Publications | Megan Maton

2025

Where tests fall short: Empirically analyzing oracle gaps in covered code

Megan Maton, Gregory M. Kapfhammer, and Phil McMinn

In Proceedings of the 19th International Symposium on Empirical Software Engineering and Measurement, 2025

Abs PDF

Background: Developers often rely on statement coverage to assess test suite quality. However, statement coverage alone may only lead to 10% fault detection, necessitating more rigorous approaches. While mutation testing is effective, its execution and human analysis costs remain high. Identifying covered statements that are not checked by oracles (e.g., assertions) offers a cost-effective alternative; however, the lack of empirical evidence for selecting the appropriate Oracle Gap Calculation Approach (OGCA) prevents developers from making informed choices. Aims: This knowledge-seeking study compares oracle gap characteristics determined by different OGCAs to assist developers in choosing the most valuable approach for their use cases. Method: Using mixed-method empirical analysis, we conduct an in-depth evaluation of the oracle gaps produced using three OGCAs: Checked Coverage using a Dynamic Slicer (CCDS), Checked Coverage using an Observational Slicer (CCOS), and Pseudo-Tested Statement Identification (PTSI). Across 30 Java classes from six open-source projects, we report on a quantitative evaluation of gap prominence, distribution, fault detection correlation and execution times, as well as results from a qualitative manual inspection of the statement types found in the oracle gaps. Results: The qualitative analysis showed data-loading statements, iteration statements and output updates to be most prominent in the oracle gaps. PTSI identified the oracle gaps with the lowest median mutation score (0.32), highlighting areas requiring more fault detection improvement compared to CCDS (0.76) and CCOS (0.50). PTSI also had the shortest median execution time (19.9 seconds), far quicker than both CCDS (273.2 seconds) and CCOS (5957.1 seconds). Conclusions: PTSI quickly reveals the priority testing areas for improved fault detection, making it an effective OGCA for developers to identify where tests fall short

2024

Exploring pseudo-testedness: Empirically evaluating extreme mutation testing at the statement level

Megan Maton, Gregory M. Kapfhammer, and Phil McMinn

In Proceedings of the 40th International Conference on Software Maintenance and Evolution, 2024

Abs DOI PDF

Extreme mutation testing (XMT) detects undesirable pseudo-testedness in a program by deleting the method bodies of covered code and observing whether the test suite can detect their absence. Even though XMT may identify test limitations, its coarse granularity means that it may overlook testing inadequacies, particularly at the statement level, that developers may want to address before committing the resources demanded by traditional mutation testing. This paper proposes the use of the statement deletion mutation operator (SDL) to uncover pseudo-tested statements in addition to complete methods. In an experimental evaluation involving four frequently-studied, large, Apache Commons Java projects and 23 projects randomly selected from the Maven Central Repository, we found 722 different cases of pseudo-tested statements. Critically, we discovered that 48% of these statements exist outside of pseudo-tested methods, meaning that the detection of testing deficiencies related to these statements would normally be left to traditional, resource-intensive, mutation testing. Also, we found that a popular Java mutation testing tool would not have mutated some of the statement types involved in the first place, effectively rendering these issues, hitherto, hard to discover. This paper therefore demonstrates that XMT alone is insufficient and should be combined with pseudo-tested statement evaluation to pinpoint subtle, yet important, testing oversights that a developer should tackle before applying traditional mutation testing.
PseudoSweep: A pseudo-tested code identifier

Megan Maton, Gregory M. Kapfhammer, and Phil McMinn

In Proceedings of the 40th International Conference on Software Maintenance and Evolution – Tool Demonstrations Track, 2024

Abs DOI PDF

Software testing remains a crucial practice for ensuring and maintaining code quality. Yet, a critical issue remains: the existence of pseudo-tested statements. Tests cover these statements, but removing them does not trigger test failures. Since no established tools address this challenge, this paper introduces PseudoSweep, a novel tool that automatically identifies pseudotested methods and statements in Java projects. PseudoSweep combines method and statement deletion techniques to reveal these maintenance problems. In addition to explaining the approach used by PseudoSweep, this paper details use-cases and overviews results from experiments with PseudoSweep. The tool is available (including set-up instructions and examples) at https://github.com/PseudoTested/PseudoSweep and there is a video demonstration at https://youtu.be/5QCsu7MbiXI.