Regression Testing in Agile—A Systematic Mapping Study

Das, Suddhasvatta; Gary, Kevin

doi:10.3390/software4020009

Open AccessArticle

Regression Testing in Agile—A Systematic Mapping Study

by

Suddhasvatta Das

^*

and

Kevin Gary

School of Computing and Augmented Intelligence, Arizona State University, Tempe, AZ 85281, USA

^*

Author to whom correspondence should be addressed.

Software 2025, 4(2), 9; https://doi.org/10.3390/software4020009

Submission received: 29 October 2024 / Revised: 29 March 2025 / Accepted: 1 April 2025 / Published: 14 April 2025

Download

Browse Figures

Versions Notes

Abstract

Background: Regression testing is critical in agile software development, as it ensures that frequent changes do not introduce defects into previously working functionalities. While agile methodologies emphasize rapid iterations and value delivery, regression testing research has predominantly focused on optimizing technical efficiency rather than aligning with agile principles. Aim: This study aims to systematically map research trends and gaps in regression testing within agile environments, identifying areas that require further exploration to enhance alignment with agile practices and value-driven outcomes. Method: A systematic mapping study analyzed 35 primary studies. The research categorized studies based on their focus areas, evaluation metrics, agile frameworks, and methodologies, providing a comprehensive overview of the field. Results: The findings strongly emphasize test prioritization and selection, reflecting the need for optimized fault detection and execution efficiency in agile workflows. However, areas such as test generation, test minimization, and cost analysis are under-explored. Current evaluation metrics primarily address technical outcomes, neglecting agile-specific aspects like defect severity’s business impact and iterative workflows. Additionally, the research highlights the dominance of continuous integration frameworks, with limited attention to other agile practices like Scrum and a lack of datasets capturing agile-specific attributes such as testing costs and user story importance. Conclusions: This study underscores the need for research to expand beyond existing focus areas, exploring diverse testing techniques and developing agile-centric metrics and datasets. By addressing these gaps, future work can enhance the applicability of regression testing strategies and align them more closely with agile development principles.

Keywords:

agile; regression testing; regression test selection; regression test prioritization; systematic mapping study

1. Introduction

Regression testing is a quality assurance practice that ensures changes in a software codebase do not introduce defects into previously functioning features [1]. Traditionally, regression testing was a one-time activity closer to release. However, with the growing popularity of agile methods characterized by rapid iteration and continuous delivery, regression testing has evolved into an ongoing process [2]. Studies such as [3,4] have acknowledged this shift. Recent review studies [5,6] have provided detailed insights into the application of artificial intelligence (AI) and machine learning (ML) in regression testing, along with techniques like Regression Test Selection (RTS) and Prioritization (RTP), and the metrics used to evaluate these methods. While these studies are valuable, they often fail to capture agile’s unique focus on continuous value delivery, the primary metric of success in agile development [7]. This gap highlights the need to understand how current regression testing approaches address agile-specific attributes. We conducted a mapping study to identify trends and research gaps in regression testing within agile environments to bridge this gap, analyzing 35 relevant works. Our findings underscore current limitations and reveal the need for regression testing approaches that fully integrate agile principles, particularly value-driven testing.

Regression testing is a resource-intensive process, leading to the development of techniques like Regression Test Minimization (RTM), Regression Test Selection, and Regression Test Prioritization [8]. These established methodologies have been widely adopted across software projects, including agile workflows. Their primary objective is to optimize resources by identifying test cases sufficient to verify new changes and detect faults early. Typically, these techniques rely on technical information such as code coverage, fault histories, and dependency analysis [5]. While these approaches effectively optimize technical aspects, they often overlook the business significance of test cases. A recent example is the CrowdStrike case, where agile practices prioritized speed and automation to streamline regression testing [9,10]. While automation effectively identified technical regressions (e.g., unit and integration-level issues), it failed to account for business-critical scenarios, resulting in a loss of value [11]. This highlights the need for regression testing approaches that align with agile’s emphasis on maintaining value delivery by ensuring requirements reaching the definition of done [12] continue to meet both technical and business expectations.

According to Hidenberg et al. [13], value in agile encompasses both financial and non-financial dimensions. This aligns with Poppendieck’s [14] concept of ‘Lean Engineering’, which asserts that any process or activity only adds value if it benefits the customer or organization. While regression testing does not directly create value, it plays a critical role as a ‘value-preserving’ activity by ensuring that new increments do not introduce defects that could diminish the value of previously delivered software. With the popularity of agile methods (89% of organizations follow some variant of agile) [15], numerous efforts have been made to address regression testing in agile settings, with multiple review studies aggregating these contributions to provide a comprehensive overview.

Existing review studies such as Pan et al. [5] and Greca et al. [16] have made significant contributions to advancing regression testing methodologies in agile. Pan et al. [5] reviewed machine learning-based techniques for test case selection and prioritization, identifying supervised learning, clustering, and reinforcement learning that leverage features like fault histories, code coverage, and textual logs. The study reports that despite the technical advancements in the field, the lack of consistent evaluation metrics and limited reproducibility restricts their scalability and practical adoption. Similarly, the review by Greca et al. explored the industrial relevance and applicability of regression testing techniques, noting that while methods like test case prioritization, selection, and test suite reduction are technically robust, their adoption in real-world systems remains constrained by high implementation costs and inadequate documentation. While these reviews provide valuable insights into regression testing techniques, their scope predominantly emphasizes technical efficiency and industrial applicability. However, these studies do not explore how these techniques align with agile’s iterative workflows or its core principle of delivering continuous value. This presents an opportunity to examine regression testing research within agile contexts, focusing on how approaches integrate with the agile principle of value preservation. By adopting a systematic mapping approach, this study offers a comprehensive landscape view, identifying the prevailing research trends and under-explored dimensions—particularly the role of value-driven testing in agile environments.

We opted for a systematic mapping study (SMS) approach not only to align perfectly with our goals to catalog existing knowledge and synthesize actionable guidelines but also to reinforce the necessity of our study in addressing the observed research gap. By leveraging this approach, we aim to contribute meaningfully to shaping the trajectory of future research in agile regression testing, ensuring that it is both informed by a comprehensive knowledge base and attuned to the practical realities of the field. Such an initiative promises to be a starting point for subsequent scientific ventures and highlights the significance of understanding and addressing the challenges agile regression testing poses. Within this context, our paper seeks to make a substantial contribution, highlighting contemporary research pathways and unveiling potential exploratory avenues. With this motivation, we pose the following research questions.

RQ1:

What research trends are observed in agile regression testing?

RQ2:

What research gaps should the community address?

The rest of the paper is organized as follows: related works, methods, discussion, threats to validity, conclusion, and recommendations, followed by future works.

2. Related Work

This section provides an overview of recent review studies, presenting their findings within the domain of regression testing. By outlining these studies’ contributions, we aim to outline the research community’s efforts in aggregating approaches and metrics in agile regression testing environments. This review contextualizes the broader research landscape and offers insights into the areas that these reviews have focused on.

This systematic literature review [5] explores the application of machine learning techniques for test case selection and prioritization in continuous integration environments, analyzing 29 primary studies published from 2006 to 2020. The study categorizes ML techniques into supervised, unsupervised, reinforcement learning, and natural language processing (NLP) approaches, with supervised learning, especially ranking models, being the most prevalent. The paper emphasizes the practicality of features like execution history, code coverage, and code complexity for ML-based TSP (ML applied to RTS and RTP) in CI. However, it highlights challenges in evaluating these techniques due to inconsistent metrics and diverse software project characteristics, with only 21% of the studies being reproducible. The authors call for future research to focus on reproducibility, advanced ML techniques like deep learning, and standard evaluation practices to enable meaningful comparisons across studies.

Another review study [17] examines regression test case prioritization (TCP) (same as RTP) by reviewing 80 primary studies from 1999 to 2016. The study finds search-based techniques, particularly those using Genetic Algorithms, to be the most popular, focusing on enhancing fault detection rates while minimizing execution time. However, it highlights the need for research on other TCP aspects, such as execution time and cost-effectiveness, and points out inconsistencies in evaluation methods that hinder comparisons across studies. The review identifies a gap in multi-objective TCP approaches, which consider multiple criteria simultaneously, and suggests future research should focus on developing such approaches and standardizing evaluation methods for more robust comparisons.

A systematic mapping study [18] was conducted on machine learning techniques applied to software testing, analyzing 48 primary studies from 1995 to 2018. The studies were classified by study type, including publication and research types, testing activity such as compatibility testing, test case design, test prioritization, and the types of ML techniques used. The study concluded that ML techniques were predominantly applied for test case generation, refinement, evaluation, and predicting the cost of testing-related activities.

This systematic mapping study [3] investigates test case prioritization and selection in continuous integration environments (TCPCI) by analyzing 35 primary studies published from 2009 to 2019. The study highlights a growing interest in TCPCI, particularly between 2016 and 2018, with history-based approaches, which use execution history data, being the most common due to their effectiveness in improving fault detection rates. However, it also notes the under-exploration of probabilistic and distribution-based approaches. The paper underscores the need for more sophisticated evaluation methods that account for the dynamic nature of continuous integration, pointing to a gap in research on time efficiency and resource consumption. The authors advocate for future work to develop new approaches integrating multiple information sources, such as historical data, code coverage, and user requirements, to enhance TCPCI effectiveness.

In conclusion, recent review studies have contributed significantly to advancing regression testing research by providing useful insights into techniques, methodologies, and metrics. They have highlighted key advancements, such as adopting machine learning-based approaches, using metrics such as fault detection rates, and the growing focus on industrial applicability. However, while these studies have extensively covered technical efficiency and general trends, there remains an opportunity to explore how agile processes characterized by iterative workflows, frequent integration, and continuous delivery shape the regression testing problem. This systematic mapping study assembles these by examining the literature through the lens of agile principles, aiming to uncover trends and gaps that relate specifically to agile environments. This study seeks to complement existing research, offering a more nuanced understanding of regression testing in agile development.

3. Methods

This section outlines the research methodology employed in this work. Systematic Mapping Studies (SMS), also called scoping studies, offer an exhaustive overview of a research area by classifying contributions within that area’s categories. This method involves thoroughly examining the existing literature to understand the thematic scope and the mediums of publication. A systematic mapping primarily seeks to structure a research field. Systematic mapping studies are utilized by scholars in various fields, following distinct guidelines or methodologies [19].

Several reasons justify the choice of SMS for our research methodology. Our investigation is directed towards answering two critical questions. RQ1 is concerned with delineating research trends in agile regression testing over recent years, a vital endeavor for grasping the evolving dynamics of agile software development. An SMS is aptly suited for providing a comprehensive perspective of the research environment, thereby fulfilling this aim. Similarly, RQ2 aims to pinpoint research voids, an essential step for directing future scholarly and practical pursuits. The SMS methodology is ideally suited for methodically identifying knowledge gaps warranting further exploration. Moreover, the introduction highlights the pivotal influence of ASD within the software industry, underlining the urgent need to comprehend the progressive trends in agile regression testing. An SMS fits perfectly with our objectives to catalog existing knowledge and compile directives that will influence forthcoming research activities. In conducting this research, we primarily adhered to the systematic mapping guidelines outlined by Petersen et al. [19]. While other guidelines, such as those of Kitchenham et al. [20], offer a basic framework for mapping studies, they provide relatively limited procedural details regarding iterative search refinement, classification schemes, and data extraction processes. In contrast, Petersen et al.’s guidelines (2015) deliver a more precise and comprehensive set of recommendations specifically tailored to the demands of mapping studies in software engineering. Their framework delineates explicit strategies for developing search protocols, systematically categorizing research contributions, and iteratively refining data extraction, essential for capturing the broad and dynamic research landscape in agile regression testing. These additional details and methodological rigor in Petersen et al.’s approach better align with our study’s objectives, making it the more appropriate choice for our systematic mapping study.

Furthermore, conducting review studies is challenging and time-consuming, and to help researchers with these challenges, PRISMA guidelines were developed [21]. These guidelines are widely recognized for improving the transparency and reliability of systematic reviews and meta-analyses. They are particularly suited to studies with narrowly defined research questions that involve detailed synthesis and statistical analysis. However, this study employs a systematic mapping study methodology, which focuses on categorizing and summarizing research trends rather than synthesizing data or conducting meta-analyses.

While PRISMA’s structured approach ensures rigor in systematic reviews, its complete application is less aligned with the mapping study’s broader exploratory goals. Instead, this study adheres to established SMS guidelines, such as those outlined by Petersen et al., to systematically map the research landscape and identify high-level trends and gaps in agile regression testing. Nonetheless, we recognize the value of certain PRISMA-inspired practices in enhancing transparency and reproducibility. Key elements, such as a clear study selection process and the inclusion of a flow diagram (Figure 1), have been incorporated. However, features like detailed risk-of-bias assessments, more relevant to meta-analyses, were omitted, as they are beyond the scope of this mapping study. By focusing on methodologies that align with SMS objectives, this study ensures methodological rigor while maintaining the flexibility needed to provide a broad overview of the research landscape in agile regression testing.

All study stages were conducted in the Fall of 2024. The first author (PhD candidate) executed the entire research, while the second author reviewed it. Although our search queries were not explicitly restricted by publication date, the vast majority of the relevant literature retrieved pertained to the past decade.

3.1. Study Selection

Library Scan: The study selection process involved multiple stages, starting with the library scan, as summarized in Figure 1. To ensure a comprehensive search, we began with Google Scholar, a database covering various academic publications. Using the search term “agile regression testing”, more than 69,000 results were initially retrieved. Google Scholar’s default sorting by Relevance prioritized studies most aligned with the search term. Only the first 100 results (10 pages, with 10 results per page) were reviewed, as the relevance of results noticeably declined beyond this point. Additionally, equivalent search strings were also applied to the ACM Digital Library, IEEE Xplore, and Scopus databases. These databases allow refined searches limited to abstracts, titles, and keywords, enabling a more focused list of studies that complements Google Scholar’s broader scope. Below are the search strings for other databases with the number of studies in Table 1:

Google Scholar: (agile regression test) OR (continuous regression test)
SCOPUS: TITLE ((agile AND regression AND test) OR (continuous AND regression AND test))
ACM: [Title: agile] AND [Title: regression] AND [Title: test] OR [Title: continuous] AND [Title: regression] AND [Title: test]
IEEE: (“Document Title”: agile regression test) OR (“Document Title”: continuous regression test)

Filtering and Screening: As depicted in Figure 1, the filtering process was conducted in the following steps:

Title and Abstract Screening: Studies were excluded if their titles or abstracts did not explicitly reference agile methodologies and regression testing. This step ensured that only studies directly relevant to the research questions were retained for further review.
Duplicate Removal: The authors manually identified and removed duplicates across databases.
Full-Text Review: The remaining studies were fully reviewed, focusing on alignment with the research questions and methodological rigor.
Snowballing: A backward snowballing step was performed to find relevant works.

Note: For backward snowballing, we inspected the reference lists of the studies that had undergone full-text review to identify additional relevant articles. This was applied only once to maintain a focused and manageable scope.

Quality and Reliability: To ensure the reliability of the selected studies, several criteria were assessed, including:

Credibility of data sources and methodological rigor (e.g., data used in the studies, detailed experimental setups, benchmarking practices).
Relevance to regression testing within agile environments.

3.2. Data Extraction and Results

This section provides a detailed overview of the data gathered from an extensive review of 35 studies (refer to Appendix A) as part of this systematic mapping study. The information extracted encompasses attributes as follows: the publication year, the primary research focus, the evaluation metrics utilized, and the various types of agile methods used. These categories were chosen to align with the overarching research questions and objectives, enabling a comprehensive analysis of the existing literature. This will allow for understanding the prevailing trends and identifying critical gaps in regression testing within agile environments.

The focus column contains five categories (see Table 2) based on the research objectives of the selected study, highlighting techniques such as RTS, RTP, and RTM as foundational regression testing techniques. These three categories were deductively chosen for their established significance in the field [8], as identified in Yoo and Harman’s seminal work, and they are deductively chosen for their established significance in the field [8]. Regression Test Generation (RTG) and cost-related aspects of regression testing were inductively chosen classifications to capture the included study(s) focus falling outside the deductively found categories.

The metrics column emphasizes the evaluation measures used to assess regression testing approaches. Metrics such as Average Percentage of Fault Detection (APFD), Normalized APFD (NAPFD), and APFD adjusted for test execution cost and fault severity (APFDc) are widely used to measure technical efficiency. Additionally, metrics like Rank Percentile Average (RPA), Rank of the Failing Test Cases (RFTC), Fault Detection Rate (FDR), and Normalized Time Reduction (NTR) were also found to be used for RTP. A summary of the key findings is presented in Table 3.

Finally, the agile method types column identifies the specific agile frameworks considered in the studies. Behavioral-driven development (BDD) is explicitly mentioned as a representative example, reflecting the increasing focus on user-centric development practices in agile methodologies. By extracting these categories of data, this study provides a high-level analysis of regression testing trends, mapping how current research aligns with agile principles like iterative workflows and continuous value delivery.

Research Focus and Publication Year Trends: The spectrum of regression testing research highlights several distinct trends, with RTP emerging as the most studied area. As summarized in Table 3, RTP accounts for 57.7% of the total studies (18 out of 35), underscoring the community’s emphasis on accelerating fault detection by ordering test cases (Figure 2). This focus on rapid feedback loops aligns with agile development cycles, where teams aim to detect critical issues early and maintain continuous delivery.

Next in prominence is RTS, constituting 26.9% (Figure 2) of the studies (10 out of 35). Table 3 shows how these RTS efforts tackle the ongoing challenge of identifying the most relevant subset of test cases to ensure coverage and accuracy in iterative workflows. Typical approaches in RTS leverage historical fault data, code coverage, or dependencies to minimize redundant testing while still preserving high confidence in the software’s stability. Together, RTP and RTS represent over 84% of the research focus, as Table 3 and Figure 2 indicate, highlighting their critical role in addressing agile-specific constraints such as limited resources, fast iteration, and tightly bounded sprint timelines. The remaining 15.5% of studies delve into niche areas like cost analysis, RTG, or RTM. While these areas receive less attention overall, they nonetheless highlight essential considerations for optimizing both technical and business aspects of agile regression testing.

Niche areas like cost analysis, RTG, and RTM collectively contribute 15.5%. While these areas are less studied, they are essential in creating efficient regression testing strategies. RTG emphasizes generating test cases for new functionalities, whereas RTM optimizes test suites by eliminating redundant cases. Cost analysis, though under-explored, remains vital in resource-constrained agile environments. The distribution of research outputs over time, visualized in Figure 3, demonstrates notable surges in activity, particularly in 2020 and 2022. The year 2020 marked a peak with nine publications, driven by the growing adoption of machine learning techniques such as reinforcement learning and deep learning in regression testing practices. These advances enabled scalable and adaptive solutions for RTP and RTS, addressing the unique demands of agile workflows. 2022 maintained this momentum with another peak of eight publications, reflecting the academic community’s sustained interest in refining RTP and RTS practices. The productivity during these years underscores the field’s focus on leveraging technological advancements to address practical challenges in agile regression testing. By contrast, earlier years (2013–2017) exhibit more sporadic contributions, reflecting the gradual evolution of agile regression testing as a distinct research area.

Evaluation Metrics: As illustrated in Figure 4, the bar chart shows how often each metric appears across the 35 primary studies. Since any paper may employ multiple metrics, the total across bars can exceed 35. The APFD group (including APFD, NAPFD, and APFDc) is most common, appearing 15 times, followed by Precision and Cost (4 each), Coverage (3), and various others (e.g., F-score, FDR, Recall) occurring once or twice. These metrics emphasize fault detection, testing speed, and coverage breadth. Meanwhile, machine learning–based techniques also adopt metrics like Precision, Recall, and F1 to gauge predictive accuracy—though the definition of “true positives” can differ across studies, underscoring the need for standardization [5]. Finally, value-oriented metrics aligned with agile principles, such as those capturing business impact or user-story importance, remain unexplored.

Type of Agile: The analysis of agile methods, as visualized in Figure 5, reveals a strong focus on continuous integration (CI), referenced in 67.3% of the studies (26 out of 35). CI’s dominance reflects its centrality to agile workflows, where frequent integration and testing cycles are critical for maintaining quality in fast-paced development environments. One study explicitly mentions that BDD emphasizes user-centric testing approaches. Scrum appears in 7.7% of studies, while other agile frameworks like Kanban are notably absent. This distribution highlights a potential gap in research on how regression testing practices adapt to diverse agile methodologies beyond CI. Future studies should explore how specific frameworks influence testing strategies, particularly prioritization and test suite optimization.

4. Discussion

This section presents our interpretation of the results, thereby answering the two specific research questions the paper attempts to answer.

RQ1: What research trends are observed in agile regression testing?: Agile regression testing has witnessed significant advancements, driven by the increasing adoption of agile methodologies and the need for effective testing strategies to support iterative workflows. Consistent with prior reviews, such as those by Durelli et al. [18] and Khatibsyarbini et al. [17], our mapping study confirms a dominant focus on techniques like regression test prioritization and regression test selection. These approaches collectively address critical challenges in agile environments, such as resource constraints and the rapid pace of delivery cycles. RTP, in particular, continues to command a significant share of research, with 57.7% of the studies in our dataset focusing on prioritization strategies. This finding aligns with Pan et al. [5], who identified supervised and reinforcement learning techniques as key enablers for regression test prioritization techniques.

The focus on regression test selection, accounting for 26.9% of the studies, reflects ongoing efforts to refine test case selection methods that balance comprehensive coverage and time efficiency. While Lima and Vergilio [3] highlighted RTS as a critical aspect of continuous integration workflows, our findings further indicate its broader relevance across agile methodologies. Regression test selection techniques rely on static code features and historical fault data to guide selection decisions. While these approaches are practical, they could benefit from greater emphasis on factors like business importance post-delivery, ensuring that regression tests validate technical functionality and the value delivered to end-users.

A noticeable trend across the reviewed studies is the consistent use of traditional metrics, such as Average Percentage Faults Detected and Test Execution Time, to evaluate regression testing techniques. These metrics are foundational, as observed in Khatibsyarbini et al. and Pan et al., but their utility in agile contexts is evolving. For instance, as agile emphasizes continuous value delivery, traditional metrics provide limited insights into the business impact of regression testing outcomes. Our results highlight a gradual shift toward integrating machine learning metrics, such as precision, recall, and F1 Score, in 45.7% of the studies. This aligns with Durelli et al., who underscored the growing reliance on machine learning-based approaches for fault prediction and test case optimization. However, the variation in how these metrics are applied across techniques suggests an opportunity to establish more standardized evaluation practices that align with agile goals.

Another trend evident from our mapping study is the prominence of continuous integration as the primary agile framework in regression testing research, featured in 67.3% of the studies. This finding echoes Lima and Vergilio, who emphasized continuous integration’s central role in enabling frequent testing cycles. At the same time, our study highlights a gap in exploring other agile frameworks, such as Scrum and Kanban. Only 7.7% of the studies explicitly mentioned Scrum, while frameworks like Kanban were absent. This focus on continuous integration reflects the industry’s prioritization of automation-centric practices, but it also points to the untapped potential for tailoring regression testing strategies to alternative agile workflows.

While prior reviews primarily emphasize technical advancements, our study identifies an increasing awareness of agile-specific challenges, such as adapting testing techniques to iterative workflows and preserving business value. For example, Pan et al. and Khatibsyarbini et al. highlight fault detection as a core metric. However, its impact is enhanced when considered alongside factors like defect severity or customer satisfaction. Our findings add to this narrative by emphasizing the need to align regression testing with the agile principle of delivering continuous value.

In summary, the trends observed in agile regression testing reflect a maturing field that continues to build on foundational research. Techniques like prioritization and selection dominate the landscape, driven by their adaptability to agile workflows, while traditional metrics are gradually supplemented with ML-based evaluations. However, future research can offer even more comprehensive insights into regression testing practices in agile environments by focusing more on agile-specific concerns—such as broader exploration of frameworks and deeper alignment with business goals.

RQ2: What research gaps should the community address?: Our systematic mapping study has identified several critical gaps in agile regression testing research, highlighting opportunities for future exploration and innovation. Though reflective of historical research trends and practical challenges, these gaps present an avenue for advancing the field in ways that align more closely with the principles of agile methodologies. Addressing these areas can enhance regression testing practices’ robustness, scalability, and economic viability while bridging the divide between academic research and industrial applications.

One prominent gap lies in regression test generation, which remains under-explored, accounting for only 4.3% of the studies analyzed. Test generation under-representation may stem from the complexity of designing test cases that address technical functionality and agile-specific needs, such as evolving user requirements and rapid delivery cycles. Integrating advanced approaches into RTG, particularly those capable of adapting to agile workflows, could significantly enhance software quality by ensuring that generated test cases align with the agile value delivery philosophy.

Similarly, regression test minimization has received limited attention, with only 6.4% of studies focusing on optimizing test suites. In agile contexts, where frequent integration cycles demand efficiency, test minimization offers substantial potential to streamline testing processes by reducing redundant test cases without compromising testing goals. Expanding work in this area would enable faster feedback cycles and improve scalability, particularly in large-scale agile projects.

Cost analysis in regression testing represents another significant gap, largely overlooked in favor of technical outcomes such as fault detection and execution time. Agile projects often operate under resource constraints, making it essential to quantify the economic impact of various testing strategies. However, the difficulty of measuring return on investment in environments characterized by rapid change and frequent iteration may deter researchers from exploring this area. Addressing this gap would enable teams to balance technical efficiency with business considerations, fostering better decision-making and resource allocation in agile workflows.

A recurring theme across our findings is the misalignment of evaluation metrics with agile principles, particularly the emphasis on value delivery. While traditional metrics like Average Percentage Faults Detected and execution time dominate the literature, they provide limited insights into agile-specific outcomes, such as preserving business value and satisfying end-user expectations. Machine learning metrics, such as precision, recall, and F1 Score, while all of these were originally developed in information retrieval [22,23], are increasingly integrated into regression testing studies and are seen in 45.7% of the research analyzed. However, the absence of standardized evaluation practices and agile-centric metrics remains a barrier to aligning testing strategies with the broader objectives of agile development. Future work should prioritize developing metrics that bridge the gap between technical performance and business impact, enhancing the strategic value of regression testing in agile settings.

Another noteworthy gap is the under-representation of agile frameworks beyond Continuous Integration. While CI dominates the field, appearing in 67.3% of the studies, frameworks like Scrum and Kanban receive minimal attention. Scrum, for instance, is the most widely used agile framework, with 63% of organizations adopting it [15], and its popularity continues to grow. The limited academic focus on Scrum and other frameworks restricts the applicability of regression testing research to industry practices, potentially hindering its relevance in real-world agile environments. Addressing this imbalance is crucial to ensuring that regression testing strategies cater to the diverse workflows and requirements of different agile methodologies.

Finally, our findings underscore the lack of research into agile-related datasets that incorporate variables such as requirements or user stories with their associated business value, testing time, test setup and tear-down costs, and defect severity’s impact on business outcomes. This gap is significant because agile prioritizes iterative delivery and value preservation, making understanding the broader implications of regression testing decisions essential. Developing such datasets would provide researchers with the tools to analyze regression testing practices more comprehensively, aligning them with technical and business goals.

The observed gaps stem from a combination of research trends and practical constraints. Techniques like prioritization and selection, supported by advancements in machine learning, have garnered significant attention due to their immediate, tangible benefits, such as faster defect detection and optimized test execution. In contrast, generation, minimization, and cost analysis, which require foundational development and longer-term investment, are often de-prioritized. This imbalance reflects the pressures on researchers to produce publishable results with visible impacts alongside the challenges of justifying slower-yielding innovations to funding bodies and stakeholders.

Addressing these gaps would complement existing research and significantly enhance the effectiveness of agile regression testing. For instance, advancing generation and minimization techniques could enable more efficient test management, while robust cost analysis frameworks would ensure the sustainability of testing practices in resource-constrained projects. Similarly, developing agile-centric metrics and exploring underrepresented frameworks would provide a holistic understanding of regression testing, aligning it more closely with agile principles of iterative improvement and value delivery.

In conclusion, the identified gaps in agile regression testing underscore the need for a balanced research agenda that moves beyond quick wins to address foundational and under-explored areas. By focusing on these issues, the research community can drive meaningful advancements in agile regression testing, ensuring adaptability and relevance in an ever-evolving software development landscape.

Actionable Recommendation

This section presents some of our recommended actions that industry personnel can take to improve the regression process in their agile project settings. While this work shows research trends in agile regression testing, there are significant gaps and challenges in agile regression testing, irrespective of the line of business [24,25], that researchers should address in detail later in the future works section. Thus, below are some of the actionable recommendations we have:

For Developers

Enhance Test Case Relevance: Collaborate closely with testers to align test cases with business-critical requirements and user stories. By embedding business value into the testing framework, developers can ensure that delivered increments maintain their intended purpose and impact across iterations.
Incorporate Foundational Techniques: While prioritization and selection are essential for immediate optimization, attention must also be given to generation and minimization. Generation aids in generating high-quality test cases that reflect agile’s iterative and user-centric development. At the same time, minimization helps maintain efficiency as test suites grow, ensuring scalability in continuous delivery environments.

For Testers

Adopt Advanced Prioritization Techniques: Testers can better align testing practices with agile, rapid feedback, and value delivery priorities by focusing on defect severity, testing time, and business impact metrics.
Address Cost Components: Integrate cost awareness into regression testing workflows by considering execution time, resource consumption, and setup/tear-down costs. A cost-centric perspective ensures efficient use of resources, particularly in constrained agile settings.
Develop Agile-Specific Datasets: Create datasets that include user story value, defect severity’s impact on business outcomes, and resource utilization during testing. These datasets can facilitate research and improve the strategic alignment of regression testing with agile goals.

For Managers/Leadership

Broaden Framework Coverage: Ensure regression testing strategies extend beyond continuous integration to frameworks like Scrum and Kanban. Scrum, for instance, is the most widely used agile framework, adopted by 63% [15] of organizations. Tailoring testing practices to these frameworks enhances their applicability and relevance across diverse agile settings.
Promote Value-Driven Testing: Support the adoption of metrics and tools that emphasize business value preservation. For example, incorporating defect severity and user satisfaction metrics into regression testing ensures alignment with agile’s core principle of delivering customer value.
Invest in Long-Term Efficiency: Allocate resources to under-explored areas like generation and minimization, which, while foundational, contribute to sustainable and scalable testing practices.

5. Threats to Validity

The systematic mapping study was designed to understand regression testing in agile environments comprehensively. Specific validity threats must be acknowledged and contextualized.

Construct Validity: Construct validity concerns the degree to which the research questions, methodology, and results align with the study’s objectives. One potential threat stems from the exclusion of unpublished and non-peer-reviewed studies. While this approach ensures methodological rigor and the reliability of findings, it may have reduced the study’s comprehensiveness. The decision to exclude these sources was made to prioritize reproducibility and include research quality. The lack of standardization in the terminology used across studies also impacted the data extraction process. The mitigation was performed using the search strings and inclusion criteria explicitly designed to account for common synonyms and variations in terminology related to regression testing and agile methods.

Internal Validity: Internal validity relates to the methodology’s ability to minimize bias and ensure the accuracy of extracted data. Publication bias is a significant concern in systematic mapping studies, as impactful or positive results dominate the academic literature. This bias may have influenced the trends observed in this study. This was addressed by using a broad search strategy employed across diverse databases, including Google Scholar, IEEE Xplore, ACM Digital Library, and Scopus, to ensure the inclusion of a wide array of studies. Transparency in study selection and filtering was also maintained to minimize subjective bias. We defined specific criteria for assessing credibility and relevance to ensure a rigorous evaluation of the candidate papers. Credibility was evaluated based on the following factors: (1) the clarity, rigor, design of the methodology, and the evaluation metric used (2) the degree to which the study addressed our core research questions, i.e., regression testing in an agile setting and its contribution to advancing the field. All candidate papers were assessed using these criteria, and the evaluations were systematically recorded in a spreadsheet. This spreadsheet tracked each step of the selection process, including initial screening, detailed evaluation, and final inclusion. The first author conducted the study while the second author reviewed and both authors had full access to the spreadsheet, ensuring that all decisions were made transparently.

External Validity: External validity refers to the generalizability of the findings. The geographic distribution of authors’ affiliations in the included studies primarily represents Europe, potentially limiting insights into global practices in agile regression testing. While the databases used were comprehensive, the observed regional concentration suggests that additional work may be needed to capture practices and trends from other regions.

Conclusion Validity: Conclusion validity pertains to the relationship between the extracted data and the study’s outcomes. Excluding unpublished and non-peer-reviewed studies may have influenced the generalizability of the reported findings. However, the methodological rigor applied to the study selection process aimed to ensure that the included studies were credible and reliable. Furthermore, the potential influence of database coverage on the completeness of the dataset is acknowledged. Despite these limitations, including multiple databases and the design of semantically consistent search strings enhances confidence in the comprehensiveness of the findings.

6. Conclusions and Future Work

This systematic mapping study offers a structured analysis of current research trends and gaps in agile regression testing. In response to RQ1, the study focuses on RTP and RTS, primarily assessed through technical metrics such as APFD, precision, and recall. The extensive emphasis on continuous integration underscores its critical role in enabling iterative and rapid development cycles. However, the study identifies a significant lack of agile-specific evaluation metrics, particularly those directly measuring business value or value preservation.

Addressing RQ2, several key research gaps emerged. Regression test generation and minimization have received limited research attention, suggesting areas for future empirical exploration. Additionally, comprehensive cost-analysis frameworks are notably scarce despite their importance in understanding the economic implications of agile projects. Moreover, limited exploration of regression testing beyond the CI context—such as within Scrum, Kanban, or XP—limits the broader generalizability of current findings.

Future review studies following similar systematic mapping methodologies can extend this research by specifically addressing areas beyond the scope of this study. Future systematic reviews can examine regression testing techniques within less-explored agile frameworks such as Scrum, Kanban, and XP, thereby clarifying differences and commonalities in testing practices. Another meaningful direction is systematically analyzing methodological rigor and reporting practices in the existing regression testing literature to establish quality, consistency, and reproducibility benchmarks. Furthermore, systematic reviews can explore how emerging technological trends, including DevOps, containerization, and microservices, influence regression testing strategies within agile environments. Additionally, review studies should examine regression testing strategies employed in large-scale agile implementations, identifying particular challenges and solutions pertinent to these contexts. Lastly, systematic mapping studies could comprehensively categorize and critically evaluate agile-specific metrics used in regression testing, emphasizing metrics that capture business value and end-user impacts.

Author Contributions

Conceptualization, conducting the study and writing, S.D. Review, K.G. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original data presented in the study are openly available in GitHub at https://github.com/Suddhasvatta007/Regression-Testing-in-Agile-A-Systematic-Mapping-Study-MDPI.git (accessed on 29 October 2024).

Acknowledgments

This work acknowledges the assistance of a Large Language Model (LLM) used explicitly for editing or drafting purposes. The author has reviewed all content to ensure accuracy, integrity, and correctness.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

S01 Singh, M., Chauhan, N., Popli, R. (2024). Test case reduction and SWOA optimization for distributed agile software development using regression testing. Multimedia Tools and Applications, 1-26.
S02 Mafi, Z., Mirian-Hosseinabadi, S. H. (2024). Regression test selection in test-driven development. Automated Software Engineering, 31(1), 9.
S03 Vescan, A., Gaceanu, R. D., Szederjesi-Dragomir, A. (2024). Embracing Unification: A Comprehensive Approach to Modern Test Case Prioritization. In ENASE (pp. 396-405).
S04 Wang, D., Zhao, Y., Xiao, L., Yu, T. (2023, October). An Empirical Study of Regression Testing for Android Apps in Continuous Integration Environment. In 2023 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM) (pp. 1-11). IEEE.
S05 Da Roza, E. A., Lima, J. A. P., Silva, R. C., Vergilio, S. R. (2022, March). Machine learning regression techniques for test case prioritization in continuous integration environment. In 2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER) (pp. 196-206). IEEE.
S06 Elsner, D., Wuersching, R., Schnappinger, M., Pretschner, A., Graber, M., Dammer, R., Reimer, S. (2022, May). Build system aware multi-language regression test selection in continuous integration. In Proceedings of the 44th International Conference on Software Engineering: Software Engineering in Practice (pp. 87-96).
S07 Chen, R., Xiao, Z., Xiao, L., Li, Z. (2022, August). Regression Testing Prioritization Technique Based on Historical Execution Information. In 2022 International Conference on Machine Learning, Cloud Computing and Intelligent Mining (MLCCIM) (pp. 276-281). IEEE.
S08 Abdelkarim, M., ElAdawi, R. (2022, April). Tcp-net: Test case prioritization using end-to-end deep neural networks. In 2022 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW) (pp. 122-129). IEEE.
S09 Mirzaei, H., Keyvanpour, M. R. (2022, March). Reinforcement Learning Reward Function for Test Case Prioritization in Continuous Integration. In 2022 9th Iranian Joint Congress on Fuzzy and Intelligent Systems (CFIS) (pp. 1-6). IEEE.
S10 Bagherzadeh, M., Kahani, N., Briand, L. (2021). Reinforcement learning for test case prioritization. IEEE Transactions on Software Engineering, 48(8), 2836-2856.
S11 Kauhanen, E., Nurminen, J. K., Mikkonen, T., Pashkovskiy, M. (2021, March). Regression test selection tool for python in continuous integration process. In 2021 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER) (pp. 618-621). IEEE.
S12 Elsner, D., Hauer, F., Pretschner, A., Reimer, S. (2021, July). Empirically evaluating readily available information for regression test optimization in continuous integration. In Proceedings of the 30th acm sigsoft international symposium on software testing and analysis (pp. 491-504).
S13 Xu, J., Du, Q., Li, X. (2021, July). A requirement-based regression test selection technique in behavior-driven development. In 2021 IEEE 45th Annual Computers, Software, and Applications Conference (COMPSAC) (pp. 1303-1308). IEEE.
S14 Marijan, D., Gotlieb, A., Sapkota, A. (2020, August). Neural network classification for improving continuous regression testing. In 2020 IEEE International Conference On Artificial Intelligence Testing (AITest) (pp. 123-124). IEEE.
S15 Medhat, N., Moussa, S. M., Badr, N. L., Tolba, M. F. (2020). A framework for continuous regression and integration testing in iot systems based on deep learning and search-based techniques. IEEE Access, 8, 215716-215726.
S16 Bertolino, A., Guerriero, A., Miranda, B., Pietrantuono, R., Russo, S. (2020, June). Learning-to-rank vs ranking-to-learn: Strategies for regression testing in continuous integration. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering (pp. 1-12).
S17 Ali, S., Hafeez, Y., Hussain, S., Yang, S. (2020). Enhanced regression testing technique for agile software development and continuous integration strategies. Software Quality Journal, 28, 397-423.
S18 Lima, J. A. P., Mendonça, W. D., Vergilio, S. R., Assunção, W. K. (2020, October). Learning-based prioritization of test cases in continuous integration of highly-configurable software. In Proceedings of the 24th ACM conference on systems and software product line: Volume A-Volume A (pp. 1-11).
S19 Xiao, L., Miao, H., Shi, T., Hong, Y. (2020). LSTM-based deep learning for spatial–temporal software testing. Distributed and Parallel Databases, 38, 687-712.
S20 Shi, T., Xiao, L., Wu, K. (2020, October). Reinforcement learning based test case prioritization for enhancing the security of software. In 2020 IEEE 7th International Conference on Data Science and Advanced Analytics (DSAA) (pp. 663-672). IEEE.
S21 Shi, A., Zhao, P., Marinov, D. (2019, October). Understanding and improving regression test selection in continuous integration. In 2019 IEEE 30th International Symposium on Software Reliability Engineering (ISSRE) (pp. 228-238). IEEE.
S22 Marijan, D., Gotlieb, A., Liaaen, M. (2019). A learning algorithm for optimizing continuous integration development and testing practice. Software: Practice and Experience, 49(2), 192-213.
S23 Yu, T., Wang, T. (2018, October). A study of regression test selection in continuous integration environments. In 2018 IEEE 29th International Symposium on Software Reliability Engineering (ISSRE) (pp. 135-143). IEEE.
S24 Kwon, J. H., Ko, I. Y. (2017, December). Cost-effective regression testing using bloom filters in continuous integration development environments. In 2017 24th Asia-Pacific Software Engineering Conference (APSEC) (pp. 160-168). IEEE.
S25 de S. Campos Junior, H., de Paiva, C. A., Braga, R., Araújo, M. A. P., David, J. M. N., Campos, F. (2017, September). Regression tests provenance data in the continuous software engineering context. In Proceedings of the 2nd Brazilian Symposium on Systematic and Automated Software Testing (pp. 1-6).
S26 Kandil, P., Moussa, S., Badr, N. (2017). Cluster-based test cases prioritization and selection technique for agile regression testing. Journal of Software: Evolution and Process, 29(6), e1794.
S27 Labuschagne, A., Inozemtseva, L., Holmes, R. (2017, August). Measuring the cost of regression testing in practice: A study Java projects using continuous integration. In Proceedings of the 2017 11th joint meeting on foundations of software engineering (pp. 821-830).
S28 Kim, J., Jeong, H., Lee, E. (2017, April). Failure history data-based test case prioritization for effective regression test. In Proceedings of the Symposium on Applied Computing (pp. 1409-1415).
S29 Spieker, H., Gotlieb, A., Marijan, D., Mossige, M. (2017, July). Reinforcement learning for automatic test case prioritization and selection in continuous integration. In Proceedings of the 26th ACM SIGSOFT international symposium on software testing and analysis (pp. 12-22).
S30 Marijan, D., Liaaen, M. (2016, October). Effect of time window on the performance of continuous regression testing. In 2016 IEEE International Conference on Software Maintenance and Evolution (ICSME) (pp. 568-571). IEEE.
S31 Arora, M., Chopra, S., Gupta, P. (2016). Estimation of regression test effort in agile projects. Far East J. Electron. Commun, 3, 741-753.
S32 Kandil, P., Moussa, S., Badr, N. (2015, December). A methodology for regression testing reduction and prioritization of agile releases. In 2015 5th international conference on Information Communication Technology and accessibility (ICTA) (pp. 1-6). IEEE.
S33 Anita, N. C. (2014). A regression test selection technique by optimizing user stories in an agile environment. In 2014 IEEE International Advance Computing Conference (IACC) (pp. 1454-1458).
S34 Elbaum, S., Rothermel, G., Penix, J. (2014, November). Techniques for improving regression testing in continuous integration development environments. In Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering (pp. 235-245).
S35 Marijan, D., Gotlieb, A., Sen, S. (2013, September). Test case prioritization for continuous regression testing: An industrial case study. In 2013 IEEE International Conference on Software Maintenance (pp. 540-543). IEEE.

References

Rothermel, G.; Untch, R.H.; Chu, C.; Harrold, M.J. Test case prioritization: An empirical study. In Proceedings of the IEEE International Conference on Software Maintenance-1999 (ICSM’99). ‘Software Maintenance for Business Change’ (Cat. No. 99CB36360), Oxford, UK, 30 August–3 September 1999; pp. 179–188. [Google Scholar]
Fowler, M.; Foemmel, M. Continuous Integration. Available online: https://martinfowler.com/articles/continuousIntegration.html (accessed on 10 January 2025).
Lima, J.A.P.; Mendonça, W.D.; Vergilio, S.R.; Assunção, W.K. Learning-based prioritization of test cases in continuous integration of highly-configurable software. In Proceedings of the 24th ACM Conference on Systems and Software Product Line: Volume A-Volume A, Montreal, QC, Canada, 19–23 October 2020; pp. 1–11. [Google Scholar]
Spieker, H.; Gotlieb, A.; Marijan, D.; Mossige, M. Reinforcement learning for automatic test case prioritization and selection in continuous integration. In Proceedings of the 26th ACM SIGSOFT International Symposium on Software Testing and Analysis, Santa Barbara, CA, USA, 10–14 July 2017; pp. 12–22. [Google Scholar]
Pan, R.; Bagherzadeh, M.; Ghaleb, T.A.; Briand, L. Test case selection and prioritization using machine learning: A systematic literature review. Empir. Softw. Eng. 2022, 27, 29. [Google Scholar] [CrossRef]
Kazmi, R.; Jawawi, D.N.A.; Mohamad, R.; Ghani, I. Effective regression test case selection: A systematic literature review. ACM Comput. Surv. (CSUR) 2017, 50, 1–32. [Google Scholar] [CrossRef]
Beck, K.; Beedle, M.; van Bennekum, A.; Cockburn, A.; Cunningham, W.; Fowler, M.; Grenning, J.; Highsmith, J.; Hunt, A.; Jeffries, R.; et al. Manifesto for Agile Software Development. Available online: https://agilemanifesto.org/ (accessed on 10 January 2025).
Yoo, S.; Harman, M. Regression testing minimization, selection and prioritization: A survey. Softw. Test. Verif. Reliab. 2012, 22, 67–120. [Google Scholar] [CrossRef]
CrowdStrike. Channel File 291 Incident Root Cause Analysis; Technical Report; CrowdStrike, Inc.: Austin, TX, USA, 2024; Available online: https://www.crowdstrike.com/ (accessed on 10 January 2025).
CrowdStrike. What Is CI/CD? Available online: https://www.crowdstrike.com/en-us/cybersecurity-101/cloud-security/continuous-integration-continuous-delivery-ci-cd/ (accessed on 12 September 2024).
Coutinho, J.C.; Andrade, W.L.; Machado, P.D. Requirements engineering and software testing in agile methodologies: A systematic mapping. In Proceedings of the XXXIII Brazilian Symposium of Software Engineering, Salvador, Brazil, 23–27 September 2019; pp. 322–331. [Google Scholar]
Silva, A.; Araújo, T.; Nunes, J.; Perkusich, M.; Dilorenzo, E.; Almeida, H.; Perkusich, A. A systematic review on the use of definition of done on agile software development projects. In Proceedings of the 21st International Conference on Evaluation and Assessment in Software Engineering, Karlskrona, Sweden, 15–16 June 2017; pp. 364–373. [Google Scholar]
Heidenberg, J.; Weijola, M.; Mikkonen, K.; Porres, I. A model for business value in large-scale agile and lean software development. In Systems, Software and Services Process Improvement, Proceedings of the 19th European Conference, EuroSPI 2012, Vienna, Austria, 25–27 June 2012; Proceedings; Springer: Berlin/Heidelberg, Germany, 2012; Volume 19, pp. 49–60. [Google Scholar]
Poppendieck, M. Principles of Lean Thinking. IT Manag. Sel. 2011, 18, 1–7. [Google Scholar]
Digital.ai. 17th State of Agile Report. Available online: https://digital.ai/resource-center/analyst-reports/state-of-agile-report/ (accessed on 28 October 2024).
Greca, R.; Miranda, B.; Bertolino, A. State of practical applicability of regression testing research: A live systematic literature review. ACM Comput. Surv. 2023, 55, 1–36. [Google Scholar] [CrossRef]
Khatibsyarbini, M.; Isa, M.A.; Jawawi, D.N.; Tumeng, R. Test case prioritization approaches in regression testing: A systematic literature review. Inf. Softw. Technol. 2018, 93, 74–93. [Google Scholar] [CrossRef]
Durelli, V.H.; Durelli, R.S.; Borges, S.S.; Endo, A.T.; Eler, M.M.; Dias, D.R.; Guimarães, M.P. Machine learning applied to software testing: A systematic mapping study. IEEE Trans. Reliab. 2019, 68, 1189–1212. [Google Scholar] [CrossRef]
Petersen, K.; Vakkalanka, S.; Kuzniarz, L. Guidelines for conducting systematic mapping studies in software engineering: An update. Inf. Softw. Technol. 2015, 64, 1–18. [Google Scholar] [CrossRef]
Kitchenham, B.; Charters, S. Guidelines for performing systematic literature reviews in software engineering. Available online: https://www.researchgate.net/profile/Barbara-Kitchenham/publication/302924724_Guidelines_for_performing_Systematic_Literature_Reviews_in_Software_Engineering/links/61712932766c4a211c03a6f7/Guidelines-for-performing-Systematic-Literature-Reviews-in-Software-Engineering.pdf (accessed on 28 October 2024).
Moher, D.; Liberati, A.; Tetzlaff, J.; Altman, D.G.; Prisma Group. Preferred reporting items for systematic reviews and meta-analyses: The PRISMA statement. Int. J. Surg. 2010, 8, 336–341. [Google Scholar] [CrossRef] [PubMed]
Takahashi, K.; Yamamoto, K.; Kuchiba, A.; Koyama, T. Confidence interval for micro-averaged F₁ and macro-averaged F₁ scores. Appl. Intell. 2022, 52, 4961–4972. [Google Scholar] [CrossRef] [PubMed]
Raghavan, V.; Bollmann, P.; Jung, G.S. A critical investigation of recall and precision as measures of retrieval system performance. ACM Trans. Inf. Syst. (TOIS) 1989, 7, 205–229. [Google Scholar] [CrossRef]
Das, S.; Gary, K. Agile transformation at scale: A tertiary study. In Agile Processes in Software Engineering and Extreme Programming–Workshops, Proceedings of the XP 2021 Workshops, Virtual Event, 14–18 June 2021; Revised Selected Papers 22; Springer International Publishing: Cham, Switzerland, 2021; pp. 3–11. [Google Scholar]
Das, S.; Gary, K. Challenges and Success Factors in Large Scale Agile Transformation—A Systematic Literature Review. In Proceedings of the International Conference on Information Technology-New Generations, Las Vegas, NV, USA, 14–18 April 2024; Springer: Cham, Switzerland, 2024; pp. 405–416. [Google Scholar]

Figure 1. Study selection process.

Figure 2. Percentage of studies in each focus area.

Figure 3. Number of publications in each focus area over time.

Figure 4. Frequency of metrics used.

Figure 5. Percentage of each agile methods/type.

Table 1. Libraries and Paper Counts.

Library	Count
Google Scholar	100
SCOPUS	24
ACM	18
IEEE	16

Table 2. Focus Areas.

Focus Definition
RTM (regression test minimization)	Given a test suite T, and test requirements {r₁, …, r_n}, with subsets T₁, …, T_n ⊆ T such that each T_i satisfies r_i, find a minimal hitting set T′ ⊆ T that satisfies all r_i, i.e., remove redundant tests from the suite so the reduced set still meets all testing requirements [8].
RTS (regression test selection)	Given program P, modified version P′, and test suite T, find a subset T′ ⊆ T with which to test P′, i.e., select tests from the set of available tests relevant to test the changed program [8].
RTP (regression test prioritization)	Given test suite T, its permutations P_T, and function f: P_T → R, find T′ ∈ P_T such that ∀T ″∈ P_T, f (T′) ≥ f (T″), i.e., order tests to maximize some desirable property of the test (e.g., those likely to find faults earlier) [8].
RTG (regression test generation)	Creates tests to ensure software changes do not break existing functionality [S25].
Cost (costs of regression test)	Cost is the total resources measured in person-hours for regression testing [S31].

Table 3. Included Studies.

Paper ID	Year	Focus	Metric	Agile Type	Paper #	Year	Focus	Metric	Agile Type
S01	2024	RTP	Precision, Recall, F-score	Not specified	S19	2020	RTP	APFD	Continuous Integration
S02	2024	RTS	Num of test case	Not specified	S20	2020	RTP	APFD, NAPFD	Continuous Integration
S03	2024	RTP	NAPFD	Not specified	S21	2019	RTS	Time Saved with RTS Fault	Continuous Integration
S04	2023	Cost	Cost of RTS	Continuous Integration	S22	2019	RTM	Detection vs Time Budgets	Continuous Integration
S05	2022	RTP	NAPFD	Continuous Integration	S23	2018	RTS	Cost of RTS	Continuous Integration
S06	2022	RTS	Test Time Saving	Continuous Integration	S24	2017	Cost	Precision	Continuous Integration
S07	2022	RTP	FDR	Continuous Integration	S25	2017	RTG	Coverage	Continuous Integration
S08	2022	RTP	FDR	Continuous Integration	S26	2017	RTS, RTP	F-Score, APFD	Scrum
S09	2022	RTP	NAPFD	Continuous Integration	S27	2017	cost of regression test	cost of regression test	Continuous Integration
S10	2022	RTP	NRPA, APFD	Continuous Integration	S28	2017	RTP	APFD	Continuous Integration
S11	2021	RTS	MSR	Continuous Integration	S29	2017	RTS, RTP	NAPFD	Continuous Integration
S12	2021	RTS, RTP	APFDc	Continuous Integration	S30	2016	RTP	APFD	Continuous Integration
S13	2021	RTS	Precision, Efficiency	BDD	S31	2016	cost of regression test	Cost of RT	Not specified
S14	2020	RTP	Time to Prioritize Tests	Continuous Integration	S32	2015	RTM, RTP	Coverage, Test Set Size	Not specified
S15	2020	RTP	Precision	Continuous Integration	S33	2014	RTS	Coverage	Not specified
S16	2020	RTP	RPA	Continuous Integration	S34	2014	RTS, RTP	APFD	Continuous Integration
S17	2020	RTS, RTP	APFD	Continuous Integration	S35	2013	RTP	APFDc vs Time	Continuous Integration
S18	2020	RTP	NAPFD, RFTC, NTR	Continuous Integration

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Das, S.; Gary, K. Regression Testing in Agile—A Systematic Mapping Study. Software 2025, 4, 9. https://doi.org/10.3390/software4020009

AMA Style

Das S, Gary K. Regression Testing in Agile—A Systematic Mapping Study. Software. 2025; 4(2):9. https://doi.org/10.3390/software4020009

Chicago/Turabian Style

Das, Suddhasvatta, and Kevin Gary. 2025. "Regression Testing in Agile—A Systematic Mapping Study" Software 4, no. 2: 9. https://doi.org/10.3390/software4020009

APA Style

Das, S., & Gary, K. (2025). Regression Testing in Agile—A Systematic Mapping Study. Software, 4(2), 9. https://doi.org/10.3390/software4020009

Article Menu

Regression Testing in Agile—A Systematic Mapping Study

Abstract

1. Introduction

2. Related Work

3. Methods

3.1. Study Selection

3.2. Data Extraction and Results

4. Discussion

Actionable Recommendation

5. Threats to Validity

6. Conclusions and Future Work

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI