Next Article in Journal
Investigation of the Impact of Cold Plasma Treatment on the Chemical Composition and Wettability of Medical Grade Polyvinylchloride
Previous Article in Journal
Numerical Modeling of Heat and Mass Transfer during Cryopreservation Using Interval Analysis
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Incremental Formula-Based Fix Localization

1
Department of Electrical and Computer Engineering, Sungkyunkwan University, Suwon 16419, Korea
2
College of Computing, Sungkyunkwan University, Suwon 16419, Korea
*
Author to whom correspondence should be addressed.
Appl. Sci. 2021, 11(1), 303; https://doi.org/10.3390/app11010303
Submission received: 13 November 2020 / Revised: 21 December 2020 / Accepted: 23 December 2020 / Published: 30 December 2020

Abstract

:
Automatically fixing bugs in software programs can significantly reduce the cost and improve the productivity of the software. Toward this goal, a critical and challenging problem is automatic fix localization, which identifies program locations where a bug fix can be synthesized. In this paper, we present AgxFaults, a technique that automatically identifies minimal subsets of program statements at which a suitable modification can remove the error. AgxFaults works based on dynamically encoding semantic of program parts that are relevant to an observed error into an unsatisfiable logical formula and then manipulating this formula in an increasingly on-demand manner. We perform various experiments on faulty versions of the traffic collision avoidance system (TCAS) program in the Siemens Suite, programs in Bekkouche’s benchmark, and server real bugs in the Defects4J benchmark. The experimental results show that AgxFaults outperforms single-path-formula approaches in terms of effectiveness in finding fix localization and fault localization. AgxFaults is better than program-formula-based approaches in terms of efficiency and scalability, while providing similar effectiveness. Specifically, the solving time of AgxFaults is 28% faster, and the running time is 45% faster, than the program-formula-based approach, while providing similar fault localization results.

1. Introduction

Debugging is an essential and yet the most expensive task in software development [1,2]. It includes a labor-intensive process of locating and fixing faulty code in a buggy program. This process consumes about 50% of the total software development costs, of which the majority is spent on fault localization [1]. Automatic techniques that reduce manual effort in debugging can significantly impact software costs and productivity [3].
Many automatic techniques have been proposed for supporting developers in various debugging activities (e.g., Reference [4,5,6,7]). Most fault localization approaches (e.g., spectrum-based or mutation-based methods [8,9]) focus on computing a statistical measurement of suspicious to rank program statements by their likelihood of being faulty. However, to be useful, these methods require a test suite containing many passing and failing executions with high code coverage [10,11]. Such a high-quality test suite is often not available in practice. Moreover, these methods provide a ranked list of suspicious statements solely without any explanation; thus, developers still need high inspection efforts to examine these statements for localizing and fixing faults [3].
Formula-based fault localization (FFL) is particularly a promising approach as it not only logically identifies possible fault locations but also provides additional information that helps to explain and fix the faults [5]. Consider a buggy program and a failing test case that, in its execution trace, called error trace, demonstrate an error. FFL techniques work by constructing an unsatisfiable logical formula called error trace formula, that is a symbolic representation of the error trace, and using an automatic solver to find the causes of this formula unsatisfiability. Based on the solution obtained from the solver, possible faulty statements in the program can be logically identified. Existing FFL techniques differ in how they construct the error trace formula and how they use the automatic solver to manipulate the formula for locating the fault.
In Reference [12,13,14,15], a static analysis technique was performed to construct a logical formula called program-formula that semantically equal to the input program (with regard to a certain unwinding bound). Specifically, every satisfiable assignment to the program-formula corresponds to a feasible execution in the program, and vice versa every program execution (with regard to a certain bound) corresponds to a satisfiable assignment of the formula. This program-formula is then extended in conjunction with clauses encoding the input values and the assertions of a failing test case to form an unsatisfiable formula. They then feed the extended formula into a pMaxSAT solver, which finds an assignment to the formula’s variables that maximize the number of satisfied clauses. The set of unsatisfied clauses, called a minimal correction subset (MCS) of the formula, indicates a corresponding minimal set of program statements that can be modified to correct the considered error execution. In addition, the obtained variable assignments correspond to a feasible correct execution in the angelically fixed program, which replaces all statements in the MCS with suitable angelic values. As a result, they can provide developers with potential minimal fix location, together with a successful angelic execution, as an explanation.
A key limitation of these methods is that they are extremely computationally expensive and has scalability issue even with small programs. This is because they represent all possible execution paths in a program into a formula. It may easily lead to a very large and complex formula that cannot be handled or is difficult to handle by recent solvers. Jin [16] proposed an on-demand formula computation (OFC) technique to construct a smaller formula that encodes only program parts relevant to a given test case. Their experimental evaluation showed that OFC formulas are much simpler than program-formulas but still sufficient to produce the same result as the program-formulas. This method, however, requires to compute all MCS for multiple intermediate formulas before obtaining the final formula. Enumerating all MCS of many formulas might outweigh the benefits of generating a simpler formula.
In Reference [17,18,19]’s approaches, they work with a formula encoding the semantics of a single execution path, we refer to this formula as single-path trace formula. A single-path trace formula is semantically equal to a straight-line program that contains program statements in which execution produced an error, although these single-path trace formulas are simple and, thus, easy to solve. However, because the formula does not contain information related to the control dependence among statements in the original program, the MCS obtained from these formulas may not correctly correspond to an angelic fix in the original program. As a result, these methods may fail to identify some angelic fix location, and they may also report invalid angelic fix candidates.
In this paper, we present AgxFaults, an incremental formula-based fault localization method to overcome the above limitations. Our method is based on two main components. First, instead of based on a static formula that encodes the entire program (like program-formula) or just encodes a single execution path (like single-path formula), AgxFaults is based on an error formula that is constructed and extended dynamically and effectively in an on-demand manner. This formula, called angelic error trace formula, encodes only program parts that relevant to a specific failing test case, and it over-abstracts all unrelated program parts by angelic non-deterministic executions. This encoding results in compact error trace formulas that are easier to solve but that are still sufficient to identify both data and control-related faults. Second, because the angelic formula is incremented dynamically, instead of multiple calls to a MaxSMT solver to solve multiple formulas separately (like OFC approach), we adapt the incremental core-guide MaxSMT algorithm [20] to manipulate the formula and compute the MCS incrementally.
We implemented our method in a tool named AgxFaults by extending the Java Path Finder (a NASA model checking tool) for localizing faults in Java programs. The input of AgxFaults is a buggy program and failing test cases (can be given as a jUnit test case, or a pair of input and expected post-condition in a configuration file). AgxFaults outputs a set of minimal angelic fix candidates (MFC), and each MFC is a pair consisting of a minimal fix location set and an angelic execution path to show that modifying these statements can make the given failing test case pass.
We evaluated our methods using various programs of different kinds and sizes. These programs include several sample programs provided by Bekkouche [19], 41 faulty versions of a commercial traffic collision avoidance system (TCAS), and several large and complex real-world programs in the Defects4J benchmarks. The experimental results showed that AgxFaults succeeded in reporting actual fault localization for all bugs in both the Bekkouche and the TCAS benchmarks. AgxFaults outperformed single-path formula approaches both in terms of the success rate and the accuracy of fault localization. AgxFaults provided similar results compared to the program-formula approach with better efficiency and scalability. Specifically, AgxFaults has a 28% faster formula solving time and 45% faster running time than the program-formula approach when applied to the TCAS programs. Furthermore, when the complexity of the program increased (e.g., loop unwinding bound increased), the formula solving time of the program-formula-based method increased exponentially, while that of AgxFaults increased significantly slower.
In summary, we have made the following contributions in this paper:
  • We proposed a technique to dynamically encode the semantic of a partial program that is related to a specific test input into a formula in an increasingly and on-demand manner.
  • We present an iterative algorithm for enumerating minimal angelic fix candidates by manipulating and solving the constructed error formula in an incremental manner.
  • We implement our proposed method in a tool, AgxFaults, and it is public as open-source software.
  • We perform experiments on various public benchmarks and open-source projects to show the effectiveness of our proposed approaches.
The rest of this paper is organized as follows. We first provide a basic background in Section 2. We then describe the detail of the proposed method in Section 3. We describe our experimental setup in Section 4 and discuss the experimental results in Section 5. We review related work in Section 6. Finally, we give our conclusions in Section 7.

2. Background

In this section, we describe the fault localization problem and provide a running example. Then, we explain the basic background of maximal satisfiability-based fault localization.

2.1. Fault Localization Problem

Fault localization is the problem of identifying program statements that responsible for an observed failure in a software program. Without knowing the correct program in advance, it is impractical to automatically pinpoint the faults with absolute accuracy. Indeed, any program statement or subset of program statements, that, if suitably modified, can remove the failure, is considered possibly faulty [12,14,16]. Since checking the existence of an actual syntactic fix for a program is extremely computation expensive [7,21], we check for the existence of an angelic fix that removes the failure in an angelically way by replacing program expressions with suitable non-deterministic values (i.e., angelic values).
In the context of this paper, we consider fault localization to be the problem of finding angelic fix candidates for an observed failure in a software program. An angelic fix candidate (AFC) consists of (1) a fix locations set (i.e., a set of suspicious statements) and (2) angelic values (i.e., set of values that, if substituted for these statements, would make the given failure execution become a success). Essentially, the angelic values represents an angelic execution that is diverted from the original program execution by replacing the value of specific variables at the fix locations with their corresponding angelic values. Because there may exist a large number of angelic fix candidates, providing the developer with a set containing only minimal angelic fix candidates (i.e., angelic fix candidates that have a minimal fix location set) is more referable. Thus, given a faulty program and a failing test case that, in its execution, demonstrates a failure, we produce a set of minimal angelic fix candidates (MFCs) that make the given test execution become a success.
Figure 1 shows a method foo and a unit test method testFoo, which checks if foo returns a certain value when it is called on a particular input. The unit test method testFoo calls foo with an input x = 3 , y = 5 and asserts that the output is equal to 2 .
However, because of a fault at line 3, where the assignment ( a = 2 x ) is accidentally written as ( a = x ) , the method foo returns 8, thus violating the assertion, and the test fails. After running the test case testFoo, we know there is fault in the program foo. However, without knowledge about the correct program, a program statement can be considered possibly faulty if there exists a suitable replacement for these statements to make the observed error disappear.
We use x = ( L , V ) to denote an angelic fix candidates x, where  L = { l 1 , , l n } is its fix locations set, and  V = { v 1 = v a l 1 , , v m = v a l m } is the corresponding set of angelic values. In reality, the size of set L and set V may different, e.g., a statement may have multiple instances in an execution. However, for simplifying the representation, we assume m = n ; thus, the angelic values V [ i ] corresponds to the statement at fix location L [ i ] . Each item v i = v a l i in an the angelic values set V is a mapping from an angelic value v a l i to variable v i at the fix location point i.
Given the program and a failing test case in Figure 1, our approach produces five minimal angelic fix candidates, which are: m f c 1 = ( { 8 } , { r e t u r n = 2 } ) ; m f c 2 = ( { 6 } , { g u a r d = T r u e } ) ; m f c 3 = ( { 3 } , { a = 5 } ) ; m f c 4 = ( { 1 } , { b = 2 } ) ; and m f c 5 = ( { 2 , 4 } , { g u a r d = F a l s e , a = 5 } ) .
Consider MFC m f c 3 = ( { 3 } , { a 1 = 5 } ) . This MFC states that the failure can be removed by modifying the assignment ( a = x ) at line 3 such that it assigns the angelic value 5 to the variable a. Indeed, assigning 5 for the variable a at line 5 will change the value of condition expression in the if-statement at line 6 (i.e., a > = y ) from f a l s e to t r u e ; thus, the execution flow of the error trace is flipped into the true-branch. As a result, the statement ( b = x y ) at line 7 is executed, and the final value of variable b at the return statement is 2 ; thus, the test assertion is satisfied.
An angelic fix candidate is said a feasible fix candidate if substituting the value of variables at the fix locations by their corresponding angelic values actually results in a successful program execution, i.e., the corresponding angelic execution results in a success. Otherwise, it is said to be an invalid fix candidate, or an infeasible angelic fix candidate. All the five MFCs above are feasible fix candidates because their corresponding angelic executions are feasible.
An angelic fix candidate is a correct fault location if all statements in its fix locations set are actually faulty statements. For example, the  m f c 3 = ( { 3 } , { a 1 = 5 } ) is a correct fault location because all statements in its fix locations set, i.e., statement at line 3, are actually faulty. Let us consider another MFC m f c = ( { 3 , 1 } , { a 1 = 5 , b 1 = 3 } ) , for example. This MFC is a feasible fix candidate because replacing value of variable “b” at line 1 by value 3 and replacing the value of variable “a” by value 5 actually make the test execution a success. This MFC is not a correct fault location because its fix locations set contains a statement at line 1 that is not a faulty statement.
An angelic fix candidate is said a correct fix candidate if it is both a feasible fix candidate and a correct fault location. For example, the  m f c 3 = ( { 3 } , { a 1 = 5 } ) is a correct fix candidate, as it is both feasible and also a correct fault location. Let us consider another MFC m f c = ( { 3 } , { a 1 = 0 } ) , for example. This MFC is a correct fault location. This mcs is not a feasible fix candidate because replacing the value of variable “a” at line 3 by value 0 does not make the test execution success. Thus, this mcs is not a correct fix candidate.

2.2. Formula Satisfiability and Solvers

The formula satisfiability (SAT or SMT) is the problem of determining if there exists an assignment to variables in a given logic formula such that the formula evaluates to true. If such an assignment (called model) exists, the formula is called satisfiable (SAT); otherwise, it is called unsatisfiable (UNSAT).
Maximum satisfiability (MaxSAT or MaxSMT) is an optimization version of the SAT problem where the goal is to find a model for a given formula such that maximizes the number of clauses satisfied together. Such a maximized subset of clauses is called a maximal satisfiable subset (MSS). The complement of MSS is called a minimal correction subset (MCS), which is a minimal subset of clauses that, if removed, can make the remainder formula satisfiable again. Partial MaxSAT (pMaxSAT) is an extension of the MaxSAT, in which clauses are marked as either “soft” or “hard”. The goal in pMaxSAT is to find a model that satisfies all "hard" clauses and maximizes the number of satisfied “soft” clauses.
Although the SAT problem was known to be NP-complete, recent automatic solver algorithms have shown that they can solve large SAT formulas encoding practical industrial problems. SAT solvers are software programs that accept a logic formula in conjunction normal form (CNF) and decide if the formula is satisfiable. If the formula is satisfiable, the solver returns “SAT” and provides a satisfiable model for the formula. Otherwise, the solver returns “UNSAT” and may produce an unsatisfiable core, which is a subset of clauses in the formula that cannot be satisfiable together, as an explanation for the unsatisfiability. Some recent solvers support incremental SAT solving, which facilitates solving a series of closely related formulas. Incremental SAT solvers remember already learned information after checking the satisfiability of one input formula and utilize this information to avoid repeating redundant work in further satisfiability checks of additional formulas.
Generally, MaxSAT solver algorithms perform a succession of SAT solver calls; after each call, they add additional cardinality constraints to reach an optimal solution. The state-of-the-art MaxSAT solvers are based on the Core-guided MaxSAT algorithm, which leverages the unsatisfiable core produced by the SAT solver. Specifically, after each call to the SAT solver, they relax clauses in the unsatisfiable core by associating a relaxation variable with each such clause. To reach optimal solutions, they add cardinality constraints to constrain the number of relaxed clauses.

2.3. MaxSAT-Based Fault Localization

MaxSAT-based fault localization approaches [12,14,15,19,22] reduce the fault localization to the maximal satisfiability problem, which finds a variable assignment for a logic formula such that the number of satisfied clauses is maximized. Given a buggy program P and a failing test case T ( i n p , a s ) that exposes a bug in the program, these approaches perform as follows.
First, they use a bounded model checking or symbolic execution tool to construct a logical formula, called error trace formula, that semantically represents the error execution of the buggy program and the failing test case. Essentially, this error trace formula is the conjunction Φ φ i n p φ t f φ a s , where φ i n p encodes the test input, φ t f is called trace formula encoding the semantics of program execution trace induced by the given input, and  φ a s encodes the test assertion that the program must be satisfied (or the expected output that the program must produce for the given test input). Since the program fails the test, thus, obviously, this error trace formula is logically unsatisfiable. Because the test input and the test assertion are correct by definition, thus, the clauses encoding test input and assertion are not responsible for the unsatisfiability of the error trace formula. Therefore, the causes of this formula unsatisfiability are account for clauses in the trace formula φ t f . It is exactly the situation that faulty statements in the program are responsible for the test failure.
Second, they treat the constructed error trace formula as an instance of a partial MaxSAT problem, in which the clauses encoding the test input φ i n p and test assertions φ a s are marked “hard”, and the clauses encoding program statements in the formula φ T F are marked “soft”. They then feed the formula into a pMaxSAT solver to obtain a set of MCSs of the formula. Intuitively, the set of clauses in an MCS indicates a corresponding set of minimal fix locations set, and the maximum satisfiability model of the MCS provides angelic values for these fix locations. Thus, as a result, they can produce a set of minimal fix candidates that can make the given test execution become a success.
Consider our example in Figure 1. Program-formula-based approaches, such as Bug-Assist [12] and SNIPER [14,15], first inline all function calls and unwind all loops in the program foo up to a given bound to obtain a loop-free and function-call-free program. They then transform the flatted program into a semantically equal program in the static single assignment form (SSA) [23], in which each variable in the program is assigned, at most, one time. Figure 2 shows the SSA form of the method foo where the number of loop unwinding is two. Each statement in the SSA program is then represented as a logic clause. These clauses are then conjoined to form a program-formula, the formula TF shown in Figure 3. This program-formula is semantically equal to the original program with respect to a specific unwinding bound. Specifically, every satisfiable assignment to the program-formula corresponds to a feasible execution in the program, and, vice versa, every program execution (with regard to the unwinding bound) corresponds to a satisfiable assignment of the program-formula. The program-formula is then extended in conjunction with clauses encoding the test input values and clauses encoding the test assertions of the failing test case to form an error trace formula I N T F A S , shown in Figure 3. By applying a pMaxSAT solver to this error trace formula, they obtain, totally, five minimal correction subsets: m c s 1 = { c 9 } , m c s 2 = { c 6 } , m c s 3 = { c 3 } , m c s 4 = { c 1 } , m c s 5 = { c 2 , c 4 } . Each MCS indicates a corresponding set of minimal fix locations set, and the maximum satisfiability model of the MCS provides angelic values for these fix locations. As a result, they identify and report to developer following five minimal fix candidates: m f c 1 = ( { 8 } , { f o o = 2 } ) ; m f c 2 = ( { 6 } , { g 2 = T r u e } ) ; m f c 3 = ( { 3 } , { a 1 = 5 } ) ; m f c 4 = ( { 1 } , { b 1 = 2 } ) ; and m f c 5 = ( { 2 , 4 } , { g 1 = F a l s e , a 2 = 5 } ) .

3. Proposed Method

In this section, we provide details of our proposed fault localization method, AgxFaults.

3.1. Overview

AgxFaults takes a buggy program and a failing test case that demonstrate a program failure as an input. It outputs a set of pairs, each pair consisting of a minimal set of suspicious statements with an angelic execution that explain how the failure can be removed by replacing these suspicious statements with angelic values. The fault localization process of AgxFaults is iterative and incremental on-demand. Figure 4 provides a high-level view of AgxFaults and its main components. Below, we first briefly describe the main components and then explain the overall fault localization process of AgxFaults.
  • Angelic DCFG (Dynamic Control Flow Graph): The Angelic DCFG is the dynamic control flow graph of an angelic program [24]. This angelic program acts as an abstraction of the input buggy program such that only program parts relevant to the error are represented precisely, while irrelevant parts are represented abstractedly as angelic non-determinisms (i.e., executions that can produce non-deterministic values such that the program execution succeeds).
  • Error Trace Formula: The error trace formula is essentially formula I N T F a g x A S , where I N represents the test input, A S represents the assertions of the given failing test case, and  T F a g x is semantically equal to the current version of the Angelic Program. The error trace formula encodes the fault localization problem of the current angelic program with the given failing test case. Each MCS of this angelic formula corresponds to an angelic execution in the angelic program.
  • Angelic Execution The angelic execution is a correct execution of the angelic program. This execution is obtained by diverting the original error trace such that the output of specific statements is dynamically replaced with proper values, i.e., angelic values, to make the test execution success.
  • On-demand Program Explorer and Encoder component is responsible for refining the angelic program and the error trace formula in an on-demand manner.
  • Incremental formula solver is responsible for computing the minimal correction subset (MCS) of the angelic formula incrementally.
  • MCS analyzer analyzes the obtained MCS of the angelic formula to determine possible faults in the program. In addition, it determines which abstract parts of the angelic program need more refinement to provide a more precise result.
The core idea of the AgxFaults is to work with an angelic program incrementally instead of a program formula encoding all semantics of the original buggy program, which may lead to very complex and expensive computation. In the beginning, only statement instances that executed in the original failing trace are presented precisely in the angelic program. The angelic program is expanded dynamically in an on-demand manner, after each iteration, to provide more precise results.

3.2. Overall Fault Localization Algorithm

The overall fault localization process of AgxFaults is described in Algorithm 1. In the algorithm, P a g x represents the dynamic control flow graph of the angelic program, and  s o l v e r is an instance of an Incremental Partial MaxSat Solver. Additional hard and soft constraints can be added into the solver via method addHard() and addSoft(), respectively. The method Check() of the solver returns True if it finds an MCS for the current formula; otherwise, it returns False.
In the beginning, the algorithm initializes P a g x as a pure angelic program (which does not contain any specific statements, line 1). The formula solver, s o l v e r , is initialized with the set of soft-constraints empty and the set of hard-constraints containing clauses encoding the test input and its assertion (lines 1 to 3).
The main process of the algorithm is the loop from line 4 to line 15. It is an iterative process comprising the following steps: At the beginning of each iteration, it asks the solver to find an MCS for the current formula, line 4. If there is no more MCS, then the loop is terminated. Otherwise, the iteration starts and the solution of the formula is stored in m o d e l , line 5. Then, based on m o d e l , it determines the corresponding possible faulty statements m c s and an angelic execution Π a g x in the angelic program, line 6. If the angelic execution Π a g x contains unspecified executions (i.e., the if condition at line 7 is true), then the angelic refinement process is performed to refine the angelic program and angelic formula at these angelic-branches, line 8. Otherwise, Π a g x is a feasible angelic execution; thus, the possible fault location m c s , together with the angelic execution path Π a g x , are reported to the developer, line 12. In addition, a blocking constraint is added to the solver, line 13. This blocking constraint guarantees that the current MCS will not be encountered again in subsequent iterations. Specifically, the blocking constraint of the MCS is { c | c m c s } . The blocking constraint states that all program statements in the m c s are not simultaneously containing faults.
Algorithm 1 Overall Fault Localization Algorithm
Input:prog: buggy program
Input:(inp,as): failing test case
Input:bound: maximum bound fix location size
Output: {(stmt,angelic correct execution)}
1: P a g x , I N , A S , i n p , a s
2: Φ h , Φ s ( I N A S ) ,
3: s o l v e r = new IncrMaxSMTSolver( Φ h , Φ s , b o u n d );
4:while s o l v e r . C h e c k ( ) ¬ t i m e o u t ( ) do
5: φ m c s , m o d e l s o l v e r .getModel()
6: s t m t , Π a g x AnalyzeMCS( m o d e l , P a g x )
7:if ( Π a g x contain angelic-branches) then
8:   P a g x , Φ h , Φ s AngelicRefinement( s t m t , Π a g x , P a g x )
9:   s o l v e r .addHard( Φ h )
10:   s o l v e r .addSoft( Φ s )
11:else
12:  writeOutput( s t m t , Π a g x )
13:   s o l v e r .addHard(BlockingConstraint( φ m c s ))
14:end if
15:end while

3.3. On-Demand Program Explorer and Incremental Formula Encoder

On-demand Program Explorer and Encoder (OPEE) is responsible for dynamically constructing and refining the angelic program, as well as the angelic formula in an incremental and on-demand manner. Algorithm 2 shows its details. It takes as input a program P, a concrete input i n p , a set of angelic modifications m c s , and an angelic execution path Π that contains angelic-branches. Each angelic modification, m m c s , specifies a statement instance s t and a value v a l , called the angelic value. m c s [ s t ] represents the angelic value of statement instance s t . The abstract execution path Π contains a sequence of branch decisions. Π [ s t ] represents the decision at branch instance s t .
The procedure AngelicRefinement( ) in Algorithm 2 describes how our the on-demand program explorer works. The OPEE component runs the program with the given test input and dynamically substitutes the value produced by statement s t m c s with its corresponding angelic value m c s [ s t ] to explore the program execution path specified in the abstract path Π . During program execution, it treats each statement instance, s t , differently, depending on whether s t corresponds to a non-deterministic or concrete statement instance in the angelic program P a g x . If s t corresponds to a concrete statement instance of P a g x , the condition at line 21 is true. If s t is an angelic location, then the execution engine perturbs the program memory M such that the output of s t is replaced with its corresponding angelic value (line 23). In addition, the execution engine also checks whether the current execution has diverted from the expected abstract path by comparing the actual branch decision with the angelic branch decision (line 25). If the diversion happens, then the execution stops. If s t corresponds to a nondeterminism statement in P a g x (i.e., the condition at line 21 is false), the angelic program is refined by making this statement instance concrete (line 31), and the angelic formula is simultaneously updated to encode this statement instance (line 32).
Let s t be the statement instance that is being encoded into the angelic formula. Depending on the type of s t , it is encoded in the angelic formula differently. The procedure UpdateFormula( ) in Algorithm 2 shows the details. Specifically, if  s t is an assignment statement [ v = e x p r ] , we represent s t as an equivalence relation between variable v on the right-hand side and expression e x p r on the left-hand side (line 47). For a conditional statement, [ i f c o n d ] , we add an extra variable g u a r d to represent the branch predicate, and we represent the conditional statement as an equivalence relation between the g u a r d variable and the conditional predicate expression (line 45). If s t is a phi statement instance, [ v = Φ ( g u a r d c s , e x p r ) ] , it is presented as an implication constraint ( g u a r d c s ( v = e x p r ) ) . This implication constraint essentially states that, if the execution trace reaches the statement s t by going through the branch g u a r d c s of the conditional statement, then the value of variable v is equal to e x p r ; otherwise, the value of v is un-constrained. By encoding Phi statements as implication constraints, it allows tightening the constraints of the joined variable by additional constraints when executing other branches of the conditional statement.
Algorithm 2 On-demand Program Explorer and Formula Encoder
16:procedureAngelicRefinement( m c s , Π a g x , P a g x )
17: Φ h a r d , Φ s o f t ,
18: M i n i t i a l i z e ( P , i n p ) ; s t s t e n t r y ;
19:while s t n u l l do
20:   ( s t s s a , π s s a ) = c o n v e r t T o S S A ( s t , π s s a )
21:  if s t s s a P a g x then
22:   if s t s s a m c s then
23:     M p e r t u r b ( s t s s a , Π a g x [ s t s s a ] )
24:   end if
25:   if ( s t s s a is [ i f c o n d ] ) then
26:     if M [ g u a r d s t ] ) Π a g x [ g u a r d s t ] then
27:     break   ▹ execution is diverted from the angelic path Π a g x
28:    end if
29:   end if
30:  else // s t s s a P a g x
31:    P a g x P a g x { s t s s a }         ▹ refine the angelic program
32:   updateFormula( s t s s a , Φ h a r d , Φ s o f t )    ▹ refine the angelic formula
33:  end if
34:   ( s t , M ) e x e c u t e S t a t e m e n t ( P , M )
35:end while
36:return ( P a g x , Φ h a r d , Φ s o f t )
37:end procedure
 
38:procedureupdateFormula( s t s s a , Φ h a r d , Φ s o f t )
39:if ( s t s s a is [ v = Φ ( g u a r d c s , e x p r ) ] ) then
40:   φ v a l ( g u a r d c s ( v = e x p r ) )
41:   Φ h a r d Φ h a r d { φ v a l }
42:else
43:  if s t s s a is [ i f c o n d ] then
44:    c o n d conditional expression of s t s s a
45:    φ v a l ( g u a r d s t = c o n d )
46:  else
47:    φ v a l ( v = e x p r ) )
48:  end if
49:   a b fault predicate of original s t
50:   Φ h a r d Φ h a r d { a b φ v a l }
51:   Φ s o f t Φ s o f t { ¬ a b }
52:end if
53:end procedure
Because if a statement s t is faulty, it would be replaced by a different statement. Thus, in that case, the constraints that constrain the value of variables in s t is invalid and should be relaxed. To reasoning on the faultiness of program statements, we associate with each statement s t in the original program a boolean variable a b s t as a fault predicate. Specifically, the variable a b s t indicates that the statement s t is faulty (or correct) if it is evaluated to t r u e (or f a l s e , otherwise). We encode each conditional and each assignment statement instance s t into the angelic formula by adding a clause ( a b s t φ v a l ) to the hard constraints set and adding a clause ¬ a b s t to the soft constraints set (line 49–51), where φ v a l is the constraint representation of the statement instance. Since phi statements are fake statements that are introduced to explicitly represent the dependence of variables on branch decisions in the execution trace, the faultiness of a phi statement instance may account for faults in its preceding assignment or conditional statement instances. Thus, we add the constraint representing a phi statement instance into the angelic formula as a hard constraint (line 41).
To summarize, the constructed angelic formula is an instance of a partial maximum satisfiability problem in which the hard constraints represent semantic of the input and assertions of a test case and the constructed angelic program; the soft constraints contain a set of fault predicate of statements in the buggy program that have been represented in the angelic program. Each MCS of this angelic error trace formula indicates a minimal set of program statement s t m t , in which its corresponding fault predicate a b is evaluated to t r u e .

3.4. Incremental Formula Solver

After each iteration of the Angelic Refinement Loop, a new set of constraints are added into the angelic formula. Instead of considering each angelic formula after each iteration as an independent max-sat problem and invoking a Max-SAT solver to find MCS, we consider all generated angelic formula so far as a sequence of a similar max-sat problem, i.e., an instance of a sequential maximum satisfiability problem [20]. The Formula Solver uses a sequential max-sat solver to compute MCS of the angelic formula incrementally.

3.5. MCS Analyzer and Report Writer

Given an MCS, m c s , and a satisfiable model of the angelic error trace formula, the MCS Analyzer responsible for reconstructing an angelic execution trace Π a g x in the angelic program that corresponds to the MCS in the angelic formula. The angelic execution trace Π a g x is reconstructed by traveling the dynamic control flow graph of P a g x beginning from the entry point as follows. The entry point of the program is the first element of Π a g x . All assignment statements in the path are added in Π a g x . When traveling to a conditional branch node c s , the guard expression of c s is evaluated using the m o d e l to determine the selected branch, which we will call b. If the selected branch b is not included in the angelic program P a g x , (b is a non-deterministic branch), then the process moves to the corresponding Phi statement instance of c s . Otherwise, it moves to the first statement in the selected branch and continues the process until it reaches the terminal node.

3.6. Illustrative Example

We illustrate how our proposed method work using the running example in Figure 1. Figure 5 shows the progress of AgxFaults in the firsts 4 iterations. In the figure, the green nodes represents program parts that are assumed to be correct; thus, they are encoded into the formula as hard constraints.
After the initialization steps, the angelic program P a g x contains only an entry point and a normally exit point (i.e., successful terminated). The solver, s o l v e r , is initialized when the soft-constraints set is empty, and the set of hard-constraints encode the entry point and the normal exit points which represent the successful terminator. In the first iteration of the loop, the algorithm do nothing important but invoke the AgelicRefinement method (line 8) with input an empty angelic fix location set and an empty angelic path. Thus, the AngelicRefinement method just runs a given test case and encodes the executed statements into the formula. Figure 5b shows the state of algorithm after the first iteration. Specifically, the angelic program precisely presents only executed statements in the original error trace, while all other parts are abstracted. Figure 6 shows the set of encoded constraints in the first iteration (We eliminated the fault predicate variables a b from the clauses for simplification).
In the second iteration, first, the solver solves the current angelic formula and returns an MCS ϕ m c s = { f o o = b 3 } , which corresponds to the return statement at line 8. Then, the MCSAnalyzer component analyzes the mcs and corresponding model to identify fix location set and corresponding angelic path, (line 6 in Algorithm 1). The constructed angelic path Π a g x is the path (s,1,2,6,8,9,OK) in the CFG of the angelic program. This path is feasible because all edges in the path are deterministic. Thus, the algorithm goes to line 12 to output the newly found minimal fix candidate, and then it adds a constraint to block this solution that will occur in the next iterations. In other words, from this time, the solver considers the statement at line 8 as correct.
In the third iteration, the solver returns an MCS m c s = { b 1 = x 0 + y 0 } , which correspond to assignment [ b = x + y ] at line 1. The constructed angelic path for this mcs is the path (s,1,2,6,8,9,OK), which is a feasible path in the angelic program. Thus, similar to the second iteration, the algorithm outputs the newly found MFC, adds blocking constraint for line 1, and goes to the next iteration.
Figure 5c shows the current state of the angelic program and the angelic formula in the solver. Because the blocking constraint that are added in previous iterations, the statements at line 1 and line 8 will not be considered as possible faults. Thus, the solve only needs to search in a smaller search space for the next MCSs.
In the fourth iteration, the solver found an MCS m c s = { g 2 = ( a 1 > = y 0 ) } , and the angelic value is { g 2 = T r u e } . This mcs indicates the branch condition at line 6 is the fix location. The constructed angelic path for this mcs is the path (S,1,2,3,6,?,8,9,OK). Because the angelic path contain an non-deterministic branch, thus, the AngelicRefinement method is invoked with input m c s = { l i n e 6 } , and its corresponding angelic value is T r u e to explore the true-branch of the if-statement at line 6. After the refinement, the assignment [ b = x y ] is presented in the angelic program, and additional constraints in Figure 7 are added into the solver.
After the refinement, the procession continues until timeout or all minimal fix candidates have been found (i.e., the solver.Check() return UNSAT).

4. Evaluation Setup

To evaluate our proposed method, we performed experiments on three different set of benchmarks and compared against existing formula-based fault localization techniques including: techniques that are based on program-formula (e.g., BugAssist [12], Sniper [14]); techniques that are based on single-path control-flow insensitive formulas (e.g., Reference [17,19]); and single-path control-flow sensitive formulas (e.g., Reference [18]). All experiments are performed on a computer with a 4.0 Ghz Intel Core i7 CPU and 8 GB RAM. In this section, we describe the setups of our evaluation.

4.1. Implementation

We have implemented our approach in a prototype tool, named AgxFaults, that automatically localize faults for Java programs. Unfortunately, all the formula-based fault localization techniques, which provide tools or source-code online, are targeting C programs. Thus, for comparison, we also implemented three different existing formula-based fault localization techniques into the tool AgxFaults, specifically, the program-formula (PF) approach used in Reference [12,14], the flow-insensitive trace formula (FI)-based approach used in Reference [17], and the flow-sensitive trace formula (FS) used in Reference [18] are implemented into AgxFaults.
We implemented the AgxFaults tool as an extension of NASA’s model checker Java-Path-Finder (JPF). The inputs of AgxFaults are a buggy program and a failing test case (given as a JUnit test method or a pair of input and expected post-condition in a configuration file). The output is a list of minimal angelic fix candidate (MFCs), where each MFC contains (1) a set of suspicious statements, together with (2) angelic values that these statements should have produced to make the test execution, which originally fails, become a success. AgxFaults includes in itself a customizable formula-builder and a constraint solver for solving formulas. We implement the formula-builder by using extension mechanisms of the JPF and adapted from the implementation of jDart [25], a concolic execution engine for Java programs. Specifically, we use the bytecode factories and listeners extension mechanisms of JPF to (1) create fresh symbolic variables on-the-fly when executing assignments and branch conditions instructions, (2) dynamically perturb concrete program state and force the program execution to follow a specific path, and (3) manipulate and propagate symbolic values along different program execution paths and collect constraints for building error trace formulas. As inherited from jDart, AgxFaults also uses the constraints library jConstraints [26] as an abstraction layer for constraint solvers. Since the jConstraint does not support solving pMaxSMT/MaxSMT and sequential MaxSMT problems, we implemented the Fu & Malik’s core-guided max-sat algorithm [27] and the incremental core-guided MaxSMT solving algorithm [20] into the jConstraint library. We use the Z3 (https://github.com/Z3Prover/z3) SMT solver as our back-end constraint solver.
We implemented the program-formula approach (PF) following the full-flow sensitive formula encoding approaches in Sniper [14] because it was experimentally showed more effective than Bug-Assist. To construct a program-formula, we force the on-demand program explorer and formula builder to explore all paths up a certain bound in the program and encode all of them into the formula, instead of operating on demand. We implemented the single-path flow-insensitive formula approaches by following error trace formula encoding described in Reference [17]. The single-path flow-sensitive formula approach is implemented by following the error trace formula encoding described in Reference [18]. The source code of AgxFaults, which contain our implementation of all above techniques, as well as our benchmark programs, are available online for open access at http://bit.ly/agxfaults.

4.2. Research Questions and Evaluation Metrics

We applied each of these four implemented techniques to several buggy programs to empirically investigate the following research questions:
RQ1: How effective is AgxFaults in finding angelic fix candidates, compared to program-formula and single-path formula approaches?
A technique is said more effective in finding angelic fix candidate if it can find more feasible fix candidates. A technique is said more precise in finding angelic fix candidate if it always produces feasible MFCs where the ratio of the number of its found feasible MFCs to the total number of its reported MFCs is high. Thus, to answer RQ1, we identify two metrics:
(1) average number of feasible MFCs that the technique found for each run (i.e., number of feasible MFCs found), and
(2) the number of feasible MFCs in the total number of MFCs that the technique reported for each run (precision).
RQ2: How effective is AgxFaults in localizing fault location, compared to program-formula and single-path formula approaches?
To evaluate the fault localization effectiveness of a technique, we identify three metrics:
(1) the number of runs the technique were able to report the correct fault locations in the total of its runs (i.e., successful fault localization rate),
(2) the number of runs the technique were able to report a correct fix candidate in total number of its runs (i.e., successful fix localization rate), and
(3) the percent of program code lines that the developer need to examine before identifying the first faulty statement (i.e., EXAM score).
RQ3: How efficient and scalable are AgxFaults, compared to program-formula and single-path formula approaches?
We use the CPU time as a metric to measure the efficiency and scalability of a technique. Specifically, for each technique, we measure:
(1) the CPU time spent by the solver to solve formulas (i.e., Formula solving time), and
(2) the total running time the technique for each run (i.e., Running time).
RQ4: Can AgxFaults be applied to real bugs in large software projects?

4.3. Benchmarks

For our evaluation, we use several buggy programs selected from three different benchmarks. The first benchmark consist of a set of example programs provided by Bekkouche [28]. The size of these program ranges from 17 to 130 lines of code. Each program contains one to three faults that were specifically injected to evaluate fault localization techniques. The second benchmark contains 41 faulty versions of a real traffic collision avoidance system (TCAS) from Siemens [29], which is popularly used in software testing and fault localization researches. The third benchmark we selected is real-world buggy programs in the Defects4J [30].
We selected the programs in the first and second benchmark for two main reasons. First, these programs are small, so they allow us to find all minimal angelic fix candidates for the programs. This is generally impossible for large and more complex programs. By finding all minimal angelic fix candidates for the programs, we can precisely compare the efficiency of the techniques in term of complexity reduction and compare the effectiveness of the techniques in terms of success rates. Second, the TCAS programs are commonly used to evaluate state-of-the-art formula-based fault localization techniques, including BugAssist, Sniper, and LocFaults. Thus, we can directly compare our results with those techniques since the program in the first and the second benchmarks are small size and contains only artificial faults. The third benchmark contains non-trivial open-source programs with real bugs.
We obtained Java versions of the TCAS programs from the SIR website. Each faulty version of the TCAS program has a size of 180 lines and contains one to three artificially-injected faults. These programs also come with a total of 1576 test inputs and a fault-free version. For each buggy version, we manually compare it with the fault-free version and consider the set of different statements to be the actual faulty statements. To obtain failing test cases for each buggy version, we ran all test cases using the fault-free version to obtain the expected output of the test cases. Then, for each buggy version, we ran all the test cases and matched the results with the expected output to identify the failing test cases.
The third benchmark we considered is several faulty programs in the Defects4J, a benchmark containing real bugs from large and complex open-source projects. We randomly select several buggy versions of open-source projects in the Defects4J. These programs include JFreeChart, Commons-Codec, Commons-Compress, Common-CSV, Common-Lang, Common-Math, and Mockito. In this experiment, we use the failing test cases that are already provided in the Defects4J as the input for the AgxFaults tool.

Study Protocol

We ran each of four techniques methods implemented in our tool on a buggy program multiple times, each time inputting a different failing test case and outputting a list of minimal angelic fix candidates (MFCs). We examined all MFCs produced for each run, in the generated order, and determined the validity and accuracy of the results. Specifically, a run was considered successful fault localization if its output contained a correct fault location MFC, i.e., all suspicious statements in the MFC were actually faulty statements. A run was considered successful fix localization if its output contained an correct fix location MFC, i.e., an MFC that is both a feasible fix candidate and performs fix at the correct fault location.
To evaluate the performance of the techniques, we record the average CPU time of each technique for processing each failing test case and the amount of which consumed by the solvers, the number of MFC each technique generated and the number of which are feasible fix candidates, the number of code lines included in the report, whether the generated report contained the actual fault, and how many lines of code developer will have to examine to identify the actual fault location.
We did not set timeout when running the techniques on the programs in the TCAS and the Bechouche benchmarks. Thus, the tool finished after it had reported all MFCs that it could find. When the techniques were ran on the programs in the Defects4J benchmark, we set a timeout of five minutes and the deepest nested loop was 100.

5. Results of the Experiments

5.1. Result of RQ1: Effectiveness in Finding Angelic Fix Candidates

To evaluate the effectiveness of our method in finding angelic fix candidates, we report and compare (1) the number of MFCs that each technique found for each run and (2) the number of which are feasible angelic fix candidates. A technique is said more effective in finding angelic fix candidates if it can find more feasible fix candidates.
Table 1 shows the experimental results for 41 buggy versions of the TCAS program in terms of effectiveness in finding MFCs. The first section of the table shows the version name (Ver), the number of faults (Faults), and the number of failing test cases (Ftc) for each buggy version. The second and third sections of the table show, on average, the total number of MFCs that each technique found in a fault localization run for each buggy version, as well as the number of these found MFCs that are feasible (i.e., MFCs with valid explanation), respectively.
For example, version v1 has one injected fault, and there are 131 failing test cases. Thus, each fault localization method was run on version v1 131 times. On average, both AgxFaults and PF found 44 minimal angelic fix candidates for each run, while those of FS and FI are 52 and 3, respectively. After examining the MFCs generated by all methods, we found that all MFCs generated by AgxFaults, PF, and FI are feasible, while only 35 in a total of 52 MFCs generated by FS are feasible. Thus, the number of feasible MFCs generated by AgxFaults, PF, FS, and FI is 44, 44, 35, and 3, respectively.
In total, each method was run 2156 times over 41 program versions. On average, the number of MFCs generated for each run by the AgxFaults, PF, FS, and FI are 87, 87, 45, and 3, respectively. Of which, the number of generated MFCs that are feasible of AgxFaults, PF, FS, and FI are 87, 87, 23, and 3, respectively. As the result showed, AgxFaults produced similar results as the PF approach, which is based on a more complex error trace formula encoding the entire program semantic, in all buggy versions. All MFCs generated by AgxFaults, PF, and FI approaches are feasible (i.e., provide valid explanations), while only 23 (51%) out of 45 MFCs generated by FS are feasible. AgxFaults outperform single-path formula approaches (both flow-sensitive and flow-insensitive techniques) in finding more feasible MFCs.

5.2. Result of RQ2: Effectiveness in Fault Localization

To evaluate the fault localization effectiveness, we report and compare (1) the number of successful fault localization runs, the number of successful fix localization runs, and the average EXAM score of each technique for each buggy program. For recall, a run of a technique was said successful fault localization if it found an MFC that fixes at only faulty statements (i.e., all fix locations in the MFC are actually faulty statements). A fault localization run was considered successful fix localization if it outputs an MFC that is both a feasible fix candidate and a correct fault location. EXAM score is the percent of program code lines that the developer needs to examine before identifying the fault. EXAM score is computed as the ratio of the number of lines of code that the developer examined before reaching an actual faulty line to the total number of code lines in the program.
Table 2 shows the experimental results for the TCAS programs in terms of fault localization effectiveness. The first section of the table shows the version name (Ver) and the number of fault localization runs (Ftc) of each technique for each buggy version.
The columns in the first section, “#Succ. fault localization”, of Table 2 show the number of runs that each technique succeed in reporting correct fault location. We obtained the result of Bug-Assist (BA) and LocFaults (LF) from Reference [19,22]. In a total of 2156 runs of each technique, the number of runs that successfully output the correct fault locations of the AgxFaults, PF, and FS techniques is 2156 (100%) runs, while those of Bug-Assist (BA), LocFaults (LF), and FI is 2087 (96%), 1345 (62%), and 121 (5.6%) runs, respectively. Specifically, FI reported the correct fault location only for version v36, in which the faulty statement is data-dependent on the program output.
The columns in the second section, “#Succ. fix localization”, of Table 2 show the number of runs that each technique can output an MFC that is both correct fault location and feasible angelic fix candidate. As the result showed, all 2156 in a total of 2156 runs of the AgxFaults and PF are successful fix localization runs, while FS and FI techniques succeed in 2027 and 1345 runs, respectively.
The columns in the last sections of the table show the EXAM score of each technique. On average, the EXAM score for Agx was 6.8%, PF was 6.9%, FS was 11.5%, FI was 17.7%, and execution slice was 17.6%.

5.3. Result of RQ3: Efficiency and Scalability

To answer RQ3, we compare AgxFaults with PF, FS, and FI techniques based on their computational expensive. We use the CPU time as a metric for measure the computational complexity. Specifically, we measure and report the CPU time spent on the solver to solve formulas and the total running time of the techniques for each run.
Figure 8 and Figure 9 show the formula solving time—the CPU time spent by the solver to solve formulas—and the total running time of PF, Agx, FS, and FI formula techniques for the 41 buggy versions of the TCAS program. Both the formula-solving time and total running time of Agx were significantly smaller than those of PF for most of the versions. On average, the Agx approaches was 28% faster than PF, but three time slower than FS and 51 time slower than FI at formula solving. The Agx approaches was 45% faster than PF, but 9.4 time slower than FS and 9.4 time slower than FI at total running.
Because the computational complexity of AgxFaults and PF approaches are proportional to the loop unwinding bound that limits the maximum number of nested iterations for each loop in the target program, indeed, the computational complexity increased when the loops unwinding bound increased. Thus, we used the set of programs containing loops in Bekkouche’s benchmark to further evaluate the scalability of fault localization approaches with respect to loop unwinding bound. These programs included various variations of the SquareRoot, Sum, and BSearch programs. SquareRoot is a program that finds the integer part of the square root of an integer number. Sum is a program that computes the sum of all natural numbers from one to a given input value. BSearch is a program that implements the binary search algorithm to search in an increased array of integers. We run a fault localization technique multiple times for each program and each failing test case. We varied the maximum number of loop unwinding for the program execution trace from 10 to 100.
The fault localization result returned by both PF and AgxFaults are identical for each program. Specifically, both method return 6 MFCs for SquareRoot, 8 MFCs for Sum, and 8 MFCs for BSearch programs. Figure 10 shows the average formula solving time of Agx and PF approaches when applied to programs with increasing loop unwinding bounds. As shown in this graph, for a small loop unwinding bound, the formula solving time of PF and Agx are similar. However, when the number of loop unwinding bounds was increased, the time it took the PF approach to solve the problem increased exponentially, while that of the Agx approach increased at a significantly slower rate.

5.4. Result of RQ4: Real Software Bugs

This experiment is to evaluate the capability of AgxFaults on real bugs in large and complex projects. Table 3 has details of the projects and characteristics of the bugs that are used in this study. Columns “Name and “LOC” in the table show the project name and the number lines of the Java code in these projects. Columns “Bug ID” and “Description” show the unique id that identifies the bug and a description of each bug. Columns in the “Patch size” section of the table show the complexity of the patch written by the developer to fix the bugs. Specifically, columns “Add”, “Del.”, and “Edit” show the number of lines that the developer has added, deleted, and edited to fix the bugs.
Table 4 shows the results of our method for each bug in the benchmark. Column “#MFC shows the number of angelic fix candidates that AgxFaults found for a failing test case. All these generated MFCs are feasible. Column “#Susp. Lines” shows the number of distinct lines reported in the list of MFCs. Column “Found actual fault?” describes whether the reported lines contain the actual faulty statements or not (“yes/no"). Column “Exam lines” shows the number of lines of code that the developer needs to examine to identify the first fault location. “Solver time” and “Run time” shows the time spent on SMT solver and the total running time of AgxFaults for each buggy program.
Let us consider bug Chart 5, for example. Figure 11 shows how the developer fixed the bug. To fix the bug, the developer has (1) changed the condition expression of the if statements at line 548 and (2) added additional code at line 544. AgxFaults found 6 MFCs for this bug, shown in Figure 12, and all of these MFCs are feasible (replacing the value of the suspicious expressions with the corresponding angelic value actually results in a successful execution). There are a total of 7 lines of code reported in all MFCs. The actual faulty lines (i.e., the “if statement” at line 548) are reported in 4 MFCs, which are mfc3, mfc4, mfc5, and mfc6. All these MFCs contain the buggy line, together with one additional statement. This result indicates that modifying the buggy if-statement alone is not enough to make the failing test case pass. The developer should modify both the if-statement at line 548, together with one additional statement, as reported in mfc3, mfc4, mfc5, and mfc6. For example, mfc3 shows that modifying the if statement at line 548, together with the assignment “return = 0” at line 203 of the file XYDataItem.java, can make the failing test case a success. The number of lines that the developer has to examine before identify the faulty line is 3, as he needs to examine two lines in the mfc1 and mfc2 before checking the mfc3. The total running time of AgxFaults is about 3 seconds, of which the SMT solver accounts for 0.47 s.
In a total of 22 bugs in this study, there are 9 bugs that AgxFaults reported with the actual faulty line at the first candidate to the developer. A developer needs to examine less than 5 lines to identify the actual faulty line in all cases, except for the bug Codec18. The total running time for each run is a few seconds, which is acceptable.

Comparison with Existing Techniques on Real Bugs

Because the program-formula approach crashed or timed out without generating any MFCs when it was ran on the bugs in the real bug benchmark, we can only compare the result of AgxFaults with those of the single-path formula approaches FS and FI.
Table 5 shows the comparison of the results generated by AgxFaults with those generated by the FS and FI approaches on the real bugs benchmark. Column “Trace size” shows the number line of code that is executed in the error trace of the bug. The “#MFC columns show the number of minimal angelic fix candidates that each technique AgxFaults, FS, and FI produced for a failing test case. The “#Susp. Lines” columns show the total number of lines that each techniques reported as suspicious. The “#Exam lines” columns show the number of lines of code that the developer needs to examined to identify the first fault. Empty value in the “#Exam lines” means that the developer would not find any faulty statement in the list of suspicious statements produced by the tool, i.e., the tool cannot report any actual faulty statement in the buggy program.
As the result shows, in total 22 bugs, AgxFaults outperforms FS and FI in terms of fault localization effectiveness since, as shown in Table 5, in a total of 22 bugs in this study, AgxFaults successfully reported the actual faulty line in 19 bugs, while FS succeeded in 9 bugs, and FI succeeded in 5 bugs.
We compared the efficiency of AgxFaults on the real bugs benchmark with only FS and FI techniques. Table 6 shows the comparison results. As shown in the table, the time for solving formulas in AgxFaults is about 1.7 times longer than in FS, and about 95 times longer than FI approaches. As shown in the table, the total running time of AgxFaults is about 1.7 times longer than in FS, and about 5 times longer than FI approaches. The formula solving time accounts for about 43% of the total running time of AgxFault, while that of FS is 41%, and FI is The computation overhead for the formula solving activity account for about 43% of total computation cost of AgxFaults and total running time.

5.5. Threats to Validity

The most important internal threat to validity in our evaluation is that we implemented the existing techniques that we compare against. Since we are targeting Java program, unfortunately, all techniques that we compare against target C or C++ programs. Another internal threat is the possibility errors in our implementation of the core-guided incremental sequential partial MaxSMT algorithms. To reduce the threats, we have made all the source codes of our implementation available online in an open-source repository.
The main external threat to validity is that we performed our evaluation on two simple programs and some bugs in real open source projects. They do not necessarily represent all types of programs and bugs; thus, our results may not generalize. Another external threat is that the SMT solver Z3 and the JConstraints library, which we used in our implementation, may contain bugs.

6. Related Work

For decades, many automated techniques have been proposed for automatic fault localization. We refer the interested reader to the survey work of W. Wong et al. [6] for a systematic literature review. In this section, we give a brief overview of the most popular fault localization techniques, such as spectrum-based, slicing, mutation-based, and, especially, focus on formula-based fault localization, which is closely related to our work.

6.1. Spectrum-Based Fault Localization

Most existing automatic fault localization techniques are spectrum-based fault localization (SFL) [4,6]. SFL profiling the buggy program with a given test suite and count the number of passing and failing tests that cover a statement. Based on this coverage information, they compute for each statement a suspicious score which measures the likelihood of them being faulty. The output to developers a list of statements ranked by their suspicious score. SFL techniques require lightweight computation; thus, they can be applied to a very large program. However, these techniques usually return a long list of program entities with no context information. Moreover, in order to rank the actual faulty statement at the top of the suspicious list, they require to have a comprehensive test suite that contains sufficiently many passing and failing executions. These limitations limit the usefulness of their fault localization result. Our approach required only a single failing test case, and it can return a small sets of suspicious statements, where a suitable modification can make the test become passing.

6.2. Program Slicing Based

Slicing-based techniques [31] use program dependence information to reduce the suspicious scope to a subset of statements that might affect the wrong value of variables at the failure site. Since all statements, together with the corresponding dependencies, are taken into account, slicing-based techniques often return an imprecise list of suspicious statements. B. Hofer and F Wotawa [32] combine dynamic slicing with constraint solving to produce a more precise list of suspicious statements compared to dynamic slicing.

6.3. Mutation-Based Fault Localization

Mutation-based fault localization (MBFL) [8] is a recent direction that utilizes mutation analysis in fault localization. These techniques first use a set of syntactic change operations (i.e., mutation operations) to mutate the program code in order to generate several variant programs, called mutants. They then run these mutations with test cases and measure how the test execution results change when a code element is mutated. Based on this information, MBFL techniques statistically infer program elements that are highly relevant to the fault. A limitation of MBFL techniques is the huge mutation execution cost [9] because they need to generate a large number of mutants, combine them, and run these mutants with many test cases.

6.4. Formula-Based Fault Localization

Bug-Assist [12,22], SNIPER [14,33], and F. Wotawa [13] construct a formula that semantically represents all possible executions in a buggy program (unwinding to a given bound) and extend this formula in conjunction with clauses encoding input and expected output of a failing test case to form an unsatisfiable error trace formula. Bug-Assist [12,22] and SNIPER [14,33] treat this error trace formula as an instance of a partial MaxSAT problem, in which the clauses encoding test input and expected output is marked “hard-clause”, and the clauses encoding program statements are marked “soft-clause”. They use a pMaxSAT solver to obtain the MCS and report the program statements that correspond to clauses in the MCS as possible faults. To reduce the formula solving time, S. Lamraoui et al. [15] determine correct basic blocks (CB), which are basic blocks that do not participate in any failing executions, and set all clauses related to statements in these CBs as hard-clause. Instead of using MaxSAT solver to obtain MCS directly, F. Wotawa [13] derives the MCS by computing irreducible infeasible subsets (minimal hitting set) of the error trace formula. Our approach is similar to these MaxSAT-based approaches in finding minimal sets of program locations where an angelic fix may exist. However, our approach differs from these approaches in several aspects. First, while these approaches are based on a static error trace formula, which may lead to over-complex or insufficient for reasoning, our approach is based on a formula that is constructed dynamically on-demand in order to gain the trade-off between efficiency and complexity. Second, instead of using a MaxSAT solver, we adapt the core-guided maxsat algorithm to manipulate and solve the formula incrementally.
E. Ermis [17], U. Christ [18], O. Chanseok [34], and M. Bekkouche [19] work on error trace formulas that represent a sequence of program statements in which execution produced an error; however, we referred as single-path formula. E. Ermis [17] and U. Christ2013 [34] leverage Craig Interpolation to find Error Invariants for every point in the error trace, where an error invariant for a position in a trace is a condition that the error will occur if the program is continued from that position. Based on Error Invariants, they can semantically remove all irrelevant statements from the error trace, thus resulting in a shorter error trace which is easier to localize bugs. Compared to our approach, both approaches can output a reduced error trace which contains error-relevant statements only (in our approach, the reduced error trace can be reconstructed by sort all statements in MCS by the executed order); in addition, our approach also provides a suggestion about how to fix these bugs. Moreover, while our approach uses incremental SMT solving, which is commonly supported by recent SMT solvers, these error invariant approaches require an interpolation solver, which is not popular.
Bekkouche et al. [19] encode assignment statements on a given error path into an error trace formula. They used a MaxSAT solver to compute the MCS of this formula and reported the corresponding statements as possible faults. An attempt is made to divert at most k conditional branch decisions on the error path to find alternative corrected paths. For each corrected path found, diverted conditional branches and also the MCS of the trace formula constructed on the path that reached the first diverted condition are reported as possibly faulty. Similar to this approach, we also diverted the counterexample path to find corrected executions. However, there are several difference between our approach and that of Bekkouche et al. [19]. First, our approach finds corrected paths by diverting not only branch decisions but also assignment statements. Second, instead of trying to check all possible diverted paths by exhaustively diverting bounded subsets of conditions on the counterexample path, we encode the possible effect of diverting operations into the trace formula to leverage the search capacity of the solver. As a result, by analyzing MCSs obtained from the solver, a much smaller number of diversion attempts is needed to derive correct paths than is required using Beckkouche’s approach.
W. Jin and A. Orso [16] proposed two techniques called on-demand formula computation (OFC) and clause weighting (CW) to mitigate the computational expensive and improve the accuracy of formula-based fault localization. Specifically, OFC (1) encodes only statement instances in the original failing trace into the error trace formula, (2) computes all MCS of the constructed formula. (3) If there is a conditional statement s t such that (i) s t is found in an MCS and (ii) a branch b of s t is still not encoded in the formula, then the OFC expands the formula by encoding all statement instances in the branch b and go back to step (2). Otherwise, the obtained MCSs are reported as the final output. OFC and our proposed ATF encoding similar in encoding only partial program into formula in an increasing manner. There are two main differences between the OFC and our method ATF. First, the formula in OFC approach is expanded to encode all branches of conditional statements that occurred in an MCS, while the ATF formula is expanded to refine abstracted conditional branches that are included in an angelic execution of the angelic program. Second, our approach does not require computing all MCS of the intermediate formula, as the OFC does, instead, it computes one MCS at a time and stops computing MCS when the formula needs more refinement.
In the Angelic Debugging approach [21], program expression(s) in suspicious scope (provided in advance) are replaced with a nondeterministic expression (i.e., an angelic choice which can return arbitrary value). Then, they use symbolic execution to find a successful execution in the transformed program with the input is fixed to a given failing test input. If such a successful execution exists, the translated suspicious expression is considered a fix candidate, and the concrete values of the non-deterministic expressions is reported as angelic values, or a suggestion to fix. The limitation of this method is that it performs on each expression separately; thus, they need to run the symbolic execution many times, each time for checking one expression. Thus, they have to call to SMT solver many times for solving different formulas. One limitation of this method is that it does not return minimal results. Indeed, it can output a successful angelic execution by replacing all statements in the suspicious scope with angelic value.

6.5. Automatic Program Repair

Automatic program repair (APR) [7,35,36,37] is a hot research topic in software engineering currently. These techniques try to provide the developer with actual patches that can make the buggy program pass a given test suite, which it originally fails. Automated program repair techniques usually start by using a fault localization or a fix localization [37,38] to identify a subset of code elements at which a patch can be applied. The effectiveness of the fix localization task is critical important to the effectiveness, as well as the reliability of automatic program repair [39,40]. The fix localization components in the semantic-based APR approaches (such as Angelix [37] and Nopol [36]) share the same objective with our approach that is finding angelic execution paths that make the failing test case to be a pass. Our method differ to these techniques in several points. The angelic fix localization in Nopol find angelic values only for conditional expression, they assume a single modification, and they do not use solver. Our approach similar to the angelic forest extractor component in the Angelix, as they find angelic values for both assignments and conditional expressions. However, our method produce angelic execution path by modifying minimal locations, while Angelix does not constraint the size of the fix locations.

7. Conclusions

In this paper, we presented AgxFaults, a formula-based fault localization method that aims to automatically find minimal sets of program locations where a bug fix might exist. We implemented AgxFaults as an extension of the Java Path Finder for automatic localizing fault in Java programs. We used AgxFaults to localize faults in various benchmarked programs of different sizes and compared the performance of AgxFaults to existing formula-based fault localization approaches. The experimental results demonstrated that our proposed method outperformed single-path formula approaches in terms of effectiveness. AgxFaults was comparable to the Program Formula approach in terms of effectiveness but was better in efficiency and scalability. We also demonstrated the capability of AgxFaults when applied to bugs from large real-world software projects.

Author Contributions

Conceptualization, Q.-N.P.; methodology, Q.-N.P.; software, Q.-N.P.; validation, Q.-N.P. and E.L.; formal analysis, Q.-N.P. and E.L.; investigation, Q.-N.P. and E.L.; writing—original draft preparation, Q.-N.P.; writing—review and editing, Q.-N.P. and E.L.; supervision, E.L.; project administration, E.L.; funding acquisition, E.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research is supported by the Next-Generation Information Computing Development Program (2017M3C4A7068179), and the Basic Science Research Program (2019R1A2C2006411) through the National Research Foundation of Korea (NRF) grant funded by the Korean Government (MSIT).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The implementation and benchmarks data are publicly available at https://bit.ly/agxfaults.

Acknowledgments

I would like to extend my thanks to my advisor Eunseok Lee and my lab mates for their guidance and support throughout the process of this paper.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Britton, T.; Jeng, L.; Carver, G.; Cheak, P.; Katzenellenbogen, T. Reversible Debugging Software: Quantify the Time and Cost Saved Using Reversible Debuggers; University Cambridge: Cambridge, UK, 2013; Available online: https://core.ac.uk/display/23390105 (accessed on 29 December 2020).
  2. Hailpern, B.; Santhanam, P. Software debugging, testing, and verification. IBM Syst. J. 2002, 41, 4–12. [Google Scholar] [CrossRef] [Green Version]
  3. Parnin, C.; Orso, A. Are automated debugging techniques actually helping programmers? In Proceedings of the 2011 International Symposium on Software Testing and Analysis, Toronto, ON, Canada, 17–21 July 2011; p. 199. [Google Scholar] [CrossRef]
  4. Abreu, R.; Zoeteweij, P.; Gemund, A.V. Spectrum-Based Multiple Fault Localization. In Proceedings of the 2009 IEEE/ACM International Conference on Automated Software Engineering, Auckland, New Zealand, 16–20 November 2009. [Google Scholar] [CrossRef]
  5. Roychoudhury, A.; Chandra, S. Formula-based software debugging. Commun. ACM 2016, 59, 68–77. [Google Scholar] [CrossRef]
  6. Wong, W.E.; Gao, R.; Li, Y.; Abreu, R.; Wotawa, F. A Survey on Software Fault Localization. IEEE Trans. Softw. Eng. 2016, 5589, 707–740. [Google Scholar] [CrossRef] [Green Version]
  7. Gazzola, L.; Micucci, D.; Mariani, L. Automatic Software Repair: A Survey. IEEE Trans. Softw. Eng. 2017, 5589, 1. [Google Scholar] [CrossRef] [Green Version]
  8. Papadakis, M.; Le Traon, Y. Metallaxis-FL: Mutation-based fault localization. Softw. Test. Verif. Reliab. 2015, 25, 605–628. [Google Scholar] [CrossRef] [Green Version]
  9. Li, Z.; Wang, H.; Liu, Y. HMER: A Hybrid Mutation Execution Reduction approach for Mutation-based Fault Localization. J. Syst. Softw. 2020, 168, 110661. [Google Scholar] [CrossRef]
  10. Pearson, S.; Campos, J.; Just, R.; Fraser, G.; Abreu, R.; Ernst, M.D.; Pang, D.; Keller, B. Evaluating & improving fault localization techniques. In Proceedings of the 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE), Buenos Aires, Argentina, 20–28 May 2017. [Google Scholar]
  11. Gopinath, D.; Zaeem, R.N.; Khurshid, S. Improving the effectiveness of spectra-based fault localization using specifications. In Proceedings of the 2012 27th IEEE/ACM International Conference on Automated Software Engineering, Essen, Germany, 3–7 September 2012; p. 40. [Google Scholar] [CrossRef]
  12. Jose, M.; Majumdar, R. Cause Clue Clauses: Error Localization using Maximum Satisfiability. In Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation, San Jose, CA, USA, 4–8 June 2011; pp. 437–446. [Google Scholar] [CrossRef]
  13. Wotawa, F.; Nica, M.; Moraru, I. Automated debugging based on a constraint model of the program and a test case. J. Log. Algebr. Program. 2012, 81, 390–407. [Google Scholar] [CrossRef] [Green Version]
  14. Lamraoui, S.M.; Nakajima, S. A Formula-Based Approach for Automatic Fault Localization of Imperative Programs. In Proceedings of the 16th International Conference on Formal Engineering Methods, Luxembourg, 3–5 November 2014; pp. 251–266. [Google Scholar] [CrossRef]
  15. Lamraoui, S.M.; Nakajima, S.; Hosobe, H. Hardened Flow-Sensitive Trace Formula for Fault Localization. In Proceedings of the International Conference on Engineering of Complex Computer Systems (ICECCS), Gold Coast, Australia, 9–12 December 2015; pp. 50–59. [Google Scholar] [CrossRef]
  16. Jin, W.; Orso, A. Improving efficiency and accuracy of formula-based debugging. In Proceedings of the Haifa Verification Conference, Haifa, Israel, 14–17 November 2016; pp. 99–116. [Google Scholar] [CrossRef]
  17. Ermis, E.; Schäf, M.; Wies, T. Error invariants. In Proceedings of the International Symposium on Formal Methods, Paris, France, 27–31 August 2012; pp. 187–201. [Google Scholar] [CrossRef]
  18. Christ, U.; Ermis, E.; Schaef, M.; Wies, T. Flow-Sensitive Fault Localization. Verif. Model Checking Abstr. Interpret. 2013, 7737, 189–208. [Google Scholar]
  19. Bekkouche, M.; Collavizza, H.; Rueher, M. LocFaults: A new flow-driven and constraint-based error localization approach. In Proceedings of the 30th Annual ACM Symposium on Applied Computing, Salamanca, Spain, 13–17 April 2015; pp. 1773–1780. [Google Scholar]
  20. Si, X.; Zhang, X.; Manquinho, V.; Janota, M.; Ignatiev, A.; Naik, M. On Incremental Core-Guided MaxSAT Solving. In Principles and Practice of Constraint Programming; Rueher, M., Ed.; Springer International Publishing: Cham, Switzerland, 2016; Volume 9892, pp. 473–482. [Google Scholar]
  21. Chandra, S.; Torlak, E.; Barman, S.; Bodik, R. Angelic debugging. In Proceedings of the 2011 33rd International Conference on Software Engineering (ICSE), Honolulu, HI, USA, 21–28 May 2011; pp. 121–130. [Google Scholar] [CrossRef]
  22. Jose, M.; Majumdar, R. Bug-assist: Assisting fault localization in ANSI-C Programs. In Proceedings of the International Conference on Computer Aided Verification), Edinburgh, UK, 15–19 July 2011; pp. 504–509. [Google Scholar] [CrossRef] [Green Version]
  23. Cytron, R.; Ferrante, J.; Rosen, B.K.; Wegman, M.N.; Zadeck, F.K. Efficiently computing static single assignment form and the control dependence graph. ACM Trans. Program. Lang. Syst. 1991, 13, 451–490. [Google Scholar] [CrossRef]
  24. Barman, S.; Bodik, R.; Chandra, S.; Galenson, J.; Kimelman, D.; Rodarmor, C.; Tung, N. Programming with angelic nondeterminism. ACM SIGPLAN Not. 2010, 45, 339. [Google Scholar] [CrossRef]
  25. Luckow, K.; Dimjašević, M.; Giannakopoulou, D.; Howar, F.; Isberner, M.; Kahsai, T.; Rakamarić, Z.; Raman, V. JDart: A Dynamic Symbolic Analysis Framework. In Proceedings of the International Conference on Tools and Algorithms for the Construction and Analysis of Systems (TACAS 2016), Eindhoven, The Netherlands, 4–7 April 2016. [Google Scholar]
  26. Howar, F.; Jabbour, F.; Mues, M. JConstraints: A Library for Working with Logic Expressions in Java. In Models, Mindsets, Meta: The What, the How, and the Why Not? Springer: Cham, Switzerland, 2019. [Google Scholar]
  27. Fu, Z.; Malik, S. On Solving the Partial MAX-SAT Problem. In Proceedings of the International Conference on Theory and Applications of Satisfiability Testing 2006, Seattle, WA, USA, 12–15 August 2006; pp. 252–265. [Google Scholar] [CrossRef]
  28. Bekkouche, M. Java Benchmark. Available online: http://www.capv.toile-libre.org/Benchs_Mohammed.html (accessed on 29 December 2020).
  29. Hutchins, M.; Foster, H.; Goradia, T.; Ostrand, T. Experiments of the effectiveness of dataflow- and controlflow-based test adequacy criteria. In Proceedings of the 16th International Conference on Software Engineering, Sorrento, Italy, 16–21 May 1994. [Google Scholar]
  30. Just, R.; Jalali, D.; Ernst, M.D. Defects4J: A Database of Existing Faults to Enable Controlled Testing Studies for Java Programs. In Proceedings of the 2014 International Symposium on Software Testing and Analysis, San Jose, CA, USA, 21–26 July 2014. [Google Scholar] [CrossRef] [Green Version]
  31. Weiser, M. Program slicing. In Proceedings of the 5th International Conference on Software Engineering, San Diego, CA, USA, 9–12 March 1981. [Google Scholar]
  32. Hofer, B.; Wotawa, F. Combining slicing and constraint solving for better debugging: The CONBAS approach. Adv. Softw. Eng. 2012, 2012, 628571. [Google Scholar] [CrossRef]
  33. Lamraoui, S.M.; Nakajima, S. A Formula-Based Approach for Automatic Fault Localization of Multi-fault Programs. J. Inf. Process. 2016, 24, 251–266. [Google Scholar] [CrossRef] [Green Version]
  34. Chanseok, O.H.; Schaf, M.; Schwartz-Narbonne, D.; Wies, T. Concolic Fault Abstraction. In Proceedings of the 2014 IEEE 14th International Working Conference on Source Code Analysis and Manipulation, Victoria, BC, USA, 28–29 September 2014; pp. 135–144. [Google Scholar] [CrossRef] [Green Version]
  35. Yuan, Y.; Banzhaf, W. Toward Better Evolutionary Program Repair: An Integrated Approach. ACM Trans. Softw. Eng. Methodol. (TOSEM) 2020, 29, 1–53. [Google Scholar] [CrossRef]
  36. Xuan, J.; Martinez, M.; DeMarco, F.; Clément, M.; Marcote, S.L.; Durieux, T.; Berre, D.L.; Monperrus, M. Nopol: Automatic Repair of Conditional Statement Bugs in Java Programs. IEEE Trans. Softw. Eng. 2016, 41, 34–55. [Google Scholar] [CrossRef] [Green Version]
  37. Mechtaev, S.; Yi, J.; Roychoudhury, A. Angelix: Scalable Multiline Program Patch Synthesis via Symbolic Analysis. In Proceedings of the International Conference on Software Engineering, Austin, TX, USA, 14–22 May 2016. [Google Scholar]
  38. Jeffrey, D.; Gupta, N.; Gupta, R. Fault localization using value replacement. In Proceedings of the 2008 International Symposium on Software Testing and Analysis, Seattle, WA, USA, 20–24 July 2008; p. 167. [Google Scholar] [CrossRef] [Green Version]
  39. Liu, K.; Koyuncu, A.; Bissyande, T.F.; Kim, D.; Klein, J.; Le Traon, Y. You cannot fix what you cannot find! An investigation of fault localization bias in benchmarking automated program repair systems. In Proceedings of the 2019 IEEE 12th International Conference on Software Testing, Verification and Validation, Xi’an, China, 22–27 April 2019; pp. 102–113. [Google Scholar] [CrossRef] [Green Version]
  40. Liu, K.; Li, L.; Koyuncu, A.; Kim, D.; Liu, Z.; Klein, J.; Bissyandé, T.F. A critical review on the evaluation of automated program repair systems. J. Syst. Softw. 2021, 171, 110817. [Google Scholar] [CrossRef]
Figure 1. Buggy program and a failing test.
Figure 1. Buggy program and a failing test.
Applsci 11 00303 g001
Figure 2. Static single assignment (SSA) representation of method foo.
Figure 2. Static single assignment (SSA) representation of method foo.
Applsci 11 00303 g002
Figure 3. Error trace formula.
Figure 3. Error trace formula.
Applsci 11 00303 g003
Figure 4. Overview of AgxFaults.
Figure 4. Overview of AgxFaults.
Applsci 11 00303 g004
Figure 5. Angelic program encoded during the process of AgxFaults.
Figure 5. Angelic program encoded during the process of AgxFaults.
Applsci 11 00303 g005
Figure 6. Constraints added in the first iteration.
Figure 6. Constraints added in the first iteration.
Applsci 11 00303 g006
Figure 7. Constraints added in the fourth iteration.
Figure 7. Constraints added in the fourth iteration.
Applsci 11 00303 g007
Figure 8. Formula solving time of different methods on the TCAS programs.
Figure 8. Formula solving time of different methods on the TCAS programs.
Applsci 11 00303 g008
Figure 9. Total runtime of different methods on the TCAS programs.
Figure 9. Total runtime of different methods on the TCAS programs.
Applsci 11 00303 g009
Figure 10. Formula solving time with increasing loop unwinding bounds.
Figure 10. Formula solving time with increasing loop unwinding bounds.
Applsci 11 00303 g010
Figure 11. Chart 5 diff.
Figure 11. Chart 5 diff.
Applsci 11 00303 g011
Figure 12. Minimal angelic fix candidates (MFCs) returned by AgxFaults for the bug Chart 5.
Figure 12. Minimal angelic fix candidates (MFCs) returned by AgxFaults for the bug Chart 5.
Applsci 11 00303 g012
Table 1. Comparison of the effectiveness in finding minimal angelic fix candidate of AgxFaults (Agx), Program Formula (PF), Single Path Control-Flow Sensitive (FS), and Control-Flow Insensitive (FI) approaches on the traffic collision avoidance system (TCAS) programs.
Table 1. Comparison of the effectiveness in finding minimal angelic fix candidate of AgxFaults (Agx), Program Formula (PF), Single Path Control-Flow Sensitive (FS), and Control-Flow Insensitive (FI) approaches on the traffic collision avoidance system (TCAS) programs.
Program InfoNum. MFC FoundNum. Feasible MFC
ProgFaultsFtcAgxPFFSFIAgxPFFSFI
v1113144445234444353
v216948484434848223
v312461615336161333
v412242424834242343
v511061614736161313
v611265655036565313
v713666665036666323
v81144445234444353
v91739394833939143
v1021481815038181353
v11314815015037315015053
v1217059594935959303
v131457575135757333
v14150771037773
v1521064644636464303
v1617063634936363333
v1713564645036464323
v1812985855038585233
v1911964645136464333
v201739394833939143
v2111642425134242353
v22111121121493121121113
v2314236364933636143
v241742424934242333
v251379795137979133
v2611155554935555313
v2711061614736161313
v2817648484334848213
v2911856564335656213
v3015844444234444203
v311279106106423106106213
v321336107107433107107303
v3417950504535050263
v3527648484334848213
v36112122322352322322333
v3719952525035252253
v391379795137979133
v402121111111473111111153
v4112244444934444363
SumAvg.215687874538787233
Table 2. Comparison of the effectiveness in identifying fault locations of AgxFaults (Agx), Program Formula (PF), Single Path Control-Flow Sensitive (FS), Control-Flow Insensitive (FI), Bug-Assist (BA), and LocFaults (LF) approaches on the TCAS programs.
Table 2. Comparison of the effectiveness in identifying fault locations of AgxFaults (Agx), Program Formula (PF), Single Path Control-Flow Sensitive (FS), Control-Flow Insensitive (FI), Bug-Assist (BA), and LocFaults (LF) approaches on the TCAS programs.
#Succ. Fault Localization#Succ. Fix LocalizationEXAM Score
VerFtcAgxPFFSFIBALFAgxPFFSFIAgxPFFSFIExe
v1131131131131013113113113113106.18.46.419.318.2
v2696969690696969694207.47.910.218.717.6
v3242424240142324242405.16.19.819.518.3
v422222222022422222208.689.719.418.2
v510101010010910101005.86.58.819.518.3
v6121212120121112129012.413.613.719.218.1
v73636363603636363636013.112.415.619.118
v81111011111013.717.617.620.219.1
v9777707777704.97.211.819.718.5
v101414141401412141411013.712.712.719.218.1
v11148148148148014814814814814802.22.16.818.917.7
v12707070700484570707005.56.67.619.418.3
v134444044444013.28.213.619.518.5
v14505050500505050505002.913.711.19.9
v15101010100101010101006.35.4719.418.2
v16707070700707070707009.89.816.918.717.6
v17353535350353535353509.88.213.719.118
v182929292902928292929066.713.819.718.5
v191919191901918191919013.111.81719.518.4
v20777707777704.97.211.819.718.5
v21161616160161616161606.48.214.419.618.5
v221111111101111111111043.413.120.219
v234242424204142424242054.712.719.818.7
v247777077777011.98.36.619.718.5
v25333303233304.63.36.62018.9
v2611111111011711111107.54.38.219.618.5
v2710101010010910101005.86.58.819.518.3
v28767676760587476765507.97.412.718.617.5
v29181818180181718181407.37.58.318.517.3
v30585858580585858584306.96.99.318.217.1
v312792792792790279 27927927906.26.411.818.217.1
v323363363363360336 33633633609.19.513.418.917.7
v34797979790797979797907.67.67.819.418.2
v35767676760587476765107.27.21218.617.5
v361211211211211211211201211211211211.51.414.81.518.6
v37999999990992199996809.29.817.819.117.9
v39333303233304.63.36.62018.9
v4012112112112101217212112112105.75.513.418.617.5
v41222222220221622222205.75.49.819.418.2
2156215621562156121208713452156215620271216.86.911.517.717.6
Table 3. Bugs used for studies.
Table 3. Bugs used for studies.
ProjectBugDev. Patch Size
NameLOCBugIDDescriptionAddDelEdit
Chart89.3KChart5mising branch401
Chart7wrong assignments002
Chart18missing guard1132
Chart22algorithm error3012
Codec17.5KCodec16wrong field initialization111
Codec18wrong return expression111
Compress28.3KComp24algorithm error340
Comp27wrong branch guard030
Comp6algorithm error302
Csv3.8KCsv2mising try-catch700
Csv8algorithm error670
Lang53.2KLang14missing branch301
Lang21wrong return expression111
Lang22missing branch712
Lang30missing branch3805
Lang31missing branch800
Lang40design error701
Lang58wrong if condition010
Math60.6KMath94wrong if condition001
Math97design error1402
Mockito10.5KMock11design error801
Mock21design error1604
Table 4. Fault localization results of AgxFaults for real bugs.
Table 4. Fault localization results of AgxFaults for real bugs.
Bug ID#MFC#Susp. LinesFound Actual Fault?#Exam LinesSolver Time (ms)Run Time (ms)
Chart567yes34703167
Chart744yes415(s)19(s)
Chart1833yes34585749
Chart2211no22203509
Codec1663yes5582237
Codec184812yes1229387870
Comp243114yes3937026(s)
Comp273117yes229(s)63(s)
Comp644yes2461586
Csv222yes151419
Csv844yes12022369
Lang1411yes1691772
Lang2181yes128976534
Lang2211no325(s)29(s)
Lang30118yes51662397
Lang313913yes18515757
Lang40134yes44456444
Lang5844yes11575907
Math9422yes157(s)59(s)
Math9722yes17222512
Mock1111yes141743
Mock2122no351725
Table 5. Comparing the results of AgxFaults (Agx), Single-path Control-Flow-Sensitive (FS), and Single-path Control-Flow-Insensitive (FI) approaches on real bugs.
Table 5. Comparing the results of AgxFaults (Agx), Single-path Control-Flow-Sensitive (FS), and Single-path Control-Flow-Insensitive (FI) approaches on real bugs.
Trace Size#MFC#Susp. Lines#Exam Lines
Bug IDAgxFSFIAgxFSFIAgxFSFI
Chart592861477146391
Chart784341404140412-
Chart188063033033--
Chart2275411141114-7-
Codec16546923525--
Codec182074866126612--
Comp245836316614553--
Comp276183312217222--
Comp6904604602--
Csv2452222221--
Csv85754334331--
Lang1461131121111
Lang21510388011011-
Lang221621110100---
Lang301611110088052-
Lang31345390013001--
Lang40774013004004--
Lang5847641394108172
Math947912002001--
Math97155244244144
Mock1135111111111
Mock21105222222---
Table 6. Comparing execution time of AgxFaults (Agx), Single-path Control-Flow-Sensitive (FS), and Single-path Control-Flow-Insensitive (FI) approaches on real bugs.
Table 6. Comparing execution time of AgxFaults (Agx), Single-path Control-Flow-Sensitive (FS), and Single-path Control-Flow-Insensitive (FI) approaches on real bugs.
Solver TimeRunning Time
Bug IDAgxFSFIAgxFSFI
Chart5470571108316739531968
Chart715,170546318,96233763227
Chart184580139574925972434
Chart22220400119350930702192
Codec165819376223730852485
Codec182938139152787023412009
Comp24937021217926,29539503094
Comp2729,469121935563,23667465035
Comp646673158617451368
Csv25108141924781390
Csv820210281236938412597
Lang14691813177217021668
Lang21289747554653481282495
Lang2225,96253,232429,50757,6913452
Lang301661412239719741640
Lang31851573575724881616
Lang40445193644424381802
Lang58157257136590729982298
Math9456,59524,535559,01427,8602569
Math97722180116251220911661
Mock114107174318071698
Mock2154022172521151766
Sum146,27986,7031538259,711148,47450,464
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Phung, Q.-N.; Lee, E. Incremental Formula-Based Fix Localization. Appl. Sci. 2021, 11, 303. https://doi.org/10.3390/app11010303

AMA Style

Phung Q-N, Lee E. Incremental Formula-Based Fix Localization. Applied Sciences. 2021; 11(1):303. https://doi.org/10.3390/app11010303

Chicago/Turabian Style

Phung, Quang-Ngoc, and Eunseok Lee. 2021. "Incremental Formula-Based Fix Localization" Applied Sciences 11, no. 1: 303. https://doi.org/10.3390/app11010303

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop