SLIDE 1 1
Class 20
- Fault localization (cont’d)
- Test-data generation
- Exam review: Nov 3, after class to 7:30
- Responsible for all material up through Nov 3
(through test-data generation)
- Send questions beforehand so all can prepare
- Exam: Nov 10
- Final project presentations: Dec 1, 3; 4:35-6:45
- Assign (see Schedule for links)
- Problem Set 9 discuss
- Readings
Fault Localization Using Tarantula
- What information does Tarantula use to compute
suspicious (and ranking) of statements in the program?
- How is this information used?
- Are there other ways to compute the suspiciousness
using this information?
- What information other than statement coverage could
be used for fault localization?
- Do you think statement coverage would have worked for
tritype?
- How could we use fault localization to identify which
changes are most suspicious after a build?
SLIDE 2 Improving Fault-localization Efficiency
P Execute Debug failed tests P’ all tests pass Execute Debug failed tests P’’ Pi is failure-free Execute …
- Are all failing tests caused by the same fault?
- Are all failing tests caused by the same fault?
- Can we associate groups of tests with different
faults?
- Are all failing tests caused by the same fault?
- Can we associate groups of tests with different
faults?
- Can we reduce debugging effort by considering
these groups individually?
- Are all failing tests caused by the same fault?
- Can we associate groups of tests with different
faults?
- Can we reduce debugging effort by considering
these groups individually?
- Can we reduce debugging effort by considering
these groups simultaneously?
mid() { int x,y,z,m; 1:read(“Enter 3 integers:”,x,y,z); 2:m = z; 3:if (y<z) 4: if (x<y) 5: m = y; 6: else if (x<z) 7: m = y; 8:else 9: if (x>y) 10: m = z; 11: else if (x>z) 12: m = x; 13:print(“Middle number is:”, m); } 3,3,5 1,2,3 3,2,2 5,5,5 1,1,4 5,3,4 P P P P F h h h h h h h h h h h h h h h h h h h h h h h h h h h h h h h h h h h h h h h h Pass/fail Status 3,2,1 2,1,3 5,4,2 5,2,6 h h h h h h h h h h h h h h h h h h h h h h h h h h h h P P F F F t1 t2 t3 t4 t5 t6 t7 t8 t9 t10
Improving Fault-localization Efficiency
SLIDE 3
3,3,5 1,2,3 3,2,2 5,5,5 1,1,4 5,3,4 P P P P F h h h h h h h h h h h h h h h h h h h h h h h h h h h h h h h h h h h h h h h h Pass/fail Status 3,2,1 2,1,3 5,4,2 5,2,6 h h h h h h h h h h h h h h h h h h h h h P P F F F mid() { int x,y,z,m; 1:read(“Enter 3 integers:”,x,y,z); 2:m = z; 3:if (y<z) 4: if (x<y) 5: m = y; 6: else if (x<z) 7: m = y; 8:else 9: if (x>y) 10: m = z; 11: else if (x>z) 12: m = x; 13:print(“Middle number is:”, m); } t1 t2 t3 t4 t5 t6 t7 t8 t9 t10
Improving Fault-localization Efficiency
h h h h h h h
Debugging Process
P Execute Debug failed tests P’ all tests pass Execute Debug failed tests P’’ Pi is failure-free Pj is failure-free Execute … P Execute Debug some failed tests P’ all tests pass Execute some failed tests…
SLIDE 4
Debugging Process
P Execute Debug failed tests P’ all tests pass Execute Debug failed tests P’’ Pi is failure-free Pj is failure-free Execute … P Execute Debug some failed tests P’ all tests pass Execute some failed tests… Execute P Debug Debug P’ all tests pass
…
Pk is failure-free some failed tests some failed tests
Hierarchy of Bugs
Faults often dominate each other Failing test cases are first caused by a set of initial faults Once initial faults are fixed, other faults manifest themselves
Time Fault 1 Fault 2 Fault 3 Fault 4 Fault 5 Fault 6 Fault 7 Fault 8
SLIDE 5 Pk is failure-free
Debugging Process
P Execute Debug failed tests P’ all tests pass Execute Debug failed tests P’’ Pi is failure-free Pj is failure-free Execute … P Execute Debug some failed tests P’ all tests pass Execute some failed tests… Execute P Debug Debug P’ all tests pass
…
some failed tests some failed tests
Potential benefits:
- Reduced time to failure-free
program
- Less “noise” in locating each
fault
- Better utilization of developer
effort Potential costs
- Overhead to partition test
cases
(developers)
Debugging Process
P Execute Debug failed tests P’ all tests pass Execute Debug failed tests P’’ Pi is failure-free Pj is failure-free Execute … P Execute Debug some failed tests P’ all tests pass Execute some failed tests… Execute P Debug Debug P’ all tests pass
…
Pk is failure-free some failed tests some failed tests
Crucial problem:
- Partitioning failed tests into
groups of similar behavior— focus on different faults
Crucial problem:
- Partitioning failed tests into
groups of similar behavior— focus on different faults
- fault-focusing clusters of failed
test cases
SLIDE 6 Fault-focusing Clusters—Overview
t06 t07 t09 t08 t10 t08 t07 t04 t09 t02 t05 t01 t03 t10
Fault-focusing clusters:
- Clusters of failing test cases
- Clusters failing in similar way
- Each cluster targeting a different fault
Test Cases
Fault-focusing Clusters
t07 t09 t08 t10 t06 t04 t02 t05 t01 t03
Test Cases
t06 t04 t02 t05 t01 t03
Specialized Test Suites
Specialized test suites: Fault-focusing clusters combined with passing test cases
SLIDE 7 Fault-focusing Clusters
t07 t09 t08 t10
Specialized Test Suites
Developer Developer t06 t04 t02 t05 t01 t03 t06 t04 t02 t05 t01 t03
Specialized test suites: Fault-focusing clusters combined with passing test cases Specialized test suites: Fault-focusing clusters combined with passing test cases
using specialized test suites
Fault-focusing Clusters
t07 t09 t08 t10 t06 t04 t02 t05 t01 t03
Test Cases
t06 t04 t02 t05 t01 t03
Specialized Test Suites
Developer 1 Developer 2 t06 t04 t02 t05 t01 t03 t06 t04 t02 t05 t01 t03
Specialized test suites: Fault-focusing clusters combined with passing test cases
using specialized test suites
time (in parallel) using specialized test suites
SLIDE 8 Fault-focusing Clusters
Execution Clustering Fault Localization
failed test cases execution information specialized test suites suspiciousness and ranks
Fault-focusing Clusters
Clustering by behavior models Dynamic information
- profiles (branch, method-method, …)
- only failed tests
Statistical analysis, machine learning
- generate models for each execution
- cluster models
Fault-localization for stopping point Execution Clustering Fault Localization
failed test cases execution information specialized test suites suspiciousness and ranks
SLIDE 9 t07-09 t07-09-08-10 t08-010
Clustering Behavior Models
- Models: discrete-time Markov chains (DTMCs) from profiles
(branch, method,…)
- Clustering: iterative with two most similar according to Sim1
Sim1: sum of absolute difference between matching transitions in DTMCs being compared
Most difficult problem of clustering is determining a good stopping criterion?
t07 t08 t09 t10
Most difficult problem of clustering is determining a good stopping criterion? What is a good stopping point for the clustering for fault-focused clusters?
Fault Localization for Stopping Point
t07-09 t07-09-08-10 t08-10 t07 t08 t09 t10
SLIDE 10
Fault Localization for Stopping Point
t07-09 t07-09-08-10 t08-10 t07 t08 t09 t10
B A B A Sim U I = 2
Fault Localization for Stopping Point
t07-09 t07-09-08-10 t08-10 t07 t08 t09 t10
t06 t08 t07 t04 t09 t02 t05 t01 t03 t10
t07-09-08-10
B A B A Sim U I = 2
rank
SLIDE 11 mid() { int x,y,z,m; 1:read(“Enter 3 integers:”,x,y,z); 2:m = z; 3:if (y<z) 4: if (x<y) 5: m = y; 6: else if (x<z) 7: m = y; 8:else 9: if (x>y) 10: m = z; 11: else if (x>z) 12: m = x; 13:print(“Middle number is:”, m); } 3,3,5 1,2,3 3,2,2 5,5,5 1,1,4 5,3,4 P P P P F
3,2,1 2,1,3 5,4,2 5,2,6
F F
Tarantula: Fault Localization
//bug //bug suspiciousness 0.50 0.50 0.50 0.43 0.00 0.50 0.60 0.60 0.60 0.75 0.00 0.00 0.50 t1 t2 t3 t4 t5 t6 t7 t8 t9 t10
Fault Localization for Stopping Point
t07-09 t07-09-08-10 t08-10 t07 t08 t09 t10
t06 t08 t07 t04 t09 t02 t05 t01 t03 t10
t07-09-08-10
10 9 8 7
t06 t07 t04 t09 t02 t05 t01 t03
t07-09
B A B A Sim U I = 2
rank rank
SLIDE 12 mid() { int x,y,z,m; 1:read(“Enter 3 integers:”,x,y,z); 2:m = z; 3:if (y<z) 4: if (x<y) 5: m = y; 6: else if (x<z) 7: m = y; 8:else 9: if (x>y) 10: m = z; 11: else if (x>z) 12: m = x; 13:print(“Middle number is:”, m); } 3,3,5 1,2,3 3,2,2 5,5,5 1,1,4 5,3,4 P P P P
3,2,1 5,4,2
F
Fault-focusing Cluster 1
//bug //bug suspiciousness 0.50 0.50 0.50 0.00 0.00 0.00 0.00 0.75 0.75 0.86 0.00 0.00 0.50 t1 t2 t3 t4 t5 t6 t7 t8 t9 t10
Fault Localization for Stopping Point
t07-09 t07-09-08-10 t08-10 t07 t08 t09 t10
t06 t08 t07 t04 t09 t02 t05 t01 t03 t10
t07-09-08-10
10 9 8 7
t06 t07 t04 t09 t02 t05 t01 t03
t07-09
10 9 8 1 B A B A Sim U I = 2
5 3 2 = Sim
rank rank .60
SLIDE 13 Fault Localization for Stopping Point
t07-09 t07-09-08-10 t08-10 t07 t08 t09 t10
t06 t08 t07 t04 t09 t02 t05 t01 t03 t10
t07-09-08-10
10 9 8 7
t06 t08 t04 t10 t02 t05 t01 t03
t08-10
B A B A Sim U I = 2
rank rank .60
mid() { int x,y,z,m; 1:read(“Enter 3 integers:”,x,y,z); 2:m = z; 3:if (y<z) 4: if (x<y) 5: m = y; 6: else if (x<z) 7: m = y; 8:else 9: if (x>y) 10: m = z; 11: else if (x>z) 12: m = x; 13:print(“Middle number is:”, m); } 3,3,5 1,2,3 3,2,2 5,5,5 1,1,4 5,3,4 P P P P F
2,1,3 5,2,6
F //bug //bug suspiciousness 0.50 0.50 0.50 0.60 0.00 0.67 0.75 0.00 0.00 0.00 0.00 0.00 0.50
Fault-focusing Cluster 2
t1 t2 t3 t4 t5 t6 t7 t8 t9 t10
SLIDE 14
Fault Localization for Stopping Point
t07-09 t07-09-08-10 t08-10 t07 t08 t09 t10
t06 t04 t10 t02 t05 t01 t03
t08-10
7 6 4 1 B A B A Sim U I = 2
7 1 2 = Sim
t06 t08 t07 t04 t09 t02 t05 t01 t03 t10
t07-09-08-10
10 9 8 7
rank rank .14 .60
t08
Fault Localization for Stopping Point
t07-09 t07-09-08-10 t08-10 t07 t08 t09 t10
B A B A Sim U I = 2
.14 .60
7 6 4 1
rank
t06 t04 t10 t02 t05 t01 t03
t08-10
t08 t06 t04 t02 t05 t01 t03
t08
7 6 4 1
rank
t08
SLIDE 15
Fault Localization for Stopping Point
t07-09 t07-09-08-10 t08-10 t07 t08 t09 t10
B A B A Sim U I = 2
.14 .60
7 6 4 1
rank
t06 t04 t10 t02 t05 t01 t03
t08-10
t08
Fault Localization for Stopping Point
t07-09 t07-09-08-10 t08-10 t07 t08 t09 t10
B A B A Sim U I = 2
4 4 2 = Sim
.14 .60 1.00
7 6 4 1
rank
t06 t04 t10 t02 t05 t01 t03
t08-10
t08
rank
t06 t04 t02 t05 t01 t03
t08
7 6 4 1
rank
t08
SLIDE 16 Fault Localization for Stopping Point
t07-09 t07-09-08-10 t08-10 t07 t08 t09 t10
B A B A Sim U I = 2
4 4 2 = Sim
.14 .60 1.00
7 6 4 1
rank
t06 t04 t10 t02 t05 t01 t03
t08-10
t08
rank
t06 t04 t02 t05 t01 t03
t10
7 6 4 1
rank
t10
1.00
Fault Localization for Stopping Point
t07-09 t07-09-08-10 t08-10 t07 t08 t09 t10 .14 .60 1.00 1.00
- Composite is similar (above threshold) to
both of its constituents so clustering stops at this level
- Result is two clusters: {t07, t09}, {t08, t10}
SLIDE 17 Fault Localization for Stopping Point
t07-09 t07-09-08-10 t08-10 t07 t08 t09 t10 .14 .60 1.00 1.00
- Composite is similar (above threshold) to
both of its constituents so clustering stops at this level
- Result is two clusters: {t07, t09}, {t08, t10}
mid() { int x,y,z,m; 1:read(“Enter 3 integers:”,x,y,z); 2:m = z; 3:if (y<z) 4: if (x<y) 5: m = y; 6: else if (x<z) 7: m = y; 8:else 9: if (x>y) 10: m = z; 11: else if (x>z) 12: m = x; 13:print(“Middle number is:”, m); } 3,3,5 1,2,3 3,2,2 5,5,5 1,1,4 5,3,4 P P P P
3,2,1 5,4,2
F
Fault-focusing Cluster 1
//bug //bug suspiciousness 0.50 0.50 0.50 0.00 0.00 0.00 0.00 0.75 0.75 0.86 0.00 0.00 0.50 t1 t2 t3 t4 t5 t6 t7 t8 t9 t10
SLIDE 18 mid() { int x,y,z,m; 1:read(“Enter 3 integers:”,x,y,z); 2:m = z; 3:if (y<z) 4: if (x<y) 5: m = y; 6: else if (x<z) 7: m = y; 8:else 9: if (x>y) 10: m = z; 11: else if (x>z) 12: m = x; 13:print(“Middle number is:”, m); } 3,3,5 1,2,3 3,2,2 5,5,5 1,1,4 5,3,4 P P P P F
2,1,3 5,2,6
F //bug //bug suspiciousness 0.50 0.50 0.50 0.60 0.00 0.67 0.75 0.00 0.00 0.00 0.00 0.00 0.50
Fault-focusing Cluster 2
t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 mid() { int x,y,z,m; 1:read(“Enter 3 integers:”… 2:m = z; 3:if (y<z) 4: if (x<y) 5: m = y; 6: else if (x<z) 7: m = y; 8:else 9: if (x>y) 10: m = z; 11: else if (x>z) 12: m = x; 13:print(“Middle number is:”… } //bug //bug
Visualization of Specialized Test Suites
mid() { int x,y,z,m; 1:read(“Enter 3 integers:”… 2:m = z; 3:if (y<z) 4: if (x<y) 5: m = y; 6: else if (x<z) 7: m = y; 8:else 9: if (x>y) 10: m = z; 11: else if (x>z) 12: m = x; 13:print(“Middle number is:”… } //bug //bug
Cluster 1 Cluster 2
SLIDE 19 Empirical Study
Variables
NSTS: Finding faults using non-specialized test suites STS-S: Finding faults using specialized test suites STS-P: Finding faults using specialized test suites in parallel
Measures
D: total developer effort FF: total effort to failure-free program
Subject
SPACE
- 6000 LOC
- 100 8-fault versions; > 1000 derivative versions)
Method
For each of 100 8-fault versions, debug until failure-free, using Non specialized test suite Specialized test suite both sequential and parallel
D: Total Developer Effort
- Using specialized test suites based on fault-focusing
cluster is less expensive, on average, than not using specialized test suites
- Benefit holds when performing
- fault localization sequentially (one developer)
- fault localization in parallel (multiple developers)
SLIDE 20 FF: Total Effort to Failure-free
Sample Source Sample mean Sample standard deviation 99% confidence interval lower bound 99% confidence interval upper bound FFNSTS 36.26 22.86 30.83 41.69 FFSTS-S 26.16 22.58 20.80 31.53 FFSTS-P 18.29 14.00 14.96 21.62
Using specialized test suites and performing the fault localization sequential or in parallel can provide significant savings over using non specialized test suites
Summary of Results
For SPACE Using specialized test suites is usually less expensive than using non-specialized test suites
- Total developer effort is reduced for both sequential
and parallel modes
- Time to a failure-free program is reduced without
negatively affecting the total developer effort
SLIDE 21 41
Automatic Test Data Generation
42
Test Data Generation
Ferguson and Korel described three categories of test-data generation What are they?
SLIDE 22 43
Test Data Generation
Ferguson and Korel described three categories of test-data generation
- 1. Random—randomly select from universe of inputs
- 2. Goal-oriented—select test data to execute a given
entity (e.g., statement, branch, def-use pair) irrespective of the path taken
- 3. Path-oriented—select a program path and
generate test data that will execute that path; path can be selected automatically or selected by the user