Information Theory and Software Testing David Clark David Clark IT - PowerPoint PPT Presentation

Information Theory and Software Testing David Clark David Clark IT and ST

Papers Squeeziness: A Information Theoretic Measure for Avoiding Fault Masking. D. Clark and R. Hierons. IPL. 2012. Fault Localization Prioritization: Comparing Information Theoretic and Coverage Based Approaches. S. Yoo, M. Harman and D. Clark. ToSEM. 2013. An Analysis of the Relationship between Conditional Entropy and Failed Error Propagation in Software Testing. K. Androutsopoulos, D. Clark, H. Dan, R. Hierons, and M. Harman. ICSE. 2014. Information Transformation: An Underpinning Theory for Software Engineering. D. Clark, R.Feldt, S. Poulding and S. Yoo. ICSE. 2015. Test Set Diameter: Quantifying the Diversity of Sets of Test Cases. R. Feldt, S. Poulding, D. Clark and S. Yoo. ICST. 2016. Test Oracle Assessment and Improvement. G. Jahangirova, D. Clark, M. Harman and P. Tonella. ISSTA. 2016. David Clark IT and ST

Problems What is the test execution order that locates a software fault as quickly as possible? How can we choose tests that don’t suffer from coincidental correctness? How do we know that we have enough tests? How do we know that our test suite is sufficiently diverse? How can we measure how much a real oracle deviates from an ideal oracle? David Clark IT and ST

Shannon Entropy randomness of a random variable David Clark IT and ST

Kolmogorov Complexity Chaitin Kolmogorov Solomono ff The length of the shortest program that can produce a given string from no inputs randomness of a string David Clark IT and ST

Use Entropy to speed Fault Location Program with m statements, S = { s 0 , s 1 , . . . , s m − 1 } Test suite with n tests, T = { t 0 , t 1 , . . . , t n − 1 } S contains a single fault Random variable X models fault locality p ( X = s j ) is the probability that s j is the faulty statement H ( X ) − → 0 as fast as possible Estimate the change in entropy due to each test Employ a greedy algorithm to select the next test David Clark IT and ST

Localisation Metrics AKA “suspiciousness” metrics: likelihood of statement containing the fault Tarantula, Ochiai, Jaccard etc. Tarantula Metric fail ( s ) totalfail metric τ ( s ) = pass ( s ) fail ( s ) totalpass + totalfail David Clark IT and ST

Tarantula Metric illustration Structural Test Test Test Tarantula Test Tarantula Elements t 1 t 2 t 3 Metric( τ ) t 4 Metric( τ ) s 1 0.00 0.00 • • s 2 0.00 0.00 • • s 3 0.00 0.00 • • s 4 0.00 0.00 • s 5 0.00 0.00 • • 1.00 1.00 s 6 • • s 7 (faulty) 0.67 1.00 • • • 1.00 1.00 s 8 • • s 9 0.67 0.50 • • • Result P F P - F - David Clark IT and ST

B ( s j ) is the event that s j is faulty T i = T i − 1 ∪ { t i } is a set of tests τ ( s | T i ) is the suspiciousness of s after executing T i Tarantula induced Probability Distribution τ ( s j | T i ) P T i ( B ( s j )) = P m j =1 τ ( s j | T i ) Tarantula induced Entropy m X H T i ( S ) = − P T i ( B ( s j )) · log P T i ( B ( s j )) j =1 David Clark IT and ST

Entropy Lookahead Lookahead Probability Distribution on Failure TF i α = P T i +1 ( F ( t i +1 )) ≈ TP i + TF i Lookahead Probability Distribution on Fault location P T i +1 ( B ( s j )) = P T i +1 ( B ( s j ) | F ( t i +1 )) · α + P T i +1 ( B ( s j ) |¬ F ( t i +1 )) · (1 − α ) F ( t i ) is the event that t i is identified as a failing test Use P T i +1 ( B ( s j )) to calculate H T i +1 ( S ), the estimated entropy of B that results from adding the execution t i +1 David Clark IT and ST

Outcomes Approach is independent of the fault localisation method used Experimental evidence from four SUTs plus their test suites drawn from the Software Infrastructure Repository (SIR) Increased the suspiciousness ranking and decreased the cost of fault localisation for 70% of the faults examined Paper Fault Localization Prioritization: Comparing Information Theoretic and Coverage Based Approaches. Yoo, Harman and Clark . ToSEM 2013. David Clark IT and ST

Use Conditional Entropy to avoid Coincidental Correctness input t1:x==3 Intended Unintended t2:x==-5 x=x+2; x=3*x; if(x>0) if(x>0) x=x%4; x=x%4; else x=x; else x=x; output output t1:x==1 t1:x==1 t2:x==-3 t2:x==-15 David Clark IT and ST

The Abstract View Intended Unintended t t P P’’ A’ A C’ C pp’ pp Q Q B B’ o o David Clark IT and ST

Information Based View H ( f − 1 o ) f − 1 o . . . . . . . . . . . . . . . f . . . . . . o p ( o ) David Clark IT and ST

The Maths Loss of information from running program P deterministic case H ( I ) − H ( O ) = H ( I | O ) where [ [ P ] ] I = O Conditional entropy of I given O: Squeeziness . X p ( o ) H ( f − 1 o ) Sq ( f ) = H ( I ) − H ( O ) = o 2 O via the partition property David Clark IT and ST

Example Hypothesis Intended t Unintended t P P’’ π = A 0 B 0 A’ A π l = B 0 ] pa [ [ π ] pp 0 C’ C pp’ pp Q Q B B’ o o David Clark IT and ST

Summary 30 SUTS 1,408 Mutants 7,140,00 test cases Five different IT metrics experimentally investigated Two metrics showed 0.95 Spearman rank correlation with the probability of failed error propagation 10% of all 7,140,000 test inputs suffered from FEP Paper An Analysis of the Relationship between Conditional Entropy and Failed Error Propagation in Software Testing. Androutsopoulos, Clark, Dan, Hierons and Harman . ICSE 2014. David Clark IT and ST

Use Kolmogorov Complexity to Measure Input Diversity Normalised Information Distance For two strings x and y , NID( x , y ) = max { K ( x | y ) , K ( y | x ) } max { K ( x ) , K ( y ) } Enables comparisons between strings of different lengths NCD: The Normalised Compression Distance For two strings x and y , NCD( x , y ) = C ( xy ) − min { C ( x ) , C ( y ) } max { C ( x ) , C ( y ) } Computable approximation using compressors such as 7zip, Bzip David Clark IT and ST

Experiments Use a version of NCD for multisets – calculate the set “diameter” Bigger diameter means more diversity Purely consider sets of inputs – no information from executions except in the course of evaluation Inputs for three SUTs: JEuclid, NanoXML, ROME Controlled for input size Compared test sets using three fixed sizes: 10, 25 and 50 David Clark IT and ST

Outcomes for Higher Diameter Test Sets On average higher code coverage Higher code coverage than randomly selected test sets Leads to higher code coverage even if we control for the size of test inputs May have better fault-finding ability Selection scales quadratically in the size of the initial pool of tests and linearly with the average length of the tests Paper Test Set Diameter: Quantifying the Diversity of Sets of Test Cases. Feldt, Poulding, Clark and Yoo . ICST 2016. David Clark IT and ST

Oracle Deficiencies public class Subtract { public class FastMath { public double value(double x, double y) { public int max (int a, int b) { double result = x-y; int max; assert (result != x); if (a >= b) { max = a; assert (result == x-y); } else { return result; max = b; // max = a; } } } assert (max >= a); return max; } False alarm } Missed fault Oracles may be too strong (false alarms) or too weak (missed faults) 3 David Clark IT and ST

Oracle Improvement Steps Since E is fixed: a + b = const c + d = const (repartitioning) False negative reduction : d a’ = a + ! b’ = b - ! False positive reduction : c’ = c + " d’ = d - " b David Clark IT and ST

Oracle Improvement Modelling Mutual information p ( x, y ) X X I ( X ; Y ) = p ( x, y ) log 2 p ( x ) p ( y ) x 2 X y 2 Y  − ( b + c ) log 2 ( b + c ) − ( a + d ) log 2 ( a + d )  I ( α ; G ) = − ( a + b ) log 2 ( a + b ) − ( c + d ) log 2 ( c + d ) + a log 2 a + b log 2 b + c log 2 c + d log 2 d  David Clark IT and ST

Bad Oracles A bad oracle # is one for which ac < bd I ( α ; G ) bad oracle good oracle ∆ = bd − ac ! c + d Paper Test Oracle Assessment and Improvement. Jahangirova, Clark, Harman and Tonella . ISSTA 2016. David Clark IT and ST

In Conclusion Looked at contributions both theoretical and practical to oracle improvement test set diversity coincidental correctness test set prioritisation More to come: InfoTestSS EPSRC funded project Applying information theoretic ideas to test set selection and exploring relationships with coverage and mutation testing EPSRC contribution approx £ 900,000 shared between UCL and Brunel Industrial contribution approx £ 230,000 from J.P.Morgan and Berner Mattner Project collaborators include Rob Hierons, Mark Harman, Robert Feldt, Michele Boreale, Paolo Tonella David Clark IT and ST

Information Theory and Software Testing David Clark David Clark IT - PowerPoint PPT Presentation

Information Theory and Software Testing David Clark David Clark IT and ST Papers Squeeziness: A Information Theoretic Measure for Avoiding Fault Masking. D. Clark and R. Hierons. IPL. 2012. Fault Localization Prioritization: Comparing

Software Testing Software testing 1 V model Software testing 2 Program testing goals To

A review of software testing P DAVID COWARD 200511347 Software testing Software

Software Testing Overview What is software testing? General testing criteria Testing

Software testing Software Testing Introduction Testing levels Automated testing Principles and

Introduction to Software Testing Software Testing - Module 1 Part 1 The Software Engineering

UI TDD COCOAHEADS AUG 2018 TDD UI TDD SOFTWARE TESTING SOFTWARE TESTING Repeatability

Levels of Testing Chapter 12 Beyond unit testing Developer Testing stages Unit testing

Testing Terminology System testing Types of errors Function testing Structure

Property-Based Testing Matt Bachmann @mattbachmann Testing is Important Testing is Important

1. Test page This page is for testing. This page is for testing. This page is for testing.

TESTING SOFTWARE TESTING "Software testing is an investigation conducted to provide

Software Testing Outline Software Quality Unit Testing Integration Testing

The Theory and Practice of Software Testing: Can we Test it? Yes we Can! Gregory M. Kapfhammer

Software Testing Techniques Chapter 17 Software Testing Strategies Chapter 18 1 Software

Software Testing E6891 Lecture 5 2014-02-26 Todays plan Overview of software testing

Overview Objective Types of testing ECE 553: TESTING AND Verification testing

Concept Drift Albert Bifet March 2012 COMP423A/COMP523A Data Stream Mining Outline 1.

Introduction to Machine Learning Evaluation: Measures for Binary Classification: ROC

Bayesian Updating: Discrete Priors: 18.05 Spring 2014 http://xkcd.com/1236/ January 1, 2017

MA162: Finite mathematics . Jack Schmidt University of Kentucky December 3, 2012 Schedule:

12. Classical statistics Andrej Bogdanov Estimators X = ( X 1 , , X n ) independent samples ^

1 2 Where in the World is Stepping Up? American Psychiatric Association (San Diego, Calif.) 3

Logistic regression on Sonar Machine Learning Toolbox Classification models Categorical

Why is My Classifier Discriminatory? Irene Y. Chen, Fredrik D. Johansson, David Sontag

Information Theory and Software Testing David Clark David Clark IT - PowerPoint PPT Presentation

Information Theory and Software Testing David Clark David Clark IT and ST Papers Squeeziness: A Information Theoretic Measure for Avoiding Fault Masking. D. Clark and R. Hierons. IPL. 2012. Fault Localization Prioritization: Comparing

Software Testing Software testing 1 V model Software testing 2 Program testing goals To

A review of software testing P DAVID COWARD 200511347 Software testing Software

Software Testing Overview What is software testing? General testing criteria Testing

Software testing Software Testing Introduction Testing levels Automated testing Principles and

Introduction to Software Testing Software Testing - Module 1 Part 1 The Software Engineering

UI TDD COCOAHEADS AUG 2018 TDD UI TDD SOFTWARE TESTING SOFTWARE TESTING Repeatability

Levels of Testing Chapter 12 Beyond unit testing Developer Testing stages Unit testing

Testing Terminology System testing Types of errors Function testing Structure

Property-Based Testing Matt Bachmann @mattbachmann Testing is Important Testing is Important

1. Test page This page is for testing. This page is for testing. This page is for testing.

TESTING SOFTWARE TESTING &quot;Software testing is an investigation conducted to provide

Software Testing Outline Software Quality Unit Testing Integration Testing

The Theory and Practice of Software Testing: Can we Test it? Yes we Can! Gregory M. Kapfhammer

Software Testing Techniques Chapter 17 Software Testing Strategies Chapter 18 1 Software

Software Testing E6891 Lecture 5 2014-02-26 Todays plan Overview of software testing

Overview Objective Types of testing ECE 553: TESTING AND Verification testing

Concept Drift Albert Bifet March 2012 COMP423A/COMP523A Data Stream Mining Outline 1.

Introduction to Machine Learning Evaluation: Measures for Binary Classification: ROC

Bayesian Updating: Discrete Priors: 18.05 Spring 2014 http://xkcd.com/1236/ January 1, 2017

MA162: Finite mathematics . Jack Schmidt University of Kentucky December 3, 2012 Schedule:

12. Classical statistics Andrej Bogdanov Estimators X = ( X 1 , , X n ) independent samples ^

1 2 Where in the World is Stepping Up? American Psychiatric Association (San Diego, Calif.) 3

Logistic regression on Sonar Machine Learning Toolbox Classification models Categorical

Why is My Classifier Discriminatory? Irene Y. Chen, Fredrik D. Johansson, David Sontag

TESTING SOFTWARE TESTING "Software testing is an investigation conducted to provide