Mutation Testing Reid Holmes Key questions Is a test suite: Su - - PowerPoint PPT Presentation

mutation testing
SMART_READER_LITE
LIVE PREVIEW

Mutation Testing Reid Holmes Key questions Is a test suite: Su - - PowerPoint PPT Presentation

Mutation Testing Reid Holmes Key questions Is a test suite: Su ffi ciently broad ? Su ffi ciently deep ? 2 Test suite depth Mutation testing 3 Program Generate Mutants 4 Program Generate Mutants 5 Program Generate Mutants Mutant


slide-1
SLIDE 1

Mutation Testing

Reid Holmes

slide-2
SLIDE 2 2

Key

questions

Sufficiently deep? Is a test suite: Sufficiently broad?

slide-3
SLIDE 3 3

depth

Mutation testing

Testsuite

slide-4
SLIDE 4 4

Program Generate Mutants

slide-5
SLIDE 5 5

Program Generate Mutants

slide-6
SLIDE 6 6

Program

6

Generate Mutants Mutant

slide-7
SLIDE 7 7

Program

7

Generate Mutants Mutant

slide-8
SLIDE 8 8

Program

8

Generate Mutants Mutant

slide-9
SLIDE 9 9

Kill Score Program Test Suite Execute Suites Generate Mutants Mutant

slide-10
SLIDE 10 10

mutations?

what

flip boolean increment to decrement boundaries (<, >=, etc) remove conditional

slide-11
SLIDE 11 11

mutation

  • perators

Conditional Boundary < —> <= <= —> < > —> >= >= —> > if (a<b) {..} —> if (a<=b) {..}

slide-12
SLIDE 12 12

Negate Conditionals == —> != != —> == … if (a==b) {..} —> if (a!=b) {..}

mutation

  • perators
slide-13
SLIDE 13 13

Remove Conditionals if (a==b) {..} —> if (true) {..}

mutation

  • perators
slide-14
SLIDE 14 14

Math + —> - * —> / | —> & … int a = b + c; —> int a = b - c;

mutation

  • perators
slide-15
SLIDE 15 15

Increments/Decrements

++ —> - -

  • - —> ++

i++ —> i—

mutation

  • perators
slide-16
SLIDE 16 16

Inline Constant int i = 0; —> int i = 3;

mutation

  • perators
slide-17
SLIDE 17 17

Return mutator return o; —> return null;

mutation

  • perators
slide-18
SLIDE 18 18

Skip void calls

void somethingImportant(){..} int foo() { int i = 5; somethingImportant(); return i; } —> int foo() { int i = 5; // somethingImportant(); return i; }

mutation

  • perators
slide-19
SLIDE 19

public float avg(float[] data){ float sum = 0;
 for (float num : data){ sum += num; } return sum * data.length; }

slide-20
SLIDE 20

public float avg(float[] data){ float sum = 0;
 for (float num : data){ sum += num; } return sum * data.length; }

assertEq(avg([1]), 1);

Test suite:

slide-21
SLIDE 21

public float avg(float[] data){ float sum = 1;
 for (float num : data){ sum += num; } return sum * data.length; }

✖ assertEq(avg([1]), 1);

Test suite:

slide-22
SLIDE 22

public float avg(float[] data){ float sum = 0;
 for (float num : data){ sum -= num; } return sum * data.length; }

✖ assertEq(avg([1]), 1);

Test suite:

slide-23
SLIDE 23

public float avg(float[] data){ float sum = 0;
 for (float num : data){ sum += num; } return sum / data.length; }

✔ assertEq(avg([1]), 1);

Test suite:

slide-24
SLIDE 24

sum = 0 —> sum = 1 sum += num —> sum += num sum * length —> sum / length

✔ ✔ ✖ ✖

Kill Score: 66%

assertEq(avg([1]), 1);

Test suite:

slide-25
SLIDE 25

sum = 0 —> sum = 1 sum += num —> sum += num sum * length —> sum / length

assertEq(avg([1,1]), 1);

New test:

✔ ✔ ✖ ✖

should have been / not * all along assertEq(avg([1]), 1);

Test suite:

slide-26
SLIDE 26

public float avg(float[] data){ float sum = 0;
 for (float num : data){ sum += num; } return sum / data.length; }

assertEq(avg([1,1]), 1);

New test:

assertEq(avg([1]), 1);

Test suite:

slide-27
SLIDE 27

assertEq(avg([1,1]), 1);

New test:

From the expected return of this function, this test should pass in the program; instead it reveals a fault in the program itself.

assertEq(avg([1]), 1);

Test suite:

slide-28
SLIDE 28 28

mutation assumptions

2) Coupling Hypothesis:

—> Big bugs are composed of a series of small errors.

1) Competent Programmer Hypothesis:

—>Most programs are nearly correct.
slide-29
SLIDE 29 29

qualityof

test suites Assessing the

slide-30
SLIDE 30

testing Mutation

“If the program works … on specified data, then it will always work on any data.

— Hoare

slide-31
SLIDE 31 31+

Correctness focus Programmatic

  • racle

Synthetic Small programs Few faults Few mutants Past studies:

slide-32
SLIDE 32 ISSTA 1996 ICSE 2005 FSE 2014 KLOC 1 6 321 Faults 12 38 357 Mutants 24 1,100 230,000 Tests generated generated developer-written & generated Coverage controlled ✖ ✖ ✔ Examine shortcomings ✖ ✖ ✔
slide-33
SLIDE 33 ISSTA 1996 ICSE 2005 FSE 2014 KLOC 1 6 321 Faults 12 38 357 Mutants 24 1,100 230,000 Tests generated generated developer-written & generated Coverage controlled ✖ ✖ ✔ Examines shortcomings ✖ ✖ ✔

Do stronger tests detect more mutants? Is mutant detection correlated with fault detection? Can mutants describe all real faults?

slide-34
SLIDE 34 34

Experimental

method

Define Candidates Generate Suites Analyze Misses Compilable Faults Triggering Tests
slide-35
SLIDE 35 35

Experimental results

Unchanged Increased

27% 73%

Mutant detection

60% 40%

Statement coverage

Do stronger tests detect more mutants?

slide-36
SLIDE 36 36 10%

17% 73% What kinds of faults are not represented by mutants?

No operator Weak/missing Increased if (x) { … return; } if (x) { … // del } if (cK.length != sD[0].length) if (cK.length != getCatCount())

Experimental results

slide-37
SLIDE 37 37

Mutation

takeaway

A correlation exists between mutant detection and real fault detection.

slide-38
SLIDE 38 38

Kill score is a better predictor of test quality than coverage Mutants can serve as effective proxies for real faults

testing

Impact

  • n

Stronger coverage criteria offer little additional insight 60% of real faults are already covered Adding tests can be more impactful than increasing coverage Mutants can describe many real faults