Shadow of a Doubt: Testing for Divergences Between Software - - PowerPoint PPT Presentation

shadow of a doubt testing for divergences between
SMART_READER_LITE
LIVE PREVIEW

Shadow of a Doubt: Testing for Divergences Between Software - - PowerPoint PPT Presentation

Shadow of a Doubt: Testing for Divergences Between Software Versions Hristina Palikareva Tomasz Kuchta Cristian Cadar ICSE16, 20 th May 2016 This work is supported by EPSRC and Microsoft Research Motivation Software patches


slide-1
SLIDE 1

Shadow of a Doubt: Testing for Divergences Between Software Versions

Hristina Palikareva Tomasz Kuchta Cristian Cadar

ICSE’16, 20th May 2016

This work is supported by EPSRC and Microsoft Research

slide-2
SLIDE 2

§ Software patches

§ Frequent, at the core of software evolution § New features, bug fixes, better performance, usability § Poorly tested in practice § May introduce bugs

2

Motivation

slide-3
SLIDE 3

01 int gt_100(unsigned x) { 02 unsigned y = x; 03 if (y > 100) 04 return 1; 05 else 06 return 0; 07 }

Motivation: an example

3

Old

slide-4
SLIDE 4

01 int gt_100(unsigned x) { 02 unsigned y = x + 1 x + 1; 03 if (y > 100) 04 return 1; 05 else 06 return 0; 07 }

§ Test cases: x = 0, x = 100, x = 101 Motivation: an example

4

Old New

01 int gt_100(unsigned x) { 02 unsigned y = x; 03 if (y > 100) 04 return 1; 05 else 06 return 0; 07 }

slide-5
SLIDE 5

§ Test cases: x = 0, x = 100, x = 101, 100% 100% code coverage Motivation: an example

5 01 int gt_100(unsigned x) { 02 unsigned y = x + 1 x + 1; 03 if (y > 100) 04 return 1; 05 else 06 return 0; 07 } 01 int gt_100(unsigned x) { 02 unsigned y = x; 03 if (y > 100) 04 return 1; 05 else 06 return 0; 07 }

Old New

slide-6
SLIDE 6

§ Test cases: x = 0, x = 100, x = 101, 100% 100% code coverage § Only 50% 50% new behaviour coverage § Code coverage not sufficient! Motivation: an example

6 01 int gt_100(unsigned x) { 02 unsigned y = x + 1 x + 1; 03 if (y > 100) 04 return 1; 05 else 06 return 0; 07 } 01 int gt_100(unsigned x) { 02 unsigned y = x; 03 if (y > 100) 04 return 1; 05 else 06 return 0; 07 }

Old New

slide-7
SLIDE 7

§ Shadow symbolic execution technique

§ Focuses on the new behaviours of the patch

§ Technique for unifying two versions of a program

§ Execute in a single symbolic execution instance

§ A patch testing approach

§ Shadow symbolic execution § Enhanced cross-version checks

7

Contributions

slide-8
SLIDE 8

8

Symbolic execution

x is a symbolic variable

01 int gt_100(unsigned x) { 02 unsigned y = x + 1 x + 1; 03 if (y > 100) 04 return 1; 05 else 06 return 0; 07 }

slide-9
SLIDE 9

9 x+1 > 100 x+1 ≤ 100

Symbolic execution

x is a symbolic variable

01 int gt_100(unsigned x) { 02 unsigned y = x + 1 x + 1; 03 if (y > 100) 04 return 1; 05 else 06 return 0; 07 }

slide-10
SLIDE 10

10 x+1 > 100 x+1 ≤ 100

Symbolic execution

x is a symbolic variable

01 int gt_100(unsigned x) { 02 unsigned y = x + 1 x + 1; 03 if (y > 100) 04 return 1; 05 else 06 return 0; 07 }

slide-11
SLIDE 11

11 x+1 > 100 x+1 ≤ 100

Symbolic execution

x is a symbolic variable

01 int gt_100(unsigned x) { 02 unsigned y = x + 1 x + 1; 03 if (y > 100) 04 return 1; 05 else 06 return 0; 07 }

slide-12
SLIDE 12

12 x+1 > 100 x+1 ≤ 100

Symbolic execution

x is a symbolic variable

01 int gt_100(unsigned x) { 02 unsigned y = x + 1 x + 1; 03 if (y > 100) 04 return 1; 05 else 06 return 0; 07 }

slide-13
SLIDE 13

x + 1 ≤ 100 13 x + 1 > 100 x+1 > 100 x+1 ≤ 100

Symbolic execution

x is a symbolic variable

01 int gt_100(unsigned x) { 02 unsigned y = x + 1 x + 1; 03 if (y > 100) 04 return 1; 05 else 06 return 0; 07 }

slide-14
SLIDE 14

x + 1 > 100 14

e.g. x = 1000 e.g. x = 0

x is a symbolic variable

x+1 > 100 x+1 ≤ 100

Symbolic execution

01 int gt_100(unsigned x) { 02 unsigned y = x + 1 x + 1; 03 if (y > 100) 04 return 1; 05 else 06 return 0; 07 } x + 1 ≤ 100

slide-15
SLIDE 15

Shadow symbolic execution

slide-16
SLIDE 16

§ Old and new version in the same instance

§ The two versions are combined § Executed in lock-step fashion § The old version shadows the new one

§ Focus on the new behaviour

§ Versions take different sides of a branch

16

Shadow symbolic execution

slide-17
SLIDE 17

01 int gt_100(unsigned x) { 02 unsigned y = x + 1 x + 1; 03 if (y > 100) 04 return 1; 05 else 06 return 0; 07 } 01 int gt_100(unsigned x) { 02 unsigned y = x; 03 if (y > 100) 04 return 1; 05 else 06 return 0; 07 } 17

Old New

Shadow symbolic execution

slide-18
SLIDE 18

01 int gt_100(unsigned x) { 02 unsigned y = change(x, x + 1) change(x, x + 1); 03 if (y > 100) 04 return 1; 05 else 06 return 0; 07 } 18

Combined

Shadow symbolic execution

slide-19
SLIDE 19

19

Shadow symbolic execution

01 int gt_100(unsigned x) { 02 unsigned y = change(x, x + 1) change(x, x + 1); 03 if (y > 100) 04 return 1; 05 else 06 return 0; 07 }

slide-20
SLIDE 20

01 int gt_100(unsigned x) { 02 unsigned y = change(x, x + 1) change(x, x + 1); 03 if (y > 100) 04 return 1; 05 else 06 return 0; 07 } 20

Shadow symbolic execution

slide-21
SLIDE 21

21

4-way fork

Shadow symbolic execution

01 int gt_100(unsigned x) { 02 unsigned y = change(x, x + 1) change(x, x + 1); 03 if (y > 100) 04 return 1; 05 else 06 return 0; 07 }

slide-22
SLIDE 22

22 x+1 ≤ 100 x+1 > 100

Shadow symbolic execution

01 int gt_100(unsigned x) { 02 unsigned y = change(x, change(x, x + 1 x + 1); 03 if (y > 100) 04 return 1; 05 else 06 return 0; 07 }

slide-23
SLIDE 23

23 x ≤ 100 x > 100 x+1 ≤ 100 x+1 > 100 x ≤ 100 x > 100

Shadow symbolic execution

01 int gt_100(unsigned x) { 02 unsigned y = change( change(x, x + 1) , x + 1); 03 if (y > 100) 04 return 1; 05 else 06 return 0; 07 }

slide-24
SLIDE 24

24 new:else

  • ld:else

new:else

  • ld:then

new:then

  • ld:else

new:then

  • ld:then

x+1 ≤ 100 x+1 > 100

Shadow symbolic execution

x ≤ 100 x > 100 x ≤ 100 x > 100 01 int gt_100(unsigned x) { 02 unsigned y = change( change(x, x + 1 x, x + 1); 03 if (y > 100) 04 return 1; 05 else 06 return 0; 07 }

slide-25
SLIDE 25

25 x+1 ≤ 100 x+1 > 100

Shadow symbolic execution

x ≤ 100 x > 100 x ≤ 100 x > 100 01 int gt_100(unsigned x) { 02 unsigned y = change( change(x, x + 1 x, x + 1); 03 if (y > 100) 04 return 1; 05 else 06 return 0; 07 } new:else

  • ld:else

new:else

  • ld:then

new:then

  • ld:else

new:then

  • ld:then
slide-26
SLIDE 26

01 int gt_100(unsigned x) { 02 unsigned y = change( change(x, x + 1 x, x + 1); 03 if (y > 100) 04 return 1; 05 else 06 return 0; 07 } 26

100 ✓

x+1 ≤ 100 x+1 > 100

Shadow symbolic execution

x ≤ 100 x > 100 x ≤ 100 x > 100 new:else

  • ld:else

new:else

  • ld:then

new:then

  • ld:else

new:then

  • ld:then
slide-27
SLIDE 27

27

Max Int

x+1 ≤ 100 x+1 > 100

Shadow symbolic execution

x ≤ 100 x > 100 x ≤ 100 x > 100 01 int gt_100(unsigned x) { 02 unsigned y = change( change(x, x + 1 x, x + 1); 03 if (y > 100) 04 return 1; 05 else 06 return 0; 07 } new:else

  • ld:else

new:else

  • ld:then

new:then

  • ld:else

new:then

  • ld:then

100 ✓

slide-28
SLIDE 28

01 int gt_100(unsigned x) { 02 unsigned y = x; 03 if (change(y > 100, y change(y > 100, y ≥ 100) 100)) 04 return 1; 05 else 06 return 0; 07 } 28

Shadow symbolic execution

y < 100 y ≤ 100 y ≥ 100 y ≤ 100 y ≥ 100 y > 100

Divergence not always possible

y < 100 y > 100 y > 200

slide-29
SLIDE 29

§ Advantages of shadow symbolic execution

§ Pruning execution paths – smaller search space § Space efficiency

§ Two versions combined into one § Expression sharing via shadow expressions

§ Does not execute unchanged path prefix twice

29

Shadow symbolic execution

slide-30
SLIDE 30

Patch annotations

slide-31
SLIDE 31

§ Annotations

§ change(old, new) macro § Currently manual, automation possible § A set of 15 rules § See project web-site for annotated patches

http://srg.doc.ic.ac.uk/projects/shadow/

31

Expressing patches

slide-32
SLIDE 32

01 if (change(argc - optind, n_args) change(argc - optind, n_args) < 1 < 1) 02 { 03 error (...); 04 usage (EXIT_FAILURE); 05 }

§ Modifying an rvalue expression

32

Annotation rules

Old Combined

01 if (argc – optind argc – optind < 1 < 1) 02 { 03 error (...); 04 usage (EXIT_FAILURE); 05 }

New

01 if (n_args n_args < 1 < 1) 02 { 03 error (...); 04 usage (EXIT_FAILURE); 05 }

slide-33
SLIDE 33

§ Adding an assignment

33

Annotation rules

01 byte_idx = 0; 02 print_delimiter = false; 03 current_rp = rp; current_rp = rp;

Old New Combined

01 byte_idx = 0; 02 print_delimiter = false; 03 01 byte_idx = 0; 02 print_delimiter = false; 03 current_rp = change(current_rp, rp) change(current_rp, rp);

slide-34
SLIDE 34

Patch testing approach

slide-35
SLIDE 35

35

Shadow approach overview

Old version New version Test suite

slide-36
SLIDE 36

36

Shadow approach overview

Unify versions Select test cases Shadow Enhanced checks

Old version New version Test suite

slide-37
SLIDE 37

37

Unify versions Select test cases Shadow Enhanced checks

Old version New version Test suite Regression bugs Expected divergences

Shadow approach overview

slide-38
SLIDE 38

38

Unify versions Select test cases Shadow Enhanced checks

Old version New version Test suite Regression bugs Expected divergences

Shadow approach overview

slide-39
SLIDE 39

39

Unify versions Select test cases Shadow Enhanced checks

§ Combine old and new version

§ change() macro § Set of rules

Shadow approach overview

slide-40
SLIDE 40

40

Unify versions Select test cases Shadow Enhanced checks

§ Select test cases that touch the patch

§ Run test suite on the new version § Use coverage data

§ Cover at least one line of the patch Shadow approach overview

slide-41
SLIDE 41

41

Unify versions Select test cases Shadow Enhanced checks

§ Use test suite inputs

Shadow approach overview

slide-42
SLIDE 42

42

Shadow approach overview

§ Use test suite inputs

Unify versions Select test cases Shadow Enhanced checks

slide-43
SLIDE 43

43

§ Use test suite inputs § Try to find divergent paths

✓ ✗ ✓

Unify versions Select test cases Shadow Enhanced checks

Shadow approach overview

slide-44
SLIDE 44

44

§ Use test suite inputs § Try to find divergent paths § Perform bounded symbolic execution

§ New test cases § Explore more divergent behaviours

✗ ✓ BSE BSE ✓

Unify versions Select test cases Shadow Enhanced checks

Shadow approach overview

slide-45
SLIDE 45

45

Unify versions Select test cases Shadow Enhanced checks

§ Run old and new versions on the generated inputs § Compare:

§ program outputs § program exit codes § memory safety violations (ASAN)

Shadow approach overview

slide-46
SLIDE 46

Implementation and evaluation

slide-47
SLIDE 47

§ Implemented on top of KLEE § Uses concolic execution functionality from ZESTI and Docovery

47

Implementation

http://klee.github.io http://srg.doc.ic.ac.uk/projects/zesti http://srg.doc.ic.ac.uk/projects/docovery

slide-48
SLIDE 48

§ Evaluated on patches from CoREBench study

§ http://www.comp.nus.edu.sg/~release/corebench/ § 18 unique Coreutils patches which introduced bugs § Significantly more complex than typical patches used in the evaluation of previous work (e.g. SIR, Siemens) § The bug-fixing patches also known § Evaluated 16 out of 18 due to technical issues

48

Evaluation

slide-49
SLIDE 49

49

Evaluation

Patch atch Tool

  • ol

Patch size atch size Annotations Annotations LOC Hunks

1 mv, rm 45 17 12 3 cut 294 35 14 4 tail 21 4 4 5=16 tail 275 13 1 6 cut 8 3 3 7 seq 148 5 5 8 seq 37 4 12 10 cp 16 8 2 11 cut 2 1 1 12=17 cut 110 17 4 13 ls 13 2 2 14 ls 15 5 4 15 du 3 1 1 19 seq 40 9 6 21 cut 31 10 6 22 expr 54 6 4

slide-50
SLIDE 50

50

Evaluation

Patch atch Tool

  • ol

Patch size atch size Annotations Annotations LOC Hunks

1 mv, rm 45 17 12 3 cut 294 35 14 4 tail 21 4 4 5=16 tail 275 13 1 6 cut 8 3 3 7 seq 148 5 5 8 seq 37 4 12 10 cp 16 8 2 11 cut 2 1 1 12=17 cut 110 17 4 13 ls 13 2 2 14 ls 15 5 4 15 du 3 1 1 19 seq 40 9 6 21 cut 31 10 6 22 expr 54 6 4

slide-51
SLIDE 51

51

Evaluation

Patch atch Tool

  • ol

Patch size atch size Annotations Annotations LOC Hunks

1 mv, rm 45 17 12 3 cut 294 35 14 4 tail 21 4 4 5=16 tail 275 13 1 6 cut 8 3 3 7 seq 148 5 5 8 seq 37 4 12 10 cp 16 8 2 11 cut 2 1 1 12=17 cut 110 17 4 13 ls 13 2 2 14 ls 15 5 4 15 du 3 1 1 19 seq 40 9 6 21 cut 31 10 6 22 expr 54 6 4

slide-52
SLIDE 52

52

Evaluation

Gener Generated input ated input Behaviour Behaviour Old New

cut -s -d: -f0- <file> file contains “:::\n:1” :::\n1 \n\n cut –d: -f1,0- <file> file contains “a:b:c” a:b:c a tail --retry ///s\x01\x00g\x00 tail: warning: -- retry is useful mainly when following by name… tail: warning: -- retry ignored; -- retry is useful only when following…

Expected

slide-53
SLIDE 53

53

Evaluation

Gener Generated input ated input Behaviour Behaviour Old New

cut -c1-3,8- --output-d=: <file> file contains “abcdefg” abc abc + buffer overflow cut -c1-7,8- --output-d=: <file> file contains “abcdefg” abcdefg abcdefg + buffer

  • verflow

cut -b0-2,2- --output-d=: <file> file contains “abc” abc signal abort

Bugs

slide-54
SLIDE 54

Gener Generated input ated input Behaviour Behaviour Old New

cut -c1-3,8- --output-d=: <file> file contains “abcdefg” abc abc + buffer overflow cut -c1-7,8- --output-d=: <file> file contains “abcdefg” abcdefg abcdefg + buffer

  • verflow

cut -b0-2,2- --output-d=: <file> file contains “abc” abc signal abort

54

Evaluation

Bugs New bug, not part of CoREBench

slide-55
SLIDE 55

55

Evaluation

Patch atch Diver Divergences gences Output dif Output differ ferences ences Expected Bug

1 39K 3

  • 3

15K

  • 4

39 36

  • 5=16

14

  • 2

6 1.4K

  • 86

7 124 5

  • 8

54K

  • 10

6

  • 2

11 874 9

  • 12=17

4.2K

  • 78

13 11 1 1 14 2

  • 15

1 1

  • 19

33K 7

  • 21

21K 151 684 22

slide-56
SLIDE 56

56

Evaluation

Patch atch Diver Divergences gences Output dif Output differ ferences ences Expected Bug

1 39K 3

  • 3

15K

  • 4

39 36

  • 5=16

14

  • 2

6 1.4K

  • 86

7 124 5

  • 8

54K

  • 10

6

  • 2

11 874 9

  • 12=17

4.2K

  • 78

13 11 1 1 14 2

  • 15

1 1

  • 19

33K 7

  • 21

21K 151 684 22

slide-57
SLIDE 57

57

Evaluation

Patch atch Diver Divergences gences Output dif Output differ ferences ences Expected Bug

1 39K 3

  • 3

15K

  • 4

39 36

  • 5=16

14

  • 2

6 1.4K

  • 86

7 124 5

  • 8

54K

  • 10

6

  • 2

11 874 9

  • 12=17

4.2K

  • 78

13 11 1 1 14 2

  • 15

1 1

  • 19

33K 7

  • 21

21K 151 684 22

slide-58
SLIDE 58

58

Evaluation

Patch atch Diver Divergences gences Output dif Output differ ferences ences Expected Bug

1 39K 3

  • 3

15K

  • 4

39 36

  • 5=16

14

  • 2

6 1.4K

  • 86

7 124 5

  • 8

54K

  • 10

6

  • 2

11 874 9

  • 12=17

4.2K

  • 78

13 11 1 1 14 2

  • 15

1 1

  • 19

33K 7

  • 21

21K 151 684 22

slide-59
SLIDE 59

§ Unsuccessful cases

§ Refactorings § Non-functional changes

§ Memory consumption § Performance

§ Technical challenges:

§ Reasoning about file access rights § Symbolic directories support § Floating point support § Not reproducible

59

Evaluation

slide-60
SLIDE 60

Shadow symbolic execution § A symbolic execution technique for patch testing

§ Generates inputs that trigger new behaviours § Prunes large parts of the search space § Useful for: regression testing, test-suite augmentation, patch understanding

http://srg.doc.ic.ac.uk/projects/shadow/

Summary