Comparative Causality: Explaining the Differences Between - - PowerPoint PPT Presentation

comparative causality explaining the differences between
SMART_READER_LITE
LIVE PREVIEW

Comparative Causality: Explaining the Differences Between - - PowerPoint PPT Presentation

Comparative Causality: Explaining the Differences Between Executions William N. Sumner Xiangyu Zhang {wsumner,xyzhang} @ cs.purdue.edu ICSE 2013 22 May 2013 Background Debugging requires understanding how a program behaves. Background


slide-1
SLIDE 1

Comparative Causality: Explaining the Differences Between Executions

William N. Sumner Xiangyu Zhang {wsumner,xyzhang} @ cs.purdue.edu ICSE 2013 22 May 2013

slide-2
SLIDE 2

Background

Debugging requires understanding how a program behaves.

slide-3
SLIDE 3

Background

Debugging requires requires understanding how a program behaves.

  • Which statements are buggy and how?
slide-4
SLIDE 4

Background

Debugging requires requires understanding how a program behaves.

  • Which statements are buggy and how?
  • How does a bug/fault lead to a failure?
slide-5
SLIDE 5

Background

Debugging requires requires understanding how a program behaves.

  • Which statements are buggy and how?
  • How does a bug/fault lead to a failure?
  • What might possible fixes be?
slide-6
SLIDE 6

Background

Debugging requires requires understanding how a program behaves.

  • Which statements are buggy and how?
  • How does a bug/fault lead to a failure?
  • What might possible fixes be?

These questions need an explanation for a bug.

slide-7
SLIDE 7

Background

Explaining a bug (fault → failure)

inventory = [(Shoes,5); (Hats,0); (Ties,1)] bought = 0 for (item, available) in inventory: if bought < 3 and available >= 0: buy(item) bought += 1 print “Items bought: “, bought

Should print “Items bought: 2” Failure: prints “Items bought: 3”

slide-8
SLIDE 8

Background

Explaining a bug (fault → failure)

inventory = [(Shoes,5); (Hats,0); (Ties,1)] bought = 0 for (item, available) in inventory: if bought < 3 and available >= 0: buy(item) bought += 1 print “Items bought: “, bought

Should print “Items bought: 2” Failure: prints “Items bought: 3”

BUG: Should be >

slide-9
SLIDE 9

Background

Explaining a bug (fault → failure)

inv = [(S,5); (H,0); (T,1)] bt = 0 1)for (itm, av) in inv: 2) if bt < 3 and av >= 0: 3) buy(itm) 4) bt += 1 5)print bt 1) for (itm, av) = (S,5): 2) if bt < 3 and av >= 0: 3) buy(itm) 4) bt += 1 1) for (itm, av) = (H,0): 2) if bt < 3 and av >= 0: 3) buy(itm) 4) bt += 1 1) for (itm, av) = (T,1): 2) if bt < 3 and av >= 0: 3) buy(itm) 4) bt += 1 5)print bt

Program Trace

slide-10
SLIDE 10

Background

Explaining a bug (fault → failure)

1) for (itm, av) = (S,5): 2) if bt < 3 and av >= 0: 3) buy(itm) 4) bt += 1 1) for (itm, av) = (H,0): 2) if bt < 3 and av >= 0: 3) buy(itm) 4) bt += 1 1) for (itm, av) = (T,1): 2) if bt < 3 and av >= 0: 3) buy(itm) 4) bt += 1 5)print bt

Trace

A faulty branch is taken at A A

slide-11
SLIDE 11

Background

Explaining a bug (fault → failure)

1) for (itm, av) = (S,5): 2) if bt < 3 and av >= 0: 3) buy(itm) 4) bt += 1 1) for (itm, av) = (H,0): 2) if bt < 3 and av >= 0: 3) buy(itm) 4) bt += 1 1) for (itm, av) = (T,1): 2) if bt < 3 and av >= 0: 3) buy(itm) 4) bt += 1 5)print bt

Trace

A faulty branch is taken at so bt is given the faulty value 2 at B A A B

slide-12
SLIDE 12

Background

Explaining a bug (fault → failure)

1) for (itm, av) = (S,5): 2) if bt < 3 and av >= 0: 3) buy(itm) 4) bt += 1 1) for (itm, av) = (H,0): 2) if bt < 3 and av >= 0: 3) buy(itm) 4) bt += 1 1) for (itm, av) = (T,1): 2) if bt < 3 and av >= 0: 3) buy(itm) 4) bt += 1 5)print bt

Trace

A faulty branch is taken at so bt is given the faulty value 2 at so bt is given the faulty value 3 at B A C A B C

slide-13
SLIDE 13

Background

Explaining a bug (fault → failure)

1) for (itm, av) = (S,5): 2) if bt < 3 and av >= 0: 3) buy(itm) 4) bt += 1 1) for (itm, av) = (H,0): 2) if bt < 3 and av >= 0: 3) buy(itm) 4) bt += 1 1) for (itm, av) = (T,1): 2) if bt < 3 and av >= 0: 3) buy(itm) 4) bt += 1 5)print bt

Trace

A faulty branch is taken at so bt is given the faulty value 2 at so bt is given the faulty value 3 at so '3' is printed erroneously at B A C D A B C D

slide-14
SLIDE 14

Background

Explaining a bug (fault → failure)

1) for (itm, av) = (S,5): 2) if bt < 3 and av >= 0: 3) buy(itm) 4) bt += 1 1) for (itm, av) = (H,0): 2) if bt < 3 and av >= 0: 3) buy(itm) 4) bt += 1 1) for (itm, av) = (T,1): 2) if bt < 3 and av >= 0: 3) buy(itm) 4) bt += 1 5)print bt

Trace

A B C D

an explanation

slide-15
SLIDE 15

Existing Approaches

Dynamic Slicing [Agrawal PLAN90]

  • Too large & unwieldy in practice
slide-16
SLIDE 16

Existing Approaches

Dynamic Slicing [Agrawal PLAN90]

  • Too large & unwieldy in practice

State Replacement

  • What faulty state can reproduce a failure?
slide-17
SLIDE 17

Existing Approaches

Dynamic Slicing [Agrawal PLAN90]

  • Too large & unwieldy in practice

State Replacement

  • What faulty state can reproduce a failure?

– Cause Effect Chains [Zeller FSE02] – Causal Paths [Sumner FASE09,FSE10]

slide-18
SLIDE 18

Causal Paths [FASE09, FSE10]

Reproduce the failure in a correct run: Buggy Correct

slide-19
SLIDE 19

Causal Paths [FASE09, FSE10]

Reproduce the failure in a correct run: Buggy Correct

x = 5 y = 4 z = 3 x = 5 y = 2 z = 1

slide-20
SLIDE 20

Causal Paths [FASE09, FSE10]

Reproduce the failure in a correct run: Buggy Correct

x = 5 y = 4 z = 3 x = 5 y = 2 z = 1

?

slide-21
SLIDE 21

Causal Paths [FASE09, FSE10]

Reproduce the failure in a correct run: Buggy Correct

x = 5 y = 4 z = 3 x = 5 y = 2 z = 1

?

Trial

slide-22
SLIDE 22

Causal Paths [FASE09, FSE10]

Reproduce the failure in a correct run: Buggy Correct

x = 5 y = 4 z = 3 x = 5 y = 2 z = 1 {y=4, z=3} {y=4} {z=3}

?

Trial

slide-23
SLIDE 23

Causal Paths [FASE09, FSE10]

Reproduce the failure in a correct run: Buggy Correct

x = 5 y = 4 z = 3 x = 5 y = 2 z = 1 {y=4, z=3} {y=4} {z=3}

?

Trial

Blame smallest set possible

slide-24
SLIDE 24

Causal Paths [FASE09, FSE10]

Reproduce the failure in a correct run: Buggy Correct

x = 5 y = 4 z = 3 x = 5 y = 2 z = 1

?

Trial

slide-25
SLIDE 25

Causal Paths [FASE09, FSE10]

Reproduce the failure in a correct run: Buggy Correct

x = 5 y = 4 z = 3 x = 5 y = 4 z = 1 y = 4

?

Trial

slide-26
SLIDE 26

Causal Paths [FASE09, FSE10]

Reproduce the failure in a correct run: Buggy Correct

x = 5 y = 4 z = 3 x = 5 y = 4 z = 1 y = 4

?

Trial

slide-27
SLIDE 27

Causal Paths [FASE09, FSE10]

Reproduce the failure in a correct run: Buggy Correct

x = 5 y = 4 z = 3 x = 5 y = 4 z = 1 y = 4

?

y=4 is responsible

y = 4

Trial

slide-28
SLIDE 28

Causal Paths [FASE09, FSE10]

Reproduce the failure in a correct run: Buggy Correct

y = 4

Why was y=4?

slide-29
SLIDE 29

Causal Paths [FASE09, FSE10]

Reproduce the failure in a correct run: Buggy Correct

y = 4 y = ...

slide-30
SLIDE 30

Causal Paths [FASE09, FSE10]

Reproduce the failure in a correct run: Buggy Correct

a = 1 b = 2 c = 1 a = 1 b = 1 c = 0 y = 4

?

slide-31
SLIDE 31

Causal Paths [FASE09, FSE10]

Reproduce the failure in a correct run: Buggy Correct

a = 1 b = 2 c = 1 a = 1 b = 2 c = 1 y = 4 b = 2 c = 1

slide-32
SLIDE 32

Causal Paths [FASE09, FSE10]

Reproduce the failure in a correct run: Buggy Correct

a = 1 b = 2 c = 1 a = 1 b = 2 c = 1 y = 4 b = 2 c = 1 y = 4

slide-33
SLIDE 33

Causal Paths [FASE09, FSE10]

Reproduce the failure in a correct run: Buggy Correct

y = 4 b = 2 c = 1

slide-34
SLIDE 34

Causal Paths [FASE09, FSE10]

Reproduce the failure in a correct run: Buggy Correct

y = 4 b = 2 c = 1

Proceed until no differences

slide-35
SLIDE 35

Causal Paths [FASE09, FSE10]

Reproduce the failure in a correct run: Buggy Correct

y = 4 b = 2 c = 1

Line 23 Line 42

An explanation

  • f the bug!
slide-36
SLIDE 36

Causal Paths [FASE09, FSE10]

Problems?

  • Identifying where to compare. [Sumner FASE09]
  • Identifying & replacing state. [Sumner FSE10]
slide-37
SLIDE 37

Causal Paths [FASE09, FSE10]

Problems?

  • Identifying where to compare. [Sumner FASE09]
  • Identifying & replacing state. [Sumner FSE10]
  • Replacing state confounds the explanation.

– Arbitrary replacement can yield arbitrary results.

slide-38
SLIDE 38

Causal Paths [FASE09, FSE10]

Problems?

  • Identifying where to compare. [Sumner FASE09]
  • Identifying & replacing state. [Sumner FSE10]
  • Replacing state confounds the explanation.

– Arbitrary replacement can yield arbitrary results.

  • Not helpful with execution omission

– A failure isn't just bad behavior, it is missing good

behavior

slide-39
SLIDE 39

Confounding

Which state should we blame? Recall:

?

Trial

slide-40
SLIDE 40

Confounding

Which state should we blame? Recall:

?

Trial

slide-41
SLIDE 41

Confounding

Which state should we blame? Recall:

x = 5 y = 4 z = 3 x = 5 y = 4 z = 1 y = 4

?

Trial

slide-42
SLIDE 42

Confounding

Which state should we blame? Recall:

x = 5 y = 4 z = 3 x = 5 y = 4 z = 1 y = 4

?

Trial

slide-43
SLIDE 43

Confounding

Which state should we blame? Recall:

x = 5 y = 4 z = 3 x = 5 y = 4 z = 1 y = 4

?

Trial

What does this patched run even mean?

slide-44
SLIDE 44

x ← 1 y ← 1 z ← 3 if False: else: y ← 2 print(2)

Example – Altered Meaning

1)x ← input() 2)y ← input() 3)z ← input() 4)if y>1 & z<6: 5) y ← 5 6)else: y ← y+1 7)print(y) x ← 0 y ← 2 z ← 6 if False: else: y ← 3 print(3)

Correct Buggy Trial

slide-45
SLIDE 45

x ← 1 y ← 1 z ← 3 if False: else: y ← 2 print(2)

Example – Altered Meaning

What should we blame here?

1)x ← input() 2)y ← input() 3)z ← input() 4)if y>1 & z<6: 5) y ← 5 6)else: y ← y+1 7)print(y) x ← 0 y ← 2 z ← 6 if False: else: y ← 3 print(3)

Trial Correct Buggy

slide-46
SLIDE 46

x ← 1 y ← 1 z ← 3 if False: else: y ← 2 print(2)

Example – Altered Meaning

1)x ← input() 2)y ← input() 3)z ← input() 4)if y>1 & z<6: 5) y ← 5 6)else: y ← y+1 7)print(y) x ← 0 y ← 2 z ← 6 if False: else: y ← 3 print(3)

Trial Correct Buggy

slide-47
SLIDE 47

y ← 2 x ← 1 y ← 2 z ← 3 if False: else: y ← 2 print(2)

Example – Altered Meaning

1)x ← input() 2)y ← input() 3)z ← input() 4)if y>1 & z<6: 5) y ← 5 6)else: y ← y+1 7)print(y) x ← 0 y ← 2 z ← 6 if False: else: y ← 3 print(3)

Trial Correct Buggy

slide-48
SLIDE 48

x ← 1 y ← 2 z ← 3 x ← 1 y ← 2 z ← 3 if False: else: y ← 2 print(2)

Example – Altered Meaning

1)x ← input() 2)y ← input() 3)z ← input() 4)if y>1 & z<6: 5) y ← 5 6)else: y ← y+1 7)print(y) x ← 0 y ← 2 z ← 6 if False: else: y ← 3 print(3)

Trial Correct Buggy

slide-49
SLIDE 49

x ← 1 y ← 2 z ← 3 if True: y ← 5 print(5) x ← 1 y ← 2 z ← 3 if False: else: y ← 2 print(2)

Example – Altered Meaning

1)x ← input() 2)y ← input() 3)z ← input() 4)if y>1 & z<6: 5) y ← 5 6)else: y ← y+1 7)print(y) x ← 0 y ← 2 z ← 6 if False: else: y ← 3 print(3)

Trial Correct Buggy

slide-50
SLIDE 50

x ← 1 y ← 2 z ← 3 if True: y ← 5 print(5) x ← 1 y ← 2 z ← 3 if False: else: y ← 2 print(2)

Example – Altered Meaning

1)x ← input() 2)y ← input() 3)z ← input() 4)if y>1 & z<6: 5) y ← 5 6)else: y ← y+1 7)print(y) x ← 0 y ← 2 z ← 6 if False: else: y ← 3 print(3)

Trial

  • New control flow unlike original runs
  • Occurs in large portion of real bugs

Correct Buggy

slide-51
SLIDE 51

Confounding of Explanations

Behavior not found in original executions:

  • includes irrelevant information
  • excludes necessary information
slide-52
SLIDE 52

Confounding of Explanations

Behavior not found in original executions:

  • includes irrelevant information
  • excludes necessary information

Solution: Dual Slicing

  • Identify & extract execution differences

relevant to the failure

– Those that differ across executions

slide-53
SLIDE 53

Confounding of Explanations

Behavior not found in original executions:

  • includes irrelevant information
  • excludes necessary information

Solution: Dual Slicing

  • Identify & extract execution differences

relevant to the failure

– Those that differ across executions

  • Run trials on the extracted program
slide-54
SLIDE 54

Dual Slicing

  • A slice of two executions at once

– Includes dependences that differ across executions – Skips dependences that are the same

1)x ← 1 2)y ← 1 3)print(x+y) 1)x ← 0 2)y ← 1 3)print(x+y)

slide-55
SLIDE 55

Dual Slicing

  • A slice of two executions at once

– Includes dependences that differ across executions – Skips dependences that are the same

1)x ← 1 2)y ← 1 3)print(x+y) 1)x ← 0 2)y ← 1 3)print(x+y) 3 2 1

1

slide-56
SLIDE 56

Dual Slicing

  • A slice of two executions at once

– Includes dependences that differ across executions – Skips dependences that are the same

1)x ← 1 2)y ← 1 3)print(x+y) 1)x ← 0 2)y ← 1 3)print(x+y) 3 2 1 3 2 1

1 1 1

slide-57
SLIDE 57

Dual Slicing

  • A slice of two executions at once

– Includes dependences that differ across executions – Skips dependences that are the same

1)x ← 1 2)y ← 1 3)print(x+y) 1)x ← 0 2)y ← 1 3)print(x+y) 3 2 1 3 2 1 3 2 1

1 1 1

slide-58
SLIDE 58

Dual Slicing

  • A slice of two executions at once

– Includes dependences that differ across executions – Skips dependences that are the same

1)x ← 1 2)y ← 1 3)print(x+y) 1)x ← 0 2)y ← 1 3)print(x+y) 3 2 1 3 2 1 3 2 1

1 1 1 1

slide-59
SLIDE 59

x ← 1 y ← 1 z ← 3 if False: else: y ← 2 print(2)

Dual Slicing

  • Identify differences affecting the failure

1)x ← input() 2)y ← input() 3)z ← input() 4)if y>1 & z<6: 5) y ← 5 6)else: y ← y+1 7)print(y) x ← 0 y ← 2 z ← 6 if False: else: y ← 3 print(3) 7

Correct Buggy

slide-60
SLIDE 60

x ← 1 y ← 1 z ← 3 if False: else: y ← 2 print(2)

Dual Slicing

  • Identify differences affecting the failure

1)x ← input() 2)y ← input() 3)z ← input() 4)if y>1 & z<6: 5) y ← 5 6)else: y ← y+1 7)print(y) x ← 0 y ← 2 z ← 6 if False: else: y ← 3 print(3) 7 6

Correct Buggy

slide-61
SLIDE 61

x ← 1 y ← 1 z ← 3 if False: else: y ← 2 print(2)

Dual Slicing

  • Identify differences affecting the failure

1)x ← input() 2)y ← input() 3)z ← input() 4)if y>1 & z<6: 5) y ← 5 6)else: y ← y+1 7)print(y) x ← 0 y ← 2 z ← 6 if False: else: y ← 3 print(3) 7 6 2

Correct Buggy

slide-62
SLIDE 62

x ← 1 y ← 1 z ← 3 if False: else: y ← 2 print(2)

Dual Slicing

1)x ← input() 2)y ← input() 3)z ← input() 4)if y>1 & z<6: 5) y ← 5 6)else: y ← y+1 7)print(y) x ← 0 y ← 2 z ← 6 if False: else: y ← 3 print(3) 7 6 2 2)y ← input() 6)y ← y+1 7)print(y)

Extract

Correct Buggy

slide-63
SLIDE 63

y ← 1 y ← 2 print(2)

Example – Extracted Meaning

y ← 2 y ← 3 print(3)

Trial

2)y ← input() 6)y ← y+1 7)print(y)

Correct Buggy

slide-64
SLIDE 64

y ← 2 y ← 2 y ← 2 print(2)

Example – Extracted Meaning

y ← 2 y ← 3 print(3)

Trial

2)y ← input() 6)y ← y+1 7)print(y)

Correct Buggy

slide-65
SLIDE 65

y ← 3 print(3) y ← 2 y ← 2 y ← 2 print(2)

Example – Extracted Meaning

y ← 2 y ← 3 print(3)

Trial

2)y ← input() 6)y ← y+1 7)print(y)

Trial can now correctly blame y

Correct Buggy

slide-66
SLIDE 66

Data Confounding

1)x ← [0,1,2,3] 2)y ← input() 3)z ← input() 4)x[z] ← 5 5)print(x[y])

Trial Correct Buggy

  • Control flow is not the only source of

confounding

x ← … y ← 2 z ← 3 x[3] ← 5 print(x[2]) 2 x ← … y ← 1 z ← 2 x[2] ← 5 print(x[1]) 1

slide-67
SLIDE 67

x ← … y ← 2 z ← 3 x[3] ← 5 print(x[2])

Data Confounding

What should we blame here?

1)x ← [0,1,2,3] 2)y ← input() 3)z ← input() 4)x[z] ← 5 5)print(x[y])

Trial Correct Buggy

2 x ← … y ← 1 z ← 2 x[2] ← 5 print(x[1]) 1

slide-68
SLIDE 68

x ← … y ← 2 z ← 3 x[3] ← 5 print(x[2])

Data Confounding

1)x ← [0,1,2,3] 2)y ← input() 3)z ← input() 4)x[z] ← 5 5)print(x[y])

Trial Correct Buggy

2 x ← … y ← 1 z ← 2 x[2] ← 5 print(x[1]) 1

slide-69
SLIDE 69

Data Confounding

1)x ← [0,1,2,3] 2)y ← input() 3)z ← input() 4)x[z] ← 5 5)print(x[y]) x ← … y ← 2 z ← 3 x[3] ← 5 print(x[2])

Trial Correct Buggy

y ← 2 z ← 2 2 x ← … y ← 1 z ← 2 x[2] ← 5 print(x[1]) 1

slide-70
SLIDE 70

Data Confounding

x ← … y ← 1 z ← 2 x[2] ← 5 print(x[1]) 1)x ← [0,1,2,3] 2)y ← input() 3)z ← input() 4)x[z] ← 5 5)print(x[y]) x ← … y ← 2 z ← 3 x[3] ← 5 print(x[2])

Trial Correct Buggy

y ← 2 z ← 2 x[2] ← 5 print(x[2]) 2 1 5

slide-71
SLIDE 71

Data Confounding

x ← … y ← 1 z ← 2 x[2] ← 5 print(x[1]) 1)x ← [0,1,2,3] 2)y ← input() 3)z ← input() 4)x[z] ← 5 5)print(x[y]) x ← … y ← 2 z ← 3 x[3] ← 5 print(x[2])

Trial Correct Buggy

y ← 2 z ← 2 x[2] ← 5 print(x[2]) 2 1 5

slide-72
SLIDE 72

Data Confounding

x ← … y ← 1 z ← 2 x[2] ← 5 print(x[1]) 1)x ← [0,1,2,3] 2)y ← input() 3)z ← input() 4)x[z] ← 5 5)print(x[y]) x ← … y ← 2 z ← 3 x[3] ← 5 print(x[2])

Trial Correct Buggy

y ← 2 z ← 2 x[2] ← 5 print(x[2]) 2 1 5

  • Either new control flow or new data flow can

cause confounding.

  • Removing them is crucial.
slide-73
SLIDE 73

Execution Omission

  • A failure is not just incorrect behavior, it is

missing correct behavior.

slide-74
SLIDE 74

Execution Omission

  • A failure is not just incorrect behavior, it is

missing correct behavior.

– Also known as execution omission – Cannot be explained by reproducing faulty behavior

slide-75
SLIDE 75

Execution Omission

x ← 0 y ← 2 if False: print('*') x ← 4 y ← 5 if True: print(5) Print('*') 1) x ← input() 2) y ← input() 3) if x > 3: 4) print(y) 5) print('*')

Correct Buggy

slide-76
SLIDE 76

x ← 4 y ← 5 if True: print(5) Print('*')

Execution Omission

x ← 0 y ← 2 if False: print('*')

What should we blame here?

1) x ← input() 2) y ← input() 3) if x > 3: 4) print(y) 5) print('*')

Correct Buggy

slide-77
SLIDE 77

x ← 4 y ← 5 if True: print(5) Print('*')

Execution Omission

x ← 0 y ← 2 if False: print('*') 1) x ← input() 2) y ← input() 3) if x > 3: 4) print(y) 5) print('*')

Correct Buggy

slide-78
SLIDE 78

x ← 0 y ← 5 if True: print(5) Print('*')

Execution Omission

x ← 0 y ← 5 x ← 0 y ← 2 if False: print('*') 1) x ← input() 2) y ← input() 3) if x > 3: 4) print(y) 5) print('*')

Correct Buggy

slide-79
SLIDE 79

Execution Omission

1) x ← input() 2) y ← input() 3) if x > 3: 4) print(y) 5) print('*') x ← 0 y ← 2 if False: print('*') x ← 0 y ← 5 if False: print('*') x ← 0 y ← 5 if True: print(5) Print('*')

  • x alone reproduces the failure!
  • Does x alone explain the bug?

Correct Buggy

slide-80
SLIDE 80

Execution Omission

1) x ← input() 2) y ← input() 3) if x > 3: 4) print(y) 5) print('*') x ← 0 y ← 2 if False: print('*') x ← 0 y ← 5 if False: print('*') x ← 0 y ← 5 if True: print(5) Print('*')

  • x alone reproduces the failure!
  • Does x alone explain the bug?

– Can you fix the bug by only fixing x?

Correct Buggy

slide-81
SLIDE 81

Execution Omission

1) x ← input() 2) y ← input() 3) if x > 3: 4) print(y) 5) print('*') x ← 0 y ← 2 if False: print('*') x ← 0 y ← 5 if False: print('*') x ← 0 y ← 5 if True: print(5) Print('*')

  • x alone reproduces the failure!
  • Does x alone explain the bug?

– Can you fix the bug by only fixing x?

Correct Buggy

We can run a symmetric trial to find out!

slide-82
SLIDE 82

x ← 4 y ← 5 if True: print(5) Print('*')

Execution Omission

x ← 0 y ← 2 if False: print('*') 1) x ← input() 2) y ← input() 3) if x > 3: 4) print(y) 5) print('*')

Correct Buggy

slide-83
SLIDE 83

x ← 4 y ← 5 if True: print(5) Print('*')

Execution Omission

x ← 4 y ← 2 x ← 0 y ← 2 if False: print('*') 1) x ← input() 2) y ← input() 3) if x > 3: 4) print(y) 5) print('*')

Correct Buggy

slide-84
SLIDE 84

x ← 4 y ← 5 if True: print(5) print('*')

Execution Omission

x ← 4 y ← 2 if True: print(2) print('*') x ← 0 y ← 2 if False: print('*') 1) x ← input() 2) y ← input() 3) if x > 3: 4) print(y) 5) print('*')

Correct Buggy

  • Fixing x does not fix the missing behavior!
slide-85
SLIDE 85

x ← 4 y ← 5 if True: print(5) print('*')

Execution Omission

x ← 4 y ← 2 if True: print(2) print('*') x ← 0 y ← 2 if False: print('*') 1) x ← input() 2) y ← input() 3) if x > 3: 4) print(y) 5) print('*')

Correct Buggy

  • Fixing x does not fix the missing behavior!
  • x alone does not explain the bug.
slide-86
SLIDE 86

x ← 4 y ← 5 if True: print(5) Print('*')

Execution Omission

x ← 0 y ← 2 if False: print('*') 1) x ← input() 2) y ← input() 3) if x > 3: 4) print(y) 5) print('*')

Correct Buggy

What if we try both x and y?

slide-87
SLIDE 87

x ← 4 y ← 5 if True: print(5) Print('*')

Execution Omission

x ← 0 y ← 2 if False: print('*') 1) x ← input() 2) y ← input() 3) if x > 3: 4) print(y) 5) print('*')

Correct Buggy

What if we try both x and y?

slide-88
SLIDE 88

x ← 4 y ← 5 if True: print(5) Print('*')

Execution Omission

x ← 4 y ← 5 x ← 0 y ← 2 if False: print('*') 1) x ← input() 2) y ← input() 3) if x > 3: 4) print(y) 5) print('*')

Correct Buggy

slide-89
SLIDE 89

x ← 4 y ← 5 if True: print(5) print('*')

Execution Omission

x ← 4 y ← 5 if True: print(5) print('*') x ← 0 y ← 2 if False: print('*') 1) x ← input() 2) y ← input() 3) if x > 3: 4) print(y) 5) print('*')

Correct Buggy

  • Fixing x and y together can fix the bug
slide-90
SLIDE 90
  • Fixing x and y together can fix the bug

x ← 4 y ← 5 if True: print(5) print('*')

Execution Omission

x ← 4 y ← 5 if True: print(5) print('*') x ← 0 y ← 2 if False: print('*') 1) x ← input() 2) y ← input() 3) if x > 3: 4) print(y) 5) print('*')

Correct Buggy

Symmetric trials at each step 1) 2) explain the missed behavior, too.

slide-91
SLIDE 91

Comparative Causality

Explaining a failure:

  • Reproducing the failure is not enough.
  • Requires explaining why both executions

differ from each other

slide-92
SLIDE 92

Comparative Causality

Explaining a failure:

  • Reproducing the failure is not enough.
  • Requires explaining why both executions

differ from each other

  • Dual slicing ensures

– That we compare behaviors from the two executions

slide-93
SLIDE 93

Comparative Causality

Explaining a failure:

  • Reproducing the failure is not enough.
  • Requires explaining why both executions

differ from each other

  • Dual slicing ensures

– That we compare behaviors from the two executions

  • Symmetric comparison explains

– Why the buggy execution did something wrong – Why the buggy execution didn't do something right

slide-94
SLIDE 94

Real Results

  • Implemented with LLVM
  • 20 KLOC
  • Automatically explains bugs in C programs.

– 20kloc - 400kloc – 400kinst – 2.24minst

slide-95
SLIDE 95

Real Results

Program Size CC Old Trials Time Stmts Root Trials Time Stmts Root Precision Recall find 73k 15 12 6 X 1260 253

  • gnuplot

144k 33 44 10 X 469 141 10 X 1 1 gnuplot 139k 323 200 48 X 208 51

  • gnuplot

134k 337 961 129

  • 1888 950

121

  • 0.97

0.91 gnuplot 134k 130 140 33

  • 3012 931

38

  • 0.87

1 grep 12k 186 114 62

  • 1012 8263 23
  • 0.96

0.35 grep 12k 327 156 69

  • 1734 183

32

  • 1

0.46 grep 12k 78 49 27 X 1546 168 23

  • 0.96

0.81 make 30k 62 342 27 X 543 416 17

  • 1

0.63 tar 20k 8 22 3 X 221 50 3 X 1 1 tar 24k 125 124 48 X 332 110

  • tar

20k 121 53 20 X 296 66

  • tar

20k 28 43 10 X 2117 439 5

  • 1

0.5 tar 21k 87 80 23 X 709 165 15 X 0.73 0.48 tar 21k 15 22 4 X 1283 228 4 X 1 1 Avg. 125 157 34 1109 828 20 0.22 0.26

slide-96
SLIDE 96

Real Results

Program Size CC Old Trials Time Stmts Root Trials Time Stmts Root Precision Recall find 73k 15 12 6 X 1260 253

  • gnuplot

144k 33 44 10 X 469 141 10 X 1 1 gnuplot 139k 323 200 48 X 208 51

  • gnuplot

134k 337 961 129

  • 1888 950

121

  • 0.97

0.91 gnuplot 134k 130 140 33

  • 3012 931

38

  • 0.87

1 grep 12k 186 114 62

  • 1012 8263 23
  • 0.96

0.35 grep 12k 327 156 69

  • 1734 183

32

  • 1

0.46 grep 12k 78 49 27 X 1546 168 23

  • 0.96

0.81 make 30k 62 342 27 X 543 416 17

  • 1

0.63 tar 20k 8 22 3 X 221 50 3 X 1 1 tar 24k 125 124 48 X 332 110

  • tar

20k 121 53 20 X 296 66

  • tar

20k 28 43 10 X 2117 439 5

  • 1

0.5 tar 21k 87 80 23 X 709 165 15 X 0.73 0.48 tar 21k 15 22 4 X 1283 228 4 X 1 1 Avg. 125 157 34 1109 828 20 0.22 0.26

slide-97
SLIDE 97

Real Results

Program Size CC Old Trials Time Stmts Root Trials Time Stmts Root Precision Recall find 73k 15 12 6 X 1260 253

  • gnuplot

144k 33 44 10 X 469 141 10 X 1 1 gnuplot 139k 323 200 48 X 208 51

  • gnuplot

134k 337 961 129

  • 1888 950

121

  • 0.97

0.91 gnuplot 134k 130 140 33

  • 3012 931

38

  • 0.87

1 grep 12k 186 114 62

  • 1012 8263 23
  • 0.96

0.35 grep 12k 327 156 69

  • 1734 183

32

  • 1

0.46 grep 12k 78 49 27 X 1546 168 23

  • 0.96

0.81 make 30k 62 342 27 X 543 416 17

  • 1

0.63 tar 20k 8 22 3 X 221 50 3 X 1 1 tar 24k 125 124 48 X 332 110

  • tar

20k 121 53 20 X 296 66

  • tar

20k 28 43 10 X 2117 439 5

  • 1

0.5 tar 21k 87 80 23 X 709 165 15 X 0.73 0.48 tar 21k 15 22 4 X 1283 228 4 X 1 1 Avg. 125 157 34 1109 828 20 0.22 0.26

slide-98
SLIDE 98

Real Results

Program Size CC Old Trials Time Stmts Root Trials Time Stmts Root Precision Recall find 73k 15 12 6 X 1260 253

  • gnuplot

144k 33 44 10 X 469 141 10 X 1 1 gnuplot 139k 323 200 48 X 208 51

  • gnuplot

134k 337 961 129

  • 1888 950

121

  • 0.97

0.91 gnuplot 134k 130 140 33

  • 3012 931

38

  • 0.87

1 grep 12k 186 114 62

  • 1012 8263 23
  • 0.96

0.35 grep 12k 327 156 69

  • 1734 183

32

  • 1

0.46 grep 12k 78 49 27 X 1546 168 23

  • 0.96

0.81 make 30k 62 342 27 X 543 416 17

  • 1

0.63 tar 20k 8 22 3 X 221 50 3 X 1 1 tar 24k 125 124 48 X 332 110

  • tar

20k 121 53 20 X 296 66

  • tar

20k 28 43 10 X 2117 439 5

  • 1

0.5 tar 21k 87 80 23 X 709 165 15 X 0.73 0.48 tar 21k 15 22 4 X 1283 228 4 X 1 1 Avg. 125 157 34 1109 828 20 0.22 0.26

Precise reasoning is more efficient! 3 min 14 min

slide-99
SLIDE 99

Real Results

Program Size CC Old Trials Time Stmts Root Trials Time Stmts Root Precision Recall find 73k 15 12 6 X 1260 253

  • gnuplot

144k 33 44 10 X 469 141 10 X 1 1 gnuplot 139k 323 200 48 X 208 51

  • gnuplot

134k 337 961 129

  • 1888 950

121

  • 0.97

0.91 gnuplot 134k 130 140 33

  • 3012 931

38

  • 0.87

1 grep 12k 186 114 62

  • 1012 8263 23
  • 0.96

0.35 grep 12k 327 156 69

  • 1734 183

32

  • 1

0.46 grep 12k 78 49 27 X 1546 168 23

  • 0.96

0.81 make 30k 62 342 27 X 543 416 17

  • 1

0.63 tar 20k 8 22 3 X 221 50 3 X 1 1 tar 24k 125 124 48 X 332 110

  • tar

20k 121 53 20 X 296 66

  • tar

20k 28 43 10 X 2117 439 5

  • 1

0.5 tar 21k 87 80 23 X 709 165 15 X 0.73 0.48 tar 21k 15 22 4 X 1283 228 4 X 1 1 Avg. 125 157 34 1109 828 20 0.22 0.26

125 trials 1100 trials

slide-100
SLIDE 100

Real Results

Program Size CC Old Trials Time Stmts Root Trials Time Stmts Root Precision Recall find 73k 15 12 6 X 1260 253

  • gnuplot

144k 33 44 10 X 469 141 10 X 1 1 gnuplot 139k 323 200 48 X 208 51

  • gnuplot

134k 337 961 129

  • 1888 950

121

  • 0.97

0.91 gnuplot 134k 130 140 33

  • 3012 931

38

  • 0.87

1 grep 12k 186 114 62

  • 1012 8263 23
  • 0.96

0.35 grep 12k 327 156 69

  • 1734 183

32

  • 1

0.46 grep 12k 78 49 27 X 1546 168 23

  • 0.96

0.81 make 30k 62 342 27 X 543 416 17

  • 1

0.63 tar 20k 8 22 3 X 221 50 3 X 1 1 tar 24k 125 124 48 X 332 110

  • tar

20k 121 53 20 X 296 66

  • tar

20k 28 43 10 X 2117 439 5

  • 1

0.5 tar 21k 87 80 23 X 709 165 15 X 0.73 0.48 tar 21k 15 22 4 X 1283 228 4 X 1 1 Avg. 125 157 34 1109 828 20 0.22 0.26

35 stmts 20 stmts

slide-101
SLIDE 101

Real Results

Program Size CC Old Trials Time Stmts Root Trials Time Stmts Root Precision Recall find 73k 15 12 6 X 1260 253

  • gnuplot

144k 33 44 10 X 469 141 10 X 1 1 gnuplot 139k 323 200 48 X 208 51

  • gnuplot

134k 337 961 129

  • 1888 950

121

  • 0.97

0.91 gnuplot 134k 130 140 33

  • 3012 931

38

  • 0.87

1 grep 12k 186 114 62

  • 1012 8263 23
  • 0.96

0.35 grep 12k 327 156 69

  • 1734 183

32

  • 1

0.46 grep 12k 78 49 27 X 1546 168 23

  • 0.96

0.81 make 30k 62 342 27 X 543 416 17

  • 1

0.63 tar 20k 8 22 3 X 221 50 3 X 1 1 tar 24k 125 124 48 X 332 110

  • tar

20k 121 53 20 X 296 66

  • tar

20k 28 43 10 X 2117 439 5

  • 1

0.5 tar 21k 87 80 23 X 709 165 15 X 0.73 0.48 tar 21k 15 22 4 X 1283 228 4 X 1 1 Avg. 125 157 34 1109 828 20 0.22 0.26

35 stmts

slide-102
SLIDE 102

Real Results

Program Size CC Old Trials Time Stmts Root Trials Time Stmts Root Precision Recall find 73k 15 12 6 X 1260 253

  • gnuplot

144k 33 44 10 X 469 141 10 X 1 1 gnuplot 139k 323 200 48 X 208 51

  • gnuplot

134k 337 961 129

  • 1888 950

121

  • 0.97

0.91 gnuplot 134k 130 140 33

  • 3012 931

38

  • 0.87

1 grep 12k 186 114 62

  • 1012 8263 23
  • 0.96

0.35 grep 12k 327 156 69

  • 1734 183

32

  • 1

0.46 grep 12k 78 49 27 X 1546 168 23

  • 0.96

0.81 make 30k 62 342 27 X 543 416 17

  • 1

0.63 tar 20k 8 22 3 X 221 50 3 X 1 1 tar 24k 125 124 48 X 332 110

  • tar

20k 121 53 20 X 296 66

  • tar

20k 28 43 10 X 2117 439 5

  • 1

0.5 tar 21k 87 80 23 X 709 165 15 X 0.73 0.48 tar 21k 15 22 4 X 1283 228 4 X 1 1 Avg. 125 157 34 1109 828 20 0.22 0.26

slide-103
SLIDE 103

Real Results

Program Size CC Old Trials Time Stmts Root Trials Time Stmts Root Precision Recall find 73k 15 12 6 X 1260 253

  • gnuplot

144k 33 44 10 X 469 141 10 X 1 1 gnuplot 139k 323 200 48 X 208 51

  • gnuplot

134k 337 961 129

  • 1888 950

121

  • 0.97

0.91 gnuplot 134k 130 140 33

  • 3012 931

38

  • 0.87

1 grep 12k 186 114 62

  • 1012 8263 23
  • 0.96

0.35 grep 12k 327 156 69

  • 1734 183

32

  • 1

0.46 grep 12k 78 49 27 X 1546 168 23

  • 0.96

0.81 make 30k 62 342 27 X 543 416 17

  • 1

0.63 tar 20k 8 22 3 X 221 50 3 X 1 1 tar 24k 125 124 48 X 332 110

  • tar

20k 121 53 20 X 296 66

  • tar

20k 28 43 10 X 2117 439 5

  • 1

0.5 tar 21k 87 80 23 X 709 165 15 X 0.73 0.48 tar 21k 15 22 4 X 1283 228 4 X 1 1 Avg. 125 157 34 1109 828 20 0.22 0.26

27% explained 73% explained

slide-104
SLIDE 104

Real Results

Program Size CC Old Trials Time Stmts Root Trials Time Stmts Root Precision Recall find 73k 15 12 6 X 1260 253

  • gnuplot

144k 33 44 10 X 469 141 10 X 1 1 gnuplot 139k 323 200 48 X 208 51

  • gnuplot

134k 337 961 129

  • 1888 950

121

  • 0.97

0.91 gnuplot 134k 130 140 33

  • 3012 931

38

  • 0.87

1 grep 12k 186 114 62

  • 1012 8263 23
  • 0.96

0.35 grep 12k 327 156 69

  • 1734 183

32

  • 1

0.46 grep 12k 78 49 27 X 1546 168 23

  • 0.96

0.81 make 30k 62 342 27 X 543 416 17

  • 1

0.63 tar 20k 8 22 3 X 221 50 3 X 1 1 tar 24k 125 124 48 X 332 110

  • tar

20k 121 53 20 X 296 66

  • tar

20k 28 43 10 X 2117 439 5

  • 1

0.5 tar 21k 87 80 23 X 709 165 15 X 0.73 0.48 tar 21k 15 22 4 X 1283 228 4 X 1 1 Avg. 125 157 34 1109 828 20 0.22 0.26

slide-105
SLIDE 105

Real Results

Program Size #Inst CC Slicing Stmts Stmts find 73k 481K 6 185 gnuplot 144k 461K 10 148 gnuplot 139k 1.34M 48 464 gnuplot 134k 2.18M 129 368 gnuplot 134k 2.19M 33 237 grep 12k 415K 62 153 grep 12k 434K 69 109 grep 12k 466K 27 95 make 30k 2.24M 27 38 tar 20k 1.22M 3 3 tar 24k 962K 48 61 tar 20k 1.11M 20 1239 tar 20k 1.12M 10 1270 tar 21k 1.26M 23 25 tar 21k 1.19M 4 557 Avg. 34 330

35 stmts 330 stmts

slide-106
SLIDE 106

Program Size #Inst Lucid Slicing Stmts Stmts find 73k 481K 6 185 gnuplot 144k 461K 10 148 gnuplot 139k 1.34M 48 464 gnuplot 134k 2.18M 129 368 gnuplot 134k 2.19M 33 237 grep 12k 415K 62 153 grep 12k 434K 69 109 grep 12k 466K 27 95 make 30k 2.24M 27 38 tar 20k 1.22M 3 3 tar 24k 962K 48 61 tar 20k 1.11M 20 1239 tar 20k 1.12M 10 1270 tar 21k 1.26M 23 25 tar 21k 1.19M 4 557 Avg. 34 330

Real Results

More than an order

  • f magnitude
slide-107
SLIDE 107

Explaining a Bug (from tar)

int read_header() 1) name ← input() int extract_dir() 2) status ← mkdir(name) 3) if status 4) if !is_dir(name): 5) error() 6) return status void extract_archive() 7) status = extract_dir(name) 8) if status: 9) undo_last_backup()

Symptom:

Nothing extracted from archive when backing up existing files.

Why?

slide-108
SLIDE 108

Explaining a Bug (from tar)

int read_header() 1) name ← input() int extract_dir() 2) status ← mkdir(name) 3) if status 4) if !is_dir(name): 5) error() 6) return status void extract_archive() 7) status = extract_dir(name) 8) if status: 9) undo_last_backup() 8

extract undo backup

slide-109
SLIDE 109

Explaining a Bug (from tar)

int read_header() 1) name ← input() int extract_dir() 2) status ← mkdir(name) 3) if status 4) if !is_dir(name): 5) error() 6) return status void extract_archive() 7) status = extract_dir(name) 8) if status: 9) undo_last_backup() 8

extract undo backup

7

status = -1 status = 0

slide-110
SLIDE 110

Explaining a Bug (from tar)

int read_header() 1) name ← input() int extract_dir() 2) status ← mkdir(name) 3) if status 4) if !is_dir(name): 5) error() 6) return status void extract_archive() 7) status = extract_dir(name) 8) if status: 9) undo_last_backup() 8

extract undo backup

7 6

status = -1 status = 0 status = 0 status = -1

slide-111
SLIDE 111

Explaining a Bug (from tar)

int read_header() 1) name ← input() int extract_dir() 2) status ← mkdir(name) 3) if status 4) if !is_dir(name): 5) error() 6) return status void extract_archive() 7) status = extract_dir(name) 8) if status: 9) undo_last_backup() 8

extract undo backup

7 6 2

status = -1 status = 0 status = 0 status = -1 status = -1 status = 0

slide-112
SLIDE 112

Explaining a Bug (from tar)

int read_header() 1) name ← input() int extract_dir() 2) status ← mkdir(name) 3) if status 4) if !is_dir(name): 5) error() 6) return status void extract_archive() 7) status = extract_dir(name) 8) if status: 9) undo_last_backup() 8

extract undo backup

7 6 2 1

status = -1 status = 0 status = 0 status = -1 status = -1 status = 0 name=”dir” name=”dir2”

slide-113
SLIDE 113

Explaining a Bug (from tar)

int read_header() 1) name ← input() int extract_dir() 2) status ← mkdir(name) 3) if status 4) if !is_dir(name): 5) error() 6) return status void extract_archive() 7) status = extract_dir(name) 8) if status: 9) undo_last_backup()

Symptom:

Nothing extracted from archive when backing up existing files.

Why?

Because an existing directory sets a failure status the undoes extraction.

slide-114
SLIDE 114

Explaining a Bug (from tar)

int read_header() 1) name ← input() int extract_dir() 2) status ← mkdir(name) 3) if status 4) if !is_dir(name): 5) error() 6) return status void extract_archive() 7) status = extract_dir(name) 8) if status: 9) undo_last_backup()

Symptom:

Nothing extracted from archive when backing up existing files.

Why?

Because an existing directory sets a failure status the undoes extraction. else: status = 0

slide-115
SLIDE 115

Limitations

  • Needs a deterministic, reproducible failure
  • Depends on the correct run
  • Requires a model of external state, I/O
slide-116
SLIDE 116

Related Work

  • Delta debugging

[Zeller FSE'02]

  • Fault Localization

[Jones ASE'05, Liblit PLDI'04]

  • Tests for Localization

[Artzi ICSE'10, Rößler ISSTA'12]

  • Dynamic slicing

[Zhang PLDI'04]

  • Constraint Comparison

[Qi FSE'09]

  • Identifying repair candidates

[Chandra ICSE'11,Jeffrey ISSTA'08]

slide-117
SLIDE 117

Future Work

Comparative Causality is a general framework for explaining why executions differ.

  • Debugging
  • Program understanding
  • Reverse engineering
slide-118
SLIDE 118

Conclusions

  • Failures can be explained by explaining why

executions differ.

slide-119
SLIDE 119

Conclusions

  • Failures can be explained by explaining why

executions differ.

  • Executing the dual slice guards against the

effects of confounding from state replacement.

slide-120
SLIDE 120

Conclusions

  • Failures can be explained by explaining why

executions differ.

  • Executing the dual slice guards against the

effects of confounding from state replacement.

  • Symmetric causality testing explains

– Observed incorrect behaviors – Missing correct behaviors

slide-121
SLIDE 121

Conclusions

  • Failures can be explained by explaining why

executions differ.

  • Executing the dual slice guards against the

effects of confounding from state replacement.

  • Symmetric causality testing explains

– Observed incorrect behaviors – Missing correct behaviors

  • Comparative Causality yields precise and

concise explanations for real world bugs.

slide-122
SLIDE 122

Comparative Causality: Explaining the Differences Between Executions

William N. Sumner Xiangyu Zhang {wsumner,xyzhang} @ cs.purdue.edu ICSE 2013 22 May 2013