Isolating Failure-Inducing Thread Schedules Andreas Zeller - - PowerPoint PPT Presentation

isolating failure inducing thread schedules
SMART_READER_LITE
LIVE PREVIEW

Isolating Failure-Inducing Thread Schedules Andreas Zeller - - PowerPoint PPT Presentation

0/12 International Symposium on Software Testing and Analysis (ISSTA), Rome, Italy, 2002 Isolating Failure-Inducing Thread Schedules Andreas Zeller Jong-Deok Choi Lehrstuhl f ur Softwaretechnik IBM T. J. Watson Research


slide-1
SLIDE 1

0/12

  • International Symposium on Software Testing and Analysis (ISSTA), Rome, Italy, 2002

Isolating Failure-Inducing Thread Schedules

Andreas Zeller Jong-Deok Choi

Lehrstuhl f¨ ur Softwaretechnik IBM T. J. Watson Research Center Universit¨ at des Saarlandes, Saarbr¨ ucken Yorktown Heights, New York

slide-2
SLIDE 2

1/12

  • How Thread Schedules Induce Failures

The behavior of a multi-threaded program can depend on the thread schedule:

  • pen(".htpasswd")

read(...) modify(...) write(...) close(...)

  • pen(".htpasswd")

read(...) modify(...) write(...) close(...) Schedule Thread A Thread B

Thread Switch

slide-3
SLIDE 3

1/12

  • How Thread Schedules Induce Failures

The behavior of a multi-threaded program can depend on the thread schedule:

  • pen(".htpasswd")

read(...) modify(...) write(...) close(...)

  • pen(".htpasswd")

read(...) modify(...) write(...) close(...) Schedule Thread A Thread B

Thread Switch

  • pen(".htpasswd")
  • pen(".htpasswd")

read(...) modify(...) read(...) write(...) close(...) modify(...) write(...) close(...) Thread A Thread B Schedule

slide-4
SLIDE 4

1/12

  • How Thread Schedules Induce Failures

The behavior of a multi-threaded program can depend on the thread schedule:

  • pen(".htpasswd")

read(...) modify(...) write(...) close(...)

  • pen(".htpasswd")

read(...) modify(...) write(...) close(...) Schedule Thread A Thread B

Thread Switch

  • pen(".htpasswd")
  • pen(".htpasswd")

read(...) modify(...) read(...) write(...) close(...) modify(...) write(...) close(...) Thread A Thread B Schedule

A’s updates get lost!

Thread switches and schedules are nondeterministic: Bugs are hard to reproduce and hard to isolate!

slide-5
SLIDE 5

2/12

  • Recording and Replaying Runs

DEJAVU captures and replays program runs deterministically:

DEJAVU recorded schedule record replay

x = 45 y = 39 z = 67 x = 45 y = 39 z = 67 x = 45 y = 39 z = 67 x = 45 y = 39 z = 67

Allows simple reproduction of schedules and induced failures

slide-6
SLIDE 6

3/12

  • Differences between Schedules

Using DEJAVU, we can consider the schedule as an input which determines whether the program passes or fails.

replay replay

✔ ✘

slide-7
SLIDE 7

3/12

  • Differences between Schedules

Using DEJAVU, we can consider the schedule as an input which determines whether the program passes or fails.

replay replay

✔ ✘

The difference between schedules is relevant for the failure: A small difference can pinpoint the failure cause

slide-8
SLIDE 8

4/12

  • Finding Differences

✘ ✔

t1 t2 t3

  • We start with runs ✔ and ✘
  • We determine the differences

∆i between thread switches ti: – t1 occurs in ✔ at “time” 254 – t1 occurs in ✘ at “time” 278 – The difference ∆1 = |278 − 254| induces a statement interval: the code executed between “time” 254 and 278 – Same applies to t2, t3, etc.

Our goal: Narrow down the difference such that only a small relevant difference remains, pinpointing the root cause

slide-9
SLIDE 9

5/12

  • Isolating Relevant Differences

We use Delta Debugging to isolate the relevant differences Delta Debugging applies subsets of differences to ✔:

✘ ✔ ?

  • The entire difference

∆1 is applied

  • Half of the difference

∆2 is applied

  • ∆3 is not applied at all

DEJAVU executes the debuggee under this generated schedule; an automated test checks if the failure occurs

slide-10
SLIDE 10

6/12

  • The Isolation Process

Delta Debugging systematically narrows down the difference

✘ ✔ ? ✔ ✘

Dejavu replays the generated schedule Test outcome

slide-11
SLIDE 11

7/12

  • A Real Program

We examine Test #205 of the SPEC JVM98 Java test suite: a raytracer program depicting a dinosaur Program is single-threaded—the multi-threaded code is commented out

slide-12
SLIDE 12

7/12

  • A Real Program

We examine Test #205 of the SPEC JVM98 Java test suite: a raytracer program depicting a dinosaur Program is single-threaded—the multi-threaded code is commented out To test our approach,

  • we make the raytracer program multi-threaded again
  • we introduce a simple race condition
  • we implement an automated test that would check whether

the failure occurs or not

  • we generate random schedules until we obtain both a

passing schedule (✔) and a failing schedule (✘)

slide-13
SLIDE 13

8/12

  • Passing and Failing Schedule

We obtain two schedules with 3,842,577,240 differences, each moving a thread switch by ±1 “time” unit

2e+07 4e+07 6e+07 8e+07 1e+08 1.2e+08 1.4e+08 1.6e+08 1.8e+08 10 20 30 40 50 60 70 80 90 100 Time (# yield points)

  • Thread switches

Thread Schedules Failing Schedule Passing Schedule

slide-14
SLIDE 14

9/12

  • Narrowing Down the Failure Cause

Delta Debugging isolates one single difference after 50 tests:

1e+11 1e+12 1e+13 1e+14 5 10 15 20 25 30 35 40 45 50 Deltas

  • Tests executed

Delta Debugging Log cpass cfail

slide-15
SLIDE 15

10/12

  • The Root Cause of the Failure

25 public class Scene { ... 44

private static int ScenesLoaded = 0;

45

(more methods. . . )

81

private

82

int LoadScene(String filename) {

84

int OldScenesLoaded = ScenesLoaded;

85

(more initializations. . . )

91

infile = new DataInputStream(...);

92

(more code. . . )

130

ScenesLoaded = OldScenesLoaded + 1;

131

System.out.println("" + ScenesLoaded + " scenes loaded.");

132

...

134

}

135

...

733 }

slide-16
SLIDE 16

11/12

  • Lessons Learned

Delta Debugging is efficient even when applied to very large thread schedules Programs are “mostly correct” w.r.t. the thread schedule ⇒ Delta Debugging works like a binary search

slide-17
SLIDE 17

11/12

  • Lessons Learned

Delta Debugging is efficient even when applied to very large thread schedules Programs are “mostly correct” w.r.t. the thread schedule ⇒ Delta Debugging works like a binary search No analysis is required as Delta Debugging relies on experiments alone Only the schedule was observed and altered Failure-inducing thread switch is easily associated with code

slide-18
SLIDE 18

11/12

  • Lessons Learned

Delta Debugging is efficient even when applied to very large thread schedules Programs are “mostly correct” w.r.t. the thread schedule ⇒ Delta Debugging works like a binary search No analysis is required as Delta Debugging relies on experiments alone Only the schedule was observed and altered Failure-inducing thread switch is easily associated with code Alternate runs can be obtained automatically by generating random schedules Only one initial run (✔ or ✘) is required

slide-19
SLIDE 19

11/12

  • Lessons Learned

Delta Debugging is efficient even when applied to very large thread schedules Programs are “mostly correct” w.r.t. the thread schedule ⇒ Delta Debugging works like a binary search No analysis is required as Delta Debugging relies on experiments alone Only the schedule was observed and altered Failure-inducing thread switch is easily associated with code Alternate runs can be obtained automatically by generating random schedules Only one initial run (✔ or ✘) is required The whole approach is annoyingly simple in comparison to many other ideas we initially had

slide-20
SLIDE 20

12/12

  • Conclusion

Debugging multi-threaded applications is easy:

  • Record/Replay tools like DEJAVU reproduce runs
  • Delta Debugging pinpoints the root cause of the failure

Debugging can do without analysis:

  • It suffices to execute the debuggee under changing

circumstances There is still much work to do:

  • More case studies (as soon as DEJAVU can handle GUIs)
  • Using program analysis to guide the narrowing process
  • Isolating cause-effect chain from root cause to failure

http://www.st.cs.uni-sb.de/dd/ http://www.research.ibm.com/dejavu/