isolating failure inducing thread schedules
play

Isolating Failure-Inducing Thread Schedules Andreas Zeller - PowerPoint PPT Presentation

0/12 International Symposium on Software Testing and Analysis (ISSTA), Rome, Italy, 2002 Isolating Failure-Inducing Thread Schedules Andreas Zeller Jong-Deok Choi Lehrstuhl f ur Softwaretechnik IBM T. J. Watson Research


  1. 0/12 International Symposium on Software Testing and Analysis (ISSTA), Rome, Italy, 2002 Isolating Failure-Inducing Thread Schedules � � Andreas Zeller Jong-Deok Choi � Lehrstuhl f¨ ur Softwaretechnik IBM T. J. Watson Research Center � Universit¨ at des Saarlandes, Saarbr¨ ucken Yorktown Heights, New York � � �

  2. How Thread Schedules Induce Failures 1/12 The behavior of a multi-threaded program can depend on the thread schedule: Schedule Thread A Thread B open(".htpasswd") read(...) modify(...) write(...) close(...) open(".htpasswd") Thread read(...) Switch modify(...) � write(...) � close(...) � ✔ � � � �

  3. How Thread Schedules Induce Failures 1/12 The behavior of a multi-threaded program can depend on the thread schedule: Schedule Thread A Thread B Schedule Thread A Thread B open(".htpasswd") open(".htpasswd") read(...) open(".htpasswd") modify(...) read(...) write(...) read(...) close(...) modify(...) open(".htpasswd") write(...) Thread read(...) close(...) Switch modify(...) modify(...) � write(...) write(...) � close(...) close(...) � ✔ ✘ � � � �

  4. How Thread Schedules Induce Failures 1/12 The behavior of a multi-threaded program can depend on the thread schedule: Schedule Thread A Thread B Schedule Thread A Thread B open(".htpasswd") open(".htpasswd") read(...) open(".htpasswd") modify(...) read(...) write(...) read(...) close(...) modify(...) open(".htpasswd") write(...) Thread read(...) close(...) Switch modify(...) modify(...) � write(...) write(...) � close(...) close(...) A’s updates � ✔ ✘ get lost! � Thread switches and schedules are nondeterministic: � Bugs are hard to reproduce and hard to isolate! � �

  5. Recording and Replaying Runs 2/12 DEJAVU captures and replays program runs deterministically: recorded schedule x = 45 y = 39 z = 67 record replay x = 45 x = 45 y = 39 y = 39 z = 67 z = 67 � x = 45 � y = 39 z = 67 � DEJAVU � � Allows simple reproduction of schedules and induced failures � �

  6. Differences between Schedules 3/12 Using DEJAVU, we can consider the schedule as an input which determines whether the program passes or fails. replay replay � � ✔ ✘ � � � � �

  7. Differences between Schedules 3/12 Using DEJAVU, we can consider the schedule as an input which determines whether the program passes or fails. replay replay � � ✔ ✘ � � The difference between schedules is relevant for the failure: � A small difference can pinpoint the failure cause � �

  8. Finding Differences 4/12 t1 • We start with runs ✔ and ✘ • We determine the differences t2 ∆ i between thread switches t i : – t 1 occurs in ✔ at “time” 254 – t 1 occurs in ✘ at “time” 278 t3 – The difference ∆ 1 = | 278 − 254 | induces a statement interval: the code � executed between “time” � 254 and 278 ✔ ✘ � – Same applies to t 2 , t 3 , etc. � Our goal: Narrow down the difference such that only a small � relevant difference remains, pinpointing the root cause � �

  9. Isolating Relevant Differences 5/12 We use Delta Debugging to isolate the relevant differences Delta Debugging applies subsets of differences to ✔ : • The entire difference ∆ 1 is applied • Half of the difference ∆ 2 is applied � • ∆ 3 is not applied at all � � ? ✔ ✘ � � DEJAVU executes the debuggee under this generated � schedule; an automated test checks if the failure occurs �

  10. The Isolation Process 6/12 Delta Debugging systematically narrows down the difference ? ✔ ✘ � � � Dejavu replays � the generated schedule � ✔ ✘ � Test outcome �

  11. A Real Program 7/12 We examine Test #205 of the SPEC JVM98 Java test suite: a raytracer program depicting a dinosaur Program is single-threaded—the multi-threaded code is commented out � � � � � � �

  12. A Real Program 7/12 We examine Test #205 of the SPEC JVM98 Java test suite: a raytracer program depicting a dinosaur Program is single-threaded—the multi-threaded code is commented out To test our approach, • we make the raytracer program multi-threaded again • we introduce a simple race condition � • we implement an automated test that would check whether � the failure occurs or not � � • we generate random schedules until we obtain both a passing schedule ( ✔ ) and a failing schedule ( ✘ ) � � �

  13. � Passing and Failing Schedule 8/12 We obtain two schedules with 3,842,577,240 differences, each moving a thread switch by ± 1 “time” unit Thread Schedules 1.8e+08 Failing Schedule Passing Schedule 1.6e+08 1.4e+08 1.2e+08 Time (# yield points) 1e+08 � 8e+07 � 6e+07 � 4e+07 � 2e+07 � 0 � 0 10 20 30 40 50 60 70 80 90 100 Thread switches �

  14. � Narrowing Down the Failure Cause 9/12 Delta Debugging isolates one single difference after 50 tests: Delta Debugging Log 1e+14 cpass cfail 1e+13 Deltas � 1e+12 � � � � 1e+11 0 5 10 15 20 25 30 35 40 45 50 Tests executed � �

  15. The Root Cause of the Failure 10/12 25 public class Scene { ... private static int ScenesLoaded = 0; 44 (more methods. . . ) 45 private 81 int LoadScene(String filename) { 82 int OldScenesLoaded = ScenesLoaded; 84 (more initializations. . . ) 85 infile = new DataInputStream(...); 91 (more code. . . ) 92 ScenesLoaded = OldScenesLoaded + 1; 130 � System.out.println("" + 131 � ScenesLoaded + " scenes loaded."); ... � 132 } 134 � ... 135 � 733 } � �

  16. Lessons Learned 11/12 Delta Debugging is efficient even when applied to very large thread schedules Programs are “mostly correct” w.r.t. the thread schedule ⇒ Delta Debugging works like a binary search � � � � � � �

  17. Lessons Learned 11/12 Delta Debugging is efficient even when applied to very large thread schedules Programs are “mostly correct” w.r.t. the thread schedule ⇒ Delta Debugging works like a binary search No analysis is required as Delta Debugging relies on experiments alone Only the schedule was observed and altered Failure-inducing thread switch is easily associated with code � � � � � � �

  18. Lessons Learned 11/12 Delta Debugging is efficient even when applied to very large thread schedules Programs are “mostly correct” w.r.t. the thread schedule ⇒ Delta Debugging works like a binary search No analysis is required as Delta Debugging relies on experiments alone Only the schedule was observed and altered Failure-inducing thread switch is easily associated with code � Alternate runs can be obtained automatically by generating � random schedules � Only one initial run ( ✔ or ✘ ) is required � � � �

  19. Lessons Learned 11/12 Delta Debugging is efficient even when applied to very large thread schedules Programs are “mostly correct” w.r.t. the thread schedule ⇒ Delta Debugging works like a binary search No analysis is required as Delta Debugging relies on experiments alone Only the schedule was observed and altered Failure-inducing thread switch is easily associated with code � Alternate runs can be obtained automatically by generating � random schedules � Only one initial run ( ✔ or ✘ ) is required � The whole approach is annoyingly simple in comparison to � many other ideas we initially had � �

  20. Conclusion 12/12 Debugging multi-threaded applications is easy: • Record/Replay tools like DEJAVU reproduce runs • Delta Debugging pinpoints the root cause of the failure Debugging can do without analysis: • It suffices to execute the debuggee under changing circumstances There is still much work to do: � • More case studies (as soon as DEJAVU can handle GUIs) � • Using program analysis to guide the narrowing process � • Isolating cause-effect chain from root cause to failure � http://www.st.cs.uni-sb.de/dd/ � http://www.research.ibm.com/dejavu/ � �

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend