to towards production ru run heisenbugs re reproduction
play

To Towards Production-Ru Run Heisenbugs Re Reproduction on - PowerPoint PPT Presentation

To Towards Production-Ru Run Heisenbugs Re Reproduction on Commercial Hardware Shiyou Huang Bowen Cai and Jeff Huang 1 Whats a coders worst nightmare? https://www.quora.com/What-is-a-coders-worst-nightmare 2 The bug only occurs in


  1. To Towards Production-Ru Run Heisenbugs Re Reproduction on Commercial Hardware Shiyou Huang Bowen Cai and Jeff Huang 1

  2. What’s a coder’s worst nightmare? https://www.quora.com/What-is-a-coders-worst-nightmare 2

  3. The bug only occurs in production but cannot be replicated locally. https://www.quora.com/What-is-a-coders-worst-nightmare 3

  4. Heis Heisenbug enbug When you trace them, they disappear! 4

  5. Heis Heisenbug enbug When you trace them, they disappear! Localization is hard • 5

  6. Heis Heisenbug enbug When you trace them, they disappear! Localization is hard • reproduction is hard • 6

  7. Heis Heisenbug enbug When you trace them, they disappear! Localization is hard • reproduction is hard • never know if it is fixed… • 7

  8. A A motivating ng exampl ple Init: x=1, y=2 T2 T1 7: if (z==1) 1: T2.start() ✗ z=1 8: assert(x+1==y) 2: z=0 3: x++ contradiction! x=2, y=3 4: y++ 5: z=1 x+1==y 6: T2.join() http://stackoverflow.com/questions/16159203/ 8

  9. A A motivating ng exampl ple Init: x=1, y=2 T2 T1 7: if (z==1) 1: T2.start() ✗ z=1 8: assert(x+1==y) 2: z=0 3: x++ contradiction! x=2, y=3 4: y++ PSO 5: z=1 x+1==y 6: T2.join() http://stackoverflow.com/questions/16159203/ 9

  10. A A motivating ng exampl ple Init: x=1, y=2 T2 T1 $12 million loss of equipment! 7: if (z==1) 1: T2.start() ✗ z==1 8: assert(x+1==y) 2: z=0 3: x++ contradiction! x=2, y=3 4: y++ 5: z=1 x+1==y 6: T2.join() http://stackoverflow.com/questions/16159203/ 10

  11. Re Record & Re Replay (Rn RnR) Goal: record the non-determinism at runtime and reproduce the failure Record Replay Failure Execution 11

  12. Re Record & Re Replay (Rn RnR) Goal: record the non-determinism at runtime and reproduce the failure Record Replay • runtime overhead • the ability to reproduce failures Failure Execution 12

  13. Re Related Work • Software-based approach • order-based : fully record shared memory dependencies at runtime • LEAP[FSE’10], Order[USENIX ATC’11], Chimera[PLDI’12], Light[PLDI’15] RR[USENIX ATC’17]… • Chimera: > 2.4x • search-based : partially record the dependencies at runtime and use offline analysis (e.g. SMT solvers) to reason the dependencies • ODR[SOSP’09], Lee et al. [MICRO’09], Weeratunge et al.[ASPLOS’10], CLAP[PLDI’13]… • CLAP: 0.9x – 3x • Hardware-based approach • Rerun[ISCA’08], Delorean[ISCA’08], Coreracer[MICRO’11], PBI[ASPLOS’13]… • rely on special hardware that are not deployed 13

  14. Re Reality of Rn RnR • high overheads • failing to reproduce failures • lack of commodity hardware support In production 14

  15. Co Contri ributions Goal: record the execution at runtime with low overhead and faithfully reproduce it offline Ø RnR based on control flow tracing on commercial hardware (Intel PT) Ø core-based constraints reduction to reduce the offline computation Ø H3, evaluated on popular benchmarks and real-world applications, overhead: 1.4%-23.4% 15

  16. In Intel el Proces essor or Trace (PT) T) PT : Program control flow tracing, supported on 5 th and 6 th generation Intel core • Low overhead, as low as 5% 1 • Highly compacted packets, <1 bit per retired instruction • One bit (1/0) for branch taken indication • Compressed branch target address 1: https://sites.google.com/site/ intelptmicrotutorial. 16

  17. PT T Tracing Ove verhead Packets stream (per logical CPU) Intel CPU core 0...n Native PT Program Reconstructed OH(%) time (s) time (s) trace execution 4.9% overhead on Intel PT Software bodytrack 0.557 0.573 2.9% 94M Configure & Enable executions of PARSEC 3.0 Decoder Intel PT x264 1.086 1.145 5.4% 88M on average 14.7% vips 1.431 1.642 98M Runtime data Binary blackscholes 1.51 1.56 9.9% 289M Driver Image files ferret 1.699 1.769 4.1% 145M swaptions 2.81 2.98 6.0% 897M raytrace 3.818 4.036 5.7% 102M facesim 5.048 5.145 1.9% 110M fluidanimate 14.8 15.1 1.4% 1240M freqmine 15.9 17.1 7.5% 2468M Avg. 4.866 5.105 4.9% 553M 17

  18. Ch Challenges s • PT trace: low-level representation (assembly instruction) • Absence of the thread information • No data values of memory accesses 18

  19. So Solutions • PT trace: low-level representation & no data values • Idea: extract the path profiles from PT trace and re-execute the program by KLEE to generate symbol values • Absence of the thread information • Idea: use thread context switch information by Perf 19

  20. H3 H3 Ov Overview Binary image T0 Tn ... core 0 core 1 PT tracing 1. Constraints formula - Path profiles Packet log Decode 2. SMT solver A global generation schedule - Path constraints - Symbolic - Core-based read-write constraints core 2 core 3 execution Symbolic trace - Synchronization constraints user end Execution recorded of each thread - Memory order constraints by each core Recording & Decoding Offline Constraints Construction & Solving Phase 1: Control-flow tracing Reconstruct the execution on each core by decoding the packets generated by PT and thread information from Perf Phase 2: Offline analysis • Path profiles of each thread • Symbolic trace of each thread • SMT constraints over the trace 20

  21. Ex Exampl ple Step1: Collecting path profiles of each thread Init: x=1, y=2 A PT: tracing control-flow of the program’s execution Binary Packets T1 + image log 1: T2.start() T1 Decoding B C 2: z=0 Matching Binary image line numbers line 1 D 3: x++ line 2 ... A 4: y++ Binary Packets libipt line n + image log E F 5: z=1 T2 Decoding reconstructed execution program's cotrol flow B C 6: T2.join() Trace Matching perf context switch events line numbers line 1 Packets (TID, CPUID, TIME…) D T2 line 2 ... 7: if (z==1) line n E F ✗ 8: assert(x+1==y) reconstructed execution program's cotrol flow 21

  22. Ex Exampl ple Step1: Collecting path profiles of each thread Init: x=1, y=2 A Binary Packets PT: tracing control-flow of the program’s execution + image log T1 1: T2.start() T1 Decoding B C T1 : bb1 2: z=0 Matching line numbers line 1 D BB1 3: x++ line 2 A Binary Packets + ... log image 4: y++ T2 : bb1, bb2 line n Match to *.ll E F 5: z=1 Decoding T2 B C BB1 reconstructed execution program's cotrol flow 6: T2.join() Matching line numbers line 1 D line 2 T2 ... BB2 BB3 7: if (z==1) line n E F path profile ✗ 8: assert(x+1==y) reconstructed execution program's cotrol flow 22

  23. Ex Exampl ple Step2: symbolic trace generation Init: x=1, y=2 KLEE[OSDI’08]: execute the thread along the path profile T1 T1 1: T2.start() # = 0 𝑋 2: z=0 " ( = 𝑆 ' ( + 1 Using symbol values to represent ( , 𝑋 𝑆 ' 3: x++ ' concrete values, e.g., - = 𝑆 , - + 1 - , 𝑋 4: y++ 𝑆 , # : value written to z at line 2 , 𝑋 " . = 1 5: z=1 ( : value read from z at line 3 𝑋 𝑆 ' " 6: T2.join() T2 T2 4 == 1 7: if (z==1) 𝑈𝑠𝑣𝑓 ≡ 𝑆 " ✗ 5 + 1 ≠ 𝑆 , 8: assert(x+1==y) 5 𝑆 ' 23

  24. Ex Exampl ple Step 3: computing global failure schedule Init: x=1, y=2 T1 CLAP[PLDI’13]: Reason dependencies of memory accesses 1: T2.start() 2: z=0 Global T1 3: x++ 4: y++ T2 5: z=1 Order variable O represents the order of a statement, e.g., 6: T2.join() O 2 <O 3 T2 means 2:z=0 happen before 3: x++ 7: if (z==1) ✗ 8: assert(x+1==y) 24

  25. Ex Exampl ple Step 3: computing global failure schedule Init: x=1, y=2 CLAP[PLDI’13]: Reason dependencies of memory accesses T1 1: T2.start() Read-Write Constraints match a read to a write 2: z=0 $ = 0 ∧ ) $ < ) + ) ∨ (" # $ = . / ∧ ) / < ) $ ∧ () + < ) / ∨ ) $ < ) + )) 3: x++ (" # # 4: y++ Memory Order Constraints SC PSO 5: z=1 2 3 < ) 1 3 < ) 5 4 2 3 ) 0 < ) + ) / < ) 6 ) 0 < ) + < ) 1 6: T2.join() 2 3 < ) 1 2 3 < ) 5 4 4 3 < ) / < ) 6 4 ) 1 3 ) 5 3 < ) 5 8 < ) 7 9 8 < ) 7 9 ) $ < ) 7 ) $ < ) 7 T2 Path Constraints Failure Constraints 7: if (z==1) $ = 1 7 + 1! = " 9 7 " # " 8 ✗ 8: assert(x+1==y) 25

  26. Ex Exampl ple Step 3: computing global failure schedule Init: x=1, y=2 T1 CLAP[PLDI’13]: Reason dependencies of memory accesses 1: T2.start() Read-Write Constraints match a read to a write HB 2: z=0 $ = 0 ∧ ) $ < ) + ) ∨ (" # $ = . / ∧ ) / < ) $ ∧ () + < ) / ∨ ) $ < ) + )) 3: x++ (" # # 4: y++ Memory Order Constraints SC PSO 5: z=1 2 3 < ) 1 3 < ) 5 4 2 3 ) 0 < ) + ) / < ) 6 ) 0 < ) + < ) 1 6: T2.join() 2 3 < ) 1 2 3 < ) 5 4 4 3 < ) / < ) 6 4 ) 1 3 ) 5 3 < ) 5 rf 8 < ) 7 9 8 < ) 7 9 ) $ < ) 7 T2 ) $ < ) 7 Path Constraints Failure Constraints 7: if (z==1) $ = 1 7 + 1! = " 9 7 " # " 8 ✗ 8: assert(x+1==y) 26

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend