To Towards Production-Ru Run Heisenbugs Re Reproduction on - PowerPoint PPT Presentation

To Towards Production-Ru Run Heisenbugs Re Reproduction on Commercial Hardware Shiyou Huang Bowen Cai and Jeff Huang 1

What’s a coder’s worst nightmare? https://www.quora.com/What-is-a-coders-worst-nightmare 2

The bug only occurs in production but cannot be replicated locally. https://www.quora.com/What-is-a-coders-worst-nightmare 3

Heis Heisenbug enbug When you trace them, they disappear! 4

Heis Heisenbug enbug When you trace them, they disappear! Localization is hard • 5

Heis Heisenbug enbug When you trace them, they disappear! Localization is hard • reproduction is hard • 6

Heis Heisenbug enbug When you trace them, they disappear! Localization is hard • reproduction is hard • never know if it is fixed… • 7

A A motivating ng exampl ple Init: x=1, y=2 T2 T1 7: if (z==1) 1: T2.start() ✗ z=1 8: assert(x+1==y) 2: z=0 3: x++ contradiction! x=2, y=3 4: y++ 5: z=1 x+1==y 6: T2.join() http://stackoverflow.com/questions/16159203/ 8

A A motivating ng exampl ple Init: x=1, y=2 T2 T1 7: if (z==1) 1: T2.start() ✗ z=1 8: assert(x+1==y) 2: z=0 3: x++ contradiction! x=2, y=3 4: y++ PSO 5: z=1 x+1==y 6: T2.join() http://stackoverflow.com/questions/16159203/ 9

A A motivating ng exampl ple Init: x=1, y=2 T2 T1 $12 million loss of equipment! 7: if (z==1) 1: T2.start() ✗ z==1 8: assert(x+1==y) 2: z=0 3: x++ contradiction! x=2, y=3 4: y++ 5: z=1 x+1==y 6: T2.join() http://stackoverflow.com/questions/16159203/ 10

Re Record & Re Replay (Rn RnR) Goal: record the non-determinism at runtime and reproduce the failure Record Replay Failure Execution 11

Re Record & Re Replay (Rn RnR) Goal: record the non-determinism at runtime and reproduce the failure Record Replay • runtime overhead • the ability to reproduce failures Failure Execution 12

Re Related Work • Software-based approach • order-based : fully record shared memory dependencies at runtime • LEAP[FSE’10], Order[USENIX ATC’11], Chimera[PLDI’12], Light[PLDI’15] RR[USENIX ATC’17]… • Chimera: > 2.4x • search-based : partially record the dependencies at runtime and use offline analysis (e.g. SMT solvers) to reason the dependencies • ODR[SOSP’09], Lee et al. [MICRO’09], Weeratunge et al.[ASPLOS’10], CLAP[PLDI’13]… • CLAP: 0.9x – 3x • Hardware-based approach • Rerun[ISCA’08], Delorean[ISCA’08], Coreracer[MICRO’11], PBI[ASPLOS’13]… • rely on special hardware that are not deployed 13

Re Reality of Rn RnR • high overheads • failing to reproduce failures • lack of commodity hardware support In production 14

Co Contri ributions Goal: record the execution at runtime with low overhead and faithfully reproduce it offline Ø RnR based on control flow tracing on commercial hardware (Intel PT) Ø core-based constraints reduction to reduce the offline computation Ø H3, evaluated on popular benchmarks and real-world applications, overhead: 1.4%-23.4% 15

In Intel el Proces essor or Trace (PT) T) PT : Program control flow tracing, supported on 5 th and 6 th generation Intel core • Low overhead, as low as 5% 1 • Highly compacted packets, <1 bit per retired instruction • One bit (1/0) for branch taken indication • Compressed branch target address 1: https://sites.google.com/site/ intelptmicrotutorial. 16

PT T Tracing Ove verhead Packets stream (per logical CPU) Intel CPU core 0...n Native PT Program Reconstructed OH(%) time (s) time (s) trace execution 4.9% overhead on Intel PT Software bodytrack 0.557 0.573 2.9% 94M Configure & Enable executions of PARSEC 3.0 Decoder Intel PT x264 1.086 1.145 5.4% 88M on average 14.7% vips 1.431 1.642 98M Runtime data Binary blackscholes 1.51 1.56 9.9% 289M Driver Image files ferret 1.699 1.769 4.1% 145M swaptions 2.81 2.98 6.0% 897M raytrace 3.818 4.036 5.7% 102M facesim 5.048 5.145 1.9% 110M fluidanimate 14.8 15.1 1.4% 1240M freqmine 15.9 17.1 7.5% 2468M Avg. 4.866 5.105 4.9% 553M 17

Ch Challenges s • PT trace: low-level representation (assembly instruction) • Absence of the thread information • No data values of memory accesses 18

So Solutions • PT trace: low-level representation & no data values • Idea: extract the path profiles from PT trace and re-execute the program by KLEE to generate symbol values • Absence of the thread information • Idea: use thread context switch information by Perf 19

H3 H3 Ov Overview Binary image T0 Tn ... core 0 core 1 PT tracing 1. Constraints formula - Path profiles Packet log Decode 2. SMT solver A global generation schedule - Path constraints - Symbolic - Core-based read-write constraints core 2 core 3 execution Symbolic trace - Synchronization constraints user end Execution recorded of each thread - Memory order constraints by each core Recording & Decoding Offline Constraints Construction & Solving Phase 1: Control-flow tracing Reconstruct the execution on each core by decoding the packets generated by PT and thread information from Perf Phase 2: Offline analysis • Path profiles of each thread • Symbolic trace of each thread • SMT constraints over the trace 20

Ex Exampl ple Step1: Collecting path profiles of each thread Init: x=1, y=2 A PT: tracing control-flow of the program’s execution Binary Packets T1 + image log 1: T2.start() T1 Decoding B C 2: z=0 Matching Binary image line numbers line 1 D 3: x++ line 2 ... A 4: y++ Binary Packets libipt line n + image log E F 5: z=1 T2 Decoding reconstructed execution program's cotrol flow B C 6: T2.join() Trace Matching perf context switch events line numbers line 1 Packets (TID, CPUID, TIME…) D T2 line 2 ... 7: if (z==1) line n E F ✗ 8: assert(x+1==y) reconstructed execution program's cotrol flow 21

Ex Exampl ple Step1: Collecting path profiles of each thread Init: x=1, y=2 A Binary Packets PT: tracing control-flow of the program’s execution + image log T1 1: T2.start() T1 Decoding B C T1 : bb1 2: z=0 Matching line numbers line 1 D BB1 3: x++ line 2 A Binary Packets + ... log image 4: y++ T2 : bb1, bb2 line n Match to *.ll E F 5: z=1 Decoding T2 B C BB1 reconstructed execution program's cotrol flow 6: T2.join() Matching line numbers line 1 D line 2 T2 ... BB2 BB3 7: if (z==1) line n E F path profile ✗ 8: assert(x+1==y) reconstructed execution program's cotrol flow 22

Ex Exampl ple Step2: symbolic trace generation Init: x=1, y=2 KLEE[OSDI’08]: execute the thread along the path profile T1 T1 1: T2.start() # = 0 𝑋 2: z=0 " ( = 𝑆 ' ( + 1 Using symbol values to represent ( , 𝑋 𝑆 ' 3: x++ ' concrete values, e.g., - = 𝑆 , - + 1 - , 𝑋 4: y++ 𝑆 , # : value written to z at line 2 , 𝑋 " . = 1 5: z=1 ( : value read from z at line 3 𝑋 𝑆 ' " 6: T2.join() T2 T2 4 == 1 7: if (z==1) 𝑈𝑠𝑣𝑓 ≡ 𝑆 " ✗ 5 + 1 ≠ 𝑆 , 8: assert(x+1==y) 5 𝑆 ' 23

Ex Exampl ple Step 3: computing global failure schedule Init: x=1, y=2 T1 CLAP[PLDI’13]: Reason dependencies of memory accesses 1: T2.start() 2: z=0 Global T1 3: x++ 4: y++ T2 5: z=1 Order variable O represents the order of a statement, e.g., 6: T2.join() O 2 <O 3 T2 means 2:z=0 happen before 3: x++ 7: if (z==1) ✗ 8: assert(x+1==y) 24

Ex Exampl ple Step 3: computing global failure schedule Init: x=1, y=2 CLAP[PLDI’13]: Reason dependencies of memory accesses T1 1: T2.start() Read-Write Constraints match a read to a write 2: z=0 $ = 0 ∧ ) $ < ) + ) ∨ (" # $ = . / ∧ ) / < ) $ ∧ () + < ) / ∨ ) $ < ) + )) 3: x++ (" # # 4: y++ Memory Order Constraints SC PSO 5: z=1 2 3 < ) 1 3 < ) 5 4 2 3 ) 0 < ) + ) / < ) 6 ) 0 < ) + < ) 1 6: T2.join() 2 3 < ) 1 2 3 < ) 5 4 4 3 < ) / < ) 6 4 ) 1 3 ) 5 3 < ) 5 8 < ) 7 9 8 < ) 7 9 ) $ < ) 7 ) $ < ) 7 T2 Path Constraints Failure Constraints 7: if (z==1) $ = 1 7 + 1! = " 9 7 " # " 8 ✗ 8: assert(x+1==y) 25

Ex Exampl ple Step 3: computing global failure schedule Init: x=1, y=2 T1 CLAP[PLDI’13]: Reason dependencies of memory accesses 1: T2.start() Read-Write Constraints match a read to a write HB 2: z=0 $ = 0 ∧ ) $ < ) + ) ∨ (" # $ = . / ∧ ) / < ) $ ∧ () + < ) / ∨ ) $ < ) + )) 3: x++ (" # # 4: y++ Memory Order Constraints SC PSO 5: z=1 2 3 < ) 1 3 < ) 5 4 2 3 ) 0 < ) + ) / < ) 6 ) 0 < ) + < ) 1 6: T2.join() 2 3 < ) 1 2 3 < ) 5 4 4 3 < ) / < ) 6 4 ) 1 3 ) 5 3 < ) 5 rf 8 < ) 7 9 8 < ) 7 9 ) $ < ) 7 T2 ) $ < ) 7 Path Constraints Failure Constraints 7: if (z==1) $ = 1 7 + 1! = " 9 7 " # " 8 ✗ 8: assert(x+1==y) 26

To Towards Production-Ru Run Heisenbugs Re Reproduction on - PowerPoint PPT Presentation

To Towards Production-Ru Run Heisenbugs Re Reproduction on Commercial Hardware Shiyou Huang Bowen Cai and Jeff Huang 1 Whats a coders worst nightmare? https://www.quora.com/What-is-a-coders-worst-nightmare 2 The bug only occurs in

NOT FOR REPRODUCTION NOT FOR REPRODUCTION NOT FOR REPRODUCTION NOT FOR REPRODUCTION NOT FOR

Definition of Reproduction Asexual Reproduction The process by which living organisms produce

Tone Reproduction Tone Reproduction Tone Reproduction Erik Reinhard University of Central

ALPHAVISION A new view of reproduction A unique tool for reproduction management and insemination

Logistics Paper summaries on Tone Reproduction Tone Reproduction Any takers? Computer

Deconstructing Concurrency Heisenbugs Shaz Qadeer Research in Software Engineering Microsoft

Tone Reproduction Definition: Compressing the dynamic Photographic Tone Reproduction range of a

ATLAS Heavy Flavour production Looking towards Run 2 Heavy Flavour at the LHC

Muddy Run/Conowingo Recreation Sites and Facilities Consultation Presentation September 14-15,

Outdoor Heritage Projects Blood Run Blood Run Oak Forest Blood Run 2012 Big Sioux River overlook

+ Characterization of Miller Run and Conceptual Plan for Characterization of Miller Run and

5 Official 5 Official 5 Official 5 Official Run Zone Coverage Run Zone Coverage Run Zone

Equine clinical reproduction and academic research; facilitating collaboration and co-ordination

NOT FOR REPRODUCTION Too Many Losses Too Soon: Loss and Grief Among Foster and Adopted Children

SCOOP Project SpringBoard January 29, 2019 PROPERTY OF CONTINENTAL RESOURCES, INC. REPRODUCTION

Reproduction Organic possibilities: Growing interest in natural mating Support in

Project 1 Corso di Sistemi e Architetture per Big Data A.A. 2017/18 Valeria Cardellini, Matteo

Introduction Goal : Enhance Productivity Increase Delivery and Support Quality

CompSci 356: Computer Network Architectures Lecture 8: Spanning Tree Algorithm and Basic

A Reality Check on Health Information Privacy: How should we understand re-identification risks

Note: Totals include Confirmed and CDC Expanded Case Definition (Probable) *Includes testing

A STEP TOWARD QUANTIFYING INDEPENDENTLY REPRODUCIBLE MACHINE LEARNING RESEARCH Edward Raff

Automated tracking of computational experiments using Sumatra Andrew Davison Unit de

in in Phys Physics ics Applica pplicati tions ons March 2016 - Carlo Tintori ISO 9001:2008