Automated Repair of Concurrency Bugs
Ben Liblit with Guoliang Jin and Shan Lu
Concurrency Bugs Ben Liblit with Guoliang Jin and Shan Lu We need - - PowerPoint PPT Presentation
Automated Repair of Concurrency Bugs Ben Liblit with Guoliang Jin and Shan Lu We need reliable software Peoples daily life now depends on reliable software Software companies spend lots of resources on debugging More than 50% effort
Ben Liblit with Guoliang Jin and Shan Lu
2
We need reliable software
Concurrency bugs hurt
3
Multi-threaded program
Multicore chip core1 cache thread1 core2 cache thread2 core3 cache thread3 core4 cache thread4 shared memory
4
Huge Interleaving space
An example concurrency bug
Thread 1 if (ptr != NULL) { ptr->field = 1; } Thread 2 ptr = NULL; Thread 1 if (ptr != NULL) { ptr->field = 1; } Thread 2 ptr = NULL;
The interleaving space
5
Bad interleavings
Previous research focuses
Thread 1 if (ptr != NULL) { ptr->field = 1; } Thread 2 ptr = NULL; Thread 1 if (ptr != NULL) { ptr->field = 1; } Thread 2 ptr = NULL; Segmentation Fault
Bug fixing
6
*SIGPLAN: “one of the first papers to attack the problem of automated bug fixing”
Automated fixing is difficult
7
Description:
Symptom Triggering condition …
Patch:
Correctness Performance Simplicity
as long as the buggy interleaving does not occur
Automated concurrency-bug fixing?
8
Description:
Symptom Triggering condition …
Patch:
Correctness Performance Simplicity
Description:
Symptom Triggering condition …
Description:
Interleavings that lead to software failure
9
Patch:
Correctness Performance Simplicity
atomicity violation detectors
ParkASPLOS’09, FlanaganPOPL’04, LuASPLOS’06, ChewEuroSys’10
detectors
ZhangASPLOS’10, LuciaMICRO’09, YuISCA’09, GaoASPLOS’11
data race detectors
SenPLDI’08, SavageTOCS’97, YuSOSP’05, EricksonOSDI’10, KasikciASPLOS’10
abnormal data flow detectors
ZhangASPLOS’11, ShiOOPSLA’10
p r c
A B Wb R Wg I1 I2
How to get a general solution that generates good patches?
. . . . . . Patched binary Patched binary Patched binary Patched binary Merged binary . . . Selected binary Selected binary Mutual exclusion Order Mutual exclusion Order Final patched binary
10
Description:
Interleavings that lead to software failure
Patch:
Correctness Performance Simplicity
Run-time Support Patch Merging Patch Testing & Selection Synchronization Enforcement Fix-Strategy Design
Source code Bug reports
automated fixing for non- deadlock concurrency bugs
mutual exclusion and order relationship
a set of techniques to automate the whole bug- fixing process: CFix
Contributions
Run-time Support Fix-Strategy Design Synchronization Enforcement Patch Merging Patch Testing & Selection
11
CFix: fix-strategy design
Challenges:
12
Run-time Support Fix-Strategy Design Synchronization Enforcement Patch Merging Patch Testing & Selection
Two types of synchronization relationships
13
Mutual Exclusion Order Relationship
Fix-strategy for atomicity-violation detectors example 1
Thread 1
if (ptr != NULL) { ptr->field = 1; } ptr = NULL;
Thread 2
14
Fix-strategy for atomicity-violation detectors example 2
Thread 1
ptr->field = 1; ptr->field = 1; ptr = NULL;
Thread 2
15
CFix: fix-strategy design
Challenges:
Solution:
mutual exclusion &
enforcement
16
Run-time Support Fix-Strategy Design Synchronization Enforcement Patch Merging Patch Testing & Selection
Fix-strategies
OV Detector AV Detector Race Detector DU Detector
I1 I2 A B
p r c
Wb R Wg
17
CFix: synchronization enforcement
Challenges:
and simplicity Solution:
enforcement: AFix [PLDI’11]
enforcement: OFix [OSDI’12]
18
Run-time Support Fix-Strategy Design Synchronization Enforcement Patch Merging Patch Testing & Selection
exclusive with r
Mutual exclusion relationship
19
Thread 1 if (ptr != NULL) { ptr->field = 1; } Thread 2 ptr = NULL;
r p c
Mutual exclusion enforcement: AFix
p c r
20
Put p and c into a critical section: naïve
p c p c p c p c
21
Put p and c into a critical section: AFix
p c
22
Subtle details
23
Order relationship
24
use read initialization destroy
Order relationship: two sub-types
Ai A B Aj
… …
firstA-B allA-B
A1 B An
…
A1 B An
…
25
OFix allA-B enforcement
26
OFix allA-B enforcement: A side
How to identify the last A instance in one thread
A
. . .; for (. . .) . . . ; // A . . .;
27
OFix allA-B enforcement: A side
How to identify the last thread that executes A
void main() { for (. . .) thread_create(thr_main); . . .; }
void ofix_signal() { mutex_lock(L);
if ( == 0) cond_broadcast(con); mutex_unlock(L); }
void thr_main() { for (. . .) . . . ; // A . . .; }
counter for signal threads =1 ++
thread _create
A
28
OFix allA-B enforcement: B side
B
void ofix_wait() { mutex_lock(L); if ( != 0) cond_timedwait(con, L, t); mutex_unlock(L); }
29
OFix firstA-B
B A
30
CFix: patch testing & selection
Challenge:
testing Solution:
31
Run-time Support Fix-Strategy Design Synchronization Enforcement Patch Merging Patch Testing & Selection
Patch testing principles
32
Run once without external perturbation
Thread 1 ptr->field = 1; ptr->field = 1; Thread 2 ptr = NULL;
33
Implicit bad patch
a Mutual Exclusion b c Order Relationships
34
Challenge:
mistake usually leads to multiple bug reports Solution:
CFix: patch merging
35
Run-time Support Fix-Strategy Design Synchronization Enforcement Patch Merging Patch Testing & Selection
c1 r1 p1 p2 c2, r2
void buf_write() { int tmp = buf_len + str_len; if (tmp > MAX) return; memcpy(buf[buf_len], str, str_len); buf_len = tmp; }
An example with multiple reports
p1 c1 p2 r1 c2, r2
36
Related patch: a case of AFix
lock(L1) p1 lock(L2) p2 c1 unlock(L1) c2 unlock(L2) lock(L1) r1 unlock(L1) lock(L2) r2 unlock(L2) lock(L1) p1 p2 c1 c2 unlock(L1) lock(L1) r2 unlock(L1)
37
c1 r1 p1 p2 c2,r2
void buf_write() { int tmp = buf_len + str_len; if (tmp > MAX) { return; } memcpy(buf[buf_len], str, str_len); buf_len = tmp; }
The merged patch for the example
p1 c1 p2 r1 c2, r2
c1,p2 c2,r1,r2 p1
38
is a deadlock underlying time-out
for production runs
CFix: run-time support
39
Run-time Support Fix-Strategy Design Synchronization Enforcement Patch Merging Patch Testing & Selection
Evaluation methodology
APP. PBZIP2 x264 FFT HTTrack Mozilla-1 transmission ZSNES Apache MySQL-1 MySQL-2 Mozilla-2 Cherokee Mozilla-3 AV Detector OV Detector RA Detector DU Detector
40
Evaluation result
AV Detector OV Detector RA Detector DU Detector # of Ops 5 7 5 2 2 2 3 3 5 9 3 2 5 APP. PBZIP2 x264 FFT HTTrack Mozilla-1 transmission ZSNES Apache MySQL-1 MySQL-2 Mozilla-2 Cherokee Mozilla-3
41
Comparison with manual patches
Manual Patch Order with pthread_join Order with pthread_join Order with pthread_join N/A Order with lock Move before pthread_create Move before pthread_create New lock in structure Existing lock and variable Existing lock Make the variable local Existing lock Customized synchronization APP. PBZIP2 x264 FFT HTTrack Mozilla-1 transmission ZSNES Apache MySQL-1 MySQL-2 Mozilla-2 Cherokee Mozilla-3
similar correctness and performance
integrate better with existing code
42
Broader context and related work
Concurrency bug detection
Atomicity and races Record/replay Production runs Special considerations for repair
Correct by construction
Synthesis and sketching Derivation from high-level constructs Global static analysis
Hot-patching at run time
Apply developer- provided fixes Execution steering
43
Broader context and related work
Concurrency bug detection
Atomicity and races Record/replay Production runs Special considerations for repair
[Flanagan, POPL’04], CCI [Jin, OOPSLA’10],
AVIO [Lu, ASPLOS’06], Vaziri [POPL’06], ConTeGe [PLDI’12; ICSE’13; ISSTA’14]
PLDI’10], Choi [PLDI’02], FastTrack [Flanagan, PLDI’09], CCI [Jin, OOPSLA’10],
Eraser [Savage, TCS’97], ConTeGe
[PLDI’12; ICSE’13; ISSTA’14]
for omissions!
44
Broader context and related work
Concurrency bug detection
Atomicity and races Record/replay Production runs Special considerations for repair
45
Broader context and related work
Concurrency bug detection
Atomicity and races Record/replay Production runs Special considerations for repair
46
Broader context and related work
Concurrency bug detection
Atomicity and races Record/replay Production runs Special considerations for repair
47
Broader context and related work
Correct by construction
Synthesis and sketching Derivation from high-level constructs Global static analysis
verification
program obey specification
world code
48
Broader context and related work
Correct by construction
Synthesis and sketching Derivation from high-level constructs Global static analysis
futures, etc.
automatically
49
Broader context and related work
Correct by construction
Synthesis and sketching Derivation from high-level constructs Global static analysis
synchronization analyses
boundaries that guarantee conflict-serializability
blocks
deadlocks
50
Broader context and related work
Hot-patching at run time
Apply developer- provided fixes Execution steering
patches
considerations apply?
51
Broader context and related work
Hot-patching at run time
Apply developer- provided fixes Execution steering
[Alglave, CAV’14], Kivati [Chew, EuroSys’10],
Atom-Aid [Lucia, Micro’09], Yu [ISCA’09;
MICRO’10]
[Křena, PADTAD’07; Letko, PADTAD’08],
Ratanaworabhan [PPoPP’09]
Dimmunix [Jula, OSDI’08]
[OSDI, 2010], dOS [Bergan, OSDI’10], Grace [Berger, OOPSLA’09], Cui [OSDI’10; SOSP’11],
Dthreads [Liu, SOSP’11], Kendo [Olszewski,
ASPLOS’09] 52
Broader context and related work
Hot-patching at run time
Apply developer- provided fixes Execution steering
53
CFix summary
54
55
Run-time Support Fix-Strategy Design Synchronization Enforcement Patch Merging Patch Testing & Selection