Institute for Communication Technologies and Embedded Systems
Automatic Exploration of SW Concurrency Bugs through Deterministic Behavior Control
Luis Gabriel Murillo, Rainer Leupers
MAD Workshop 14.11.13, Munich, Germany
Deterministic Behavior Control Luis Gabriel Murillo, Rainer Leupers - - PowerPoint PPT Presentation
Automatic Exploration of SW Concurrency Bugs through Deterministic Behavior Control Luis Gabriel Murillo, Rainer Leupers MAD Workshop 14.11.13, Munich, Germany Institute for Communication Technologies and Embedded Systems Motivation: MPSoC
Institute for Communication Technologies and Embedded Systems
MAD Workshop 14.11.13, Munich, Germany
2
CPU 1 ASIP CPU n DSP
L1 Cache L1 Cache
System RAM System ROM NoC Router ASIP ASIP ASIP DSP DSP DSP
bus
Debugger Debugger Debugger
3
Task 1 Task 2
84 ... 85 lock(x) 89 unlock(x) 25 ... 24 print(a) 25 ... 21 a = 2 22 unlock(x) 86 ... 87 ... 88 a = 1
Time Bugs appear due to improper synchronization
25 ... 24 print(a) 25 ...
Probe effect!
4
9. ...
11. print(a);
14. a=1;
Dynamic Monitoring Replay & Iterate Platform Parallel Application User Intervention Automation
Analysis
... void *task1(void *) { print(a); ... void *task2(void *) { a=1; ...
Diagnostic: Synchronization Conflict Time: 20ms Location: main.c:24 and main.c:88
5 Concurrency- related event
6
Platform
Task 1 Task 2
EVENT 2 EVENT 5
…
EVENT 3 EVENT 1 EVENT 4
Parallel SW All synchronization, task management, message passing, shared memory…
Virtual Platform
AVIO
(Lu et al. ’06)
Chess
(Microsoft ’08)
Portend
(EPFL ’12)
This work Target system x86 Windows LLVM Virtual Platform Target application C(++) .NET Pthread SW + HW Non-intrusive Instrumentation Wrapper Symbolic execution Deterministic replay Deterministic program exploration Extensibility
7
8
9 Platform
Task 1 Task 2
Lock GET (x) Lock RELEASE (x)
…
READ (a) Lock RELEASE (x)
WRITE (a)
19 task1(){ 20 ... 21 a = 2 22 unlock(x) 23 print(a) 24 ...} 83 task2(){ 84 a = 1
Main
5 main(){ 6 ... 7 new(task1) 8 new(task2)}
DWARF
ELF
OS/Lib Aware- ness
Debugger BE
10
core
…
BP on instr. write inst.
…
BP on instr. Func call thread
create
time event context
application
New task Func call Get
lock
OS
… …
Visible Shadowed
Abstraction
11
12
13
Replay Platform Analysis Events
14
Identify synchronization
(“happens before” analysis)
Identify “always concurrent” events
Identify event dependencies
Identify conflicts
VP 15
OS (e.g. Linux) Application
Behavior Control
Task 1 Task 2 Task n
…
Event Trace
Debug API Controllers Output Monitors
Iterate to explore Trace Transforma- tions
Full-system
Simulation
E.g. emulate call to Linux scheduler
t
16
17
18
Results ARM Versatile Express Event-based Framework Retargetable BE High-level Monitors Adaptation Effort ~1 man-month ~2 man-days Monitoring and Analysis Synthetic SPLASH-2 Total events (no SM) ~500 600 – 123k Total events ~2500 3000 – 1.9M Overhead ~3x ~3x (WC:60x) Replay Constraints ~50 500 - 3200
19 item0: previous modify (6) at 1405 (6,kNone).kOnVirtWrite(0) @00072014 @000199dc: slave1.C:517 === item1: current visit (4) at 19913 (4,kNone).kOnVirtRead(0) @00072014 @000199bc: slave1.C:517
Filtered conflicts Total Sync Mutex Conflict Count 284 260 23 1 rel. 91.5 % 8.1 % 0.4 %
516: /*LOCK(locks->psibilock)*/ 517: global->psibi = global->psibi + psibipriv; 218: /*UNLOCK(locks->psibilock)*/
20 src/RandomSwapBugFinder.cc:299 : bug occurs when events happen in this order: first event: 0xc170f508 (4,kNone).kOnVirtRead(0) @00072014 @000199bc: slave1.C:517 second event: 0xc1702d48 (6,kNone).kOnVirtWrite(0) @00072014 @000199dc: slave1.C:517
The bug was found after one iteration.
9. ...
11. print(a);
14. a=1;
Dynamic Monitoring Replay & Iterate Platform
Application
Automation
Analysis
... void *task1(void *) { print(a); ... void *task2(void *) { a=1; ...
Diagnostic: Synchronization Conflict Time: 20ms Location: main.c:24 and main.c:88
21
User Intervention
Institute for Communication Technologies and Embedded Systems