Parallel Symbolic Execution for Automated Real-World Software Testing
Stefan Bucur, Vlad Ureche, Cristian Zamfir, George Candea
Cloud9
School of Computer and Communication Sciences
Cloud9 Parallel Symbolic Execution for Automated Real-World - - PowerPoint PPT Presentation
Cloud9 Parallel Symbolic Execution for Automated Real-World Software Testing Stefan Bucur, Vlad Ureche, Cristian Zamfir, George Candea School of Computer and Communication Sciences Automated Software Testing Automated Industrial Techniques
Parallel Symbolic Execution for Automated Real-World Software Testing
Stefan Bucur, Vlad Ureche, Cristian Zamfir, George Candea
School of Computer and Communication Sciences
Automated Techniques
Automated Software Testing
2
Symbolic Execution Model Checking
Industrial SW Testing
Manual Testing Static Analysis Fuzzing Scalability Applicability Usability
Cloud9 - The Big Picture
3
Automated Systems Testing
4
[*] C. Cadar, D. Dunbar, D. Engler, “KLEE: Unassisted and automatic generation
KLEE [*]
programs
Symbolic Execution
5
Memcached GNU Coreutils Apache
void proc_pkt(packet_t* pkt) { if (pkt->magic != 0xC9) { err(pkt); return; } if (pkt->cmd == GET) { ... } else if ... ... }
Symbolic Execution in a Nutshell
[C9 A0 ... ]
6
void proc_pkt(packet_t* pkt) { if (pkt->magic != 0xC9) { err(pkt); return; } if (pkt->cmd == GET) { ... } else if ... ... }
Symbolic Execution in a Nutshell
[C9 A0 ... ]
6
pkt->magic != 0xC9 void proc_pkt(packet_t* pkt) { if (pkt->magic != 0xC9) { err(pkt); return; } if (pkt->cmd == GET) { ... } else if ... ... }
Symbolic Execution in a Nutshell
[C9 A0 ... ]
6
pkt->cmd == GET pkt->magic != 0xC9 void proc_pkt(packet_t* pkt) { if (pkt->magic != 0xC9) { err(pkt); return; } if (pkt->cmd == GET) { ... } else if ... ... }
Symbolic Execution in a Nutshell
[C9 A0 ... ]
6
pkt->cmd == GET pkt->magic != 0xC9 void proc_pkt(packet_t* pkt) { if (pkt->magic != 0xC9) { err(pkt); return; } if (pkt->cmd == GET) { ... } else if ... ... }
Symbolic Execution in a Nutshell
[C9 A0 ... ]
6
void proc_pkt(packet_t* pkt) { if (pkt->magic != 0xC9) { err(pkt); return; } if (pkt->cmd == GET) { ... } else if ... ... }
Symbolic Execution in a Nutshell
7
λ
λ.magic == 0xC9 λ.magic != 0xC9 void proc_pkt(packet_t* pkt) { if (pkt->magic != 0xC9) { err(pkt); return; } if (pkt->cmd == GET) { ... } else if ... ... }
Symbolic Execution in a Nutshell
7
λ
λ.cmd == GET λ.cmd != GET λ.magic == 0xC9 λ.magic != 0xC9 void proc_pkt(packet_t* pkt) { if (pkt->magic != 0xC9) { err(pkt); return; } if (pkt->cmd == GET) { ... } else if ... ... }
Symbolic Execution in a Nutshell
7
λ
λ.cmd == GET λ.cmd != GET λ.magic == 0xC9 λ.magic != 0xC9 void proc_pkt(packet_t* pkt) { if (pkt->magic != 0xC9) { err(pkt); return; } if (pkt->cmd == GET) { ... } else if ... ... }
Symbolic Execution in a Nutshell
7
λ program size
8
CPU Bottleneck Memory Exhaustion
W1 W2 W3
Parallel Tree Exploration
8
W1 W2 W3
Parallel Tree Exploration
8
Key research problem: Scalable parallel exploration
Linear Solution to Exponential Problem
9
Program Size Time to Test
Linear Solution to Exponential Problem
9
Program Size Time to Test
Testing target 1 worker
Linear Solution to Exponential Problem
9
Program Size Time to Test
Testing target
Bring testing time down to practical values
1 worker 2 workers 4 workers 8 workers
Throw Hardware at the Problem
10
Scalability Challenges
Tree structure not known a priori
11
Scalability Challenges
Static Allocation
12
Scalability Challenges
12
Scalability Challenges
Anticipate Allocation
13
Scalability Challenges
13
Outline
14
Cloud9 Architecture
15
Global Symbolic Tree
Cloud9 Architecture
15
W1’s Local Tree W2’s Local Tree W3’s Local Tree Each worker runs a local sequential symbolic execution engine (KLEE)
Cloud9 Architecture
16
Candidate nodes Fence nodes
exploration
Load Balancing
LB
W1 W2 W3
17
Hybrid distributed system: centralized reports, P2P work transfer
Load Balancing
LB
W1 W2 W3
17
Hybrid distributed system: centralized reports, P2P work transfer
Load Balancing
LB
W1 W2 W3
17
Hybrid distributed system: centralized reports, P2P work transfer
Work Transfer
W1
18
Candidate Fence
Work Transfer
W1 W2
18
Candidate Fence
Work Transfer
W1 W2
Virtual
18
Candidate Fence
Work Transfer
W1 W2
Virtual
18
Candidate Fence
Work Transfer
W1 W2
Materialized
18
Candidate Fence
Work Transfer
W1 W2
18
Exploration disjointness + completeness
Candidate Fence
1 1 1 1 1
Path-based Encoding
19
<128 bits (16 bytes)
Load Balancing in Practice
20
LB stops after 1 min LB stops after 4 min Continuous load balancing Work done [% of total instructions] Time [minutes]
10 20 30 40 50 60 70 80 90 100 2 4 6 8 10
Load balancing necessary to ensure scalability
Outline
21
Calls into the Environment
22
if (fork() == 0) { ... if ((res = recv(sock, buff, size, 0)) > 0) { pthread_mutex_lock(&mutex); memcpy(gBuff, buff, res); pthread_mutex_unlock(&mutex); } ... } else { ... pid_t pid = wait(&stat); ... }
fork()
Program Under Test Environment
(C Library / OS)
Environment Model
23
Cannot directly execute symbolically
fork()
Program Under Test Environment
(C Library / OS)
Environment Model
23
Model Code
Symbolic Execution Engine
Equivalent functionality Executable symbolically
Starting Point
24
Symbolic Execution Engine
Network
Stubs
Files
POSIX
S i n g l e
h r e a d e d i s
a t e d n
e s S i n g l e
h r e a d e d u t i l i t i e s
POSIX Environment Model
25
Symbolic Execution Engine
Network
TCP/UDP/UNIX
Files Pipes Threads
pthread_*
Processes
POSIX
M e s s a g e p a s s i n g S e r v e r s a n d c l i e n t s M u l t i
h r e a d e d p r
r a m s D i s t r i b u t e d s y s t e m s
Signals
A s y n c h r
s e v e n t s , I P C S i n g l e
h r e a d e d u t i l i t i e s
Key Changes in Symbolic Execution
Multithreading and Scheduling
Address Space Isolation
26
Symbolic Engine System Calls
needed for threads/processes
27
Symbolic Engine System Calls
thread_create thread_terminate process_fork process_terminate get_context thread_preempt thread_sleep thread_notify get_wait_list make_shared
1 2 3
Outline
28
Testing Real-World Software
29
Memcached GNU Coreutils Apache
Time to Reach Target Coverage
30
printf Faster time-to-cover, higher coverage values
60% coverage 70% coverage 80% coverage 90% coverage 10 20 30 40 50 60 1 4 8 24 48 Time to achieve target coverage [minutes] Number of workers
Increase in Code Coverage
10 20 30 40 50 10 20 30 40 50 60 70 80 90 Additional code covered [ % of program LOC ] Index of tested Coreutil (sorted by additional coverage)
31
Coreutils suite (12 workers, 10 min.) Consistent code coverage increase
Exhaustive Exploration
32
1 2 3 4 5 6 2 4 6 12 24 48 Time to complete exhaustive test [hours] Number of workers
Scalability of exhaustive path exploration memcached (7.4×104 paths)
Instruction Throughput
33
0.0e+00 2.0e+09 4.0e+09 6.0e+09 8.0e+09 1.0e+10 1.2e+10 1.4e+10 1.6e+10 1.8e+10 1 4 6 12 24 48 Useful work done [ # of instructions ] Number of workers 4 minutes 6 minutes 8 minutes 10 minutes
memcached Linear scalability with number of workers
Execute the “whole world” symbolically Symbolic State
Experimental Setup
34
Client Process memcached/ Apache/ lighttpd
TCP Stream Symbolic cmd.
Symbolic Test Cases
symbolic test cases
scheduler
35
Symbolic Test Cases
36
Testing HTTP header extension
make_symbolic(hdrData); // Append symbolic header to request strcat(req, “X-NewExtension: “); strcat(req, hdrData); // Enable fault injection on socket ioctl(ssock, SIO_FAULT_INJ, RD | WR); // Symbolic stream fragmentation ioctl(ssock, SIO_PKT_FRAGMENT, RD);
Conclusions
37