CSE507
Emina Torlak
emina@cs.washington.educourses.cs.washington.edu/courses/cse507/14au/
Computer-Aided Reasoning for Software
Symbolic Execution
CSE507 Computer-Aided Reasoning for Software Symbolic Execution - - PowerPoint PPT Presentation
CSE507 Computer-Aided Reasoning for Software Symbolic Execution courses.cs.washington.edu/courses/cse507/14au/ Emina Torlak emina@cs.washington.edu Today 2 Today Last lecture Bounded verification: forward VCG for finitized programs 2
Emina Torlak
emina@cs.washington.educourses.cs.washington.edu/courses/cse507/14au/
Computer-Aided Reasoning for Software
Symbolic Execution
Today
2Today
2Last lecture
VCG for finitized programs
Today
2Last lecture
VCG for finitized programs Today
Confidence Cost (programmer effort, time, expertise)
The spectrum of program validation tools
3Verification Static Analysis Extended Static Checking Ad-hoc Testing Concolic Testing & Whitebox Fuzzing Bounded Verification & Symbolic Execution
Confidence Cost (programmer effort, time, expertise)
The spectrum of program validation tools
3Verification Static Analysis Extended Static Checking Ad-hoc Testing Concolic Testing & Whitebox Fuzzing
E.g., JPF, Klee
Bounded Verification & Symbolic Execution
E.g., SAGE, Pex, CUTE, DART
Symbolic execution
41976: A system to generate test data and symbolically execute programs (Lori Clarke) 1976: Symbolic execution and program testing (James King) 2005-present: practical symbolic execution
Classic symbolic execution
5 def f (x, y): if (x > y): x = x + y y = x - y x = x - y if (x - y > 0): assert false return (x, y)Classic symbolic execution
5Execute the program on symbolic values.
def f (x, y): if (x > y): x = x + y y = x - y x = x - y if (x - y > 0): assert false return (x, y)Classic symbolic execution
5Execute the program on symbolic values. Symbolic state maps variables to symbolic values.
def f (x, y): if (x > y): x = x + y y = x - y x = x - y if (x - y > 0): assert false return (x, y)x ↦ X y ↦ Y
Classic symbolic execution
5Execute the program on symbolic values. Symbolic state maps variables to symbolic values. Path condition is a quantifier-free formula over the symbolic inputs that encodes all branch decisions taken so far.
def f (x, y): if (x > y): x = x + y y = x - y x = x - y if (x - y > 0): assert false return (x, y)x ↦ X y ↦ Y x ↦ X y ↦ Y X ≤ Y
Classic symbolic execution
5Execute the program on symbolic values. Symbolic state maps variables to symbolic values. Path condition is a quantifier-free formula over the symbolic inputs that encodes all branch decisions taken so far. All paths in the program form its execution tree, in which some paths are feasible and some are infeasible.
def f (x, y): if (x > y): x = x + y y = x - y x = x - y if (x - y > 0): assert false return (x, y)x ↦ X y ↦ Y x ↦ X y ↦ Y X ≤ Y feasible
Classic symbolic execution
5Execute the program on symbolic values. Symbolic state maps variables to symbolic values. Path condition is a quantifier-free formula over the symbolic inputs that encodes all branch decisions taken so far. All paths in the program form its execution tree, in which some paths are feasible and some are infeasible.
def f (x, y): if (x > y): x = x + y y = x - y x = x - y if (x - y > 0): assert false return (x, y)x ↦ X y ↦ Y x ↦ X + Y y ↦ Y x ↦ X y ↦ Y X ≤ Y X > Y feasible
Classic symbolic execution
5Execute the program on symbolic values. Symbolic state maps variables to symbolic values. Path condition is a quantifier-free formula over the symbolic inputs that encodes all branch decisions taken so far. All paths in the program form its execution tree, in which some paths are feasible and some are infeasible.
def f (x, y): if (x > y): x = x + y y = x - y x = x - y if (x - y > 0): assert false return (x, y)x ↦ X y ↦ Y x ↦ X + Y y ↦ Y x ↦ X + Y y ↦ X x ↦ X y ↦ Y X ≤ Y X > Y true feasible
Classic symbolic execution
5Execute the program on symbolic values. Symbolic state maps variables to symbolic values. Path condition is a quantifier-free formula over the symbolic inputs that encodes all branch decisions taken so far. All paths in the program form its execution tree, in which some paths are feasible and some are infeasible.
def f (x, y): if (x > y): x = x + y y = x - y x = x - y if (x - y > 0): assert false return (x, y)x ↦ X y ↦ Y x ↦ X + Y y ↦ Y x ↦ X + Y y ↦ X x ↦ Y y ↦ X x ↦ X y ↦ Y X ≤ Y X > Y true true feasible
Classic symbolic execution
5Execute the program on symbolic values. Symbolic state maps variables to symbolic values. Path condition is a quantifier-free formula over the symbolic inputs that encodes all branch decisions taken so far. All paths in the program form its execution tree, in which some paths are feasible and some are infeasible.
def f (x, y): if (x > y): x = x + y y = x - y x = x - y if (x - y > 0): assert false return (x, y)x ↦ X y ↦ Y x ↦ X + Y y ↦ Y x ↦ X + Y y ↦ X x ↦ Y y ↦ X x ↦ X y ↦ Y X ≤ Y X > Y true true x ↦ Y y ↦ X Y - X ≤ 0 feasible
Classic symbolic execution
5Execute the program on symbolic values. Symbolic state maps variables to symbolic values. Path condition is a quantifier-free formula over the symbolic inputs that encodes all branch decisions taken so far. All paths in the program form its execution tree, in which some paths are feasible and some are infeasible.
def f (x, y): if (x > y): x = x + y y = x - y x = x - y if (x - y > 0): assert false return (x, y)x ↦ X y ↦ Y x ↦ X + Y y ↦ Y x ↦ X + Y y ↦ X x ↦ Y y ↦ X x ↦ X y ↦ Y X ≤ Y X > Y true true x ↦ Y y ↦ X Y - X ≤ 0 feasible feasible
Classic symbolic execution
5Execute the program on symbolic values. Symbolic state maps variables to symbolic values. Path condition is a quantifier-free formula over the symbolic inputs that encodes all branch decisions taken so far. All paths in the program form its execution tree, in which some paths are feasible and some are infeasible.
def f (x, y): if (x > y): x = x + y y = x - y x = x - y if (x - y > 0): assert false return (x, y)x ↦ X y ↦ Y x ↦ X + Y y ↦ Y x ↦ X + Y y ↦ X x ↦ Y y ↦ X x ↦ X y ↦ Y X ≤ Y X > Y true true x ↦ Y y ↦ X x ↦ Y y ↦ X Y - X ≤ 0 Y - X > 0 feasible feasible
Classic symbolic execution
5Execute the program on symbolic values. Symbolic state maps variables to symbolic values. Path condition is a quantifier-free formula over the symbolic inputs that encodes all branch decisions taken so far. All paths in the program form its execution tree, in which some paths are feasible and some are infeasible.
def f (x, y): if (x > y): x = x + y y = x - y x = x - y if (x - y > 0): assert false return (x, y)x ↦ X y ↦ Y x ↦ X + Y y ↦ Y x ↦ X + Y y ↦ X x ↦ Y y ↦ X x ↦ X y ↦ Y X ≤ Y X > Y true true x ↦ Y y ↦ X x ↦ Y y ↦ X Y - X ≤ 0 Y - X > 0 feasible feasible infeasible
Classic symbolic execution: practical issues
6Classic symbolic execution: practical issues
6Loops and recursion: infinite execution trees
Classic symbolic execution: practical issues
6Loops and recursion: infinite execution trees Path explosion: exponentially many paths
Classic symbolic execution: practical issues
6Loops and recursion: infinite execution trees Path explosion: exponentially many paths Heap modeling: symbolic data structures and pointers
Classic symbolic execution: practical issues
6Loops and recursion: infinite execution trees Path explosion: exponentially many paths Heap modeling: symbolic data structures and pointers Solver limitations: dealing with complex PCs
Classic symbolic execution: practical issues
6Loops and recursion: infinite execution trees Path explosion: exponentially many paths Heap modeling: symbolic data structures and pointers Solver limitations: dealing with complex PCs Environment modeling: dealing with native / system / library calls
Classic symbolic execution: practical issues
6Loops and recursion: infinite execution trees Path explosion: exponentially many paths Heap modeling: symbolic data structures and pointers Solver limitations: dealing with complex PCs Environment modeling: dealing with native / system / library calls
Loops and recursion
7Dealing with infinite execution trees:
Loops and recursion
7Dealing with infinite execution trees:
I
Loops and recursion
7Dealing with infinite execution trees:
I
Path explosion
8Achieving good coverage in the presence of exponentially many paths:
Path explosion
8Achieving good coverage in the presence of exponentially many paths:
symbolic execution random testing interleaved execution
Heap modeling
9Modeling symbolic heap values and pointers
General lazy concretization
10 class Node { int elem; Node next; } n = symbolic(Node); x = n.next;General lazy concretization
10 class Node { int elem; Node next; } n = symbolic(Node); x = n.next;n ↦ A0 elem: ? next: ? A0
General lazy concretization
10 class Node { int elem; Node next; } n = symbolic(Node); x = n.next;n ↦ A0 elem: ? next: ? n ↦ A0 x ↦ null A0 A0.next = null elem: ? next: null A0
General lazy concretization
10 class Node { int elem; Node next; } n = symbolic(Node); x = n.next;n ↦ A0 elem: ? next: ? n ↦ A0 x ↦ null A0 n ↦ A0 x ↦ A0 A0.next = null elem: ? next: null A0 elem: ? next: A0 A0 A0.next = A0
General lazy concretization
10 class Node { int elem; Node next; } n = symbolic(Node); x = n.next;n ↦ A0 elem: ? next: ? n ↦ A0 x ↦ null A0 n ↦ A0 x ↦ A0 n ↦ A0 x ↦ A1 A0.next = null elem: ? next: null A0 elem: ? next: A1 A0 elem: ? next: ? A1 elem: ? next: A0 A0 A0.next = A0 A0.next = A1
Concolic testing
11 typedef struct cell { int v; struct cell *next; } cell; int f(int v) { return 2*v + 1; } int testme(cell *p, int x) { if (x > 0) if (p != NULL) if (f(x) == p->v) if (p->next == p) abort(); return 0; }Concolic testing
11 typedef struct cell { int v; struct cell *next; } cell; int f(int v) { return 2*v + 1; } int testme(cell *p, int x) { if (x > 0) if (p != NULL) if (f(x) == p->v) if (p->next == p) abort(); return 0; }p ↦ null x ↦ 236 x > 0 ∧ p=null Execute concretely and symbolically. Concrete PC
Concolic testing
11 typedef struct cell { int v; struct cell *next; } cell; int f(int v) { return 2*v + 1; } int testme(cell *p, int x) { if (x > 0) if (p != NULL) if (f(x) == p->v) if (p->next == p) abort(); return 0; }p ↦ null x ↦ 236 x > 0 ∧ p=null Execute concretely and symbolically. Negate last decision and solve for new inputs. Concrete PC p ↦ A0 x ↦ 236 x > 0 ∧ p≠null ∧ p.v ≠ 2x + 1 next: null v: 634 A0
Concolic testing
11 typedef struct cell { int v; struct cell *next; } cell; int f(int v) { return 2*v + 1; } int testme(cell *p, int x) { if (x > 0) if (p != NULL) if (f(x) == p->v) if (p->next == p) abort(); return 0; }p ↦ null x ↦ 236 x > 0 ∧ p=null Execute concretely and symbolically. Negate last decision and solve for new inputs. Concrete PC p ↦ A0 x ↦ 236 x > 0 ∧ p≠null ∧ p.v ≠ 2x + 1 next: null v: 634 A0 p ↦ A0 x ↦ 1 x > 0 ∧ p≠null ∧ p.v = 2x + 1 ∧ p.next ≠ p next: null v: 3 A0
Concolic testing
11 typedef struct cell { int v; struct cell *next; } cell; int f(int v) { return 2*v + 1; } int testme(cell *p, int x) { if (x > 0) if (p != NULL) if (f(x) == p->v) if (p->next == p) abort(); return 0; }p ↦ null x ↦ 236 x > 0 ∧ p=null Execute concretely and symbolically. Negate last decision and solve for new inputs. Concrete PC p ↦ A0 x ↦ 236 x > 0 ∧ p≠null ∧ p.v ≠ 2x + 1 next: null v: 634 A0 p ↦ A0 x ↦ 1 x > 0 ∧ p≠null ∧ p.v = 2x + 1 ∧ p.next ≠ p next: null v: 3 A0 p ↦ A0 x ↦ 1 next: A0 v: 3 A0 x > 0 ∧ p≠null ∧ p.v = 2x + 1 ∧ p.next = p
Solver limitations
12Reducing the demands on the solver:
Environment modeling
13Dealing with system / native / library calls:
Summary
14Today
Next lecture