CSE507 Computer-Aided Reasoning for Software Symbolic Execution - - PowerPoint PPT Presentation

cse507
SMART_READER_LITE
LIVE PREVIEW

CSE507 Computer-Aided Reasoning for Software Symbolic Execution - - PowerPoint PPT Presentation

CSE507 Computer-Aided Reasoning for Software Symbolic Execution courses.cs.washington.edu/courses/cse507/14au/ Emina Torlak emina@cs.washington.edu Today 2 Today Last lecture Bounded verification: forward VCG for finitized programs 2


slide-1
SLIDE 1

CSE507

Emina Torlak

emina@cs.washington.edu

courses.cs.washington.edu/courses/cse507/14au/

Computer-Aided Reasoning for Software

Symbolic Execution

slide-2
SLIDE 2

Today

2
slide-3
SLIDE 3

Today

2

Last lecture

  • Bounded verification: forward

VCG for finitized programs

slide-4
SLIDE 4

Today

2

Last lecture

  • Bounded verification: forward

VCG for finitized programs Today

  • Symbolic execution: a path-based translation
  • Concolic testing
slide-5
SLIDE 5

Confidence Cost (programmer effort, time, expertise)

The spectrum of program validation tools

3

Verification Static Analysis Extended Static Checking Ad-hoc Testing Concolic Testing & Whitebox Fuzzing Bounded Verification & Symbolic Execution

slide-6
SLIDE 6

Confidence Cost (programmer effort, time, expertise)

The spectrum of program validation tools

3

Verification Static Analysis Extended Static Checking Ad-hoc Testing Concolic Testing & Whitebox Fuzzing

E.g., JPF, Klee

Bounded Verification & Symbolic Execution

E.g., SAGE, Pex, CUTE, DART

slide-7
SLIDE 7

Symbolic execution

4

1976: A system to generate test data and symbolically execute programs (Lori Clarke) 1976: Symbolic execution and program testing (James King) 2005-present: practical symbolic execution

  • Using SMT solvers
  • Heuristics to control exponential explosion
  • Heap modeling and reasoning about pointers
  • Environment modeling
  • Dealing with solver limitations
slide-8
SLIDE 8

Classic symbolic execution

5 def f (x, y): if (x > y): x = x + y y = x - y x = x - y if (x - y > 0): assert false return (x, y)
slide-9
SLIDE 9

Classic symbolic execution

5

Execute the program on symbolic values.

def f (x, y): if (x > y): x = x + y y = x - y x = x - y if (x - y > 0): assert false return (x, y)
slide-10
SLIDE 10

Classic symbolic execution

5

Execute the program on symbolic values. Symbolic state maps variables to symbolic values.

def f (x, y): if (x > y): x = x + y y = x - y x = x - y if (x - y > 0): assert false return (x, y)

x ↦ X y ↦ Y

slide-11
SLIDE 11

Classic symbolic execution

5

Execute the program on symbolic values. Symbolic state maps variables to symbolic values. Path condition is a quantifier-free formula over the symbolic inputs that encodes all branch decisions taken so far.

def f (x, y): if (x > y): x = x + y y = x - y x = x - y if (x - y > 0): assert false return (x, y)

x ↦ X y ↦ Y x ↦ X y ↦ Y X ≤ Y

slide-12
SLIDE 12

Classic symbolic execution

5

Execute the program on symbolic values. Symbolic state maps variables to symbolic values. Path condition is a quantifier-free formula over the symbolic inputs that encodes all branch decisions taken so far. All paths in the program form its execution tree, in which some paths are feasible and some are infeasible.

def f (x, y): if (x > y): x = x + y y = x - y x = x - y if (x - y > 0): assert false return (x, y)

x ↦ X y ↦ Y x ↦ X y ↦ Y X ≤ Y feasible

slide-13
SLIDE 13

Classic symbolic execution

5

Execute the program on symbolic values. Symbolic state maps variables to symbolic values. Path condition is a quantifier-free formula over the symbolic inputs that encodes all branch decisions taken so far. All paths in the program form its execution tree, in which some paths are feasible and some are infeasible.

def f (x, y): if (x > y): x = x + y y = x - y x = x - y if (x - y > 0): assert false return (x, y)

x ↦ X y ↦ Y x ↦ X + Y y ↦ Y x ↦ X y ↦ Y X ≤ Y X > Y feasible

slide-14
SLIDE 14

Classic symbolic execution

5

Execute the program on symbolic values. Symbolic state maps variables to symbolic values. Path condition is a quantifier-free formula over the symbolic inputs that encodes all branch decisions taken so far. All paths in the program form its execution tree, in which some paths are feasible and some are infeasible.

def f (x, y): if (x > y): x = x + y y = x - y x = x - y if (x - y > 0): assert false return (x, y)

x ↦ X y ↦ Y x ↦ X + Y y ↦ Y x ↦ X + Y y ↦ X x ↦ X y ↦ Y X ≤ Y X > Y true feasible

slide-15
SLIDE 15

Classic symbolic execution

5

Execute the program on symbolic values. Symbolic state maps variables to symbolic values. Path condition is a quantifier-free formula over the symbolic inputs that encodes all branch decisions taken so far. All paths in the program form its execution tree, in which some paths are feasible and some are infeasible.

def f (x, y): if (x > y): x = x + y y = x - y x = x - y if (x - y > 0): assert false return (x, y)

x ↦ X y ↦ Y x ↦ X + Y y ↦ Y x ↦ X + Y y ↦ X x ↦ Y y ↦ X x ↦ X y ↦ Y X ≤ Y X > Y true true feasible

slide-16
SLIDE 16

Classic symbolic execution

5

Execute the program on symbolic values. Symbolic state maps variables to symbolic values. Path condition is a quantifier-free formula over the symbolic inputs that encodes all branch decisions taken so far. All paths in the program form its execution tree, in which some paths are feasible and some are infeasible.

def f (x, y): if (x > y): x = x + y y = x - y x = x - y if (x - y > 0): assert false return (x, y)

x ↦ X y ↦ Y x ↦ X + Y y ↦ Y x ↦ X + Y y ↦ X x ↦ Y y ↦ X x ↦ X y ↦ Y X ≤ Y X > Y true true x ↦ Y y ↦ X Y - X ≤ 0 feasible

slide-17
SLIDE 17

Classic symbolic execution

5

Execute the program on symbolic values. Symbolic state maps variables to symbolic values. Path condition is a quantifier-free formula over the symbolic inputs that encodes all branch decisions taken so far. All paths in the program form its execution tree, in which some paths are feasible and some are infeasible.

def f (x, y): if (x > y): x = x + y y = x - y x = x - y if (x - y > 0): assert false return (x, y)

x ↦ X y ↦ Y x ↦ X + Y y ↦ Y x ↦ X + Y y ↦ X x ↦ Y y ↦ X x ↦ X y ↦ Y X ≤ Y X > Y true true x ↦ Y y ↦ X Y - X ≤ 0 feasible feasible

slide-18
SLIDE 18

Classic symbolic execution

5

Execute the program on symbolic values. Symbolic state maps variables to symbolic values. Path condition is a quantifier-free formula over the symbolic inputs that encodes all branch decisions taken so far. All paths in the program form its execution tree, in which some paths are feasible and some are infeasible.

def f (x, y): if (x > y): x = x + y y = x - y x = x - y if (x - y > 0): assert false return (x, y)

x ↦ X y ↦ Y x ↦ X + Y y ↦ Y x ↦ X + Y y ↦ X x ↦ Y y ↦ X x ↦ X y ↦ Y X ≤ Y X > Y true true x ↦ Y y ↦ X x ↦ Y y ↦ X Y - X ≤ 0 Y - X > 0 feasible feasible

slide-19
SLIDE 19

Classic symbolic execution

5

Execute the program on symbolic values. Symbolic state maps variables to symbolic values. Path condition is a quantifier-free formula over the symbolic inputs that encodes all branch decisions taken so far. All paths in the program form its execution tree, in which some paths are feasible and some are infeasible.

def f (x, y): if (x > y): x = x + y y = x - y x = x - y if (x - y > 0): assert false return (x, y)

x ↦ X y ↦ Y x ↦ X + Y y ↦ Y x ↦ X + Y y ↦ X x ↦ Y y ↦ X x ↦ X y ↦ Y X ≤ Y X > Y true true x ↦ Y y ↦ X x ↦ Y y ↦ X Y - X ≤ 0 Y - X > 0 feasible feasible infeasible

slide-20
SLIDE 20

Classic symbolic execution: practical issues

6
slide-21
SLIDE 21

Classic symbolic execution: practical issues

6

Loops and recursion: infinite execution trees

slide-22
SLIDE 22

Classic symbolic execution: practical issues

6

Loops and recursion: infinite execution trees Path explosion: exponentially many paths

slide-23
SLIDE 23

Classic symbolic execution: practical issues

6

Loops and recursion: infinite execution trees Path explosion: exponentially many paths Heap modeling: symbolic data structures and pointers

slide-24
SLIDE 24

Classic symbolic execution: practical issues

6

Loops and recursion: infinite execution trees Path explosion: exponentially many paths Heap modeling: symbolic data structures and pointers Solver limitations: dealing with complex PCs

slide-25
SLIDE 25

Classic symbolic execution: practical issues

6

Loops and recursion: infinite execution trees Path explosion: exponentially many paths Heap modeling: symbolic data structures and pointers Solver limitations: dealing with complex PCs Environment modeling: dealing with native / system / library calls

slide-26
SLIDE 26

Classic symbolic execution: practical issues

6

Loops and recursion: infinite execution trees Path explosion: exponentially many paths Heap modeling: symbolic data structures and pointers Solver limitations: dealing with complex PCs Environment modeling: dealing with native / system / library calls

slide-27
SLIDE 27

Loops and recursion

7

Dealing with infinite execution trees:

  • Finitize paths by limiting the size of PCs (bounded verification)
  • Use loop invariants (verification)
slide-28
SLIDE 28

Loops and recursion

7

Dealing with infinite execution trees:

  • Finitize paths by limiting the size of PCs (bounded verification)
  • Use loop invariants (verification)
init; while (C) { B; } assert P;

I

slide-29
SLIDE 29 init; assert I; makeSymbolic(targets(B)); assume I; if (C) { B; assert I; } else assert P;

Loops and recursion

7

Dealing with infinite execution trees:

  • Finitize paths by limiting the size of PCs (bounded verification)
  • Use loop invariants (verification)
init; while (C) { B; } assert P;

I

slide-30
SLIDE 30

Path explosion

8

Achieving good coverage in the presence of exponentially many paths:

  • Select next branch at random
  • Select next branch based on coverage
  • Interleave symbolic execution with random testing
slide-31
SLIDE 31

Path explosion

8

Achieving good coverage in the presence of exponentially many paths:

  • Select next branch at random
  • Select next branch based on coverage
  • Interleave symbolic execution with random testing

symbolic execution random testing interleaved execution

slide-32
SLIDE 32

Heap modeling

9

Modeling symbolic heap values and pointers

  • Segmented address space via the theory of arrays (Klee)
  • Lazy concretization (JPF)
  • Concolic lazy concretization (CUTE)
slide-33
SLIDE 33

General lazy concretization

10 class Node { int elem; Node next; } n = symbolic(Node); x = n.next;
slide-34
SLIDE 34

General lazy concretization

10 class Node { int elem; Node next; } n = symbolic(Node); x = n.next;

n ↦ A0 elem: ? next: ? A0

slide-35
SLIDE 35

General lazy concretization

10 class Node { int elem; Node next; } n = symbolic(Node); x = n.next;

n ↦ A0 elem: ? next: ? n ↦ A0 x ↦ null A0 A0.next = null elem: ? next: null A0

slide-36
SLIDE 36

General lazy concretization

10 class Node { int elem; Node next; } n = symbolic(Node); x = n.next;

n ↦ A0 elem: ? next: ? n ↦ A0 x ↦ null A0 n ↦ A0 x ↦ A0 A0.next = null elem: ? next: null A0 elem: ? next: A0 A0 A0.next = A0

slide-37
SLIDE 37

General lazy concretization

10 class Node { int elem; Node next; } n = symbolic(Node); x = n.next;

n ↦ A0 elem: ? next: ? n ↦ A0 x ↦ null A0 n ↦ A0 x ↦ A0 n ↦ A0 x ↦ A1 A0.next = null elem: ? next: null A0 elem: ? next: A1 A0 elem: ? next: ? A1 elem: ? next: A0 A0 A0.next = A0 A0.next = A1

slide-38
SLIDE 38

Concolic testing

11 typedef struct cell { int v; struct cell *next; } cell; int f(int v) { return 2*v + 1; } int testme(cell *p, int x) { if (x > 0) if (p != NULL) if (f(x) == p->v) if (p->next == p) abort(); return 0; }
slide-39
SLIDE 39

Concolic testing

11 typedef struct cell { int v; struct cell *next; } cell; int f(int v) { return 2*v + 1; } int testme(cell *p, int x) { if (x > 0) if (p != NULL) if (f(x) == p->v) if (p->next == p) abort(); return 0; }

p ↦ null x ↦ 236 x > 0 ∧ p=null Execute concretely and symbolically. Concrete PC

slide-40
SLIDE 40

Concolic testing

11 typedef struct cell { int v; struct cell *next; } cell; int f(int v) { return 2*v + 1; } int testme(cell *p, int x) { if (x > 0) if (p != NULL) if (f(x) == p->v) if (p->next == p) abort(); return 0; }

p ↦ null x ↦ 236 x > 0 ∧ p=null Execute concretely and symbolically. Negate last decision and solve for new inputs. Concrete PC p ↦ A0 x ↦ 236 x > 0 ∧ p≠null ∧ p.v ≠ 2x + 1 next: null v: 634 A0

slide-41
SLIDE 41

Concolic testing

11 typedef struct cell { int v; struct cell *next; } cell; int f(int v) { return 2*v + 1; } int testme(cell *p, int x) { if (x > 0) if (p != NULL) if (f(x) == p->v) if (p->next == p) abort(); return 0; }

p ↦ null x ↦ 236 x > 0 ∧ p=null Execute concretely and symbolically. Negate last decision and solve for new inputs. Concrete PC p ↦ A0 x ↦ 236 x > 0 ∧ p≠null ∧ p.v ≠ 2x + 1 next: null v: 634 A0 p ↦ A0 x ↦ 1 x > 0 ∧ p≠null ∧ p.v = 2x + 1 ∧ p.next ≠ p next: null v: 3 A0

slide-42
SLIDE 42

Concolic testing

11 typedef struct cell { int v; struct cell *next; } cell; int f(int v) { return 2*v + 1; } int testme(cell *p, int x) { if (x > 0) if (p != NULL) if (f(x) == p->v) if (p->next == p) abort(); return 0; }

p ↦ null x ↦ 236 x > 0 ∧ p=null Execute concretely and symbolically. Negate last decision and solve for new inputs. Concrete PC p ↦ A0 x ↦ 236 x > 0 ∧ p≠null ∧ p.v ≠ 2x + 1 next: null v: 634 A0 p ↦ A0 x ↦ 1 x > 0 ∧ p≠null ∧ p.v = 2x + 1 ∧ p.next ≠ p next: null v: 3 A0 p ↦ A0 x ↦ 1 next: A0 v: 3 A0 x > 0 ∧ p≠null ∧ p.v = 2x + 1 ∧ p.next = p

slide-43
SLIDE 43

Solver limitations

12

Reducing the demands on the solver:

  • On-the-fly expression simplification
  • Incremental solving
  • Solution caching
  • Substituting concrete values for symbolic in complex PCs (CUTE)
slide-44
SLIDE 44

Environment modeling

13

Dealing with system / native / library calls:

  • Partial state concretization
  • Manual models of the environment (Klee)
slide-45
SLIDE 45

Summary

14

Today

  • Practical symbolic execution and concolic testing

Next lecture

  • Basics of model checking