symbolic execution
play

Symbolic Execution Saswat Anand 22/09/2009 Limitation of Dataflow - PDF document

Symbolic Execution Saswat Anand 22/09/2009 Limitation of Dataflow Analysis if(p < 10) i =10 x = 1 Is the DU pair involving variables I real? if(p > 10) No, because the path is infeasible. x=x+1 j = i+1 1 Outline Background


  1. Symbolic Execution Saswat Anand 22/09/2009 Limitation of Dataflow Analysis if(p < 10) i =10 x = 1 Is the DU pair involving variables I real? if(p > 10) No, because the path is infeasible. x=x+1 j = i+1 1

  2. Outline • Background – feasible and infeasible program paths – constraints, and constraint satisfiability • Symbolic execution – base idea – handling of symbolic references • Overview of compositional symbolic execution • Overview of implementation of symbolic execution • Limitations of symbolic execution • Summary Feasible and Infeasible Paths • A path refers to a path in the (inter- procedural) control-flow graph of the program. • A path is feasible if there exists an input I to the program that covers the path; i.e., when program is executed with I as input, the path is taken. • A path is infeasible if there exists no input I that covers the path. 2

  3. Infeasible Paths • Infeasible path does not imply dead code; However dead If(sameGoto) code implies infeasible path. newTarget = ((IfStmt) stmtSeq[5]).getTarget(); • In all real software, a very else { newTarget = next; large portion of the total no. oldTarget = of paths are infeasible. ((IfStmt) stmtSeq[5]).getTarget(); } • Automatic test-input … generation does not scale If(!sameGoto) when there are large no. of b.getUnits().insertAfter(…); … infeasible paths to the target location that needs to be An example of infeasible path from soot. A path that goes through the then branches covered. of both conditional stmt.s is infeasible. Constraints X > Y Λ Y+X ≤ 10 More types of constraints 1. Linear constraint • X, Y are called free variables . • X > Y Λ Y+X ≤ 10 2. Non-linear constraint • A solution of the • X * Y < 100 constraint is a set of • X % 3 Λ Y > 10 assignments, one for • (X >> 3) < Y each free variable that 3. Use of function symbols makes the constraint • f(X) > 10 Λ (forall a. f(a) = a satisfiable. + 10) • {X = 3, Y=2} is a solution. • {X = 6, Y=5} is not a solution. 3

  4. Constraints (contd.) • A decision procedure is a tool that can decide if a constraint is satisfiable. • A constraint solver is a tool that finds satisfying assignments for a constraint, if it is satisfiable. • In general, checking constraint satisfiability is undecidable. Symbolic Execution • Symbolic execution refers to execution of program with symbols as argument. • Unlike concrete execution, where the taken path is determined by the input, in symbolic execution the program can take any feasible path. • During symbolic execution, program state consists of – symbolic values for some memory locations – path condition • Path condition is a condition on the input symbols such that if a path is feasible its path-condition is satisfiable. • Solution of path-condition is an test-input that covers the respective path. 4

  5. Symbolic Execution 1 int x, y; inputs that cover else branch inputs that cover else branch x=A,y=B 2 if(x > y){ at stmt. 2: at stmt. 2: x=A,y=B x = 3 x = ? y = ? y = 4 A>B x = x+y; 3 x=A+B,y=B A>B y = x – y; 4 x=A+B,y=A inputs that cover then branch inputs that cover then branch A>B x = x – y; at 2 and else at 6: at 2 and else at 6: 5 x=B,y=A x = ? y = ? x = 5 y = 1 A>B if(x > y) 6 assert false; 7 8 } x=B,y=A 9 printf(x,y); A>B Λ B ≤ A One solution of the constraint A>B Λ B ≤ A is A = 5, B = 1 Symbolic Execution 1 int x, y; inputs that cover else branch inputs that cover else branch x=A,y=B 2 if(x > y){ at stmt. 2: at stmt. 2: x=A,y=B x = ? y = ? x = 3 y = 4 A>B x = x+y; 3 x=A+B,y=B A>B y = x – y; 4 x=A+B,y=A inputs that cover then branch A>B x = x – y; at 2 and else at 6: 5 x=B,y=A x = 5 y = 1 A>B if(x > y) 6 x=B,y=A assert false; A>B Λ B>A 7 inputs that cover then branch UNSAT! at 2 and then at 6: 8 } x = ? y = ? Does not exist! 9 printf(x,y); 5

  6. All-paths Symbolic Execution Normal execution int x, y; input: x = 4, y = 3 x=A,y=B PC: true if(x > y){ output: 3, 4 x=A,y=B PC: A>B x = x+y; x=A+B,y=B PC: A>B y = x – y; Symbolic execution x=A+B,y=A PC: A>B x = x – y; input: x = A, y = B x=B,y=A PC: A>B if(x > y) output: A, B x=B,y=A assert false; PC: A>B Λ B>A Path-condition: A ≤ B UNSAT! } output: B, A x=A,y=B x=B,y=A printf(x,y); PC: A ≤ B PC: A>B Λ B ≤ A Path-condition: A>B Λ B ≤ A Handling Symbolic References 1 class Node { int elem; 2 Node next; 3 foo( Node n1, Node n2 ){ 4 if (n1 == null) return ; 5 if (n2 == null ) return ; 6 if (n2.elem == 0) 7 return ; 8 if (n1.next != null ) 9 n1.next.elem = n1.elem -10; 10 assert (n2.elem != 0); 11 12 } 6

  7. Handling Symbolic References • setElem(H,n,e) – updates the elem field of node n in heap H to value e; returns the updated heap • getElem(H,n) – returns the value of elem field of node n in heap H • setNext(H,n,e), getNext(H,n) – likewise for next field Invariants: forall H, n. getElem(setElem(H,n,v),n) = v forall H, n. getNext(setNext(H,n,v),n) = v Handling Symbolic References 1 class Node { int elem; 2 Node next; 3 Path condition for the path 4-5-6-7-9-10-11 foo( Node n1, Node n2 ){ 4 if (n1 == null) return ; 5 n1 ≠ null Λ if (n2 == null ) return ; n2 ≠ null Λ 6 getElem(H1,n2) ≠ 0 Λ if (n2.elem == 0) 7 getNext(H1,n1) ≠ null Λ return ; 8 H2 = setElem(H1, if (n1.next != null ) 9 getNext(H1,n1), n1.next.elem = n1.elem -10; getElem(H1,n1)-10) Λ 10 getElem(H2,n2) = 0 assert (n2.elem != 0); 11 12 } 7

  8. Compositional Symbolic Execution • Goal: generate an input that int abs(int x){ covers leads to execution of if(x >= 0) error() return x; • No. of paths to error() = 2 50 else return –x; • Symbolically executing each } path and checking its feasibility does not scale! int sumAbs(int[] a){ int sum = 0; • Key idea: compute function for ( int i = 0; i < 50; i++) summaries to be used at all sum += abs(a[i]); call-sites of the function if (sum == 13) error(); return sum; } Compositional Symbolic Execution (contd.) • Symbolically execute all int abs(int x){ paths of callee function (e.g., if(x >= 0) abs) and compute a function return x; summary. else • For each path in a function, return –x; the summary encodes path- } condition of each path and the value returned on the int sumAbs(int[] a){ path. int sum = 0; • When symbolically executing for ( int i = 0; i < 50; i++) sum += abs(a[i]); paths in caller function (e.g., if (sum == 13) sumAbs) reuse the summary error(); of the callee instead of return sum; symbolically executing paths } in callee repeatedly. 8

  9. Compositional Symbolic Execution (contd.) 2 paths to symbolically summary of abs function: int abs(int x){ execute forall x. (x ≥ 0 Λ abs(x) = x) V if(x >= 0) (x < 0 Λ abs(x) = -x) return x; else return –x; No. of paths that lead to error() without } descending into abs function = 1 int sumAbs(int[] a){ int sum = 0; path-condition of path leading to error for ( int i = 0; i < 50; i++) abs(a[0]) + abs(a[1]) + …+ abs(a[49]) = 13 sum += abs(a[i]); Λ forall x. (x ≥ 0 Λ abs(x) = x) V if (sum == 13) (x < 0 Λ abs(x) = -x) error(); return sum; } Implementation of Symbolic Execution • Transformation approach – transform the program to another program that operates on symbolic values such that execution of the transformed program is equivalent to symbolic execution of the original program – difficult to implement, portable solution, suitable for Java, .NET • Instrumentation approach – callback hooks are inserted in the program such that symbolic execution is done in background during normal execution of program – easy to implement for C • Customized runtime approach – Customize the runtime (e.g., JVM) to support symbolic execution – Applicable to Java, .NET, difficult to implement, flexible, not portable 9

  10. Implementation of Symbolic Execution for Java (contd.) void foo( int x, int y){ void foo( Expression x, Expression y){ if (x > y){ if (_GT(x, y)){ x = x + y; x = _ADD(x, y); y = x – y; y = _SUB(x, y); x = _SUB(x, y); x = x – y; if (_GT(x,y)) if (x > y) assert false; assert false; } } } } transformed program original program class Expression{ int concreteValue; Operator op; Expression leftOp; Expression rightOp; … } Applications of Symbolic Execution • Test-input generation • Bug finding • Program verification • Determining functional equivalence • Worst case execution time estimation for real-time software 10

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend