Static Analysis Static Analysis they are read? Will the current - - PowerPoint PPT Presentation

static analysis static analysis
SMART_READER_LITE
LIVE PREVIEW

Static Analysis Static Analysis they are read? Will the current - - PowerPoint PPT Presentation

Interesting Questions Interesting Questions Is every statement reachable? Does every non- void method return a value? Compilation 2007 Compilation 2007 Will local variables definitely be assigned before Static Analysis Static


slide-1
SLIDE 1

1

Compilation 2007 Compilation 2007

Static Analysis Static Analysis

Michael I. Schwartzbach BRICS, University of Aarhus

2

Static Analysis

Interesting Questions Interesting Questions

Is every statement reachable? Does every non-void method return a value? Will local variables definitely be assigned before they are read? Will the current value of a variable ever be read? ... How much heap space will the program need? Does the program always terminate? Will the output always be correct?

3

Static Analysis

Rice Rice’ ’s Theorem s Theorem

Theorem 11.9 (Martin p. 420) If R is a property of languages that is satisfied by some but not all recursively enumerable languages then the decision problem PR: Given a TM T, does L(T) have property R? is unsolvable.

4

Static Analysis

Rice Rice’ ’s Theorem Explained s Theorem Explained

Theorem 11.9 (Martin, p. 420) "Every interesting question about the behavior of a program is undecidable."

slide-2
SLIDE 2

2

5

Static Analysis

Static Analysis Static Analysis

Static analysis provides approximate answers to interesting questions about programs The approximation is conservative, meaning that some answers are guaranteed to be true Compilers spend most of their time performing static analysis so they may:

  • understand the semantics of programs
  • provide safety guarantees
  • generate efficient code

6

Static Analysis

Conservative Approximation Conservative Approximation

A typical scenario for a boolean property:

  • if the analysis says yes, the property definitely holds
  • if it says no, the property may or may not hold
  • only the yes answer will help the compiler
  • a trivial analysis will say no always
  • the engineering challenge is to say yes often enough

For other kinds of properties, the notion of approximation may be more subtle

7

Static Analysis

A Range of Static Analyses A Range of Static Analyses

Static analysis may take place:

  • at the source code level
  • at some intermediate level
  • at the machine code level

Static analysis may look at:

  • statement blocks only
  • an entire method (intraprocedural)
  • the whole program (interprocedural)

The precision and cost both rise as we include more information

8

Static Analysis

The The Phases Phases of

  • f GCC (1/2)

GCC (1/2)

Parsing Tree optimization RTL generation Sibling call optimization Jump optimization Register scan Jump threading Common subexpression elimination Loop optimizations Jump bypassing Data flow analysis Instruction combination If-conversion Register movement Instruction scheduling Register allocation Basic block reordering Delayed branch scheduling Branch shortening Assembly output Debugging output

slide-3
SLIDE 3

3

9

Static Analysis

The The Phases Phases of

  • f GCC (2/2)

GCC (2/2)

Parsing Tree optimization RTL generation Sibling call optimization Jump optimization Register scan Jump threading Common subexpression elimination Loop optimizations Jump bypassing Data flow analysis Instruction combination If-conversion Register movement Instruction scheduling Register allocation Basic block reordering Delayed branch scheduling Branch shortening Assembly output Debugging output

Static analysis uses 60%

  • f the compilation time

10

Static Analysis

Reachability Analysis Reachability Analysis

Java requires two reachability guarantees:

  • all statements must be reachable (avoid dead code)
  • all non-void methods must return a value

These are interesting properties and thus they are undecidable But a static analysis may provide conservative approximations To ensure that different compilers accept the same programs, the Java language specification mandates a specific static analysis

11

Static Analysis

Constraint Constraint-

  • Based Analysis

Based Analysis

For every node S that represents a statement in the AST, we define two boolean properties:

  • C[[S]] denotes that S may complete normally
  • R[[S]] denotes that S is possibly reachable

A statement may only complete if it is reachable For each syntactic kind of statement, we generate constraints that relate C[[...]] and R[[...]]

12

Static Analysis

Information Flow Information Flow

The values of R[[...]] are inherited The values of C[[...]] are synthesized AST R C

slide-4
SLIDE 4

4

13

Static Analysis

Reachability Constraints (1/3) Reachability Constraints (1/3)

if(E) S: R[[S]] = R[[if(E) S]] C[[if (E) S]] = R[[if(E) S]] if(E) S1 else S2: R[[Si]] = R[[if(E) S1 else S2]] C[[if(E) S1 else S2]] = C[[S1]] ∨ C[[S2]] while(true) S: R[[S]] = R[[while(true) S]] C[[while(true) S]] = false while(false) S: R[[S]] = false C[[while(false) S]] = R[[while(false) S]]

14

Static Analysis

Reachability Constraints (2/3) Reachability Constraints (2/3)

while(E) S: R[[S]] = R[[while(E) S]] C[[while(E) S]] = R[[while(E) S]] return: C[[return]] = false return E: C[[return E]] = false throw E: C[[throw E]] = false {σ x; S}: R[[S]] = R[[{σ x; S}]] C[[{σ x; S}]] = C[[S]]

15

Static Analysis

Reachability Constraints (3/3) Reachability Constraints (3/3)

S1S2: R[[S1]] = R[[S1S2]] R[[S2]] = C[[S1]] C[[S1S2]] = C[[S2]] for any simple statement S: C[[S]] = R[[S]] for any method or constructor body {S}: R[[S]] = true

16

Static Analysis

Exploiting the Information Exploiting the Information

For any statement S where R[[S]] = false: unreachable statement For any non-void method with body {S} where C[[S]] = true: missing return statement These guarantees are sound but conservative

slide-5
SLIDE 5

5

17

Static Analysis

Approximations Approximations

C[[S]] may be true too often:

some unfair missing return errors may occur if (b) return 17; if (!b) return 42;

R[[S]] may be true too often:

some dead code is not detected if (b==!b) { ... }

18

Static Analysis

Definite Assignment Analysis Definite Assignment Analysis

Java requires that a local variable is assigned before its value is used This is an interesting properties and thus it is undecidable But a static analysis may provide a conservative approximation To ensure that different compilers accept the same programs, the Java language specification mandates a specific static analysis

19

Static Analysis

Constraint Constraint-

  • Based Analysis

Based Analysis

For every node S that represents a statement in the AST, we define some set-valued properties:

  • B[[S]] denotes the variables that are definitely

assigned before S is executed

  • A[[S]] denotes the variables that are definitely

assigned after S is executed

For every node E that represents an expression in the AST, we similarly define B[[E]] and A[[E]]

20

Static Analysis

Increased Precision Increased Precision

To handle cases such as:

{ int k; if (a>0 && (k=System.in.read())>0) System.out.print(k); }

we also use two refinements of A[[...]]:

  • At[[E]] which assumes that E evaluates to true
  • Af[[E]] which assumes that E evaluates to false
slide-6
SLIDE 6

6

21

Static Analysis

Information Flow Information Flow

The values of B[[...]] are inherited The values of A[[...]], At[[....]] and Af[[...]] are synthesized AST B A, At, Af

22

Static Analysis

Definite Assignment Constraints (1/7) Definite Assignment Constraints (1/7)

if(E) S: B[[E]] = B[[if(E) S]] B[[S]] = At[[E]] A[[if(E) S]] = A[[S]] ∩ Af[[E]] if(E) S1 else S2: B[[E]] = B[[if(E) S1 else S2]] B[[S1]] = At[[E]] B[[S2]] = Af[[E]] A[[if(E) S1 else S2]] = A[[S1]] ∩ A[[S2]]

23

Static Analysis

Definite Assignment Constraints (2/7) Definite Assignment Constraints (2/7)

while(E) S: B[[E]] = B[[while(E) S]] B[[S]] = At[[E]] A[[while(E) S]] = Af[[E]] return: A[[return]] = ∞ return E: B[[E]] = B[[return E]] A[[return E]] = ∞ throw E: B[[E]] = B[[throw E]] A[[throw E]] = ∞

the set of all variables in scope

24

Static Analysis

Definite Assignment Constraints (3/7) Definite Assignment Constraints (3/7)

E;: B[[E]] = B[[E;]] A[[E;]] = A[[E]] {σ x=E; S}: B[[E]] = B[[{σ x=E; S}]] B[[S]] = A[[E]] ∪ {x} A[[{σ x=E; S}]] = A[[S]] {σ x; S}: B[[S]] = B[[{σ x; S}]] A[[{σ x; S}]] = A[[S]]

slide-7
SLIDE 7

7

25

Static Analysis

Definite Assignment Constraints (4/7) Definite Assignment Constraints (4/7)

S1S2: B[[S1]] = B[[S1S2]] B[[S2]] = A[[S1]] A[[S1S2]] = A[[S2]] x = E: B[[E]] = B[[x = E]] A[[x = E]] = A[[E]] ∪ {x} x[E1] = E2: B[[E1]] = B[[x[E1] = E2]] B[[E2]] = A[[E1]] A[[x[E1] = E2]] = A[[E2]]

26

Static Analysis

Definite Assignment Constraints (5/7) Definite Assignment Constraints (5/7)

true: At[[true]] = B[[true]] Af[[true]] = ∞ A[[true]] = B[[true]] false: At[[false]] = ∞ Af[[false]] = B[[false]] A[[false]] = B[[false]] !E: B[[E]] = B[[!E]] Af[[!E]] = At[[E]] A[[!E]] = A[[E]] At[[!E]] = Af[[E]]

27

Static Analysis

Definite Assignment Constraints (6/7) Definite Assignment Constraints (6/7)

E1 && E2: B[[E1]] = B[[E1 && E2]] B[[E2]] = At[[E1]] At[[E1 && E2]] = At[[E2]] Af[[E1 && E2]] = Af[[E1]] ∩ Af[[E2]] A[[E1 && E2]] = At[[E1 && E2]] ∩ Af[[E1 && E2]] E1 || E2: B[[E1]] = B[[E1 || E2]] B[[E2]] = Af[[E1]] At[[E1 || E2]] = At[[E1]] ∩ At[[E2]] Af[[E1 || E2]] = Af[[E2]] A[[E1 || E2]] = At[[E1 || E2]] ∩ Af[[E1 || E2]]

28

Static Analysis

Definite Assignment Constraints (7/7) Definite Assignment Constraints (7/7)

EXP(E1,...,Ek): (any other expression with subexpressions) B[[E1]] = B[[EXP(E1,...,Ek)]] B[[Ei+1]] = A[[Ei]] A[[EXP(E1,...,Ek)]] = A[[Ek]] When not specified otherwise: At[[E]] = Af[[E]] = A[[E]]

slide-8
SLIDE 8

8

29

Static Analysis

Exploiting the Information Exploiting the Information

For every expression E of the form:

  • x
  • x++
  • x--
  • x[E']

where x∉B[[E]]:

variable might not have been initialized

30

Static Analysis

Approximation Approximation

A[[...]] and B[[...]] may be too small:

some unfair uninitialized variable errors may occur { int x; if (b) x = 17; if (!b) x = 42 System.out.print(x); }

31

Static Analysis

A Simpler Guarantee A Simpler Guarantee

In Joos 1, definite assignment is guaranteed by:

  • requiring initializers for all local declarations
  • forbidding a local variable to appear in its own initializer

This is an even coarser approximation:

{ int x = (x=1)+42; System.out.print(x) }

32

Static Analysis

Flow Flow-

  • Sensitive Analysis

Sensitive Analysis

The analyses for:

  • reachability
  • definite assignment

may simply be computed by traversing the AST Other analyses are defined on the control flow graph of a program and require more complex techniques

slide-9
SLIDE 9

9

33

Static Analysis

Motivation: Register Optimization Motivation: Register Optimization

For native code, we may want to optimize the use

  • f registers:

mov 1,R3 mov 1,R1 mov R3,R1 This optimization is only sound if the value in R3 is not used in the future

34

Static Analysis

Motivation: Register Spills Motivation: Register Spills

When pushing a new frame, we write back all variables from registers to memory It would be better to only write back those registers whose values may be used in the future

c b a y x R1 R2 R3 35

Static Analysis

Liveness Liveness

In both examples, we need to know if the value of some register Ri might be read in the future If so, it is called live (and otherwise dead) Exact liveness is of course undecidable A static analysis may conservatively approximate liveness at each program point A trivial analysis thinks everything is live A superior analysis identifies more dead registers

36

Static Analysis

Liveness Analysis Liveness Analysis

For every program point Si we define the following set-valued properties:

  • B[[Si]] denotes the set of registers that are possibly

live before Si

  • A[[Si]] denotes the set of registers that are possibly

live after Si

For every program point we generate a constraint that relates A[[...]] and B[[...]] for neighboring program points We no longer just use the AST...

slide-10
SLIDE 10

10

37

Static Analysis

Terminology Terminology

succ(Si) denotes the set of program points to which execution may continue (by falling through

  • r jumping)

uses(Si) denotes the set of registers that Si reads defs(Si) denotes the set of registers that Si writes

38

Static Analysis

A Tiny Example A Tiny Example

uses(Si) defs(Si) succ(Si) S1: mov 3,R1 {} {R1} {S2} S2: mov 4,R2 {} {R2} {S3} S3: add R1,R2,R3 {R1,R2} {R3} {S4} S4: mov R3,R0 {R3} {R0} {S5} S5: return {R0} {} {}

39

Static Analysis

Dataflow Constraints Dataflow Constraints

For every program point Si we have: B[[Si]] = uses(Si) ∪ (A[[Si]] \ defs(Si)) A[[Si]] = B[[x]]

x∈succ(Si)

A cyclic control flow graph will generate a cyclic collection of constraints

40

Static Analysis

Dataflow Example Dataflow Example

S7: add R1,R2,R3 S8 S9 S17

B[[S8]]={R4} B[[S9]]={R1} B[[S17]]={R3} B[[S7]]={R1,R2}∪({R4,R1,R3}\{R3})={R1,R2,R4}

slide-11
SLIDE 11

11

41

Static Analysis

A Small Example A Small Example

{ int i, even, odd, sum; i = 1; even = 0;

  • dd = 0;

sum = 0; while (i < 10) { if (i%2 == 0) even = even+1; else odd = odd+1; sum = sum+1; i++; } }

42

Static Analysis

Generated Native Code Generated Native Code

mov 1,R1 // R1 is i mov 0,R1 // R2 is even mov 0,R3 // R3 is odd mov 0,R4 // R4 is sum loop: andcc R1,1,R5 // R5 = R1 & 1 cmp R5,0 bne else // if R5!=0 goto else add R2,R1,R2 // R2 = R2+R1 b endif else: add R3,R1,R3 // R3 = R3+R1 endif: add R4,R1,R4 // R4 = R4+R1 add R1,1,R1 // R1 = R1+1 cmp R1,9 ble loop // if i<=9 goto loop

43

Static Analysis

The Control Flow Graph The Control Flow Graph

S5: andcc R1,1,R5 S1: mov 1,R1 S6: cmp R5,0 S8: add R2,R1,R2 S2: mov 0,R2 S7: bne S9 S10: add R4,R1,R4 S9: add R3,R1,R3 S3: mov 0,R3 S11: add R1,1,R1 S4: mov 0,R4 S12: cmp R1,9 S13: ble S5 44

Static Analysis

Cyclic Constraints Cyclic Constraints

B[[S1]] = f1(B[[S2]]) B[[S2]] = f2(B[[S3]]) B[[S3]] = f3(B[[S4]]) B[[S4]] = f4(B[[S5]]) B[[S5]] = f5(B[[S6]]) B[[S6]] = f6(B[[S7]]) B[[S7]] = f7(B[[S8]],B[[S8]]) B[[S8]] = f8(B[[S10]]) B[[S9]] = f9(B[[S10]]) B[[S10]] = f10(B[[S11]]) B[[S11]] = f11(B[[S12]]) B[[S12]] = f12(B[[S13]]) B[[S13]] = f13(B[[S5]]) where the fi(...) functions express the local constraints

slide-12
SLIDE 12

12

45

Static Analysis

Fixed Fixed-

  • Point Solutions

Point Solutions

Define ⊥ = (∅, ∅, ∅, ∅, ∅, ∅, ∅, ∅, ∅, ∅, ∅, ∅) Define R = {R0,R1,R2,R3,R4,R5} Define F: R13→R13 as: F(X1,X2,X3,X4,X5,X6,X7,X8,X9,X10,X11,X12,X13) = (f1(X2),f2(X3),f3(X4),f4(X5),f5(X6),f6(X7),f7(X8,X9), f8(X9),f9(X10),f10(X11),f11(X12),f12(X13),f13(X5)) A solution is now a fixed point X∈R13 such that F(X)=X The smallest fixed point is computed as: Fn(⊥) for some n≥0

46

Static Analysis

Computing the Minimal Fixed Point (1/3) Computing the Minimal Fixed Point (1/3)

⊥ F(⊥) F2 (⊥) F3 (⊥) S1 {} {} {} {} S2 {} {} {} {} S3 {} {} {} {R1} S4 {} {} {R1} {R1} S5 {} {R1} {R1} {R1} S6 {} {R5} {R5} {R1,R2,R3,R5} S7 {} {} {R1,R2,R3} {R1,R2,R3,R4} S8 {} {R1,R2} {R1,R2,R4} {R1,R2,R4} S9 {} {R1,R3} {R1,R3,R4} {R1,R3,R4} S10 {} {R1,R4} {R1,R4} {R1,R4} S11 {} {R1} {R1} {R1} S12 {} {R1} {R1} {R1} S13 {} {R1} {R1} {R1}

47

Static Analysis

Computing the Minimal Fixed Point (2/3) Computing the Minimal Fixed Point (2/3)

F4 (⊥) F5(⊥) F6(⊥) S1 {} {} {} S2 {R1} {R1} {R1} S3 {R1} {R1} {R1,R2} S4 {R1} {R1,R2,R3} {R1,R2,R3} S5 {R1,R2,R3} {R1,R2,R3,R4} {R1,R2,R3,R4} S6 {R1,R2,R3,R4,R5} {R1,R2,R3,R4,R5} {R1,R2,R3,R4,R5} S7 {R1,R2,R3,R4} {R1,R2,R3,R4} {R1,R2,R3,R4} S8 {R1,R2,R4} {R1,R2,R4} {R1,R2,R4} S9 {R1,R3,R4} {R1,R3,R4} {R1,R3,R4} S10 {R1,R4} {R1,R4} {R1,R4} S11 {R1} {R1} {R1,R2,R3} S12 {R1} {R1,R2,R3} {R1,R2,R3,R4} S13 {R1,R2,R3} {R1,R2,R3,R4} {R1,R2,R3,R4}

48

Static Analysis

Computing the Minimal Fixed Point (3/3) Computing the Minimal Fixed Point (3/3)

F7 (⊥) F8(⊥) S1 {} {} S2 {R1} {R1} S3 {R1,R2} {R1,R2} S4 {R1,R2,R3} {R1,R2,R3} S5 {R1,R2,R3,R4} {R1,R2,R3,R4} S6 {R1,R2,R3,R4,R5} {R1,R2,R3,R4,R5} S7 {R1,R2,R3,R4} {R1,R2,R3,R4} S8 {R1,R2,R4} {R1,R2,R3,R4} S9 {R1,R3,R4} {R1,R2,R3,R4} S10 {R1,R2,R3,R4} {R1,R2,R3,R4} S11 {R1,R2,R3,R4} {R1,R2,R3,R4} S12 {R1,R2,R3,R4} {R1,R2,R3,R4} S13 {R1,R2,R3,R4} {R1,R2,R3,R4}

F8(⊥)= F9(⊥)

slide-13
SLIDE 13

13

49

Static Analysis

Application: Register Allocation Application: Register Allocation

Variables that are never live at the same time may share the same register Create a graph of variables where edges indicate simultaneous liveness: Register allocation is done by finding a minimal graph coloring and assigning a register to each color: {{a,d,f}, {b,e}, {c}}

a b c d e f