Dataflow Analysis - - PowerPoint PPT Presentation

dataflow analysis jonathan aldrich
SMART_READER_LITE
LIVE PREVIEW

Dataflow Analysis - - PowerPoint PPT Presentation

Dataflow Analysis Jonathan Aldrich


slide-1
SLIDE 1
  • Jonathan Aldrich

Dataflow Analysis

slide-2
SLIDE 2

Outline

  • Introduction to Dataflow Analysis
  • Dataflow Analysis Frameworks
  • Lattices
  • Abstraction functions
  • Control flow graphs
  • Control flow graphs
  • Flow functions
  • Worklist algorithm
  • Example Dataflow Analyses
  • Constant Propagation
  • Reaching Definitions
  • Live Variable Analysis
slide-3
SLIDE 3

Motivation: Dataflow Analysis

  • Catch interesting errors
  • Non*local: x is null, x is written to y, y is

dereferenced

  • Optimize code
  • Reduce run time, memory usage-
  • Soundness required
  • Safety*critical domain
  • Safety*critical domain
  • Assure lack of certain errors
  • Cannot optimize unless it is proven safe
  • Correctness comes before performance
  • Automation required
  • Dramatically decreases cost
  • Makes cost/benefit worthwhile for far more

purposes

slide-4
SLIDE 4

Dataflow analysis

  • Tracks value flow through program
  • Can distinguish order of operations
  • Did you read the file after you closed it?
  • Does this null value flow to that dereference?
  • Differs from AST walker
  • Walker simply collects information or checks patterns
  • Tracking flow allows more interesting properties
  • Abstracts values
  • Abstracts values
  • Chooses abstraction particular to property
  • Is a variable null?
  • Is a file open or closed?
  • Could a variable be 0?
  • Where did this value come from?
slide-5
SLIDE 5

Zero Analysis

  • Could variable x be 0?
  • Useful to know if you have an expression y/x
  • In C, useful for null pointer analysis
  • Program semantics
  • Stack environment η maps every variable to an

integer

  • Semantic abstraction
  • Semantic abstraction
  • σ maps every variable to non zero (NZ), zero(Z),
  • r maybe zero (MZ)
  • Abstraction function for integers αZI :
  • αZI(0) = Z
  • αZI(n) = NZ

for all n ≠ 0

  • We may not know if a value is zero or not
  • Analysis is always an approximation
  • Need MZ option, too
slide-6
SLIDE 6

Zero Analysis Example

x := 10; y := x; z := 0; while y > *1 do σ =[]

  • !
  • while y > *1 do

x := x / y; y := y*1; z := 5;

slide-7
SLIDE 7

Zero Analysis Example

x := 10; y := x; z := 0; while y > *1 do σ =[] σ =[x↦αZI(10)]

  • "
  • while y > *1 do

x := x / y; y := y*1; z := 5;

slide-8
SLIDE 8

Zero Analysis Example

x := 10; y := x; z := 0; while y > *1 do σ =[] σ =[x↦NZ]

  • while y > *1 do

x := x / y; y := y*1; z := 5;

slide-9
SLIDE 9

Zero Analysis Example

x := 10; y := x; z := 0; while y > *1 do σ =[] σ =[x↦NZ] σ =[x↦NZ,y↦σ(x)]

  • while y > *1 do

x := x / y; y := y*1; z := 5;

slide-10
SLIDE 10

Zero Analysis Example

x := 10; y := x; z := 0; while y > *1 do σ =[] σ =[x↦NZ] σ =[x↦NZ,y↦NZ]

  • while y > *1 do

x := x / y; y := y*1; z := 5;

slide-11
SLIDE 11

Zero Analysis Example

x := 10; y := x; z := 0; while y > *1 do σ =[] σ =[x↦NZ] σ =[x↦NZ,y↦NZ] σ =[x↦NZ,y↦NZ,z↦αZI(0)]

  • while y > *1 do

x := x / y; y := y*1; z := 5;

slide-12
SLIDE 12

Zero Analysis Example

x := 10; y := x; z := 0; while y > *1 do σ =[] σ =[x↦NZ] σ =[x↦NZ,y↦NZ] σ =[x↦NZ,y↦NZ,z↦Z]

  • while y > *1 do

x := x / y; y := y*1; z := 5;

slide-13
SLIDE 13

Zero Analysis Example

x := 10; y := x; z := 0; while y > *1 do σ =[] σ =[x↦NZ] σ =[x↦NZ,y↦NZ] σ =[x↦NZ,y↦NZ,z↦Z] σ =[x↦NZ,y↦NZ,z↦Z]

  • while y > *1 do

x := x / y; y := y*1; z := 5; σ =[x↦NZ,y↦NZ,z↦Z]

slide-14
SLIDE 14

Zero Analysis Example

x := 10; y := x; z := 0; while y > *1 do σ =[] σ =[x↦NZ] σ =[x↦NZ,y↦NZ] σ =[x↦NZ,y↦NZ,z↦Z] σ =[x↦NZ,y↦NZ,z↦Z]

  • while y > *1 do

x := x / y; y := y*1; z := 5; σ =[x↦NZ,y↦NZ,z↦Z] σ =[x↦MZ,y↦NZ,z↦Z]

slide-15
SLIDE 15

Zero Analysis Example

x := 10; y := x; z := 0; while y > *1 do σ =[] σ =[x↦NZ] σ =[x↦NZ,y↦NZ] σ =[x↦NZ,y↦NZ,z↦Z] σ =[x↦NZ,y↦NZ,z↦Z]

  • while y > *1 do

x := x / y; y := y*1; z := 5; σ =[x↦NZ,y↦NZ,z↦Z] σ =[x↦MZ,y↦NZ,z↦Z] σ =[x↦MZ,y↦MZ,z↦Z]

slide-16
SLIDE 16

Zero Analysis Example

x := 10; y := x; z := 0; while y > *1 do σ =[] σ =[x↦NZ] σ =[x↦NZ,y↦NZ] σ =[x↦NZ,y↦NZ,z↦Z] σ =[x↦NZ,y↦NZ,z↦Z]

  • !
  • while y > *1 do

x := x / y; y := y*1; z := 5; σ =[x↦NZ,y↦NZ,z↦Z] σ =[x↦MZ,y↦NZ,z↦Z] σ =[x↦MZ,y↦MZ,z↦Z] σ =[x↦MZ,y↦MZ,z↦NZ]

slide-17
SLIDE 17

Zero Analysis Example

x := 10; y := x; z := 0; while y > *1 do σ =[] σ =[x↦NZ] σ =[x↦NZ,y↦NZ] σ =[x↦NZ,y↦NZ,z↦Z] σ =[x↦MZ,y↦MZ,z↦MZ]

  • "
  • while y > *1 do

x := x / y; y := y*1; z := 5; σ =[x↦MZ,y↦MZ,z↦MZ] σ =[x↦MZ,y↦NZ,z↦Z] σ =[x↦MZ,y↦MZ,z↦Z] σ =[x↦MZ,y↦MZ,z↦NZ]

slide-18
SLIDE 18

Zero Analysis Example

x := 10; y := x; z := 0; while y > *1 do σ =[] σ =[x↦NZ] σ =[x↦NZ,y↦NZ] σ =[x↦NZ,y↦NZ,z↦Z] σ =[x↦MZ,y↦MZ,z↦MZ]

  • while y > *1 do

x := x / y; y := y*1; z := 5; σ =[x↦MZ,y↦MZ,z↦MZ] σ =[x↦MZ,y↦MZ,z↦MZ] σ =[x↦MZ,y↦MZ,z↦Z] σ =[x↦MZ,y↦MZ,z↦NZ]

slide-19
SLIDE 19

Zero Analysis Example

x := 10; y := x; z := 0; while y > *1 do σ =[] σ =[x↦NZ] σ =[x↦NZ,y↦NZ] σ =[x↦NZ,y↦NZ,z↦Z] σ =[x↦MZ,y↦MZ,z↦MZ]

  • while y > *1 do

x := x / y; y := y*1; z := 5; σ =[x↦MZ,y↦MZ,z↦MZ] σ =[x↦MZ,y↦MZ,z↦MZ] σ =[x↦MZ,y↦MZ,z↦MZ] σ =[x↦MZ,y↦MZ,z↦NZ]

slide-20
SLIDE 20

Zero Analysis Example

x := 10; y := x; z := 0; while y > *1 do σ =[] σ =[x↦NZ] σ =[x↦NZ,y↦NZ] σ =[x↦NZ,y↦NZ,z↦Z] σ =[x↦MZ,y↦MZ,z↦MZ]

  • while y > *1 do

x := x / y; y := y*1; z := 5; σ =[x↦MZ,y↦MZ,z↦MZ] σ =[x↦MZ,y↦MZ,z↦MZ] σ =[x↦MZ,y↦MZ,z↦MZ] σ =[x↦MZ,y↦MZ,z↦NZ] Nothing more happens!

slide-21
SLIDE 21

Zero Analysis Termination

  • The analysis values will not change, no matter how

many times we execute the loop

  • Proof: our analysis is deterministic
  • We run through the loop with the current analysis values,

none of them change

  • Therefore, no matter how many times we run the loop, the

results will remain the same

  • Therefore, we have computed the dataflow analysis results

for any number of loop iterations

  • for any number of loop iterations
slide-22
SLIDE 22

Zero Analysis Termination

  • The analysis values will not change, no matter how

many times we execute the loop

  • Proof: our analysis is deterministic
  • We run through the loop with the current analysis values,

none of them change

  • Therefore, no matter how many times we run the loop, the

results will remain the same

  • Therefore, we have computed the dataflow analysis results

for any number of loop iterations

  • for any number of loop iterations
  • Why does this work
  • If we simulate the loop, the data values could (in principle)

keep changing indefinitely

  • There are an infinite number of data values possible
  • Not true for 32*bit integers, but might as well be true
  • Counting to 232 is slow, even on today’s processors
  • Dataflow analysis only tracks 2 possibilities!
  • So once we’ve explored them all, nothing more will change
  • This is the secret of abstraction
  • We will make this argument more precise later
slide-23
SLIDE 23

Using Zero Analysis

  • Visit each division in the program
  • Get the results of zero analysis for the divisor
  • If the results are definitely zero, report an error
  • If the results are possibly zero, report a warning
slide-24
SLIDE 24

Outline

  • Introduction to Dataflow Analysis
  • Dataflow Analysis Frameworks
  • Lattices
  • Abstraction functions
  • Control flow graphs
  • Control flow graphs
  • Flow functions
  • Worklist algorithm
  • Example Dataflow Analyses
  • Constant Propagation
  • Reaching Definitions
  • Live Variable Analysis
slide-25
SLIDE 25

Defining Dataflow Analyses

  • Lattice
  • Describes program data abstractly
  • Abstract equivalent of stack or heap contents
  • Abstraction function
  • Maps concrete value to lattice element
  • Flow functions
  • Flow functions
  • Describes how abstract data changes
  • Abstract equivalent of statement/expression

semantics

  • Control flow graph
  • Determines how abstract data propagates from

statement to statement

  • Abstract equivalent of program semantics
slide-26
SLIDE 26

Definitions

  • Partial order – a relation ⊑ that is:
  • Reflexive (l ⊑ l)
  • Transitive (l1 ⊑ l2 ˄ l2 ⊑ l3 ⇒ l1 ⊑ l3)
  • Antisymmetric (l1 ⊑ l2 ˄ l2 ⊑ l1 ⇒ l1 = l2)
  • Upper bound l of a set L
  • ∀ l′∈L : l′ ⊑ l
  • ∀ l′∈L : l′ ⊑ l
  • Least upper bound / join operator ⊔
  • An upper bound l ⊑ all other upper bounds
  • Lattice – a partially ordered set where every

subset has a least upper bound

  • !
slide-27
SLIDE 27

Lattices in Program Analysis

  • A lattice is a tuple (L, ⊑, ⊔, ⊥, ⊤)
  • L is a set of abstract elements
  • ⊑ is a partial order on L
  • Means at least as precise as
  • ⊔ is the least upper bound of two

elements

  • Must exist for every two elements in L
  • Used to merge two abstract values

⊤=MZ Z NZ ⊥

less precise more precise

  • "
  • Used to merge two abstract values
  • ⊥ (bottom) is the least element of L
  • Means we haven’t yet analyzed this yet
  • Will become clear later
  • ⊤ (top) is the greatest element of L
  • Means we don’t know anything
  • L may be infinite
  • Typically should have finite height
  • All paths from ⊥ to ⊤ should be finite
  • We’ll see why later
slide-28
SLIDE 28

Zero Analysis Lattice

  • Integer zero lattice
  • LZI = { ⊥, Z, NZ, MZ }
  • ⊥ ⊑ Z, ⊥ ⊑ NZ, NZ ⊑ MZ, Z ⊑ MZ
  • ⊥ ⊑ MZ holds by transitivity
  • ⊔ defined as join for ⊑
  • x ⊔ y = z iff
  • z is an upper bound of x and y

⊤=MZ Z NZ ⊥

  • z is an upper bound of x and y
  • z is the least such bound
  • Obeys laws: ⊥ ⊔ X = X, ⊤ ⊔ X = ⊤, X ⊔ X = X
  • Also Z ⊔ NZ = MZ
  • ⊥ = ⊥
  • ∀X . ⊥ ⊑ X
  • ⊤ = MZ
  • ∀X . X ⊑ ⊤

slide-29
SLIDE 29

Zero Analysis Lattice

  • Integer zero lattice
  • LZI = { ⊥, Z, NZ, MZ }
  • ⊥ ⊑ Z, ⊥ ⊑ NZ, NZ ⊑ MZ, Z ⊑ MZ
  • ⊔ defined as join for ⊑
  • ⊥ = ⊥
  • ⊤ = MZ

⊤=MZ Z NZ ⊥

  • Program lattice is a tuple lattice
  • LZ is the set of all maps from to LZI
  • σ1 ⊑Z σ2 iff ∀x∈ . σ1(x) ⊑ZI σ2(x)
  • σ1 ⊔Z σ2 = { x ↦ σ1(x) ⊔ZI σ2(x) | x∈}
  • ⊥ = { x ↦ ⊥ZI | x∈}
  • ⊤ = { x ↦ ⊤ZI | x∈} = { x ↦ MZ | x∈}
  • Can produce a tuple lattice from any base lattice
  • Just define as above

slide-30
SLIDE 30

Tuple Lattices Visually

  • For = { x,y }

⊤=MZ {x↦Z, y↦MZ} {x↦NZ, y↦MZ} {x↦MZ, y↦Z} {x↦MZ, y↦NZ}

  • - - - - - - - - - - - -

{x↦MZ, y↦⊥ZI} {x↦Z, y↦Z} {x↦Z, y↦NZ} - - - {x↦Z, y↦⊥ZI} {x↦NZ, y↦⊥ZI} {x↦⊥ZI, y↦Z} {x↦⊥ZI, y↦NZ} ⊥={x↦⊥ZI, y↦⊥ZI}

slide-31
SLIDE 31

One Path in a Tuple Lattice

⊤={w↦MZ, x↦MZ, y↦MZ, z↦MZ}

  • {w↦Z, x↦MZ, y↦MZ, z↦MZ} -
  • {w↦Z, x↦MZ, y↦NZ, z↦MZ} -
  • {w↦Z, x↦NZ, y↦⊥ZI , z↦⊥ZI} -
  • {w↦⊥ZI, x↦NZ, y↦⊥ZI , z↦⊥ZI} -

⊥={w↦⊥ZI, x↦⊥ZI, y↦⊥ZI , z↦⊥ZI}

slide-32
SLIDE 32

Quick Quiz

  • Consider the following two tuple lattice

values: [xZ, yMZ] and [xMZ, yNZ]

  • How do the two compare in the lattice
  • rdering for zero analysis?
  • rdering for zero analysis?
  • What is the join of these two tuple lattice

values?

slide-33
SLIDE 33

Outline

  • Introduction to Dataflow Analysis
  • Dataflow Analysis Frameworks
  • Lattices
  • Abstraction functions
  • Control flow graphs
  • Control flow graphs
  • Flow functions
  • Worklist algorithm
  • Example Dataflow Analyses
  • Constant Propagation
  • Reaching Definitions
  • Live Variable Analysis
slide-34
SLIDE 34

Abstraction Function

  • Maps each concrete program state to a

lattice element

  • For tuple lattices, the function can be

defined for values and lifted to tuples

  • Integer Zero abstraction function αZI :
  • Integer Zero abstraction function αZI :
  • αZI(0) = Z
  • αZI(n) = NZ

for all n ≠ 0

  • Zero Analysis abstraction function αZA :
  • αZA(η) = {x ↦ αZI(η(x)) | x∈}
  • This is just the tuple form of αZI(n)
  • Can be done for any tuple lattice
slide-35
SLIDE 35

Outline

  • Introduction to Dataflow Analysis
  • Dataflow Analysis Frameworks
  • Lattices
  • Abstraction functions
  • Control flow graphs
  • Control flow graphs
  • Flow functions
  • Worklist algorithm
  • Example Dataflow Analyses
  • Constant Propagation
  • Reaching Definitions
  • Live Variable Analysis
slide-36
SLIDE 36

Control Flow Graph (CFG)

  • Shows order of statement execution
  • Determines where data flows
  • Decomposes expressions into primitive
  • perations
  • Typically one CFG node per “useful” AST node
  • !
  • Typically one CFG node per “useful” AST node
  • constants, variables, binary operations, assignments, if,

while-

  • Loops are written out
  • Form a loop in the CFG
  • Benefit: analysis is defined one operation at a time
slide-37
SLIDE 37

Intuition for Building a CFG

  • Connect nodes in order of operation
  • Defined by language
  • Java order of operation
  • Expressions, assignment, sequence
  • Evaluate subexpressions left to right
  • "
  • Evaluate subexpressions left to right
  • Evaluate node after children (postfix)
  • While, If
  • Evaluate condition first, then if/while
  • if branches to else and then
  • while branches to loop body and exit
slide-38
SLIDE 38

Control Flow Graph Example

while i*2 < 10 do if x < i+2 then x := x + 5 else i := i + 1

END BEGIN if * 10 while < := < :=

  • BEGIN

i 2 * 10 x i 2 + < x 5 + := i 1 + := i x

slide-39
SLIDE 39

Control Flow Graph Example

while i*2 < 10 do if x < i+2 then x := x + 5 else i := i + 1

END BEGIN if * 10 while < := < :=

  • BEGIN

i 2 * 10 x i 2 + < x 5 + := i 1 + := i x

slide-40
SLIDE 40

Quick Quiz

  • Draw a CFG for the following program:

1: x := 0 2: y := 1 3: if (z == 0) 4: x := x + y 5: else y := y – 1

  • 5: else y := y – 1

6: w := y

slide-41
SLIDE 41

Outline

  • Introduction to Dataflow Analysis
  • Dataflow Analysis Frameworks
  • Lattices
  • Abstraction functions
  • Control flow graphs
  • Control flow graphs
  • Flow functions
  • Worklist algorithm
  • Example Dataflow Analyses
  • Constant Propagation
  • Reaching Definitions
  • Live Variable Analysis
slide-42
SLIDE 42

Flow Functions

  • Compute dataflow information after a statement from

dataflow information before the statement

  • Formally, map a lattice element and a CFG node to a new

lattice element

  • analysis(σ, [operation]) = σ’
  • analysis(σ, [operation]) = σ’
  • is just a flow function
  • analysis is the name of our analysis
  • σ is the old lattice
  • peration
  • σ’ is the new lattice
slide-43
SLIDE 43

Three Address Code

  • Analysis performed on 3*address code
  • inspired by 3 addresses in assembly language:

add x,y,z

  • Convert complex expressions to 3*address code
  • Each subexpression represented by a temporary

variable

  • x+3*y t1:=3; t2:= t1*y; t3:=x+t2
  • x+3*y t1:=3; t2:= t1*y; t3:=x+t2

(x == foo()) { y = z * x + 5 / w; } t1 = foo(); t2 = x == t1; (t2) { t3 = z * x; t4 = 5 / w; y = t3 + t4; }

slide-44
SLIDE 44

While3Addr

  • copy
  • binary op
  • literal
  • unary op
  • label

x = y x = y op z (op ∈ {+,*,*,/,-}) x = n x = op y

(op ∈ {*,!,++,-})

label lab

  • label
  • jump
  • branch

label lab jump lab btrue x lab

slide-45
SLIDE 45

Zero Analysis Flow Functions

  • ZA(σ, [x := y]) = [x σ(y)] σ
  • ZA(σ, [x := n]) = if n==0

then [x Z]σ else [x NZ]σ

  • ZA(σ, [x := …]) = [x MZ] σ
  • ZA(σ, [x := …]) = [x MZ] σ

Could be more precise! ZA(σ, [x := a * b]) = if σ(a)=NZ && σ(b)=NZ then [x NZ]σ else [x MZ]σ

  • ZA(σ, /* any nonassignment */) = σ
slide-46
SLIDE 46

Zero Analysis Flow Functions

  • ZA(σ, [x := y]) = [x σ(y)] σ
  • ZA(σ, [x := n]) = if n==0

then [x Z]σ else [x NZ]σ

  • ZA(σ, [x := …]) = [x MZ] σ
  • !
  • ZA(σ, [x := …]) = [x MZ] σ
  • Could be more precise!

ZA(σ, [x := a * b]) = if σ(a)=Z || σ(b)=Z then [x Z]σ else [x MZ]σ

  • ZA(σ, /* any nonassignment */) = σ
slide-47
SLIDE 47

Zero Analysis Example

x := 0; while x > 3 do x := x+1

[while]7 [:=]3 [;]13 [x] [0] [<] [:=] END BEGIN

  • "
  • [x]1

[0]2 [<]6 [x]4 [3]5 [:=]12 [x]8 [x]9 [1]10 [+]11

slide-48
SLIDE 48

Zero Analysis Example

x := 0; while x > 3 do x := x+1

[while]7 [:=]3 [;]13 [x] [0] [<] [:=] END BEGIN

  • [x]1

[0]2 [<]6 [x]4 [3]5 [:=]12 [x]8 [x]9 [1]10 [+]11

slide-49
SLIDE 49

Zero Analysis Example

Initial dataflow σι = { x MZ | x∈} Intuition:

[while]7 [:=]3 [;]13 [x] [0] [<] [:=] END BEGIN

  • Intuition:

We know nothing about initial variable values. We could use a precondition if we had one.

[x]1 [0]2 [<]6 [x]4 [3]5 [:=]12 [x]8 [x]9 [1]10 [+]11

slide-50
SLIDE 50

Zero Analysis Example

σι = { x MZ | x∈} σ2 = ZA(σι, [t2 := 0]) = [t2 Z] σι

[while]7 [:=]3 [;]13 [x] [0] [<] [:=] END BEGIN

  • ZA(σ, [x := n]) =

if n==0 then [x Z]σ else [x NZ]σ

[x]1 [0]2 [<]6 [x]4 [3]5 [:=]12 [x]8 [x]9 [1]10 [+]11

slide-51
SLIDE 51

Zero Analysis Example

σι = { x MZ | x∈} σ2 = [t2 Z] σι σ3 = ZA(σ2, [x := t2]) = [x σ2(t2)] σ2 = [x Z] σ

[while]7 [:=]3 [;]13 [x] [0] [<] [:=] END BEGIN

  • = [x Z] σ2

= [x Z, t2 Z] σι ZA(σ, [x := y]) = [x σ(y)] σ

[x]1 [0]2 [<]6 [x]4 [3]5 [:=]12 [x]8 [x]9 [1]10 [+]11

slide-52
SLIDE 52

Zero Analysis Example

σι = { x MZ | x∈} σ3 = [x Z, t2 Z] σι Input to [3]5 comes from [:=]3 and [:=]12 Input should be σ ⊔ σ

[while]7 [:=]3 [;]13 [x] [0] [<] [:=] END BEGIN

  • Input should be σ3 ⊔ σ12

What is σ12? Solution: assume ⊥ Benefit: σ3 ⊔ ⊥ = σ3 Same result as ignoring back edge first time

[x]1 [0]2 [<]6 [x]4 [3]5 [:=]12 [x]8 [x]9 [1]10 [+]11

slide-53
SLIDE 53

Zero Analysis Example

σι = { x MZ | x∈} σ3 = [x Z, t2 Z] σι σ12 = ⊥ σ5 = ZA(σ3 ⊔ σ12, [t5 := 3]) = ZA(σ3 ⊔ ⊥, [t5 := 3])

[while]7 [:=]3 [;]13 [x] [0] [<] [:=] END BEGIN

  • = ZA(σ3 ⊔ ⊥, [t5 := 3])

= ZA(σ3, [t5 := 3]) = [t5 NZ] σ3

ZA(σ, [x := n]) =

if n==0 then [x Z]σ else [x NZ]σ

[x]1 [0]2 [<]6 [x]4 [3]5 [:=]12 [x]8 [x]9 [1]10 [+]11

slide-54
SLIDE 54

Zero Analysis Example

σι = { x MZ | x∈} σ3 = [x Z, t2 Z] σι σ12 = ⊥ σ5 = [t5 NZ] σ3 σ6 = ZA(σ5, [t6 := x< t5])

[while]7 [:=]3 [;]13 [x] [0] [<] [:=] END BEGIN

  • σ6

= ZA(σ5, [t6 := x< t5]) = σ5 = [t5 NZ] σ3

ZA(σ, /* any other */) = σ

[x]1 [0]2 [<]6 [x]4 [3]5 [:=]12 [x]8 [x]9 [1]10 [+]11

slide-55
SLIDE 55

Zero Analysis Example

σι = { x MZ | x∈} σ3 = [x Z, t2 Z] σι σ12 = ⊥ σ6 = [t5 NZ] σ3 Skipping similar nodes

[while]7 [:=]3 [;]13 [x] [0] [<] [:=] END BEGIN

  • Skipping similar nodes

[x]1 [0]2 [<]6 [x]4 [3]5 [:=]12 [x]8 [x]9 [1]10 [+]11

slide-56
SLIDE 56

Zero Analysis Example

σι = { x MZ | x∈} σ3 = [x Z, t2 Z] σι σ12 = ⊥ σ10 = [t10 NZ,-] σ3 σ11 = ZA(σ10, [t11 := x + t10]) = [t MZ] σ [while]7 [:=]3 [;]13 [x] [0] [<] [:=] END BEGIN

  • !
  • = [t11 MZ] σ10

ZA(σ, [x := y op z]) = [x MZ] σ

[x]1 [0]2 [<]6 [x]4 [3]5 [:=]12 [x]8 [x]9 [1]10 [+]11

slide-57
SLIDE 57

Zero Analysis Example

σι = { x MZ | x∈} σ3 = [x Z, t2 Z] σι σ12 = ⊥ σ11 = [t10 NZ,t11 MZ,-]σ3 σ12 = ZA(σ11, [x :=t11])

[while]7 [:=]3 [;]13 [x] [0] [<] [:=] END BEGIN

  • "
  • σ12 = ZA(σ11, [x :=t11])

= [x σ11(t11)] σ11 = [x MZ] σ11 = [x MZ,-] σ3

ZA(σ, [x := y]) = [x σ(y)] σ

[x]1 [0]2 [<]6 [x]4 [3]5 [:=]12 [x]8 [x]9 [1]10 [+]11

slide-58
SLIDE 58

Zero Analysis Example

σι = { x MZ | x∈} σ3 = [x Z, t2 Z] σι σ12 = [x MZ, -] σ3 σ5 = ZA(σ3 ⊔ σ12, [t5 := 3])

[while]7 [:=]3 [;]13 [x] [0] [<] [:=] END BEGIN

  • σ5

= ZA(σ3 ⊔ σ12, [t5 := 3]) = ZA([x MZ]σ3, [t5 := 3]) = [t5 NZ] [x MZ, -]σ3 = [t5 NZ, x MZ, -] σ3

ZA(σ, [x]k) = [tk σ(x)] σ

[x]1 [0]2 [<]6 [x]4 [3]5 [:=]12 [x]8 [x]9 [1]10 [+]11

slide-59
SLIDE 59

Zero Analysis Example

σι = { x MZ | x∈} σ3 = [x Z, t2 Z] σι σ12 = [x MZ,-] σ3

[while]7 [:=]3 [;]13 [x] [0] [<] [:=] END BEGIN

  • Propagation of x MZ continues

σ12 does not change, so no need to iterate again

[x]1 [0]2 [<]6 [x]4 [3]5 [:=]12 [x]8 [x]9 [1]10 [+]11

slide-60
SLIDE 60

Quick Quiz

  • Explain in detail how the dataflow lattice value for after the

statement w := y is computed, using the CFG below as your point

  • f reference.
  • !
  • Answer:
slide-61
SLIDE 61

Outline

  • Introduction to Dataflow Analysis
  • Dataflow Analysis Frameworks
  • Lattices
  • Abstraction functions
  • Control flow graphs
  • !
  • Control flow graphs
  • Flow functions
  • Worklist algorithm
  • Example Dataflow Analyses
  • Constant Propagation
  • Reaching Definitions
  • Live Variable Analysis
slide-62
SLIDE 62

Kildall’s Dataflow Analysis Worklist Algorithm

worklist = new Set(); for all node indexes i do results[i] = ⊥A; results[entry] = ιA; worklist.add(all nodes); while (!worklist.isEmpty()) do

Ok to just add entry node if flow functions cannot return ⊥A (examples will assume this)

  • !
  • i = worklist.pop();

before = ⊔k∈pred(i) results[k]; after = A(before, node(i)); if (!(after ⊑ results[i])) results[i] = after; for all k∈succ(i) do worklist.add(k);

Pop removes the most recently added element from the set (performance

  • ptimization)
slide-63
SLIDE 63

Example of Worklist

[a := 0]1 [b := 0]2 while [a < 2]3 do [b := a]4; [a := a + 1] ;

  • 1

MZ MZ

  • !
  • [a := a + 1]5;

[a := 0]6

1 2 3 4 5 6 Control Flow Graph

slide-64
SLIDE 64

Example of Worklist

[a := 0]1 [b := 0]2 while [a < 2]3 do [b := a]4; [a := a + 1] ;

  • 1

MZ MZ 1 2 Z MZ 2 3 Z Z 3 4,6 Z Z 4 5,6 Z Z 5 3,6 MZ Z

  • !
  • [a := a + 1]5;

[a := 0]6

5 3,6 MZ Z 3 4,6 MZ Z 4 5,6 MZ MZ 5 3,6 MZ MZ 3 4,6 MZ MZ 4 6 MZ MZ 6 Z MZ

1 2 3 4 5 6 Control Flow Graph

slide-65
SLIDE 65

Quick Quiz

  • Show how the worklist algorithm given in class operates
  • n the program given, by filling in the table below.

1: x := 0 2: y := 1 3: if (z == 0) 4: x := x + y

  • !
  • 4: x := x + y

5: else y := y – 1 6: w := y

slide-66
SLIDE 66

Worklist Algorithm Performance

worklist = new Set(); for all node indexes i do input[i] = ⊥A; input[entry] = ιA; worklist.add(all nodes); while (!worklist.isEmpty()) do i = worklist.pop();

  • !!
  • i = worklist.pop();

after = A(input[i], node(i)); for all k∈succ(i) do newinput = input[k] ⊔ after if (!(newinput ⊑ input[k])) input[k] = newinput; worklist.add(k);

slide-67
SLIDE 67

Worklist Algorithm Performance

worklist = new Set(); for all node indexes i do input[i] = ⊥A; input[entry] = ιA; worklist.add(all nodes); while (!worklist.isEmpty()) do i = worklist.pop();

  • How many times might a node get

added to the worklist?

  • The node’s input must increase

each time

  • The number of increases is

bound by the height h of the lattice

  • How many times do statements

execute?

  • While loop executes h times for
  • !"
  • i = worklist.pop();

after = A(input[i], node(i)); for all k∈succ(i) do newinput = input[k] ⊔ after if (!(newinput ⊑ input[k])) input[k] = newinput; worklist.add(k);

  • While loop executes h times for

each node n (h*n total)

  • Statements in inner loop

execute h times for each successor edge (h*e total, where e ≥ n)

  • Assume operation cost is c
  • Then performance is O(h*e*c)
  • Often h, e, and c are bounded

by n. So we get O(n3)

  • Good enough to run on a

function, but not on the whole program

slide-68
SLIDE 68

Outline

  • Dataflow Analysis Frameworks
  • Lattices
  • Abstraction functions
  • Control flow graphs
  • Flow functions
  • Worklist algorithm
  • !
  • Example Dataflow Analyses
  • Constant Propagation
  • Reaching Definitions
  • Live Variable Analysis
  • Dataflow Analysis Correctness
  • Termination
  • Soundness
slide-69
SLIDE 69

Constant Propagation

  • Goal: determine which variables hold a

constant value:

x := 3; y := x+7; if b then z := x+2

  • !
  • then z := x+2

else z := y*5; w := z*2

  • What is w?
  • Useful for optimization, error checking
  • Zero analysis is a special case
slide-70
SLIDE 70

Constant Propagation Definition

  • Constant lattice (LC, ⊑C, ⊔C, ⊥, ⊤)
  • LC = ∪ { ⊥, ⊤ }
  • ∀n∈ : ⊥ ⊑C n && n ⊑C ⊤
  • Constant propagation lattice
  • Tuple lattice formed from above lattice
  • See notes on zero analysis for details

  • *1 0 1 -

  • "
  • See notes on zero analysis for details
  • Abstraction function:
  • αC(n) = n
  • αCP(η) = { x αC(η(x)) | x∈ }
  • Initial data:
  • ιCP = { x ⊤ | x∈ }
slide-71
SLIDE 71

Constant Propagation Definition

  • CP(σ, [x := y]) = [x σ(y)] σ
  • CP(σ, [x := n]) = [x n]σ
  • CP(σ, [x := y op z]) = [x (σ(y) op⊤ σ(z))]

σ

  • "
  • σ
  • n op⊤ m n op m
  • n op⊤ ⊤ ⊤
  • ⊤ op⊤ m ⊤
  • Note: we could define for ⊥ too, but we

won’t actually ever see ⊥ during analysis

  • CP(σ, /* any other */) = σ
slide-72
SLIDE 72

Constant Propagation Example

[x := 3]1; [y := x+7]2; if [b]3 then [z := x+2]4 else [z := y*5]5; [w := z*2]

  • "
  • [w := z*2]6
slide-73
SLIDE 73

Constant Propagation Example

[x := 3]1; [y := x+7]2; if [b]3 then [z := x+2]4 else [z := y*5]5; [w := z*2]

  • 1

⊤ ⊤ ⊤ ⊤ 1 2 3 ⊤ ⊤ ⊤ 2 3 3 10 ⊤ ⊤ 3 4,5 3 10 ⊤ ⊤ 4 6,5 3 10 5 ⊤ 6 5 3 10 5 3

  • "
  • [w := z*2]6

6 5 3 10 5 3 5 6 3 10 5 ⊤ 6 3 10 5 3

slide-74
SLIDE 74

Constant Propagation Example

[x := 3]1; [y := x+7]2; if [b]3 then [z := x+]4 else [z := y*5]5; [w := z*2]

  • 1

⊤ ⊤ ⊤ ⊤ 1 2 3 ⊤ ⊤ ⊤ 2 3 3 10 ⊤ ⊤ 3 4,5 3 10 ⊤ ⊤

  • "
  • [w := z*2]6
slide-75
SLIDE 75

Constant Propagation Example

[x := 3]1; [y := x+7]2; if [b]3 then [z := x+]4 else [z := y*5]5; [w := z*2]

  • 1

⊤ ⊤ ⊤ ⊤ 1 2 3 ⊤ ⊤ ⊤ 2 3 3 10 ⊤ ⊤ 3 4,5 3 10 ⊤ ⊤ 4 6,5 3 10 4 ⊤ 6 5 3 10 4 2

  • "
  • [w := z*2]6

6 5 3 10 4 2 5 6 3 10 5 ⊤ 6 3 10 ⊤ ⊤

slide-76
SLIDE 76

Loss of Precision

if [x = 0]1 then [y := 1]2; else [y := x]3; [z := 10/y]4

  • 1

MZ MZ MZ 1 2,3 MZ MZ MZ 2 4,3 MZ ! MZ 4 3 MZ NZ MZ 3 4 MZ "! MZ 4 MZ "! MZ

  • "!
  • 4

MZ "! MZ

slide-77
SLIDE 77

Branch Sensitivity for Zero Analysis

  • Existing flow functions
  • ZA(σ, [x := y]) = [x σ(y)] σ
  • ZA(σ, [x := n]) = if n==0

then [x Z]σ else [x NZ]σ

  • ZA(σ, [x := y op z]) = [x MZ] σ
  • (σ, /* any other */) = σ
  • ""
  • ZA(σ, /* any other */) = σ

Propagate different info on branches

ZA

T(σ, [x = 0]) = [x Z] σ

ZA

F(σ, [x = 0]) = [x NZ] σ

Slightly more general:

ZA

T(σ, [x = y]) = [x σ(y)] σ

ZA

F(σ, [x = y]) = [x ¬σ(y)] σ

Assume ¬Z=NZ; ¬NZ=Z; ¬MZ=MZ

slide-78
SLIDE 78

Branch Sensitivity for Zero Analysis

  • Existing flow functions
  • ZA(σ, [x := y]) = [x σ(y)] σ
  • ZA(σ, [x := n]) = if n==0

then [x Z]σ else [x NZ]σ

  • ZA(σ, [x := y op z]) = [x MZ] σ
  • (σ, /* any other */) = σ
  • "
  • ZA(σ, /* any other */) = σ
  • Propagate different info on branches
  • ZA

T(σ, [x = 0]) = [x Z] σ

  • ZA

F(σ, [x = 0]) = [x NZ] σ

  • Slightly more general:
  • ZA

T(σ, [x = y]) = [x σ(y)] σ

  • ZA

F(σ, [x = y]) = [x ¬σ(y)] σ

  • Assume ¬Z=NZ; ¬NZ=Z; ¬MZ=MZ
slide-79
SLIDE 79

Precision Regained

Worklist simplified to the statement level

if [x = 0]1 then [y := 1]2; else [y := x]3; [z := 10/y]4

  • 1

MZ MZ MZ 1T 2,3 ! MZ MZ 1F 2,3 ! MZ MZ 2 (use 1T) 4,3 Z ! MZ 4 3 Z NZ MZ 3 (use 1F) 4 NZ ! MZ

  • "
  • 3 (use 1 )

4 NZ ! MZ 4 "! ! MZ

slide-80
SLIDE 80

Outline

  • Dataflow Analysis Frameworks
  • Lattices
  • Abstraction functions
  • Control flow graphs
  • Flow functions
  • Worklist algorithm
  • Example Dataflow Analyses
  • Constant Propagation
  • Reaching Definitions
  • Live Variable Analysis
  • Dataflow Analysis Correctness
  • Termination
  • Soundness
slide-81
SLIDE 81

Reaching Definitions Analysis

  • Goal: determine which is the most recent assignment

to a variable that precedes its use: [y := x]1; [z := 1]2; while [y>1]3 do [z := z * y]4;

  • [z := z * y]4;

[y := y – 1]5; [y := 0]6;

  • Example: definitions 1 and 5 reach the use of y at 4
  • Applications
  • Simpler version of constant propagation, zero analysis, etc.
  • Just look at the reaching definitions for constants
  • If definitions reaching use include “undefined” sentinal, then

we may be using an undefined variable

slide-82
SLIDE 82

Reaching Definitions

  • Set Lattice (ℙ#$%&, ⊑RD, ⊔RD, ∅, #$%&)
  • #$%& is the set of definitions in the program

#$%&={1,2,4}

{1,2} {1,4} {2,4} {1} {2} {4} ∅

  • #$%& is the set of definitions in the program
  • Each element of the lattice is a subset of defs
  • ℙ#$%&is the powerset of #$%&, i.e. the set of all subsets of #$%&
  • Approximation
  • A definition d may reach program point P if d is in the lattice at P
  • We call this a may analysis
  • x ⊑RD y iff x ⊆ y
  • x ⊔RD y = x ⋃ y
  • This is a direct consequence of the definition of ⊑RD
  • Most precise element ⊥=∅ (no reaching definitions)
  • Least precise element ⊤=#$%& (all definitions reach)

slide-83
SLIDE 83

Reaching Definitions

  • Initially assume dummy assignments
  • Represents passed values for parameters
  • Represents uninitialized for non*parameters
  • ιRD = { x0 | x∈}
  • Flow functions
  • ƒ

(σ, [x := -] ) ⋃

  • ƒRD(σ, [x := -]k)

= σ * { xm | xm ∈ #$%&(x)} ⋃ { xk }

  • Kills (removes from set) all other definitions of x
  • Generates (adds to set) the current definition xk
  • Kill/Gen pattern true in many analyses with set lattices
  • ƒRD(σ, /* any other */) = σ
slide-84
SLIDE 84

Reaching Definitions Example

[y := x]1; [z := 1]2; while [y>1]3 do [z := z * y]4; [y := y – 1] ;

  • '($
  • [y := y – 1]5;

[y := 0]6;

slide-85
SLIDE 85

Reaching Definitions Example

[y := x]1; [z := 1]2; while [y>1]3 do [z := z * y]4; [y := y – 1] ;

  • '($

1 {x0, y0, z0} 1 2 {x0, y1, z0} 2 3 {x0, y1, z2} 3 4,6 {x0, y1, z2} 4 5,6 {x0, y1, z4} 5 3,6 {x0, y5, z4}

  • [y := y – 1]5;

[y := 0]6;

5 3,6 {x0, y5, z4} 3 4,6 {x0, y1, y5, z2, z4} 4 5,6 {x0, y1, y5, z4} 5 6 {x0, y5, z4} 6 {x0, y6, z2, z4}

slide-86
SLIDE 86

Outline

  • Dataflow Analysis Frameworks
  • Lattices
  • Abstraction functions
  • Control flow graphs
  • Flow functions
  • Worklist algorithm
  • !
  • Example Dataflow Analyses
  • Constant Propagation
  • Reaching Definitions
  • Live Variable Analysis
  • Dataflow Analysis Correctness
  • Termination
  • Soundness
slide-87
SLIDE 87

Live Variables Analysis

  • Goal: determine which variables may be used again before they

are redefined (i.e. are live) at the current program point: [y := x]1; [z := 1]2; while [y>1]3 do [z := z * y]4; [y := y – 1]5;

  • "
  • [y := y – 1]5;

[y := 0]6;

  • Example: after statement 1, y is live, but x and z are not
  • Optimization applications
  • If a variable is not live after it is defined, can remove the definition

statement (e.g. 6 in the example)

slide-88
SLIDE 88

Live Variables Definition

={x,y,z} {x,y} {x,z} {y,z} {x} {y} {z} ∅

  • Set Lattice (ℙ, ⊑LV, ⊔LV, ∅, )
  • is the set of variables in the program
  • Each element of the lattice is a subset of
  • ℙis the powerset of , i.e. the set of all subsets of
  • x ⊑LV y iff x ⊆ y
  • x ⊔LV y = x ⋃ y
  • Most precise element ⊥=∅ (no live variables)
  • Least precise element ⊤=#$%& (all variables live)

slide-89
SLIDE 89

Live Variables Definition

  • Live Variables is a backwards analysis
  • To figure out if a variable is live, you have to look

at the future execution of the program

  • Will x be used before it is redefined?
  • When x is defined, assume it is not live
  • When x is used, assume it is live
  • When x is used, assume it is live
  • Propagate lattice elements as usual, except

backwards

  • Initially assume return value is live
  • ιLV = { x } where x is the variable returned from the

function

slide-90
SLIDE 90

Flow Function Practice

  • Write flow functions for Live Variable

analysis:

  • ƒLV(σ, [x := e]k) =
  • ƒLV(σ, /* any other */) =
slide-91
SLIDE 91

Flow Function Practice

  • Write flow functions for Live Variable

analysis:

  • ƒLV(σ, [x := e]k) = (σ * { x }) ⋃ vars(e)
  • Kills (removes from set) the variable x
  • Kills (removes from set) the variable x
  • Generates (adds to set) the variables in e
  • Note: must kill first then generate (what if e = x?)
  • ƒLV(σ, [e]k) = σ ⋃ vars(e)
  • ƒLV(σ, /* any other */) = σ
slide-92
SLIDE 92

Worklist Practice

Show how the worklist algorithm given in class operates

  • n the program given, by filling in the table below.

[y := x]1; [z := 1]2; while [y>1]3 do

  • '()
  • while [y>1]3 do

[z := z * y]4; [y := y – 1]5; [y := 0]6; return z;

slide-93
SLIDE 93

Live Variables Example

[y := x]1; [z := 1]2; while [y>1]3 do [z := z * y]4; [y := y – 1] ;

  • '($

exit 6 {z} 6 3 {z} 3 5,2 {y,z} 5 4,2 {y,z} 4 3,2 {y,z} 3 2 {y,z}

  • [y := y – 1]5;

[y := 0]6; return z;

3 2 {y,z} 2 1 {y} 1 {x}