CO444H SSA SSA Construction SSA-based analysis Ben Livshits 1 - - PowerPoint PPT Presentation

co444h
SMART_READER_LITE
LIVE PREVIEW

CO444H SSA SSA Construction SSA-based analysis Ben Livshits 1 - - PowerPoint PPT Presentation

Dataflow Solutions CO444H SSA SSA Construction SSA-based analysis Ben Livshits 1 Refresher: Reaching Definitions Direction D = forward. Domain V = set of all sets of definitions in the flow graph. = union. Functions F =


slide-1
SLIDE 1

Dataflow Solutions SSA SSA Construction SSA-based analysis

CO444H

Ben Livshits

1

slide-2
SLIDE 2

2

Refresher: Reaching Definitions

  • Direction D = forward.
  • Domain V = set of all sets of definitions in the flow

graph.

  • ∧ = union.
  • Functions F = all “gen-kill” functions of the form

f(x) = (x - KILL) ∪ GEN, where KILL and GEN are sets of definitions (members of V).

slide-3
SLIDE 3

Last Time: May vs. Must Analysis

May Must Forward Reaching definitions Available expressions Backward Live variables Very busy expressions

3

slide-4
SLIDE 4

Different Dataflow Solutions

4

slide-5
SLIDE 5

5

What Does the Iterative Algorithm Do?

  • IDEAL = ideal solution = meet over all executable

paths from entry to a point

  • MOP = meet over all paths from entry to a given

point, of the transfer function along that path applied to vENTRY

  • MFP (maximal fixedpoint) = result obtained by

running the iterative algorithm from last lecture

slide-6
SLIDE 6

6

Transfer Function of a Path

f1 f2 fn-1 B . . . fn-1( . . .f2(f1(vENTRY)). . .)

slide-7
SLIDE 7

7

Maximum Fixedpoint

  • Fixedpoint = solution to the equations used

in iteration: IN(B) = ∧ predecessors P of B OUT(P); OUT(B) = fB(IN(B));

slide-8
SLIDE 8

Why Maximum?

Maximum = any

  • ther solution is

≤ the result of the iterative algorithm (MFP)

8

slide-9
SLIDE 9

9

MOP and IDEAL

  • All solutions are really meets of the result of

starting with vENTRY and following some set of paths to the point in question

  • If we don’t include at least the IDEAL paths, we

have an error – we are not conservative

  • But try not to include too many more – try to be

precise

slide-10
SLIDE 10

10

MOP Versus IDEAL --- (1)

  • At each block B, MOP[B] ≤ IDEAL[B].
  • i.e., the meet over many paths is ≤ the meet over a

subset.

  • Example:
  • x ∧ y ∧ z ≤ x ∧ y
  • because x ∧ y ∧ z ∧ x ∧ y = x ∧ y ∧ z.
  • Intuition: Anything not ≤ IDEAL is not safe, because

there is some executable path whose effect is not accounted for.

slide-11
SLIDE 11

11

MOP Versus IDEAL --- (2)

  • Conversely: any solution that is ≤ IDEAL

accounts for all executable paths (and maybe more paths), and is therefore conservative (safe), even if not accurate.

slide-12
SLIDE 12

12

MFP Versus MOP --- (1)

  • Is MFP ≤ MOP?
  • If so, then since MOP ≤ IDEAL, we have MFP ≤ IDEAL,

and therefore MFP is safe.

  • Yes, but … requires two important assumptions

about the framework:

1. “Monotonicity” 2. Finite height (no infinite chains . . . < x2 < x1 < x within the lattice)

slide-13
SLIDE 13

13

MFP Versus MOP --- (2)

  • Intuition: If we computed the MOP directly, we

would compose functions along all paths, then take a big meet

  • But the MFP (iterative algorithm) alternates

compositions and meets somewhat arbitrarily

  • Also, meets occur early, which causes a loss of

precision

slide-14
SLIDE 14

14

Monotonicity

  • A framework is monotone if the transfer function

(call it f) respects ≤.

  • That is:
  • If x ≤ y, then f(x) ≤ f(y).
  • Equivalently: f(x ∧ y) ≤ f(x) ∧ f(y).
  • Intuition: it is conservative to take a meet before

completing the composition of functions.

slide-15
SLIDE 15

15

Motonotonicity is Quite Common

  • The frameworks we’ve studied so far are all

monotone:

  • Easy to prove for functions in GEN/KILL form
  • Try proving this yourself for reaching definitions, for

example

  • And they all are finite height
  • Only a finite number of definitions, variables, etc. in any

given program

  • You can only iterate so many times
slide-16
SLIDE 16

16

Two Paths to B That Meet Early

ENTRY B

Since f(x ∧ y) ≤ f(x) ∧ f(y), we might have lost some precision because of the early meet

f OUT = x OUT = y IN = x∧y OUT = f(x∧y) In MFP, Values x and y get combined too soon. f(x) f(y) MOP considers paths independently and and combines at the last possible moment. OUT = f(x) ∧ f(y)

slide-17
SLIDE 17

17

Distributive Frameworks

  • Strictly stronger than monotonicity is the

distributivity condition: f(x ∧ y) = f(x) ∧ f(y)

  • Functions F = all “gen-kill” functions of the form f(x)

= (x - KILL) ∪ GEN, where KILL and GEN are sets of definitions (members of V) (x – (K1 U K2)) U (G1UG2) = (x-K1) U G1 U (x-K2) U G2

slide-18
SLIDE 18

18

Advantages of Distributivity

  • All the GEN/KILL frameworks are distributive.
  • If a framework is distributive, then combining paths

early doesn’t hurt.

  • MOP = MFP
  • That is, the iterative algorithm computes a solution that

takes into account all and only the physical paths.

  • It also does so quite fast
slide-19
SLIDE 19

What Are Some of the Disadvantages of Multiple Reaching Definitions

19

slide-20
SLIDE 20

SSA

SSA Representation SSA Construction Analysis based on SSA

20

slide-21
SLIDE 21

Static Single Assignment

  • Each variable has only one reaching definition
  • When two definitions of the same variable

merge, a Ф function is introduced, with a new definition of the variable

  • First consider SSA for alias-free variables

21

slide-22
SLIDE 22

Example: CFG

a = = a+5 a = = a+5 a = = a+5

Multiple reaching definitions

22

slide-23
SLIDE 23

Example: SSA Form

a1= = a1+5 a2= = a2+5 a3= a4= Ф(a1,a3) = a4+5 Single reaching definition

23

slide-24
SLIDE 24

Ф Functions

  • A Ф operand represents the reaching definition

from the corresponding predecessor

  • The ordering of Ф operands are important for

knowing from which path the definition is coming from

  • The predicate is generally not recorded as part
  • f the Ф function

24

slide-25
SLIDE 25

SSA Conditions

1. If

  • two non-null paths X →+ Z and Y →+ Z converge at node

Z, and

  • nodes X and Y contains (V =..),
  • then V = Ф(V, .., V) has been inserted at Z

2. Each mention of V has been replaced by a mention of Vi 3. V and the corresponding Vi have the same value.

25

slide-26
SLIDE 26

SSA Placement

SSA Representation SSA Construction

Step 1: Place Ф statements Step 2: Rename all variables

Converting out of SSA

26

slide-27
SLIDE 27

Ф Function Placement

a = … a = Ф(a ,a) a = Ф(a ,a)

Place minimal number of Ф functions

27

slide-28
SLIDE 28

Renaming Uses to Refer to Proper Definitions

28

a1= a1+5 a4 = a4+5 a2= a3= Ф(a1,a2) a3+5

slide-29
SLIDE 29

SSA Construction Algorithm (I)

  • Step 1: Place Ф statements by computing

iterated dominance frontier

29

slide-30
SLIDE 30

Control Flow Graph (CFG)

  • A control flow graph G = (V, E)
  • Set V contains distinguished nodes START and END
  • Every node is reachable from START
  • END is reachable from every node in G.
  • START has no predecessors
  • END has no successors.
  • Predecessor, successor, path

30

slide-31
SLIDE 31

Dominator Relation

  • If X appears on every path from START to Y, then X

dominates Y

  • Domination is both reflexive and transitive
  • IDOM(Y): immediate dominator of Y
  • Dominator Tree
  • START is the root
  • Any node Y other than START has IDOM(Y) as its parent
  • Parent, child, ancestor, descendant

31

slide-32
SLIDE 32

Dominator Tree Example

START a b c d END START CFG DT

32

slide-33
SLIDE 33

Dominator Tree Example

START a b c d END START a CFG DT

33

slide-34
SLIDE 34

Dominator Tree Example

START a b c d END START a b c CFG DT

34

slide-35
SLIDE 35

Dominator Tree Example

START a b c d END START a d b c CFG DT

35

slide-36
SLIDE 36

Dominator Tree Example

START a b c d END START a d END b c CFG DT

36

slide-37
SLIDE 37

Dominance Frontier

  • Dominance frontier DF(X) for node X is a set of

nodes Y such that

  • 1. X dominates a predecessor of Y
  • 2. X does not strictly dominate Y

37

slide-38
SLIDE 38

DF Example

START a b c d END START a d END b c CFG DT DF(c) = ? DF(a) = ?

38

slide-39
SLIDE 39

DF Example

START a b c d END START a d END b c CFG DT DF(c) = {d} DF(a) = ?

39

slide-40
SLIDE 40

DF Example

START a b c d END START a d END b c CFG DT DF(c) = {d} DF(a) = {END}

40

slide-41
SLIDE 41

Computing DF(X)

  • DF(X) is the union of the

following sets

  • DFlocal(X), a set of successor

nodes that X doesn’t strictly dominate

  • E.g. DFlocal(c) = {d}, see

previous slide

  • DFup(Z) for all Z є Children(X)
  • DFup(Z) = {Y є DF(Z) such

that IDOM(Z) doesn’t strictly dominate Y}

  • E.g. X = a, Z = d, Y = END,

see previous slide

41

START a b c d END

slide-42
SLIDE 42

Iterated Dominance Frontier

  • We can also define this notion for sets of nodes:

DF(SET) is the union of DF(X), where X є SET.

  • Iterated dominance frontier DF+(SET) is the limit of
  • DF1 = DF(SET) and DFi+1 = DF(SET U DFi)

42

slide-43
SLIDE 43

Computing Joins

  • J(SET) of join nodes
  • Set of all nodes Z
  • There are (at least) two non-null CFG paths that start at

two distinct nodes in SET and converge at Z

  • Iterated join J+(SET) is the limit of
  • J1 = J(SET) and Ji+1 = J(SET U Ji)
  • J+(SET) = DF+(SET)

43

slide-44
SLIDE 44

Placement of Ф Functions in SSA

  • for each variable V
  • add all nodes with assignments to V to worklist W
  • for-each X in worklist W do
  • for-each Y in DF(X) do
  • if no Ф added in Y then
  • place (V = Ф (V,…,V)) at Y
  • if Y has not been added before, add Y to W.

44

slide-45
SLIDE 45

Computational Complexity

  • Constructing SSA takes O(Atot *

avrgDF), where

  • Atot: total number of assignments
  • avrgDF: weighted average DF size
  • The computational complexity is

O(n2).

  • e.g. nested repeat-until loops

S a b c E d

45

slide-46
SLIDE 46

Ф Placement Example

a = … a = Ф(a ,a) a = Ф(a ,a) Place Ф at Iterative Dominance Frontiers

46

slide-47
SLIDE 47

SSA Construction (II)

  • Step 2: Rename all variables in original

program and Ф functions, using dominator tree and renaming stack to keep track of the current names.

47

slide-48
SLIDE 48

Variable Renaming

  • Rename from the START node recursively
  • For each node X
  • For each assignment (V = …) in X
  • Rename any use of V with the TOS of rename stack
  • Push the new name Vi on rename stack
  • i = i + 1
  • Rename all the Ф operands through successor edges
  • Recursively rename for all child nodes in the dominator

tree

  • For each assignment (V = …) in X
  • Pop Vi in X from the rename stack

48

slide-49
SLIDE 49

Renaming Example

a1= a1+5 a = a+5 a= a= Ф(a1,a) a+5 TOS Rename expr

49

slide-50
SLIDE 50

Remaining Questions?

  • What to with
  • arrays
  • pointer aliasing
  • interprocedural analysis
  • How would this work in

SSA?.. f(17) function f(int x){ x = x +1; return x; }

50

slide-51
SLIDE 51

SSA

SSA Representation SSA Construction Analysis based on SSA

51

slide-52
SLIDE 52

Constant and Copy Propagation in SSA

52

a0=1 x1=a0+5 a2 =x1+4 x2=a2+5 a1=6 a3= Ф(a0,a1) x4=a3+5 a4= Ф(a2,a3) x3

=Ф(x2,x4)

6 10 15 1,6 6,11

10,1,6 15,11

slide-53
SLIDE 53

Taint Tracking Analysis

  • Typically concerns itself with untrusted extremal

inputs in the program and where they flow

  • For example
  • buf[i] = … -- what is the source of i? Can it be negative?

Can it overrun buf?

53

slide-54
SLIDE 54

Taint Tracking in SSA

54

a0=tainted x1=a0+5 a2 =x1+4 x2=a2+5 a3= Ф(a0,a1) x4=a3+5 a4= Ф(a2,a3) x3

=Ф(x2,x4)

a1=tainted