263-2810: Advanced Compiler Design 2.0 Sta>c Single Assignment - - PowerPoint PPT Presentation

263 2810 advanced compiler design 2 0 sta c single
SMART_READER_LITE
LIVE PREVIEW

263-2810: Advanced Compiler Design 2.0 Sta>c Single Assignment - - PowerPoint PPT Presentation

263-2810: Advanced Compiler Design 2.0 Sta>c Single Assignment Form Thomas R. Gross Computer Science Department ETH Zurich, Switzerland 2.0 Sta>c Single Assignment IR SSA: Sta>c single assignment There is one assignment statement


slide-1
SLIDE 1

263-2810: Advanced Compiler Design 2.0 Sta>c Single Assignment Form

Thomas R. Gross Computer Science Department ETH Zurich, Switzerland

slide-2
SLIDE 2

2.0 Sta>c Single Assignment IR

§ SSA: Sta>c single assignment § There is one assignment statement that writes a variable/ field/memory loca>on

§ Assignment for short: statement, expression, …. § Sta6c: in the source/IR § Single: each statement writes a different variable

§ SSA makes data dependences explicit

2

slide-3
SLIDE 3

Example

3

Instead of x = a + b d = x + 1 b = a + c we have x1 = a0 + b0 d1 = x1 + 1 b1 = a0 + c0

slide-4
SLIDE 4

Example

§ Statement 1 produces a value for statement 2 § Only true dependences recorded

§ No constraints due to variable names

5

x1 = a0 + b0 d1 = x1 + 1 b1 = a0 + c0

slide-5
SLIDE 5

Outline

SSA form is used in many produc>on compilers § 2.1 SSA form for straight line code

§ How to turn a (JavaLi/C/Java/…) program into SSA form

§ 2.2 Condi>onal statements § 2.3 Benefits of SSA form § 2.4 SSA for well-structured programs

§ Only selected subset of control flow constructs allowed

§ “goto”-free programs § C – longjmp & friends

§ 3.0 SSA for arbitrary programs

6

slide-6
SLIDE 6

2.1 SSA for a basic block

§ Assump>on: Program(method) translated into basic blocks, forest of IR trees for each basic block

§ E.g., AST or similar IR § All sources and des6na6ons of opera6ons visible § Program wriYen without considera6on of SSA format

§ Goal: transform one basic block into SSA format § Approach: consider all statements (IR trees) in sequence

§ Consider each operand § Simplifica>on: consider only method-local scalar operands § Arrays, objects, structs, records: later

7

slide-7
SLIDE 7

§ For each variable X: Counter CX

§ CX ini6alized to 0 at start of method

§ CX indicates the “current” version § Given a statement or expression D = S ⊗ T or S ⊗ T

  • 1. Lookup CS and CT

1. Yields current version, say Sn and Tm

  • 2. (for stmts) Increment CD

1. Yields new version (Dk)

  • 3. Replace S, T, D : Dk = Sn ⊗ Tm or Sn ⊗ Tm

8

slide-8
SLIDE 8

Example

x = a + b y = x + 2 z = a + 1 x = c * 2 w = x + 1

9

slide-9
SLIDE 9

Example

x1 = a0 + b0 y1 = x1 + 2 z1 = a0 + 1 x2 = c0 * 2 w1 = x2 + 1

10

slide-10
SLIDE 10

Example

Turn the following example into SSA form (3 min) a1 = a0 + x0 x1 = a1 + b0 c1 = a1 + b0 x2 = c1 + 2 a2 = x2 + b0

12

slide-11
SLIDE 11

§ Each basic block can be handled that way …. § Only the variable names are changed

§ Otherwise use trees as before § Could use any other IR

§ Some>mes variable aj is called a version of a § Need right value for counters CX at the start of a basic block

13

slide-12
SLIDE 12

16

slide-13
SLIDE 13

Finding the current version

§ Simple example a = 1; if (b ≠ 0) { a = 0; } x = a ;

17

a1 = 1; if (b0 ≠ 0) { a2 = 0; } x1 = a??? ; Can’t use a1 Can’t use a2

slide-14
SLIDE 14

Solu>on: φ func>on

§ Introduce a “magic” func>on φ § φ func>on delivers the correct version

§ (in the example) § If (b0 = 0) : returns a1 § If (b0 ≠ 0) : returns a2

§ The func>on picks the correct version depending on the path taken to reach BB2

§ More precisely: the path that requires us to use either a1 or a2

19

slide-15
SLIDE 15

φ func>on

§ Result (return value) depends on path taken

§ Result value assigned to a new version of variable

§ Arguments are possible return values

§ Different versions of (conceptually) the same variable

21

a1 = 1; if (b0 ≠ 0) { a2 = 0; } a3 = φ (a2, a1) x1 = a3 ;

slide-16
SLIDE 16

23

slide-17
SLIDE 17

φ func>on -- Notes

§ φ func>on appears only on the right hand side of an assignment § φ func>on placed at beginning of basic block

§ Not mandatory but simplifies reading examples § Simplifies op6miza6ons/transforma6ons § No other statements or expressions appear before φ func6on in a basic block

24

slide-18
SLIDE 18

φ func>on -- Notes

§ φ func>on does not evaluate all arguments

§ Only the single argument that is returned is evaluated § Why does this maYer? § Precise and fine-grained informa6on

26

slide-19
SLIDE 19

29

slide-20
SLIDE 20

§ a2 is read (and therefore live) at the end of BB1 § a1 is read (and therefore live) at the end of BB2 § Neither a1 nor a2 is read (live) in BB3

§ No need to find register § Only a3 a candidate for register alloca6on

30

slide-21
SLIDE 21

Conver>ng to SSA

§ Given a CFG with ENTRY node, forest of IR trees or AST § Start with ENTRY node

§ Convert the opera6ons in this basic block

§ Insert φ func>ons as needed

§ More on this later

§ Process next basic block un>l all blocks have been processed

31

slide-22
SLIDE 22

Implemen>ng φ func>ons

32

BB0 BB1 BB2 BB2 if (b0 ≠ 0) a2 = 0 a1 = 1 a3 = φ (a2 , a1)

slide-23
SLIDE 23

§ Assign a1 and a2 to the same register

§ Both a1 and a2 are allocated to a register and they get the same register § Useful if a3 is read before it is spilled

§ Create a temporary t (stored in memory) and insert copy statements

33

BB1 BB2 BB2 a2 = 0 t = a2 a1 = 1 t = a1 a3 = t

slide-24
SLIDE 24

2.3 Benefits of SSA form

  • 1. Some op>miza>ons are easy resp. obvious
  • 2. Efficient representa>on of dependences

34

slide-25
SLIDE 25

SSA-based op>miza>ons

§ Elimina>on of common subexpressions (CSE) § The evalua>on of an expression “a + b” at point P can be eliminated if “a + b” is evaluated on all paths leading to P

§ Must consider all paths § There cannot be an assignment to a or b along any paths aner a+b has been evaluated

35

slide-26
SLIDE 26

SSA-based op>miza>ons

§ Elimina>on of common subexpressions (CSE)

§ First step: iden6fy common subexpressions

t = a + b ; v = a + b ;

36

slide-27
SLIDE 27

CSE

§ Must consider complete program

§ “All paths ….” § “No assignments to operands …”

37

slide-28
SLIDE 28

SSA-based op>miza>ons

§ SSA form immediately provides the answer t = ai+ bj ; v = am + bn ; If m=i and n=j the expression “a+b” is the same

§ Candidate for removal

38

slide-29
SLIDE 29

Example

= a + b; if ( … ) { x = 0; } = a + b;

39

slide-30
SLIDE 30

Example

= ai + bj; if ( … ) { x2 = 0; } = ai + bj;

40

slide-31
SLIDE 31

43

slide-32
SLIDE 32

Use-Def (ud) chains

§ Given a “use” of a variable, would like to know where the value was wrifen (“defined”)

§ Useful to iden6fy register alloca6on candidates

§ SSA form makes it easy – there is one defini>on for each variable

§ Easily maintained mapping

44

slide-33
SLIDE 33

Def-Use (du) chains

§ For a given defini>on, find all uses of the value computed § SSA form makes it easy: can easily iden>fy uses

§ No extra work needed

ai = … = ai + … = ai + ak = … = ak = am + …

45

slide-34
SLIDE 34

SSA efficient representa>on

§ Consider ud-chains and du-chains § Storage space for links directly propor>onal to the number

  • f uses

46

slide-35
SLIDE 35

SSA efficient representa>on

§ Old: Global dataflow equa>ons are solved by itera>on § Use a bit vector to represent

§ Variables § Defini6ons § Expressions …

47

slide-36
SLIDE 36

Available Expressions: Finding IN(B) and OUT(B)

§ genB and killB capture what happens inside a basic block

§ Sets of expressions “generated” and “killed”!

§ We need IN and OUT for each basic block

§ IN(B) = ∩ Bi, Bi is predecessor of B in CFG OUT(Bi) § OUT(B) = genB ∪ (IN(B) – killB)

§ N basic blocks, 2×N sets IN / OUT

48

slide-37
SLIDE 37

Finding IN(B) and OUT(B)

§ N basic blocks, 2×N sets IN / OUT § System with 2×N unknowns

§ Solve by itera6ng un6l a fixed point is found

§ How to start itera>on?

Safe assump6on OUT[ENTRY] = ∅

49

slide-38
SLIDE 38

Finding IN(B) and OUT(B)

§ Safe assump>on OUT[ENTRY] = ∅ § What about OUT[Bi] for Bi ≠ ENTRY?

§ For reaching defini6ons, we wanted smallest set of defini6ons that “reach”

§ OK if we say d reaches but it does not

§ For available expressions, we want largest set of expressions that “reach”

§ OK if expr is available but not included in set

§ So start with a large approxima>on and remove expressions that are clearly not available

§ OUT[Bi] = U § U U is the set of all expressions that appear in the program

50

slide-39
SLIDE 39

Finding available expressions

OUT[ENTRY] = ∅ Ini6alize OUT[B] = U

U for ∀ B ≠ ENTRY

while (changes to any OUT(B)) { for (each basic block B ≠ ENTRY) { IN(B) = ∩ Bi, Bi is predecessor of B in CFG OUT(Bi) OUT(B) = genB ∪ (IN(B) – killB) } }

51

slide-40
SLIDE 40

Comments

§ The order of visi>ng nodes of the control flow graph mafers

§ For speed of convergence, not correctness

§ Needs sets to hold all expressions that appear in func>on/ method

§ Possibly large § Bit vector representa6on allows fast implementa6on of set opera6ons

§ Need mul6-word set representa6on

§ Compiler may limit size of bit vector

§ Not all instruc6ons/variables/expressions will be considered

52

slide-41
SLIDE 41

SSA efficient representa>on

§ Common subexpressions iden>fied easily § Cost of representa>on reduced in prac9ce

53

slide-42
SLIDE 42

2.4 SSA for well-structured programs

§ Well-structured programs contain only “nice” control flow constructs

§ Programming language enforces property that program is well- structured

§ Syntax-directed transla>on to SSA format

§ Insert φ func6ons

§ We say “insert φ node” – a tree node with the φ func6on

§ Rename variables

§ Use correct version

54

slide-43
SLIDE 43

Prepara>on

§ Augment symbol table to record for each variable

§ Current version (integer) § Next version (integer) § We some6mes use the term “version number”

55

slide-44
SLIDE 44

Syntax-directed transla>on

Given a CFG Process program star>ng with ENTRY node

1. Handle straight-line code (basic blocks) 2. Handle condi6onal statements (if-then, if-then-else) 3. Handle loops

56

slide-45
SLIDE 45

2.4.1 Straight-line code

§ Process block from first statement (first IR node) to last statement, in source program order § One statement S at a >me:

§ Consider the right-hand-side (assume no side effects)

§ For a variable V on the RHS, use the current version § Found in symbol table

§ Rewrite RHS § Consider the len-hand-side (effect of the statement)

§ For variable D, use next version § Update symbol table, increment “next” version

§ Same applies to expression E

§ No need to deal with “len-hand-side” unless there are side effects

§ V, D must be scalar method-local variables

§ Others unchanged

57

slide-46
SLIDE 46

59