CS293S Redundancy Removal: SVN & DVN & GCSE Yufei Ding - - PowerPoint PPT Presentation

cs293s redundancy removal svn dvn gcse
SMART_READER_LITE
LIVE PREVIEW

CS293S Redundancy Removal: SVN & DVN & GCSE Yufei Ding - - PowerPoint PPT Presentation

CS293S Redundancy Removal: SVN & DVN & GCSE Yufei Ding Review of Last Class Removing redundant expressions DAG: version tracking Linear representation: value numbering 2 Local Value Numbering <-> Linear IR Local Value


slide-1
SLIDE 1

CS293S Redundancy Removal: SVN & DVN & GCSE

Yufei Ding

slide-2
SLIDE 2

2

Review of Last Class

Removing redundant expressions DAG: version tracking Linear representation: value numbering

slide-3
SLIDE 3

3

Missed opportunities

(need stronger methods) m ¬ a + b n ¬ a + b

A

p ¬ c + d r ¬ c + d

B

y ¬ a + b z ¬ c + d

G

q ¬ a + b r ¬ c + d

C

e ¬ b + 18 s ¬ a + b u ¬ e + f

D

e ¬ a + 17 t ¬ c + d u ¬ e + f

E

v ¬ a + b w ¬ c + d x ¬ e + f

F

Local Value Numbering <-> Linear IR

Local Value Numbering

  • 1 block at a time
  • Strong local results
  • No cross-block effects

*

slide-4
SLIDE 4

4

Missed opportunities

(need stronger methods)

m ¬ a + b n ¬ a + b

A

p ¬ c + d r ¬ c + d

B

y ¬ a + b z ¬ c + d

G

q ¬ a + b r ¬ c + d

C

e ¬ b + 18 s ¬ a + b u ¬ e + f

D

e ¬ a + 17 t ¬ c + d u ¬ e + f

E

v ¬ a + b w ¬ c + d x ¬ e + f

F

Can we find set of blocks that also ensures the sequential execution

  • rder in the basic block?

Local Value Numbering <-> Linear IR

Local Value Numbering

  • 1 block at a time
  • Strong local results
  • No cross-block effects
slide-5
SLIDE 5

5

Topics of This Class

Scope of optimization Basic block -> Local value numbering Extended basic block -> Superlocal value numbering (SVN) Dominator -> Dominator-based value numbering (DVN) Global Common Subexpression Elimination (GCSE) More close to DAG-based methods Work on lexical notation instead of expression values.

slide-6
SLIDE 6

6

Basic blocks

A basic block is a maximal-length segment of straight-line,

unpredicated code. In another word, it has one entry point (i.e., no code within it is the destination of a jump instruction), one exit point and no jump instructions contained within it.

Example

L2: L1:

m = 2; c = m + n; if(c>0) goto L1; d = 4; goto L2; c = 5;

slide-7
SLIDE 7

7

CFG

m ¬ a + b n ¬ a + b

A

p ¬ c + d r ¬ c + d

B

y ¬ a + b z ¬ c + d

G

q ¬ a + b r ¬ c + d

C

e ¬ b + 18 s ¬ a + b u ¬ e + f

D

e ¬ a + 17 t ¬ c + d u ¬ e + f

E

v ¬ a + b w ¬ c + d x ¬ e + f

F

Control-flow graph (CFG)

  • Nodes for basic blocks
  • Edges for branches
  • Basis for many program

analysis & transformation This CFG, G = (N,E)

  • N = {A,B,C,D,E,F,G}
  • E = {(A,B),(A,C),(B,G),(C,D),

(C,E),(D,F),(E,F),(F,E)}

  • |N| = 7, |E| = 8
slide-8
SLIDE 8

Extended basic block (EBB)

An EBB is a set of blocks B1,

B2, ..., Bn, where Bi, 2<= i <= n has a unique predecessor, which is in the EBB.

May have multiple exits A tree structure If a block is added to the EBB,

all of its predecessors must be

  • included. Bi is the one with on

predecessor, i.e., the root of the EBB.

m ¬ a + b n ¬ a + b

A

p ¬ c + d r ¬ c + d

B

y ¬ a + b z ¬ c + d

G

q ¬ a + b r ¬ c + d

C

e ¬ b + 18 s ¬ a + b u ¬ e + f

D

e ¬ a + 17 t ¬ c + d u ¬ e + f

E

v ¬ a + b w ¬ c + d x ¬ e + f

F

Can you find the maximum EBB

slide-9
SLIDE 9

9

Superlocal Value Numbering

m ¬ a + b n ¬ a + b

A

p ¬ c + d r ¬ c + d

B

y ¬ a + b z ¬ c + d

G

q ¬ a + b r ¬ c + d

C

e ¬ b + 18 s ¬ a + b u ¬ e + f

D

e ¬ a + 17 t ¬ c + d u ¬ e + f

E

v ¬ a + b w ¬ c + d x ¬ e + f

F

  • 1. First find the maximum EBB:

ABCDE, F, G

  • 2. Apply local method to EBBs’ paths
  • Do {A,B}, {A,C,D}, {A,C,E}, {F}, {G}
slide-10
SLIDE 10

10

Implementation

Reuse the value numbering results of some common blocks for

efficiency

Which necessitates the undoing of a block’s effect After {A,C,D}, it must recreate the state of {A,C} before

processing E.

Options:

1. Record the state of the tables at each block boundary, and restore the state when needed 2. Walking backward and undo the effect. Need record the “lost” information. 3. Scoped hash tables (Lowest cost) keep the table produced at the current block

slide-11
SLIDE 11

11

Scoped Value Table

m ¬ a + b n ¬ a + b

A

p ¬ c + d r ¬ c + d

B

y ¬ a + b z ¬ c + d

G

r ¬ c + d q ¬ a + b

C

e ¬ b + 18 s ¬ a + b u ¬ e + f

D

t ¬ c + d u ¬ a + b

E

v ¬ a + b w ¬ c + d x ¬ e + f

F

a->1 b->2 1+2->3 m->3 n->3 c->4 d->5 4+5->6 r->6 q->3 t->6 u->3 c->4 d->5 4+5->6 p->6 r->6

slide-12
SLIDE 12

12

Rewritten

a ¬ b + c e ¬ b - c

b -> 1 c -> 2 1 + 2 ->3 a -> 3 1->b 2->c 3->a

d ¬ b - c f ¬ b - c

1-2 -> 4 e -> 4 4 -> e 4 -> d 1-2 -> 4 d-> 4 f-> 4

d ¬ b - c f ¬ d

slide-13
SLIDE 13

13

Rewritten

a ¬ b + c a ¬ 17 e ¬ b + c d ¬ b + c

Renaming is still needed. But does it work in all scenarios?

a1 ¬ b1 + c1 a2 ¬ 17 e1 ¬ b1 + c1 d1 ¬ b1 + c1

slide-14
SLIDE 14

Extra Complexity

14

a1 ¬ b + c a3 ¬ 17 a2 ¬ a1 + c d ¬ a + c ?

slide-15
SLIDE 15

15

SSA (Single Static Assignment) Name Space

Two principles

Each name is defined by exactly one operation Each operand refers to exactly one definition

To reconcile these principles with real code

Insert f-functions at merge points to reconcile name space

x ¬ ... x ¬ ... ... ¬ x + ... x0 ¬ ... x1 ¬ ... x2 ¬f(x0,x1) ¬ x2 + ... becomes

slide-16
SLIDE 16

Another SSA Example

16

x ¬ ... x ¬ ... ... ¬ x + ... x3 ¬ ... x4 ¬ ... x5 ¬f(x3,x4) ¬ x5 + ... becomes x ¬ x + ... x1 ¬f(x0,x5) x2 ¬ x1 + ...

Detail: CT-2ndEd: Section 5.4.2; CT-1stEd: Section 5.5.

slide-17
SLIDE 17

17 This is in SSA Form

Superlocal Value Numbering

m0 ¬ a + b n0 ¬ a + b

A

p0 ¬ c + d r0 ¬ c + d

B

r2 ¬ f(r0,r1) y0 ¬ a + b z0 ¬ c + d

G

q0 ¬ a + b r1 ¬ c + d

C

e0 ¬ b + 18 s0 ¬ a + b u0 ¬ e + f

D

e1 ¬ a + 17 t0 ¬ c + d u1 ¬ e + f

E

e3 ¬ f(e0,e1) u2 ¬ f(u0,u1) v0 ¬ a + b w0 ¬ c + d x0 ¬ e + f

F

1.Build SSA form 2.Find EBBs 3.Apply value numbering to each path in each EBB using scoped hash tables

slide-18
SLIDE 18

18 This is in SSA Form

Superlocal Value Numbering

m0 ¬ a + b n0 ¬ a + b

A

p0 ¬ c + d r0 ¬ c + d

B

r2 ¬ f(r0,r1) y0 ¬ a + b z0 ¬ c + d

G

q0 ¬ a + b r1 ¬ c + d

C

e0 ¬ b + 18 s0 ¬ a + b u0 ¬ e + f

D

e1 ¬ a + 17 t0 ¬ c + d u1 ¬ e + f

E

e3 ¬ f(e0,e1) u2 ¬ f(u0,u1) v0 ¬ a + b w0 ¬ c + d x0 ¬ e + f

F

With all the bells & whistles

  • Find more redundancy
  • Pay little additional cost
  • Still does nothing for F & G
slide-19
SLIDE 19

Dominator-Based Value Numbering

19

slide-20
SLIDE 20

20

Regional (Dominator-based) Methods

Dominators of b: all blocks that dominate b if every path from the entry of the graph to b goes through a,

then a is one of b’s dominator.

The full set of dominators for b is denoted by DOM(b). Strict Dominators: If a dominators b and a ≠ b, then we say a strictly dominates b. Immediate Dominator: The immediate dominator of b is the strict dominator of b that

is closest to b. It is denoted IDOM(b).

slide-21
SLIDE 21

Example

m ¬ a + b n ¬ a + b

A

p ¬ c + d r ¬ c + d

B

y ¬ a + b z ¬ c + d

G

q ¬ a + b r ¬ c + d

C

e ¬ b + 18 s ¬ a + b u ¬ e + f

D

e ¬ a + 17 t ¬ c + d u ¬ e + f

E

v ¬ a + b w ¬ c + d x ¬ e + f

F

BLOCK A B C D E F G DOM IDOM

slide-22
SLIDE 22

22

Dominator-Based Value Numbering

Basic strategy: use table from IDom(x ) to

start value numbering x

Use C for F and A for G Imposes a Dom-based application

  • rder

m0 ¬ a + b n0 ¬ a + b

A

p0 ¬ c + d r0 ¬ c + d

B

r2 ¬ f(r0,r1) y0 ¬ a + b z0 ¬ c + d

G

q0 ¬ a + b r1 ¬ c + d

C

e0 ¬ b + 18 s0 ¬ a + b u0 ¬ e + f

D

e1 ¬ a + 17 t0 ¬ c + d u1 ¬ e + f

E

e3 ¬ f(e0,e1) u2 ¬ f(u0,u1) v0 ¬ a + b w0 ¬ c + d x0 ¬ e + f

F

slide-23
SLIDE 23

SSA Resolves Name Conflicts

23

a ¬ b + c b ¬ 17 d ¬ b - c e ¬ b + c a ¬ b0 + c b1 ¬ 17 d ¬ b0 - c b2 ¬f(b0,b1) e ¬ b2 + c

slide-24
SLIDE 24

Summary

Two methods in a scope beyond a basic block Superlocal value numbering (SVN)

Value numbering across basic blocks

Dominator-based value numbering (DVN)

Uses dominance information to handle join points in CFG

They can be used together First Build SSA Do SVN Do DVN with the value tables built in SVN reused

24

Build SSA form is the prerequisite for both!

slide-25
SLIDE 25

Examples

25

e = c + d; f = c + d; g = c + d; x = a + b; c = a - b;

slide-26
SLIDE 26

The first data-flow problem A global method

26

Global Common Subexpression Elimination (GCSE)

slide-27
SLIDE 27

27

Some Expression Sets

For each block b Let AVAIL(b) be the set of expressions available on entry to b. Let EXPRKILL(b) be the set of expressions killed in b. i.e. one or more operands of the expression are redefined in b. !!!! Must consider all expressions in the whole graph. Let DEEXPR(b) include the downward exposed expressions in b. i.e. expressions defined in b and not subsequently killed in b

slide-28
SLIDE 28

28

Formula to Compute AVAIL

Now, AVAIL(b) can be defined as:

AVAIL(b) = ÇxÎpred(b) (DEEXPR(x) È (AVAIL(x) Ç EXPRKILL(x) ))

  • preds(b) is the set of b’s predecessors in the control-flow graph.

(Again, a predecessor is an immediate parent, not including other ancestors.)

slide-29
SLIDE 29

29

Computing Available Expressions

The Big Picture

  • 1. Build a control-flow graph
  • 2. Gather the initial data: DEEXPR(b) & EXPRKILL(b)
  • 3. Propagate information around the graph, evaluating the equation

Works for loops through an iterative algorithm: finding the fixed- point. All data-flow problems are solved, essentially, this way.

slide-30
SLIDE 30

30

Making Theory Concrete

Computing AVAIL for the example

AVAIL(A) = Ø AVAIL(B) = {a+b} È (Ø Ç all) = {a+b} AVAIL(C) = {a+b} AVAIL(D) = {a+b,c+d} È ({a+b} Ç all) = {a+b,c+d} AVAIL(E) = {a+b,c+d} AVAIL(F) = [{b+18,a+b,e+f} È ({a+b,c+d} Ç {all - e+f})] Ç [{a+17,c+d,e+f} È ({a+b,c+d} Ç {all - e+f})] = {a+b,c+d,e+f} AVAIL(G) = [ {c+d} È ({a+b} Ç all)] Ç [{a+b,c+d,e+f} È ({a+b,c+d,e+f} Ç all)] = {a+b,c+d}

m ¬ a + b n ¬ a + b

A

p ¬ c + d r ¬ c + d

B

y ¬ a + b z ¬ c + d

G

q ¬ a + b r ¬ c + d

C

e ¬ b + 18 s ¬ a + b u ¬ e + f

D

e ¬ a + 17 t ¬ c + d u ¬ e + f

E

v ¬ a + b w ¬ c + d x ¬ e + f

F

slide-31
SLIDE 31

31

First step is to compute DEEXPR & EXPRKILL

Computing Available Expressions

assume a block b with operations o1, o2, …, ok VARKILL ¬ Ø DEEXPR(b) ¬ Ø for i = k to 1 assume oi is “x ¬ y + z” add x to VARKILL if (y Ï VARKILL) and (z Ï VARKILL) then add “y + z” to DEEXPR(b) EXPRKILL(b) ¬ Ø For each expression e for each variable v Î e if v Î VARKILL(b) then EXPRKILL(b) ¬ EXPRKILL(b) È {e}

Many data-flow problems have initial information that costs less to compute

O(k) steps O(N) steps

N is # operations Backward through block

slide-32
SLIDE 32

32

Computing Available Expressions

The worklist iterative algorithm

Worklist ¬ { all blocks, bi } while (Worklist ¹ Ø) remove a block b from Worklist recompute AVAIL(b ) as AVAIL(b) = ÇxÎpred(b) (DEEXPR(x) È (AVAIL(x) Ç EXPRKILL(x) )) if ??? then Worklist ¬ ???

slide-33
SLIDE 33

33

Computing Available Expressions

The worklist iterative algorithm

Worklist ¬ { all blocks, bi } while (Worklist ¹ Ø) remove a block b from Worklist recompute AVAIL(b ) as AVAIL(b) = ÇxÎpred(b) (DEEXPR(x) È (AVAIL(x) Ç EXPRKILL(x) )) if AVAIL(b ) changed then Worklist ¬ Worklist È successors(b )

  • Finds fixed point solution to equation for AVAIL
  • That solution is unique
slide-34
SLIDE 34

34

Data-flow Analysis

Data-flow analysis is a collection of techniques for compile-time reasoning about run-time flow of values

Almost always involves building a graph Problems are trivial on a basic block Global problems -> control-flow graph (or derivative) Whole program problems -> call graph (or derivative) Usually formulated as a set of simultaneous equations

slide-35
SLIDE 35

35

Replacement step in GCSE

Limit to textually identical expressions

(like DAG, unlike value numbering)

e <- d + c

a <- b + c d <- b

e <- b + c

a <- b + c

f <- b + c AVAIL(B) ={b+c}

B2 B1 B

AVAIL(B) ={b+c} Cannot find or remove the redundancy! Should replace b+c with ?

slide-36
SLIDE 36

36

GCSE (replacement step)

Compute a static mapping from expression to name After analysis & before transformation " block b, " expression eÎAVAIL(b), assign e a global name by hashing

  • n e

During transformation step Evaluation of e Þ insert copy name(e) ¬ e (e is not available and needs to be evaluated) Reference to e Þ replace e with name(e) (e is available and should be replaced)

slide-37
SLIDE 37

Example

m=a+b; n=c+d; c = 17; q=c+d; p=c+d; r=c+d; name expression t1 a+b t2 c+d B1 B2 B3 B4 t1 = a+b; m=t1; t2=c+d; n=t2; c = 17; t2=c+d; q=t2; t2=c+d; p=t2; r=t2; B1 B2 B3 B4

AVAIL(B4) ={c+d; a+b}

slide-38
SLIDE 38

38

GCSE (replacement step)

The major problem with this approach Inserts extraneous copies At all definitions and uses of any eÎAVAIL(b), " b Not a big issue Those extra copies are dead and easy to remove The useful ones often coalesce away

slide-39
SLIDE 39

39

Comparison

m ¬ a + b n ¬ a + b

A

p ¬ c + d r ¬ c + d

B

y ¬ a + b z ¬ c + d

G

q ¬ a + b r ¬ c + d

C

e ¬ b + 18 s ¬ a + b u ¬ e + f

D

e ¬ a + 17 t ¬ c + d u ¬ e + f

E

v ¬ a + b w ¬ c + d x ¬ e + f

F

LVN LVN SVN SVN SVN DVN DVN GCSE DVN GCSE

The VN methods are ordered

  • LVN ≤ SVN ≤ DVN
  • GCSE is different
  • Based on names, not value
  • But for this particular

example: DVN ≤ GCSE

  • Not always!!!!
slide-40
SLIDE 40

40

Redundancy Elimination Wrap-up

Conclusions

Redundancy elimination has some depth & subtlety Variations on names, algorithms & analysis

DVN is probably the method of choice

Results quite close to the global methods (± 1%) Cost is low