Principles of Programming Languages - - PowerPoint PPT Presentation

principles of programming languages h p di unipi it
SMART_READER_LITE
LIVE PREVIEW

Principles of Programming Languages - - PowerPoint PPT Presentation

Principles of Programming Languages h"p://www.di.unipi.it/~andrea/Dida2ca/PLP-15/ Prof. Andrea Corradini Department of Computer Science, Pisa Lesson 17 Loops in Control Flow Graphs Convergence speed of data-flow analysis Region


slide-1
SLIDE 1

Principles of Programming Languages

h"p://www.di.unipi.it/~andrea/Dida2ca/PLP-15/

  • Prof. Andrea Corradini

Department of Computer Science, Pisa

  • Loops in Control Flow Graphs

– Convergence speed of data-flow analysis

  • Region based Data-Flow analysis
  • Symbolic analysis

Lesson 17

slide-2
SLIDE 2

2

Determining Loops in Flow Graphs: Dominators

  • Dominators: d dom n

– Node d of a CFG dominates node n if every path from the entry node to n goes through d – The loop entry dominates all nodes in the loop

  • The immediate dominator m of a node n is the

last dominator of n on any path from the iniGal node to n

– If d ≠ n and d dom n then d dom m

  • Since each node has a unique immediate

dominator, dominators form a tree

slide-3
SLIDE 3

3

Dominator Tree of a CFG

1 2 3 4 6 5 7 8 9 10

CFG Dominator tree

1 2 3 4 5 6 7 8 9 10

slide-4
SLIDE 4

Data-Flow Analysis for Dominators

  • The set of dominators for each node n, D(n), can

be computed with dataflow analysis

  • Fact: d dom n iff d = n or d dom m for all m in

pred(n)

– DirecGon: forwards – SemilaUce: powerset of CFG nodes – Transfer funcGon: fB(x) = x U {B} – Meet operator: intersecGon (must) – Boundary: OUT[ENTRY] ={ENTRY} – IniGalizaGon: OUT[B] = all Nodes

4

d n

P1 P2 Pk

slide-5
SLIDE 5

5

Natural Loops

  • A back edge in a CFG is an edge a → b where

b dominates a

  • A natural loop:

– has a single-entry node d, the header, which dominates all nodes in the loop – has a back edge that enters node d

  • Given a back edge n → d

– Its natural loop consists of d plus the nodes that can reach n without going through d – The loop header is node d

slide-6
SLIDE 6

6

Reducible Flow Graphs

1 2 3 4

Example of a reducible CFG

1 2 3

Example of a nonreducible CFG

(not a natural loop: no back edge to dominator 1)

  • A flow graph is reducible if and only if deleGng

all back edges the resulGng graph is acyclic.

  • We consider only CFGs which are reducible
slide-7
SLIDE 7

7

Natural Inner/Outer Loops

  • In reducible CFGs, unless two loops have the

same header, they are disjoint or one is nested within the other

  • A nested loop is an inner loop if it contains no
  • ther loops
  • A loop is an outer loop if it is not contained

within another loop

slide-8
SLIDE 8

8

Natural Inner/Outer Loops Example

1 2 3 4 6 5 7 8 9 10

CFG Dominator tree

1 2 3 4 5 6 7 8 9 10 Natural loop for 7 dom 10 Natural loop for 3 dom 4 Natural loop for 4 dom 7 1 2 3 4 5 6 7 8 9 10 Natural loop for 1 dom 9 Natural loop for 3 dom 8

CFG

slide-9
SLIDE 9

9

Depth of a Control Flow Graph

  • Depth: largest

number of back edges in any acyclic path in the graph

  • IntuiGon: not larger

than the maximal nesGng of loops

1 2 3 4 6 5 7 8 9 10 1 2 3 4 5 6 7 8 9 10

slide-10
SLIDE 10

Speed of convergence of data-flow analysis

  • Maximum number of iteraGons: (height of the

laUce) x (number of nodes)

  • If value of interest can be propagated along

acyclic path (like for reaching definiEons, available expressions, live variables), few passes are sufficient in general, depending on number of loop nesGng (typically, depth of CFG + 1).

  • Otherwise, several iteraGons in loops might be

needed: eg. constant folding

10

L: x = y; y = z; z = 1; goto L

slide-11
SLIDE 11

Region-Based Analysis

  • In dataflow analysis, transfer funcGons are

associated with basic blocks

  • Here we associate them with regions, which

provide a hierarchical view on the program

  • Proceeds from smaller to larger regions, up to

enGre procedures

  • Need more algebraic structure:

– SemilaUce of values – SemilaUce of transfer funcGons with meet, composiEon and closure operator

11

slide-12
SLIDE 12

Regions

  • A region is a porGon of the flow graph with a single entry

point.

– A single statement of a high-level language is a region – Each block or other form of statement nesGng is a region

  • A region is a collecGon of nodes N and edges E such that

– N has a dominator h – No node external of N can reach a node m in N without passing through h – E contains all edges between nodes in N (but, possibily, for some to h)

  • Note: natural loops are regions, but regions may not

contain loops

12

slide-13
SLIDE 13

Region hierarchies

  • AssumpGon: the CFG is reducible (thus natural loops

are disjoint or nested)

  • Building the region hierarchy for the CFG:

– Every block is a leaf region – For each natural loop L, starGng from the innermost, replace the body (all nodes and edges but for back edges to the header) of L with a new node represenGng a region

  • R. All edges from L become edges from R, possibly loops. R

is a body region. – Construct the loop region R’, that is idenGcal to R but without the loop: it represents the whole L – Finally construct one region for the resulGng acyclic flow graph, if needed

13

slide-14
SLIDE 14

Example: the region hierarchy of a CFG

14

slide-15
SLIDE 15

Region based analysis: idea

  • We define transfer funcGons for regions, exploiGng the hierarchy
  • One transfer funcGon for each region R and subregion R’: fR,IN[R’]

summarizes the effect of all possible paths from the entry of R to R’.

  • One transfer funcGon for each exit block B in R: fR,OUT[B]

summarizes all paths from the entry of R to the exit of B

  • Move upwards:

– For leaf regions, fB,IN[B] is the idenGty and fB,OUT[B] is the transfer funcGon of the block – For body regions, they are an acyclic graph of subregions: compose the transfer funcGons in topological order (see later) – For loop regions, one has to take into account only the back edges to the header (see later)

  • For the data-flow values, proceed from the top region to the leaves.

Compute values at entry, then to the entry of sub regions, and so

  • n.

15

slide-16
SLIDE 16

Needed properGes of transfer funcGons

  • Examples from reaching definiEons:

f(x) = gen U (x – kill)

  • ComposiEon, for block/region sequences
  • Meet, to combine transfer funcGon along

different paths to the same point

16

slide-17
SLIDE 17

Needed properGes of transfer funcGons

  • Closure is needed for loops. Represents the

effect of going around the cycle any number

  • f Gmes
  • For reaching definiGons:

17

slide-18
SLIDE 18

Composing transfer funcGons in body regions

  • If R is a body region:

18

slide-19
SLIDE 19

19

  • Region-based analysis of reaching definiGons
slide-20
SLIDE 20

Composing transfer funcGons in loop regions

  • If R is a loop region:
  • fR,IN[S] represents the effect of execuGng any path

from the entry of R to the entry of S, aoer execuGng any number of Gmes the loop (possibly 0)

20

slide-21
SLIDE 21

CompuGng data-flow values in a top-down pass

  • The transfer funcGons associated with regions

are used to compute values at the beginning

  • f each region:

21

slide-22
SLIDE 22

The reaching definiGon running example

22

  • Values computed by Region Based

reaching definiGon analysis

slide-23
SLIDE 23

Region-Based Symbolic Analysis

  • The analysis idenGfies program variables whose value

can be expressed as affine expressions (~ linear combinaEons) of certain reference variables, and it returns such expressions

  • Reference variables can be

– loop control variables – variables holding values returned by funcGons or read from input

  • Affine expressions can also refer to iteraEon counts
  • InducEon variables are those expressible as a*i + b,

with i = count of iteraGons

23

slide-24
SLIDE 24

Symbolic Analysis: MoGvaGons

  • The idenGficaGon of inducGon variables and of enables

various kinds of opGmizaGons

– Values can be computed with addiGon or shio (not mulGplicaGon) – Access to array elements can be parallelized if they are disGnct – Loop invariants and constants can be idenGfied as degenerate affine expression over loop indexes

24

  • x only reference variable
  • With symbolic analysis we learn

that y = x-1 and z = x-2

  • Thus the assignments to A are

at disEnct locaEons

  • And the last statement is never

executed

slide-25
SLIDE 25

Sample program and corresponding CFG (with regions, aoer some transformaGons)

25

  • f and g are inducGve variables.
  • Can be expressed as

– f = i + 99 – g = j + 9

with i, j iteraGon counters

  • The control variables f and g are replaced

by iteraGon counters

  • for loops are transformed in repeat unEl
slide-26
SLIDE 26

Data-flow analysis: the domain

  • Domain of data-flow values: the map laUce

(Vars → AffExp, ∧s) of Symbolic maps, where

– Vars is the set of variable of the program – (AffExp, ∧s) is the flat semilaUce of all affine expressions of reference variables with NAA as borom (represenGng “non-affine expression”)

  • By definiGon of map la]ce we have that

– The meet is defined by (f∧f’)(x) = f(x)∧f'(x) – The ordering is f ≤ f' ⇔ ∀x, f(x) ≤ f'(x) – The borom value is the map mapping all variables to NAA

26

slide-27
SLIDE 27

Data-flow analysis: Transfer funcGons of statements

  • Transform a symbolic map, according to the

semanGcs of the statement

27

  • “c0 + c1 y + c2 z” represents any affine expression involving variables of the program
slide-28
SLIDE 28

Data-flow analysis: ComposiGon of transfer funcGons

  • Standard composiGon of linear combinaGons, if

defined, otherwise NAA values are propagated

28

slide-29
SLIDE 29

Region Based Analysis: meet and closure of transfer funcGons

  • For Region Based Analysis we should define the meet and

closure of transfer funcGons

  • Meet: The value of a variable for the meet of two funcGons

is NAA unless it has the same value for both funcGons:

  • Closure: Given a symbolic map m and a transfer funcGon f

(represenGng a single execuGon of a loop) we can summarize the effect of 0 or any number of execuGon of a loop only if m is loop invariant:

29

slide-30
SLIDE 30

Region Based Analysis: closure is not enough

  • For symbolic analysis, the closure operator on transfer

funcGons, summarizing 0 or any number of execuGon

  • f a loop is not informaGve enough

– The symbolic map needs to be parametrized by the number of Gmes a loop is executed – When the loop terminates, the number of iteraGons is used to determine the value of inducGon variables aoer the loop

  • We need to compute the effect of composing a

transfer funcGon g (the effect of one iteraGon) a fixed number of Gmes: g0 = I, gi+1 = g gi

  • Therefore we introduce parametrized (transfer)

funcEon composiEon

30

slide-31
SLIDE 31

Region Based Analysis: parametrized funcGon composiGon

  • For a transfer funcGon g represenGng the execuGon of one

cycle, we determine gi for all i >= 0

  • PotenGal inducGon variables are of three kinds:

1. if g(m)(x) = m(x) + c with c constant, then gi(m)(x) = m(x) + c i [x is a basic inducEon variable] 2. if g(m)(x) = m(x), then gi(m)(x) = m(x) [x is a symbolic constant] 3. if g(m)(x) = co + c1 m(x1) +…+ cn m(xn) where each xk is basic inducGon variable or symbolic constant, then gi(m)(x) = co + c1 gi(m)(x1) +…+ cn gi(m)(xn) [x is a (non basic) inducEon variable] 4. In all other cases gi(m)(x) = NAA

31

slide-32
SLIDE 32

Region Based Symbolic Analysis: the modified algorithm

  • The algorithm is sGll made of two passes:

– A borom-up pass to compute the transfer funcGons for all regions – A top-down pass to compute the symbolic map at the entry of each region

  • But for each loop region R and body sub-region S, instead
  • f fR,IN[S] (represenGng the effect of execuGng any number
  • f Gmes the loop) one computes fR,i,IN[S] (represenGng the

effect of execuGng exactly i Gmes the loop)

32

slide-33
SLIDE 33

Region Based Symbolic Analysis: the modified algorithm (2)

  • If the number of iteraGons of a loop is known, the summary
  • f the region is computed by replacing i with the actual

count.

  • In the top-down pass fR,i,IN[S] is used to compute the

symbolic map at the entry of the i-th iteraGon of a loop.

  • To avoid that NAA variables penetrate into inner loops, if

m(v) is used in an assignment in region R but m(v) = NAA at the entry of R, the assignment t = v is added, and m(v) is replaced by t everywhere. Example:

33

slide-34
SLIDE 34

Sample program and corresponding CFG (with regions, aoer some transformaGons)

34

slide-35
SLIDE 35

Result of Region Based Analysis

35

  • Modified program

Sequence of values for the variables Symbolic maps