From Mozart and Freud to .Net Technologies Jens Knoop Vienna - - PowerPoint PPT Presentation

from mozart and freud to net technologies
SMART_READER_LITE
LIVE PREVIEW

From Mozart and Freud to .Net Technologies Jens Knoop Vienna - - PowerPoint PPT Presentation

From Mozart and Freud to .Net Technologies Jens Knoop Vienna University of Technology, Austria 1 The Mystery Keynote !? Disclosing the secret... 2 Wolfgang Amadeus Mozart 2006 Celebrating the 250th Anniversary of the Birth of Mozart! 1756


slide-1
SLIDE 1

From Mozart and Freud to .Net Technologies

Jens Knoop Vienna University of Technology, Austria

1

slide-2
SLIDE 2

The Mystery Keynote !?

Disclosing the secret...

2

slide-3
SLIDE 3

Wolfgang Amadeus Mozart 2006

Celebrating the 250th Anniversary of the Birth of Mozart!

1756 – 2006

3

slide-4
SLIDE 4

Sigmund Freud 2006

Celebrating the 150th Anniversary of the Birth of Freud!

1856 – 2006

4

slide-5
SLIDE 5

.Net Technologies 2006

Celebrating the 50th Anniversary of the First Release of .Net!

1956 – 2006

5

slide-6
SLIDE 6

.Net Technologies 2006

Celebrating the 50th Anniversary of the First Release of .Net!

1956 – 2006

Really? After some investigation, unfortunately, not.

6

slide-7
SLIDE 7

Started Thinking about the Question

Is there a topic in computer science researchers

  • started to investigate as early as 1956, and
  • still do, and
  • which is of interest for the .Net-community?

7

slide-8
SLIDE 8

Travelling Back in Time

...as far as possible. Coming up with:

  • The first issue of the

Communications of the ACM did not appear until 1958!

8

slide-9
SLIDE 9

Having a Closer Look

...at this very first issue of the CACM, I got quite excited:

  • Ershov, A. P. On Programming of Arithmetic Operations.

CACM 1 (8), 3 - 6, 1958.

9

slide-10
SLIDE 10

Findings

  • In this article, Ershov proposes the use of “convention

numbers” for denoting the results of computations and avoid having to recompute them.

  • Ershov’s work can be considered the initial work on

value numbering schemes.

  • Rephrased in more modern terms:

The origin of research on Code Motion (CM) can be traced back to Ershov’s CACM article of

1958!

10

slide-11
SLIDE 11

Conclusions drawn from these Findings

Research on CM-based Program Optimizations...

Its 50th Anniversary is approaching!

1956/58 – 2006/08

11

slide-12
SLIDE 12

2006

The Year of Anniversaries...

  • 1756-2006: 250th Anniversary of the Birth of Mozart
  • 1856-2006: 150th Anniversary of the Birth of Freud
  • 1956/58-2006: 50th Anniversary of the Origin of

Research on Code Motion

12

slide-13
SLIDE 13

Does CM Meet the Criteria?

Indeed, it is an active area of research...

  • Relevant ...widely used in practice
  • General ...a family of optimizations rather than a single
  • ne
  • Challenging ...conceptually simple, but exhibits lots of

thought-provoking phenomena Last but not least...

  • The underlying theory is a nice piece of mathematics!

Technology for the Impatient .Net Programmer and User

13

slide-14
SLIDE 14

The Plan for this Presentation

  • Providing a Tour through 50 Years of Research on CM
  • Focusing on

– Achievements – Phenomena – Open Problems and Challenges

14

slide-15
SLIDE 15

Tour Stops Included

  • Part I: Code motion – Exploring the Design Space
  • Part II: Code motion – Classically, but Advanced
  • Part III: Code motion – Phenomena of its Derivatives
  • Part IV: Code Motion – Recent Strands of Research
  • Conclusions and Perspectives

15

slide-16
SLIDE 16

Part I: CM – Exploring the Design Space

CM in the early days essentially meant...

  • Common subexpression elimination
  • Loop invariant code motion

16

slide-17
SLIDE 17

CM – The Seminal Work

Even if CM can be traced back to...

  • Ershov, A. P. On Programming of Arithmetic Operations.

CACM 1 (8), 3 - 6, 1958. ...it is fair to say that contemporary CM starts with the seminal work of

  • Morel, E. and Renvoise, C. Global Optimization by

Suppression of Partial Redundancies. CACM 22 (2), 96 - 103, 1979.

17

slide-18
SLIDE 18

CM – What’s it all about?

  • Essentially, CM aims at avoiding recomputing values

h:= a+b h:= a+b y :=h y := a+b x := a+b x :=h

18

slide-19
SLIDE 19

In practice, it is slightly more complex!

16 z := a+b

9

y := a+b

5 7 8 10 11 12 13 14 6 4 3 x := a+b 2

a := c

1

15 17 18 x := a+b y := a+b

19

slide-20
SLIDE 20

A Transformation

3

15 16

9 5 7 8 10 11 12 13 14 6 4

h y := z := h

1

a := c

2

h := a+b h y := h := a+b 17 18 x := h h x :=

20

slide-21
SLIDE 21

Another Transformation

:= a+b y := h z := h

4 3 x := a+b 2

a := c

1

:= a+b h 15

9 7 8 11 12 13 14 6

16

10 5

x := a+b 18 17 y := h h

21

slide-22
SLIDE 22

In more detail

CM and its two traditional optimization goals...

a:= ... z := a+b

22

slide-23
SLIDE 23

Conceptually

...CM can be considered a two-stage process

  • 1. Expression hoisting

...hoisting expressions to “earlier” safe computation points

  • 2. Total redundancy elimination

...eliminating computations which became totally redundant

23

slide-24
SLIDE 24

Extreme Strategy – Earliestness Principle

Placing computations as early as possible...

  • Theorem [Computational Optimality]

...hoisting expressions to their earliest safe computation points yields computationally optimal programs

❀ ...known as Busy Code Motion (PLDI’92, Knoop et al.) ...already known to Morel and Renvoise (though no theorem or proof).

24

slide-25
SLIDE 25

Earliestness Principle

Placing computations as early as possible... ...yields computationally optimal programs.

:= a+b h := a+b h z := h a:= ...

25

slide-26
SLIDE 26

Note: Earliestness means in fact...

...as early as possible, but not earlier!

Incorrect!

h z := h:= a+b a:= x+y

26

slide-27
SLIDE 27

Earliestness Principle: Important Drawback

...computationally optimal, but maximum register pressure

  • Register Pressure!

Maximum

h z := h:= a+b h:= a+b a:= ...

27

slide-28
SLIDE 28

Dual Extreme Strategy – Latestness Principle

Placing computations as late as possible...

  • Theorem [Optimality]

...hoisting expressions to their latest safe computation points yields computationally optimal programs with minimum register pressure

❀ ...known as Lazy Code Motion (PLDI’92, Knoop et al.)

28

slide-29
SLIDE 29

Latestness Principle

...computationally optimal, too, with mininum register pressure!

  • Register Pressure!

Minimum

h z := h:= a+b h:= a+b h:= a+b a:= ...

29

slide-30
SLIDE 30

These days...

Lazy Code Motion is...

  • ...the de-facto standard algorithm for CM used in

contemporary state-of-the-art compilers – Gnu compiler family – Sun Sparc compiler family – ...

30

slide-31
SLIDE 31

Towards Exploring the Design Space

Traditionally,

  • Code (C) means expressions
  • Motion (M) means hoisting

But...

  • CM is more than hoisting of expressions and PR(E)E!

31

slide-32
SLIDE 32

Obviously, code...

...can be assignments, too.

x := a+b x := a+b x := a+b x := a+b x := a+b x := a+b

  • Here, CM means partially redundant assignment

elimination (PRAE)

32

slide-33
SLIDE 33

In contrast to expressions, assignments...

...might also be sunk.

  • ut(x)

x := a+b

  • ut(x)

x := y+z

  • ut(x)
  • ut(x)

x := a+b x := y+z x := a+b

  • Now, CM means partially dead code elimination (PDCE)

33

slide-34
SLIDE 34

Towards the Design Space of CM-Algorithms...

More generally...

  • Code means expressions/assignments
  • Motion means hoisting/sinking

Code / Motion Hoisting Sinking Expressions EH ·/· Assignments AH AS

34

slide-35
SLIDE 35

Refining the Design Space of CM-Algorithms...

− Interprocedural − Parallelism − Predicated code − ... − Intraprocedural

EH AH, AS

  • sem. red.
  • syn. red.

Paradigm Semantic Syntactic

Introducing semantics... !

x := a+b c := a y := a+b z := c+b

35

slide-36
SLIDE 36

Refining the Design Space of CM-Algorithms...

− Interprocedural − Parallelism − Predicated code − ... − Intraprocedural

EH AH, AS

  • sem. red.
  • syn. red.

Paradigm Semantic Syntactic

Introducing semantics... !

x := a+b c := a y := a+b z := c+b

36

slide-37
SLIDE 37

Semantic Code Motion...

allows more powerful optimizations!

(x,y,z) := (a,b,a+b) (a,b,c) := (x,y,y+z) (a,b,c) := (x,y, h := x+y := a+b h h (x,y,z) := (a,b, ) h)

(example by B. Steffen, TAPSOFT’87)

37

slide-38
SLIDE 38

Remember,...

CM (PREE) and its optimization goals!

  • Speed
  • Register Pressure

There might be a third one:

  • Code Size

38

slide-39
SLIDE 39

A Computationally and Code-Size Optimal Program

:= a+b h z := h a:= ...

❀ Code size

39

slide-40
SLIDE 40

1999 World Market for Microprocessors

Going for size makes sense... Chip Category Number Sold Embedded 4-bit 2000 million Embedded 8-bit 4700 million Embedded 16-bit 700 million Embedded 32-bit 400 million DSP 600 million Desktop 32/64-bit 150 million ∼ 2%

... David Tennenhouse (Intel Director of Research). Keynote Speech at the 20th IEEE Real-Time Systems Symposium (RTSS’99), Phoenix AZ, 1999.

40

slide-41
SLIDE 41

Think of...

... domain-specific processors as used in embedded systems

  • Telecom

– Cell phones, pagers, ...

  • Consumer Electronics

– MP3 player, cameras, pocket games, ...

  • Automative

– GPS navigation, airbags, ...

  • ...

41

slide-42
SLIDE 42

For such applications...

...code size often more critical than speed!

42

slide-43
SLIDE 43

Part II: CM – Classically, but Advanced

...enhancing (L)CM to take a user’s priorities into account!

Computational Quality Lifetime Quality

...Register Pressure

Code-Size Quality

...Run-Time Performance

43

slide-44
SLIDE 44

...rendering possible this transformation, too:

  • Moderate

Register Pressure!

h z := h:= a+b a:= ...

44

slide-45
SLIDE 45

Towards Code-Size Sensitive CM...

  • Background: Classical CM

❀ Busy CM (BCM) / Lazy CM (LCM) (Knoop et al., PLDI’92)

– Received the ACM SIGPLAN Most Influential PLDI Paper Award 2002 (for 1992) – Selected for “20 Years of the ACM SIGPLAN PLDI: A Selection” (60 papers out of ca. 600 papers)

  • Code-Size Sensitive CM (Knoop et al., POPL’00)

❀ ...modular extension of BCM/LCM

∗ Modelling and Solving the Problem ...based on graph-theoretical means ∗ Main Results ...correctness, optimality

45

slide-46
SLIDE 46

The Running Example

1 2 3 4 5 6 7 8 9 10 11 13 12 14 15

a+b a+b a+b a := ...

46

slide-47
SLIDE 47

The Running Example (Cont’d)

a) b)

Two Code−size Optimal Programs

h:= a+b h:= a+b h:= a+b h:= a+b a := ...

1 2 3 4 5 6 7 8 9 10 11 13 12 14 15 a := ...

h h h

1 2 3 4 5 6 7 8 9 10 11 13 12 14 15

h h h 47

slide-48
SLIDE 48

The Running Example (Cont’d)

  • SQ > LQ> CQ

> CQ> LQ SQ

a) b)

h:= a+b h:= a+b h:= a+b h:= a+b

1 2 3 4 5 6 7 8 9 10 11 13 12 14 15 a := ...

h h h

1 2 3 4 5 6 7 8 9 10 11 13 12 14 15 a := ...

h h h 48

slide-49
SLIDE 49

The Running Example (Cont’d)

Note, we do not want the following transformation: It’s no

  • ption!

Impairing! h:= a+b

1 2 3 4 5 6 7 8 9 10 11 13 12 14 15 a := ...

h h h

49

slide-50
SLIDE 50

Code-Size Sensitive PRE

❀ The Problem

...how to get a code-size minimal placement of computations, i.e., a placement which is

– admissible (semantics & performance preserving) – code-size minimal

❀ Solution: A Fresh Look at PRE

...considering PRE a trade-off problem: trading the original computations against newly inserted ones!

❀ The Clou: Use Graph Theory!

...reducing the trade-off problem to the computation of tight sets in bipartite graphs based on maximum matchings!

50

slide-51
SLIDE 51

Bipartite Graph

T S

Tight Set ...of a bipartite graph (S ∪ T, E) is a subset Sts ⊆ S such that ∀ S′ ⊆ S. |Sts| − |Γ(Sts)| ≥ |S′| − |Γ(S′)|

Sts

Γ

)

Sts

(

S T

{

{

2 Variants: (1) Largest Tight Sets (2) Smallest Tight Sets

51

slide-52
SLIDE 52

Bipartite Graph

T S

Tight Set ...of a bipartite graph (S ∪ T, E) is a subset Sts ⊆ S such that ∀ S′ ⊆ S. |Sts| − |Γ(Sts)| ≥ |S′| − |Γ(S′)|

Sts

Γ

)

Sts

(

S T

{

{

2 Variants: (1) Largest Tight Sets (2) Smallest Tight Sets

52

slide-53
SLIDE 53

Apparently

Off-the-shelf algorithms of graph theory can be used to compute...

  • Maximum matchings and
  • Tight sets

Hence, our PRE problem boils down to... ...constructing the bipartite graph modelling the problem!

53

slide-54
SLIDE 54

Modelling the Trade-Off Problem

The Set of Nodes

TDS SDS

U

InsertBCM

{

Comp/UpSafe

{

U

(Comp / DownSafe UpSafe)

{

12 13 3 4 2 5 6 7 8 11

The Set of Edges...

54

slide-55
SLIDE 55

The Set of Nodes

a) b)

h:= a+b h:= a+b h:= a+b a := ...

1 2 3 4 5 6 7 8 9 10 11 13 12 14 15 a := ... 1 2 4 6 7 8 9 10 11 13 12 14 15

h h h a+b a+b a+b

5 3

55

slide-56
SLIDE 56

Modelling the Trade-Off Problem

The Set of Nodes

TDS SDS

U

InsertBCM

{

Comp/UpSafe

{

U

(Comp / DownSafe UpSafe)

{

12 13 3 4 2 5 6 7 8 11

The Bipartite Graph

SDS TDS

3 4 2 6 7 8 5 8 11 13 6 7 12 5

The Set of Edges ... ∀ n ∈ SDS ∀ m ∈ T DS. {n, m} ∈ EDS ⇐ ⇒d

f m ∈ Closure(pred(n)) 56

slide-57
SLIDE 57

DownSafety Closures

DownSafety Closure For n ∈ DownSafe/Upsafe the DownSafety Closure Closure(n) is the smallest set of nodes satisfying

  • 1. n ∈ Closure(n)
  • 2. ∀ m ∈ Closure(n) \ Comp. succ(m) ⊆ Closure(n)
  • 3. ∀ m ∈ Closure(n). pred(m) ∩ Closure(n) = ∅ ⇒

pred(m) \ UpSafe ⊆ Closure(n)

57

slide-58
SLIDE 58

DownSafety Closures – The Very Idea 1(4)

h:= a+b

1 2 3 4 5 6 7 8 9 10 11 13 12 14 15 a := ...

h h h

58

slide-59
SLIDE 59

DownSafety Closures – The Very Idea 2(4)

h:= a+b

1 2 3 4 5 6 7 8 9 10 11 13 12 14 15 a := ...

h h h

59

slide-60
SLIDE 60

DownSafety Closures – The Very Idea 3(4)

h:= a+b

1 2 3 4 5 6 7 8 9 10 11 13 12 14 15 a := ...

h h h No Initialization!

60

slide-61
SLIDE 61

DownSafety Closures – The Very Idea 4(4)

h:= a+b h:= a+b

1 2 3 4 5 6 7 8 9 10 11 13 12 14 15 a := ...

h h h

61

slide-62
SLIDE 62

DownSafety Closures

DownSafety Closure For n ∈ DownSafe/Upsafe the DownSafety Closure Closure(n) is the smallest set of nodes satisfying

  • 1. n ∈ Closure(n)
  • 2. ∀ m ∈ Closure(n) \ Comp. succ(m) ⊆ Closure(n)
  • 3. ∀ m ∈ Closure(n). pred(m) ∩ Closure(n) = ∅ ⇒

pred(m) \ UpSafe ⊆ Closure(n)

62

slide-63
SLIDE 63

DownSafety Regions

Some subsets of nodes are distinguished. We call each of these sets a DownSafety Region...

  • A set R⊆ N of nodes is a DownSafety Region if and
  • nly if
  • 1. Comp\UpSafe ⊆ R ⊆ DownSafe\UpSafe
  • 2. Closure(R) = R

63

slide-64
SLIDE 64

Fundamental...

Insertion Theorem Insertions of admissible PRE-Transformations are always at “earliest-frontiers” of DownSafety regions.

}

Comp

UpSafe Transp

R

DownSafe/UpSafe EarliestFrontier

R

...characterizes for the first time all semantics preserving CM-transf.

64

slide-65
SLIDE 65

The Key Questions

...concerning correctness and optimality:

  • 1. Where to insert computations, why is it correct?
  • 2. What is the impact on the code size?
  • 3. Why is it optimal, i.e., code-size minimal?

...three theorems answering one of these questions each.

65

slide-66
SLIDE 66

Main Results / First Question

  • 1. Where to insert computations, why is it correct?

Intuitively, at the earliestness frontier of the DS-region induced by the tight set... Theorem 1 [Tight Sets: Insertion Points] Let TS ⊆ SDS be a tight set. Then RT S=d

f Γ(TS) ∪ (Comp\UpSafe)

is a DownSafety Region with BodyRT S = TS Correctness ...immediate corollary of Theorem 1 and Insertion Theorem

66

slide-67
SLIDE 67

Main Results / Second Question

  • 2. What is the impact on the code size?

Intuitively, the difference between computations inserted and replaced... Theorem 2 [DownSafety Regions: Space Gain] Let R be a DownSafety Region with BodyR=d

f R\EarliestFrontierR

Then

  • Space Gain of Inserting at EarliestFrontierR:

|Comp\UpSafe| − |EarliestFrontierR| = |BodyR| − |Γ(BodyR)|

d f = defic(BodyR) 67

slide-68
SLIDE 68

Main Results / Third Question

  • 3. Why is it optimal, i.e., code-size minimal?

Due to an inherent property of tight sets (non-negative deficiency!)... Optimality Theorem [The Transformation] Let TS ⊆ SDS be a tight set.

  • Insertion Points:

InsertSpCM=d

f EarliestFrontierR

TS=R

TS\TS

  • Space Gain:

defic(TS)=d

f |TS| − |Γ(TS)| ≥ 0 max. 68

slide-69
SLIDE 69

Largest vs. Smallest Tight Sets: The Impact

tight sets favor tight sets favor Computational Quality Largest Earliestness Principle Smallest Latestness Principle Lifetime Quality

SmTS

R RSmTS

LaTS

R

LaTS

EarliestFrontier EarliestFrontier Comp

R

69

slide-70
SLIDE 70

Recall the Running Example

( SQ > CQ )

Latestness Principle Earliestness Principle

( SQ > LQ )

Smallest Tight Set

b) a)

Largest Tight Set

h := a+b

1 2 3 4 5 6 7 8 9 10 11 13 12 14 15 a := ...

h h h := a+b

1 2 3 4 5 6 7 8 9 10 11 13 12 14 15 a := ...

h h h := a+b h := a+b h h

70

slide-71
SLIDE 71

Code-Size Sensitive CM at a Glance

LCM (G)

Main Process Preprocess

Perform

Optional:

(2 GEN/KILL-DFAs)

LCM for BCM Compute Predicates of

(3 GEN/KILL-DFAs)

G resp.

Compute Largest/Smallest Tight Set

Optimization Phase

Determine Insertion Points

Reduction Phase

Construct Bipartite Graph Compute Maximum Matching

71

slide-72
SLIDE 72

A brief overview on the history of CM...

  • 1958: ...first glimpse of PRE

❀ Ershov’s work on On Programming of Arithmetic Operations.

  • 1979: ...origin of contemporary PRE

❀ Morel/Renvoise’s seminal work on PRE

  • 1992: ...LCM [Knoop et al., PLDI’92]

❀ ...first to achieve comp. optimality with minimum register pressure ❀ ...first to rigorously be proven correct and optimal

  • 2000: ...origin of code-size sensitive PRE [Knoop et al., POPL

2000] ❀ ...first to allow prioritization of goals ❀ ...rigorously be proven correct and optimal ❀ ...first to bridge the gap between traditional compilation and compilation for embedded systems

72

slide-73
SLIDE 73

Overview (Cont’d)

  • ca. since 1997: ...a new strand of research on PRE

❀ Speculative PRE: Gupta, Horspool, Soffa, Xue, Scholz, Knoop,...

  • 2005: ...another fresh look at PRE (as maximum flow problem)

❀ Unifying PRE and Speculative PRE [Jingling Xue and J. Knoop]

73

slide-74
SLIDE 74

Part III: CM – Phenomena of its Derivatives

Optimality results are quite sensitive! Three examples to provide evidence... (A) Code motion vs. code placement (B) Interdependencies of (elementary) transformations (C) Paradigm dependencies

74

slide-75
SLIDE 75

(A) Code Motion vs. Code Placement

...not just synonyms!

z := c+b (x,y) := (h1,h2) h2 := c+b z := h2 z := h1 z := h2 (x,y) := (h1,h2) h2 := c+b (h1,h2) := (a+b,c+b) (x,y) := (a+b,c+b) c := a h1 := a+b z := h1 (c,h2) := (a,h1) h1 := a+b z := a+b c := a

Original Program

Placing a+b Placing c+b

After Sem. Code Motion After Sem. Code Placement

Motion gets stuck! Motion gets stuck!

75

slide-76
SLIDE 76

Even worse...

Optimality is lost!

y := c+b c := a z := a+b z := c+b y := c := a z := a+b z := h := a+b h := c+b h h

Incomparable!

76

slide-77
SLIDE 77

Even more worse...

Peformance may be lost, when naively applied!

z := c+b z := c+b c := a z := a+b c := a h := a+b z := a+b

77

slide-78
SLIDE 78

(B) Interdependencies of Transformations

x := a+b a := b+c x := z

  • ut(x,a)

a := b+c

  • ut(x,a)

x := a+b x := z x := a+b x := a+b a := b+c

  • ut(x,a)

x := z x := a+b

AS TDCE ...2nd Order Effects!

❀ ...Partial Dead-Code Elimination (PDCE)

78

slide-79
SLIDE 79

Interdependencies of Transformations

a := b+c

  • ut(a,b)
  • ut(a,b)
  • ut(a,b)

a := b+c b := a+c b := a+c a := b+c b := a+c a := b+c

AH TRAE ...2nd Order Effects!

❀ ...Partially Redundant Assignment Elimination (PRAE)

79

slide-80
SLIDE 80

Conceptually

...we can think of PREE, PRAE and PDCE in terms of

  • PREE = AH ; TREE
  • PRAE = (AH + TRAE)∗
  • PDCE = (AS + TDCE)∗

80

slide-81
SLIDE 81

PRAE/PDCE – Optimality Results

Derivation relation ⊢...

  • PRAE...

G ⊢AH,T RAE G′ ( ET={AH,TRAE} )

  • PDCE...

G ⊢AS,T DCE G′ ( ET={AS,TDCE} ) We can prove... Optimality Theorem For both PRAE and PDCE, ⊢ET is confluent and terminating

81

slide-82
SLIDE 82

ET ET ET ET ET

G G

  • pt

Universe

82

slide-83
SLIDE 83

Consider now...

  • Assignment Placement AP

AP = (AH + TRAE + AS + TDCE)∗ ...should be even more powerful! Indeed, but...

x := a+b

  • ut(x)

x := a+b x := a+b

  • ut(x)

x := a+b

  • ut(x)

x := a+b

  • ut(x)

x := a+b

  • ut(x)

x := a+b

  • ut(x)

PRAE PDCE

83

slide-84
SLIDE 84

Confluence...

...and hence (global) optimality are lost!

ET ET ET ET ET

G

locOpt

G Universe

84

slide-85
SLIDE 85

Even worse...

...there are scenarios, where we can end up with universes like

G Universe

ET ET ET ET ET

?

85

slide-86
SLIDE 86

(C) Paradigm Dependencies

x := a+b z := d+b y := c+b z := h3 y := h2 (h1,h2,h3) := (a+b,c+b,d+b) x := h1

Original Program

ParBegin ParEnd ParBegin ParEnd

After Earliestness Transformation

86

slide-87
SLIDE 87

Part IV: CM – Recent Strands of Research

...another strand of research on CM is gaining more and more attention

  • Speculative CM (SCM)

87

slide-88
SLIDE 88

SCM – What’s it all about?

In contrast to CM

  • SCM takes profile information into account

...thereby allowing to improve the performance of hot program paths at the expense of impairing cold program paths. Anything else, especially the optimization goals,

  • the same!

88

slide-89
SLIDE 89

50 / 80 50 / 80 10 / 40 40 / 10 = a+b = a+b = a+b a = = a+b = a+b = a+b a = a = = a+b 1 2 3 4 5 6 7 8 10 11 12 9 20 80 10 10 50 30 30 30 40 10 10 50 / 20 10 10

89

slide-90
SLIDE 90

50 / 80 50 / 80 10 / 40 40 / 10 = a+b = a+b = a+b a = = a+b = a+b = a+b a = a = = a+b 1 2 3 4 5 6 7 8 10 11 12 9 20 80 10 10 50 30 30 30 40 10 10 50 / 20 10 10

90

slide-91
SLIDE 91

SCM vs. CM

Apparently

  • SCM and CM are two closely related and very similar

problems having much in common! However

  • SCM and CM are tackled by quite diverse algorithmic

means – CM ...based on solving (typically) 4 bitvector analyses: Availability, Anticipability, ... – SCM ...based on solving a maximum flow problem

91

slide-92
SLIDE 92

Recent Achievement

...the missing link between

  • Classical PRE (CPRE) and Speculative PRE (SPRE)

On the theoretical side, this yields...

  • a common high-level conceptual basis and understanding
  • f CPRE and SPRE

On the practical side, we obtain...

  • a new and simple algorithm for CPRE, which turns out

to outperform its competitors (joint work with Jingling Xue (CC 2006)

92

slide-93
SLIDE 93

Major Finding

Like SCM

  • CM is a maximum flow problem, too!

This means

  • Each (S)CM-algorithm, if optimal, must find in one way
  • r the other the unique minimum cut on a flow network

derived from a program’s CFG. Hence, we have

  • The Missing Link between CM and SCM!

93

slide-94
SLIDE 94

On the Impact of this Finding 1(4)

Practically

  • Possibly none

...at least not in terms of demanding replacement of implementations of optimal state-of-the-art CM algorithms by the flow-network based one.

Theoretically

  • Possibly a lot

...a common high-level basis for understanding and reasoning about both SCM and CM.

94

slide-95
SLIDE 95

On the Impact of this Finding 2(4)

This is in line with work on CM by other researchers striving for a simple and “motion-free” characterization of CM:

  • Bronnikov, D., A Practical Adaption of Partial Redundancy

Elimination, SIGPLAN Not., 39(8), 49-53, 2004.

  • Dhamdhere, D. M., E-Path pre: Partial Redundancy

Elimination Made Easy, SIGPLAN Not., 37(8), 53-65, 2002.

  • Paleri, V. K., Srikant, Y. N., Shankar, P., A Simple Algorithm

for Partial Redundancy Elimination, SIGPLAN Not., 33(12), 35-43, 1998.

95

slide-96
SLIDE 96

On the Impact of this Finding 3(4)

However, for these approaches

  • either no proofs of correctness and optimality are given
  • or these proofs still rely on a low-level path-based

reasoning Especially in this respect, the characterization of

  • CM as a maximum flow problem

can be considered a major step forward.

96

slide-97
SLIDE 97

On the Impact of this Finding 4(4)

A practical impact though... Based on the new understanding, we obtained

  • a new and simple CM-algorithm

– Like its competitors: ...relies on 4 bitvector analyses – At first sight thus: ...yet another CM-algorithm – But: ...outperforms its competitors

97

slide-98
SLIDE 98

Practical Measurements

...of the new algorithm show

  • a reduction in the number of bitvector operations

required ranging from 20% to 60% in comparison to three state-of-the-art algorithms for CM (including LCM and E-path) Experiments were performed

  • on an Intel Xeon and a Sun UltraSPARC-III platform
  • with the GCC-compiler as vehicle
  • using all of the 22 C/C++/Fortran SPECcpu2000

benchmarks

98

slide-99
SLIDE 99

Conclusions and Perspectives

  • Code Motion (CM)

...a hot topic of on-going research for almost 50 years!

  • State-of-the-Art in Theory and Practice

– Theory available and widely used in practice ∗ Classic CM – Theory available, but not yet widely used ∗ Derivatives of Classic CM (PDCE, PFCE, SR, DAP,...) ∗ Speculative CM and some derivatives (SR) ∗ Semantic CM – Theory not yet available ∗ Speculative Semantic CM ∗ ...

99

slide-100
SLIDE 100

Conclusions and Perspectives

  • Our obligation

– Pushing forward the further development of CM-based optimizations – Demanding their application (e.g. in the Phoenix framework) ...in order to help the impatient (.Net) programmer and user!

100

slide-101
SLIDE 101

Perspectives

Predicting the future...

  • Niels Bohr: ”Predictions are always difficult, especially

about the future.

101

slide-102
SLIDE 102

Perspectives: 50 Years from Now...

The future will be bright! In particular, I predict that...

  • we will celebrate the...

– 300th Anniversary of the Birth of Mozart – 200th Anniversary of the Birth of Freud – 100th Anniversary of the begin of CM-Research

  • and that...

102

slide-103
SLIDE 103

Perspectives: 50 Years from Now...

...we will all meet again at The 54th Annual .Net Technologies 2056 Conference in Central Europe June 1 - 5, 2006, Plzen, Czech Republic

103

slide-104
SLIDE 104

Perspectives: 50 Years from Now...

Browsing the programme of .Net Technologies 2056, I foresee to see... Keynote Speech: From .Net to .Net/XP: The Role and the Impact of a 100 Years of CM-Research

104

slide-105
SLIDE 105

Thank you!

Questions?

Acknowledgements: Most of the results reported are joint work with Oliver R¨ uthing (U. Dortmund), Bernhard Steffen (U. Dortmund), Eduard Mehofer (U. Wien), and more recently Bernhard Scholz (U. Sydney), Nigel Horspool (U. Victoria), and Jingling Xue (Univ. of New South Wales).

105