optimizing for space and time optimizing for space and
play

Optimizing for Space and Time Optimizing for Space and Time Usage - PowerPoint PPT Presentation

Optimizing for Space and Time Optimizing for Space and Time Usage with Speculative Par Usage with Speculative Partial ial Redundancy E Redundancy E limination limination Bernhard Scholz, University of Sydney, Australia Nigel Horspool,


  1. Optimizing for Space and Time Optimizing for Space and Time Usage with Speculative Par Usage with Speculative Partial ial Redundancy E Redundancy E limination limination Bernhard Scholz, University of Sydney, Australia Nigel Horspool, University of Victoria, Canada Jens Knoop, Vienna University of Technology, Austria Slide 1

  2. Optimizing for Space and Time Optimizing for Space and Time Usage with SPRE Usage with SPRE Bernhard Scholz, University of Sydney, Australia Nigel Horspool, University of Victoria, Canada Jens Knoop, Vienna University of Technology, Austria Slide 1

  3. Over Overview view • SPRE is normally a speed optimization ... • ... but SPRE may significantly increase program size. • We present a new SPRE approach where the objective function is a linear combination of space and time. (Problem maps to the well-known maximum flow problem in net- works.) • An objective function which combines space and time can come close to the optimal result for both space and time when optimized sepa- rately. Slide 2

  4. Introduction Introduction Par Partial Redundancy E ial Redundancy E limination limination ... is a generalization of code motion (Morel and Renvoise, 1979) ...= a+b a = ...= a+b Slide 3

  5. Introduction Introduction Par Partial Redundancy E ial Redundancy E limination limination ... is a generalization of code motion (Morel and Renvoise, 1979) ...= a+b a = t1 = a+b a = ... ... = t1 t1 = a+b ...= a+b ... = t1 Slide 4

  6. Introduction Introduction Par Partial Redundancy E ial Redundancy E limination limination ... is a generalization of code motion (Morel and Renvoise, 1979) 1000 1000 500 500 500 500 ...= a+b a = t1 = a+b a = ... ... = t1 t1 = a+b 1000 1000 ...= a+b ... = t1 1000 1000 1000 Slide 4

  7. Par Partial Redundancy E ial Redundancy E limination limination ... is also very conservative. An expression e can be inserted at a point P only if every path starting from P uses e . This restriction guarantees: • safety, and • optimality (no more evaluations of the expression than before). ...= a+b Slide 5

  8. Par Partial Redundancy E ial Redundancy E limination limination ... is also very conservative. An expression e can be inserted at a point P only if every path starting from P uses e . This restriction guarantees: • safety, and • optimality (no more evaluations of the expression than before). t1 = a+b ? ...= a+b ... = t1 Slide 5

  9. Par Partial Redundancy E ial Redundancy E limination limination ... is also very conservative. An expression e can be inserted at a point P only if every path starting from P uses e . This restriction guarantees: • safety, and • optimality (no more evaluations of the expression than before). t1 = a+b 1 1 1 1 ? ...= a+b ... = t1 10000 10000 Slide 5

  10. Par Partial Redundancy E ial Redundancy E limination limination ... is also very conservative. An expression e can be inserted at a point P only if every path starting from P uses e . This restriction guarantees: • safety, and • optimality (no more evaluations of the expression than before). t1 = a+b 1 1 1 1 ? ...= a+b ... = t1 10000 10000 Slide 5

  11. Speculative PRE Speculative PRE • An evaluation of e can be inserted anywhere as long as it is safe to do so, and • we speculatively compute e in the hope that the value will be useful later. Using probabilistic information (from execution profiles or elsewhere), the optimality goal becomes minimization of the expected number of evalua- tions. Cai and Xue presented a SPRE algorithm in 2003 which finds time-optimal solutions. Slide 6

  12. SPRE SPRE E E xample xample 1 3 1 a = ... b = ... 3 1 1 3 10 ..=a+b 10 ..=a+b #evals=15; #occurrences=2 Slide 7

  13. SPRE SPRE E E xample – optimized for time xample – optimized for time #evals=5; #occurrences=3 1 3 1 1 3 a = ... b = ... 1 a =... b =... 3 h = a+b h = a+b h = a+b 1 1 3 3 1 1 3 10 ..=a+b 10 ...= h 10 ..=a+b 10 ...= h #evals=15; #occurrences=2 Slide 8

  14. SPRE SPRE E E xample – optimized for space xample – optimized for space #evals=8; #occurrences=2 1 3 1 1 3 a = ... b = ... 1 a = ... b = ... 3 h = a+b 1 1 3 3 1 1 3 10 ..=a+b 10 ...= h 10 ..=a+b 10 ..=a+b #evals=15; #occurrences=2 Slide 9

  15. Over Overview of Algorithm view of Algorithm • The problem is decomposed into local transformations on each block in the flow graph. (For convenience only, we consider each simple statement to be a block.) • For an expression a+b , we have three kinds of block: a NULL block which neither computes a+b nor assigns to a or b ; a COMP block which computes a+b ; a MOD block which assigns to a or b (and does not compute a+b ). • Each local transformation incurs a cost (or a benefit); the cost is a lin- ear combination of the code size and the expected execution fre- quency of the node. • We map the costs and the constraints into a network flow problem. We use a maximum flow algorithm to find the combination of local transformations that achieves the lowest total cost (or greatest total benefit). Slide 10

  16. Local Transformations Local Transformations For an expression a+b , the transformation of a block is driven by • availability/unavailability of a+b on entry, • whether we want a+b to be available on exit. The three kinds of block are diagrammed like this: a = ... a = ... = a+b = a+b NULL block MOD block COMP block Slide 11

  17. Local Transformations Local Transformations NULL block: a+b available a+b unavailable Transformations on exit on exit a+b available on entry Cost = 0 Cost = 0 a+b unavailable h = a+b on entry Cost = ??? Cost = 0 Static cost of h=a+b is 1. Dynamic cost of h=a+b is the execution frequency of the node. Slide 12

  18. Local Transformations Local Transformations COMP block: a+b available a+b unavailable Transformations on exit on exit = h = h a+b available on entry = a+b = a+b Cost = 0 Cost = 0 h = a+b = a+b a+b unavailable = h on entry Cost = ??? Cost = ??? Slide 13

  19. Local Transformations Local Transformations MOD block: a+b available a+b unavailable Transformations on exit on exit a = a = a+b available h = a+b on entry a = a = Cost = ??? Cost = 0 a = a = a+b unavailable h = a+b on entry Cost = ??? Cost = 0 Slide 14

  20. Searching for an Optimal Solution ... Searching for an Optimal Solution ... 1 3 1 a = ... b = ... 3 1 1 3 10 ..=a+b 10 ..=a+b Slide 15

  21. Searching for an Optimal Solution ... Searching for an Optimal Solution ... i 1 • Labels i 1 ... i 7 , o 1 ... o 7 o 1 denote all the places 1 3 where expression a+b 1 i 2 i 3 i 4 might be made available a = ... b = ... or left unavailable. o 4 • A = set of labels where o 2 o 3 3 i 5 a+b is available in the 1 1 3 optimal solution; ~A is o 5 the complement set. 10 i 6 • There are constraints on ..=a+b the partitioning of labels o 6 i 7 into the A and ~A sets, which we express in a 10 ..=a+b flow network. o 7 Slide 16

  22. Searching for an Optimal Solution ... Searching for an Optimal Solution ... i 1 s o 1 i 2 i 3 i 4 The labels are the o 2 o 3 o 4 nodes of the network. i 5 We add two more o 5 nodes s and f (for start and finish). i 6 o 6 i 7 o 7 f Slide 17

  23. Searching for an Optimal Solution ... Searching for an Optimal Solution ... i 1 s o 1 ∞ ∞ ∞ i 2 i 3 i 4 We add edges with infinite ∞ o 2 o 3 o 4 capacity wherever two la- bels must have the same i 5 assignment (both in A or ∞ ∞ ∞ o 5 ∞ both in ~A), as when they are connected by an edge i 6 ∞ in the original flow graph. o 6 i 7 o 7 f Slide 17

  24. Searching for an Optimal Solution ... Searching for an Optimal Solution ... i 1 s 5 o 1 ∞ ∞ ∞ i 2 i 3 i 4 3 For each NULL block, ∞ o 2 o 3 o 4 we create an edge from its input label to i 5 13 its output label with ∞ ∞ ∞ o 5 ∞ capacity equal to that block’s execution i 6 ∞ frequency . o 6 i 7 Speed optimization o 7 f Slide 17

  25. Searching for an Optimal Solution ... Searching for an Optimal Solution ... i 1 s 5 o 1 ∞ ∞ ∞ i 2 i 3 i 4 3 For each COMP block, ∞ o 2 o 3 o 4 we add an edge from its input label to the f label i 5 13 with capacity equal to ∞ ∞ ∞ o 5 ∞ that block’s execution frequency. i 6 ∞ o 6 5 i 7 Speed optimization 10 o 7 f Slide 17

  26. Searching for an Optimal Solution ... Searching for an Optimal Solution ... ∞ i 1 s 5 o 1 ∞ ∞ ∞ For each MOD block, we i 2 i 3 i 4 3 1 add an edge from s to its output label with capaci- ∞ o 2 o 3 o 4 1 ty equal to that block’s execution frequency. i 5 13 ∞ ∞ ∞ o 5 And add an edge from s ∞ to the input label of the i 6 ∞ entry point with infinite capacity. o 6 5 i 7 Speed optimization 10 o 7 f Slide 17

  27. Searching for an Optimal Solution ... Searching for an Optimal Solution ... ∞ i 1 s 5 o 1 ∞ ∞ ∞ i 2 i 3 i 4 3 1 ∞ o 2 o 3 o 4 1 The min cut – giving a maximum flow of 5 i 5 13 ∞ ∞ ∞ o 5 ∞ i 6 ∞ o 6 5 i 7 Speed optimization 10 o 7 f Slide 17

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend