Optimizing for Space and Time Optimizing for Space and Time Usage - PowerPoint PPT Presentation

Optimizing for Space and Time Optimizing for Space and Time Usage with Speculative Par Usage with Speculative Partial ial Redundancy E Redundancy E limination limination Bernhard Scholz, University of Sydney, Australia Nigel Horspool, University of Victoria, Canada Jens Knoop, Vienna University of Technology, Austria Slide 1

Optimizing for Space and Time Optimizing for Space and Time Usage with SPRE Usage with SPRE Bernhard Scholz, University of Sydney, Australia Nigel Horspool, University of Victoria, Canada Jens Knoop, Vienna University of Technology, Austria Slide 1

Over Overview view • SPRE is normally a speed optimization ... • ... but SPRE may significantly increase program size. • We present a new SPRE approach where the objective function is a linear combination of space and time. (Problem maps to the well-known maximum flow problem in networks.) • An objective function which combines space and time can come close to the optimal result for both space and time when optimized sepa- rately. Slide 2

Introduction Introduction Par Partial Redundancy E ial Redundancy E limination limination ... is a generalization of code motion (Morel and Renvoise, 1979) ...= a+b a = ...= a+b Slide 3

Introduction Introduction Par Partial Redundancy E ial Redundancy E limination limination ... is a generalization of code motion (Morel and Renvoise, 1979) ...= a+b a = t1 = a+b a = ... ... = t1 t1 = a+b ...= a+b ... = t1 Slide 4

Introduction Introduction Par Partial Redundancy E ial Redundancy E limination limination ... is a generalization of code motion (Morel and Renvoise, 1979) 1000 1000 500 500 500 500 ...= a+b a = t1 = a+b a = ... ... = t1 t1 = a+b 1000 1000 ...= a+b ... = t1 1000 1000 1000 Slide 4

Par Partial Redundancy E ial Redundancy E limination limination ... is also very conservative. An expression e can be inserted at a point P only if every path starting from P uses e . This restriction guarantees: • safety, and • optimality (no more evaluations of the expression than before). ...= a+b Slide 5

Par Partial Redundancy E ial Redundancy E limination limination ... is also very conservative. An expression e can be inserted at a point P only if every path starting from P uses e . This restriction guarantees: • safety, and • optimality (no more evaluations of the expression than before). t1 = a+b ? ...= a+b ... = t1 Slide 5

Par Partial Redundancy E ial Redundancy E limination limination ... is also very conservative. An expression e can be inserted at a point P only if every path starting from P uses e . This restriction guarantees: • safety, and • optimality (no more evaluations of the expression than before). t1 = a+b 1 1 1 1 ? ...= a+b ... = t1 10000 10000 Slide 5

Speculative PRE Speculative PRE • An evaluation of e can be inserted anywhere as long as it is safe to do so, and • we speculatively compute e in the hope that the value will be useful later. Using probabilistic information (from execution profiles or elsewhere), the optimality goal becomes minimization of the expected number of evaluations. Cai and Xue presented a SPRE algorithm in 2003 which finds time-optimal solutions. Slide 6

SPRE SPRE E E xample xample 1 3 1 a = ... b = ... 3 1 1 3 10 ..=a+b 10 ..=a+b #evals=15; #occurrences=2 Slide 7

SPRE SPRE E E xample – optimized for time xample – optimized for time #evals=5; #occurrences=3 1 3 1 1 3 a = ... b = ... 1 a =... b =... 3 h = a+b h = a+b h = a+b 1 1 3 3 1 1 3 10 ..=a+b 10 ...= h 10 ..=a+b 10 ...= h #evals=15; #occurrences=2 Slide 8

SPRE SPRE E E xample – optimized for space xample – optimized for space #evals=8; #occurrences=2 1 3 1 1 3 a = ... b = ... 1 a = ... b = ... 3 h = a+b 1 1 3 3 1 1 3 10 ..=a+b 10 ...= h 10 ..=a+b 10 ..=a+b #evals=15; #occurrences=2 Slide 9

Over Overview of Algorithm view of Algorithm • The problem is decomposed into local transformations on each block in the flow graph. (For convenience only, we consider each simple statement to be a block.) • For an expression a+b , we have three kinds of block: a NULL block which neither computes a+b nor assigns to a or b ; a COMP block which computes a+b ; a MOD block which assigns to a or b (and does not compute a+b ). • Each local transformation incurs a cost (or a benefit); the cost is a linear combination of the code size and the expected execution frequency of the node. • We map the costs and the constraints into a network flow problem. We use a maximum flow algorithm to find the combination of local transformations that achieves the lowest total cost (or greatest total benefit). Slide 10

Local Transformations Local Transformations For an expression a+b , the transformation of a block is driven by • availability/unavailability of a+b on entry, • whether we want a+b to be available on exit. The three kinds of block are diagrammed like this: a = ... a = ... = a+b = a+b NULL block MOD block COMP block Slide 11

Local Transformations Local Transformations NULL block: a+b available a+b unavailable Transformations on exit on exit a+b available on entry Cost = 0 Cost = 0 a+b unavailable h = a+b on entry Cost = ??? Cost = 0 Static cost of h=a+b is 1. Dynamic cost of h=a+b is the execution frequency of the node. Slide 12

Local Transformations Local Transformations COMP block: a+b available a+b unavailable Transformations on exit on exit = h = h a+b available on entry = a+b = a+b Cost = 0 Cost = 0 h = a+b = a+b a+b unavailable = h on entry Cost = ??? Cost = ??? Slide 13

Local Transformations Local Transformations MOD block: a+b available a+b unavailable Transformations on exit on exit a = a = a+b available h = a+b on entry a = a = Cost = ??? Cost = 0 a = a = a+b unavailable h = a+b on entry Cost = ??? Cost = 0 Slide 14

Searching for an Optimal Solution ... Searching for an Optimal Solution ... 1 3 1 a = ... b = ... 3 1 1 3 10 ..=a+b 10 ..=a+b Slide 15

Searching for an Optimal Solution ... Searching for an Optimal Solution ... i 1 • Labels i 1 ... i 7 , o 1 ... o 7 o 1 denote all the places 1 3 where expression a+b 1 i 2 i 3 i 4 might be made available a = ... b = ... or left unavailable. o 4 • A = set of labels where o 2 o 3 3 i 5 a+b is available in the 1 1 3 optimal solution; ~A is o 5 the complement set. 10 i 6 • There are constraints on ..=a+b the partitioning of labels o 6 i 7 into the A and ~A sets, which we express in a 10 ..=a+b flow network. o 7 Slide 16

Searching for an Optimal Solution ... Searching for an Optimal Solution ... i 1 s o 1 i 2 i 3 i 4 The labels are the o 2 o 3 o 4 nodes of the network. i 5 We add two more o 5 nodes s and f (for start and finish). i 6 o 6 i 7 o 7 f Slide 17

Searching for an Optimal Solution ... Searching for an Optimal Solution ... i 1 s o 1 ∞ ∞ ∞ i 2 i 3 i 4 We add edges with infinite ∞ o 2 o 3 o 4 capacity wherever two labels must have the same i 5 assignment (both in A or ∞ ∞ ∞ o 5 ∞ both in ~A), as when they are connected by an edge i 6 ∞ in the original flow graph. o 6 i 7 o 7 f Slide 17

Searching for an Optimal Solution ... Searching for an Optimal Solution ... i 1 s 5 o 1 ∞ ∞ ∞ i 2 i 3 i 4 3 For each NULL block, ∞ o 2 o 3 o 4 we create an edge from its input label to i 5 13 its output label with ∞ ∞ ∞ o 5 ∞ capacity equal to that block’s execution i 6 ∞ frequency . o 6 i 7 Speed optimization o 7 f Slide 17

Searching for an Optimal Solution ... Searching for an Optimal Solution ... i 1 s 5 o 1 ∞ ∞ ∞ i 2 i 3 i 4 3 For each COMP block, ∞ o 2 o 3 o 4 we add an edge from its input label to the f label i 5 13 with capacity equal to ∞ ∞ ∞ o 5 ∞ that block’s execution frequency. i 6 ∞ o 6 5 i 7 Speed optimization 10 o 7 f Slide 17

Searching for an Optimal Solution ... Searching for an Optimal Solution ... ∞ i 1 s 5 o 1 ∞ ∞ ∞ For each MOD block, we i 2 i 3 i 4 3 1 add an edge from s to its output label with capaci- ∞ o 2 o 3 o 4 1 ty equal to that block’s execution frequency. i 5 13 ∞ ∞ ∞ o 5 And add an edge from s ∞ to the input label of the i 6 ∞ entry point with infinite capacity. o 6 5 i 7 Speed optimization 10 o 7 f Slide 17

Searching for an Optimal Solution ... Searching for an Optimal Solution ... ∞ i 1 s 5 o 1 ∞ ∞ ∞ i 2 i 3 i 4 3 1 ∞ o 2 o 3 o 4 1 The min cut – giving a maximum flow of 5 i 5 13 ∞ ∞ ∞ o 5 ∞ i 6 ∞ o 6 5 i 7 Speed optimization 10 o 7 f Slide 17

Optimizing for Space and Time Optimizing for Space and Time Usage - PowerPoint PPT Presentation

Optimizing for Space and Time Optimizing for Space and Time Usage with Speculative Par Usage with Speculative Partial ial Redundancy E Redundancy E limination limination Bernhard Scholz, University of Sydney, Australia Nigel Horspool,

Optimizing monitoring networks for Optimizing monitoring networks for Optimizing monitoring

Cycle time: 40 sec Cycle time: 12 sec Cycle time: 0.75 sec Cycle time: 1.25 sec Cycle time: 5

Computational Complexity Lecture 5 in which we relate space and time, and see the essence of

Lecture 14 Space-Time The Wedding of Time and Space Announcements References: Today:

Baldwin Space Summary October 25 1 Baldwin School Space Summary 2 Baldwin School Space Summary

Fast Rope Optimizing IDIQs as a Prime and a S ub Feb 2016 Optimizing IDIQ and GWACs Prime

Matching and Optimizing the Matching and Optimizing the SILC / ILC sections SILC / ILC sections

Fractal Prefetching B+-Trees: Optimizing Both Cache and Disk Performance Shimin Chen, Phillip B.

Rcpp classes and vectors Romain Franois Consulting Datactive, ThinkR DataCamp Optimizing R

Optimizing the Management of Acute Myeloid Leukemia: Individualized Therapy Optimizing the

Optimizing the Truckload / Less Than Truckload (TL/LTL) Optimizing the Truckload / Less Than

A Case for Self-Optimizing File Systems Jason Liptak, Sam Burnett A Case for Self-Optimizing

Optimizing re me dia tio n a ppro a c he s Optimizing re me dia tio n a ppro a c he s a t mine

OUR OBJECTIVE : OPTIMIZING YOUR TRANSACTION 1 OPTIMIZING YOUR TRANSACTION, is bringing added

Optimizing Discrete Wavelet Transform Optimizing Discrete Wavelet Transform on the Cell Broadband

HDA case study S. Skogestad, May 2006 Self- Self Thanks to Antonio Arajo 1 Process

Nonlocal methods for image processing Lecture note, Xiaoqun Zhang Oct 30, 2009 1/29 Nonlocal

How does bilateral filter relates with other methods? Pierre Kornprobst (INRIA) 0:35 Many

is Bf EXEIPE in X r plane An REE field extasio Grcr n Wendi've dim W Vectorspace S Gr

Offline Reinforcement Learning CS 285 Instructor: Aviral Kumar UC Berkeley What have we

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27

Petri Nets Petri Nets Inputs and Outputs Petri Nets vs FSM Lionel Morel Modeling Templates

Dynamic Sampling fs om Graphical Models Yitong Yin Nanjing University Joint work with W

Layered approach (by T. Berners-Lee) The Semantic Web principles are implemented in the layers of