optimizing stream programs using linear state space
play

Optimizing Stream Programs Using Linear State Space Analysis Sitij - PowerPoint PPT Presentation

1 Optimizing Stream Programs Using Linear State Space Analysis Sitij Agrawal 1,2 , William Thies 1 , and Saman Amarasinghe 1 1 Massachusetts Institute of Technology 2 Sandbridge Technologies CASES 2005 http://cag.lcs.mit.edu/streamit Streaming


  1. 1 Optimizing Stream Programs Using Linear State Space Analysis Sitij Agrawal 1,2 , William Thies 1 , and Saman Amarasinghe 1 1 Massachusetts Institute of Technology 2 Sandbridge Technologies CASES 2005 http://cag.lcs.mit.edu/streamit

  2. Streaming Application Domain 2 AtoD • Based on a stream of data – Graphics, multimedia, software radio Decode – Radar tracking, microphone arrays, duplicate HDTV editing, cell phone base stations • Properties of stream programs LPF 1 LPF 2 LPF 3 – Regular and repeating computation HPF 1 HPF 2 HPF 3 – Parallel, independent actors with explicit communication roundrobin – Data items have short lifetimes Encode Transmit

  3. Conventional DSP Design Flow 3 Spec. (data-flow diagram) Design the Datapaths (no control flow) Signal Processing Expert in Matlab DSP Optimizations Coefficient Tables Rewrite the program Software Engineer Architecture-specific in C and Assembly Optimizations (performance, power, code size) C/Assembly Code

  4. Ideal DSP Design Flow 4 Application-Level Design High-Level Program Application Programmer (dataflow + control) DSP Optimizations Compiler Architecture-Specific Optimizations Challenge: maintaining performance Challenge: maintaining performance C/Assembly Code

  5. The StreamIt Language 5 • Goals: – Provide a high-level stream programming model – Invent new compiler technology for streams • Contributions: – Language design [CC ’02, PPoPP ’05] – Compiling to tiled architectures [ASPLOS ’02, ISCA ’04, Graphics Hardware ’05] – Cache-aware scheduling [LCTES ’03, LCTES ’05] – Domain-specific optimizations [PLDI ’03, CASES ‘05]

  6. Programming in StreamIt 6 void->void pipeline FMRadio(int N, float lo, float hi) { AtoD add AtoD(); add FMDemod(); FMDemod add splitjoin { split duplicate; Duplicate for (int i=0; i<N; i++) { add pipeline { add LowPassFilter(lo + i*(hi - lo)/N); LPF 1 LPF 2 LPF 3 add HighPassFilter(lo + i*(hi - lo)/N); HPF 1 HPF 2 HPF 3 } } RoundRobin join roundrobin(); } add Adder(); Adder add Speaker(); Speaker }

  7. Example StreamIt Filter 7 float->float filter LowPassButterWorth (float sampleRate, float cutoff) { float coeff; float x; init { coeff = calcCoeff(sampleRate, cutoff); } work peek 2 push 1 pop 1 { filter x = peek (0) + peek (1) + coeff * x; push (x); pop (); } }

  8. Focus: Linear State Space Filters 8 • Properties: 1. Outputs are linear function of inputs and states 2. New states are linear function of inputs and states • Most common target of DSP optimizations – FIR / IIR filters – Linear difference equations – Upsamplers / downsamplers – DCTs

  9. Representing State Space Filters 9 • A state space filter is a tuple 〈 A, B, C, D 〉 inputs u states 〈 A, B, C, D 〉 x’ = Ax + Bu y = Cx + Du outputs

  10. Representing State Space Filters 10 • A state space filter is a tuple 〈 A, B, C, D 〉 inputs float->float filter IIR { u float x1, x2; states work push 1 pop 1 { float u = pop(); 〈 A, B, C, D 〉 x’ = Ax + Bu push(2*(x1+x2+u)); x1 = 0.9*x1 + 0.3*u; x2 = 0.9*x2 + 0.2*u; y = Cx + Du } } outputs

  11. Representing State Space Filters 11 • A state space filter is a tuple 〈 A, B, C, D 〉 inputs float->float filter IIR { u float x1, x2; states work push 1 pop 1 { 0.3 0.9 0 float u = pop(); A = B = 0.2 0 0.9 x’ = Ax + Bu push(2*(x1+x2+u)); x1 = 0.9*x1 + 0.3*u; 2 C = D = 2 2 x2 = 0.9*x2 + 0.2*u; y = Cx + Du } } outputs

  12. Representing State Space Filters 12 • A state space filter is a tuple 〈 A, B, C, D 〉 inputs float->float filter IIR { u float x1, x2; states work push 1 pop 1 { 0.3 0.9 0 float u = pop(); A = B = 0.2 0 0.9 x’ = Ax + Bu push(2*(x1+x2+u)); x1 = 0.9*x1 + 0.3*u; 2 D = C = 2 2 x2 = 0.9*x2 + 0.2*u; y = Cx + Du } } outputs

  13. Representing State Space Filters 13 • A state space filter is a tuple 〈 A, B, C, D 〉 inputs float->float filter IIR { u float x1, x2; states work push 1 pop 1 { 0.3 0.9 0 float u = pop(); A = B = 0.2 0 0.9 x’ = Ax + Bu push(2*(x1+x2+u)); x1 = 0.9*x1 + 0.3*u; 2 C = 2 2 D = x2 = 0.9*x2 + 0.2*u; y = Cx + Du } } outputs

  14. Representing State Space Filters 14 • A state space filter is a tuple 〈 A, B, C, D 〉 inputs float->float filter IIR { u float x1, x2; states work push 1 pop 1 { 0.3 0.9 0 float u = pop(); A = B = 0.2 0 0.9 x’ = Ax + Bu push(2*(x1+x2+u)); x1 = 0.9*x1 + 0.3*u; 2 C = 2 2 D = x2 = 0.9*x2 + 0.2*u; y = Cx + Du } } outputs

  15. Representing State Space Filters 15 • A state space filter is a tuple 〈 A, B, C, D 〉 inputs float->float filter IIR { u float x1, x2; states work push 1 pop 1 { 0.3 0.9 0 float u = pop(); A = B = 0.2 0 0.9 x’ = Ax + Bu push(2*(x1+x2+u)); x1 = 0.9*x1 + 0.3*u; 2 C = 2 2 D = x2 = 0.9*x2 + 0.2*u; y = Cx + Du } } outputs

  16. Representing State Space Filters 16 • A state space filter is a tuple � A, B, C, D 〉 inputs float->float filter IIR { u float x1, x2; states work push 1 pop 1 { 0.3 0.9 0 float u = pop(); A = B = 0.2 0 0.9 x’ = Ax + Bu push(2*(x1+x2+u)); x1 = 0.9*x1 + 0.3*u; 2 C = 2 2 D = x2 = 0.9*x2 + 0.2*u; y = Cx + Du } } outputs Linear dataflow analysis

  17. State Space Optimizations 17 1. State removal 2. Reducing the number of parameters 3. Combining adjacent filters

  18. Change-of-Basis Transformation 18 x’ = Ax + Bu y = Cx + Du

  19. Change-of-Basis Transformation 19 x’ = Ax + Bu y = Cx + Du T = invertible matrix Tx’ = TAx + TBu y = Cx + Du

  20. Change-of-Basis Transformation 20 x’ = Ax + Bu y = Cx + Du T = invertible matrix Tx’ = TA(T -1 T)x + TBu y = C(T -1 T)x + Du

  21. Change-of-Basis Transformation 21 x’ = Ax + Bu y = Cx + Du T = invertible matrix Tx’ = TAT -1 (Tx) + TBu y = CT -1 (Tx) + Du

  22. Change-of-Basis Transformation 22 x’ = Ax + Bu y = Cx + Du T = invertible matrix, z = Tx Tx’ = TAT -1 (Tx) + TBu y = CT -1 (Tx) + Du

  23. Change-of-Basis Transformation 23 x’ = Ax + Bu y = Cx + Du T = invertible matrix, z = Tx z’ = TAT -1 z + TBu y = CT -1 z + Du

  24. Change-of-Basis Transformation 24 x’ = Ax + Bu y = Cx + Du T = invertible matrix, z = Tx A’ = TAT -1 B’ =TB z’ = A’z + B’u y = C’z + D’u C’ = CT -1 D’ = D

  25. Change-of-Basis Transformation 25 x’ = Ax + Bu y = Cx + Du T = invertible matrix, z = Tx A’ = TAT -1 B’ =TB z’ = A’z + B’u y = C’z + D’u C’ = CT -1 D’ = D Can map original states x to transformed states z = Tx without changing I/O behavior

  26. 1) State Removal 26 • Can remove states which are: a. Unreachable – do not depend on input b. Unobservable – do not affect output • To expose unreachable states, reduce [A | B] to a kind of row-echelon form – For unobservable states, reduce [A T | C T ] • Automatically finds minimal number of states

  27. State Removal Example 27 1 0 0.3 0.9 0 0.3 0.9 0 T = x’ = 0 0.9 x + u x’ = 0 0.9 x + u 1 1 0.2 0.5 y = 2 2 x + 2u x + 2u y = 0 2 float->float filter IIR { float x1, x2; work push 1 pop 1 { float u = pop(); push(2*(x1+x2+u)); x1 = 0.9*x1 + 0.3*u; x2 = 0.9*x2 + 0.2*u; } }

  28. State Removal Example 28 1 0 0.3 0.9 0 0.3 0.9 0 T = x’ = 0 0.9 x + u x’ = 0 0.9 x + u 1 1 0.2 0.5 y = 2 2 x + 2u x + 2u y = 0 2 x1 is unobservable float->float filter IIR { float x1, x2; work push 1 pop 1 { float u = pop(); push(2*(x1+x2+u)); x1 = 0.9*x1 + 0.3*u; x2 = 0.9*x2 + 0.2*u; } }

  29. State Removal Example 29 1 0 0.3 0.9 0 T = x’ = 0 0.9 x + u x’ = 0.9x + 0.5u 1 1 0.2 y = 2x + 2u y = 2 2 x + 2u float->float filter IIR { float->float filter IIR { float x1, x2; float x; work push 1 pop 1 { work push 1 pop 1 { float u = pop(); float u = pop(); push(2*(x1+x2+u)); push(2*(x+u)); x1 = 0.9*x1 + 0.3*u; x = 0.9*x + 0.5*u; x2 = 0.9*x2 + 0.2*u; } } } }

  30. State Removal Example 30 5 FLOPs 9 FLOPs 8 load/store 12 load/store output output float->float filter IIR { float->float filter IIR { float x1, x2; float x; work push 1 pop 1 { work push 1 pop 1 { float u = pop(); float u = pop(); push(2*(x1+x2+u)); push(2*(x+u)); x1 = 0.9*x1 + 0.3*u; x = 0.9*x + 0.5*u; x2 = 0.9*x2 + 0.2*u; } } } }

  31. 2) Parameter Reduction 31 • Goal: Convert matrix entries (parameters) to 0 or 1 • Allows static evaluation: 1*x � x Eliminate 1 multiply 0*x + y � y Eliminate 1 multiply, 1 add • Algorithm (Ackerman & Bucy, 1971) – Also reduces matrices [A | B] and [A T | C T ] – Attains a canonical form with few parameters

  32. Parameter Reduction Example 32 T = 2 x’ = 0.9x + 1 u x’ = 0.9x + 0.5u y = 1 x + 2u y = 2x + 2u 6 FLOPs 4 FLOPs output output

  33. 3) Combining Adjacent Filters 33 u Filter 1 u y = D 1 u Combined y z = D 2 D 1 u z = Eu Filter E Filter 2 z z = D 2 y z

  34. 3) Combining Adjacent Filters 34 u u B 1 A 1 0 x’ = x + u Combined B 2 D 1 B 2 C 1 A 2 Filter 1 Filter z = D 2 C 1 C 2 x + D 2 D 1 u y z Also in paper: Filter 2 - combination of parallel streams - combination of feedback loops - expansion of mis-matching filters z

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend