an ssa based algorithm for optimal speculative code
play

An SSA-based Algorithm for Optimal Speculative Code Motion under an - PowerPoint PPT Presentation

An SSA-based Algorithm for Optimal Speculative Code Motion under an Execution Profile Hucheng Zhou Tsinghua University June 2011 Joint work with: Wenguang Chen (Tsinghua University), Fred Chow (ICube Technology Corp.) Contents Basic Concepts


  1. An SSA-based Algorithm for Optimal Speculative Code Motion under an Execution Profile Hucheng Zhou Tsinghua University June 2011 Joint work with: Wenguang Chen (Tsinghua University), Fred Chow (ICube Technology Corp.)

  2. Contents Basic Concepts PRE SSA SSAPRE Speculative Code Motion MC-SSAPRE Algorithm Complexity Experiments Conclusion June 2011 MC-SSAPRE PLDI 2

  3. Partial Redundancy Elimination (PRE) • Eliminates expressions redundant on some (not necessarily all) paths • One of the most important and widely applied target-independent global optimization • Subsumes global common subexpression and loop invariant code motion B1 a+b B2 B1 B2 t=a+b t=a+b PRE B3 B3 B4 a+b B5 a+b B4 t B5 t June 2011 MC-SSAPRE PLDI 3

  4. PRE Facts • Applied to each lexically identified expression independently – e.g (a+b), (a-b), (a*c) • Formulated as a Placement problem: Step 1 – Determine where to perform insertions – Render more computations fully redundant Step 2 – Delete fully redundant computations • Main challenge is in Step 1 June 2011 MC-SSAPRE PLDI 4

  5. The Most Popular PRE Algorithms Lazy Code Motion (Knoop et. al ) – Computationally and Life-time Optimal – Ordinary program representation – Bit-vector-based iterative data flow analyses SSAPRE – Computationally and Life-time Optimal – SSA form of program representation – Sparse solution of data flow properties – Subsumes local common subexpression • Insensitive to basic block boundaries June 2011 MC-SSAPRE PLDI 5

  6. Static Single Assignment (SSA) • Program representation with built-in use-def information • Use-def edges factored at join points in CFG • Use-def implicitly represented via unique names • Each renamed variable has only one definition a= a= a= a= a1= a2= B1 B2 B1 B2 B1 B2 factored use-def use-def  a3 = (a1,a2) B3 B3 B3 =a =a =a =a =a3 =a3 B4 B5 B4 B5 B4 B5 CFG USE-DEF June 2011 MC-SSAPRE PLDI 6

  7. Factored Redundancy Graph (FRG) • Used in SSAPRE to represent redundant relationships among occurrences of the same expression via edges • The redundancy edges are factored as in SSA • Can view as SSA applied to expressions – Effectively put the t storing the expression after PRE in SSA form t2=a+b a+b a+b a+b a+b t1=a+b B1 B2 B1 B2 B1 B2 factored redundancy  redundancy B3 B3 B3 t3= (t1,t2) a+b a+b a+b a+b t3 t3 B4 B5 B4 B5 B4 B5 CFG Redundancy June 2011 MC-SSAPRE PLDI 7

  8. Speculative Code Motion Classical PRE only inserts at places where the expression is anticipated (down-safe) – Many redundant computations cannot be eliminated Speculative code motion ignores safety constraint – Can remove more redundancies – Not applicable to computations that may trigger runtime exceptions B1 a+b B2 B1 B2 t=a+b t=a+b Classical PRE B3 B3 Speculation B4 a+b B5 B4 t B5 CFG Unsafe Path June 2011 MC-SSAPRE PLDI 8

  9. While Loop Example Invariant code motion involves speculation Classical PRE Speculation June 2011 MC-SSAPRE PLDI 9

  10. While Loop Restructuring • The common solution • Speculation no longer necessary • But code size increases while loop PRE restructure June 2011 MC-SSAPRE PLDI 10

  11. Speculation not always beneficial • Useless computations introduced for some paths • Beneficial only if removed computations executed more frequently than inserted computations • Requires execution frequency information B1 B2 B1 B2 a+b t=a+b t=a+b 50 100 50 100 Non-beneficial because freq(B2) > freq(B4) B3 B3 150 150 B4 B5 B4 B5 t a+b 50 100 50 100 June 2011 MC-SSAPRE PLDI 11

  12. Problem Statement How to minimize the dynamic execution count of an expression under an execution profile • A more aggressive form of PRE – Classical PRE beneficial regardless of execution frequencies • Cai and Xue (2003, 2006) first to apply min-cut to solve this problem optimally – Algorithm called MC-PRE – Uses bit-vector-based data flow analyses – Min-cut applied to CFG • No SSA-based technique exists yet June 2011 MC-SSAPRE PLDI 12

  13. Topic of this Paper MC-SSAPRE – a new algorithm that yields optimal code placement under the SSAPRE framework Overview: • Form a essential flow graph (EFG) out of the FRG • Map the BB execution frequencies to the EFG nodes • Apply min-cut to the EFG June 2011 MC-SSAPRE PLDI 13

  14. Algorithm Steps MC-SSAPRE Steps SSAPRE Steps • • Construct FRG Construct FRG o F insertion ฀ F insertion o Rename – Rename • Form EFG and perform min-cut • Data Flow Attributes o Data flow – DownSafety o Graph reduction – WillBeAvail o Single source • Book-keeping o Single sink – Finalize o Minimum cut – CodeMotion o WillBeAvail • Book-keeping o Finalize o CodeMotion June 2011 MC-SSAPRE PLDI 14

  15. Running example in SSA Form Input Program B1 B2 a1+b1 50 20 B3 70 B4 B5 B6 B7 a1+b1 a1+b1 exit 50 10 10 50 B8 B9 B10 a1+b1 a1+b1 exit 60 5 5 B12 B12 exit exit 60 5 June 2011 MC-SSAPRE PLDI 15

  16. FRG for Running Example Introduce h so the FRG can be viewed from an SSA perspective Input Program FRG B1 h1 F Insertion 50 B1 B2 a1+b1 50 20 and h2= F (h1, ^ ) Rename B3 B3 70 70 B4 B4 B5 B6 B7 h3 a1+b1 a1+b1 exit B6 50 h2 50 10 10 50 10 F F F h4= F (h3,h2) B8 B8 B9 B10 a1+b1 a1+b1 exit 60 60 5 5 B9 h2 5 h4 B12 B12 exit exit 60 5 June 2011 MC-SSAPRE PLDI 16

  17. Roles of Factored Redundancy Graph • Insertions need to be considered only at F ’s – associated with the F operands • Medium to compute data flow properties to disqualify more F ’s from being insertion candidates • SSA form for t (temporary to store the computed value) will be carved out of the FRG • Three kinds of nodes: 1.Real occurrences in original program • Def – always non-redundant • Use – partially redundant (including fully redundant) 2. F (def) 3. F operand (use) – can be ^ June 2011 MC-SSAPRE PLDI 17

  18. Data Flow Properties for MC-SSAPRE Fully available • Insertions at these F ’s always unnecessary because the computed values are available Partially anticipated • Insertions should only be at these F ’s • otherwise, the inserted computation would have no use June 2011 MC-SSAPRE PLDI 18

  19. Graph Reduction Use computed data flow properties to further narrow down the F candidates for insertion Delete:  F ’s that are fully available  F ’s that are not partial anticipated  Use nodes (real occurrences or F operands) that are fully redundant  Edges from/to above nodes June 2011 MC-SSAPRE PLDI 19

  20. Graph Reduction for Running Example B1 h1 50 graph reduction h2= F (h1, ^ ) h2= F (h1, ^ ) B3 B3 70 70 B4 h3 B6 B6 50 h2 h2 10 10 F F F F F F h4= F (h3,h2) h4= F (h3,h2) B8 B8 60 60 B9 h2 5 h4 h4 rg_excluded rg_excluded – fully redundant occurrences determined during Renaming June 2011 MC-SSAPRE PLDI 20

  21. Form Essential Flow Graph (EFG) • Introduce a virtual source node – Add an edge from it to each ^ F operand • Introduce a virtual sink node – Add an edge from each real occurrence to it • Result is a complete flow network source h2= F (h1, ^ ) B3 70 B6 h2 10 F F F h4= F (h3,h2) B8 60 new edges h4 sink June 2011 MC-SSAPRE PLDI 21

  22. Edges in EFG Edges to the sink are never insertion candidate – Mark with ∞ frequency Other edges are: Type 1 edge – Edges ending at a F operand Type 2 edge – Edges from a F to a real occurrence source h2= F (h1, ^ ) B3 70 B6 h2 Type 1 10 F F F h4= F (h3,h2) B8 60 ∞ Type 2 h4 ∞ sink June 2011 MC-SSAPRE PLDI 22

  23. Mapping Frequencies to EFG Edges • Model insertion at a Type 1 edge by inserting at exit of the predecessor BB corresponding to the F operand – Annotate the Type 1 edge by the node frequency of that predecessor BB • Insertion at a Type 2 edge means performing the computation in place – Annotate the Type 2 edge by the frequency of the real occurrence June 2011 MC-SSAPRE PLDI 23

  24. EFG annotated with Frequencies Original Program B1 B2 a1+b1 50 20 B3 70 Final EFG B4 B5 B6 B7 a1+b1 a1+b1 exit 50 10 10 50 source 20 B8 B9 B10 a1+b1 a1+b1 exit 60 5 5 h2= F (h1, ^ ) B3 70 B12 B12 10 exit exit 10 60 5 B6 h2 10 Type 1 h4= F (h3,h2) B8 60 ∞ 60 Type 2 h4 ∞ sink June 2011 MC-SSAPRE PLDI 24

  25. Performing Minimum Cut A minimum cut • separates the flow network into two halves, such that • the sum of the weights of the cut edges is minimized By performing insertions at the cut edges, the number of execution of the computation is minimized – Implies computational optimality If min-cut not unique, choose the cut nearest the sink – Induces life-time optimality June 2011 MC-SSAPRE PLDI 25

  26. Our Example • Two possible min-cuts • Pick later red one source min-cut 20 h2= F (h1, ^ ) B3 70 min-cut 10 10 B6 h2 10 60 h4= F (h3,h2) B8 ∞ 60 h4 ∞ sink June 2011 MC-SSAPRE PLDI 26

  27. Final Result final transformed program B1 B2 a1+b1 source 50 20 B3 20 70 h2= F (h1, ^ ) B3   B4 B5 B6 t1=a1+b1 B7 t2 =a1+b1 exit 70 t2=a1+b1 min-cut 50 10 10 50 t1 t2 10 10 B6 h2  10 B8 B9 B10 exit t2 t1 60 h4= F (h3,h2) B8 60 5 5 ∞ 60 h4 B11 B13 ∞ exit exit 10 5 sink June 2011 MC-SSAPRE PLDI 27

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend