heuristics for profile driven method driven method
play

Heuristics for Profile- -driven Method driven Method- - - PowerPoint PPT Presentation

Heuristics for Profile- -driven Method driven Method- - Heuristics for Profile level Speculative Parallelization level Speculative Parallelization John Whaley and Christos Kozyrakis Stanford University June 15, 2005 Speculative


  1. Heuristics for Profile- -driven Method driven Method- - Heuristics for Profile level Speculative Parallelization level Speculative Parallelization John Whaley and Christos Kozyrakis Stanford University June 15, 2005

  2. Speculative Multithreading • Speculatively parallelize an application – Uses speculation to overcome ambiguous dependencies – Uses hardware support to recover from misspeculation – Promising technique for automatically extracting parallelism from programs • Problem: Where to put the threads? June 15, 2005 Heuristics for Profile-driven Method- 1 level Speculative Parallelization

  3. Method-Level Speculation • Idea: Use method boundaries as speculative threads – Computation is naturally partitioned into methods – Execution often independent – Well-defined interface • Extract parallelism from irregular, non-numerical applications June 15, 2005 Heuristics for Profile-driven Method- 2 level Speculative Parallelization

  4. Method-Level Speculation Example main() foo() { { work_A; work_B; // writes *p } foo(); work_C; // reads *q } June 15, 2005 Heuristics for Profile-driven Method- 3 level Speculative Parallelization

  5. Method-Level Speculation Example main() { work_A; foo() { work_B; // writes *p } work_C; // reads *q } June 15, 2005 Heuristics for Profile-driven Method- 4 level Speculative Parallelization

  6. Method-Level Speculation Example work_A main() { foo() work_A; work_B foo() { work_B; // writes *p } work_C; // reads *q work_C } Sequential execution June 15, 2005 Heuristics for Profile-driven Method- 5 level Speculative Parallelization

  7. Method-Level Speculation Example work_A main() fork { overhead foo() work_A; work_B foo() { work_C work_B; // writes *p p!=q } No violation work_C; // reads *q } TLS execution – no violation June 15, 2005 Heuristics for Profile-driven Method- 6 level Speculative Parallelization

  8. Method-Level Speculation Example work_A main() fork { overhead foo() work_A; work_C work_B foo() { (aborted) work_B; // writes *p overhead p=q } Violation! work_C; // reads *q } work_C TLS execution – violation June 15, 2005 Heuristics for Profile-driven Method- 7 level Speculative Parallelization

  9. Method-Level Speculation Example Sequential TLS – no violation TLS – violation work_A work_A work_A fork fork overhead overhead foo() foo() foo() work_C work_B work_B work_B (aborted) work_C p!=q p=q overhead No violation Violation! work_C work_C June 15, 2005 Heuristics for Profile-driven Method- 8 level Speculative Parallelization

  10. Nested Speculation fork main() overhead { work_B foo() { foo() fork work_A; work_A overhead } bar() work_B; work_C work_D bar() { work_C; } work_D; Sequences of method calls can } cause nested speculation. June 15, 2005 Heuristics for Profile-driven Method- 9 level Speculative Parallelization

  11. This Talk: Choosing Speculation Points • Which methods to speculate? – Low chance of violation – Not too short, not too long – Not too many stores • Idea: Use profile data to choose good speculation points – Used for profile-driven and dynamic compiler – Should be low-cost but accurate • We evaluated 7 different heuristics – ~80% effective compared to perfect oracle June 15, 2005 Heuristics for Profile-driven Method- 10 level Speculative Parallelization

  12. Difficulties in Method-Level Speculation • Method invocations can have varying execution times – Too short: Doesn’t overcome speculation overhead – Too long: More likely to violate or overflow, prevents other threads from retiring • Return values – Mispredicted return value causes violation June 15, 2005 Heuristics for Profile-driven Method- 11 level Speculative Parallelization

  13. Classes of Heuristics • Simple Heuristics – Use only simple information, such as method runtime • Single-Pass Heuristics – More advanced information, such as sequence of store addresses – Single pass through profile data • Multi-Pass Heuristics – Multiple passes through profile data June 15, 2005 Heuristics for Profile-driven Method- 12 level Speculative Parallelization

  14. Classes of Heuristics • Simple Heuristics – Use only simple information, such as method runtime • Single-Pass Heuristics – More advanced information, such as sequence of store addresses – Single pass through profile data • Multi-Pass Heuristics – Multiple passes through profile data June 15, 2005 Heuristics for Profile-driven Method- 13 level Speculative Parallelization

  15. Runtime Heuristic (SI-RT) • Speculate on all methods with: – MIN < runtime < MAX • Idea: Should be long enough to amortize overhead, but not long enough to violate • Data required: – Average runtime of each method June 15, 2005 Heuristics for Profile-driven Method- 14 level Speculative Parallelization

  16. Store Heuristic (SI-SC) • Speculate on all methods with: – dynamic # of stores < MAX • Idea: Stores cause violations, so speculate on methods with few stores • Data required: – Average dynamic store count of each method June 15, 2005 Heuristics for Profile-driven Method- 15 level Speculative Parallelization

  17. Classes of Heuristics • Simple Heuristics – Use only simple information, such as method runtime • Single-Pass Heuristics – More advanced information, such as sequence of store addresses – Single pass through profile data • Multi-Pass Heuristics – Multiple passes through profile data June 15, 2005 Heuristics for Profile-driven Method- 16 level Speculative Parallelization

  18. Stalled Threads fork foo() overhead work_B { bar() { bar() work_A; work_A idle } work_B; } Speculative threads may stall while waiting to become main thread. June 15, 2005 Heuristics for Profile-driven Method- 17 level Speculative Parallelization

  19. Fork at intermediate points foo() { bar() { bar() work_A; work_A } fork work_B; overhead work_B } Fork at an intermediate point within a method to avoid violations and stalling June 15, 2005 Heuristics for Profile-driven Method- 18 level Speculative Parallelization

  20. Best Speedup Heuristic (SP-SU) • Speculate on methods with: – predicted speedup > THRES • Calculate predicted speedup by: expected sequential run time expected parallel run time • Scan store stream backwards to find fork point – Choose fork point to avoid violations and stalling June 15, 2005 Heuristics for Profile-driven Method- 19 level Speculative Parallelization

  21. Most Cycles Saved Heuristic (SP-CS) • Speculate on methods with: – predicted cycle savings > THRES • Calculate predicted cycle savings by: sequential cycle count – parallel cycle count • Place fork point such that: – predicted probability of violation < RATIO • Uses same information as SP-SU June 15, 2005 Heuristics for Profile-driven Method- 20 level Speculative Parallelization

  22. Classes of Heuristics • Simple Heuristics – Use only simple information, such as method runtime • Single-Pass Heuristics – More advanced information, such as sequence of store addresses – Single pass through profile data • Multi-Pass Heuristics – Multiple passes through profile data June 15, 2005 Heuristics for Profile-driven Method- 21 level Speculative Parallelization

  23. Nested Speculation fork main() overhead foo() { work_A foo() { work_D fork work_A; bar() { overhead bar() work_B; work_B foo() } idle work_C work_C; } work_D; } Effectiveness of speculation choice depends on choices for caller methods! June 15, 2005 Heuristics for Profile-driven Method- 22 level Speculative Parallelization

  24. Best Speedup Heuristic with Parent Info (MP-SU) • Iterative algorithm: – Choose speculation with best speedup – Readjust all callee methods to account for speculation in caller – Repeat until best speedup < THRES • Max # of iterations: depth of call graph June 15, 2005 Heuristics for Profile-driven Method- 23 level Speculative Parallelization

  25. Most Cycles Saved Heuristic with Parent Info (MP-CS) • Iterative algorithm: 1.Choose speculation with most cycles saved and predicted violations < RATIO 2.Readjust all callee methods to account for speculation in caller 3.Repeat until most cycles saved < THRES • Multi-pass version of SP-CS June 15, 2005 Heuristics for Profile-driven Method- 24 level Speculative Parallelization

  26. Most Cycles Saved Heuristic with No Nesting (MP-CSNN) • Iterative algorithm: – Choose speculation with most cycles saved and predicted violations < RATIO. – Eliminate all callee methods from consideration. – Repeat until most cycles saved < THRES. • Disallows nested speculation to avoid double-counting the benefits • Faster to compute than MP-CS June 15, 2005 Heuristics for Profile-driven Method- 25 level Speculative Parallelization

  27. Experimental Results Experimental Results June 15, 2005 Heuristics for Profile-driven Method- 26 level Speculative Parallelization

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend