bursty tracing a framework for low overhead temporal
play

Bursty Tracing: A Framework for Low-Overhead Temporal Profiling - PowerPoint PPT Presentation

Bursty Tracing: A Framework for Low-Overhead Temporal Profiling Martin Hirzel Trishul Chilimbi hirzel@colorado.edu trishulc@microsoft.com FDDO4 December 2001 Austin, Texas Low-overhead temporal profiling Low overhead Intended


  1. Bursty Tracing: A Framework for Low-Overhead Temporal Profiling Martin Hirzel Trishul Chilimbi hirzel@colorado.edu trishulc@microsoft.com FDDO4 December 2001 Austin, Texas

  2. “Low-overhead temporal profiling” • Low overhead – Intended for dynamic optimization systems – Profile overhead must be recovered by optimization • Temporal profiling – Trend in profiling literature: discover more causality (path profiling, calling context trees, etc. ) – Temporal profiles expose more optimization opportunities 2

  3. Arnold-Ryder profiling framework (a) (b) entry check A A A’ B B B’ back− edge checking check instrumented code code original procedure modified procedure (Arnold−Ryder) • Counter nCheck 1 • Sampling rate r = nCheck 0 + 1 • Implemented in Jikes RVM (Java on PowerPC) 3

  4. Why longer bursts • Arnold-Ryder framework isolates events by loop back-edges, calls, and returns • Example: for ( i = 1; i < n ; i ++) if ( . . . ) f (); else g (); • Temporal relationships interesting for optimization: – Single-entry multiple-exit regions – Field reordering 4

  5. Contributions • Longer bursts – Our framework captures temporal relationships across loop back-edges, calls, and returns. • x86 binaries – We report experiences with the framework in an alternative setting with different advantages and disadvantages. • Overhead reduction techniques – We eliminate some of the checks at procedure entries and at loop back-edges. 5

  6. Talk outline • Introduction • Methodology – Longer bursts – Overhead reduction by eliminating checks • Evaluation – Overhead – Profile quality • Conclusion 6

  7. Longer bursts (a) (b) entry check A A A’ B B B’ back− edge checking check instrumented code code original procedure modified procedure (longer bursts) • Counters nCheck and nInstr nInstr 0 • Sampling rate r = nCheck 0 + nInstr 0 • Implemented using Vulcan (x86 binaries) 7

  8. Fewer checks • Goal: reduce overhead • Starting point: 6-35% overhead in our setting with checks on all procedure entries and loop back-edges • Constraint: never recurse or loop for unbounded amount of time without check • Remark: analogous to thread-yield points, gc-safe points, asynchronous-exception points 8

  9. Eliminating entry checks substitute check main match insert_after expand join ~symbols delete_digram 9

  10. Eliminating entry checks 3 substitute 1 check 0 2 main match 1 3 insert_after expand 2 4 join ~symbols 3 delete_digram � C = f ∈ N | ¬ is leaf ( f ) ∧ ( is root ( f ) ∨ addr taken ( f ) ∨ � recursion from below ( f )) 10

  11. Eliminating loop back-edge checks • Tight inner loops – Checking gets expensive relative to time spent in original code – Statically optimized, not much opportunity for dynamic optimization • Omit both checking and profiling for tight inner loops • k -boring loop: – No calls – At most k profiling events of interest 11

  12. Evaluation: Overhead • overhead ( r ) = basic overhead + r · instr overhead % basic overhead 40 EC+L4 orig all checks intact orig L10 EC EN LN EL no checks on entry to leaf procedures EL L4 35 EC call−graph technique EN no checks on entry to any procedures L4 4−boring loop technique 30 L10 10−boring loop technique LN no checks on any loop back−edges EC+L4 call−graph and 4−boring loop techniques 25 EC+L4 orig L10 EC EN LN EL L4 20 15 EC+L4 EC+L4 orig L10 EL EC EN LN L4 orig L10 EN LN EL EC L4 10 EC+L4 orig L10 EN LN EL EC L4 5 0 181.mcf 252.eon 300.twolf 305.espresso boxsim 12

  13. Case study: Hot data stream profiles • data reference : dynamic load, ( pc , addr ) pair • data stream : sequence v of data references • heat of data stream : v. heat = v. length ∗ v. frequency • hot data stream : when v. heat > heat threshold (we set the threshold such that all hot data streams together cover 90% of the profile) • hot data stream profile : set P of hot data streams and their heats � • overlap ( P, Q ) = min { v. heat P , v. heat Q } v ∈ P ∪ Q 13

  14. • nCheck 0 : nInstr 0 10 20 30 40 50 60 0 % overlap 20:1 181.mcf 100:1 200:1 200:10 1000:10 2000:10 1000:50 5000:50 Evaluation: Overlap 20:1 252.eon 100:1 200:1 200:10 1000:10 2000:10 1000:50 5000:50 20:1 300.twolf 100:1 200:1 200:10 1000:10 2000:10 1000:50 5000:50 20:1 305.espresso 100:1 200:1 200:10 1000:10 2000:10 1000:50 5000:50 20:1 boxsim 100:1 200:1 200:10 1000:10 2000:10 1000:50 5000:50 14

  15. Evaluation: Overlap orig all checks intact nCheck 0 : nInstr 0 = 1000:50 EL no checks on entry to leaf procedures EC call−graph technique EN no checks on entry to any procedures L4 4−boring loop technique L10 10−boring loop technique LN no checks on any loop back−edges EC+L4 call−graph and 4−boring loop techniques % overlap EC+L4 60 EC+L4 orig L10 EC+L4 EL EC EN LN L4 orig L10 EN LN EL EC orig L10 L4 EL EC EN LN L4 50 40 EC+L4 EC+L4 30 orig L10 EL EC EN LN L4 orig L10 EC EN LN EL L4 20 10 0 181.mcf 252.eon 300.twolf 305.espresso boxsim 15

  16. Related work • Arnold, Ryder, A framework for reducing the cost of instrumented code , PLDI 2001 • Temporal profiling – Ball, Larus, Efficient path profiling , MICRO 1996 – Ammons, Ball, Larus, Exploiting hardware performance counters with flow and context sensitive profiling , PLDI 1997 – Larus, Whole program paths , PLDI 1999 – Chilimbi, Efficient representations and abstractions for quantifying and exploiting data reference locality , PLDI 2001 16

  17. Conclusions • Bursty tracing can collect temporal profiles online – General, low-overhead, deterministic – Flexible trade-off between sampling rate, overhead, and burst-length – Temporal • Future work – Prefetching hot data streams – Eliminating more loop back-edge checks – Improving profile quality further 17

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend