kyle c hale boris grot stephen w keckler
play

Kyle C. Hale Boris Grot Stephen W. Keckler December 12, 2009 - PowerPoint PPT Presentation

Kyle C. Hale Boris Grot Stephen W. Keckler December 12, 2009 Department of Computer Science The University of Texas at Austin Static Energy consumption due to leakage threatening to become dominant Network constitutes a substantial (up


  1. Kyle C. Hale Boris Grot Stephen W. Keckler December 12, 2009 Department of Computer Science The University of Texas at Austin

  2.  Static Energy consumption due to leakage threatening to become dominant  Network constitutes a substantial (up to 36%) portion of total chip energy [Kim et al., ISLPED ‘03]  Abundant on-chip bandwidth often underutilized due to low injection rates Department of Computer Science 2 The University of Texas at Austin

  3.  Background  Segment Gating  Methodology  Evaluation ◦ Optimal Gating Scheme ◦ Static Scheme with Random Segment Selection ◦ Static Scheme with Intelligent Segment Selection ◦ Dynamic Gating Scheme  Future Work Department of Computer Science 3 The University of Texas at Austin

  4.  Gated-V DD (power gating) at various granularities  Power-aware buffer designs [Chen & Peh, ISLPED ’03]  Slow-Silent VCs (DVFS applied to links, power gate the buffers) [Matsutani et al., NOCS ‘08] Department of Computer Science 4 The University of Texas at Austin

  5.  Aggressive gating of idle resources  Link ◦ Driver, repeaters  Router ◦ All VC buffers and management logic ◦ Xbar ports  line drivers & switching elements ◦ Allocators  2-level stateful allocators Department of Computer Science 5 The University of Texas at Austin

  6. Downstream Router Upstream Router V.A. R.C. V.A. R.C. S.A. S.A. N N S S E E W W Department of Computer Science 6 The University of Texas at Austin

  7.  Background  Segment Gating  Methodology  Evaluation ◦ Optimal Gating Scheme ◦ Static Scheme with Random Segment Selection ◦ Static Scheme with Intelligent Segment Selection ◦ Dynamic Gating Scheme  Future Work Department of Computer Science 7 The University of Texas at Austin

  8.  Three types of Gating Schemes ◦ Optimal Gating (oracle, no cost)  Upper bound on energy savings ◦ Static Segment Gating  Off-line decision on which segments to gate ◦ Dynamic Segment Gating  “On-line” decisions based on dynamic workload  Evaluated via analytical approaches ◦ Not simulation based (for now) ◦ Effects of contention ignored Department of Computer Science 8 The University of Texas at Austin

  9.  No power-down/wake-up overheads (latency, energy)  Segment shuts down during any period of inactivity and instantly wakes up on-demand  Used as a baseline measurement for static schemes Department of Computer Science 9 The University of Texas at Austin

  10.  Objective: Turn off a certain number of segments before the run of the workload  Measure impact of gating through effect on hop counts  Static analysis tool: ◦ Represents mesh as directed graph ◦ Hop counts derived from shortest path lengths between communicating nodes ◦ Shortest paths may be longer than min manhattan routes  Segments turned off via stochastic process ◦ Invariant: full connectivity maintained ◦ Take multiple samples to generate a distribution Department of Computer Science 10 The University of Texas at Austin

  11.  Select segments at random to power down  Limits of static segment gating: ◦ 161 segments (links) out of 224 gated in a 64-node mesh ◦ A gated segment remains in that state for rest of workload  For certain traffic patterns, a random decision could lead to a bad choice Department of Computer Science 11 The University of Texas at Austin

  12.  Random selection ignores characteristics of traffic patterns  Instead, pick segments based on link utilization ◦ For applications with communication regularity ◦ Requires us to know communication pattern a priori  Two-stage approach ◦ Stage 1: Pick segments (links) with utilization zero  92 for bit-complement  100 for transpose ◦ Stage 2 (iterative):  Turn off least-utilized segment  Recompute utilization based on new traffic flow Department of Computer Science 12 The University of Texas at Austin

  13.  Objective: Dynamically gate segments to accommodate a changing workload  PARSEC traces run through cycle-accurate network simulator ◦ Log each link’s idle/active periods  Off-line analysis of activity logs ◦ Combine with power model ◦ Gate idle links ◦ Wake up links on demand ◦ Ignore contention and wake-up delays ◦ Segments must be gated long enough to amortize energy cost of wake-up & power-down Department of Computer Science 13 The University of Texas at Austin

  14. Component Component Static ( Static (nJ nJ/cycle) /cycle) Dynamic ( Dynamic (nJ nJ/flit) /flit) Flit buffers 1.5 7.23 Crossbar 0.491 10.3 Allocators 0.215 0.7 Link 0.556 8.1  Derived using ORION 2.0  Allocator energy: energy from 1 st & 2 nd level switch and VC allocators combined  Note: Buffers only account for 55% of leakage energy! Department of Computer Science 14 The University of Texas at Austin

  15. Topology 64-node mesh Channels 128 bits wide, 1 cycle/link Synthetic Uniform random, transpose, bit- Workloads complement. Each workload comprises 1,000 packets injected by each node PARSEC traces Blackscholes, bodytrack, fluidanimate, vips, x264. Sim-medium datasets Router details 2-stage speculative pipeline, 5 ports, 4 VCs/port, 5 flits/VC Department of Computer Science 15 The University of Texas at Austin

  16.  Background  Segment Gating  Methodology  Evaluation ◦ Optimal Gating Scheme ◦ Static Scheme with Random Segment Selection ◦ Static Scheme with Intelligent Segment Selection ◦ Dynamic Gating Scheme  Future Work Department of Computer Science 16 The University of Texas at Austin

  17. 50 Dynamic Leakage 40 Total Energy ( µ J) J) 30 Total Energy ( 20 10 0 Uniform Random Transpose Bit Complement Department of Computer Science 17 The University of Texas at Austin

  18. 26 Mean 24 Max 22 Min 20 18 16 Hop Count Hop Count 14 12 10 8 6 4 2 0 0 20 40 60 80 100 120 140 160 Number of Gated Segments Number of Gated Segments  ~20 edges can be removed with a negligible hop count increase  2-4x increase in hop count with max # of segments gated Department of Computer Science 18 The University of Texas at Austin

  19. 2,000 Dynamic Leakage 1,800 1,600 1,400 Total Energy ( µ J) J) 1,200 Total Energy ( 1,000 800 600 400 200 0 0.1% 1.0% 5.0% 10.0% 0.1% 1.0% 5.0% 10.0% 0.1% 1.0% 5.0% 10.0% 0.1% 1.0% 5.0% 10.0% 0.1% 1.0% 5.0% 10.0% 0 40 80 120 161 Number of Gated Segments Number of Gated Segments Department of Computer Science 19 The University of Texas at Austin

  20. Department of Computer Science 20 The University of Texas at Austin

  21.  Two policies for idle period and break-even point ◦ Aggressive: 2 cycles idle, 10 cycles to break even ◦ Conservative: 10 cycles idle, 50 cycles to break even Department of Computer Science 21 The University of Texas at Austin

  22. Department of Computer Science 22 The University of Texas at Austin

  23.  Detailed simulation and analysis  Consider performance impact and contention  Other network configurations  Clearly establish regimes that should use Segment Gating  Explore application to fault tolerant systems Department of Computer Science 23 The University of Texas at Austin

  24.  Real applications show sparse communication so potential for static energy savings is high  Using link utilization, we can minimize dynamic energy incurred from gating segments statically  Aggressive dynamic policy gives us static energy savings for up to 99% of cycles Department of Computer Science 24 The University of Texas at Austin

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend