toward runtime power management of exascale networks by
play

Toward Runtime Power Management of Exascale Networks by On/Off - PowerPoint PPT Presentation

Toward Runtime Power Management of Exascale Networks by On/Off Control of Links Ehsan Totoni University of Illinois-Urbana Champaign, PPL Charm++ Workshop, April 16 2013 Power challenge Power is a major challenge Blue Waters


  1. Toward Runtime Power Management of Exascale Networks by On/Off Control of Links Ehsan Totoni University of Illinois-Urbana Champaign, PPL Charm++ Workshop, April 16 2013

  2. Power challenge ò Power is a major challenge ò Blue Waters consuming up to 13 MW ò Enough to electrify a small town ò Power and cooling infrastructure ò Up to 30% of power in network ò Projected for future by Peter Kogge ò Saving 25% power in current Cray XT system by turning down network Work from Sandia ò Ehsan Totoni 2

  3. Network link power ò Network is not “energy proportional” ò Consumption is not related to utilization ò Near peak most of the time ò Unlike processor ò Recent study: ò Work from Google in ISCA’10 ò 50% of power in network of non-HPC data center ò When CPU’s underutilized ò Up to 65% of network’s power is in links Ehsan Totoni 3

  4. Exascale networks ò Dragonfly ò IBM PERCS in Power 775 machines ò Cray Aries network in XC30 “Cascade” ò DOE Exascale Report ò High dimensional Tori ò 5D Torus in IBM Blue Gen/Q ò 6D Torus in K Computer ò Higher radix -> a lot of links! Ehsan Totoni 4

  5. Communication patterns ò Applications’ communication patterns are different ò Network topology designed for a wide range of applications NPB CG MILC Ehsan Totoni 5

  6. Fraction of links ever used Full Network 3D Torus PERCS 6D Torus 100 80 Link Usage (%) 60 40 20 0 Ehsan Totoni 6 NAMD_PME NAMD MILC CG MG BT

  7. Nearest neighbor usage Full Network 3D Torus PERCS 6D Torus 100 80 Link Usage (%) 60 40 20 0 Ehsan Totoni 7 Jacobi2D Jacobi3D Jacobi4D

  8. More expensive links LL links D links LR links all links 100 80 Link Usage (%) 60 40 20 0 NAMD_PME NAMD MILC CG MG BT Ehsan Totoni 8

  9. Nearest neighbor LL links D links LR links all links 100 80 Link Usage (%) 60 40 20 0 Jacobi2D Jacobi3D Jacobi4D Ehsan Totoni 9

  10. Solution to power waste ò Many of the links are never used For common applications ò ò Are networks over-built? Maybe FFTs are crucial ò But processors are also overbuilt ò ò Let’s make them “energy proportional” Consume according to workload ò Just like processors ò ò Turn off unused links Commercial network exists (Motorola) ò Ehsan Totoni 10

  11. Runtime system solution Hardware can cause delays ò According to related work ò Not enough application knowledge ò Small window size ò Compiler does not have enough info ò Input dependent program flow ò Application does not know hardware ò Significant programming burden to expose ò Runtime system is the best ò mediates all communication ò knows the application ò knows the hardware ò Ehsan Totoni 11

  12. Feasibility ò Not probably available for your cluster downstairs ò Need to convince hardware vendors ò Runtime hints to hardware, small delay penalty if wrong ò Multiple jobs: interference ò Isolated allocations are becoming common ò Blue Genes allocate cubes already ò Capability machines are for big jobs Ehsan Totoni 12

  13. Software design choices ò Random mapping and indirect routing have similar performance but different link usages 100 LL links D links LR links all links Link Usage of Jacobi3d 300K (%) 80 60 40 20 Ehsan Totoni 13 0 Default Random Indirect

  14. Power model ò We saw many links that are never used ò Used links are not used all the time ò For only a fraction of iteration time ò Compute-communicate paradigm ò A power model for “network capacity utilization” ò “Average” utilization of all the links ò Assume that links are turned magically on and off At the exact right time ò ò No switching overhead ò Example: network used one tenth of iteration time Ehsan Totoni 14

  15. Model results 45 PERCS Network Capacity Utilization (U %) 6D Torus 40 35 30 25 20 15 10 5 Ehsan Totoni 15 0 NAMD MILC CG MG BT

  16. Scheduling on/offs ò Runtime roughly knows when a message will arrive ò For common iterative HPC applications ò Low noise systems (e.g. IBM Blue Genes) ò There is a delay for switching the link ò 10 μ s for current implementation ò Much smaller than iteration time ò Runtime can be conservative ò Schedule “on”s earlier ò Similar to having more switching delay Ehsan Totoni 16

  17. Delay overhead 100 NAMD Network Capacity Utilization (U %) MILC CG MG 80 BT 60 40 20 0 0.01 0.1 1 10 Link Transition Delay (ms) Ehsan Totoni 17

  18. Results summary Basic PERCS Schedule 1ms delay PERCS Basic 6D Torus Schedule 1ms delay 6D Torus Machine Power Saving Potential (%) 30 25 20 15 10 5 0 Ehsan Totoni NAMD_PME MILC CG MG BT 18

  19. Questions? Are you convinced? Ehsan Totoni 19

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend