Power!Use!of!Disk!Subsystems!in! Supercomputers! Ma8hew!L.!Curry! - - PowerPoint PPT Presentation

power use of disk subsystems in supercomputers
SMART_READER_LITE
LIVE PREVIEW

Power!Use!of!Disk!Subsystems!in! Supercomputers! Ma8hew!L.!Curry! - - PowerPoint PPT Presentation

Power!Use!of!Disk!Subsystems!in! Supercomputers! Ma8hew!L.!Curry! Sandia!Na?onal!Laboratories! Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energys


slide-1
SLIDE 1

Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy’s National Nuclear Security Administration under contract DE-AC04-94AL85000.! !

Power!Use!of!Disk!Subsystems!in! Supercomputers!

Ma8hew!L.!Curry! Sandia!Na?onal!Laboratories!

slide-2
SLIDE 2

Supercomputer!Power!Use!and! Exascale!

  • DOE!Plan:!First!exascale!machine!will!

consume!up!to!20!MW,!or!50!GF/W!

  • The!June!2011!Green500!list!has!a!BG/Q!

prototype!as!the!most!efficient!machine!

– 2!GF/W! – In!the!next!decade,!machines!need!to!be!25x! more!power!efficient!!

  • Where!can!we!find!more!power!efficiency?!
slide-3
SLIDE 3

Memory!Hierarchy!and!Power!

  • The!first!reac?on!

is!oXen!to!look!at! which!opera?ons! require!the!most! power!

  • Disks!are!far!away!

and!(most)!have! moving!parts!

  • How!much!power!

does!storage!really! use!for!real! applica?on! behavior?!

Disk!(1000!nJ/byte)! Off_Chip!Cache!(1!nJ/ byte)! DRAM!(0.1!nJ/byte)! On_Chip!Cache! (0.1!nJ/byte)! Reg!

slide-4
SLIDE 4

A!Study!of!Power!in!Supercompu?ng!

  • Survey!three!sites!with!large!machines!

– Los!Alamos:!Roadrunner,!#10,!and!others! – Los!Alamos/Sandia!ACES:!Cielo,!#6! – Sandia:!Red!Sky,!#16! – Clemson!University’s!Palme8o,!#96!

  • Asked!for!power!data!from!compute!and!I/O!

infrastructure!separately!

– No!cooling,!external!infrastructure,!etc.!Just! compute,!I/O!servers,!disks.!

slide-5
SLIDE 5

Los!Alamos!Descrip?on!

  • Two!separate!methods!of!sampling!

– Cielo!individually!

  • 4.7_6.7!MW!
  • 1.1!PF!(~143k!cores)!
  • 10PB!of!dedicated!Panasas!storage!

– Secure!Compu?ng!Environment,!which!includes! Cielo,!Roadrunner,!capacity!clusters,!etc.!

  • 16.5!MW!typical!
  • 3.5!PF!
  • 20!PB!of!Panasas!storage,!with!10PB!served!to!all!

machines!except!Cielo!via!a!10GigE!fabric!

slide-6
SLIDE 6

Los!Alamos!Results!

93% 94% 95% 96% 97% 98% 99% 100% Cielo LANL Secure Computing Environment Disks + Storage Servers + SAN Compute + I/O Forwarding Nodes

slide-7
SLIDE 7

Sandia!Descrip?on!

  • Red!Sky/Red!Mesa!is!the!premier!capacity!

plajorm!for!Sandia!and!NREL!

– 3!PB! – 433.5!PF!(~42k!cores)!

  • One!rack!of!storage!and!compute!measured!

throughout!a!single!day!

  • Extrapolated!to!unclassified!sec?on!of!Red!

Sky,!which!is!approximately!56%!of!the!Red! Sky/Red!Mesa!machine!

slide-8
SLIDE 8

Sandia!Results!

92% 93% 94% 95% 96% 97% 98% 99% 100% Hour 0 Hour 1 Hour 7 Hour 8 Hour 11 Hour 13 Hour 14 Disks Compute + Storage Servers

slide-9
SLIDE 9

Clemson!Descrip?on!

  • Capacity,!condominium!cluster!at!Clemson!

University!

– 92TF,!~14k!cores! – 616TB!

  • Data!collec?on!at!two_hour!intervals!over!

two!weeks!

– Storage!infrastructure!used!mostly!constant! power!throughout!

slide-10
SLIDE 10

Clemson!Results!

90% 91% 92% 93% 94% 95% 96% 97% 98% 99% 100% Hour 0 Hour 20 Hour 40 Hour 60 Hour 80 Hour 100 Hour 120 Hour 140 Hour 160 Disks + Storage Servers Compute

slide-11
SLIDE 11

Extrapola?ng!to!Exascale!

  • Exascale!storage!systems!will!require!

320PB_1EB!of!storage!at!106.7!TB/s!

– 32PB!main!memory! – Checkpoint!every!hour! – 95%!(57/60!minutes)!must!be!spent!compu?ng!

  • Predic?ons!for!future!disks!(~30TB!capacity,!

~380!MB/s!bandwidth)!dictate!277k!disks!!

– 66%!of!power!budget!if!power!per!disk!remains! constant!

slide-12
SLIDE 12

Burst!Buffer!

  • Grider!has!detailed!in!many!presenta?ons!a!

“burst!buffer”!idea!for!checkpoin?ng!

– Quickly!accept!a!checkpoint!in!smaller!flash!store! – Bleed!flash!to!slower!disk_based!storage!between! checkpoints!

  • It!has!been!shown!that!this!will!work!from!a!

purchase!price!standpoint!

– Power?!

slide-13
SLIDE 13

Flash!Characteris?cs!

  • Current!flash!(e.g.,!Intel!320!series)!can!accept!

1MB/s!per!gigabyte!of!capacity!

– Even!today,!90PB!of!flash!(to!hold!three!checkpoints)! is!sufficient!to!sustain!90TB/s!of!bandwidth!

  • Use!10TB/s!disk_based!store!

– Requires!25k!disks,!which!may!hold!738!PB! – Extrapola?ng!from!today’s!disk!power,!this!is!6%!of! the!power!budget! – Flash!uses!a!comparable!amount!of!power,!yielding! 6.6%!of!20MW!for!disk!and!flash!

slide-14
SLIDE 14

Conclusion!

  • I/O!consumes!a!low!propor?on!of!power!within!

the!machine!

– 4.4_5.5%!

  • One!exascale!storage!model,!the!burst_buffer!

scheme,!can!be!done!with!6.6%!of!the!power! budget!

  • Inefficiencies!in!the!power!feed!systems!of!the!

data!center!can!be!a!larger!consumer!of!power!!

  • We!should!always!be!on!the!lookout!for!ways!to!

be!more!efficient!

– Especially!for!workloads!that!aren’t!checkpoin?ng!

slide-15
SLIDE 15

Acknowledgements!

  • Authors!

– Lee!Ward,!Sandia! – Gary!Grider,!Los!Alamos! – Jill!Gemmill,!Clemson! – Jay!Harris,!Clemson! – Dave!Mar?nez,!Sandia!