3 babblings from bent dagstuhl 2017 1. grind-crunch 2. from 2 to 4 - - PowerPoint PPT Presentation

3 babblings from bent
SMART_READER_LITE
LIVE PREVIEW

3 babblings from bent dagstuhl 2017 1. grind-crunch 2. from 2 to 4 - - PowerPoint PPT Presentation

3 babblings from bent dagstuhl 2017 1. grind-crunch 2. from 2 to 4 and back again 3. to share or not to share john bent, dagstuhl 2017 seagate gov grind-crunch bent john bent, dagstuhl 2017 seagate gov bents super simplistic


slide-1
SLIDE 1

seagate gov john bent, dagstuhl 2017

3 babblings from bent

dagstuhl 2017

  • 1. grind-crunch
  • 2. from 2 to 4 and back again
  • 3. to share or not to share
slide-2
SLIDE 2

seagate gov john bent, dagstuhl 2017

grind-crunch

bent

slide-3
SLIDE 3

cycle ckpt cycle ckpt ckpt ckpt cycle cycle comm grind comm grind comm grind grind Key observation: a grind (for strong-scaling apps) traverses all of memory. crunch load crunch load crunch load crunch load

bent’s super simplistic understanding of hpc sims

slide-4
SLIDE 4
slide-5
SLIDE 5
  • ne more dollar test
slide-6
SLIDE 6

2 to 4 and back again

bent, settlemeyer, grider

slide-7
SLIDE 7

HPC Storage Stack, 2002-2015

Tightly Coupled Parallel Application Parallel File System Tape Archive Concurrent, aligned, interleaved IO Concurrent, unaligned, interspersed IO

slide-8
SLIDE 8

(Graph and analysis courtesy of Gary Grider)

whence burst buffer?

slide-9
SLIDE 9

HPC Storage Stack, 2002-2015

Tightly Coupled Parallel Application Parallel File System Tape Archive Concurrent, aligned, interleaved IO Concurrent, unaligned, interspersed IO

slide-10
SLIDE 10

HPC Storage Stack, 2015-2016

Burst Buffer Tape Archive Parallel File System Tightly Coupled Parallel Application

slide-11
SLIDE 11

Post 2016 HPC Cold Data Requirements MarFS

whence object?

slide-12
SLIDE 12

HPC Storage Stack, 2015-2016

Burst Buffer Tape Archive Parallel File System Tightly Coupled Parallel Application

slide-13
SLIDE 13

HPC Storage Stack, 2016-2020

Object Store Parallel File System Tightly Coupled Parallel Application Burst Buffer Tape Archive

slide-14
SLIDE 14

four is too many

slide-15
SLIDE 15

HPC Storage Stack, 2020-

Object Store Burst Buffer Tightly Coupled Parallel Application

slide-16
SLIDE 16

why not one?

slide-17
SLIDE 17

17

slide-18
SLIDE 18

the number is 2

  • f physical, there may be many
  • f logical, there should be two

human in loop to make difficult decisions

  • ne storage system focused on performance, one on durability

durable should be site-wide, perf can be machine-local

slide-19
SLIDE 19

from 2 to 4 and back again

Tightly Coupled Parallel Application Parallel File System Tape Archive Concurrent, aligned, interleaved IO Concurrent, unaligned, interspersed IO Burst Buffer Tape Archive Parallel File System Tightly Coupled Parallel Application Object Store Parallel File System Tightly Coupled Parallel Application Burst Buffer Tape Archive Object Store Burst Buffer Tightly Coupled Parallel Application

2002-2015 2015-2016 2016-2020 2020-

slide-20
SLIDE 20

seagate gov john bent, dagstuhl 2017

to share or not to share

a comparison of burst buffer architectures

bent, settlemeyer, cao

slide-21
SLIDE 21

seagate gov john bent, dagstuhl 2017

three places to add burst buffers

CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN

private, e.g. Cray/Intel Aurora @ Argonne

slide-22
SLIDE 22

seagate gov john bent, dagstuhl 2017

three places to add burst buffers

CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN

shared, e.g. Cray Trinity @ LANL

slide-23
SLIDE 23

seagate gov john bent, dagstuhl 2017

three places to add burst buffers

CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN

embedded, e.g. Seagate Nytro NXD

slide-24
SLIDE 24

seagate gov john bent, dagstuhl 2017

private

no contention linear scaling low cost no network bandwidth coupled failure domain single shared file is difficult small jobs cannot use them all

slide-25
SLIDE 25

seagate gov john bent, dagstuhl 2017

shared

n-1 easy data can outlive job temporary storage if pfs offline small jobs can use it all decoupled failure domain most flexible ratio btwn compute, burst, pfs most expensive interference possible

slide-26
SLIDE 26

seagate gov john bent, dagstuhl 2017

embedded

n-1 easy data outlives job small jobs can use it all decoupled failure domain from app low cost most transparent SAN must be provisioned for burst interference possible most transparent

slide-27
SLIDE 27

seagate gov john bent, dagstuhl 2017

the value of decoupled failure domains

Bent, Settlemyer, et al. On the non-suitability of non-volatility. HotStorage ’15.

  • bservations

shared doesn’t need parity private does ... but then the high private perf is lost

e.g. RAID6 (10+2) e.g. 2-way replica

slide-28
SLIDE 28

seagate gov john bent, dagstuhl 2017

the value of shared for bandwidth

Lei Cao, Bradley Settlemyer, and John Bent. To share or not to share: Comparing burst buffer architectures. SpringSim 2017.

Local Unreliable Local 20% Parity Shared Unreliable Mean Ckpt Bw 206.8 GB/s

simulation of APEX workflows running on Trinity

slide-29
SLIDE 29

seagate gov john bent, dagstuhl 2017

the value of shared for bandwidth

Lei Cao, Bradley Settlemyer, and John Bent. To share or not to share: Comparing burst buffer architectures. SpringSim 2017.

Local Unreliable Local 20% Parity Shared Unreliable Mean Ckpt Bw 206.8 GB/s 165.6 GB/s

simulation of APEX workflows running on Trinity

slide-30
SLIDE 30

seagate gov john bent, dagstuhl 2017

the value of shared for bandwidth

Lei Cao, Bradley Settlemyer, and John Bent. To share or not to share: Comparing burst buffer architectures. SpringSim 2017.

Local Unreliable Local 20% Parity Shared Unreliable Mean Ckpt Bw 206.8 GB/s 165.6 GB/s 614.54 GB/s

simulation of APEX workflows running on Trinity

  • bservation: capacity machines need shared burst buffers
slide-31
SLIDE 31

seagate gov john bent, dagstuhl 2017

babblings from bent @dagstuhl

(blame the jet lag) john.bent@seagategov.com