seagate gov john bent, dagstuhl 2017
3 babblings from bent
dagstuhl 2017
- 1. grind-crunch
- 2. from 2 to 4 and back again
- 3. to share or not to share
3 babblings from bent dagstuhl 2017 1. grind-crunch 2. from 2 to 4 - - PowerPoint PPT Presentation
3 babblings from bent dagstuhl 2017 1. grind-crunch 2. from 2 to 4 and back again 3. to share or not to share john bent, dagstuhl 2017 seagate gov grind-crunch bent john bent, dagstuhl 2017 seagate gov bents super simplistic
seagate gov john bent, dagstuhl 2017
seagate gov john bent, dagstuhl 2017
bent
cycle ckpt cycle ckpt ckpt ckpt cycle cycle comm grind comm grind comm grind grind Key observation: a grind (for strong-scaling apps) traverses all of memory. crunch load crunch load crunch load crunch load
bent, settlemeyer, grider
HPC Storage Stack, 2002-2015
Tightly Coupled Parallel Application Parallel File System Tape Archive Concurrent, aligned, interleaved IO Concurrent, unaligned, interspersed IO
(Graph and analysis courtesy of Gary Grider)
HPC Storage Stack, 2002-2015
Tightly Coupled Parallel Application Parallel File System Tape Archive Concurrent, aligned, interleaved IO Concurrent, unaligned, interspersed IO
HPC Storage Stack, 2015-2016
Burst Buffer Tape Archive Parallel File System Tightly Coupled Parallel Application
Post 2016 HPC Cold Data Requirements MarFS
HPC Storage Stack, 2015-2016
Burst Buffer Tape Archive Parallel File System Tightly Coupled Parallel Application
HPC Storage Stack, 2016-2020
Object Store Parallel File System Tightly Coupled Parallel Application Burst Buffer Tape Archive
HPC Storage Stack, 2020-
Object Store Burst Buffer Tightly Coupled Parallel Application
17
human in loop to make difficult decisions
durable should be site-wide, perf can be machine-local
Tightly Coupled Parallel Application Parallel File System Tape Archive Concurrent, aligned, interleaved IO Concurrent, unaligned, interspersed IO Burst Buffer Tape Archive Parallel File System Tightly Coupled Parallel Application Object Store Parallel File System Tightly Coupled Parallel Application Burst Buffer Tape Archive Object Store Burst Buffer Tightly Coupled Parallel Application
2002-2015 2015-2016 2016-2020 2020-
seagate gov john bent, dagstuhl 2017
bent, settlemeyer, cao
seagate gov john bent, dagstuhl 2017
three places to add burst buffers
CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CNprivate, e.g. Cray/Intel Aurora @ Argonne
seagate gov john bent, dagstuhl 2017
three places to add burst buffers
CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CNshared, e.g. Cray Trinity @ LANL
seagate gov john bent, dagstuhl 2017
three places to add burst buffers
CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CNembedded, e.g. Seagate Nytro NXD
seagate gov john bent, dagstuhl 2017
no contention linear scaling low cost no network bandwidth coupled failure domain single shared file is difficult small jobs cannot use them all
seagate gov john bent, dagstuhl 2017
n-1 easy data can outlive job temporary storage if pfs offline small jobs can use it all decoupled failure domain most flexible ratio btwn compute, burst, pfs most expensive interference possible
seagate gov john bent, dagstuhl 2017
n-1 easy data outlives job small jobs can use it all decoupled failure domain from app low cost most transparent SAN must be provisioned for burst interference possible most transparent
seagate gov john bent, dagstuhl 2017
the value of decoupled failure domains
Bent, Settlemyer, et al. On the non-suitability of non-volatility. HotStorage ’15.
shared doesn’t need parity private does ... but then the high private perf is lost
e.g. RAID6 (10+2) e.g. 2-way replica
seagate gov john bent, dagstuhl 2017
the value of shared for bandwidth
Lei Cao, Bradley Settlemyer, and John Bent. To share or not to share: Comparing burst buffer architectures. SpringSim 2017.
Local Unreliable Local 20% Parity Shared Unreliable Mean Ckpt Bw 206.8 GB/s
simulation of APEX workflows running on Trinity
seagate gov john bent, dagstuhl 2017
the value of shared for bandwidth
Lei Cao, Bradley Settlemyer, and John Bent. To share or not to share: Comparing burst buffer architectures. SpringSim 2017.
Local Unreliable Local 20% Parity Shared Unreliable Mean Ckpt Bw 206.8 GB/s 165.6 GB/s
simulation of APEX workflows running on Trinity
seagate gov john bent, dagstuhl 2017
the value of shared for bandwidth
Lei Cao, Bradley Settlemyer, and John Bent. To share or not to share: Comparing burst buffer architectures. SpringSim 2017.
Local Unreliable Local 20% Parity Shared Unreliable Mean Ckpt Bw 206.8 GB/s 165.6 GB/s 614.54 GB/s
simulation of APEX workflows running on Trinity
seagate gov john bent, dagstuhl 2017
(blame the jet lag) john.bent@seagategov.com