Automatic and Transparent I/O Optimization With Storage Integrated Runtime Support
Noah Watkins Carlos Maltzahn Zhihao Jia Alex Aiken Galen Shipman Pat McCormick UC Santa Cruz Stanford LANL
Automatic and Transparent I/O Optimization With Storage Integrated - - PowerPoint PPT Presentation
Automatic and Transparent I/O Optimization With Storage Integrated Runtime Support Noah Watkins Zhihao Jia Galen Shipman Carlos Maltzahn Alex Aiken Pat McCormick UC Santa Cruz Stanford LANL What is this talk about? Convince you that
Noah Watkins Carlos Maltzahn Zhihao Jia Alex Aiken Galen Shipman Pat McCormick UC Santa Cruz Stanford LANL
execution models
2
Application Application Application Database Engine Data Data Data Data Data Data Add Index Add-Index()
3
bits bits
SCIENCE
bits bits bits bits bits bits bits POSIX
4
bits bits
SCIENCE
bits bits bits bits bits bits bits POSIX
5
bits bits
SCIENCE
bits bits bits bits bits bits bits POSIX I/O
6
bits bits
SCIENCE
bits bits bits bits bits bits bits POSIX I/O pwrite(fd, data, 1.5MB, 1MB)
application intent, data model
7
bits bits
SCIENCE
bits bits bits bits bits bits bits POSIX I/O
SCIENCE SCIENCE SCIENCE SCIENCE SCIENCE SCIENCE SCIENCE SCIENCE
write(...) write(...) write(...) write(...) App1 App2 t0 App1 App2 t0 t1 contention blocking
storage system state, configuration
8
Application Database Engine Data Data SQL HPC Application ??? (Runtime) Data Data ???
9
bits bits bits bits bits bits bits bits bits POSIX
Array I/O common data model
Collective I/O Data Sieving Hints I/O Pattern Transform Hints
10
○ Multiple tiers and networks
○ Resolution-aware I/O
○ Data rendezvous
○ Metadata ○ Consistency
11
12
Computation
I/O I/O
bits bits bits bits bits bits bits bits bits
System A
I/O I/O I/O
13
Computation
I/O I/O
bits bits bits bits bits bits bits bits bits
System A
I/O I/O I/O
14
Computation
I/O I/O
bits bits bits bits bits bits bits bits bits
System A
I/O I/O I/O
Timestep 1 Timestep 2 Timestep 3
15
Computation
I/O I/O
bits bits bits bits bits bits bits bits bits
System A
I/O I/O I/O
Timestep 1 Timestep 2 Timestep 3
TS State Done 1 Red (T1) Yes 2 Green (Xfer) No
16
Computation
I/O I/O
bits bits bits bits bits bits bits bits bits
System A
I/O I/O I/O
Timestep 1 Timestep 2 Timestep 3
TS State Done 1 Red (T1) Yes 2 Green (T1) Blue (Xfer) No 3 Yellow (T2) Yes
17
Computation
I/O I/O
bits bits bits bits bits bits bits bits bits
System A
I/O I/O I/O
Timestep 1 Timestep 2 Timestep 3
TS State Done 1 Red (T1) Yes 2 Green (T1) Blue (Xfer) No 3 Yellow (T2) Yes
18
Computation
I/O I/O
bits bits bits bits bits bits bits bits bits
System A
I/O I/O I/O
19
Computation
I/O I/O I/O I/O
bits bits bits bits bits bits bits bits bits
System B
I/O I/O
20
Computation
I/O I/O I/O I/O
bits bits bits bits bits bits bits bits bits
System B
I/O I/O compute();
21
bits bits
SCIENCE
bits bits bits bits bits bits bits
Model
Dependence
Intent Intent I/O Application Runtime
22
bits bits
SCIENCE
bits bits bits bits bits bits bits
Model
Dependence
Intent Intent I/O
23
Legion Runtime Machine
GASNet Memory GPU RAM ZeroCopy RAM
○ Parallel, data-centric, task-based
○ Do not commit to physical layout
○ Unified model across memory types
application
○ Managed by runtime ○ Optimizations
“Legion: Expressing Locality and Independence with Logical Regions”, Michael Bauer, Sean Treichler, Elliott Slaughter, Alex Aiken, SC 12
24
Legion Runtime Machine
GASNet Memory
GPU RAM ZeroCop y RAM
memory into Legion
application
and coherence control
HDF5 RADOS
25
HDF5 (FS) librados (Object) Legion Runtime
26
HDF5 (FS) librados (Object) Legion Runtime checkpoint
27
HDF5 (FS) librados (Object) Legion Runtime restart
28
HDF5 (FS) librados (Object) Legion Runtime
29
Lustre, HDF5 (Read) Lustre, HDF5 (Write)
MPI-IO, Caching N-N N-1
30
RADOS Target (R/W)
31
Legion, Lustre, HDF5 Legion, RADOS Target
shared storage dedicated storage
32
Lustre, HDF5 RADOS Target
OSD Cache Journaling Limited DMA Threads Caching, Noise
○ Hard-coding optimizations is bad ○ Restricts flexibility and portability
○ Noah Waktins (jayhawk@soe.ucsc.edu)
33