Automatic and Transparent I/O Optimization With Storage Integrated - - PowerPoint PPT Presentation

automatic and transparent i o optimization with storage
SMART_READER_LITE
LIVE PREVIEW

Automatic and Transparent I/O Optimization With Storage Integrated - - PowerPoint PPT Presentation

Automatic and Transparent I/O Optimization With Storage Integrated Runtime Support Noah Watkins Zhihao Jia Galen Shipman Carlos Maltzahn Alex Aiken Pat McCormick UC Santa Cruz Stanford LANL What is this talk about? Convince you that


slide-1
SLIDE 1

Automatic and Transparent I/O Optimization With Storage Integrated Runtime Support

Noah Watkins Carlos Maltzahn Zhihao Jia Alex Aiken Galen Shipman Pat McCormick UC Santa Cruz Stanford LANL

slide-2
SLIDE 2

What is this talk about?

  • Convince you that storage should be interested into [HPC] application

execution models

2

Application Application Application Database Engine Data Data Data Data Data Data Add Index Add-Index()

slide-3
SLIDE 3

Application and System Development

3

bits bits

SCIENCE

bits bits bits bits bits bits bits POSIX

?

  • Isolated development
  • Maximize FLOPS
  • Checkpoint / Restart
slide-4
SLIDE 4

Application and System Development

4

bits bits

SCIENCE

bits bits bits bits bits bits bits POSIX

?

  • Isolated development
  • Maximize BW/Latency
  • File system interface
slide-5
SLIDE 5

Conflict of Interest in System Development

5

bits bits

SCIENCE

bits bits bits bits bits bits bits POSIX I/O

  • Isolated development
  • Maximize FLOPS
  • Checkpoint / Restart
  • Isolated development
  • Maximize BW/Latency
  • File system interface
slide-6
SLIDE 6

Abstractions hide important parameters

6

bits bits

SCIENCE

bits bits bits bits bits bits bits POSIX I/O pwrite(fd, data, 1.5MB, 1MB)

  • unaligned write
  • update multiple blocks
  • locking protocols

application intent, data model

slide-7
SLIDE 7

Inflexible applications cannot adapt

7

bits bits

SCIENCE

bits bits bits bits bits bits bits POSIX I/O

SCIENCE SCIENCE SCIENCE SCIENCE SCIENCE SCIENCE SCIENCE SCIENCE

write(...) write(...) write(...) write(...) App1 App2 t0 App1 App2 t0 t1 contention blocking

storage system state, configuration

slide-8
SLIDE 8

Communicating application requirements

8

Application Database Engine Data Data SQL HPC Application ??? (Runtime) Data Data ???

  • Database engines use SQL to communicate declarative requirements
  • HPC applications are entirely different, and require different mechanisms
slide-9
SLIDE 9

I/O Middleware Stacks

9

bits bits bits bits bits bits bits bits bits POSIX

Array I/O common data model

Collective I/O Data Sieving Hints I/O Pattern Transform Hints

slide-10
SLIDE 10

Remainder of the talk

  • Illustrate challenges for existing I/O stacks and application design
  • Describe our work integrating storage into the Legion runtime
  • Preliminary results

10

slide-11
SLIDE 11

Motivating Example

  • Heterogeneous memory hierarchy

○ Multiple tiers and networks

  • Adaptive mesh refinement (AMR)

○ Resolution-aware I/O

  • Workflow systems and in-transit

○ Data rendezvous

  • Out-of-core algorithms
  • Data management challenges

○ Metadata ○ Consistency

  • Independent I/O

11

slide-12
SLIDE 12

Motivating Example: Independent I/O

12

Computation

I/O I/O

bits bits bits bits bits bits bits bits bits

System A

I/O I/O I/O

slide-13
SLIDE 13

Motivating Example: Independent I/O

13

Computation

I/O I/O

bits bits bits bits bits bits bits bits bits

System A

I/O I/O I/O

compute(); a = async_io(); compute(); wait(a); compute(a);

slide-14
SLIDE 14

Independent I/O: Consistency Challenges

14

Computation

I/O I/O

bits bits bits bits bits bits bits bits bits

System A

I/O I/O I/O

Timestep 1 Timestep 2 Timestep 3

slide-15
SLIDE 15

Independent I/O: Consistency Challenges

15

Computation

I/O I/O

bits bits bits bits bits bits bits bits bits

System A

I/O I/O I/O

Timestep 1 Timestep 2 Timestep 3

TS State Done 1 Red (T1) Yes 2 Green (Xfer) No

slide-16
SLIDE 16

Independent I/O: Consistency Challenges

16

Computation

I/O I/O

bits bits bits bits bits bits bits bits bits

System A

I/O I/O I/O

Timestep 1 Timestep 2 Timestep 3

TS State Done 1 Red (T1) Yes 2 Green (T1) Blue (Xfer) No 3 Yellow (T2) Yes

slide-17
SLIDE 17

Independent I/O: Consistency Challenges

17

Computation

I/O I/O

bits bits bits bits bits bits bits bits bits

System A

I/O I/O I/O

Timestep 1 Timestep 2 Timestep 3

TS State Done 1 Red (T1) Yes 2 Green (T1) Blue (Xfer) No 3 Yellow (T2) Yes

slide-18
SLIDE 18

Independent I/O: Portability

18

Computation

I/O I/O

bits bits bits bits bits bits bits bits bits

System A

I/O I/O I/O

slide-19
SLIDE 19

Independent I/O: Portability

19

Computation

I/O I/O I/O I/O

bits bits bits bits bits bits bits bits bits

System B

I/O I/O

slide-20
SLIDE 20

Independent I/O: Portability

20

Computation

I/O I/O I/O I/O

bits bits bits bits bits bits bits bits bits

System B

I/O I/O compute();

a = async_io(); compute(); wait(a); compute(a);

slide-21
SLIDE 21

Melding I/O and Application Semantics

21

bits bits

SCIENCE

bits bits bits bits bits bits bits

  • 1. Data model
  • 2. Memory

Model

  • 3. Data

Dependence

Intent Intent I/O Application Runtime

slide-22
SLIDE 22

Melding I/O and Application Semantics

22

bits bits

SCIENCE

bits bits bits bits bits bits bits

  • 1. Data model
  • 2. Memory

Model

  • 3. Data

Dependence

Intent Intent I/O

slide-23
SLIDE 23

Legion Programming Model and Runtime

23

Legion Runtime Machine

GASNet Memory GPU RAM ZeroCopy RAM

  • Prototype built in Legion

○ Parallel, data-centric, task-based

  • Logical Region Data Model

○ Do not commit to physical layout

  • Memory hierarchy

○ Unified model across memory types

  • Data dependencies extracted from

application

○ Managed by runtime ○ Optimizations

“Legion: Expressing Locality and Independence with Logical Regions”, Michael Bauer, Sean Treichler, Elliott Slaughter, Alex Aiken, SC 12

slide-24
SLIDE 24

Legion and Persistent Memory Integration

24

Legion Runtime Machine

GASNet Memory

GPU RAM ZeroCop y RAM

  • Our work introduces persistent

memory into Legion

  • HDF5 and RADOS targets
  • Legion tracks instances like any
  • ther memory
  • Persistent is transparent to

application

  • Integrated with dependence tracking

and coherence control

HDF5 RADOS

slide-25
SLIDE 25

Preliminary Results: Microbenchmark

25

HDF5 (FS) librados (Object) Legion Runtime

slide-26
SLIDE 26

Preliminary Results: Microbenchmark

26

HDF5 (FS) librados (Object) Legion Runtime checkpoint

slide-27
SLIDE 27

Preliminary Results: Microbenchmark

27

HDF5 (FS) librados (Object) Legion Runtime restart

slide-28
SLIDE 28

Preliminary Results: Optimizations

28

HDF5 (FS) librados (Object) Legion Runtime

Optimizations

  • Sharding
  • Independent I/O
slide-29
SLIDE 29

Preliminary Results: Weak Scaling

29

Lustre, HDF5 (Read) Lustre, HDF5 (Write)

MPI-IO, Caching N-N N-1

  • Application state partitioned into 256 shards
  • Scaled from 4 GB to 32 GB across 2 to 16 nodes
  • Compared throughput against IOR, N-1, HDF5, MPI-IO
slide-30
SLIDE 30

Preliminary Results: Weak Scaling

30

RADOS Target (R/W)

  • Application state partitioned into 256 shards
  • Scaled from 4 GB to 32 GB across 2 to 10 nodes
  • Transparent integration with non-POSIX backends
slide-31
SLIDE 31

Checkpoint without global barrier

31

Legion, Lustre, HDF5 Legion, RADOS Target

shared storage dedicated storage

  • Application state partitioned into 256 shards
  • 14 GB data set size (56 MB shards), fixed set of 12 nodes
  • Tracked read-write phases for each shard
slide-32
SLIDE 32
  • Application state partitioned into 256 shards
  • Total application state size 14 GB
  • Scaled from 4 to 32 nodes (Lustre), 2 to 12 nodes (RADOS)
  • Preliminary Results: Strong Scaling

32

Lustre, HDF5 RADOS Target

OSD Cache Journaling Limited DMA Threads Caching, Noise

slide-33
SLIDE 33

Conclusion

  • Memory hierarchies are becoming complex!
  • We cannot continue to just evolve applications
  • Storage should be interested into application execution models

○ Hard-coding optimizations is bad ○ Restricts flexibility and portability

  • Legion runtime and programming model supports pluggable memory
  • Integrate persistent storage as a memory
  • Initial results show feasibility of the system design
  • Enables wide range of transparent optimizations
  • Questions?

○ Noah Waktins (jayhawk@soe.ucsc.edu)

33