Damaris: Using Dedicated I/O Cores for Scalable Post-petascale HPC - - PowerPoint PPT Presentation

damaris using dedicated i o cores for scalable post
SMART_READER_LITE
LIVE PREVIEW

Damaris: Using Dedicated I/O Cores for Scalable Post-petascale HPC - - PowerPoint PPT Presentation

Damaris: Using Dedicated I/O Cores for Scalable Post-petascale HPC Simulations Matthieu Dorier ENS Cachan Brittany extension matthieu.dorier@eleves.bretagne.ens-cachan.fr Advised by Gabriel Antoniu SRC Context: HPC simulations on Blue Waters


slide-1
SLIDE 1

Damaris: Using Dedicated I/O Cores for Scalable Post-petascale HPC Simulations

Matthieu Dorier ENS Cachan Brittany extension matthieu.dorier@eleves.bretagne.ens-cachan.fr Advised by Gabriel Antoniu SRC

slide-2
SLIDE 2

2

Context: HPC simulations on Blue Waters

² INRIA/UIUC Joint Lab for Petascale Computing ² Targeting large-scale simulation of unprecedented accuracy ² Our concern: I/O performance scalability

slide-3
SLIDE 3

3

Motivation: data management in HPC

slide-4
SLIDE 4

4

Motivation: data management in HPC

² Problem: ² All processes entering I/O phases at the same time ² File system contention: lake of scalability ² High I/O overhead, high performance variability + 100.000 processes ~ 10.000 processes ~ 100 data servers PetaBytes

  • f data
slide-5
SLIDE 5

5

I/O variability: an example

² CM1 tornado simulation: 672 processes sorted by write time

slide-6
SLIDE 6

6

The Damaris approach: dedicated I/O cores

Leave a core, go faster!

² Use the SMP’s intra-node shared memory

slide-7
SLIDE 7

7

Integration with the CM1 tornado simulation

² Less than an hour to write an I/O backend with Damaris ² The I/O core spends 25% of its time writing è 75% spare time!

How to use the spare time?

² Custom plugin system: ² Data post-processing, indexing, analysis ² End-to-end scientific process ² Connect visualization/analysis tools è inline visualization

slide-8
SLIDE 8

8

Results with the CM1 tornado simulation

² On Grid’5000: French national testbed (24 cores/node, 672 cores), with PVFS, comparison with collective I/O ² Communication overhead è leaving a core is more efficient ² No synchronization ² 6 times higher write throughput ² BluePrint: Power5 BlueWaters interim system at NCSA (16 cores/node, 1024 cores), with GPFS, comparison with file-per-process approach ² On 64 nodes è 64 files instead of 1024

slide-9
SLIDE 9

9

Results with the CM1 tornado simulation

² On Grid’5000: French national testbed (24 cores/node, 672 cores), with PVFS, comparison with collective I/O ² Communication overhead è leaving a core is more efficient ² No synchronization ² 6 times higher write throughput ² BluePrint: Power5 BlueWaters interim system at NCSA (16 cores/node, 1024 cores), with GPFS, comparison with file-per-process approach ² On 64 nodes è 64 files instead of 1024 ² Overall benefits ² Spare time usage ² Data layout adaptation for subsequent analysis ² Overhead-free compression (600%) ² No more I/O jitter

slide-10
SLIDE 10

10

Results with the CM1 tornado simulation

slide-11
SLIDE 11

11

Conclusion

² Damaris: dedicated I/O core in multicore SMP nodes 1 Better I/O and global performance 2 No more variability in write phases 3 Easy integration and configuration ² Targeting Blue Waters and future Post-petascale machines ² Very promising prospects in many directions ² Integration with other simulations: Enzo (AMR), GTC,… ² Leverage spare time for efficient inline visualization ² Data-aware self-configuration, scheduled data movements, multi-simulations coupling ² http://damaris.gforge.inria.fr

slide-12
SLIDE 12

12

Conclusion

² Damaris: dedicated I/O core in multicore SMP nodes 1 Better I/O and global performance 2 No more variability in write phases 3 Easy integration and configuration ² Targeting Blue Waters and future Post-petascale machines ² Very promising prospects in many directions ² Integration with other simulations: Enzo (AMR), GTC,… ² Leverage spare time for efficient inline visualization ² Data-aware self-configuration, scheduled data movements, multi-simulations coupling ² http://damaris.gforge.inria.fr

Thank you, questions?