SLIDE 25 Scalable Parallel I/O Alternatives 25
Summary and Future Work
- We examine several parallel I/O approaches..
– 1 POSIX File per Proc: < 1 GB/sec on BG/L – PMPIO: < 1 GB/sec on BG/L – syncIO – all processors write as groups to different files
- BG/L: 6.6 GB/sec read, 1.3 GB/sec write
- BG/P: 11.6 GB/sec read, 25 GB/sec write
– rbIO – gives up 3 to 6% of compute nodes to hide latency of blocking parallel I/O.
- BG/L: 2.3 GB/sec actual write, 22 TB/sec perceived write
- BG/P: ~18 GB/sec actual write, ~22 TB/sec perceived write
- Good trade-off on Blue Gene
- All procs to 1 file does not yield good performance even if aligned.
- Performance “sweet spot” for syncIO depends significantly on I/O
architecture and so file format must be tuned accordingly
– BG/L @ CCNI has a metadata bottleneck and must adjust # of files according – e.g., 32 to 128 writers – BG/P @ ALCF can sustain much higher performance, but requires more files – e.g., 1024 writers – Suggest collective I/O is sensitive to underlying file system performance.
- For rbIO, we observed that 1024 writers was the best performance so far
for both BG/L and BG/P platforms..
- Future Work – impact on performance of different filesystems
– Leverage Darshan logs @ ALCF to better understand Intrepid performance – More experiments on Blue Gene/P under PVFS, CrayXT5 under Lustre