advanced data placement via ad hoc file systems at
play

Advanced Data Placement via Ad-hoc File Systems at Extreme Scales - PowerPoint PPT Presentation

Center for Information Services and High Performance Computing (ZIH) Advanced Data Placement via Ad-hoc File Systems at Extreme Scales (ADA-FS) Michael Kluge, Wolfgang E. Nagel, Andr Brinkmann, Achim Streit, Sebastian Oeste, Marc-Andr Vef,


  1. Center for Information Services and High Performance Computing (ZIH) Advanced Data Placement via Ad-hoc File Systems at Extreme Scales (ADA-FS) Michael Kluge, Wolfgang E. Nagel, André Brinkmann, Achim Streit, Sebastian Oeste, Marc-André Vef, Mehmet Soysal PDSW-DISCS @ SC’16 Salt Lake City, 2016/11/24

  2. Project Rationale I/O Challenges at Exascale I/O subsystem is the slowest system to access in a HPC machine Shared medium: no reliable bandwidth, no good transfer time predictions Upcoming architectures with “fat nodes” and intermediate local storages Goal: optimize I/O Faster access Using additional storages Transparent solution for parallel applications Pre-stage inputs early, Pre-stage inputs cache outputs 1 Michael Kluge

  3. Proposed Solution Ad-hoc overlay file system – Separate overlay file system per application run – Instantiated on the scheduled compute nodes – Lives longer than the users’ job Central I/O planner – Global Planning of I/O including stage-in/-out of data, for all par. jobs – Optimization of data placement in the ad-hoc file system (resp. nodes) – Integration with systems batch scheduler Application monitoring, resource discovery – I/O behavior, machine-specific storage types, sizes, speeds, … 2 Michael Kluge

  4. Ad-hoc overlay file system Research Goals Related Work Status Relax POSIX semantics GPFS, Lustre, Design phase for based on access patterns BeeGFS,… scalable metadata and lock free block storage No locking Key-value stores for metadata Evaluation of different Distributed Metadata storage schemata DeltaFS, BurstFS, … Eventual consistency Monitoring Make applications responsible for their I/O 3 Michael Kluge

  5. Central I/O Planner Research Goals Related Work Status Stage in and stage out of Current batch systems, Prototype for a data Data Staging from Grid temporary file system Environments based on BeeGFS Maybe even during job runtime Workpool/Workspace Stage in and stage out concepts based on parallel copy Schedule I/O based on tools estimations from the I/O scheduling and QoS running/planned jobs approaches SLURM integration 4 Michael Kluge

  6. Resource Discovery and Monitoring Research Goals Related Work Status Collect available OpenMPI Working prototype that resources discovers node and Likwid connection details Monitor FS activities Many data collection Working on integration Provide planner with tools into I/O planner estimations about I/O I/O pattern recognition capabilities and current usage Learn I/O behavior for standard applications 5 Michael Kluge

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend