Efficient Scientific Data Management on Supercomputers HDF5 and - PowerPoint PPT Presentation

Efficient Scientific Data Management on Supercomputers – HDF5 and Proactive Data Containers (PDC) Suren Byna Staff Scientist Scientific Data Management Group Data Science and Technology Department Lawrence Berkeley National Laboratory

Scientific Data - Where is it coming from? ▪ Simulations ▪ Experiments ▪ Observations 2

Life of scientific data Generation In situ analysis Processing Storage Analysis Preservation (archive) Sharing Refinement 3

Supercomputing systems 4

Typical supercomputer architecture Blade&&=&2&x&Burst&Buffer&Node&(2x&SSD)& Compute&Nodes& I/O&Node&(2x&InfiniBand&HCA)& BB& SSD& CN& CN& CN& CN& CN& SSD& Lustre&OSSs/OSTs& Storage&Fabric&(InfiniBand)& ION& IB& CN& CN& CN& CN& CN& IB& BB& SSD& CN& CN& CN& CN& CN& SSD& BB& SSD& CN& CN& CN& CN& CN& SSD& ION& IB& CN& CN& CN& CN& CN& IB& Storage&Servers& BB& SSD& CN& CN& CN& CN& CN& SSD& Aries&HighHSpeed&Network& InfiniBand&Fabric& Cori system 5

Scientific Data Management in supercomputers ▪ Data representation – Metadata, data structures, data models ▪ Data storage – Storing and retrieving data and metadata to file systems fast ▪ Data access – Improving performance of data access that scientists desire ▪ Facilitating analysis – Strategies for supporting finding the meaning in the data ▪ Data transfers – Transfer data within a supercomputing system and between different systems 6

Scientific Data Management in supercomputers ▪ Data representation – Metadata, data structures, data models ▪ Data storage – Storing and retrieving data and metadata to file systems fast ▪ Data access – Improving performance of data access that scientists desire ▪ Facilitating analysis – Strategies for supporting finding the meaning in the data ▪ Data transfers – Transfer data within a supercomputing system and between different systems 7

Focus of this presentation ▪ Storing and retrieving data – Parallel I/O and HDF5 – Software stack – Modes of parallel I/O – Intro to HDF5 and some tuning I/O of exascale applications ▪ Autonomous data management system – Proactive Data Containers (PDC) system – Metadata management service – Data management service 8

Trends – Storage system transformation Node-local, Eg. Theta Shared burst buffer Conventional Upcoming Eg. Cori @ NERSC (ALCF), Summit (OLCF) Memory Memory Memory Memory Node-local storage Node-local storage IO Gap NVM-based shared IO Gap storage Parallel file system Shared burst buffer Parallel file system (on Theta) Parallel file system Parallel file system Campaign / center- Center-wide storage (Lustre, GPFS) (Lustre, GPFS) wide storage (on Summit) Archival storage Archival storage Archival Storage Archival Storage (HPSS tape) (HPSS tape) (HPSS tape) (HPSS tape) •IO performance gap in HPC storage is a significant bottleneck because of slow disk-based storage •SSD and new memory technologies are trying to fill the gap, but increase the depth of storage hierarchy 9

Parallel I/O software stack § I/O Libraries – HDF5 (The HDF Group) [LBL, ANL] Applications – ADIOS (ORNL) – PnetCDF (Northwestern, ANL) High Level I/O Library (HDF5, NetCDF, ADIOS) – NetCDF-4 (UCAR) I/O Middleware (MPI-IO) • Middleware – POSIX-IO, MPI-IO I/O Forwarding (ANL) • I/O Forwarding Parallel File System (Lustre, GPFS,..) I/O Hardware • File systems: Lustre (Intel), GPFS (IBM), DataWarp (Cray), … § I/O Hardware (disk-based, SSD- based, …) 11

Parallel I/O – Application view ▪ Types of parallel I/O • 1 writer/reader, 1 file … … … … … P n- P n- P n- P n- P n- • N writers/readers, N files P 0 P 0 P 0 P 1 P 1 P 1 P n P n P n P 0 P 0 P 1 P 1 P n P n 1 1 1 1 1 (File-per-process) • N writers/readers, 1 file file.0 file.0 file.m file.0 file.1 file.n file.n-1 • M writers/readers, 1 file File.1 File.1 M Writers/Readers, M Files M Writers/Readers, 1 File 1 Writer/Reader, 1 File n Writers/Readers, 1 File n Writers/Readers, n Files – Aggregators – Two-phase I/O • M aggregators, M files (file- per-aggregator) – Variations of this mode 12

Parallel I/O – System view Logical view ▪ Parallel file systems – Lustre and Spectrum Scale (GPFS) File ▪ Typical building blocks of parallel file systems Communication – Storage hardware – HDD or SSD network RAID – Storage servers (in Lustre, Object Storage Servers [OSS], and object Physical view on a parallel file system storage targets [OST] – Metadata servers – Client-side processes and interfaces ▪ Management – Stripe files for parallelism OST 0 OST 1 OST 2 OST 3 – Tolerate failures File 13

Applications High Level I/O Library ( HDF5 , NetCDF, ADIOS) I/O Middleware (MPI-IO) I/O Forwarding Parallel File System (Lustre, GPFS,..) I/O Hardware WHAT IS HDF5?

What is HDF5? • HDF5 è Hierarchical Data Format, v5 • Open file format – Designed for high volume and complex data • Open source software – Works with data in the format • An extensible data model – Structures for data organization and specification

HDF5 is like …

HDF5 is designed … ▪ for high volume and / or complex data ▪ for every size and type of system – from cell phones to supercomputers ▪ for flexible, efficient storage and I/O ▪ to enable applications to evolve in their use of HDF5 and to accommodate new models ▪ to support long-term data preservation

HDF5 Overview ▪ HDF5 is designed to organize, store, discover, access, analyze, share, and preserve diverse, complex data in continuously evolving heterogeneous computing and storage environments. ▪ First released in 1998, maintained by The HDF Group “De-facto standard for scientific computing” and integrated into every major scientific analytics + visualization tool ▪ Heavily used on DOE supercomputing systems Library Usage on Cori and Edison in 2017 Library usage on Cori and Edison in 2017 10000 10000000 Top library used at NERSC by the number of linked instances Number of linking incidences 1000000 and the number of unique users Number of unique users 1000 100000 10000 100 1000 100 10 h i l l 5 w l f f t i b i c l c k e e d d s p p s c f i s p s m l d t l c c o a l m t i b l l z p a f a t t o p e t h f mpich libsci mkl hdf5-parallel fftw hdf5 papi netcdf-hdf5parallel netcdf impi petsc parallel-netcdf tpsl gsl boost m i r r e e i p l b a a n n p p - - 5 l e 5 f l f d l a d h h r - a f p d c t e n Libraies Libraries

HDF5 in Exascale Computing Project 19 out of the 26 (22 ECP + 4 NNSA) apps currently use or planning to use HDF5

HDF5 Ecosystem … Tools Supporters File Format Library Data Model Documentation …

HDF5 DATA MODEL

HDF5 File lat | lon | temp ----|-----|----- 12 | 23 | 3.1 Experiment Notes: Serial Number: 99378920 An HDF5 file is a 15 | 24 | 4.2 Date: 3/13/09 Configuration: Standard 3 17 | 21 | 3.6 container that holds data objects.

HDF5 Data Model Dataset Link Group HDF5 Datatype Objects Attribute Dataspace File

HDF5 Dataset HDF5 Datatype Integer: 32-bit, LE HDF5 Dataspace Rank Dimensions 3 Dim[0] = 4 Dim[1] = 5 Dim[2] = 7 Specifications for single data Multi-dimensional array of element and array dimensions identically typed data elements • HDF5 datasets organize and contain data elements. • HDF5 datatype describes individual data elements. • HDF5 dataspace describes the logical layout of the data elements.

HDF5 Dataspace • Describe individual data elements in an HDF5 dataset • Wide range of datatypes supported • Integer • Float • Enum • Array • User-defined (e.g., 13-bit integer) • Variable-length types (e.g., strings, vectors) • Compound (similar to C structs) • More … Extreme Scale Computing Argonne

HDF5 Dataspace Two roles: Dataspace contains spatial information • Rank and dimensions • Permanent part of dataset definition Rank = 2 Dimensions = 4x6 Partial I/0: Dataspace describes application ’ s data buffer and data elements participating in I/O Rank = 1 Dimension = 10

HDF5 Dataset with a 2D array 3 5 12 Datatype: 32-bit Integer Dataspace: Rank = 2 Dimensions = 5 x 3

HDF5 Dataset with Compound Datatype 3 5 V V V V V V V V V uint16 char int32 2x3x2 array of float32 Compound Datatype: Dataspace: Rank = 2 Dimensions = 5 x 3

How are data elements stored? Buffer in memory Data in the file Data elements Contiguous stored physically adjacent to each (default) other Better access time Chunked for subsets; extendible Improves storage Chunked & efficiency, Compressed transmission speed

Efficient Scientific Data Management on Supercomputers HDF5 and - PowerPoint PPT Presentation

Efficient Scientific Data Management on Supercomputers HDF5 and Proactive Data Containers (PDC) Suren Byna Staff Scientist Scientific Data Management Group Data Science and Technology Department Lawrence Berkeley National Laboratory

Efficient Scientific Data Efficient Scientific Data Management on Supercomputers Management on

Efficient Scientific Data Management on Supercomputers Suren Byna Staff Scientist Scientific

Supercomputers and Supercomputers and Clusters and Clusters and Grid, Grid, Oh My! Oh My!

Black-hole simulations on supercomputers U. Sperhake DAMTP , University of Cambridge DAMTP ,

The Integrative Role of COWs and Supercomputers in Research and Education Activities Don

Improving Hadoop MapReduce Performance on Supercomputers with JVM Reuse Thanh-Chung Dao and

Black-hole binary simulations on supercomputers U. Sperhake CSIC-IEEC Barcelona 2 nd Iberian

LARGE SCALE VISUALIZATION ON GPU ACCELERATED SUPERCOMPUTERS Peter Messmer, 11/16/2015

The Scientific Method The Scientific Method The Scientific Method involves 6 steps: Problem

2010 Computing on Grids and Supercomputers Improving Many-Task Computing in Scientific Workflows

Scientific report Mariusz ynel April 22, 2015 Scientific report 2 Contents 1 Scientific

Topics The Scientific Data Deluge Data-Intensive Scientific Discovery NSF OCI Data/Viz Task

SCIENCE SCIENCE Scientific Question Hypothesis Prediction Experimental Test Scientific

Scientific Programming in mpags-python.github.io Steven Bamford An introduction to scientific

The Scientific Data Management The Scientific Data Management Center Center Arie Shoshani (PI)

Topology-Aware Data Aggregation for Intensive I/O on Large-Scale Supercomputers Franois Tessier

Lab 2 - CORBA General Information T ask description is on the web Download example and

1 Corba Architecture Functions of ORB Communication between client and server Object Client

Samba4 Progress and Roadmap Andrew Tridgell tridge@osdl.org Please ask questions during the

Summary of the IDL SIGs Student Outreach Program By Sylvia Miller Student Outreach program

Distributed Systems Lecture 10: Distributed Objects & Event Notification 95-702 OCT 1 Master

An IDL for Web Services Interface definitions are needed to allow clients to communicate with

Programming Apache OpenOffice The Universal Network Object (UNO) Framework Rony G. Flatscher

Broadening-function technique (overview of scripts of S.M. Ruciski for IDL/GDL) Theodor

Efficient Scientific Data Management on Supercomputers HDF5 and - PowerPoint PPT Presentation

Efficient Scientific Data Management on Supercomputers HDF5 and Proactive Data Containers (PDC) Suren Byna Staff Scientist Scientific Data Management Group Data Science and Technology Department Lawrence Berkeley National Laboratory

Efficient Scientific Data Efficient Scientific Data Management on Supercomputers Management on

Efficient Scientific Data Management on Supercomputers Suren Byna Staff Scientist Scientific

Supercomputers and Supercomputers and Clusters and Clusters and Grid, Grid, Oh My! Oh My!

Black-hole simulations on supercomputers U. Sperhake DAMTP , University of Cambridge DAMTP ,

The Integrative Role of COWs and Supercomputers in Research and Education Activities Don

Improving Hadoop MapReduce Performance on Supercomputers with JVM Reuse Thanh-Chung Dao and

Black-hole binary simulations on supercomputers U. Sperhake CSIC-IEEC Barcelona 2 nd Iberian

LARGE SCALE VISUALIZATION ON GPU ACCELERATED SUPERCOMPUTERS Peter Messmer, 11/16/2015

The Scientific Method The Scientific Method The Scientific Method involves 6 steps: Problem

2010 Computing on Grids and Supercomputers Improving Many-Task Computing in Scientific Workflows

Scientific report Mariusz ynel April 22, 2015 Scientific report 2 Contents 1 Scientific

Topics The Scientific Data Deluge Data-Intensive Scientific Discovery NSF OCI Data/Viz Task

SCIENCE SCIENCE Scientific Question Hypothesis Prediction Experimental Test Scientific

Scientific Programming in mpags-python.github.io Steven Bamford An introduction to scientific

The Scientific Data Management The Scientific Data Management Center Center Arie Shoshani (PI)

Topology-Aware Data Aggregation for Intensive I/O on Large-Scale Supercomputers Franois Tessier

Lab 2 - CORBA General Information T ask description is on the web Download example and

1 Corba Architecture Functions of ORB Communication between client and server Object Client

Samba4 Progress and Roadmap Andrew Tridgell tridge@osdl.org Please ask questions during the

Summary of the IDL SIGs Student Outreach Program By Sylvia Miller Student Outreach program

Distributed Systems Lecture 10: Distributed Objects &amp; Event Notification 95-702 OCT 1 Master

An IDL for Web Services Interface definitions are needed to allow clients to communicate with

Programming Apache OpenOffice The Universal Network Object (UNO) Framework Rony G. Flatscher

Broadening-function technique (overview of scripts of S.M. Ruciski for IDL/GDL) Theodor

Distributed Systems Lecture 10: Distributed Objects & Event Notification 95-702 OCT 1 Master