Key features 1. Require no dedicated resources 2. Almost no - PowerPoint PPT Presentation

DeltaFS Indexed Massive Dir S oftware-Defined Storage For Fast Query PDSW-DISCS 2017 Qing Zheng, George Amvrosiadis, Saurabh Kadekodi, Michael Kuchnik Chuck Cranor, Garth Gibson Brad Settlemyer, Gary Grider, Fan Guo Carnegie Mellon University Los Alamos National Laboratory (LANL)

DeltaFS Indexed Massive Dir Key features 1. Require no dedicated resources 2. Almost no post-processing is needed 3. Low I/O overhead http://www.pdl.cmu.edu/ PDSW-DISCS 2017 2

DeltaFS Indexed Massive Dir Target workloads 1. Data-intensive HPC simulations 2. Not designed for indexing checkpoints 3. I/O bandwidth is limited http://www.pdl.cmu.edu/ PDSW-DISCS 2017 3

Agenda Part 1 – Motivation Part 2 – In-situ indexing design Part 3 – API, LANL VPIC integration Conclusion http://www.pdl.cmu.edu/ PDSW-DISCS 2017 4

Existing HPC builds indexes during post-processing 1 2 App Indexing Lustre Write 3 Tmp Queries Delay queries until post-processing done (5-20% simulation time) http://www.pdl.cmu.edu/ PDSW-DISCS 2017 5

Problem faced: The increasing time-to-science Due to the growing gap between compute and I/O Inefficient support on small data simulation start query finish

Processing data in-transit while data is written to storage MapReduce App Indexing Queries Lustre Tmp Need separate resources for sorting and indexing http://www.pdl.cmu.edu/ PDSW-DISCS 2017 7

In-situ indexing directly on app nodes using app resources data + index App + Indexing Lustre Tmp Queries No need for a separate indexing cluster http://www.pdl.cmu.edu/ PDSW-DISCS 2017 8

Key idea: Reuse storage write-back buffering and idle CPU cycles for in-situ indexing http://www.pdl.cmu.edu/ PDSW-DISCS 2017 9

Example app: LANL VPIC Particle 40 bytes Each VPIC process simulates Particles move across VPIC simulation processes during a simulation millions of particles Small random writes After simulation: high-selective queries http://www.pdl.cmu.edu/ PDSW-DISCS 2017 10

TBs I/O per trajectory fetch file-per-process P P ... P Simulation procs 1M A D E One output file per F C B 1M VPIC process C E A ... Data object 1M ... Query a single particle trajectory TBs search A B C http://www.pdl.cmu.edu/ PDSW-DISCS 2017 11

DeltaFS (w/ 1 CPU core) Baseline (Full-system parallel scan w/ 3k CPU cores) 0.0625 0.25 1 4 16 64 256 1024 4096 Query Time (sec) Time for reading a single particle trajectory (10TB, 48 billion particles) 5,000x faster than baseline with DeltaFS in-situ indexing http://www.pdl.cmu.edu/ PDSW-DISCS 2017 12

Part II System design: Light-weight in-situ indexing 1. Tiny mem footprint 2. Zero write amplification 3. No read back

Resource-efficient indexing by log-structured I/O data log buffer App thread index Indexing thread Lustre App proc Tiny mem footprint, full storage b/w util. http://www.pdl.cmu.edu/ PDSW-DISCS 2017 14

LSM-Trees compacts all the time, but we can’t afford it Total simulation Compute I/O Compute I/O … Must aim for low I/O overhead at 10%-20% Compaction easily causes 1000% I/O overhead by reading/writing previously written data http://www.pdl.cmu.edu/ PDSW-DISCS 2017 15

In-situ indexing by aggressive data partitioning … Compute I/O Compute I/O C A D F E B All-to-all shuffle A B C D E F App process #0 App process #1 App process #2 Bound the number of data needed per query per timestep http://www.pdl.cmu.edu/ PDSW-DISCS 2017 16

In-situ indexing as a file system lib component App data data block index block data block WriteBuffer shuffle shuffle filter data block receiver sender ... ... All-to-all shuffle Index Log Data Log No dedicated cluster needed http://www.pdl.cmu.edu/ PDSW-DISCS 2017 17

Part III Programming interface: Indexed Massive Directory (IMD) In-situ indexing keyed on filenames mkdir (“./particles”, DELTAFS_IMD)

How to use Indexed Massive Dir (IMD) 1. Data searched together go into a single IMD file e.g. one file for each particle 2. Create as many IMD files as you want e.g. 1 trillion files for 1 trillions particles Query you data by “open -read- close” http://www.pdl.cmu.edu/ PDSW-DISCS 2017 19

VPIC using DeltaFS IMD file-per-particle P P ... P Simulation procs 1M B E C F A D One IMD file per Indexed Massive 1T B E C F A D ... VPIC particle Directory Index object 1M ... Data object TBs MBs search A B C http://www.pdl.cmu.edu/ PDSW-DISCS 2017 20

LANL Trinity Experiments VPIC-Baseline No post-processing VPIC buffer … HDD SSD VPIC Compute Node buffer Burst-buffer Lustre Queries 32 cores/node DeltaFS indexing 1-99 compute nodes, 496 million – 48 billion particles VPIC-DeltaFS http://www.pdl.cmu.edu/ PDSW-DISCS 2017 21

4096 Baseline (Full-system parallel scan) 1024 DeltaFS (w/ 1 CPU core) Query Time (sec) 256 64 16 4 1 5112x 992x 4049x 532x 245x 625x 2221x 665x 0.25 0.0625 0.015625 1 node 2 nodes 4 node 8 node 16 nodes 33 nodes 66 nodes 99 nodes 496 992 1,984 3,968 7,936 16,368 32,736 49,104 Simulation Size (million particles) http://www.pdl.cmu.edu/ PDSW-DISCS 2017 22

200 1.13x 1.15x 1.13x Baseline DeltaFS I/O Time per Dump (sec) 160 120 1.29x Tiny simulations Bigger simulations 80 1.56x 4.78x 2.42x 9.63x 40 0 1 node 2 nodes 4 node 8 node 16 nodes 33 nodes 66 nodes 99 nodes 496 992 1,984 3,968 7,936 16,368 32,736 49,104 Simulation Size (million particles) http://www.pdl.cmu.edu/ PDSW-DISCS 2017 23

Conclusion https://github.com/pdlfs/deltafs In-situ indexing for transparent, almost-free query acceleration no dedicated nodes, no post-processing, ~15% I/O overhead • Indexed Massive Dir (~3% app mem, compaction-free, POSIX API) • Powered by Mercury RPC https://mercury-hpc.github.io/ • DeltaFS is one of the Mochi micro-services https://press3.mcs.anl.gov/mochi/ http://www.pdl.cmu.edu/ PDSW-DISCS 2017 24

Key features 1. Require no dedicated resources 2. Almost no - PowerPoint PPT Presentation

DeltaFS Indexed Massive Dir S oftware-Defined Storage For Fast Query PDSW-DISCS 2017 Qing Zheng, George Amvrosiadis, Saurabh Kadekodi, Michael Kuchnik Chuck Cranor, Garth Gibson Brad Settlemyer, Gary Grider, Fan Guo Carnegie Mellon University

COMPANY PROFILE WATER FEATURES 1 WATER FEATURES 2 WATER FEATURES 3 WATER FEATURES 4 WATER

BLOGGING How to blog well FEATURES OF A BLOG... FEATURES OF A BLOG... Chronological

New features of 660 MW Units Turbine Maintenance Sipat Super thermal power project New features

reliable innovation Main features Main features Lower part detachment Concept advantages

New Type Inference & Related Language Features Svetlana Isakova @sveta_isakova Agenda

Supervised Learning Given: a set of inputs features X 1 , . . . , X n a set of target features Y 1

Image Features Sanja Fidler CSC420: Intro to Image Understanding 1 / 64 Image Features Image

Public-Key Cryptography Public-Key Cryptography Lecture 8 Public-Key Encryption Public-Key

XR10910 16:1 Sensor Interface Key Features Benefits Integrated features for interfacing

SeisComP3 License Key 1 SeisComP3 License Key The SeisComP3 key exists in

IGCSE MISY Mandalay 2020-2022 MISY Mandalay Key Stage 4 MISY Key Stages EYFS KS4 KS5 KS1

Public-Key Cryptography Public-Key Cryptography Lecture 9 Public-Key Cryptography Lecture 9 El

Overview Key exchange Session vs. interchange keys Classical, public key methods

Outline Public key crypto RSA Essentials Computer Security: Public Key Crypto Public Key Crypto

KEY DISTRIBUTION: PKI and SESSION-KEY EXCHANGE Mihir Bellare UCSD 1 The public key setting

Public-Key Cryptography Public-Key Cryptography Lecture 8 Public-Key Cryptography Lecture 8

CS32 Summer 2013 Object-Oriented Programming in C++ Templates and STL Victor Amelkin September

Disc 0: Welcome to CS 61A! Lab 128L | Soda 275, Tu 5 p.m. - 6:30 p.m Disc 128 | Evans 9, 5

Constructive Discrepancy Minimization for Convex Sets Thomas Rothvoss UW Seattle Discrepancy

Discrepancy and SDPs Nikhil Bansal (TU Eindhoven, Netherlands ) Outline Discrepancy Theory

CS 61A Discussion 3 Recursion Albert Xu Attendance: links.cs61a.org/albert-disc Slides:

KubeVirt - Beyond Containers Back to VMs !! Roopak Parikh | @roopak_parikh | Platform9 Josh

arm64e An ABI for Pointer Authentication LLVM Developers' Meeting John McCall October 22 nd , 2019

Diffeomorphisms of discs Oscar Randal-Williams Smoothing theory M a topological d -manifold,

Key features 1. Require no dedicated resources 2. Almost no - PowerPoint PPT Presentation

DeltaFS Indexed Massive Dir S oftware-Defined Storage For Fast Query PDSW-DISCS 2017 Qing Zheng, George Amvrosiadis, Saurabh Kadekodi, Michael Kuchnik Chuck Cranor, Garth Gibson Brad Settlemyer, Gary Grider, Fan Guo Carnegie Mellon University

COMPANY PROFILE WATER FEATURES 1 WATER FEATURES 2 WATER FEATURES 3 WATER FEATURES 4 WATER

BLOGGING How to blog well FEATURES OF A BLOG... FEATURES OF A BLOG... Chronological

New features of 660 MW Units Turbine Maintenance Sipat Super thermal power project New features

reliable innovation Main features Main features Lower part detachment Concept advantages

New Type Inference &amp; Related Language Features Svetlana Isakova @sveta_isakova Agenda

Supervised Learning Given: a set of inputs features X 1 , . . . , X n a set of target features Y 1

Image Features Sanja Fidler CSC420: Intro to Image Understanding 1 / 64 Image Features Image

Public-Key Cryptography Public-Key Cryptography Lecture 8 Public-Key Encryption Public-Key

XR10910 16:1 Sensor Interface Key Features Benefits Integrated features for interfacing

SeisComP3 License Key 1 SeisComP3 License Key The SeisComP3 key exists in

IGCSE MISY Mandalay 2020-2022 MISY Mandalay Key Stage 4 MISY Key Stages EYFS KS4 KS5 KS1

Public-Key Cryptography Public-Key Cryptography Lecture 9 Public-Key Cryptography Lecture 9 El

Overview Key exchange Session vs. interchange keys Classical, public key methods

Outline Public key crypto RSA Essentials Computer Security: Public Key Crypto Public Key Crypto

KEY DISTRIBUTION: PKI and SESSION-KEY EXCHANGE Mihir Bellare UCSD 1 The public key setting

Public-Key Cryptography Public-Key Cryptography Lecture 8 Public-Key Cryptography Lecture 8

CS32 Summer 2013 Object-Oriented Programming in C++ Templates and STL Victor Amelkin September

Disc 0: Welcome to CS 61A! Lab 128L | Soda 275, Tu 5 p.m. - 6:30 p.m Disc 128 | Evans 9, 5

Constructive Discrepancy Minimization for Convex Sets Thomas Rothvoss UW Seattle Discrepancy

Discrepancy and SDPs Nikhil Bansal (TU Eindhoven, Netherlands ) Outline Discrepancy Theory

CS 61A Discussion 3 Recursion Albert Xu Attendance: links.cs61a.org/albert-disc Slides:

KubeVirt - Beyond Containers Back to VMs !! Roopak Parikh | @roopak_parikh | Platform9 Josh

arm64e An ABI for Pointer Authentication LLVM Developers' Meeting John McCall October 22 nd , 2019

Diffeomorphisms of discs Oscar Randal-Williams Smoothing theory M a topological d -manifold,

New Type Inference & Related Language Features Svetlana Isakova @sveta_isakova Agenda