DataMods
Programmable File System Services
Noah Watkins*, Carlos Maltzahn, Scott Brandt UC Santa Cruz, *Inktank Adam Manzanares California State University, Chico
1
DataMods Programmable File System Services Noah Watkins*, Carlos - - PowerPoint PPT Presentation
DataMods Programmable File System Services Noah Watkins*, Carlos Maltzahn, Scott Brandt UC Santa Cruz, *Inktank Adam Manzanares California State University, Chico 1 Talk Agenda 1. Middleware and modern IO stacks 2. Services in middleware and
Noah Watkins*, Carlos Maltzahn, Scott Brandt UC Santa Cruz, *Inktank Adam Manzanares California State University, Chico
1
2
3
4
5
6
7
8
9
New behavior Generalized storage services Pluggable customization (new programmer role)
Metadata Management Data Placement Intelligent Access Asynchronous Services
10
11
Header File
12
Header
Data Data Data Met a Met a
13
Header
Data Data Data Met a Met a
HDF5 Library
Array-based Application read(array-slice)
14
Header
Data Data Data Met a Met a
Workflow Driver
HDF5 Library
15
16
File Manifold
(Metadata and Data Placement)
Typed and Active Storage Asynchronous Services
17
18
19
20
21
22
23
Client 1 Client 2 Client 3
Log-structured Index Log-structured Index Log-structured Index
24
Parallel Log-structured File System
25
Application PLFS File System Optimization Process
Time
Compute
26
Application PLFS File System Optimization Process
Time
Compute Return to compute sooner
27
28
Logical top-half file is not materialized
29
Logical top-half file is not materialized Routes to per- process log file
30
Logical top-half file is not materialized Routes to per- process log file Append striping within object namespace
31
Logical top-half file is not materialized Routes to per- process log file Append striping within object namespace Index-enabled
logical-to-phy
32
Logical top-half file is not materialized Routes to per- process log file Append striping within object namespace Index-enabled
logical-to-phy Interface to index maintenance routines
34
35
1 10 100 1000 10000 100000 1000000 10000000 1 11 21 31 41 51 61 71 81 91 Number of Index Entries PLFS Map File Raw Trace (Baseline) Merge Compress
Contiguous Writes Large, Strided
36
37
1 10 100 1000 10000 100000 1000000 10000000 1 11 21 31 41 51 61 71 81 91 Number of Index Entries PLFS Map File Raw Trace (Baseline) Merge Compress Pattern Compress
Strided pattern identified
38
39
Index Consolidation
Index Pack
40
41
42
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 PatternIO.47K.5 PatternIO.47K.4 PatternIO.47K.1 PatternIO.47K.6 PatternIO.47K.2 LANL_App3.64.ma LANL_App1.64.dm LANL_App2.mpiio chombo.512.map chombo.128.map flash.8PE.hdf5_ flash.32PE.hdf5 flash.128PE.hdf flash.512PE.hdf flash.16PE.hdf5 flash.64PE.hdf5 flash.256PE.hdf chombo.32.map strided.8PE.10M flash.16PE.hdf5 flash.64PE.hdf5 flash.256PE.hdf PatternIO.10MB. PatternIO.10MB. PatternIO.10MB. PatternIO.10MB. PatternIO.10MB. strided.8PE.10M strided.16PE.10 strided.24PE.10 strided.32PE.10 strided.40PE.10 strided.48PE.10 strided.56PE.10 strided.64PE.10 strided.128PE.1 nonstrided.8PE. nonstrided.24PE nonstrided.40PE nonstrided.56PE nonstrided.128P nonstrided.1PE. nonstrided.3PE. nonstrided.5PE. nonstrided.7PE. PatternIO.10MB. Reduction Factor over Baseline Reduction from Merging Reduction from Pattern
Global reduction HDF5 Indexing, Data reorganization