Discovering Structure in Unstructured I/O Jun He 1,2 , John Bent 3 , - PowerPoint PPT Presentation

Discovering Structure in Unstructured I/O Jun He 1,2 , John Bent 3 , Aaron Torres 4 , Gary Grider 4 , Garth Gibson 5 , Carlos Maltzahn 6 , Xian-He Sun 1 1 Illinois Institute of Technology 2 New Mexico Consortium 3 EMC 4 Los Alamos National Laboratory 5 Carnegie Mellon University 6 University of California Santa Cruz November 12, 2012

Outline

This presentation focuses on recognizing I/O patterns and representing them compactly. PLFS (Parallel Log-structured File System) accelerates checkpointing significantly, but its internal metadata may grow too big. How to recognize I/O patterns and reduce PLFS metadata size. Pagoda PatternIO.64PE Metadata size is reduced significantly PatternIO.4PE PatternIO.16PE LANL_App3.64PE LANL_App2.MPI-IO_Independent LANL_App2.MPI-IO_Collective and R/W performance is improved. LANL_App2.App_IO_Library LANL_App1.64PE FLASH.8PE FLASH.64PE FLASH.32PE FLASH.16PE BTIO.16PE 1 2 4 6 8 10 50 100 500 1000 3 Metadata Compression

Motivation

Checkpointing is the storage driver in supercomputers. PLFS can improve checkpointing significantly. Up to several orders of magnitude improvement. PLFS transparently transforms N-1 write to N-N write. 5

PLFS internal metadata may grow very big. Proc 0 Proc 1 Hole Logical view Keep Keep writing writing PLFS Reorganization Physical File 0 Physical File 1 Logical Physical Chunk Logical Physical Chunk Length Length Offset Offset ID Offset Offset ID 0 2 0 0 2 1 0 1 3 2 2 0 5 2 1 1 7 4 4 0 11 3 3 1 14 2 8 0 Index.0 16 1 6 1 17 2 10 0 (metadata) 19 2 7 1 21 4 12 0 25 3 9 1 28 2 16 0 31 2 18 0 30 1 12 1 35 4 20 0 33 2 13 1 42 3 24 0 39 3 15 1 Explode 46 3 27 0 50 3 30 0 Index.1 54 3 33 0 (metadata) 6 58 3 36 0

Applications’ I/O has patterns and they can be represented compactly. Pattern of LANL anonymous 3. Colors indicate ranks. 7

Metadata of LANL anonymous 3 is big. After pattern compression, replicated metadata Replicated Metadata metadata (each on Disks reader has a copy) File size

Related Work

Coarse-granularity patterns are not precise enough. Statistics methods are lossy. From 1. Thanks to Phil Carns. From 7. 1. (DARSHAN) P. Carns, K. Harms, W. Allcock, C. Bacon, S. Lang, R. Latham, and R. Ross, “Understanding and improving computational science storage access through continuous characterization,” ACM Transactions on Storage (TOS), vol. 7, no. 3, p. 8, 2011. 2. B. Pasquale and G. Polyzos , “A static analysis of i/o characteristics of scientific applications in a production workload,” in Proceedings of the 1993 ACM/IEEE conference on Supercomputing. ACM, 1993, pp. 388 –397. 3. E. Smirni and D. Reed, “Lessons from characterizing the input/output behavior of parallel scientific applications,” Performance Evaluation, vol. 33, no. 1, pp. 27 – 44, 1998. 4. S. Byna , Y. Chen, X. Sun, R. Thakur, and W. Gropp, “Parallel I/O prefetching using MPI file caching and I/O signatures,” in Proceedings of the 2008 ACM/IEEE conference on Supercomputing. IEEE Press, 2008, p. 44. 5. J. He, H. Song, X. Sun, Y. Yin, and R. Thakur, “Pattern - aware file reorganization in mpi -io ,” in Proceedings of the sixth workshop on Parallel Data Storage. ACM, 2011, pp. 43 – 48. 6. T. Madhyastha and D. Reed, “Learning to classify parallel input/output access patterns,” Parallel and Distributed Systems, IEEE Transactions on, vol. 13, no. 8, pp. 802 – 813, 2002. 7. J. Oly and D. Reed, “Markov model prediction of i/o requests for scientific applications,” in Proceedings of the 16th international conference on Supercomputing. ACM, 2002, pp. 147–155. 8. N. Tran and D. Reed, “Automatic time series modeling for adaptive i/o prefetching,” Parallel and Distributed Systems , IEEE Transactions on, vol. 15, no. 4, pp. 362 – 377, 2004. 10

Methods

Sliding window algorithm is effective in discovering pattern. Logical file: Logical offsets: 0 3 7 14 17 21 28 31 35 42 46 50 54 58 3 4 7 3 4 7 3 4 7 4 4 4 4 stride list: Complexity: O(wn) . w is window size. n is input length. 12

Results

Patterns of real applications are explored, as well as benchmarks. Applications explored: LIVE RUN: • Pagoda (PNNL), MPI-Blast, MILC, Montage (NASA), ADIOS (ORNL), • MADBench2 (LBL) TRACE REPLAY: • Alegra (SNL), S3D (SNL), LANL anonymous applications, FLASH, BTIO • Benchmarks explored : PATTERN-IO (NERSC), MPI-TILE-IO (ANL), FS-TEST (LANL) • Example: write patterns of MILC (physics app). In-memory index compression rates by Pattern PLFS (higher is better): (A):37.0; (B):3.0;(C):3.6 14

Write Performance Improvement Footprint Per Index Memory Footprint Pattern.PLFS 6 PLFS.2.2.1 4 10 6 2 0 16 64 256 Number of Originating Writes ( Unchanged Unchanged (A):Open Time (sec) (B):Bandwidth (MB/s) (C):Close T 6 4000 1.5GB/s 30 4 20 Pattern PLFS 2000 PLFS 2.2.1 2 10 0 0 0 16 64 256 16 64 256 16 64 256 Number of Writes (K) Number of Writes (K) Number of W 512 processes with write size of 4K. 15

Read Performance Improvement Bandwidth(M Open Time (A):Uniform Read (B):Uniform Re 80 2000 Pattern PLFS 40 1000 480% PLFS 2.2.1 0 0 Bandwidth (MB/s) 16 64 256 16 64 256 Open Time (sec) Number of Originating Writes (K) Number of Origina (C):Non-uniform Read (D):Non-uniform 80 2000 40 1000 0 0 16 64 256 16 64 256 Number of Originating Writes (K) Number of Origina Uniform read: 512 processes Non-uniform read: 256 processes 16

PLFS metadata can be reduced by up to several orders of magnitude. 1500 Pagoda PatternIO.64PE PatternIO.4PE PatternIO.16PE LANL_App3.64PE LANL_App2.MPI-IO_Independent LANL_App2.MPI-IO_Collective LANL_App2.App_IO_Library LANL_App1.64PE FLASH.8PE FLASH.64PE FLASH.32PE FLASH.16PE BTIO.16PE 1 2 4 6 8 10 50 100 500 1000 Metadata Compression 17

Conclusions & Future Work

The proposed sliding window algorithm is effective on discovering structure and improving I/O performance. Application patterns are studied. I/O structure discovering algorithm and a compact structure representation are proposed. Open Time Bandwidth( (A):Uniform Read (B):Uniform Re 80 2000 Pattern PLFS 40 1000 PLFS 2.2.1 0 0 Bandwidth (MB/s) 16 64 256 16 64 256 Open Time (sec) (D):Non-uniform Metadata is reduced and Number of Originating Writes (K) Number of Origina (C):Non-uniform Read 80 2000 40 I/O performance is improved. 1000 0 0 16 64 256 16 64 256 Number of Originating Writes (K) Number of Origina 19

The proposed techniques have the potential for being applied in other systems. Predictability & Compactness Pre-fetching Block pre-allocation Data layout optimization SciHadoop metadata compression 20

Acknowledgement • Michael Lang (Los Alamos National Laboratory) • Adam Manzanares (California State University) • All the reviewers This work was performed at the Ultrascale Systems Research Center (USRC) at Los Alamos National Laboratory, supported by the U.S. Department of Energy DE-FC02-06ER25750. The publication has been assigned the LANL identifier LA-UR-12-25954. 21

Q & A Jun’s email: junnhe@gmail.com

Discovering Structure in Unstructured I/O Jun He 1,2 , John Bent 3 , - PowerPoint PPT Presentation

Discovering Structure in Unstructured I/O Jun He 1,2 , John Bent 3 , Aaron Torres 4 , Gary Grider 4 , Garth Gibson 5 , Carlos Maltzahn 6 , Xian-He Sun 1 1 Illinois Institute of Technology 2 New Mexico Consortium 3 EMC 4 Los Alamos National

CFD General Notation System (CGNS) Usage for unstructured grids Edwin van der Weide Stanford

Discovering Gods Word (Part-2) Discovering Gods Word (Part-2) Hermeneutics = The science

Skill discovery from unstructured demonstrations Skill discovery from unstructured demonstrations

Part III Unstructured Data Data Retrieval: III.1 Unstructured data and data retrieval

Part III Unstructured Data Data Retrieval: III.1 Unstructured data and data retrieval

Part III Unstructured Data Data Retrieval: III.1 Unstructured data and data retrieval

Data and Analysis Part III Unstructured Data Ian Stark February 2011 Part III: Unstructured

Part III Unstructured Data Data Retrieval: III.1 Unstructured data and data retrieval

The 3 rd Covenant Re-Discovering the Word of God within the words of the Bible Re-Discovering The

~ Discovering gold in the Cortez gold-trend of Nevada ~ NUG:V NULGF:QX Discovering gold in

Discovering Mammalian Endocytic Discovering Mammalian Endocytic Pathways with High- -Throughput

DISCOVERING OF CHILDREN NEEDS DISCOVERING OF CHILDREN NEEDS AND POTENTIALS: MAP SUPPORT IN

Discovering Flight Chapter Overview Discovering Flight The Early Days of Flight Chapter

Discovering Gods Word (Part-1) Discovering Gods Word The Inspired Word (Part-1) 2

Nature Inspired Visualization of Unstructured Big Data Aaditya Prakash prakash@aaditya.info

Unstructured Data Miner 315 Madison Avenue Suite 901 New York, NY 10017 (646) 701-0055

The Algorithmics of Solitaire-Like Games Roland Backhouse Wei Chen Jo ao F. Ferreira

Pauls journey to Rome The Lord stood by him, and said, Be of good cheer, Paul: for as thou

DarkSide-50 (E-1000) FY15 and FY16 Computing Needs Stephen Pordes Scientific Computing

Switching techniques for edge decompositions of graphs Daniel Horsley Monash University,

ORE ORE 2015 ! Sponsored by O W O W L R e R e a s o n e r E v E v a l u a t i o n a l u a t

PROPOSING GEOPRODUCTS FOR LY SON GEOPARK, VIETNAM Hoang Thi Phuong Chi , Ha Quang Hai, Nguyen Thi

The City The Base Metaphor Explained Figure: Base System, CC-BY-NC Randall Munroe

How to sit in the front of the plane Adrian J.B. Brady B.Sc, MD, FRCP(Glasg), FRCPE, FBHS, FESC,