Discovering Structure in Unstructured I/O Jun He 1,2 , John Bent 3 , - - PowerPoint PPT Presentation

discovering structure in unstructured i o
SMART_READER_LITE
LIVE PREVIEW

Discovering Structure in Unstructured I/O Jun He 1,2 , John Bent 3 , - - PowerPoint PPT Presentation

Discovering Structure in Unstructured I/O Jun He 1,2 , John Bent 3 , Aaron Torres 4 , Gary Grider 4 , Garth Gibson 5 , Carlos Maltzahn 6 , Xian-He Sun 1 1 Illinois Institute of Technology 2 New Mexico Consortium 3 EMC 4 Los Alamos National


slide-1
SLIDE 1

Jun He1,2, John Bent3, Aaron Torres4, Gary Grider4, Garth Gibson5, Carlos Maltzahn6, Xian-He Sun1

1Illinois Institute of Technology 2New Mexico Consortium 3EMC 4Los Alamos National Laboratory 5Carnegie Mellon University 6University of California Santa Cruz

November 12, 2012

Discovering Structure in Unstructured I/O

slide-2
SLIDE 2

Outline

slide-3
SLIDE 3

3

PLFS (Parallel Log-structured File System) accelerates checkpointing significantly, but its internal metadata may grow too big. How to recognize I/O patterns and reduce PLFS metadata size. Metadata size is reduced significantly and R/W performance is improved.

This presentation focuses on recognizing I/O patterns and representing them compactly.

Metadata Compression BTIO.16PE FLASH.16PE FLASH.32PE FLASH.64PE FLASH.8PE LANL_App1.64PE LANL_App2.App_IO_Library LANL_App2.MPI-IO_Collective LANL_App2.MPI-IO_Independent LANL_App3.64PE PatternIO.16PE PatternIO.4PE PatternIO.64PE Pagoda 1 2 4 6 8 10 50 100 500 1000
slide-4
SLIDE 4

Motivation

slide-5
SLIDE 5

5

Up to several orders

  • f magnitude

improvement.

Checkpointing is the storage driver in supercomputers. PLFS can improve checkpointing significantly.

PLFS transparently transforms N-1 write to N-N write.

slide-6
SLIDE 6

6

PLFS internal metadata may grow very big.

Logical Offset Length Physical Offset Chunk ID

2 3 2 2 7 4 4 14 2 8 17 2 10 21 4 12 28 2 16 31 2 18 35 4 20 42 3 24 46 3 27 50 3 30 54 3 33 58 3 36

Logical Offset Length Physical Offset Chunk ID

PLFS Reorganization Physical File 0 Physical File 1 Proc 0 Proc 1 Hole Index.1 (metadata)

Keep writing Keep writing

Index.0 (metadata) Logical view 11 3 3 1 16 1 6 1 19 2 7 1 25 3 9 1 30 1 12 1 33 2 13 1 39 3 15 1 2 1 1 5 2 1 1

Explode

slide-7
SLIDE 7

7

Pattern of LANL anonymous 3. Colors indicate ranks.

Applications’ I/O has patterns and they can be represented compactly.

slide-8
SLIDE 8

Replicated metadata (each reader has a copy) File size Metadata

  • n Disks

After pattern compression, replicated metadata

Metadata of LANL anonymous 3 is big.

slide-9
SLIDE 9

Related Work

slide-10
SLIDE 10

10

1. (DARSHAN) P. Carns, K. Harms, W. Allcock, C. Bacon, S. Lang, R. Latham, and R. Ross, “Understanding and improving computational science storage access through continuous characterization,” ACM Transactions on Storage (TOS), vol. 7, no. 3, p. 8, 2011. 2.

  • B. Pasquale and G. Polyzos, “A static analysis of i/o characteristics of scientific applications in a production workload,”

in Proceedings of the 1993 ACM/IEEE conference on Supercomputing. ACM, 1993, pp. 388–397. 3.

  • E. Smirni and D. Reed, “Lessons from characterizing the input/output behavior of parallel scientific applications,”

Performance Evaluation, vol. 33, no. 1, pp. 27–44, 1998. 4.

  • S. Byna, Y. Chen, X. Sun, R. Thakur, and W. Gropp, “Parallel I/O prefetching using MPI file caching and I/O

signatures,” in Proceedings of the 2008 ACM/IEEE conference on Supercomputing. IEEE Press, 2008, p. 44. 5.

  • J. He, H. Song, X. Sun, Y. Yin, and R. Thakur, “Pattern-aware file reorganization in mpi-io,” in Proceedings of the

sixth workshop on Parallel Data Storage. ACM, 2011, pp. 43–48. 6.

  • T. Madhyastha and D. Reed, “Learning to classify parallel input/output access patterns,” Parallel and Distributed Systems,

IEEE Transactions on, vol. 13, no. 8, pp. 802–813, 2002. 7.

  • J. Oly and D. Reed, “Markov model prediction of i/o requests for scientific applications,” in Proceedings of the 16th

international conference on Supercomputing. ACM, 2002, pp. 147–155. 8.

  • N. Tran and D. Reed, “Automatic time series modeling for adaptive i/o prefetching,” Parallel and Distributed Systems,

IEEE Transactions on, vol. 15, no. 4, pp. 362–377, 2004.

Coarse-granularity patterns are not precise enough. Statistics methods are lossy.

From 1. Thanks to Phil Carns. From 7.

slide-11
SLIDE 11

Methods

slide-12
SLIDE 12

12

Sliding window algorithm is effective in discovering pattern.

Complexity: O(wn). w is window size. n is input length. 0 3 7 14 17 21 28 31 35 42 46 50 54 58

Logical file: stride list: Logical offsets:

3 4 7 3 4 7 3 4 7 4 4 4 4

slide-13
SLIDE 13

Results

slide-14
SLIDE 14

Example: write patterns of MILC (physics app). In-memory index compression rates by Pattern PLFS (higher is better): (A):37.0; (B):3.0;(C):3.6

14

Patterns of real applications are explored, as well as benchmarks.

Applications explored:

  • LIVE RUN:
  • Pagoda (PNNL), MPI-Blast, MILC, Montage (NASA), ADIOS (ORNL),

MADBench2 (LBL)

  • TRACE REPLAY:
  • Alegra (SNL), S3D (SNL), LANL anonymous applications, FLASH, BTIO

Benchmarks explored :

  • PATTERN-IO (NERSC), MPI-TILE-IO (ANL), FS-TEST (LANL)
slide-15
SLIDE 15

15

512 processes with write size of 4K.

Write Performance Improvement

2 4 6 16 64 256

Number of Writes (K)

Pattern PLFS PLFS 2.2.1

(A):Open Time (sec)

2000 4000 16 64 256

Number of Writes (K)

(B):Bandwidth (MB/s)

10 20 30 16 64 256

Number of W

(C):Close T

1.5GB/s

Index Memory Footprint Number of Originating Writes ( Footprint Per

2 4 6 16 64 256

Pattern.PLFS PLFS.2.2.1

106

Unchanged Unchanged

slide-16
SLIDE 16

16

Uniform read: 512 processes Non-uniform read: 256 processes

Read Performance Improvement

40 80 16 64 256

Number of Originating Writes (K) Open Time

Pattern PLFS PLFS 2.2.1

(A):Uniform Read

1000 2000 16 64 256

Number of Origina Bandwidth(M

(B):Uniform Re

40 80 16 64 256

Number of Originating Writes (K) Open Time (sec)

(C):Non-uniform Read

1000 2000 16 64 256

Number of Origina Bandwidth (MB/s)

(D):Non-uniform

480%

slide-17
SLIDE 17

17

PLFS metadata can be reduced by up to several orders of magnitude.

Metadata Compression

BTIO.16PE FLASH.16PE FLASH.32PE FLASH.64PE FLASH.8PE LANL_App1.64PE LANL_App2.App_IO_Library LANL_App2.MPI-IO_Collective LANL_App2.MPI-IO_Independent LANL_App3.64PE PatternIO.16PE PatternIO.4PE PatternIO.64PE Pagoda 1 2 4 6 8 10 50 100 500 1000

1500

slide-18
SLIDE 18

Conclusions & Future Work

slide-19
SLIDE 19

19

The proposed sliding window algorithm is effective on discovering structure and improving I/O performance.

Application patterns are studied. I/O structure discovering algorithm and a compact structure representation are proposed.

40 80 16 64 256 Number of Originating Writes (K) Open Time Pattern PLFS PLFS 2.2.1

(A):Uniform Read

1000 2000 16 64 256 Number of Origina Bandwidth(

(B):Uniform Re

40 80 16 64 256 Number of Originating Writes (K) Open Time (sec)

(C):Non-uniform Read

1000 2000 16 64 256 Number of Origina Bandwidth (MB/s)

(D):Non-uniform Metadata is reduced and

I/O performance is improved.

slide-20
SLIDE 20

20

The proposed techniques have the potential for being applied in other systems.

Pre-fetching Block pre-allocation Data layout optimization SciHadoop metadata compression Predictability & Compactness

slide-21
SLIDE 21

21

Acknowledgement

  • Michael Lang (Los Alamos National Laboratory)
  • Adam Manzanares (California State University)
  • All the reviewers

This work was performed at the Ultrascale Systems Research Center (USRC) at Los Alamos National Laboratory, supported by the U.S. Department of Energy DE-FC02-06ER25750. The publication has been assigned the LANL identifier LA-UR-12-25954.

slide-22
SLIDE 22

Q & A

Jun’s email: junnhe@gmail.com