Towards Optimizing Large-Scale Data Transfers with End-to-End - - PowerPoint PPT Presentation

towards optimizing large scale data transfers with end to
SMART_READER_LITE
LIVE PREVIEW

Towards Optimizing Large-Scale Data Transfers with End-to-End - - PowerPoint PPT Presentation

Towards Optimizing Large-Scale Data Transfers with End-to-End Integrity Verification Raj Ke'muthu Argonne Na2onal Laboratory & University of Chicago Si Liu, Xian-He Sun, Illinois Ins2tute of Technology Eun-Sung Jung, Jongik University


slide-1
SLIDE 1

Towards Optimizing Large-Scale Data Transfers with End-to-End Integrity Verification

Raj Ke'muthu Argonne Na2onal Laboratory & University of Chicago Si Liu, Xian-He Sun, Illinois Ins2tute of Technology Eun-Sung Jung, Jongik University

slide-2
SLIDE 2

Exploding data volumes

100,000 TB

MACHO et al.: 1 TB

Palomar: 3 TB

2MASS: 10 TB

GALEX: 30 TB

Sloan: 40 TB Pan-STARRS: 40,000 TB

2004: 36 TB

2014: 3,300 TB

2020: 100+ EB 105 increase in data volumes in 6 years

Astronomy Climate Genomics

slide-3
SLIDE 3

End-to-end wide-area data transfers

Data Transfer Node Data Transfer Node Storage Storage

slide-4
SLIDE 4

Pipeline Transfer and Checksum

!

"

#" #$ !

$

#$%" !$%" #& #' !' !& … … Time

slide-5
SLIDE 5

Pipelining Data Transfer and End-to-End Data Integrity Check

§ Pipelining

– File-level pipelining: overlap a file transfer and a file integrity check – Block-level pipelining: overlap a block transfer and block data integrity check

  • Block size is less than the average file size in a dataset

§ Analytical Modeling

  • t: Transfer 2me of 500MB data c: Checksum 2me of 500MB data

§ Enhancing Block-level Pipelining

– Based on the analysis, the best performance can be achieved when the data transfer 2me is close to the data checksum 2me – Checksum-Dominant case: reduce the data checksum 2me (Current Work) – Transfer-Dominant case: reduce the transfer 2me (Future Work)

11/13/16

5

slide-6
SLIDE 6

Block-level pipelining -- Results

§ Results on Cooley

§ Results on Rain

11/13/16

6

slide-7
SLIDE 7

Block-level Pipelining – Perfect Pipeline

11/13/16

7

Comparison of the performance of 1-Checksum-Thread and 2-Checksum-Thread on Cooley

slide-8
SLIDE 8

Questions