Fix Fixed ed-PS PSNR Lossy Compres essio ion for Sci - PowerPoint PPT Presentation

Fix Fixed ed-PS PSNR Lossy Compres essio ion for Sci Scientific D c Data Dingwen Tao (The University of Alabama, USA) Sheng Di (Argonne National Laboratory, USA) Xin Liang (University of California, Riverside, USA) Zizhong Chen (University of California, Riverside, USA) Franck Cappello (Argonne National Laboratory, USA) September 2018

Outline • Introduction • Background • State-of-the-Art Lossy Compressor • Peak Signal to Noise Ratio (PSNR) • L 2 -Norm-Preserving Lossy Compression • Design of Fixed-PSNR Lossy Compression • Experimental Evaluation • Conclusion

In Intr troduc ductio tion • Scientific data are growing extremely Ability to generate data is exceeding our ability to store and analyze • Simulation systems and observation instruments grow in capability with Moore’s Law • • Petabyte (PB) data sets will be common soon! • Climate Simulation (CESM) • High resolution simulation à 1TB of data generated per compute day • IPCC Coupled Model Intercomparison Projects (CMIPs) • Phase 5 (2013): 2.5 PB of output à Phase 6 (2018): 10 PB expected! • The relative cost of storage is increasing • Previous NCAR platform (2013): ~ 20% of hardware budget • Current NCAR platform (2017): ~50% of hardware budget • Data reduction of about a factor at least 10 is needed! (A. Baker et al., HPDC’16) 3

St State-of of-th the-Ar Art Lossy Co Compression • ISABELA (NCSU) • Sorting preconditioner • B-Spline interpolation • ZFP (LLNL) • Customized orthogonal block transform • Exponent alignment • Block-wise bit-stream truncation • SZ (ANL) • Multi-dimensional prediction • Error-controlled quantization • Variable-length encoding • Unpredictable data analysis • VAPOR (NCAR) • Wavelets transformation • Vector quantization 4

State-of St of-th the-Ar Art Lossy Co Compression Error Control Mode • ISABELA (NCSU) • Sorting preconditioner Pointwise relative error bound • B-Spline interpolation • ZFP (LLNL) • Customized orthogonal block transform • Exponent alignment Absolute error bound • Block-wise bit-stream truncation • SZ (ANL) • Multi-dimensional prediction Absolute error bound • Error-controlled quantization Pointwise relative error bound • Variable-length encoding Value-range-based relative error bound • Unpredictable data analysis • VAPOR (NCAR) • Wavelets transformation No error bound control scheme • Vector quantization 5

State-of St of-th the-Ar Art Lossy Co Compression Error Control Mode • ISABELA (NCSU) • Sorting preconditioner Pointwise relative error bound • B-Spline interpolation • ZFP (LLNL) • Customized orthogonal block transform • Exponent alignment Absolute error bound • Block-wise bit-stream truncation None can control l 2 -norm-based • SZ (ANL) data distortion (e.g., RMSE, PSNR) • Multi-dimensional prediction Absolute error bound • Error-controlled quantization Pointwise relative error bound • Variable-length encoding Value-range-based relative error bound • Unpredictable data analysis • VAPOR (NCAR) • Wavelets transformation No error bound control scheme • Vector quantization 6

Pe Peak Signal to Noise Ratio Fig. Distortion of Slice 50 in Hurrican-ISABEL • PSNR: one of most critical indicators (TCf48) Data with Compression Ratio of 117:1 used to assess the distortion of reconstructed data v.s. original data • PSNR is defined as !"#$ = −20 ) *+, -. #$/"0 where NRMSE (normalized root mean (a) original raw data (b) SZ-2.0 (PSNR=56 dB) squared error) is 4 (6 2 − 6 2 7 ) 9 ∑ 23- : #$/"0 = 6 ;<= − 6 ;24 (c) SZ-1.4 (PSNR=39 dB) (d) ZFP (PSNR=31 dB)

Pe Peak Signal to Noise Ratio Fig. Distortion of Slice 50 in Hurrican-ISABEL • PSNR: one of most critical indicators (TCf48) Data with Compression Ratio of 117:1 used to assess the distortion of reconstructed data v.s. original data • PSNR is defined as !"#$ = −20 ) *+, -. #$/"0 where NRMSE (normalized root mean (a) original raw data (b) SZ-2.0 (PSNR=56 dB) squared error) is L 2 -norm-based data distortion 4 (6 2 − 6 2 7 ) 9 ∑ 23- : #$/"0 = 6 ;<= − 6 ;24 (c) SZ-1.4 (PSNR=39 dB) (d) ZFP (PSNR=31 dB)

L 2 -No Norm rm-Pr Preserving Lossy Compression • Prediction-based Lossy Compression • Compression 1. Predict data values (i.e., X pred ) and calculate prediction errors (i.e., X pe ) 2. Quantize or encode X pe 3. Entropy encoding (optional) • Decompression %$&'(" ) 1. Reconstruct prediction values (i.e., ! "#$% %$&'(" 2. De-quantize or decode ! "$ Reconstruct data values ! %$&'(" = ! "#$% %$&'(" + ! "$ %$&'(" 3.

L 2 -No Norm rm-Pr Preserving Lossy Compression • Prediction-based Lossy Compression • Compression 1. Predict data values (i.e., X pred ) and calculate prediction errors (i.e., X pe ) 2. Quantize or encode X pe 3. Entropy encoding (optional) %$&'(" in compression Assure X pred = ! "#$% • Decompression Otherwise data loss will propagate! %$&'(" ) 1. Reconstruct prediction values (i.e., ! "#$% %$&'(" 2. De-quantize or decode ! "$ ! − ! %$&'(" = ! "$ − ! "$ %$&'(" Reconstruct data values ! %$&'(" = ! "#$% %$&'(" + ! "$ %$&'(" 3. Theorem 1: For prediction-based lossy compression, overall L 2 -norm-based data distortion is as same as the distortion (introduced in Step 2) of the prediction error.

L 2 -No Norm rm-Pr Preserving Lossy Compression • Prediction-based Lossy Compression • Compression 1. Predict data values (i.e., X pred ) and calculate prediction errors (i.e., X pe ) 2. Quantize or encode X pe 3. Entropy encoding (optional) %$&'(" in compression Assure X pred = ! "#$% • Decompression Otherwise data loss will propagate! %$&'(" ) 1. Reconstruct prediction values (i.e., ! "#$% %$&'(" 2. De-quantize or decode ! "$ ! − ! %$&'(" = ! "$ − ! "$ %$&'(" Reconstruct data values ! %$&'(" = ! "#$% %$&'(" + ! "$ %$&'(" 3. Also CORRECT for orthogonal-transform-based lossy compression (such as ZFP) Theorem 1: For prediction-based lossy compression, overall L 2 -norm-based data distortion is as same as the distortion (introduced in Step 2) of the prediction error.

L 2 -No Norm rm-Pr Preserving Lossy Compression Theorem 1: For prediction-based lossy compression, overall L 2 -norm-based data distortion is as same as the distortion (introduced in Step 2) of the prediction error. Theorem 2: For orthogonal-transform-based lossy compression, overall L 2 -norm-based data distortion is as same as the distortion (introduced in Step 2) of the transformed data. estimate Overall L 2 -norm-based distortion Data distortion in Step 2 ~

Des esign gn of Fi Fixed ed-PS PSNR Lossy Compres essio ion • Quantization: prediction errors (or transformed data), i.e., ! " à a set of integers • Dequantization: integers à decompressed prediction errors • ! " ~$(&) – probability density function 0 & − 2 & 3 4 $ & 5& ! " = ∫ • ()* ! " , , /0

Des esign gn of Fi Fixed ed-PS PSNR Lossy Compres essio ion • Quantization: prediction errors (or transformed data), i.e., ! " à a set of integers • Dequantization: integers à decompressed prediction errors • ! " ~$(&) – probability density function 0 & − 2 & 3 4 $ & 5& ! " = ∫ • ()* ! " , , /0 = ≈ 1 ? $(@ : ) 6 9 > : :;< Quantization bin size Quantization bin’s midpoint

Des esign gn of Fi Fixed ed-PS PSNR Lossy Compres essio ion • Quantization: prediction errors (or transformed data), i.e., ! " à a set of integers • Dequantization: integers à decompressed prediction errors • ! " ~$(&) – probability density function 0 & − 2 & 3 4 $ & 5& ! " = ∫ • ()* ! " , , /0 9 ≈ 1 ; $(N L ) 6 K 6 L LM7 • Assume uniform quantization: • 6 7 = 6 3 ⋯ = 6 39 = 6 • Adopted by SZ lossy compressor 7 DE : 6 ; , $)<= = 20 4 @AB 7C • ()* = + 10 4 @AB 7C 12 F

Des esign gn of Fi Fixed ed-PS PSNR Lossy Compres essio ion • Quantization: prediction errors (or transformed data), i.e., ! " à a set of integers • Dequantization: integers à decompressed prediction errors • ! " ~$(&) – probability density function 0 & − 2 & 3 4 $ & 5& ! " = ∫ • ()* ! " , , /0 9 ≈ 1 ; $(\ Z ) 6 Y 6 Z Z[7 • Assume uniform quantization: • 6 7 = 6 3 ⋯ = 6 39 = 6 • Adopted by SZ lossy compressor 7 DE : 6 ; , $)<= = 20 4 @AB 7C • ()* = + 10 4 @AB 7C 12 F 3 4 10 / OPQR ST 4 UV • Absolute error bound = 2 4 6 è IJ KLM =

Expe Experimental Evalua uation • Experimental setup • Bebop cluster at Argonne (Intel Xeon E5-2695 v4 processors and 128 GB) • Data: 2D CESM-ATM, 3D hurricane-Isabel, 3D NYX • Implement fixed-PSNR mode based on SZ framework

Expe Experimental Resul sults • Our fixed-PSNR lossy compressor can precisely control PSNRs • Meet PSNR’s demands for 90%+ ATM data fields on average • PSNRs limited within 0.1~5.0 dB on average • Higher PSNR of demand, better our fixed-PSNR method performs

Fix Fixed ed-PS PSNR Lossy Compres essio ion for Sci - PowerPoint PPT Presentation

Fix Fixed ed-PS PSNR Lossy Compres essio ion for Sci Scientific D c Data Dingwen Tao (The University of Alabama, USA) Sheng Di (Argonne National Laboratory, USA) Xin Liang (University of California, Riverside, USA) Zizhong Chen

Lossless compression in lossy compression systems Almost every lossy compression system

FIX the FOX Information Exchange August 11, 2 2016 016 Agenda How to Fix the Fox Next

FIX Protocol 101 An Introduction To the FIX Protocol Martin Koopman, Former Chair FIX Protocol

NMHH satellite antennas NMHH satellite antennas Fix position antennas: Digi: fix 60 cm

Matlab arithmetic functions fix(): Round toward zero syntax : B = fix(A) example : fix( -1.9

The Parametric Complexity of Lossy Counter Machines Sylvain Schmitz ICALP , July 12, 2019,

Build it, Break it, Fix it Fix it Today Break It Presentations Theoretical Part: How

Compression Programs File Compression: Gzip, Bzip Archivers :Arc, Pkzip, Winrar,

Automating Inventory at Stitch Fix Using Beta Binomial Regression for Cold Start Problems Sally

ECS231 PCA, revisited May 28, 2019 1 / 18 Outline 1. PCA for lossy data compression 2. PCA for

RPL- Routing over Low Power and Lossy Networks Michael Richardson Ines Robles IETF 94

Lecture 7 Lossy Source Coding I-Hsiang Wang Department of Electrical Engineering National

Ackermann-Hardness for Lossy Counter Machines (and Reset Petri Nets) Philippe Schnoebelen LSV,

SES ESSIO SION A1 (Room 1 oom 1 - Mol olave ave) New ew Per Perspectives spectives in in

Department of Revenue Prof essio 11al . Dependable. Acco11111able ... in partnership with South

DeBakey High Sch chool for Healt Health Profes essio ions School Profile and Parent Update on

SCD: A S CALABLE C OHERENCE D IRECTORY WITH F LEXIBLE S HARER S ET E NCODING Daniel Sanchez and

STATS 507 Data Analysis in Python Lecture 12: Text Encoding and Regular Expressions Some slides

Entropy Coding Definition of Entropy Three Entropy coding techniques: (taken from the

Chameleon: Keeping data safe for the nave and thri6y Ansley Post and Peter Druschel MPISWS

Quantum Algorithms for Estimating Physical Quantities using Block-Encodings Patrick Rall Quantum

Announcements 61A Extra Lecture 4 Representing Strings: UTF-8 Encoding UTF (UCS (Universal

Separable Automorphisms on Matrix Algebras over Finite Field Extensions. Applications to Ideal

Encoding Prior Knowledge with Eigenword Embeddings Dominique Osborne 1 , Shashi Narayan 2 &

Fix Fixed ed-PS PSNR Lossy Compres essio ion for Sci - PowerPoint PPT Presentation

Fix Fixed ed-PS PSNR Lossy Compres essio ion for Sci Scientific D c Data Dingwen Tao (The University of Alabama, USA) Sheng Di (Argonne National Laboratory, USA) Xin Liang (University of California, Riverside, USA) Zizhong Chen

Lossless compression in lossy compression systems Almost every lossy compression system

FIX the FOX Information Exchange August 11, 2 2016 016 Agenda How to Fix the Fox Next

FIX Protocol 101 An Introduction To the FIX Protocol Martin Koopman, Former Chair FIX Protocol

NMHH satellite antennas NMHH satellite antennas Fix position antennas: Digi: fix 60 cm

Matlab arithmetic functions fix(): Round toward zero syntax : B = fix(A) example : fix( -1.9

The Parametric Complexity of Lossy Counter Machines Sylvain Schmitz ICALP , July 12, 2019,

Build it, Break it, Fix it Fix it Today Break It Presentations Theoretical Part: How

Compression Programs File Compression: Gzip, Bzip Archivers :Arc, Pkzip, Winrar,

Automating Inventory at Stitch Fix Using Beta Binomial Regression for Cold Start Problems Sally

ECS231 PCA, revisited May 28, 2019 1 / 18 Outline 1. PCA for lossy data compression 2. PCA for

RPL- Routing over Low Power and Lossy Networks Michael Richardson Ines Robles IETF 94

Lecture 7 Lossy Source Coding I-Hsiang Wang Department of Electrical Engineering National

Ackermann-Hardness for Lossy Counter Machines (and Reset Petri Nets) Philippe Schnoebelen LSV,

SES ESSIO SION A1 (Room 1 oom 1 - Mol olave ave) New ew Per Perspectives spectives in in

Department of Revenue Prof essio 11al . Dependable. Acco11111able ... in partnership with South

DeBakey High Sch chool for Healt Health Profes essio ions School Profile and Parent Update on

SCD: A S CALABLE C OHERENCE D IRECTORY WITH F LEXIBLE S HARER S ET E NCODING Daniel Sanchez and

STATS 507 Data Analysis in Python Lecture 12: Text Encoding and Regular Expressions Some slides

Entropy Coding Definition of Entropy Three Entropy coding techniques: (taken from the

Chameleon: Keeping data safe for the nave and thri6y Ansley Post and Peter Druschel MPISWS

Quantum Algorithms for Estimating Physical Quantities using Block-Encodings Patrick Rall Quantum

Announcements 61A Extra Lecture 4 Representing Strings: UTF-8 Encoding UTF (UCS (Universal

Separable Automorphisms on Matrix Algebras over Finite Field Extensions. Applications to Ideal

Encoding Prior Knowledge with Eigenword Embeddings Dominique Osborne 1 , Shashi Narayan 2 &amp;

Encoding Prior Knowledge with Eigenword Embeddings Dominique Osborne 1 , Shashi Narayan 2 &