pastr tri e err rror bou bounded los lossy comp
play

PaSTR TRI: E : Err rror-Bou Bounded Los Lossy Comp Compression - PowerPoint PPT Presentation

PaSTR TRI: E : Err rror-Bou Bounded Los Lossy Comp Compression on for or Two-El Electron n Integr grals s in in Quan antu tum Chem emis istr try Ali Murat Gok (Northwestern University, USA) Sheng Di (Argonne National


  1. PaSTR TRI: E : Err rror-Bou Bounded Los Lossy Comp Compression on for or Two-El Electron n Integr grals s in in Quan antu tum Chem emis istr try Ali Murat Gok (Northwestern University, USA) Sheng Di (Argonne National Laboratory, USA) Yuri Alexeev (Argonne National Laboratory, USA) Dingwen Tao (The University of Alabama, USA) Vladimir Mironov (Lomonosov Moscow State University, Russia) Xin Liang (University of California, Riverside, USA ) Franck Cappello (Argonne National Laboratory, USA) September 2018

  2. Sheng Di (ANL), Xin Liang (U. C. Riverside), Dingwen Tao (U. Alabama), Franck Cappello (Lead)

  3. Outline • Introduction • Background • Electron Repulsion Integrals (ERIs) • ERI Data Representation • Patterns in ERIs • PaSTRI Compression • Optimizations of Quantization & Encoding • Experimental Evaluation • Conclusion

  4. Outline • Introduction • Background • Electron Repulsion Integrals (ERIs) • ERI Data Representation • Patterns in ERIs • PaSTRI Compression • Optimizations of Quantization & Encoding • Experimental Evaluation • Conclusion

  5. Introduction • HPC applications work with extremely large data (Petabytes!) • Large data → System bottlenecks (Memory, Storage, Bandwidth)

  6. Introduction • HPC applications work with extremely large data (Petabytes!) • Large data → System bottlenecks (Memory, Storage, Bandwidth) • Electron Repulsion Integrals (ERIs): • Large data size: Petabytes • Costly computations: O(N 4 ) • Data reuse: ~10-30 times • PaSTRI: Pa ttern S caling for T wo-Electron R epulsion I ntegrals • Calculate and compress once • Decompress whenever needed

  7. Outline • Introduction • Background • Electron Repulsion Integrals (ERIs) • ERI Data Representation • Patterns in ERIs • PaSTRI Compression • Optimizations of Quantization & Encoding • Experimental Evaluation • Conclusion

  8. Electron Repulsion Integrals (ERIs)

  9. Electron Repulsion Integrals (ERIs) Orbital # of BFs s 1 p 3 d 6 f 10 g 15 … …

  10. Electron Repulsion Integrals (ERIs) Orbital # of BFs s 1 p 3 d 6 f 10 g 15 … … • ERIs are a part of solving the Schrödinger equation:

  11. Electron Repulsion Integrals (ERIs) Orbital # of BFs s 1 p 3 d 6 f 10 g 15 … … • ERIs are a part of solving the Schrödinger equation: scale as O(N 4 )

  12. Outline • Introduction • Background • Electron Repulsion Integrals (ERIs) • ERI Data Representation • Patterns in ERIs • PaSTRI Compression • Optimizations of Quantization & Encoding • Experimental Evaluation • Conclusion

  13. ERI Data Representation • (ij|kl) representation examples: (dd|dd), (dp|ff), (ps|df), …

  14. ERI Data Representation • (ij|kl) representation examples: (dd|dd), (dp|ff), (ps|df), … (dd|dd) block Orbital # of BFs 0,0,0,0 1234E-6 s 1 0,0,0,1 2345E-7 p 3 … … 0,0,0,5 3456E-6 d 6 0,0,1,0 4567E-8 f 10 g 15 … … 5,5,5,5 6789E-5 … … 6*6*6*6 = 1296 pts

  15. ERI Data Representation • (ij|kl) representation examples: (dd|dd), (dp|ff), (ps|df), … (dd|dd) block (dp|ff) block Orbital # of BFs 0,0,0,0 1234E-6 0,0,0,0 1234E-6 s 1 0,0,0,1 2345E-7 0,0,0,1 2345E-7 p 3 … … … … 0,0,0,5 3456E-6 0,0,0,9 3456E-6 d 6 0,0,1,0 4567E-8 0,0,1,0 4567E-8 f 10 g 15 … … … … 5,5,5,5 6789E-5 5,2,9,9 6789E-5 … … 6*6*6*6 = 1296 pts 6*3*10*10 = 1800 pts

  16. ERI Data Representation • (ij|kl) representation examples: (dd|dd), (dp|ff), (ps|df), … (ff|ff) block Orbital # of BFs 0,0,0,0 1234E-6 s 1 0,0,0,1 2345E-7 p 3 … … 0,0,0,9 3456E-6 d 6 0,0,1,0 4567E-8 f 10 g 15 … … 9,9,9,9 6789E-5 … … 10*10*10*10 = 10000 pts

  17. ERI Data Representation • (ij|kl) representation examples: (dd|dd), (dp|ff), (ps|df), … (ff|ff) (dd|dd) (fd|ps) 1D 4D 1D 4D 1D 4D Orbital # of BFs 0 0,0,0,0 0 0,0,0,0 0 0,0,0,0 s 1 1D Index 1 0,0,0,1 1 0,0,0,1 1 0,0,1,0 p 3 d 6 … … … … … … 9 0,0,0,9 6 0,0,0,6 19 1,0,1,0 f 10 10 0,0,1,0 7 0,0,1,0 20 1,0,2,0 g 15 … … … … … … … … 9999 9,9,9,9 1295 5,5,5,5 179 9,5,2,0

  18. Outline • Introduction • Background • Electron Repulsion Integrals (ERIs) • ERI Data Representation • Patterns in ERIs • PaSTRI Compression • Optimizations of Quantization & Encoding • Experimental Evaluation • Conclusion

  19. Patterns in ERIs (dd|dd) Original Data, Range: [0:215] 4E-07 2E-07 0E+00 -2E-07 -4E-07 0 215

  20. Patterns in ERIs (dd|dd) Original Data, Range: [0:215] Sub-Block Sub-Block Sub-Block Sub-Block Sub-Block Sub-Block 4E-07 2E-07 0E+00 -2E-07 -4E-07 [0:35] [36:71] [72:107] [108:143] [144:179] [179:215]

  21. Patterns in ERIs (dd|dd) Original Data, Range: [0:215] Sub-Block Sub-Block Sub-Block Sub-Block Sub-Block Sub-Block 4E-07 2E-07 0E+00 -2E-07 -4E-07 [0:35] [36:71] [72:107] [108:143] [144:179] [179:215] Data Ranges [0:35] , [36:71] 4E-7 0 -4E-7

  22. Patterns in ERIs (dd|dd) Original Data, Range: [0:215] Sub-Block Sub-Block Sub-Block Sub-Block Sub-Block Sub-Block 4E-07 2E-07 0E+00 -2E-07 -4E-07 [0:35] [36:71] [72:107] [108:143] [144:179] [179:215] Data Ranges Data Ranges [0:35] , [36:71] [0:35] , [36:71] 4E-7 1E-7 4E-7 0 0 0 -4E-7 -1E-7 -4E-7

  23. Patterns in ERIs (dd|dd) Original Data, Range: [0:215] Sub-Block Sub-Block Sub-Block Sub-Block Sub-Block Sub-Block 4E-07 2E-07 0E+00 -2E-07 -4E-07 [0:35] [36:71] [72:107] [108:143] [144:179] [179:215] Reasonable |Deviation| and Data Ranges Data Ranges Absolute |Compr. Error| [0:35] , [36:71] [0:35] , [36:71] Error Bound: 1E+0 4E-7 1E-7 4E-7 1E-2 10 -10 1E-4 0 0 0 1E-6 1E-8 1E-10 -4E-7 -1E-7 -4E-7 1E-12

  24. Patterns in ERIs (dd|dd) → (6*6 | 6*6) → (36 | 36) → (1296) Original Data, Range: [0:215] Sub-Block Sub-Block Sub-Block Sub-Block Sub-Block Sub-Block 4E-07 Period Block # of 2E-07 (SB Size) Size SBs 0E+00 Orbital # of BFs -2E-07 -4E-07 s 1 [0:35] [36:71] [72:107] [108:143] [144:179] [179:215] p 3 d 6 |Deviation| and Data Ranges Data Ranges |Compr. Error| f 10 [0:35] , [36:71] [0:35] , [36:71] g 15 1E+0 4E-7 1E-7 4E-7 1E-2 … … 1E-4 0 0 0 1E-6 1E-8 1E-10 -4E-7 -1E-7 -4E-7 1E-12

  25. Patterns in ERIs (dd|dd) → (6*6 | 6*6) → (36 | 36) → (1296) Original Data, Range: [0:215] Sub-Block Sub-Block Sub-Block Sub-Block Sub-Block Sub-Block 4E-07 Period Block # of 2E-07 (SB Size) Size SBs 0E+00 -2E-07 Original Data: -4E-07 Full Block: 1296 (64-bit) [0:35] [36:71] [72:107] [108:143] [144:179] [179:215] Compressed: |Deviation| and Data Ranges Data Ranges Pattern: 36 (<64-bit) |Compr. Error| [0:35] , [36:71] [0:35] , [36:71] Scale: 36 (<64-bit) 1E+0 Error Correction: ? bits 4E-7 1E-7 4E-7 1E-2 1E-4 0 0 0 1E-6 1E-8 1E-10 -4E-7 -1E-7 -4E-7 1E-12

  26. Why are there patterns in ERIs? • ERI values are calculated in ordered loops • ERI values depend on both the shape and the distance of electron clouds • For distant clouds, the shape loses its importance, distance dominates • Most of the electron clouds are distant from each other

  27. Generating Pattern and Scaling Coefficients Sub-Block Pattern Sub-Block Pattern Sub-Block Pattern |Sub-Block| |Pattern| Sub-Block Pattern b a b a b a b a b a ER FR AR AAR IS (Ratio of (Ratio of Firsts) (Ratio of Averages) (Ratio of Abs. Averages) (Interval Scaling) Extremums) Scaling coefficient = a / b (Note: |b| ≥ |a|)

  28. Generating Pattern and Scaling Coefficients Sub-Block Pattern Sub-Block Pattern Sub-Block Pattern |Sub-Block| |Pattern| Sub-Block Pattern b a b a b a b a b a ER FR AR AAR IS (Ratio of (Ratio of Firsts) (Ratio of Averages) (Ratio of Abs. Averages) (Interval Scaling) Extremums) Requires Sign Correction Best Compression, Fast Scaling coefficient = a / b (Note: |b| ≥ |a|) “a” or “b” can be close to zero !

  29. Outline • Introduction • Background • Electron Repulsion Integrals (ERIs) • ERI Data Representation • Patterns in ERIs • PaSTRI Compression • Optimizations of Quantization & Encoding • Experimental Evaluation • Conclusion

  30. PaSTRI Compression • Calculate period (based on the last two BFs) • Determine Pattern (P), then quantize P to PQ • Calculate Scaling coefficients (S), then quantize S to SQ • # of elements in PQ and SQ depend on block type (s, p, d, f, g,…) • Calculate Error Correction (EC), then quantize EC to ECQ • EC = Original data - PQ * P_binsize * SQ * S_binsize • # of elements in ECQ depends on deviation (atoms are distant or not) • Decide encoding mode • Sparse or Non-sparse • Encode PQ, SQ, and ECQ and write to output file

  31. PaSTRI Decompression • Read encoding mode, error bound • Calculate period • Read PQ and reconstruct Pattern • Read SQ and reconstruct Scaling coefficients • Read ECQ and reconstruct Error Correction • Reconstruct data values: • Decompressed Data = Pattern_DQ * Scale_DQ + ErrorCorrection_DQ Scaled Pattern

  32. Outline • Introduction • Background • Electron Repulsion Integrals (ERIs) • ERI Data Representation • Patterns in ERIs • PaSTRI Compression • Optimizations of Quantization & Encoding • Experimental Evaluation • Conclusion

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend