oracle advanced compression tests
play

Oracle Advanced Compression Tests Svetozar Kapusta 15 th of October - PowerPoint PPT Presentation

Oracle Advanced Compression Tests Svetozar Kapusta 15 th of October 2009 What is CERN? CERN is: CERN is: 2500 staff scientists (physicists, engineers, CERN is the worlds largest particle etc.) 6500 visiting scientists physics


  1. Oracle Advanced Compression Tests Svetozar Kapusta 15 th of October 2009

  2. What is CERN? CERN is: CERN is: ≈ 2500 staff scientists (physicists, engineers, • CERN is the world’s largest particle etc.) ≈ 6500 visiting scientists physics laboratory located in Geneva, h i l b t l t d i G (half of the world's (h lf f th ld' particle physicists) Switzerland Coming from ≈ 500 universities or • CERN hosts the Large Hadron Collider CERN hosts the Large Hadron Collider institutes representing (LHC) which is the biggest man-made ≈ 80 nationalities. accelerator • LHC will start its operation in November 2009 and will form, together with its experiments, the and will form, together with its experiments, the biggest sub-nuclear microscope in the world. Courtesy of M. Girone

  3. LHC: a Very Large Scientific Instrument LHC : 27 km long LHC : 27 km long 100m underground Mont Blanc, 4810 m ATLAS Downtown Geneva ALICE CMS +TOTEM Courtesy of M. Girone

  4. … Based on Advanced Technology 27 km of superconducting magnets cooled in superfluid helium at 1.9 K p Courtesy of M. Girone

  5. Experiments are ready for collisions Courtesy of M. Girone

  6. The Data Acquisition Courtesy of M. Girone Ian.Bird@c ern ch 6

  7. Data Acquisition, First pass processing Courtesy of M. Girone Ian.Bird@c ern ch 7 1.25 GB/sec (ions)

  8. CERN Openlab  Collaboration between CERN and industrial Openlab partners: HP Intel Oracle and Openlab partners: HP, Intel, Oracle and Siemens  Framework for evaluating and integrating  Framework for evaluating and integrating cutting-edge IT technologies  CERN acquires early access to technology CERN i l t t h l  CERN offers expertise and a demanding computing environment to push new technologies to their limits  CERN provides a neutral ground for carrying out advanced R&D  Excellent collaboration with Oracle

  9. Databases for physics at CERN  Relational databases play a key role in the experiments’ production dataflow chains  Listed among the critical services for the g LHC experiments  Bulk of physics data stored in files a Bulk of physics data stored in files, a fraction of it in databases  Most applications are OLTP  Most applications are OLTP  Some data warehouse applications are also emerging i

  10. Data Growth  Expected data growth is roughly ≈ 20-30 TB per year per experiment  Experiments need to have all data available p at any time  During the experiments lifetimes (10-15 years) u g t e e pe e ts et es ( 0 5 yea s)  Few extra years, as the data analysis will continue  We have to provide an efficient way of storing  We have to provide an efficient way of storing and accessing the few Peta bytes of mostly read-only data read-only data  Answer to our challenge is the compression available in 11G2 and Exadata2 available in 11G2 and Exadata2

  11. Advanced Compression Tests  Exadata2 located in Reading, UK  Half rack with 7 storage cells each of 12 disks each Half rack with 7 storage cells each of 12 disks each  Accessed remotely from Geneva for 2 weeks  Data used  The largest and representative production and test tables  Exported compressed using Datapump  Imported into Exadata2 using Datapump  Applications  PVSS (slow control system used by the experiments) PVSS ( l t l t d b th i t )  GRID monitoring application  GRID Test data GRID Test data  File transfer applications (PANDA)  Logging application for ATLAS  First results the same day

  12. Compression factors for various compression types of various physics applications 70 60 50 40 30 ARCHIVE HIGH ARCHIVE LOW ARCHIVE LOW 20 20 QUERY HIGH 10 QUERY LOW 0 BASIC OLTP NO COMPRESSION PVSS columns: 6 number, 4 TS(9) , 5 varchar2 , 3 binary_double LCG GRID Monitoring columns: 5 number LCG TESTDATA columns: 6 number(38), 1 varchar2, 1 CLOB ( ) ATLAS PANDA FILESTABLE columns: 3 number, 12 varchar2, 2 date, 2 char ATLAS LOG MESSAGES columns: 5 number, 7 varchar2, 1 TS

  13. Table creation times for various compression types of various physics applications. Normalized to no compression. 45 40 35 30 25 20 ARCHIVE HIGH 15 15 ARCHIVE LOW ARCHIVE LOW 10 QUERY HIGH 5 QUERY LOW 0 BASIC OLTP OLTP NO COMPRESSION PVSS columns: 6 number, 4 TS(9) , 5 varchar2 , 3 binary_double LCG GRID monitoring columns: 5 number LCG TESTDATA columns: 6 number(38), 1 varchar2, 1 CLOB ( ) ATLAS PANDA FILESTABLE columns: 3 number, 12 varchar2, 2 date, 2 char ATLAS LOG MESSAGES columns: 5 number, 7 varchar2, 1 TS

  14. Full table scans performance for various compression types of various physics applications. Normalized to no compression. 3.5 3 2.5 2 1.5 ARCHIVE HIGH ARCHIVE LOW 1 QUERY HIGH 0.5 QUERY LOW 0 BASIC OLTP NO COMPRESSION PVSS columns: 6 number, 4 TS(9) , 5 varchar2 , 3 binary_double LCG GRID monitoring columns: 5 number LCG TESTDATA columns: 6 number(38), 1 varchar2, 1 CLOB ( ) ATLAS PANDA FILESTABLE columns: 3 number, 12 varchar2, 2 date, 2 char ATLAS LOG MESSAGES columns: 5 number, 7 varchar2, 1 TS

  15. Full table scans performance for various compression types of various physics applications. Normalized to no compression. Exadata offloading set to false. 30 25 25 20 15 ARCHIVE HIGH 10 10 ARCHIVE LOW QUERY HIGH 5 QUERY LOW 0 BASIC OLTP NO COMPRESSION PVSS columns: 6 number, 4 TS(9) , 5 varchar2 , 3 binary_double LCG GRID monitoring columns: 5 number LCG TESTDATA columns: 6 number(38), 1 varchar2, 1 CLOB ( ) ATLAS PANDA FILESTABLE columns: 3 number, 12 varchar2, 2 date, 2 char ATLAS LOG MESSAGES columns: 5 number, 7 varchar2, 1 TS

  16. Exadata2 offloading Full table scans performance for various compression types of ATLAS logging application with and without Exadata offloading 1000 me [s] ble scan tim 100 10 10 Full tab 1 Please note the logarithmic scale

  17. Export Datapump Compression  Compression factor for PVSS data  Export Datapump ≈ 9X  tar bzip2 utility • ≈ 11X on non compressed exported PVSS data • ≈ 1.2X on the compressed exported PVSS data  Compression factor for LCG application p pp  Export Datapump ≈ 13X  tar bzip2 utility p y • ≈ 9X on non compressed exported LCG data • ≈ 1.2X on the compressed exported LCG data

  18. Conclusions  Tested basic, OLTP and hybrid columnar compression and Datapump compression i d D t i  The results for data from physics applications are rather impressing (2-6X OLTP, 10-70X EHCC archive high)  EHCC can achieve up to ≈ 3X better compression than tar bzip2 compression of p p p the same data exported uncompressed  Oracle Compression offers a win-win Oracle Compression offers a win win solution, especially for OLTP  Shrinks used storage volume  Shrinks used storage volume  Improves performance

  19. Thank you for your attention

  20. Backup 18 16 14 12 10 8 8 6 4 2 0 CPU Consumed vs No Cmp Logical Reads vs No Cmp

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend