Data Life Cycle Management for Oracle @ CERN with partitioning Oracle @ CERN with partitioning, compression, and archive.
Luca Canali, CERN
Orcan Conference, Stockholm, May 2010 y
Data Life Cycle Management for Oracle @ CERN with partitioning - - PowerPoint PPT Presentation
Data Life Cycle Management for Oracle @ CERN with partitioning Oracle @ CERN with partitioning, compression, and archive. Luca Canali, CERN Orcan Conference, Stockholm, May 2010 y Outline Physics DB Services at CERN y Motivations for
Orcan Conference, Stockholm, May 2010 y
2 Data Life Cycle Management @ CERN, Luca Canali
Balloon ( 3 0 Km ) CD stack w ith
CD stack w ith 1 year LHC data! ( ~ 2 0 Km )
Concorde ( 1 5 Km )
( 4 .8 Km ) 3 Data Life Cycle Management @ CERN, Luca Canali
bookkeeping, file transfers, etc..
Fil t l fil t f d t t
4 Data Life Cycle Management @ CERN, Luca Canali
5 Data Life Cycle Management @ CERN, Luca Canali
Data Life Cycle Management @ CERN, Luca Canali 6
Data Life Cycle Management @ CERN, Luca Canali 7
8 Data Life Cycle Management @ CERN, Luca Canali
The threshold seems to be moving with time and technology
progress
9 Data Life Cycle Management @ CERN, Luca Canali
10 Data Life Cycle Management @ CERN, Luca Canali
11 Data Life Cycle Management @ CERN, Luca Canali
12 Data Life Cycle Management @ CERN, Luca Canali
Failgroup4 Failgroup4 Failgroup2 Failgroup2 Failgroup3 Failgroup3 Failgroup1 Failgroup1
Failgroup4 Failgroup4 Failgroup2 Failgroup2 Failgroup3 Failgroup3 Failgroup1 Failgroup1
13 Data Life Cycle Management @ CERN, Luca Canali
Backup data
Metadata
1GbE 1GbE Media Manager Server
FC FC FC FC
Data Life Cycle Management @ CERN, Luca Canali 14 Database Server Tape drives
15 Data Life Cycle Management @ CERN, Luca Canali
Data Life Cycle Management @ CERN, Luca Canali 16
17 Data Life Cycle Management @ CERN, Luca Canali
18 Data Life Cycle Management @ CERN, Luca Canali
19 Data Life Cycle Management @ CERN, Luca Canali
20 Data Life Cycle Management @ CERN, Luca Canali
21 Data Life Cycle Management @ CERN, Luca Canali
22 Data Life Cycle Management @ CERN, Luca Canali
23 Data Life Cycle Management @ CERN, Luca Canali
24 Data Life Cycle Management @ CERN, Luca Canali
25 Data Life Cycle Management @ CERN, Luca Canali
26 Data Life Cycle Management @ CERN, Luca Canali
27 Data Life Cycle Management @ CERN, Luca Canali
Data Life Cycle Management @ CERN, Luca Canali 28
29 Data Life Cycle Management @ CERN, Luca Canali
Can use partition exchange to table
archive (exadata 11gR2) archive (exadata 11gR2)
Access to data of archive need to be validated with application
Data Life Cycle Management @ CERN, Luca Canali 30
31 Data Life Cycle Management @ CERN, Luca Canali
C b bl i d ti
y p
32 Data Life Cycle Management @ CERN, Luca Canali
33 Data Life Cycle Management @ CERN, Luca Canali
Data Life Cycle Management @ CERN, Luca Canali 34
number of disks which determine max IOPS number of disks, which determine max IOPS
more CPU
Data Life Cycle Management @ CERN, Luca Canali 35
Data Life Cycle Management @ CERN, Luca Canali
36
– (Ex: Atlas’ PVSS, Atlas MDT DCS, LCG’s SAME)
Data Life Cycle Management @ CERN, Luca Canali 37
p
superset of partitioning key
May require some index change for the archive
– Disable PKs in the archive table (Ex: Atlas PANDA) – Or change PK to add partitioning key
Data Life Cycle Management @ CERN, Luca Canali 38
Key factoring
Data Life Cycle Management @ CERN, Luca Canali
39
TAGs, Atlas MDT DCS
Data Life Cycle Management @ CERN, Luca Canali 40
Data Life Cycle Management @ CERN, Luca Canali 41
accelerator) GRID it i li ti
test
42 Data Life Cycle Management @ CERN, Luca Canali
Measured Compression factor for selected Physics Apps.
50 60 70
factor
30 40 50
10 20 Columnar for Query Low Columnar for Query High Columnar for Archive Low Columnar for Archive High
Co
No compression OLTP compression PVSS (261M rows, 18GB) LCG GRID Monitoring (275M rows, 7GB)
Data from Svetozar Kapusta – Openlab (CERN).
LCG TESTDATA 2007 (103M rows, 75GB) ATLAS PANDA FILESTABLE (381M rows, 120GB) ATLAS LOG MESSAGES (323M rows, 66GB)
43 Data Life Cycle Management @ CERN, Luca Canali
Full table scan speedup of compressed tables for count(*) operation (speed=1 for ‘no compression’)
3 3.5
can
1.5 2 2.5
up of full sc
0.5 1 BASIC QUERY LOW QUERY HIGH ARCHIVE LOW ARCHIVE HIGH
Speed
NO COMPRESSION OLTP PVSS (261M rows, 18GB) LCG GRID Monitoring (275M rows, 7GB) ATLAS PANDA FILESTABLE (381M, 120GB) ATLAS LOG MESSAGES (323M rows, 78GB) ( , ) ( , ) LCG TESTDATA (103M rows, 75GB)
Data from Svetozar Kapusta – Openlab (CERN).
44 Data Life Cycle Management @ CERN, Luca Canali
Hypothetical full scan speed up for count(*) operations Obtained by disabling cell offloading in exadata.
25 30
can
15 20 25
up of full sc
5 10 BASIC QUERY LOW QUERY HIGH ARCHIVE LOW ARCHIVE HIGH
Speed
NO COMPRESSION OLTP PVSS (261M rows, 18GB) LCG GRID Monitoring (275M rows, 7GB) ATLAS PANDA FILESTABLE (381M, 120GB) LCG TESTDATA (103M rows, 75GB) ATLAS PANDA FILESTABLE (381M, 120GB) LCG TESTDATA (103M rows, 75GB) ATLAS LOG MESSAGES (323M rows, 78GB) Data from Svetozar Kapusta – Openlab (CERN).
45 Data Life Cycle Management @ CERN, Luca Canali
Table creation time for various compression types of various physics applications. T = 1 for ‘no compression’.
40 45
ess
20 25 30 35
5 10 15 OLTP BASIC QUERY LOW QUERY HIGH ARCHIVE LOW ARCHIVE HIGH
Time facto
NO COMPRESSION OLTP PVSS (261M rows, 18GB) LCG GRID Monitoring (275M rows, 7GB) LCG TESTDATA (103M rows, 75GB) ATLAS PANDA FILESTABLE (381M, 120GB) ATLAS LOG MESSAGES (323M rows, 78GB)
Data from Svetozar Kapusta – Openlab (CERN).
46 Data Life Cycle Management @ CERN, Luca Canali
blocks (around 32K) E h i it t d t i t ll ‘b l ’
CU HEADER BLOCK HEADER BLOCK HEADER BLOCK HEADER BLOCK HEADER
C3 C7
Logical Compression Unit
CU HEADER
C3 C4 C1 C2 C7 C5 C6 C8 C8
Data Life Cycle Management @ CERN, Luca Canali 47
Picture and info from: B. Hodak, Oracle (OOW09 presentation on OTN).
Data Life Cycle Management @ CERN, Luca Canali 48
Compression Table Type Compression Type Blocks Used Comp Factor Constant Table no compression 2637824 1 Constant Table comp basic 172032 15 3 Constant Table comp basic 172032 15.3 Constant Table comp for oltp 172032 15.3 Constant Table comp for archive high 3200 824.3 Constant Table high 3200 824.3 Constant Table comp for query high 3200 824.3
Random Table no compression 2711552 1 Random Table comp basic 2708352 1.0 Random Table comp for oltp 2708352 1.0 p p Random Table comp for archive high 1277952 2.1 comp for query
Data Life Cycle Management @ CERN, Luca Canali 49
Random Table high 1449984 1.9
Method Uncompressed Compressed Ratio gzip -9 13763946 bytes 622559 bytes 22 compress for archive high 896 blocks 48 blocks 19
Data Life Cycle Management @ CERN, Luca Canali 50
Data Life Cycle Management @ CERN, Luca Canali 51
Consistent gets Consistent gets Table Compr Type (select *) (select owner) no compression 4 4 basic compression 4 4 comp for oltp 4 4 comp for query high 9 5 comp for query low 9 5
p q y comp for archive low 10 5 comp for archive high 24 5
Data Life Cycle Management @ CERN, Luca Canali
52
53 Data Life Cycle Management @ CERN, Luca Canali
Data Life Cycle Management @ CERN, Luca Canali 54