Ideas for evolution of replication technology @ CERN Openlab Minor - - PowerPoint PPT Presentation

ideas for evolution of replication technology cern
SMART_READER_LITE
LIVE PREVIEW

Ideas for evolution of replication technology @ CERN Openlab Minor - - PowerPoint PPT Presentation

Ideas for evolution of replication technology @ CERN Openlab Minor Review December 14 th , 2010 Zbigniew Baranowski, IT-DB CERN IT Department CH-1211 Geneva 23 Switzerland www.cern.ch/ i t Outline Replication use cases at CERN


slide-1
SLIDE 1

CERN IT Department CH-1211 Geneva 23 Switzerland

www.cern.ch/ it

Ideas for evolution of replication technology @ CERN

Openlab Minor Review December 14th, 2010 Zbigniew Baranowski, IT-DB

slide-2
SLIDE 2

CERN IT Department CH-1211 Geneva 23 Switzerland

www.cern.ch/ it

  • Replication use cases at CERN
  • Motivation for evolution of replication
  • Oracle replication technologies
  • Possible future replication solutions for LCG
  • Summary

Ideas for evolution of replication technology @ CERN

Outline

2

slide-3
SLIDE 3

CERN IT Department CH-1211 Geneva 23 Switzerland

www.cern.ch/ it

  • ATLAS

– CONDITIONS (4M LCRs/day) – PVSS (60M LCRs/day)

  • CMS

– CONDITIONS (6M LCRs/day) – PVSS (20M LCRs/day)

  • LHCb

– CONDITIONS (6K LCRs/day)

  • ALICE

– PVSS (4M LCRs/day)

  • COMPASS

– PVSS (4M LCRs/day)

Ideas for evolution of replication technology @ CERN

Replication use cases: ONLINE - OFFLINE

3 CONDITIONS PVSS

slide-4
SLIDE 4

CERN IT Department CH-1211 Geneva 23 Switzerland

www.cern.ch/ it

  • LHCb ( in addition to ONLINE-OFFLINE)

– CONDITIONS (8K LCRs/day)

Ideas for evolution of replication technology @ CERN

Replication use cases: OFFLINE - ONLINE

4

CONDITIONS

slide-5
SLIDE 5

CERN IT Department CH-1211 Geneva 23 Switzerland

www.cern.ch/ it

CONDITIONS LFC

Ideas for evolution of replication technology @ CERN

Replication use cases: OFFLINE – T1s

5

– ATLAS

  • CONDITIONS (4M LCRs/day)

– LHCb

  • LFC (235K LCRs/day)
  • CONDITIONS (15K LCRs/day)
slide-6
SLIDE 6

CERN IT Department CH-1211 Geneva 23 Switzerland

www.cern.ch/ it

  • ATLAS

– AMI (800K LCRs/day) – Muon (700K LCRs/day)

Ideas for evolution of replication technology @ CERN

Replication use cases: T1 - OFFLINE

6

AMI MUON

slide-7
SLIDE 7

CERN IT Department CH-1211 Geneva 23 Switzerland

www.cern.ch/ it

  • Need of stable and reliable replication

service

  • Streams 10g require frequent interventions

(at least once per week)

– Consistency problems – Blocking sessions – Memory pools shortage – Logminer crashes – Users unsupported changes

  • Streams administration is time consuming

and requires expert knowledge

  • Migration to 11gR2 in 2012

Ideas for evolution of replication technology @ CERN

Motivation for evolution of replication solutions

7

slide-8
SLIDE 8

CERN IT Department CH-1211 Geneva 23 Switzerland

www.cern.ch/ it

  • Is there a solution which can simplify

maintenance of replication?

– Satisfies physics data workload – Requires minimum maintenance effort – Is resilient to user’s unsupported operations – Ensures replicated data consistency – Utilizes minimum amount of resources

Ideas for evolution of replication technology @ CERN

Motivation for other replication solutions

8

slide-9
SLIDE 9

CERN IT Department CH-1211 Geneva 23 Switzerland

www.cern.ch/ it

  • Logical (SQL based) replication

– Streams11gR2 – GoldenGate

  • Physical (block-level) replication

– Active DataGuard11gR2

  • Combinations of physical and logical

replication

Ideas for evolution of replication technology @ CERN

Possible replication solutions

9

slide-10
SLIDE 10

CERN IT Department CH-1211 Geneva 23 Switzerland

www.cern.ch/ it

Ideas for evolution of replication technology @ CERN

Streams 11gR2

10

slide-11
SLIDE 11

CERN IT Department CH-1211 Geneva 23 Switzerland

www.cern.ch/ it

  • Technology features

–  Considerable maintenance effort

  • but in 11g should be less than in 10g

–  No additional license required –  Many improvements

  • stability, management, monitoring, verification of data

consistency

–  Very good performance (30K-40K LCRs/s) –  Best practices identified – a lot of experience –  Source and destination database fully accessible for reads and writes

Ideas for evolution of replication technology @ CERN

Streams11gR2 solution

11

slide-12
SLIDE 12

CERN IT Department CH-1211 Geneva 23 Switzerland

www.cern.ch/ it

  • As ONLINE – OFFLINE replication

–  Users and data content can abort the replication –  streams processes may affect performance of

  • nline database

–  no extra hardware needed –  bi-directional replication

Ideas for evolution of replication technology @ CERN

Streams11gR2 solution

12

SQLs SQLs

slide-13
SLIDE 13

CERN IT Department CH-1211 Geneva 23 Switzerland

www.cern.ch/ it

  • As OFFLINE – T1s

– Recovery of replica requires

  •  coordination between T1 and other T1, T0
  •  expert knowledge of procedures

– Downstream capture

  •  additional hardware required
  •  complete isolation from OFFLINE database
  •  standby database can be source of replication

–  T1s databases is read/write accessible –  Good monitoring for distributed streams deployment (strmmon, EM)

Streams11gR2

13

Redo Transport

slide-14
SLIDE 14

CERN IT Department CH-1211 Geneva 23 Switzerland

www.cern.ch/ it

Ideas for evolution of replication technology @ CERN

GoldenGate

14

Source: Oracle.com

slide-15
SLIDE 15

CERN IT Department CH-1211 Geneva 23 Switzerland

www.cern.ch/ it

  • Technology features

–  source and destination database fully accessible

for reads and writes –  good quality of software (very stable, free of locks, almost transparent for databases) –  good performance (comparable to Streams11g) –  additional license required –  standby database cannot be used as source –  no in-house experience –  additional dedicated disk space required for trail files –  additional software to be installed and maintained on database’s machines

GoldenGate

15

slide-16
SLIDE 16

CERN IT Department CH-1211 Geneva 23 Switzerland

www.cern.ch/ it

  • As ONLINE-OFFLINE replication

–  no extra hardware needed –  possible loops back in replication –  minor impact on source database –  users and data content can abort the replication

Ideas for evolution of replication technology @ CERN

GoldenGate solution

16

SQLs SQLs GG GG GG GG

slide-17
SLIDE 17

CERN IT Department CH-1211 Geneva 23 Switzerland

www.cern.ch/ it

  • As OFFLINE – T1s

–  easier maintenance

  • No side effects on source when target is down
  • No split of replication required
  • Trail files can be used for T1 recovery

–  no remote administration - access to nodes required –  no monitoring for distributed environment –  cannot use standby database (i.e. Active Dataguard) as a source of replication

Ideas for evolution of replication technology @ CERN

GoldenGate solution

17

slide-18
SLIDE 18

CERN IT Department CH-1211 Geneva 23 Switzerland

www.cern.ch/ it

Ideas for evolution of replication technology @ CERN

Active DataGuard 11gR2

18

Source: Oracle.com

slide-19
SLIDE 19

CERN IT Department CH-1211 Geneva 23 Switzerland

www.cern.ch/ it

  • Technology features

–  Physical replication

  • identical copy

–  Minimum maintenance effort –  Outperforms other replication technologies

  • Oracle claims 200 MB/s of redo processing

–  Improved data reliability of primary database

  • failover
  • automatic recovery of corrupted blocks

–  Fast recovery with RMAN –  Additional license required –  Target/standby database is read only

Ideas for evolution of replication technology @ CERN

Active DataGuard 11gR2

19

slide-20
SLIDE 20

CERN IT Department CH-1211 Geneva 23 Switzerland

www.cern.ch/ it

  • As ONLINE – OFFLINE replication

–  additional database installations needed for no replicated data (split of OFFLINE) –  same version of software required (installation, upgrades) –  online database is protected with another standby database –  further replication to T1s is possible in sequential standbys configuration

Active DataGuard 11gR2

20

Redo Transport

Redo Transport

slide-21
SLIDE 21

CERN IT Department CH-1211 Geneva 23 Switzerland

www.cern.ch/ it

  • As OFFLINE – T1s

–  same version required on all T1s DBs

  • Coordination of interventions becomes critical

–  T1 database is read only –  additional database installations needed for no replicated data (split of OFFLINE) –  Physical replication: lower maintenance effort –  No downstream needed

Active DataGuard 11gR2

21

Redo Transport

slide-22
SLIDE 22

CERN IT Department CH-1211 Geneva 23 Switzerland

www.cern.ch/ it

  • Streams11gR2 replication at all Tiers

– Same setup as current production

  • No additional installations needed

Ideas for evolution of replication technology @ CERN

Possible solutions

22

Redo Transport

PROPAGATION PROPAGATION

slide-23
SLIDE 23

CERN IT Department CH-1211 Geneva 23 Switzerland

www.cern.ch/ it

GG GG GG

  • GoldenGate replication at all Tiers
  • New software has to be deployed
  • Additional port needs to be opened
  • Do we need downstream database?

Ideas for evolution of replication technology @ CERN

Possible solutions

23

GG FILES GG GG FILES GG FILES GG ?

slide-24
SLIDE 24

CERN IT Department CH-1211 Geneva 23 Switzerland

www.cern.ch/ it

  • ONLINE –> OFFLINE: Active DataGuard
  • OFFLINE –> T1s: Streams11g

Possible solutions

24

PROPAGATION

Redo Transport Redo Transport

Possible redo transport directions

Additional standby database for ONLINE- OFFLINE model protection

slide-25
SLIDE 25

CERN IT Department CH-1211 Geneva 23 Switzerland

www.cern.ch/ it

Online database failover and recovery with ADG11gR2

25

PROPAGATION

Redo Transport Redo Transport

X

Redo Transport

ONLINE-OFFLINE model is broken !!!

slide-26
SLIDE 26

CERN IT Department CH-1211 Geneva 23 Switzerland

www.cern.ch/ it

Offline database failover and recovery with ADG11gR2

26

PROPAGATION

Redo Transport Redo Transport

ONLINE-OFFLINE model is broken !!!

Recovery

X

Redo Transport

slide-27
SLIDE 27

CERN IT Department CH-1211 Geneva 23 Switzerland

www.cern.ch/ it

  • ONLINE –> OFFLINE: Streams11g / GG
  • OFFLINE –> T1s: Streams11g

Possible solutions

27

Redo Transport

Streams / GG Extra procedures required in order to perform synchronized software upgrade

slide-28
SLIDE 28

CERN IT Department CH-1211 Geneva 23 Switzerland

www.cern.ch/ it

  • Migration to the new database versions

(2012) gives an opportunity to re-design and improve the replication service

  • Three candidate technologies are being

investigated

– Streams11gR 2 – GoldenGate – Active DataGuard

Ideas for evolution of replication technology @ CERN

Summary

28

slide-29
SLIDE 29

CERN IT Department CH-1211 Geneva 23 Switzerland

www.cern.ch/ it

  • Many thanks to all Physics DBAs,

especially:

– Luca – Jacek – Dawid

  • Consultancy

– Gancho – Stephen Balousek (Oracle) – Jagdev Dhillon (Oracle)

Ideas for evolution of replication technology @ CERN

Acknowledgements

29

slide-30
SLIDE 30

CERN IT Department CH-1211 Geneva 23 Switzerland

www.cern.ch/ it

Ideas for evolution of replication technology @ CERN

Questions?

30