Real-time Replication in the Real World Richard E. Baum C. Thomas - - PowerPoint PPT Presentation

real time replication in the real world
SMART_READER_LITE
LIVE PREVIEW

Real-time Replication in the Real World Richard E. Baum C. Thomas - - PowerPoint PPT Presentation

Real-time Replication in the Real World Richard E. Baum C. Thomas Tyler 2 Agenda Provide an overview of replication solutions Discuss relevant new 2009.2 features Review some real-world solutions 3 Terminology High


slide-1
SLIDE 1
slide-2
SLIDE 2

Real-time Replication in the Real World

Richard E. Baum

  • C. Thomas Tyler

2

slide-3
SLIDE 3

Agenda

  • Provide an overview of replication solutions
  • Discuss relevant new 2009.2 features
  • Review some real-world solutions

3

slide-4
SLIDE 4

Terminology

  • High Availability (HA)
  • Typical Goal: Keep Perforce online 24x7
  • Disaster Recovery (DR)
  • Business continuity
  • Murphy’s Law Insurance
  • Recovery Point Objective (RPO)
  • Targeted max data loss in various failure

scenarios

  • Recovery Time Objective (RTO)
  • Targeted max time to recover from a failure

4

slide-5
SLIDE 5

Terminology

  • Archive Files
  • Contains all versioned and shelved files
  • Metadata
  • All data in db.* files under P4ROOT
  • Read-Only Replica
  • Copy of live Perforce DBs for read-only
  • perations

5

slide-6
SLIDE 6

Terminology

  • Offline Checkpoint
  • Checkpoint created from replicated db.* files.
  • Perforce SDP (Server Deployment Package)
  • Server management scripts from Perforce

Consulting

  • DRBD (Distributed Replicated Block Device)
  • Keep your eyes open for emerging technologies!

6

slide-7
SLIDE 7

7

slide-8
SLIDE 8

8

slide-9
SLIDE 9

High Availability Thinking

  • We’re willing to invest in a more

sophisticated deployment architecture to reduce unplanned downtime.

  • We will not accept data loss for any Single

Point of Failure (SPOF).

  • Downtime is extremely expensive for us.

We are willing to spend a lot to reduce the likelihood of downtime, and minimize it when it is unavoidable.

9

slide-10
SLIDE 10

High Availability Technologies

  • Metadata:
  • Journal Truncation (p4d -jj)
  • p4 replicate
  • DAS/RAID or fast SAN for metadata
  • Archive Files:
  • SAN
  • p4 export – for metadata-driven archive

updates

10

slide-11
SLIDE 11

To Cluster, or Not To Cluster?

  • Perforce is not a cluster-aware application
  • Adds complexity and cost
  • Can reduce downtime
  • Simplifies automation of some failover tasks
  • DNS Switchover
  • Automatically mounting SAN Volumes
  • Perforce SDP designed to simplify cluster

failover

11

slide-12
SLIDE 12

Sample HA Deployment (w/SAN)

12

slide-13
SLIDE 13

Sample HA Deployment (w/DAS)

13

slide-14
SLIDE 14

14

slide-15
SLIDE 15

15

slide-16
SLIDE 16

Disaster Recovery Thinking

  • We’re willing to invest in a more

sophisticated deployment architecture to ensure business continuity in event of a disaster.

  • We need to ensure accessibility of our

intellectual property, even in the event of a sudden and total loss of one of our data centers.

16

slide-17
SLIDE 17

Disaster Recovery Technologies

  • Metadata:
  • Journal Truncation (p4d -jj)
  • p4 replicate
  • Archive Files:
  • Rsync/Robocopy
  • Block-level WAN replication solutions
  • p4 export – for metadata-driven archive

updates

17

slide-18
SLIDE 18

Sample DR Deployment

18

slide-19
SLIDE 19

Read-Only Replica Thinking

  • We have automation that interacts with

Perforce, such as continuous integration build systems or reports, that impact performance on our primary server.

  • We’re willing to invest in a more

sophisticated deployment architecture to improve performance and increase our scalability.

19

slide-20
SLIDE 20

Read-Only Replica Technologies

  • Metadata:
  • p4 replicate with filtering wrappers
  • Optional p4broker for a transparent

solution

  • Users always point to same P4PORT
  • Archive Files:
  • Shared storage with primary server

20

slide-21
SLIDE 21

Sample RO Replica (One Server)

21

slide-22
SLIDE 22

Sample RO Replica (2 Servers + Broker)

22

slide-23
SLIDE 23

Tools for Metadata Replication

  • Classic journal truncation (p4d -jj)
  • p4jrep (deprecated)
  • p4 replicate (New in 2009.2)
  • p4 export (New in 2009.2)

23

slide-24
SLIDE 24

Replication Example #1 – to Journal

#!/bin/bash P4MASTERPORT=perforce.myco.com:1742 CHECKPOINT_PREFIX=/p4servers/master/checkpoints/myco P4ROOT_REPLICA=/p4servers/replica/root REPSTATE=/p4servers/replica/root/rep.state p4 -p $P4MASTERPORT replicate \

  • s $REPSTATE \
  • J $CHECKPOINT_PREFIX \
  • o /p4servers/replica/logs/journal

24

slide-25
SLIDE 25

Replication Example #2 – to DBs

#!/bin/bash P4MASTERPORT=perforce.myco.com:1742 CHECKPOINT_PREFIX=/p4servers/master/checkpoints/myco P4ROOT_REPLICA=/p4servers/replica/root REPSTATE=/p4servers/replica/root/rep.state p4 -p $P4MASTERPORT replicate \

  • s $REPSTATE \
  • J $CHECKPOINT_PREFIX -k \

p4d -r $P4ROOT_REPLICA -f -b 1 -jrc -

25

slide-26
SLIDE 26

Replication Example #3 - Filtering

#!/bin/bash P4MASTERPORT=perforce.myco.com:1742 CHECKPOINT_PREFIX=/p4servers/master/checkpoints/myco P4ROOT_REPLICA=/p4servers/replica/root REPSTATE=/p4servers/replica/root/rep.state p4 -p $P4MASTERPORT replicate \

  • s $REPSTATE \
  • J $CHECKPOINT_PREFIX -k \

grep --line-buffered -v '@db\.have@' |\ p4d -r $P4ROOT_REPLICA -f -b 1 -jrc -

26

slide-27
SLIDE 27

Archive File Replication Solutions

  • File level – Rsync/Robocopy
  • Filesystem or block-level (DRBD, etc.)
  • Commercial WAN replication solutions
  • Metadata-driven using p4 export

27

slide-28
SLIDE 28

Replication Race

  • Metadata vs. Archive Files
  • Which data gets there first?
  • Perfect Consistency
  • Could mean a higher recovery point objective

(RPO).

  • Recovery state is clean for all recovered data.
  • Minimum Data Loss
  • More metadata is preserved.
  • p4 verify errors point to lost archive files.

28

slide-29
SLIDE 29

Example 1: Classic DR

  • Pre-2009.2 Servers
  • Classic Journal Truncation
  • Commercial WAN replication technology
  • Relaxed 8 hour recovery point objective

(RPO)

29

slide-30
SLIDE 30

Example 1: Classic DR

30

slide-31
SLIDE 31

Core approach was very straightforward:

  • On the primary server
  • Run p4d -jj every 8 hours
  • Deposit journal files on same volume as

archive files (gaining the benefit of free file transfer)

  • On the DR server
  • Replay outstanding journals using p4d –jr
  • Perforce instance on spare always up
  • Its daily job is running p4 verify

31

Example 1: Classic DR

slide-32
SLIDE 32

Example 2: Real-Time Replication

  • Suitable for HA or DR
  • Using p4 replicate
  • Wraps the p4 replicate utility
  • Replication engine runs continuously
  • Leave changes in journal for later replay, or
  • Replay changes directly to replica P4ROOT
  • Recovery Point Objective (RPO):
  • As low as 2 seconds for metadata.
  • WAN replication for archive files

32

slide-33
SLIDE 33

Example 2: Real-Time Replication

33

slide-34
SLIDE 34

Failover Automation

  • Only automate tasks behind FAILOVER button
  • Allow only a trained Perforce administrator to

push the button.

34

slide-35
SLIDE 35

Failover Automation

35 35

slide-36
SLIDE 36

Failover Automation

  • Perforce is not a cluster-aware application
  • Clustering adds some value
  • Simplifies automation of
  • DNS switchover
  • SAN mount transfers
  • etc.
  • Offline checkpoints can be beneficial
  • After failover, db.* files may be in an unknown

state

36

slide-37
SLIDE 37

Just A Bit More About Failover

  • It’s Complicated!
  • Simulation of hardware failures is non-trivial
  • There is a limit to how much confidence you

should gain from testing.

  • No substitute for a trained administrator
  • Can analyze failures
  • Determine the best course of action

37

slide-38
SLIDE 38

Example 3: Read-only Replica

  • Use Filtered Replication
  • Basic grep (with line buffering)
  • For filtering one-liner journal entries like

db.have

  • More sophisticated filtering
  • Needed for journal entries that span

multiple lines

  • Perforce Public Depot has a good example:

//guest/michael_shields/src/p4jrep/awkfilter.sh

38

slide-39
SLIDE 39

Example 3: Read-only Replica

  • For Continuous Integration/Build Farms
  • Define how users will connect to the Replica
  • Simple (for administrators):
  • Modify build scripts to use appropriate P4PORT values
  • Point users at appropriate P4PORT depending on task
  • Simple (for end users):
  • All users use p4broker P4PORT
  • p4broker routes requests to appropriate server

instance

  • Ether the live server or the read-only replica

39

slide-40
SLIDE 40

Example 3: Read-only Replica

  • Make Archive Files Available on Replica
  • Multiple Server Machines, Master & Replica
  • Use a SAN or other shared storage solution
  • Files mounted read-only on the replica
  • Run Replica instance on Primary server
  • Works if hardware is powerful enough
  • Run replica under different login
  • Cannot write to the archived files

40

slide-41
SLIDE 41

Review of RO Replica

41

slide-42
SLIDE 42

Summary

  • Advanced replication solutions
  • Easier with p4 replicate and p4 export
  • Typical Uses:
  • High Availability
  • Disaster Recovery
  • Read-only Replicas
  • Perforce Technical Support can help!
  • Perforce Consulting can help, too!

42

slide-43
SLIDE 43

Demo

43

slide-44
SLIDE 44

Q & A

44

slide-45
SLIDE 45