Real-time Replication in the Real World Richard E. Baum C. Thomas - - PowerPoint PPT Presentation
Real-time Replication in the Real World Richard E. Baum C. Thomas - - PowerPoint PPT Presentation
Real-time Replication in the Real World Richard E. Baum C. Thomas Tyler 2 Agenda Provide an overview of replication solutions Discuss relevant new 2009.2 features Review some real-world solutions 3 Terminology High
Real-time Replication in the Real World
Richard E. Baum
- C. Thomas Tyler
2
Agenda
- Provide an overview of replication solutions
- Discuss relevant new 2009.2 features
- Review some real-world solutions
3
Terminology
- High Availability (HA)
- Typical Goal: Keep Perforce online 24x7
- Disaster Recovery (DR)
- Business continuity
- Murphy’s Law Insurance
- Recovery Point Objective (RPO)
- Targeted max data loss in various failure
scenarios
- Recovery Time Objective (RTO)
- Targeted max time to recover from a failure
4
Terminology
- Archive Files
- Contains all versioned and shelved files
- Metadata
- All data in db.* files under P4ROOT
- Read-Only Replica
- Copy of live Perforce DBs for read-only
- perations
5
Terminology
- Offline Checkpoint
- Checkpoint created from replicated db.* files.
- Perforce SDP (Server Deployment Package)
- Server management scripts from Perforce
Consulting
- DRBD (Distributed Replicated Block Device)
- Keep your eyes open for emerging technologies!
6
7
8
High Availability Thinking
- We’re willing to invest in a more
sophisticated deployment architecture to reduce unplanned downtime.
- We will not accept data loss for any Single
Point of Failure (SPOF).
- Downtime is extremely expensive for us.
We are willing to spend a lot to reduce the likelihood of downtime, and minimize it when it is unavoidable.
9
High Availability Technologies
- Metadata:
- Journal Truncation (p4d -jj)
- p4 replicate
- DAS/RAID or fast SAN for metadata
- Archive Files:
- SAN
- p4 export – for metadata-driven archive
updates
10
To Cluster, or Not To Cluster?
- Perforce is not a cluster-aware application
- Adds complexity and cost
- Can reduce downtime
- Simplifies automation of some failover tasks
- DNS Switchover
- Automatically mounting SAN Volumes
- Perforce SDP designed to simplify cluster
failover
11
Sample HA Deployment (w/SAN)
12
Sample HA Deployment (w/DAS)
13
14
15
Disaster Recovery Thinking
- We’re willing to invest in a more
sophisticated deployment architecture to ensure business continuity in event of a disaster.
- We need to ensure accessibility of our
intellectual property, even in the event of a sudden and total loss of one of our data centers.
16
Disaster Recovery Technologies
- Metadata:
- Journal Truncation (p4d -jj)
- p4 replicate
- Archive Files:
- Rsync/Robocopy
- Block-level WAN replication solutions
- p4 export – for metadata-driven archive
updates
17
Sample DR Deployment
18
Read-Only Replica Thinking
- We have automation that interacts with
Perforce, such as continuous integration build systems or reports, that impact performance on our primary server.
- We’re willing to invest in a more
sophisticated deployment architecture to improve performance and increase our scalability.
19
Read-Only Replica Technologies
- Metadata:
- p4 replicate with filtering wrappers
- Optional p4broker for a transparent
solution
- Users always point to same P4PORT
- Archive Files:
- Shared storage with primary server
20
Sample RO Replica (One Server)
21
Sample RO Replica (2 Servers + Broker)
22
Tools for Metadata Replication
- Classic journal truncation (p4d -jj)
- p4jrep (deprecated)
- p4 replicate (New in 2009.2)
- p4 export (New in 2009.2)
23
Replication Example #1 – to Journal
#!/bin/bash P4MASTERPORT=perforce.myco.com:1742 CHECKPOINT_PREFIX=/p4servers/master/checkpoints/myco P4ROOT_REPLICA=/p4servers/replica/root REPSTATE=/p4servers/replica/root/rep.state p4 -p $P4MASTERPORT replicate \
- s $REPSTATE \
- J $CHECKPOINT_PREFIX \
- o /p4servers/replica/logs/journal
24
Replication Example #2 – to DBs
#!/bin/bash P4MASTERPORT=perforce.myco.com:1742 CHECKPOINT_PREFIX=/p4servers/master/checkpoints/myco P4ROOT_REPLICA=/p4servers/replica/root REPSTATE=/p4servers/replica/root/rep.state p4 -p $P4MASTERPORT replicate \
- s $REPSTATE \
- J $CHECKPOINT_PREFIX -k \
p4d -r $P4ROOT_REPLICA -f -b 1 -jrc -
25
Replication Example #3 - Filtering
#!/bin/bash P4MASTERPORT=perforce.myco.com:1742 CHECKPOINT_PREFIX=/p4servers/master/checkpoints/myco P4ROOT_REPLICA=/p4servers/replica/root REPSTATE=/p4servers/replica/root/rep.state p4 -p $P4MASTERPORT replicate \
- s $REPSTATE \
- J $CHECKPOINT_PREFIX -k \
grep --line-buffered -v '@db\.have@' |\ p4d -r $P4ROOT_REPLICA -f -b 1 -jrc -
26
Archive File Replication Solutions
- File level – Rsync/Robocopy
- Filesystem or block-level (DRBD, etc.)
- Commercial WAN replication solutions
- Metadata-driven using p4 export
27
Replication Race
- Metadata vs. Archive Files
- Which data gets there first?
- Perfect Consistency
- Could mean a higher recovery point objective
(RPO).
- Recovery state is clean for all recovered data.
- Minimum Data Loss
- More metadata is preserved.
- p4 verify errors point to lost archive files.
28
Example 1: Classic DR
- Pre-2009.2 Servers
- Classic Journal Truncation
- Commercial WAN replication technology
- Relaxed 8 hour recovery point objective
(RPO)
29
Example 1: Classic DR
30
Core approach was very straightforward:
- On the primary server
- Run p4d -jj every 8 hours
- Deposit journal files on same volume as
archive files (gaining the benefit of free file transfer)
- On the DR server
- Replay outstanding journals using p4d –jr
- Perforce instance on spare always up
- Its daily job is running p4 verify
31
Example 1: Classic DR
Example 2: Real-Time Replication
- Suitable for HA or DR
- Using p4 replicate
- Wraps the p4 replicate utility
- Replication engine runs continuously
- Leave changes in journal for later replay, or
- Replay changes directly to replica P4ROOT
- Recovery Point Objective (RPO):
- As low as 2 seconds for metadata.
- WAN replication for archive files
32
Example 2: Real-Time Replication
33
Failover Automation
- Only automate tasks behind FAILOVER button
- Allow only a trained Perforce administrator to
push the button.
34
Failover Automation
35 35
Failover Automation
- Perforce is not a cluster-aware application
- Clustering adds some value
- Simplifies automation of
- DNS switchover
- SAN mount transfers
- etc.
- Offline checkpoints can be beneficial
- After failover, db.* files may be in an unknown
state
36
Just A Bit More About Failover
- It’s Complicated!
- Simulation of hardware failures is non-trivial
- There is a limit to how much confidence you
should gain from testing.
- No substitute for a trained administrator
- Can analyze failures
- Determine the best course of action
37
Example 3: Read-only Replica
- Use Filtered Replication
- Basic grep (with line buffering)
- For filtering one-liner journal entries like
db.have
- More sophisticated filtering
- Needed for journal entries that span
multiple lines
- Perforce Public Depot has a good example:
//guest/michael_shields/src/p4jrep/awkfilter.sh
38
Example 3: Read-only Replica
- For Continuous Integration/Build Farms
- Define how users will connect to the Replica
- Simple (for administrators):
- Modify build scripts to use appropriate P4PORT values
- Point users at appropriate P4PORT depending on task
- Simple (for end users):
- All users use p4broker P4PORT
- p4broker routes requests to appropriate server
instance
- Ether the live server or the read-only replica
39
Example 3: Read-only Replica
- Make Archive Files Available on Replica
- Multiple Server Machines, Master & Replica
- Use a SAN or other shared storage solution
- Files mounted read-only on the replica
- Run Replica instance on Primary server
- Works if hardware is powerful enough
- Run replica under different login
- Cannot write to the archived files
40
Review of RO Replica
41
Summary
- Advanced replication solutions
- Easier with p4 replicate and p4 export
- Typical Uses:
- High Availability
- Disaster Recovery
- Read-only Replicas
- Perforce Technical Support can help!
- Perforce Consulting can help, too!
42
Demo
43
Q & A
44