#
Scott Stanford
Scott Stanford # Topology Infrastructure Backups & Disaster - - PowerPoint PPT Presentation
Scott Stanford # Topology Infrastructure Backups & Disaster Recovery Monitoring Lessons Learned Q&A # # Boston Traditional Proxy 1.2 Tb database, mostly db.have Average daily journal size 70 Gb
#
Scott Stanford
#
#
#
P4D (Sunnyvale)
Boston Traditional Proxy Pittsburg Traditional Proxy RTP Traditional Proxy Bangalore Traditional Proxy
files
#
Commit (Sunnyvale)
RTP Edge
Pittsburg Proxy Boston Proxy
Sunnyvale Edge Bangalore Edge Boston Traditional Proxy Pittsburg Traditional Proxy RTP Traditional Proxy Bangalore Traditional Proxy
traditional model to Commit/Edge servers.
until the migration completes later this year
Edge (50ms improvement)
#
#
Edge server, formerly were proxies
storage used for the database, journal, and log storage
P4TARGET of the closest Edge server (RTP)
an active/standby host pairing
7
#
and HBA
each controller
(NFS and p4d)
#
Each Commit/Edge server is configured in a pair consisting of
– Allows for a quick failover of the p4d without any DNS or changes to the users environment
(Perl, Ruby, Python, P4, Git-Fusion, common scripts)
#
– NetApp EF540 w/ FC for the Commit server
– NetApp E5512 w/ FC or SAS for each Edge server
– All RAID 10 with multiple spare disks, XFS, dual controllers, and dual power supplies
– Warm database or read-only replica on stand-by host – Production journal
– Production p4d log
#
storage, snapshots, and offsite
– Depot storage – Rotated journals & p4d logs – Checkpoints – Warm database
– Git-Fusion homedir & cache, dedicated volume per instance
#
#
verify they match
database
database
checkpoint (p4d –jd)
Checksum journal on SAN Copy journal to NFS Compare checksums
NFS Create snapshot(s) Delete old snapshots Replay on warm standby Replay on warm NFS p4d -jj
Every 1 hour
#
Warm database
be applied:
– p4 journals –F “jdate>=(event epoch – 1)” –T jfile,jnum”
Read-only Replica from Edge
Edge server captures event in events.csv Monit triggers backups on events.csv Determine which journals to apply Apply journals Commit server truncates
#
– Save the change output for any open changes – Generate the journal data for the client – Create an tarball of the open files – Retained for 14 days
#
– Main backup method – Created and kept for:
– Used for online backups – Created every 4 weeks, kept for 12 months
– Contains all of the data needed to recreate the instance – Sunnyvale
from production snapshots with FlexClone – DR
Bangalore Edge servers
#
#
– Monitors and alerts
– Used for identifying host or performance issues
– Storage monitoring
– Monitor both infrastructure and the end-user experience
#
data to a single M/Monit instance
system)
ssh, sendmail, ntpd, crond, ypbind, p4p, p4d, p4web, p4broker
conditions met (ie. clean a proxy cache or purge all)
thresholds
ties
have affected production in the past:
– NIC errors – Number of filehandles – known patterns in the system log – p4d crashes
#
single M/Monit instance
recovered from
#
license trends, number of clients and opened files per p4d)
commands
– Alerts if a site is 15% slower than a historical average – Runs for both the Perforce binary and internal wrappers
#
#
– Most noticeable for sites with higher latency WAN connections
commands when the WAN or Commit site are inaccessible
server
servers
#
– Set the dm.rotatelogwithjnl configurable to 0
interesting results with csv logs
– Warm databases are harder to maintain with frequent journal truncations, no way to trigger
to new P4TARGETs can cause increased load on the WAN depending on the topology.
#
Scott Stanford sstanfor@netapp.com
# Scott Stanford is the SCM Lead for NetApp where he also functions as a worldwide Perforce Administrator and tool
development, with thirteen years specializing in configuration
Architect at Synopsys.
#
SnapShot:
http://www.netapp.com/us/technology/storage-efficiency/se-technologies.aspx
SnapVault & SnapMirror:
http://www.netapp.com/us/products/protection-software/index.aspx
Backup & Recovery of Perforce on NetApp:
http://www.netapp.com/us/system/pdf-reader.aspx?pdfuri=tcm:10-107938-16&m=tr-4142.pdf Monit: http://mmonit.com/