mongodb backup and recovery field guide
play

MongoDB Backup and Recovery Field Guide Tim Vaillancourt Sr - PowerPoint PPT Presentation

MongoDB Backup and Recovery Field Guide Tim Vaillancourt Sr Technical Operations Architect, Percona `whoami` { name: tim, lastname: vaillancourt, employer: percona, techs: [ mongodb, mysql, cassandra,


  1. MongoDB Backup and Recovery Field Guide Tim Vaillancourt Sr Technical Operations Architect, Percona

  2. `whoami` { name: “tim”, lastname: “vaillancourt”, employer: “percona”, techs: [ “mongodb”, “mysql”, “cassandra”, “redis”, “rabbitmq”, “solr”, “python”, “golang” ] } 2

  3. Agenda ● History ● Methods ○ Logical ○ Binary ■ Cold ■ LVM ■ Hot Backup ● Integrity / Consistency ○ mongodb_consistent_backup ● Architecture ● Restore and Validation 3

  4. History ● 3000-4000 BC: Culturally significant data backed up in a universal format ● 1400: The Printing Press ● 1600-1800: Chapultepec Aqueduct ● 1990s: Floppy and Zip Disks ● 2000s: No more Floppy/Zip Disks ● Present: All my data is on Google Drive and I have 7 days of hourly Time Machine backups! ● Future: ? 4

  5. Replication != Backup ● Replication is not a backup! ○ Replication is High Availability ○ Including ■ Binary/Statement-based Replication of any type Delayed Replication*** ● ■ RAID Arrays ● <EOF> 5

  6. Backup Methods

  7. Logical Backups ● Tools ○ mongodump ■ Uses find() queries with $snapshot to backup all collections ■ Supports Gzip and Threading in 3.2+ ■ Outputs a directory containing bson files in various subdirectories ○ Custom Queries ■ The client API could be used similarly to mongodump to perform logical backups ● Benefits ○ Reduced storage footprint ○ Replication awareness ○ Compatibility ● Drawbacks 7

  8. Binary Backups: Cold Backup ● Very simple process ● Causes full outage to MongoDB instance! ● Process ○ Stop mongod ○ Copy and archive dbPath ○ Start mongod 8

  9. Binary Backups: LVM / Filer / Cloud Disk ● Process ○ If Non-Journalled ■ db.fsyncLock() ■ Keep session open ○ Create block-device snapshot ○ Unlock the database ■ db.fsyncUnlock() ○ Copy or achive the snapshot directory ○ Remove block devics snapshot (as quickly as possible!) ● LVM ○ Snapshots have been demonstrated to cause up to 30%* write latency impact to disk due to COW 9

  10. Binary Backups: Hot Backup ● PSMDB or MongoDB Enterprise ○ Pay $$$ for MongoDB Enterprise or download PSMDB for free(!) ○ db.adminCommand({ createBackup: 1, backupDir: "/data/mongodb/backup" }) ○ Copy/archive the output path ○ Delete the backup output path ○ NOTE: ■ RocksDB-based createBackup creates filesystem hardlinks whenever possible! ■ Delete RocksDB backupDir as soon as possible to reduce bloom filter overhead! 10

  11. Backup Integrity / Consistency

  12. The “Distributed Cluster Backup Problem” ● Mongodump is single node consistent only! ● Common to most or all database techs in sharded environment ● Problems: ○ Backup tools consider single-instance integrity only ○ Backups of different shards may complete at different times ○ Changes replicate asynchronously ○ Data may be balancing / moving in the cluster ● Risks: ○ Orphaned documents / references ○ Holes in data 12

  13. Backups: mongodb_consistent_backup ● Python project by Percona-Lab for consistent backups ● URL: https://github.com/Percona-Lab/mongodb_consistent_backup ● Best-effort support, not a “Percona Product” ● Created to solve limitations in MongoDB backup tools: ○ Replica Set and Sharded Cluster awareness ○ Cluster-wide Point-in-time consistency ○ In-line Oplog backup (vs post-backup) ○ Notifications of success / failure ● Extra Features ○ Remote Upload (AWS S3, Google Cloud Storage and Rsync) ○ Archiving (Tar or ZBackup deduplication and optional AES-at-rest) ○ CentOS/RHEL7 RPMs and Docker-based releases (.deb soon!) 13

  14. Backups: mongodb_consistent_backup ● 1.2.0 ○ Multi-threaded Rsync Upload ○ Replica Set Tags support ○ Support for MongoDB SSL / TLS connections and client auth ○ Rotation / Expiry of old backups (locally-stored only) ● Future ○ Incremental Backups ○ Binary-level Backups (Hot Backup, Cold Backup, LVM, Cloud-based, etc) ○ More Notification Methods (PagerDuty, Email, etc) ○ Restore Helper Tool ○ Instrumentation / Metrics ○ <YOUR AWESOME IDEA HERE> we take GitHub PRs (and it’s Python) ! 14

  15. Backup Architecture

  16. Architecture: Simple Example ● Method ○ Run mongodump (with --oplog) using a plain secondary ○ Store backups with on-site remote storage (filer, rsync, etc) ● Potential Issues ○ Application Impact ■ I/O and CPU impact due to backups may affect application ■ Storage-engine and FS caches will become dirty ■ Primary Failure ● A failure of the Primary may cause the Secondary backing-up to become Primary ● This can be avoided by using a Read Preference of ‘secondary’ (supported in recent mongodump versions) ○ No Disaster Recovery 16

  17. Architecture: Tag-Based Example ● Replica Set Tags ○ Allow selection of MongoDB nodes using key/value pairs ○ Represented in JSON/single document ○ Many key/value pairs is possible ● Example Backup from “west” Only ○ Specify a single node with a tag such as { location: “west” } ○ Use Read Preference Tag in mongodump/mongodb_consistent_backup to target a specific node. 17

  18. Architecture: Offsite Backup Example ● Example ○ Create backup within local datacenter ○ Upload completed backups to other datacenter, cloud, etc ■ mongodb_consistent_backup supports Amazon S3, Google Cloud Storage and Rsync for remote upload! ● Benefits ○ Fast backup time due to in-datacenter latency ● Drawbacks ○ A full backup data uploaded each backup job 18

  19. Architecture: Disaster Recovery Example ● Example ○ Place a SECONDARY node in another location ■ Dedicated node is recommended to reduce impact ■ hidden:true recommended ○ Run backup from off-site SECONDARY member ○ Optionally upload to Cloud Storage ● Benefits ○ Only changes (replication) replicated to offsite location ○ Potentially faster uploads to Cloud Storage ● Drawbacks ○ Bootstrap / Initial Sync may use high bandwidth (if not seeded by backup) 19

  20. Restore and Validation “It’s not a backup system, it’s a restore system” ~ Raymond Blum, Google SRE

  21. Restoring and Validation ● Methodology ○ Optimise restore time, not backup run time ■ Users and business care how fast their data is back, not how long it takes to backup ■ Binary-level backups are much faster to restore in MongoDB ● Validation ○ This is very application specific ○ Random sample restored data and validate ■ Example: Compare to Production ● Compare real Production item, user, article, etc to backup ● Ensure backup age doesn’t cause false alarms, ie: test data older than backup ■ Example: Integration Test / QA ● Run code integration tests or QA on restored data ■ Example: Production Backup as Test Data ● Copy Production Data to Test periodically using backups 21

  22. Thank You Sponsors! 22

  23. April 23-25, 2018 SAVE THE DATE! Santa Clara Convention Center CALL FOR PAPERS OPENING SOON! www.perconalive.com 23

  24. Questions? 24

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend