The Backup Methods Available for MongoDB Adamo Tonete Agenda - - PowerPoint PPT Presentation

the backup methods
SMART_READER_LITE
LIVE PREVIEW

The Backup Methods Available for MongoDB Adamo Tonete Agenda - - PowerPoint PPT Presentation

The Backup Methods Available for MongoDB Adamo Tonete Agenda Backup importance for companies and backup plans. Available Methods: - Disk Snapshot - mongodump - rsync or copy - Point in time backup from Percona - MongoDB Cloud / Ops


slide-1
SLIDE 1

The Backup Methods Available for MongoDB

Adamo Tonete

slide-2
SLIDE 2

2

Agenda

Backup importance for companies and backup plans. Available Methods:

  • Disk Snapshot
  • mongodump
  • rsync or copy
  • Point in time backup from Percona
  • MongoDB Cloud / Ops Manager backup (on-prem)
  • Hot Backup

Q&A

slide-3
SLIDE 3

Replica-set and Shard Concepts 101

slide-4
SLIDE 4

4

Replicasets and Shard concepts

slide-5
SLIDE 5

5

Replicaset and Shard concepts

slide-6
SLIDE 6

Why is Backup Important?

slide-7
SLIDE 7

7

Why is Backup Important?

Data usually is the most valuable asset in a company. A company with severe data loss may not even come back to the business. Could you imagine a bank losing all its data or an e-commerce offline for 1 week?

slide-8
SLIDE 8

8

Data loss can occur in 3 main different situations: 1) Human Error 2) DB failure/corruption 3) System failure/collapse 4) Security Breach

Why is Backup Important?

slide-9
SLIDE 9

Backup Plan

slide-10
SLIDE 10

10

Backup Plan Choose the best RPO, RTO for your company.

  • Recovery POINT Objective
  • Recovery Time Objective

Backup Plan/Disaster Recovery Plan

slide-11
SLIDE 11

11

  • RTO is how much time can the company would accept to be "offline".
  • How long should take to have my application back online?

Why is Backup Important?

slide-12
SLIDE 12

12

  • RPO is what POINT in time must the backups be when we have a data

loss/incident.

  • This is an extreme important metric to know how often a backup need to

be made.

Why is Backup Important?

slide-13
SLIDE 13

13

Backup Plan/Disaster Recovery Plan

1TB replica-set

slide-14
SLIDE 14

14

Backup Plan/Disaster Recovery Plan

RTO = 20 minutes RPO = 30 minutes

1TB replica- set

slide-15
SLIDE 15

15

Backup Plan/Disaster Recovery Plan

RTO = 20 minutes RPO = 30 minutes

1TB replica- set 95% read 5% writes

slide-16
SLIDE 16

16

Backup Plan/Disaster Recovery Plan

RTO = 20 minutes RPO = 30 minutes

1TB replica- set 2000 inserts/day 3000 review day

slide-17
SLIDE 17

17

Backup Plan/Disaster Recovery Plan

We have 1TB data and... 5 GB is for user login 2 GB day of new writes ~ 900 GB of reviews and 40GB is the favorites (90% of the traffic) Favorites are updated every 20 minutes asynchronous.

slide-18
SLIDE 18

18

Backup Plan/Disaster Recovery Plan

Login Favorites Comment/upvote Historical data/non fav 90% traffic - 10% data 10% traffic - 90% data

slide-19
SLIDE 19

19

Backup Plan/Disaster Recovery Plan

  • Backup the user database every 30 minutes
  • Backup the favorite topics every 20 minutes (right after the sync)
  • Backup the new comments in an incremental way (using filter for

created_at > last backup)

  • Backup the history aged/non favorites collection once per day
slide-20
SLIDE 20

20

Backup Plan/Disaster Recovery Plan

5 GB user - 30 minutes 40 GB favorites - 20 minutes 900 GB - non favorite data Comments every hour - 500 MB

slide-21
SLIDE 21

21

What feature should have priority in a recovery situation?

Backup Plan/Disaster Recovery Plan

slide-22
SLIDE 22

22

Backup Plan/Disaster Recovery Plan

Login Favorites Comment/upvote 90% traffic - 10% data

slide-23
SLIDE 23

23

  • With 10% of the data the environment is handling 90% of the requests

and slowly recovering the old data.

  • Not all the companies consider this as a full RTO but other do. It

depends on the expectations.

Replica-sets and Shard concepts

slide-24
SLIDE 24

Disk Snapshot

slide-25
SLIDE 25

25

Disk snapshot is a full copy of the data currently in a disk. The snapshot process may take a while but the advantage is when a restore is needed the files are already ready for the database. No need to create indexes or run a file restore, the recover time is fast.

Disk Snapshot

slide-26
SLIDE 26

26

Disk Snapshot

Advantages: Straight forward approach, take a copy of what is in the disk and that’s all.

slide-27
SLIDE 27

27

Disk Snapshot

Disadvantages May slow down the database while the snapshot is being created. Can take several hours depending on the disk speed No "partial" restore all or nothing

slide-28
SLIDE 28

28

Disk Snapshot

Backup type: Binary copy Time to backup: High Complexity: Low Time to recover: Low

slide-29
SLIDE 29

Rsync or scp to a different host

slide-30
SLIDE 30

30

Rsync or SCP

  • Consists in copying the entire/data folder to a different machine/disk

while a mongod process is stopped or all the writes are stopped.

  • It was very common in MMAP and still possible with wiredTiger.
slide-31
SLIDE 31

31

Advantages Data is ready to be used in the target folder. Just start the mongod process using the backup folder.

Rsync or SCP

slide-32
SLIDE 32

32

Rsync or SCP

Disadvantages Needs to stop a secondary or lock writes. May affect performance. Restore is all or nothing.

slide-33
SLIDE 33

33

Rsync or SCP

Backup type: Binary Time to backup: High Complexity: Medium Time to recover: Low

slide-34
SLIDE 34

Mongodump

slide-35
SLIDE 35

35

mongodump

mongodump in bounded with mongodb and it is the preferable tool to backup a mongodb database. It is important to mention there are 2 steps to perform a disaster recover when using mongodump 1) create the dump file 2) restore the dump file with mongorestore

slide-36
SLIDE 36

36

mongodump

Use mongodump to create backups per:

  • Database
  • Collection
  • Specific value (query)
  • Point in time backup (when using replica-sets)
slide-37
SLIDE 37

37

Although the mongodump tool is very versatile only having backup file doesn't mean you are safe. dump files need to processed by mongorestore to rebuild the database. An error in the dump file may break the entire restore process.

mongodump

slide-38
SLIDE 38

38

mongodump

Backup files Backup files dump process

slide-39
SLIDE 39

39

mongodump

Backup files Backup files Collection Start Time End Time users T T+10 logins T T+20 favorites T+10 T+30

  • ther

T+20 T+40

slide-40
SLIDE 40

40

mongorestore

Backup files Backup files

slide-41
SLIDE 41

41

Mongodump

Backup files Backup files dump process

  • p

l

  • g
  • plog
slide-42
SLIDE 42

42

Mongodump

Backup files Backup files Collection Start Time End Time

  • plog

users T T+10 T+50 logins T T+20 T+40 favorites T+10 T+30 T+20 messages T+20 T+40 T+0 Oplog

slide-43
SLIDE 43

43

Mongodump

It is easy to achieve a point in time backup in a replica-set with mongodump. However the same is not true for sharding. How to guarantee all the backups will end at the same time? https://github.com/Percona-Lab/mongodb_consistent_backup

slide-44
SLIDE 44

44

Mongodump + Percona Scripts

Percona POINT in time backup is a Beta tool from percona to backup a cluster wide project in a point in time way. It does rely on mongodump and ensures all the dumps ends at the same time generating an point in time backup from a cluster. Full backup, not partial

slide-45
SLIDE 45

45

Mongodump + Percona Scripts

slide-46
SLIDE 46

46

Advantages Highly flexible tool to generate backups. Default logical backup method offered by mongodb

Mongodump + Percona Scripts

slide-47
SLIDE 47

47

Mongodump + Percona Scripts

Disadvantages Default behavior is not point in time. Restore time can take longer as indexes needs to be rebuilt. Backup files needs to be tested

slide-48
SLIDE 48

48

Mongodump + Percona Scripts

Backup type: logical Time to backup: depends Complexity: low to high Time to recover: depends usually high

slide-49
SLIDE 49

MongoDB Atlas

slide-50
SLIDE 50

50

MongoDB Atlas

Fully managed backup service offered by MongoDB It is possible to backup using cloud provider snapshot or continuous backup. Only need an agent installed and all done. The configuration is done by a

  • website. No tech skills need.
slide-51
SLIDE 51

51

MongoDB Atlas

Backup type: logical/snapshots Time to backup: low Complexity: (unknown) Time to recover: (unknown) would say fast as the data is in the same DC

slide-52
SLIDE 52

Percona Hot Backup

slide-53
SLIDE 53

53

Binary lightweight backup method that copies the database to a different folder/disk without affecting the instance performance. Available in WiredTiger only. Acts very similar to a disk snapshot but in the database level. Generates a point of time copy of the database.

Percona Hot Backup

slide-54
SLIDE 54

54

Percona Hot Backup

Backup type: logical Time to backup: medium Complexity: low Time to recover: low

slide-55
SLIDE 55

Questions

slide-56
SLIDE 56

56

Rate My Session

slide-57
SLIDE 57

57

Thank You Sponsors!!