MySQL Backup and Restore at Facebook Scale Ola Berjak Production - - PowerPoint PPT Presentation

mysql backup and restore at facebook scale
SMART_READER_LITE
LIVE PREVIEW

MySQL Backup and Restore at Facebook Scale Ola Berjak Production - - PowerPoint PPT Presentation

MySQL Backup and Restore at Facebook Scale Ola Berjak Production Engineer at MySQL Infrastructure, Facebook London MySQL Backup and Restore at Facebook Scale and how its not rocket science Ola Berjak Production Engineer at MySQL


slide-1
SLIDE 1

MySQL Backup and Restore at Facebook Scale

Ola Berjak

Production Engineer at MySQL Infrastructure, Facebook London

slide-2
SLIDE 2

MySQL Backup and Restore at Facebook Scale

Production Engineer at MySQL Infrastructure, Facebook London

…and how it’s not rocket science Ola Berjak

slide-3
SLIDE 3 3
slide-4
SLIDE 4

When do we need backups? How do we perform backups? How do we restore backups?

4
slide-5
SLIDE 5

When do we need backups? How do we perform backups? How do we restore backups?

5
slide-6
SLIDE 6

When do we need backups? How do we perform backups? How do we restore backups?

6
slide-7
SLIDE 7

When do we need backups?

7
slide-8
SLIDE 8 MALICIOUS ATTACKER HARDWARE FAILURE HUMAN ERROR 8
slide-9
SLIDE 9 MALICIOUS ATTACKER HARDWARE FAILURE HUMAN ERROR 9
slide-10
SLIDE 10

FULL DUMPS DIFFS

10
slide-11
SLIDE 11

How do we perform backups?

11
slide-12
SLIDE 12

Every database, every day

12
slide-13
SLIDE 13

LOGICAL BACKUPS PHYSICAL BACKUPS CUSTOMER LOGIC Ea Easy sy Complex DEBUGGING Ea Easy sy Complex SINGLE TABLE RESTORE Ea Easy sy Complex PORTABILITY Con Consistent Inconsistent BACKUP AND RESTORE DURATION Long Short

13
slide-14
SLIDE 14

mysqldump number of rows for each table zstd compression

Technical setup

FULL DUMPS

trailing index

14
slide-15
SLIDE 15 15

mysqldump --single-transaction --skip-lock- tables (...)

slide-16
SLIDE 16 16

mysqldump --single-transaction --skip-lock- tables (...) | compress and add index

slide-17
SLIDE 17

mysqldump --single-transaction --skip-lock- tables (...) | compress and add index | upload

17
slide-18
SLIDE 18

Trailing index

{ "size": 7331, "offset": 1337, "table_name": "foo" }, { "size": 223, "offset": 8668, "table_name": "bar" }

18
slide-19
SLIDE 19

mysqldump: https://github.com/facebook/mysql-5.6 zstd: https://github.com/facebook/zstd

Open-source

19
slide-20
SLIDE 20

Open source tooling will get the job done

20
slide-21
SLIDE 21

Open source tooling will get the job done

au autom

  • mysql

sqlbacku ackup

scheduling email notifications custom backup rotation

21
slide-22
SLIDE 22

Open source tooling will get the job done

au autom

  • mysql

sqlbacku ackup

scheduling email notifications custom backup rotation

mo monitoring tools

alerting

22
slide-23
SLIDE 23 23
slide-24
SLIDE 24

full dump format 2 files for a single diff backup rows removed

Technical setup

DIFFS

rows inserted and updated

24
slide-25
SLIDE 25

diff2 diff1

most recent dump backup, “base dump” new dump

DiffDatabase

CREATE TABLE foo INSERT INTO foo

  • - rows for foo: 1337

CREATE TABLE bar INSERT INTO bar CREATE TABLE foo INSERT INTO foo

  • - rows for foo: 7331

CREATE TABLE bar INSERT INTO bar

25
slide-26
SLIDE 26

diff1 MergeDatabase

CREATE TABLE foo INSERT INTO foo

  • - rows for foo: 1337

CREATE TABLE bar INSERT INTO bar

new dump

diff2

CREATE TABLE foo INSERT INTO foo

  • - rows for foo: 1337

CREATE TABLE bar INSERT INTO bar

base dump

26
slide-27
SLIDE 27

D D D D F D D D D F F

27
slide-28
SLIDE 28

2-3x+

less space used

28
slide-29
SLIDE 29

Explore the open source tooling

29
slide-30
SLIDE 30

Explore the open source tooling

au autom

  • mysql

sqlbacku ackup

scheduling email notifications custom backup rotation

di different ntial al bac acku kups

30
slide-31
SLIDE 31

Due diligence checklist

31
slide-32
SLIDE 32
  • verify the size

Due diligence checklist

32
slide-33
SLIDE 33
  • verify the size
  • set up alerting

Due diligence checklist

33
slide-34
SLIDE 34
  • verify the size
  • set up alerting
  • store checksums and metadata

Due diligence checklist

34
slide-35
SLIDE 35 MALICIOUS ATTACKER HARDWARE FAILURE HUMAN ERROR 35
slide-36
SLIDE 36

all transactions for all databases from master compressed using zstd

Technical setup

BINARY LOGS

metadata stored

36
slide-37
SLIDE 37
  • verify the size
  • set up alerting
  • store checksums and metadata

Due diligence checklist

37
slide-38
SLIDE 38
  • verify the size
  • set up alerting
  • store checksums and metadata
  • detect gaps in transactions backed up

Due diligence checklist

38
slide-39
SLIDE 39
  • verify the size
  • set up alerting
  • store checksums and metadata
  • detect gaps in transactions backed up
  • monitor the ’’backup lag”

Due diligence checklist

39
slide-40
SLIDE 40

How do we restore backups?

40
slide-41
SLIDE 41

Continuous restore pipeline

41
slide-42
SLIDE 42

Scheduler

Continuous restore pipeline

42
slide-43
SLIDE 43

Warchief Loadbalancer

Continuous restore pipeline

43

Scheduler

slide-44
SLIDE 44

DB

Continuous restore pipeline

Loadbalancer

44

Scheduler

slide-45
SLIDE 45

Worker Worker Worker Worker

MySQL MySQL MySQL MySQL

DB

Continuous restore pipeline

Loadbalancer

45

Scheduler

slide-46
SLIDE 46

SELECT CHECKSUM DOWNLOAD LOAD VERIFY REPLAY

46
slide-47
SLIDE 47

SELECT CHECKSUM DOWNLOAD LOAD VERIFY REPLAY

47
slide-48
SLIDE 48

SELECT CHECKSUM DOWNLOAD LOAD VERIFY REPLAY

48
slide-49
SLIDE 49

SELECT CHECKSUM DOWNLOAD LOAD VERIFY REPLAY

49
slide-50
SLIDE 50

SELECT CHECKSUM DOWNLOAD LOAD VERIFY REPLAY

50
slide-51
SLIDE 51

SELECT CHECKSUM DOWNLOAD LOAD VERIFY REPLAY

51
slide-52
SLIDE 52

Start small and build up

52
slide-53
SLIDE 53

BUSINESS PRIORITIES DEVELOPMENT TIME BUSINESS CONTINUITY DATA RESILIENCE

53
slide-54
SLIDE 54

Today’s agenda

Why do we need backups? Backups and restores made easy How to make sure our backups don’t go to ’’/dev/null”?

54
slide-55
SLIDE 55

Today’s agenda

Why do we need backups? Backups and restores made easy How to make sure our backups don’t go to ’’/dev/null”?

55
slide-56
SLIDE 56

Today’s agenda

Why do we need backups? Backups and restores made easy How to make sure our backups don’t go to ’’/dev/null”?

56
slide-57
SLIDE 57 57
slide-58
SLIDE 58

“The best outages are the ones that don’t happen.”

PRETTY MUCH EVERY PRODUCTION ENGINEER I KNOW

58
slide-59
SLIDE 59

Thank you

59
slide-60
SLIDE 60

Ola Berjak

aberjak@fb.com @Lexxzor

60