Backup and Recovery Strategy About Stacy 10+ years of experience on - - PowerPoint PPT Presentation

backup and recovery strategy about stacy
SMART_READER_LITE
LIVE PREVIEW

Backup and Recovery Strategy About Stacy 10+ years of experience on - - PowerPoint PPT Presentation

Backup and Recovery Strategy About Stacy 10+ years of experience on various flavors of relational databases. Focus on performance tuning, code reviews, database deployment and infrastructure management for MySQL In her spare time, she


slide-1
SLIDE 1

Backup and Recovery Strategy

slide-2
SLIDE 2

About Stacy

▪ 10+ years of experience on various flavors of relational databases. ▪ Focus on performance tuning, code reviews, database deployment and infrastructure management for MySQL ▪ In her spare time, she enjoys reading books and doing some volunteer work.

slide-3
SLIDE 3

About Kushal

▪ 4+ years of experience on relational databases. ▪ Focus on infrastructure management for MySQL and Oracle RAC, code reviews, troubleshooting and capacity planning ▪ In spare time, like to play indoor and outdoor games

slide-4
SLIDE 4

MySQL at Yahoo

▪ MySQL powers many mission-critical products at Yahoo ▪ Different configurations based on product requirements ▪ Products: Yahoo Sports, Flickr, Finance, Tumblr, Daily Fantasy, Ad System, YDS ▪ MySQL Flavor: Percona Server 5.5 to 5.6 including XtraDB Cluster ▪ Operating systems : RHEL 6.x

slide-5
SLIDE 5

What are the next 50 minutes about?

§ Backups

  • XtraBackup and its options

§ Recovery

  • File based
  • GTID enabled
  • MTS without GTID

§ Backup Monitoring § Multi Source Replication Setup

slide-6
SLIDE 6

Backups

§ Why we need backup?

slide-7
SLIDE 7

MySQL Database Backup Method

  • Hot Backup

§ mysqldump § MySQL Enterprise Backup § Xtrabackup § File System

  • Cold Backup

§ File System

slide-8
SLIDE 8

Why go for XtraBackup

XtraBackup is an open-source hot backup utility

§ Optimized

  • Uses posix_fadvise() not to cache data in memory
  • Perform more aggressive read-ahead optimizations on the source files

§ Reliable

  • Checks for page corruption on each page with InnoDB’s buf_page_is_corrupted() function.

§ No interruption for writes

slide-9
SLIDE 9

Xtrabackup Improvements

Percona MySQL 5.6.16-64+ use backup locks instead of “FLUSH TABLES WITH READ LOCK “ It is a light-weight lock.

  • Block updates to non-transactional tables and

DDL statements for all tables

  • New backup lock will not flush tables

Storage engines are not forced to close tables and tables are not expelled from the table cache.

  • Only waits for conflicting statements to complete (i.e. DDL and updates to non-

transactional tables). It never waits for SELECTs, or UPDATEs to InnoDB tables to complete

https://www.percona.com/doc/percona-server/5.6/management/backup_locks.html

slide-10
SLIDE 10

XtraBackup

PRESENTED BY Brian Koch ⎪ March 2014

  • Records the log sequence number(LSN) when it starts
  • Copies all InnoDB files which is internally in inconsistent state
  • Read transaction log files to create its own change log file
  • Backup locks
  • Use LOCK TABLES FOR BACKUP to copy non-InnoDB files
  • Use Lock BINLOG FOR BACKUP to block operation and gets binary log coordinates
  • Apply xtrabackup logs to make data files consistent state
slide-11
SLIDE 11

Yahoo MySQL Database Architecture

▪Dual masters single writer setup ▪Each master carries one or more slave

VIP for Write Prod master BCP master Prod slave BCP slave VIP for Read-only Backup server

slide-12
SLIDE 12

Current Backup Strategy

PRESENTED BY Brian Koch ⎪ March 2014

  • Full backup on Sunday and incremental backup from Monday to Saturday
  • The backups are stored in external storage volume
  • The backup job runs on dedicated slave or prod slave
  • The binary logs are saved periodically

Prod master Dedicated slave Xtrabackup job Backup filer Copy binary logs

slide-13
SLIDE 13

§ xtrabackup.addl_args: --slave-info § xtrabackup.backup_dir: /mnt/backup/ § xtrabackup.cron_bkup_ret_hour: 0 § xtrabackup.cron_bkup_ret_min: 0 § xtrabackup.cron_bkup_ret_wday: 1 § xtrabackup.retain_extra_days: 14 § xtrabackup.cron_day: * § xtrabackup.cron_hour: 7 § xtrabackup.cron_min: 0 § xtrabackup.cron_wday: 0 § xtrabackup.incremental: --incremental § xtrabackup.cron_incremental_day: * § xtrabackup.cron_incremental_hour: 7 § xtrabackup.cron_incremental_min: 0 § xtrabackup.cron_incremental_wday: 1-6 § xtrabackup.crontab: on § xtrabackup.email: page-oncall-email § xtrabackup.backup_vol_threshold: 80 § xtrabackup.backup_user: root § xtrabackup.backup_password: <path to pw file>

Xtrabackup Wrapper

slide-14
SLIDE 14

Xtrabackup Options

Options Comments slave-info

Use it when taking backup is from the slave. Record its master's binary log file name and its position in the file xtrabackup_slave_info

parallel

It specifies the number of threads the xtrabackup child process should use to back up files concurrently, speed up backup process

compress

It instructs xtrabackup to compress backup copies of InnoDB data files.

use-memory

Speed up the apply log process

safe-slave- backup

stop slave SQL thread before running FLUSH TABLES WITH READ LOCK and wait to start backup until Slave_open_temp_tables in SHOW STATUS is zero.

incremental

Instructs xtrabackup to take incremental backup instead of fullbackup

slide-15
SLIDE 15

Xtrabackup with Network Streaming

PRESENTED BY Brian Koch ⎪ March 2014

Target mkdir /home/tmp/backup

screen -S backup nc -l 9999 | tar -ixf - -C /home/tmp/backup

Source screen -S backup

sudo innobackupex --user=root –password=** --slave-info --stream=tar ./ | nc <target> 9999

Target The backups will be in target:/home/tmp/backup

sudo innobackupex --apply-log --use-memory=16G --parallel=8 /home/tmp/backup sudo innobackupex --move-back /home/tmp/backup sudo chown -R mysql:mysql /home/y/…/mysql start mysql_server

slide-16
SLIDE 16

Backup Option: --compress

PRESENTED BY Brian Koch ⎪ March 2014

§ Saved backup space 2/3 § Take more time to complete backup § Take less time if multiple compress-threads are used

Database Server Full backup size Compresse d size Full bkup time Compressed bkup time Compress- thread db1 1.3 TB 360 GB 220 mins 255 mins 1 db2 1.3 TB 360 GB 220 mins 180 mins 8

slide-17
SLIDE 17

Database Restore Comparison

PRESENTED BY Brian Koch ⎪ March 2014

DB Backup size Transfer backup to target server Decompress Apply log Total time Space saved Time saved

Full backup 1.3TB 210 mins 15 mins 225 mins Full compressed backup 360GB 75 mins 65 mins 15 mins 155 mins ~70% ~30%

slide-18
SLIDE 18

The Issues Could Happen

PRESENTED BY Brian Koch ⎪ March 2014

  • The database that has a lot of partitions could run into

Error number 24 means 'Too many open files’ Added “ulimit -n 65535“ in the backup script

  • Transaction logs could be small

xtrabackup: error: log block numbers mismatch: xtrabackup: error: expected log block no. 98032273, but got no. 102226569 from the log file. xtrabackup: error: it looks like InnoDB log has wrapped around before xtrabackup could process all records due to either log copying being too slow, or log files being too small. xtrabackup: Error: xtrabackup_copy_logfile() failed.

Increase transaction log size or take backup from off-peak time.

slide-19
SLIDE 19

CAVEATS

PRESENTED BY Brian Koch ⎪ March 2014

  • The backup folder should be dedicated.

If it contains other folders or other files, the incremental backup may not be right.

  • The backup purging time and backup time shall be different.
  • Information_schema is not backed up. So the table create time stamp could

be lost.

slide-20
SLIDE 20

CAVEATS --- continued

▪ event master : status is ENABLED slave: status is SLAVESIDE_DISABLED ▪ Master and slave are NOT same Taking backup from master to rebuild slave, event is enabled causes issue

slide-21
SLIDE 21

Create event cross all database servers

▪ Login to each server

set sql_log_bin=0; create event event_name …. BEGIN IF @@global.read_only=0 THEN call del_test_data(); end if; END

Ensures master and slave identical

slide-22
SLIDE 22

Backup and Recovery Strategy -- continue

slide-23
SLIDE 23

Take Database Backup

PRESENTED BY Brian Koch ⎪ March 2014

0 12 * * 1 root find /backup/dbbk -maxdepth 1 -type d -ctime +14 | xargs rm -rf 0 0 * * 0 root /home/y/bin/innobackupex_y 0 0 * * 1-6 root /home/y/bin/innobackupex_y --incremental

slide-24
SLIDE 24

Point-Time-Recovery

PRESENTED BY Brian Koch ⎪ March 2014

Assume there are compressed backups:

full backup: /mnt/backup/saca/2016-04-07_05-05-01 Incremental backup: /mnt/backup/saca/2016-04-08_05-00-00

  • 1. Transfer backups to the restore server
  • 2. Decompress the backups first

§ sudo innobackupex /home/syuan/backup/2016-04-07_05-05-01 --use-memory=32G --decompress --parallel=16 § sudo innobackupex /home/syuan/backup/2016-04-08_05-00-00 --use-memory=8G

  • -decompress --parallel=8
  • 3. Apply logs

§ sudo innobackupex --apply-log --redo-only /home/syuan/backup/2016-04-07_05-05-01 --use-memory=32G -- parallel=16 § sudo innobackupex --apply-log --redo-only /home/syuan/backup/2016-04-07_05-05-01 --incremental- dir=/home/syuan/backup/2016-04-08_05-00-00 --use-memory=8G

  • -parallel=8

§ sudo innobackupex --apply-log /home/syuan/backup/2016-04-07_05-05-01 --use-memory=32G --parallel=16

slide-25
SLIDE 25

Point-Time-Recovery - continued

PRESENTED BY Brian Koch ⎪ March 2014

  • 4. Move the backups into right directory

sudo innobackupex --move-back /home/syuan/backup/2016-04-07_05-05-01

  • 5. Start database
  • sudo chown -R mysql:mysql /home/y/var/mysql
  • yinst start mysql_server
  • 6. Update root password
  • 7. mysqlbinlog
  • stop-datetime
  • stop-position
slide-26
SLIDE 26

Slave Rebuild

PRESENTED BY Brian Koch ⎪ March 2014

Get the replication coordinates

  • If the backup is taken from the master, check the file xtrabackup_binlog_info in the backup folder
  • If the backup is from slave, please check the file xtrabackup_slave_info

Set up the replication

CHANGE MASTER TO MASTER_HOST='master-host-name.yahoo.com', MASTER_USER='replicant', MASTER_PASSWORD='pass-word', MASTER_LOG_FILE='mysqld-bin.004464', MASTER_LOG_POS=49402644; START SLAVE; SHOW SLAVE STATUS\G

slide-27
SLIDE 27

Slave Rebuild - continued

PRESENTED BY Brian Koch ⎪ March 2014

The binary logs that is needed have been purged.

  • 1. Transfer the needed binary logs to the restore server
  • 2. Use mysqlbinlog utility to apply them to the database.

mysqlbinlog --start-position=49402644 mysqld-bin.004464 mysqld-bin.004465 .......| mysql -u root –p

Caution: If data contains BLOB, please use binary-mode Using one db connection to process all binary logs to avoid temporary table issue.

slide-28
SLIDE 28

Alternative Way to Apply Binary Logs

PRESENTED BY Brian Koch ⎪ March 2014

  • Set up a fake master or binlog server
  • Copy all necessary binary logs to the fake master
  • Make sure the system recognize all these binary logs

Edit file names in mysqld-bin.index Restart MySQL

  • Set up replications between this master and restore server
  • Change master to

………

slide-29
SLIDE 29

Advantages

PRESENTED BY Brian Koch ⎪ March 2014

Much safer and robust.

  • Recover speed is much faster

20% faster comparing with using mysqlbinlog utility

  • Best way for partial replication database to be recovered

Add ignore, replicate-do-db in my.cnf, start replication

slide-30
SLIDE 30

Clone Replication Slave

PRESENTED BY Brian Koch ⎪ March 2014

  • Requirements for my.cnf

relay_log_info_repository = TABLE master_info_repository = TABLE relay_log_recovery = 1

  • Xtrabackup option slave-info

Take xtrabackup from the sever, restore it in target server Start slave; The replication is automatically enabled.

slide-31
SLIDE 31

Start Slave Issues

PRESENTED BY Brian Koch ⎪ March 2014

  • Start slave fails

Error log: Slave failed to initialize relay log info structure from the repository

Explanation: Relay log info is messed up.

reset slave all; change master to … start slave;

slide-32
SLIDE 32

Start Slave Issues - GTID

Enable Replication in Slave Issue

mysql> SET GLOBAL gtid_purged='ffee1ff8-363f-11e5-af47-9cb654954cac:1-29123533'; ERROR 1840 (HY000): @@GLOBAL.GTID_PURGED can only be set when @@GLOBAL.GTID_EXECUTED is empty.

How to fix

RESET MASTER; ▪ SET GLOBAL gtid_purged='ffee1ff8-363f-11e5-af47-9cb654954cac:1-29123533’; ▪ CHANGE MASTER TO MASTER_HOST="mastername", master_user='rep_user', master_password='rep_password', MASTER_AUTO_POSITION = 1; ▪ START SLAVE;

slide-33
SLIDE 33

Special backup case – MTS without GTID enabled

PRESENTED BY Brian Koch ⎪ March 2014

  • Can not add --slave-info in MTS without GTID.
  • How to get the consistent backup

1. Normal way: STOP SLAVE; START SLAVE UNTIL SQL_AFTER_MTS_GAP; 2. The alternative way: Backup the database as normal

Recovery:

  • 1. The table slave_worker_info records each worker detail, the required relay log from backup

server or corresponding mysql binary logs can be found from its master.

  • 2. START SLAVE UNTIL SQL_AFTER_MTS_GAPS; to fill all gaps,
  • 3. Then direct new slave to the existing master as usual.
slide-34
SLIDE 34

PITR– MTS without GTID enabled

PRESENTED BY Brian Koch ⎪ March 2014

Follow previous instructions to restore the database backup.

mysql> select ID,Relay_log_name,Relay_log_pos,Master_log_name,Master_log_pos from mysql.slave_worker_info; +----+------------------------------------------------+---------------+-------------------+----------------+ | ID | Relay_log_name | Relay_log_pos | Master_log_name | Master_log_pos | +----+------------------------------------------------+---------------+-------------------+----------------+ | 1 | /home/y/var/mysql/data/mysqld-relay-bin.000187 | 946879583 | mysqld-bin.000059 | 946879419 | | 2 | /home/y/var/mysql/data/mysqld-relay-bin.000187 | 947931963 | mysqld-bin.000059 | 947931799 | | 3 | /home/y/var/mysql/data/mysqld-relay-bin.000187 | 947515423 | mysqld-bin.000059 | 947515259 | +----+------------------------------------------------+---------------+-------------------+----------------+ 3 rows in set (0.00 sec)

Please copy mysqld-relay-bin.000187 from the backup slave to the restore server.

slide-35
SLIDE 35

PITR– MTS without GTID enabled --Continued

PRESENTED BY Brian Koch ⎪ March 2014

If the relay log file exists

1. Edit mysqld-relay-bin.index with this new relay log file. 2. Edit file permission chown mysql:mysql mysqld-relay-bin.000187

Example: [root@bf1-trfdb-001 data]# cat mysqld-relay-bin.index /home/y/var/mysql/data/mysqld-relay-bin.000001 /home/y/var/mysql/data/mysqld-relay-bin.000187

  • 3. Restart mysqld to make the system recognized the new relay log
slide-36
SLIDE 36

PITR– MTS without GTID enabled --Continued

PRESENTED BY Brian Koch ⎪ March 2014

Use mysql binary log which is from the master

1. Rename mysqld-bin.000059 to mysqld-relay-bin.000002 Edit mysqld-relay-bin.index with this new relay log file.

  • 2. Edit file permission chown mysql:mysql mysqld-relay-bin.000002

Example: [root@bf1-trfdb-001 data]# cat mysqld-relay-bin.index /home/y/var/mysql/data/mysqld-relay-bin.000001 /home/y/var/mysql/data/mysqld-relay-bin.000002

slide-37
SLIDE 37

PITR– MTS without GTID enabled --Continued

PRESENTED BY Brian Koch ⎪ March 2014

  • 3. update mysql.slave_worker_info

set relay_log_name= ‘…./mysqld-relay-bin.000002',relay_log_pos=master_log_pos;

mysql> select ID,Relay_log_name,Relay_log_pos,Master_log_name,Master_log_pos from mysql.slave_worker_info; +----+------------------------------------------------+---------------+-------------------+----------------+ | ID | Relay_log_name | Relay_log_pos | Master_log_name | Master_log_pos | +----+------------------------------------------------+---------------+-------------------+----------------+ | 1 | /home/y/var/mysql/data/mysqld-relay-bin.000002 | 946879419 | mysqld-bin.000059 | 946879419 | | 2 | /home/y/var/mysql/data/mysqld-relay-bin.000002 | 947931799 | mysqld-bin.000059 | 947931799 | | 3 | /home/y/var/mysql/data/mysqld-relay-bin.000002 | 947515259 | mysqld-bin.000059 | 947515259 | +----+------------------------------------------------+---------------+-------------------+----------------+

slide-38
SLIDE 38

PITR– MTS without GTID enabled --Continued

PRESENTED BY Brian Koch ⎪ March 2014

  • 4. Fill all holes.

START SLAVE UNTIL SQL_AFTER_MTS_GAPS;

mysql> select ID,Relay_log_name,Relay_log_pos,Master_log_name,Master_log_pos from mysql.slave_worker_info; +----+----------------+---------------+-----------------+----------------+ | ID | Relay_log_name | Relay_log_pos | Master_log_name | Master_log_pos | +----+----------------+---------------+-----------------+----------------+ | 1 | | 0 | | 0 | | 2 | | 0 | | 0 | | 3 | | 0 | | 0 | +----+----------------+---------------+-----------------+----------------+

3 rows in set (0.00 sec)

  • 5. Change master to ….; start slave;
slide-39
SLIDE 39

PITR– MTS without GTID enabled --Continued

PRESENTED BY Brian Koch ⎪ March 2014

TABLE mysql.slave_relay_log_info is NOT updated relay_log_recovery=1 starts auto recovery

Relay_Log_File: mysqld-relay-bin.000001 Relay_Log_Pos: 4

Replication is screwed!!

https://bugs.mysql.com/bug.php?id=81385

slide-40
SLIDE 40

MySQL Backup Monitoring

PRESENTED BY Brian Koch ⎪ March 2014

Email alert if backup fails Backup dashboard

slide-41
SLIDE 41

Multi-source Replication

PRESENTED BY Brian Koch ⎪ March 2014

  • Enable replication slave to receive transactions from multiple sources simultaneously
  • Merge table shards, data aggregation for reporting purpose
  • Back up one database server instead of of different multitenant clusters.
  • Automate database restore from prod to non-prod.

master1 master2 master3 slave Binlog channel 1 Binlog channel 2 Binlog channel 3 Binlogs

slide-42
SLIDE 42

Multi-source Replication

PRESENTED BY Brian Koch ⎪ March 2014

  • Pre-condition in target slave:

master-info-repository=TABLE relay-log-info-repository=TABLE slave is MySQL57

  • Replication topology can be (GTID) based or binary log position-based
  • Change master to master_host=‘master1’,… FOR CHANNEL 'master-1';
  • Start slave FOR CHANNEL 'master-1';

Stop slave FOR CHANNEL 'master-1'; Show slave status FOR CHANNEL 'master-1’\G; SELECT * FROM performance_schema.replication_connection_status\G

slide-43
SLIDE 43

Multi-source Replication -setup

PRESENTED BY Brian Koch ⎪ March 2014

  • Source: MySQL 5.6 Target: MySQL 5.7
  • Source: take xtrabackup using --slave-info in the master, transfer the backup to the slave

generate a file contains create database/table sql statement without data

  • Target: innobackupex --apply-log --export /home/backup

Load database structure sql script in the slave Generate “ALTER TABLE TABLENAME DISCARD|IMPORT TABLESPACE;” sql scripts Target: Run drop tablespace script Copy backups(*.ibd,*.exp, *.cfg) into right database folders Run import tablespace script Get the coordinates from xtrabackup_slave_info to setup replication

slide-44
SLIDE 44

Multi-source Replication -setup

PRESENTED BY Brian Koch ⎪ March 2014

Target slave: Reset master; SET GLOBAL gtid_purged=…… change master to master_host=’master1', master_port=3306, master_user='rep', master_password='pass' , MASTER_AUTO_POSITION = 1 for channel 'ch1_master1’; start slave for channel 'ch1_master1';

slide-45
SLIDE 45

Multi-source Replication - Issues

PRESENTED BY Brian Koch ⎪ March 2014

Pre-check: Any temporal time columns Any antelope tables Rebuild all those tables Issue mysql> alter table application IMPORT TABLESPACE; ERROR 1808 (HY000): Schema mismatch (Table flags don't match, server table has 0x6 and the meta-data file has 0x1) Drop table and recreate table with ROW_FORMAT=compact Copy backup and import this tablespace

slide-46
SLIDE 46

Multi-source Replication - restrictions

PRESENTED BY Brian Koch ⎪ March 2014

1. All tables should have the newer format for timestamp columns 2. innodb_file_per_table should be used 3. All files should be in Barracuda format 4. Database name has to be different crossing multiple masters

slide-47
SLIDE 47

Summary

▪ Percona Xtrabackup Options ▪ Issues and Cautions ▪ Point-time recovery using different ways ▪ Backup/Recovery for MTS without GTID ▪ Database backup monitoring ▪ Multi-Source Replication Setup

slide-48
SLIDE 48

QA