MySQL Online Schema Changes at Uber and Tango Ben Black and David - - PowerPoint PPT Presentation

mysql online schema changes at uber and tango
SMART_READER_LITE
LIVE PREVIEW

MySQL Online Schema Changes at Uber and Tango Ben Black and David - - PowerPoint PPT Presentation

MySQL Online Schema Changes at Uber and Tango Ben Black and David Turner Who are we? Ben Black - DBA Tango, talks to himself thinking it is expert advice, herded cats at PalominoDB, tracked users at Garmin, grumbles at databases from his


slide-1
SLIDE 1

MySQL Online Schema Changes at Uber and Tango

Ben Black and David Turner

slide-2
SLIDE 2

Who are we?

  • Ben Black - DBA Tango, talks to himself thinking it is

expert advice, herded cats at PalominoDB, tracked users at Garmin, grumbles at databases from his couch in Kansas.

  • David Turner - Automates databases to have more

time to sneak out of the office and go abalone diving.

slide-3
SLIDE 3

About Tango.me

  • Tango is a free mobile messaging app with more than 350 million

registered members who enjoy free video and voice calls, texting, social discovery, browsing and sharing content with one another. As a Top 10 social app on Android and Top 20 on iTunes, Tango is available in 16 languages in more than 224 countries.

slide-4
SLIDE 4

About Uber

Uber Technologies, Inc. provides a smartphone application that connects drivers with people who need a ride. The company's application enables users to arrange and schedule transportation and/or logistics services with third party providers. Uber Technologies serves customers in North, Central, and South Americas, as well as Europe, the Middle East, Africa, and the Asia Pacific.

slide-5
SLIDE 5

Topics covered

  • Schema changes and Defrags: including

best practices

  • MySQL online DDL
  • Percona online schema change
  • gh-ost
slide-6
SLIDE 6

Best Practices

  • Backups
  • Benchmark
  • Schema/Migration Repository
  • Deployment Strategy
  • Monitor

○ Proactive

  • Verify consistency of data
slide-7
SLIDE 7

Backups

  • You will make a mistake
  • Odds increase with number of hosts
  • Automate recovery and confirm. At least run

quarterly recoveries.

  • Actually test restores
  • Verify data
slide-8
SLIDE 8

Benchmarking

  • Why should you benchmark?
slide-9
SLIDE 9

Benchmark

  • Knowing what to expect

○ Disk used ○ Time to completion ○ Processlist, files, etc

  • Is it better to create sk after loading table?

○ Fast index creation (note: drop and add)

  • Graphs for comparison and impact
  • Fun!
slide-10
SLIDE 10

Schema Repository

  • Inconsistency can break promotions

○ missing index ○ missing column

  • Automate schema consistency verification
  • Automate data consistency checks
slide-11
SLIDE 11

Deployment Strategy

  • Wrapper

○ Enforce process ■ Naming conventions ■ SQL_LOG_BIN = 0 ■ RBR and slave_type_conversions(no signage..) ■ Performance gains with server variables (restart db) ■ rename before drop ■ Grouping alters together. ○ Logging ○ Confirm slave updated (lag.., exclusive locks)

slide-12
SLIDE 12

Monitor

  • Progress

○ Masters or slaves may block on ddl ○ Impact on applications (query latency) ○ Show processlist ○ Show engine innodb status ○ select * from information_schema.global_temporary_tables ○ ls *tablename* ○ Show table status like ‘%<table_name>%’;

  • Errant transactions on gtid enabled clusters
slide-13
SLIDE 13

Verify data

  • comprehensive slave scans
  • pt-table-checksum
  • Spot checking
slide-14
SLIDE 14

Old school schema changes and defrags

  • Alter tables and reclaim space
  • Blocking alters
  • Customer workarounds
slide-15
SLIDE 15

Blocking alters

my_table writes my_table_new selects

slide-16
SLIDE 16

Problems with blocking alters

  • Unable to write to the table
  • The larger the table the longer the writes are

blocked.

  • Can only see the progress if file_per_table is

enabled and using ls -alh.

slide-17
SLIDE 17

Customer workarounds

master slave slave alter table

  • Slave Lag
  • Less efficient
slide-18
SLIDE 18

Additional customer workarounds

  • Openark
  • Facebook
  • Percona
  • gh-ost
slide-19
SLIDE 19

MySQL fast index creation

  • Innodb 5.1 Plugin or MySQL 5.5+
  • No more rebuilds for secondary indexes
  • Drop and add indexes faster
  • Faster table loads(add sk last)
  • Better secondary indexes*
slide-20
SLIDE 20

MySQL 5.6+ Innodb online DDL

  • More in place operations
  • Online alters for operations
  • Slaves lag for length of operations
  • No throttle option
slide-21
SLIDE 21

MySQL Online DDL

my_table_new log my_table

innodb_online_alter_log_max_size lockwait_timeout

writes selects Exclusive lock

slide-22
SLIDE 22

Why MySQL Online DDL

  • Int to bigint
  • Alters performed on slaves

○ Zero tolerance for data loss ■ Recoveries can suck ■ Table checksums ○ RBR and triggers ○ Less io than some other tools ○ Avoid additional load on master ■ no throttling parameters ○ All IOD operations via replication block the sql_thread on the slaves ○ 5.6.17 defrags and algorithm=inplace

slide-23
SLIDE 23

Alter online lock modes

  • Exclusive

○ alter table my_tbl engine=innodb, lock=exclusive;

  • Shared

○ alter table my_tbl engine=innodb, lock=shared;

  • Default

○ Know your stuff and test

  • None

○ alter table my_tbl add index sk1(pid), lock=none;

slide-24
SLIDE 24

Alter online algorithms

  • Inplace

○ alter table my_tbl add column, algorithm=inplace, lock=none;

  • Copy

○ alter table my_tbl engine=innodb, algorithm=copy, lock=shared; ○ all algorithm=copy allow only shared or exclusive lock.

slide-25
SLIDE 25

Operations

  • show process list contains only sleeping
  • connections. Am I clear for alters?

○ performance_schema

slide-26
SLIDE 26

More examples

# Defragging a table requires a copy and does not allow concurrent dml prior to 5.6.17. alter table ${table} engine=innodb, algorithm=copy, lock=shared; # Index ops alter table ${table} add index sk1 (zone), algorithm=inplace, lock=none; alter table ${table} drop index sk1, algorithm=inplace, lock=none; # Column ops alter table ${table} add column c3 int, add column c1 int, add column c4 int, algorithm=inplace, lock=none; alter table ${table} drop column c4,algorithm=inplace, lock=none; alter table ${table} modify c1 int default 0, algorithm=inplace, lock=none; alter table ${table} change c3 c2 int, algorithm=inplace, lock=none; alter table ${table} modify column c2 int after c1, algorithm=inplace, lock=none; alter table ${table} modify column c2 int default null, algorithm=inplace, lock=none; alter table ${table} modify c2 bigint, algorithm=copy, lock=shared; alter table ${table} drop column c1, drop column c2 , algorithm=inplace, lock=none;

slide-27
SLIDE 27

Innodb online DDL: additional info

  • Foreign keys

○ Create fk with foreign_key_checks=0 ○ Child table restrictions ■ Alter on child blocks concurrent dml ■ Alter on child could be blocked by transaction on the parent* ■ Alter on completion on parent could be blocked by dml on same parent if it causes dml on child.

  • Partitioning
  • Auto increment - alter instantaneous, add requires lock=shared
  • A table of info:

http://dev.mysql.com/doc/refman/5.6/en/innodb-create-index-overview.html

  • Full text indexes still require shared lock
slide-28
SLIDE 28

Innodb online DDL variables

  • innodb_online_alter_log_max_size
  • lock_wait_timeout
  • innodb_sort_buffer_size
  • others(benchmark)
slide-29
SLIDE 29

Innodb Crash Recovery

  • Secondary indexes
  • Clustered indexes(don’t crash!)
slide-30
SLIDE 30

Summary

  • Take Backups
  • Test recovery
  • Benchmark
slide-31
SLIDE 31

You want to do what to that 300G table?

  • App down for Maintenance/blocking DDL
  • Disable writes/read-only for altered table
  • Alter slave(s) and swap masters
  • MySQL Online DDL and serialized

replication/lag

slide-32
SLIDE 32

OSC Tools or how I learned to love the alter...

  • copy original table structure to new table
  • alter new table
  • create triggers to copy dml from original

table to new table -OR- stream binary log events

  • copy data in original table to new table in

chunks

  • swap table names and drop original table
slide-33
SLIDE 33

table id PK _table_new id PK new_col 1) copy table 2) alter _table_new 3) create triggers 4) copy data in chunks 5) swap tables/drop orig DML DML triggers (upd,del,ins) Copy chunks

slide-34
SLIDE 34

pt-online-schema-change

Percona toolkit is your friend. Open source and you can see the tables, sql, triggers and what it is doing. Pay attention to version you are using!!! FK issues, log_bin with 2.0, RTFM

slide-35
SLIDE 35

Whoomp! There it is!

time ./pt-online-schema-change --dry-run --user=bblack

  • -ask-pass --max-lag=2 --check-interval=1 --progress=time,10
  • -max-load Threads_running:50 --critical-load

Threads_running:6000 --alter "DROP COLUMN col4"

  • -host=127.0.0.1 D=my_db,t=my_table
  • -alter-foreign-keys-method=drop_swap --recursion-method

dsn=D=percona,t=dsns --no-check-replication-filters --set-vars innodb_lock_wait_timeout=10 --chunk-time=.2 screen - in case you lose connection time - test when possible for timing

slide-36
SLIDE 36

It's all ball bearings nowadays

  • -progress time,10 (default 30 seconds)
  • -max-lag and --check-interval (both default 1s)
  • -recursion-method (how to find slaves)

show processlist show hosts dsn table (great for ignoring some replicas)

slide-37
SLIDE 37

Other params

  • -chunk-time=.2 (default .5)
  • -max-load and --critical-load (Threads_running=25,50)
  • -alter-foreign-keys-method=drop_swap
  • -no-check-replication-filters
  • -no-drop-old-table (for RDS or EXT)
slide-38
SLIDE 38

What could possibly go wrong?

  • PK/UK required
  • FK names will change on altered table
  • FK's reference table being altered
  • -alter-foreign-keys-method drop_swap

sets foreign_key_checks=0 drops original table (hardlink!!! - EXT) renames new table

  • NOTIFICATION EMAILS (don’t kick off on Friday and go home)
slide-39
SLIDE 39

What could possibly go wrong?

  • non xfs huge table drops (create the hardlink!)

○ Can even cause innodb to crash mysqld

  • PKs with gaps greatly affect time estimates
  • Largest table to alter vs free disk space
  • Disk space (2x free needed for RBR)
  • Global mutexes (table drops can take several seconds)
  • table metadata locks (triggers)
  • No triggers allowed on the table to alter
slide-40
SLIDE 40

NOOOOOO!!

  • Running PT-OSC against a slave with RBR

Replication started erroring after pt-osc

why???

  • And how about syncing a table using pt-osc

with RBR?

slide-41
SLIDE 41

Trust but verify…

show create table altered_table\G show create table _altered_table_new\G show table status like ‘%altered_table%’; ls -alh /data/schema/*altered_table*.ibd

slide-42
SLIDE 42

Set it and forget it???

  • watch

○ w ○ df -h ○ ls -alh ibd files ○ mysql -e"show processlist;"|egrep -v "Sleep|repl" ○ slave lag

  • How is it affecting the application? external

monitoring/app response time

slide-43
SLIDE 43

I think I’m paranoid

watch -n 1 'w|head -1;echo "";mysql -e"show engine innodb status\G"|grep -i history;echo "";df -h|grep local;echo "";mysql -e"show processlist;"|wc -l;echo "";mysql -e"show processlist;"|grep -v Sleep|grep -v repl'

slide-44
SLIDE 44

Watch yo replicas

[01:11 root@host ~]$ cat /home/bblack/chech_repl.sh echo "master001" `mysql -hmaster001 -e"show slave status\G"|grep Seconds` echo "slave001" `mysql -hslave001 -e"show slave status\G"|grep Seconds` [01:14 bblack@host ~]$ watch ./chech_repl.sh master001 Seconds_Behind_Master: 0 slave001 Seconds_Behind_Master: 0 Slaves on non-3306? Use dsns table with entries for <hostname>:<port>

slide-45
SLIDE 45

Are we there yet?

pt-online-schema-change prints progress to std err mysql> show table status like ‘%table_name%’; ls -alh /data/schema/*table_name*.ibd mysql> select max(id) from table_name; mysql> show full processlist; -- to find id of current chunk

slide-46
SLIDE 46

Sneaky little hobbitses. Wicked, tricksy

To do a noop and optimize the table

  • -alter "engine=INNODB"

100-200MB cutoff for blocking alters Overhead for adding indexes Never do pt-osc for dropping indexes

slide-47
SLIDE 47

$ time ./pt-online-schema-change --execute ... --alter "ADD ..." D=my_schema,t=my_table ... No slaves found. See --recursion-method if host xxxxxx has slaves. Not checking slave lag because no slaves were found and --check-slave-lag was not specified. Operation, tries, wait: copy_rows, 10, 0.25 create_triggers, 10, 1 drop_triggers, 10, 1 swap_tables, 10, 1 update_foreign_keys, 10, 1 Altering `my_schema`.`my_table`... Creating new table... Created new table my_schema._my_table_new OK. Altering new table... Altered `my_schema`.`_my_table_new` OK. 2017-04-07T02:18:19 Creating triggers... 2017-04-07T02:18:19 Created triggers OK.

slide-48
SLIDE 48

2017-04-07T02:18:19 Copying approximately 13445478 rows... Copying `my_schema`.`my_table`: 0% 39:00 remain Copying `my_schema`.`my_table`: 0% 34:28 remain Copying `my_schema`.`my_table`: 1% 34:56 remain Copying `my_schema`.`my_table`: 1% 35:09 remain <snip> Copying `my_schema`.`my_table`: 98% 00:27 remain Copying `my_schema`.`my_table`: 99% 00:14 remain Copying `my_schema`.`my_table`: 99% 00:01 remain 2017-04-07T02:47:38 Copied rows OK. 2017-04-07T02:47:38 Swapping tables... 2017-04-07T02:47:38 Swapped original and new tables OK. Not dropping old table because --no-drop-old-table was specified. 2017-04-07T02:47:38 Dropping triggers... 2017-04-07T02:47:38 Dropped triggers OK. Successfully altered `my_schema`.`my_table`. real 29m20.292s

slide-49
SLIDE 49

gh-ost

https://github.com/github/gh-ost

  • Triggerless online schema change tool
  • True throttling - No triggers - No added load
  • Dynamic control of throttling, etc
  • Controlled cutover to new table
  • Requires at least one slave to use RBR
slide-50
SLIDE 50

gh-ost

  • Now works with RDS
  • No Active-Active multi-master
  • Designed for multiple concurrent alters
slide-51
SLIDE 51

gh-ost (diagram from github)

slide-52
SLIDE 52

Links

○ Schema migrations ■ https://sqlalchemy-migrate.readthedocs.io/en/latest/ - sqlalchemy migrations ■ https://www.percona.com/live/17/sessions/automatic-mysql-sche ma-management-skeema - skeema (check it out) ○ https://dev.mysql.com/doc/refman/5.6/en/innodb-create-index-overvie w.html - MySQL online DDL ○ https://www.percona.com/doc/percona-toolkit/2.1/pt-online-schema-ch ange.html - PT online schema change ○ https://github.com/github/gh-ost/blob/master/README.md - gh-ost repo

slide-53
SLIDE 53

Thank you

  • Uber - see us at our booth

○ dturner@uber.com

  • Tango

○ bblack@tango.me