! GitHub's online schema migrations for MySQL Tom Krouper, Shlomi - - PowerPoint PPT Presentation

github s online schema migrations for mysql tom krouper
SMART_READER_LITE
LIVE PREVIEW

! GitHub's online schema migrations for MySQL Tom Krouper, Shlomi - - PowerPoint PPT Presentation

! GitHub's online schema migrations for MySQL Tom Krouper, Shlomi Noach GitHub Illustrated with ghosts ! How people build so fu ware 1 GitHub ! The worlds largest Octocat T-shirt and stickers store And water bo tu les And


slide-1
SLIDE 1

How people build sofuware

!

GitHub's online schema migrations for MySQL

Tom Krouper, Shlomi Noach GitHub

1

!

Illustrated with ghosts

slide-2
SLIDE 2

How people build sofuware

!

2

!

  • The world’s largest Octocat T-shirt and stickers store
  • And water botules
  • And hoodies
  • We also do stuff related to things

GitHub

slide-3
SLIDE 3

How people build sofuware

!

3

!

  • gh-ost is GitHub’s MySQL schema migration tool
  • GitHub Online Schema Transmogrifier/Transfigurator/Transfer/Thingy
  • Developed by the @github/database-infrastructure
  • Used in production daily
  • Open source, github.com/github/gh-ost

But, what is this all about?

gh-ost

slide-4
SLIDE 4

How people build sofuware

!

4

!

  • No support for foreign keys (partially possible to add)
  • No support for triggers (possible to add)
  • RBR required on at least one server. FULL binlog image required (for now)
  • No support for 5.7 generated columns (possible to add)
  • Multisource replication not supported
  • Active-active master-master replication not supported (possible to add)
  • Bugs: gh-ost owns far more logic and data transfer, therefore needs to get

its hands dirty with timezones, characters sets, etc.. See Issues. Complete listing in github.com/github/gh-ost/blob/master/doc/ requirements-and-limitations.md, github.com/github/gh-ost/issues

Known limitations: let’s get this out of our way

slide-5
SLIDE 5

How people build sofuware

!

5

!

  • GitHub stores repositories in git, and uses MySQL as the backend database

for all related metadata:

  • Repository metadata, users, issues, pull requests, comments etc.
  • Our MySQL servers must be available, responsive and in good state:
  • Write throughput expected to be high
  • Write latency expected to be low
  • Replica lag expected to be low

MySQL

slide-6
SLIDE 6

How people build sofuware

!

6

!

  • MySQL schema migration is a known problem
  • Addressed by schema migration tools since 2009. Most common are:
  • pt-online-schema-change by Percona
  • fc-osc by Facebook
  • GitHub develops rapidly. Engineers require changes to MySQL tables daily,

and these changes should take place quickly

  • Migrations must not block development
  • Migrations must not impact availability

Migrations

slide-7
SLIDE 7

How people build sofuware

!

7

!

  • We’ve been using pt-online-schema-change for years
  • As we grew in volume and traffic, we hit more and more problems
  • Some migrations cause such high load that writes were stalled and GitHub

performance degraded

  • Others would cause consistent replication lags
  • Some tables could only be migrated off-peak
  • Some tables could only be migrated during weekend
  • We would atuend to running migrations
  • Some tables could not be migrated
  • In 2016, we suffered outages due to migrations on our busiest tables
  • We had a list of “risky” migrations

GitHub migration pains

slide-8
SLIDE 8

How people build sofuware

! !

Synchronous triggers based migration

8

!

" "

  • riginal table

ghost table

#

insert delete update insert delete update

pt-online-schema-change

  • ak-online-alter-table

LHM

slide-9
SLIDE 9

How people build sofuware

! !

9

!

" "

  • riginal table

ghost table

#

insert delete update inserts

"

changelog table

Asynchronous triggers based migration

fc-osc

slide-10
SLIDE 10

How people build sofuware

!

10

!

  • Stored routines
  • Interpreted, not compiled. Latency to each transaction
  • Locks
  • Transaction space competes for multiple, uncoordinated locks
  • Metadata locks
  • Unsuspendible
  • Even as throtuling is required, triggers must continue to work
  • Concurrent migrations
  • Trust issues
  • No reliable testing
  • Either cannot test in production, or test does not get actual write workload

What’s wrong with triggers?

slide-11
SLIDE 11

How people build sofuware

!

11

!

  • gh-ost connects as replica and pulls binary log entries (RBR format)
  • Interprets related DML (INSERT, UPDATE, DELETE) entries and transforms them

to meet refactored table structure

  • Applies on ghost table
  • gh-ost connects to master and iterates rows
  • One chunk afuer the other, copies rows from the original table to the ghost table
  • Much like existing tools, but more on this later
  • maintains a “changelog” table for internal lightweight bookkeeping

Binlog based design

slide-12
SLIDE 12

How people build sofuware

! !

12

!

" "

  • riginal table

ghost table

#

insert delete update no triggers

$

binary log

Triggerless, binlog based migration

slide-13
SLIDE 13

How people build sofuware

!

!

13

!

" "

#

$

!

" "

master replica

Binlog based migration, utilize replica

slide-14
SLIDE 14

How people build sofuware

!

14

!

  • Binary logs can be read from anywhere
  • gh-ost prefers connecting to a replica, offloading work from master
  • gh-ost controls the entire data flow
  • It can truly throtule, suspending all writes on the migrated server
  • gh-ost writes are decoupled from the master workload
  • Write concurrency on master turns irrelevant
  • gh-ost’s design is to issue all writes sequentially
  • Completely avoiding locking contention
  • Migrated server only sees a single connection issuing writes
  • Migration algorithm simplified

Binlog based design implications

slide-15
SLIDE 15

How people build sofuware

!

15

!

! !

$

" "

master replica binary log

  • riginal

table ghost table gh-ost migration:

  • creates ghost table on migrated server
  • alters ghost table
  • hooks up as a MySQL replica, streams binary log events
  • interchangeably:
  • applies events on ghost table
  • copies rows from original table onto ghost table
  • cut-over

Preferred setup:

  • connects to replica
  • inspects table structure, table dimensions on replica
  • hooks as replica onto replica
  • apply all changes on master
  • writes internal & heartbeat events onto master, 


expects them on replica " "

gh-ost design

slide-16
SLIDE 16

How people build sofuware

!

16

!

!

$

!

$

!

$

!

$

!

$

!

$ $

! !

$

!

$

  • a. connect to replica
  • b. connect to master
  • c. migrate/test on replica

gh-ost operation modes

slide-17
SLIDE 17

How people build sofuware

!

Trust

What makes gh-ost, a newcomer tool, trusted with

  • ur data?

As trusted as - or more trusted than - existing solution?

17

slide-18
SLIDE 18

How people build sofuware

!

18

!

  • Other than unit tests and integration tests, gh-ost supports testing in

production

  • You will execute a gh-ost migration on a replica
  • gh-ost will execute as normal, but applying changes on replica
  • Just before cut-over it stops replication
  • Execution ends with both original and ghost tables in place, replication stopped
  • At your leisure, you can compare/checksum the two tables
  • We have dedicated servers that continuously test our entire production

table set

  • Each table is migrated on replica via “trivial” (no schema change) migration
  • Tables data checksummed and expected to be identical

Testing

slide-19
SLIDE 19

How people build sofuware

!

19

!

! ! !

$

Testing in production

!

$

!

$

! ! !

production replicas testing replicas master

slide-20
SLIDE 20

How people build sofuware

!

20

!

  • There are no triggers. gh-ost can completely throtule the operation when it

chooses to.

  • Throtuling based on multiple criteria:
  • Master metrics thresholds (e.g. Threads_running)
  • Replication lag
  • Arbitrary query
  • Flag file
  • Use command
  • Trust: you could choose, at any time and effective immediately, to throtule

gh-ost’s operation and resume normal master workload.

  • And you may resume operation once satisfied

Throtuling

slide-21
SLIDE 21

How people build sofuware

!

21

!

  • The final migration step: replacing the original table with the ghost table,

incurs a brief table lock

  • This metadata-locks-involved step is a critical point for the migration
  • During brief lock time, number of connections may escalate
  • People tend to stick around during this phase.
  • People actually plan ahead migration start time based on the estimated

completion time, so they can guarantee to be around

  • gh-ost offers postponed cut-over (optional, configurable)
  • As cut-over is ready, gh-ost just keeps synching the tables via binlog events
  • Requires an explicit command/hint to cut-over
  • Trust: I can safely go to bed

Cut-over

slide-22
SLIDE 22

How people build sofuware

!

22

!

  • gh-ost will invoke your hooks at points of interest
  • If you like, do your own cleanup, collecting, auditing, chatuing.
  • Hooks available for:
  • startup, validated, row-copy about to begin, routinely status, about to cut-over,

stop-replication, success, failure

  • gh-ost will populate environment variables for your process
  • htups://github.com/github/gh-ost/blob/master/doc/hooks.md
  • Trust: integrate with your infrastructure

Hooks

slide-23
SLIDE 23

How people build sofuware

!

23

!

  • gh-ost supports niceness
  • Explicitly forcing it to periodic sleep based on nice-ratio
  • Trust: one can reduce gh-ost’s load at any time

nice

slide-24
SLIDE 24

How people build sofuware

!

24

!

  • gh-ost monitors replication lag in subsecond-resolution
  • For control-replicas, it requires a query that is known to return subsecond lag.
  • At GitHub replication lag is normally kept subsecond
  • We don’t like it when we see 5 second lag
  • We really don’t like it when we see 10 second lag
  • 20 second lag typically leads to investigation
  • We are able to migrate our busiest tables, during rush hour, and keep

replication lag below 300ms

  • Trust: migrations will do whatever it takes to keep replicas up-to-date

Subsecond replication lag

slide-25
SLIDE 25

How people build sofuware

!

throtuling in production

25

!

no migration migration updated
 max-lag-millis=200 migration begins
 max-lag-millis=500

Our production replication lag, before and during migration on one of our busiest tables


CEST tz

slide-26
SLIDE 26

How people build sofuware

!

26

!

  • With existing tools, you run your migration tool based on some

configuration.

  • If configuration does not match your workload, you kill the migration and

start a new one with more relaxed/aggressive config

  • gh-ost listens on Unix socket file and/or TCP
  • You can connect to a running migration and ask:
  • status
  • max-lag-millis=500
  • throtule
  • cut-over
  • Trust: you can always get a reliable status or reconfigure as you see fit

Dynamic visibility & control

slide-27
SLIDE 27

How people build sofuware

!

27

!

  • We work from/with ChatOps
  • Are slowly and incrementally integrating gh-ost into our flow and ChatOps
  • We control migrations via chat:
  • .migration sup
  • .migration max-lag-millis 300
  • .migration cut-over
  • Migrations ping us in chat to let us know their status; or if they’re ready to

cut-over

  • Migrations are accessible to everyone, not just DBAs

gh-ost @ GitHub

slide-28
SLIDE 28

How people build sofuware

!

gh-ost chatops @ GitHub

28

!

  • We control gh-ost via chatops
  • And gh-ost chats to us
  • The chat is a changelog visible to all. It tells us what happened when,

and who did what.

slide-29
SLIDE 29

How people build sofuware

!

The future

We want to make gh-ost robust We want it to be widely adopted We have a few ideas on where it can go

29

slide-30
SLIDE 30

How people build sofuware

! !

30

!

"

  • riginal table

ghost table

#

insert delete update binary log row copy "

$

row-copy still couples tables

slide-31
SLIDE 31

How people build sofuware

! !

31

!

" "

  • riginal table

ghost table

#

insert delete update

$

binary log read rows no data flow 
 between tables write events,
 write rows

decoupling row-copy

slide-32
SLIDE 32

How people build sofuware

!

! !

32

!

  • riginal, ghost tables
  • riginal, ghost tables

#

insert delete update

$

binary log write events,
 write rows

master replica

" " " "

decoupled row-copy, utilizing replica

read rows

slide-33
SLIDE 33

How people build sofuware

! !

! !

33

!

  • riginal table
  • riginal table

#

insert delete update

$

binary log

master replica

" "

ghost table

"

some unrelated server
 in a far galaxy

write events,
 write rows

remote, live table migration

read rows

slide-34
SLIDE 34

How people build sofuware

!

!

!

34

!

  • riginal table

#

insert delete update

$

master

"

ghost table

"

some unrelated server
 in a far galaxy

remote, live table migration

slide-35
SLIDE 35

How people build sofuware

!

!

35

!

" "

#

$

Resurrection

  • gh-ost bails out on meeting critical-load
  • Or someone kills it
  • Resurrect operation!
  • Resume rowcopy from same place
  • Resume binlog apply from same position
  • Both are idempotent, accurate sync is not necessary
slide-36
SLIDE 36

How people build sofuware

!

Open source

How gh-ost is developed and what we envision

36

slide-37
SLIDE 37

How people build sofuware

!

37

!

  • gh-ost is released under the MIT license
  • We encourage collaboration
  • Issues
  • Bugs
  • Questions
  • Feature requests
  • Sharing experience
  • Pull requests
  • Code
  • Documentation
  • We suggest some work for the community, that is not on our immediate

roadmap

Open Source

slide-38
SLIDE 38

How people build sofuware

!

Usage

Best read the docs.

38

!

slide-39
SLIDE 39

How people build sofuware

!

39

!

gh-ost

  • -user="gh-ost"
  • -password="123456"
  • -host=replica.with.rbr.com
  • -database="my_schema"
  • -table="my_table"
  • -verbose
  • -alter="engine=innodb"
  • -max-load=Threads_running=25
  • -critical-load=Threads_running=1000
  • -chunk-size=1000
  • -throttle-control-replicas="myreplica.1.com,myreplica.2.com"

[continued next slide]

Execution sample

slide-40
SLIDE 40

How people build sofuware

!

40

!

[continued]

  • -heartbeat-interval-millis=100
  • -replication-lag-query="select unix_timestamp(now(6)) -

unix_timestamp(ts) from meta.heartbeat order by ts desc limit 1"

  • -max-lag-millis=500
  • -switch-to-rbr
  • -exact-rowcount
  • -concurrent-rowcount
  • -panic-flag-file=/tmp/ghost.panic.flag
  • -postpone-cut-over-flag-file=/tmp/ghost.postpone.flag
  • -hooks-path=/path/to/hooks/
  • -hooks-hint="@$(whoami)"

[--execute]

Execution sample

slide-41
SLIDE 41

How people build sofuware

!

Thank you!

Questions?

41

!

github.com/tomkrouper @CaptainEyesight
 
 github.com/shlomi-noach @ShlomiNoach
 
 /cc
 github.com/ggunson @shebang_the_cat
 
 github.com/jonahberquist @hashtagjonah