Automatic MySQL Schema Management with Skeema Evan Elias Percona - - PowerPoint PPT Presentation

automatic mysql schema management with skeema
SMART_READER_LITE
LIVE PREVIEW

Automatic MySQL Schema Management with Skeema Evan Elias Percona - - PowerPoint PPT Presentation

Automatic MySQL Schema Management with Skeema Evan Elias Percona Live, April 2017 What is Schema Management? Organize table schemas in a repo Execution of all DDL, on the correct MySQL instances, with the correct OSC flags Not


slide-1
SLIDE 1

Automatic MySQL Schema Management with Skeema

Evan Elias Percona Live, April 2017

slide-2
SLIDE 2

What is Schema Management?

  • Organize table schemas in a repo
  • Execution of all DDL, on the correct MySQL instances, with the

correct OSC flags

  • Not to be confused with specific OSC tools! Schema

management is a separate concept / higher layer

2

slide-3
SLIDE 3

Prior Art

3

slide-4
SLIDE 4

Introducing Skeema

  • CLI tool for schema management
  • Designed specifically for MySQL / InnoDB
  • Free and open source (Apache 2.0 license)


http://github.com/skeema/skeema

  • Written in Go

4

slide-5
SLIDE 5

Location in the MySQL toolchain

Skeema’s functionality combines/replaces several types of tools:

5

Schema dumper

+ +

Schema diff-er DDL / OSC executor

slide-6
SLIDE 6

DESIGN PRINCIPLES

slide-7
SLIDE 7

Manage schemas like code

  • Top-level directory for each pool/cluster
  • Subdirectory for each database schema
  • One file for each table, containing CREATE TABLE statement
  • Config files may be placed in any dir / subdir, and they “stack”

7

slide-8
SLIDE 8

Declarative schema management

  • Filesystem defines what is the ideal state of schemas
  • Tool automatically figures out how to reach this state on a DB

No need to ever create migrations or DDL by hand!

8

slide-9
SLIDE 9

None of this…

9

slide-10
SLIDE 10
  • r this…

10

slide-11
SLIDE 11

11

DEFINITELY not this…

XML?!? NOPE NOPE NOPE

slide-12
SLIDE 12

Repo is CREATE TABLE all the way down

12

git diff skeema diff

slide-13
SLIDE 13

Very very agnostic

  • OSC tool
  • Service discovery
  • Sharding scheme
  • SCM tool
  • Application language

13

slide-14
SLIDE 14

Environments

  • Configure multiple environments (dev, test, stage, prod; any

arbitrary name)

  • Also support patterns such as “every engineer has their own

dev DB on localhost”

  • Ability to migrate one environment at a time
  • Visibility into the state of an environment via diff

14

slide-15
SLIDE 15

Flexible configuration

Config powerful enough for diverse use-cases:

  • Ability to override settings at multiple levels:

environment, cluster, or schema

  • Options for conditionally controlling behavior based on

table size

15

slide-16
SLIDE 16

Configuration examples

“In production, use pt-osc when altering tables over 1GB, or use MySQL 5.6 built-in online DDL for anything smaller”

16

[production] alter-wrapper="/usr/local/bin/pt-online-schema-change --execute --alter {CLAUSES} D={SCHEMA},t={TABLE},h={HOST},P={PORT},u={USER},p={PASSWORDX}" alter-wrapper-min-size=1g alter-algorithm=inplace alter-lock=none

slide-17
SLIDE 17

Configuration examples

“In dev, always just use standard ALTERs”

17

[development] skip-alter-wrapper skip-alter-algorithm skip-alter-lock

slide-18
SLIDE 18

Configuration examples

“In everything but dev, apply extra safety checks to destructive DDL, but only for tables with at least one row. In dev, always allow destructive DDL without any extra safety checks.

18

safe-below-size=1 [dev] allow-unsafe

slide-19
SLIDE 19

Configuration examples

“For a specific schema, apply different flags to
 pt-osc”

19

In /var/schemas/.skeema:

alter-wrapper="/usr/local/bin/pt-online-schema-change …”

In /var/schemas/some-cluster/some-schema/.skeema:

alter-wrapper="/usr/local/bin/pt-online-schema-change —extra-flag …”

slide-20
SLIDE 20

Configuration examples

“When interacting with any schema on a particular cluster, increase innodb_lock_wait_timeout and use different sql_mode”

20

In /var/schemas/special-cluster/.skeema:

connect-options=“innodb_lock_wait_timeout=60,sql_mode=
 ‘STRICT_ALL_TABLES,ALLOW_INVALID_DATES'”

slide-21
SLIDE 21

USAGE

slide-22
SLIDE 22

Dumping schemas to filesystem

22

Schema dumper

+ +

Schema diff-er DDL / OSC executor

Command: skeema init [environment]

slide-23
SLIDE 23

skeema init

  • Point it at a MySQL instance
  • For each database schema on the instance, creates a subdir
  • For each table in each schema, creates one .sql file,

containing a CREATE TABLE statement

  • In each dir, a .skeema option file is created, storing the host/

port (at top level) or schema name (in each database subdir)

23

slide-24
SLIDE 24
  • -h host
  • -P port
  • -S socket # if host=localhost
  • -u user
  • -ppassword
  • -p # prompt for password

24

skeema init: options

Same option-handling as the standard MySQL client! If ~/.my.cnf exists, automatically parsed for user and password. Default output dirname based on host/port; override with --dir

slide-25
SLIDE 25

25

skeema init

Example above assumes user and password obtained from ~/.my.cnf Port is non-standard due to use of Docker

slide-26
SLIDE 26

26

skeema init

slide-27
SLIDE 27

27

skeema init

slide-28
SLIDE 28

Updating files from out-of-band changes

28

Schema dumper

+ +

Schema diff-er DDL / OSC executor

Command: skeema pull [environment]

slide-29
SLIDE 29

29

skeema pull

slide-30
SLIDE 30

Linting schemas

30

Schema dumper

+ +

Schema diff-er DDL / OSC executor

Command: skeema lint

slide-31
SLIDE 31

skeema lint

  • Identifies invalid SQL
  • Normalizes format of table files to match MySQL’s SHOW

CREATE TABLE

  • Future versions may also warn on redundant indexes, lack of

PK, etc

  • Exit code 0=no changes, 1=reformatted something, 2+=error

31

slide-32
SLIDE 32

skeema lint

32

Original CREATE TABLE, with any arbitrary formatting

slide-33
SLIDE 33

skeema lint

33

slide-34
SLIDE 34

skeema lint

34

After reformatting

slide-35
SLIDE 35

skeema lint

35

Another example: catching typos / invalid SQL

slide-36
SLIDE 36

skeema lint

36

slide-37
SLIDE 37

Command: skeema diff [environment]

Diff-ing schemas

37

Schema dumper

+ +

Schema diff-er DDL / OSC executor

slide-38
SLIDE 38

skeema diff

  • Compares state of database(s) to that of filesystem,

generating DDL as output

  • Generated DDL, if executed, would cause database(s) to match

the filesystem

  • Usable as a “dry run” before executing schema changes

38

slide-39
SLIDE 39

skeema diff

39

Starting with the above table as an example

slide-40
SLIDE 40

skeema diff

40

Let’s say we want to add a new col and index. Just edit the file.

slide-41
SLIDE 41

skeema diff

41

skeema diff shows, but does not run, the corresponding DDL

slide-42
SLIDE 42

skeema diff: use in pipelines

  • STDOUT is SQL, STDERR is human log messages
  • Exit code 0=no diffs, 1=some diffs, 2+=error
  • Use --quick option to only output instances with at least one
  • diff. (Good for finding shards where a schema change didn’t

complete!)

  • If an external OSC tool is configured, the output will include

the full command-line

42

slide-43
SLIDE 43

Command: skeema push [environment]

Executing DDL

43

Schema dumper

+ +

Schema diff-er DDL / OSC executor

slide-44
SLIDE 44

skeema push

  • Exact same behavior as skeema diff, but actually executes

the DDL!

  • Recommended workflow: run skeema diff first to preview

changes, then skeema push to execute them

  • Can double-check by running skeema diff again after; should

show no remaining differences

44

slide-45
SLIDE 45

skeema push

45

slide-46
SLIDE 46

Destructive actions and safety checks

46

slide-47
SLIDE 47

ADVANCED CONFIG

slide-48
SLIDE 48

Shell-outs and variables

  • Several options configure external command-lines
  • These support use of variable placeholders
  • Conn variables: {HOST}, {PORT}, {USER}, etc
  • DDL-specific vars: {TYPE}, {CLAUSES}, {SIZE}, etc
  • Automatic escaping of quotes in variable values

48

slide-49
SLIDE 49

Online schema change

  • alter-wrapper: external command for ALTERs
  • ddl-wrapper: external command for ALTER, CREATE, DROP
  • alter-algorithm and alter-lock: force use of MySQL 5.6 online

DDL (if not using an external tool)

  • alter-wrapper-min-size: Only apply alter-wrapper to tables

that are at least this size, in bytes

49

slide-50
SLIDE 50

Online schema change example

“In production, use pt-osc when altering tables over 1GB, or use MySQL 5.6 built-in online DDL for anything smaller”

50

[production] alter-wrapper="/usr/local/bin/pt-online-schema-change --execute --alter {CLAUSES} D={SCHEMA},t={TABLE},h={HOST},P={PORT},u={USER},p={PASSWORDX}" alter-wrapper-min-size=1g alter-algorithm=inplace alter-lock=none

slide-51
SLIDE 51

Online schema change example

51

slide-52
SLIDE 52

Service discovery and sharding

  • Ordinarily, host option specifies a hostname or IP address (or

comma-separated list of them)

  • For service discovery, host-wrapper option allows host to

represent a cluster name instead

  • host-wrapper defines a command-line to shell out to.

Command’s STDOUT will be parsed and interpreted as one or more hosts (or host:port) to use

52

slide-53
SLIDE 53

Service discovery example

The executed script should be capable of doing lookups (e.g. in ZooKeeper, etcd, or Consul) such as "return the master of pool foo" or "return all shard masters for sharded pool xyz"

53

host-wrapper=/some/script.sh /mysql/{ENVIRONMENT}/{HOST}

slide-54
SLIDE 54

Sharding behavior

  • For a sharded topology, ensure that host-wrapper returns all

shard masters

  • diff and push default to applying to all shards, one at a time
  • first-only option to just run against first returned shard
  • concurrent-instances option to run against multiple at once
  • pull always runs against first shard only

54

slide-55
SLIDE 55

WORKFLOWS

slide-56
SLIDE 56

Local dev DB per engineer

  • Engineers run skeema on dev box when interacting with dev DB
  • Configure dev environment in .skeema file to use localhost

56

[development] host=localhost socket=/path/to/mysql.sock allow-unsafe skip-alter-wrapper skip-alter-algorithm skip-alter-lock

slide-57
SLIDE 57

Dev: pulling in others’ changes

  • git pull # bring local schema repo 


# up-to-date

  • skeema diff development # preview what changes need 


# to be applied to local dev

  • skeema push development # apply the changes to local dev

57

slide-58
SLIDE 58

Dev: Using Rails / Django / etc

  • Make changes to model definitions and/or migrations as usual
  • rails migrate # or however you prefer to apply 


# changes to development

  • skeema pull development # update the schema repo 


# to reflect those changes

  • git commit …

58

slide-59
SLIDE 59

Prod: pull request workflow

  • 1. Update table files in a branch, using preferred dev workflow
  • 2. Open a pull request
  • 3. Review change as normal, merge to master when ready
  • 4. Check out master, run git pull
  • 5. Run skeema diff to preview DDL
  • 6. Run skeema push to execute DDL

59

slide-60
SLIDE 60

UNDER THE HOOD

slide-61
SLIDE 61

How it works

  • Performs test DDL operations in a temporary database

(_skeema_tmp by default, but configurable)

  • Introspection on information_schema, not parsing DDL
  • All auto-generated DDL automatically sanity-tested

61

slide-62
SLIDE 62

Additional Golang modules

  • Go La Tengo: MySQL automation and schema introspection
  • mybase: MySQL-like option parsing
  • Both are open source, see http://github.com/skeema

62

slide-63
SLIDE 63

Future directions

  • More statefulness
  • Support for table and column renames
  • Agent mode, with GitHub API integration
  • Easier gh-ost and fb-osc integration
  • Missing features: FKs, partitioning, compression, other storage

engines, SSL, views, triggers, stored procs, fulltext, spatial

63

slide-64
SLIDE 64

Q + A