Percona Live Europe 2016 Launching Vitess Anthony Yeh, Dan Rogart - PowerPoint PPT Presentation

Percona Live Europe 2016 Launching Vitess Anthony Yeh, Dan Rogart Amsterdam, Netherlands | October 3 – 5, 2016

Overview http://vitess.io

Why Vitess? Their App YouTube Your App Their Vitess Vitess Sharding Magic Sharding Magic Sharding Magic MySQL MySQL MySQL 3

Why not Vitess? Vitess is... Vitess is not... ● an opinionated cluster ● a proxy ○ Many ways to scale; this is one. ○ Understands the query. ○ More on those opinions next. ○ Generates queries of its own. ● a powerful tool ● plug-and-play ○ Huge problems get easier. ○ ... yet. ○ Simple things get more complex. ○ This talk is about the gaps. 4

Launching Vitess http://vitess.io/user-guide/launching.html

Scalability Philosophy

Horizontal Scaling Small Instances Cluster Orchestration ● Many instances per host ● Containers isolate ports, files, ● Faster replication, backup/restore compute ● Less contention, outages isolated ● Scheduling for resilience ● Improves HW utilization Self-Healing, Automation ● Health checks ● Ops work should be O(1) 7

Durability and Consistency Durability through replication Sharded consistency model ● Disk is not durable ● Single-shard transactions ○ sync_binlog off ○ Same guarantees as MySQL ● Data must be on multiple ● Cross-shard transactions machines ○ May fail partially across shards ○ Work in progress on 2PC ○ semisync ● Cross-shard reads ○ lossless failover ○ routine reparent ○ Even with 2PC, may read from shards in different states 8

Globally Distributed Multi-Cell Deployment Cluster Metadata ("Topology") ● Cell = Zone | Availability Zone ● Distributed, consistent, highly ○ Possible shared fate within cell available key-value store ○ But failures shouldn't propagate ○ e.g. etcd, ZooKeeper ● Multi-Region ● Global Topology Store ○ Survive fiber cuts, regional outages ○ Quorum across multiple cells ○ Lower regional read latency ○ Survives any given cell death ● Single-Master ● Local Topology Store ○ Writes redirected at frontend ○ Quorum within a single cell ○ Only one inter-cell roundtrip ○ Independent of any other cell ○ DB writes intra-cell 9

Production Planning

Testing Integration Tests Query Compatibility ● Run app tests against Vitess ● Bind Variables ○ Use real schema ○ Client-side prepared statements ○ Test sharding ○ Vitess query plan cache ● py/vttest ● Tablet Types ○ Small footprint to run on 1 machine ○ master: writes, read-after-write ○ Emulate a full cluster for tests ○ replica: live site read traffic ○ Loads schema from .sql files ○ rdonly: batch jobs, backups ○ 1 vtcombo = all Vitess servers ● Query Support ○ 1 mysqld = all shards ○ Vitess SQL parser is incomplete ○ Report important use cases 11

Replication Binary Logging Side Effects ● Enabled everywhere (slaves too) ● Triggers ● Statement-based ● Stored procedures ○ Rewrite to PK lookups ● Foreign key constraints ● GTID required ● These can break resharding ● Used for master management, resharding, update stream, schema swap, etc. 12

Monitoring Status URLs (vtgate, vttablet, etc.) ● /debug/status ● /debug/vars ○ Prometheus, InfluxDB ● /healthz ● /queryz ● /schemaz Coming soon... ○ Realtime fleet-wide health map 13

Backups Built-in Backups ● Part of cloning, schema swap ○ Restores every day ● Storage Plugins ○ Filesystem (NFS, etc.) ○ Google Cloud Storage ○ Amazon S3 ○ Ceph ● Needs to be triggered periodically 14

Migration Strategies Tribute

Migration New Workloads Online Migration ● Getting Started + Launch Guide ● Run Vitess above existing MySQL Offline Migration ● Previously Unsharded ● Import data to Vitess ● Already Sharded ○ Custom Vindex 16

YouTube Production Dan Rogart, YouTube SRE

Run Vitess the SRE Way! • Cattle, not pets • Systemic failure is more important than individual failure • Failure is constant • Automate responses to failure when appropriate • Or detect and alert a human if required • The atomic unit is a mysql instance - for durability, availability, replacement 18

"If I have seen further than others, it is by standing upon the shoulders of giants" -- Isaac Newton • s/seen/scaled/ • Vitess runs on MySQL... • MySQL runs on Borg (Google's container cloud)... • Borg runs on Google datacenters and networks... • Each level is supported by amazing teams and we rely heavily upon their work 19

Vitess runs on MySQL on Borg • YouTube/Vitess did not fully migrate into Borg until 2013 • So, it's actually a pretty good example of how a Vitess integration with an existing MySQL stack went (pretty well, so far) • MoB had a lot of mature tools that Vitess leveraged: • Backups • Failover • Schema Management 20

Decider vtctld shard decider vttablet mysqld vtgate master vtgate vttablet vttablet vttablet vttablet mysqld mysqld mysqld mysqld replicas batch replicas 21

Decider...(vastly simplified): • Polls all mysql instances every n seconds • If the old master is unhealthy it elects a new master from the replica pool • It re-masters all the other replicas to properly replicate from the new master • Is the reason TabletExternallyReparented exists in Vitess • Total failover times for YouTube Vitess are around 5 seconds 22

Schema Management (small changes) • Autoschema • A "small" change is basically an ALTER against a table with < 2M rows • When executed on a replica it won't block the replication stream • Defined paths in source control are monitored • When a peer reviewed file containing sql is submitted... • ...autoschema will validate the change and apply it to all masters in a cluster 23

Schema Management (big changes) • Pivot • A "big" change is basically an ALTER that will block traffic for too long on the master or block replication too long when executed on a slave • Defined paths in source control are monitored • When a peer reviewed file containing sql is submitted... • ...an SRE will start a pivot • The ALTER is applied to a single replica and a seed backup is taken • All other replicas are restarted such that they restore from the backup that contains the change • Finally, the master is done last and a replica with the change is promoted 24

Schema Management • Autoschema changes take minutes • Pivots take days • At YouTube all schema changes must be forwards and backwards compatible with code. Enforced with extensive automated tests. • Sometimes dangerous: common example is removing a column using a pivot. This can break replication, so we have to block access. • Sometimes confusing for our developers: they shouldn't really care about how a change happens • Open source pivot is coming. 25

Resharding Automation • Online copy of data performed n times • Final offline copy of data to sync to a gtid • Filtered replication • Traffic redirect • ??? • Profit! 26

Resharding Automation (online copy) unsharded shard 0 shard 1 vttablet vttablet vttablet vtworker mysqld mysqld mysqld • Replication running • Read chunks from master master master source • Read chunks from target vttablet vttablet vttablet vttablet • Reconcile and write vttablet vttablet diff to target mysqld mysqld mysqld • Adaptive throttle mysqld mysqld mysqld replicas replicas replicas 27

Resharding Automation (offline copy) unsharded shard 0 shard 1 vttablet vttablet vttablet vtworker mysqld mysqld mysqld • Replication stopped • Read chunks from master master master source • Read chunks from target vttablet vttablet vttablet vttablet • Reconcile and write vttablet vttablet diff to target mysqld mysqld mysqld • Adaptive throttle mysqld mysqld mysqld replicas replicas replicas 28

Resharding Automation (filtered repl) unsharded shard 0 shard 1 vttablet vttablet vttablet mysqld mysqld mysqld • Target master tablets connect to a master master master source replica • Parse binlogs and apply statements vttablet vttablet vttablet vttablet that belong in that vttablet vttablet shard mysqld mysqld mysqld • gtid is stored and mysqld mysqld mysqld replicated on target replicas replicas replicas to survive restarts 29

Resharding Automation (redirection) • Finally application traffic is redirected: - vtctl-prod MigrateServedTypes keyspace_name/0 replica - (^^^^ sends replica traffic from unsharded to sharded) - vtctl-prod MigrateServedTypes keyspace_name/0 master - (^^^^ master cutover, point of no return) • < 5s of downtime during master cutover (faster than a normal decider failover, since only the vitess layer is touched) 30

Regression Testing • We use the Yahoo Cloud Serving Benchmark • Allows for comparison of Vitess to other storage solutions using the same workloads • A daily Vitess/YCSB sandbox is run to measure qps per core and latency • Deviations from previous results (postive or negative) are noted and investigated 31

Rate My Session! 32

Percona Live Europe 2016 Launching Vitess Anthony Yeh, Dan Rogart - PowerPoint PPT Presentation

Percona Live Europe 2016 Launching Vitess Anthony Yeh, Dan Rogart Amsterdam, Netherlands | October 3 5, 2016 Overview http://vitess.io Why Vitess? Their App YouTube Your App Their Vitess Vitess Sharding Magic Sharding Magic

Migrating to Vitess at (Slack) Scale Michael Demmer Percona Live Europe 2017 This is a (brief)

Migrating to Vitess at (Slack) Scale Michael Demmer Percona Live - April 2018 This is a

Vitess, k8s & sharding Sugu Sougoumarane, Co-creator CTO, PlanetScale @vitessio What is

Percona Backup for MongoDB Akira Kurogane Percona 3 - 2 - 1 MongoDB Percona Server for

MySQL and ZFS Yves Trudeau Yves Trudeau Percona Percona Who am I? Principal architect at

Percona Server for MySQL 8.0 Laurynas Biveinis Percona First of All, What Is Percona Server

Vitess on Kubernetes followed by a demo of VReplication Jiten Vaidya jiten@planetscale.com A

MySQL/Percona Server/MariaDB Server security features overview Colin Charles, Chief Evangelist,

MySQL Performance Optimization and Troubleshooting with PMM Peter Zaitsev, CEO, Percona Percona

Percona XtraDB Cluster: Failure Scenarios and their Recovery Krunal Bauskar (PXC Lead, Percona)

Welcome Back Day 2 Matt Yonkovit Percona Percona Live: Lots of Learning and Fun! Learned

Best practices for MySQL High Availability in 2017 Colin Charles, Chief Evangelist, Percona Inc.

An Overview of Flash Storage for Databases Vadim Tkachenko Morgan Tocker http://percona.com

Diagnosing and Fixing MySQL Performance Problems Percona, Inc. http://www.percona.com/ 1

MySQL Enterprise Backup Percona Xtrabackup Mariabackup Juan Pablo Arruti, Iwo Panowicz Percona

Percona Xtrabackup Best Practices Marcelo Altmann Senior Support Engineer - Percona Agenda

Accumulo Extensions to Googles Bigtable Apache Accumulo Design Intro to Bigtable

Mobile Applications Emmanuel Agu CS Dept. WPI MobiDesk Mobile Virtual Desktop Computing

SIM card technology from A(PDU) to X(RES) Harald Welte osmocom.org Chaos Communication Congress

CATHOLIC INVESTMENT TRUST PARTICIPANT MEETING JULY 16, 2020 1 HOUSEKEEPING NOTES Mute/Unmute

Slide 1 / 41 1 Define biotechnology. Slide 2 / 41 2 Define genetic engineering. Slide 3 / 41

Genome Sequencing (Part 1) Lecture 4: August 30, 2012

Cloning and Software Design Wei Wang Materials adopted from: Michael Godfreys We all like

Building with Biology Todays activities Introduction to Synthetic Biology Building 4

Sambuz

Useful Links

Newsletter

Mail Us

Percona Live Europe 2016 Launching Vitess Anthony Yeh, Dan Rogart - PowerPoint PPT Presentation

Percona Live Europe 2016 Launching Vitess Anthony Yeh, Dan Rogart Amsterdam, Netherlands | October 3 5, 2016 Overview http://vitess.io Why Vitess? Their App YouTube Your App Their Vitess Vitess Sharding Magic Sharding Magic

Migrating to Vitess at (Slack) Scale Michael Demmer Percona Live Europe 2017 This is a (brief)

Migrating to Vitess at (Slack) Scale Michael Demmer Percona Live - April 2018 This is a

Vitess, k8s &amp; sharding Sugu Sougoumarane, Co-creator CTO, PlanetScale @vitessio What is

Percona Backup for MongoDB Akira Kurogane Percona 3 - 2 - 1 MongoDB Percona Server for

MySQL and ZFS Yves Trudeau Yves Trudeau Percona Percona Who am I? Principal architect at

Percona Server for MySQL 8.0 Laurynas Biveinis Percona First of All, What Is Percona Server

Vitess on Kubernetes followed by a demo of VReplication Jiten Vaidya jiten@planetscale.com A

MySQL/Percona Server/MariaDB Server security features overview Colin Charles, Chief Evangelist,

MySQL Performance Optimization and Troubleshooting with PMM Peter Zaitsev, CEO, Percona Percona

Percona XtraDB Cluster: Failure Scenarios and their Recovery Krunal Bauskar (PXC Lead, Percona)

Welcome Back Day 2 Matt Yonkovit Percona Percona Live: Lots of Learning and Fun! Learned

Best practices for MySQL High Availability in 2017 Colin Charles, Chief Evangelist, Percona Inc.

An Overview of Flash Storage for Databases Vadim Tkachenko Morgan Tocker http://percona.com

Diagnosing and Fixing MySQL Performance Problems Percona, Inc. http://www.percona.com/ 1

MySQL Enterprise Backup Percona Xtrabackup Mariabackup Juan Pablo Arruti, Iwo Panowicz Percona

Percona Xtrabackup Best Practices Marcelo Altmann Senior Support Engineer - Percona Agenda

Accumulo Extensions to Googles Bigtable Apache Accumulo Design Intro to Bigtable

Mobile Applications Emmanuel Agu CS Dept. WPI MobiDesk Mobile Virtual Desktop Computing

SIM card technology from A(PDU) to X(RES) Harald Welte osmocom.org Chaos Communication Congress

CATHOLIC INVESTMENT TRUST PARTICIPANT MEETING JULY 16, 2020 1 HOUSEKEEPING NOTES Mute/Unmute

Slide 1 / 41 1 Define biotechnology. Slide 2 / 41 2 Define genetic engineering. Slide 3 / 41

Genome Sequencing (Part 1) Lecture 4: August 30, 2012

Cloning and Software Design Wei Wang Materials adopted from: Michael Godfreys We all like

Building with Biology Todays activities Introduction to Synthetic Biology Building 4

Sambuz

Useful Links

Newsletter

Mail Us

Vitess, k8s & sharding Sugu Sougoumarane, Co-creator CTO, PlanetScale @vitessio What is