Not Your Grandpas Replication The New Wave of MySQL Replication and - - PowerPoint PPT Presentation

not your grandpa s replication
SMART_READER_LITE
LIVE PREVIEW

Not Your Grandpas Replication The New Wave of MySQL Replication and - - PowerPoint PPT Presentation

Not Your Grandpas Replication The New Wave of MySQL Replication and How It Helps Your Applications Robert Hodges - Continuent, Inc. Jay Pipes - Rackspace, Inc. MySQL MySQL Conference 2010 Conference 2010 Agenda Agenda / About Us /


slide-1
SLIDE 1

MySQL MySQL Conference 2010 Conference 2010

Not Your Grandpa’s Replication

The New Wave of MySQL Replication and How It Helps Your Applications

Robert Hodges - Continuent, Inc. Jay Pipes - Rackspace, Inc.

slide-2
SLIDE 2

MySQL Conference 2010 MySQL Conference 2010

Agenda Agenda

/ About Us / Replication Problems New and Old / Old and New Replication Contenders / Questions

slide-3
SLIDE 3

MySQL Conference 2010 MySQL Conference 2010

About Us About Us

/ / Jay Pipes -- Jay Pipes -- Drizzle code monkey and man of Drizzle code monkey and man of Rackspace Rackspace

  • Drizzle replication designer and

Drizzle replication designer and chief implementer chief implementer

/ / Robert Hodges -- Robert Hodges -- Tungsten chief propeller-head (and Tungsten chief propeller-head (and CTO of Continuent) CTO of Continuent)

  • Tungsten Replicator for MySQL & PostgreSQL, backups,

Tungsten Replicator for MySQL & PostgreSQL, backups, distributed management, etc. distributed management, etc.

/ / Continuent: Cross-platform database clustering and Continuent: Cross-platform database clustering and replication replication / / Rackspace Rackspace: Hosting, Fanatical Support, etc. : Hosting, Fanatical Support, etc.

slide-4
SLIDE 4

MySQL Conference 2010 MySQL Conference 2010

In Days of Old Life Was Simple

slide-5
SLIDE 5

MySQL Conference 2010 MySQL Conference 2010

MySQL Replication Addressed Problems

/ Switch to new database after crash / Scale website performance on read-only copies / Perform schema upgrades and system maintenance with minimal downtime / Keep appliance and embedded DBMS available / Allow updates across sites / But times have changed!

slide-6
SLIDE 6

MySQL Conference 2010 MySQL Conference 2010

Replication Meets Industrial Data Farms

slide-7
SLIDE 7

MySQL Conference 2010 MySQL Conference 2010

New Replication Challenges Emerge

/ Big data -- Too big to back up or move

  • Intrusion detection systems generate burst updates of 100K/sec

/ Multi-tenant applications

  • SaaS / ISP want to backup/restore/migrate/manage tenants

/ New hardware - Multi-core, large memory, SSD

  • Sites like craigslist.org want multiple cores to reduce slave latency

/ Complex topologies

  • Market automation apps shard data across dozens of servers with

complex data flows

/ Scalable operation across sites

  • Merchant systems and on-line testing update multiple locations

MySQL replication does not handle any/all of these problems especially well

slide-8
SLIDE 8

MySQL Conference 2010 MySQL Conference 2010

And Some Problems Never Go Away

Educa&on
is
required.
People
don't want
to
hear
this.
But
from
my experience
a
lot
of
problems
are caused
by
SQL
app
developers.

Mark
Callaghan

slide-9
SLIDE 9

MySQL Conference 2010 MySQL Conference 2010

Replication Technology Review

slide-10
SLIDE 10

MySQL Conference 2010 MySQL Conference 2010

Replicate Statements or Rows?

/ SQL updates can be represented in two different ways / Statements -- What the client said / Row updates -- What the client actually did

Statement Replication

Replicate changes as SQL statements

Row Replication

Replicate changes other than DDL as row updates

DDL, only way some DBMS can log canges/replicate Flexible, fewer weird exceptions

slide-11
SLIDE 11

MySQL Conference 2010 MySQL Conference 2010

Physical vs. Logical Replication

/ Databases can update either at disk or logical level, hence two replication approaches / Log records -- Databases apply them automatically during recovery / SQL statements -- Clients send SQL to make changes

Physical Replication

Replicate log records/events to create bit-for-bit copy

Logical Replication

Replicate SQL to create equivalent data

Transparent, high performance, hard to cross architectures and versions, may limit slaves Flexible, fewer/different restrictions, allow schema differences, can manage upgrade

slide-12
SLIDE 12

MySQL Conference 2010 MySQL Conference 2010

Asynchronous vs. Synchronous

/ Replicating is like buying a car--there are lots of ways to pay for it / $0 down - Pay later; hope nothing goes wrong / Down payment - Pay some so less goes wrong later / Cash - Pay up front and it’s yours forever Asynchronous Replication Commit now, replicate later Semi-Synchronous Replication Replicate to at least

  • ne other node

Synchronous Replication Replicate fully to all other node

Lose data but robust against network failure Trade-off data loss vs. partition handling Network fails --> you fail

slide-13
SLIDE 13

MySQL Conference 2010 MySQL Conference 2010

Multi-Master or Master/Slave or…?

OK, now it gets confusing! Should I… / Update one database and let it serialize all changes? / Update any database with global update ordering? / Update any database and replicate without global

  • rdering?

Master/Slave Single master serializes and replicates Multi-Master Multiple masters with global serialization Master-Master Multiple masters with no global serialization

Fast serialization, SPOF, no split brain Good scaling but really hard to implement Convenient for WAN but hard for applications

slide-14
SLIDE 14

MySQL Conference 2010 MySQL Conference 2010

Current Contenders

slide-15
SLIDE 15

MySQL Conference 2010 MySQL Conference 2010

MySQL Native Replication: The Default

/ High-performance, built-in replication used by just about everyone / Key Characteristics

  • Logical - Replicates statements and/or rows
  • Asynchronous - Applications do not wait
  • Log-based - Based on MySQL binlog with a

variety of options/tricks / Fastest and most mature replication for MySQL

slide-16
SLIDE 16

MySQL Conference 2010 MySQL Conference 2010

MySQL Replication Architecture Master Master Slave Slave binlogs binlogs relay relay logs logs

Dump Dump thread thread I/O I/O thread thread SQL SQL thread thread :3306 :3306

slide-17
SLIDE 17

MySQL Conference 2010 MySQL Conference 2010

MySQL Master Master Replication

MySQL MySQL DBMS DBMS

Application

MySQL MySQL DBMS DBMS Binlog Binlog events events

Virtual IP / Handles maintenance very well (painless resync, application upgrades, cross architecture/version) / Tools like Flipper, MMM, and Heartbeat support it very well

Binlog Binlog events events

slide-18
SLIDE 18

MySQL Conference 2010 MySQL Conference 2010

MySQL Replication Features

/ It replicates *everything* / Very mature and fast enough for most uses / Row-based replication added in 5.1

  • Removes corner cases

/ Features for many use cases:

  • Relay logs
  • replicate-ignore-db/replicate-do-db/etc.
  • Black hole replication
  • Bi-directional replication

/ Lots of tool support: Maatkit, MMM, Heartbeat, mysqlbinlog

slide-19
SLIDE 19

MySQL Conference 2010 MySQL Conference 2010

Development Still Advancing

/ MySQL 5.5

  • Semi-synchronous replication
  • Slave fsync tuning
  • Automatic relay log recovery
  • Replication heartbeats
  • SHOW RELAY LOGS command

/ Plus regular bug fixes (397 since 2009 UC according to Lars) / Plus MariaDB is getting into the act!

  • (We’ll have more news in the next talk)
slide-20
SLIDE 20

MySQL Conference 2010 MySQL Conference 2010

/ Data protection still weak

  • No checksums on data
  • 2PC issues between log and stores
  • No global transaction IDs

/ Difficult to manage as topologies scale / Broken slaves a common problem / Fully pluggable interfaces still a long way off

MySQL Replication: What’s Not to Like?

slide-21
SLIDE 21

MySQL Conference 2010 MySQL Conference 2010

Tungsten: Complete Master/Slave Clusters

/ Build complete data services using copies of MySQL databases / Think of Tungsten as a data service appliance / Key Characteristics

  • Logical - Replicates statements and/or rows
  • Asynchronous - Applications do not wait
  • Log-based - Reads MySQL binlogs directly or

via client protocol to master

/ Features for SaaS, ISP and large enterprises

slide-22
SLIDE 22

MySQL Conference 2010 MySQL Conference 2010

Tungsten Data Tungsten Data Services Services

Apache/Mod_PHP

DBMS DBMS

Replicator Manager

DBMS DBMS

Replicator Manager

DBMS DBMS

Replicator Manager

libmysqlclient libmysqlclient.a .a

Apache/Mod_PHP

libmysqlclient libmysqlclient.a .a

Connector Manager

Master Master Slave Slave Slave Slave

Connector Manager

slide-23
SLIDE 23

MySQL Conference 2010 MySQL Conference 2010

Tungsten Replication Pipelines

binlogs binlogs THL THL Slave Slave DBMS DBMS Applier Applier Extractor Extractor Filters Filters Applier Applier Extractor Extractor Filters Filters

Stage Stage Stage Stage Pipeline Pipeline Tungsten Replicator Process Tungsten Replicator Process

Transaction Transaction History Log History Log

slide-24
SLIDE 24

MySQL Conference 2010 MySQL Conference 2010

Tungsten Features

/ Unaltered MySQL 5.0/5.1 databases / Very flexible pipelines and extensions / Global transaction IDs, crash-safe slaves, heartbeats, consistency checks, checksums / Autonomic failover and management / Seamless failover/app scaling / Rapid new feature additions

slide-25
SLIDE 25

MySQL Conference 2010 MySQL Conference 2010

New SaaS-Oriented Features

/ Tungsten 2.0 adds for SaaS/ISP usage

  • Parallel replication based on shards
  • Fast event logging
  • Low-latency WAN replication
  • Multi-master replication

/ PostgreSQL 8 warm standby support and adding features to manage PostgreSQL 9 / Drizzle support as soon as we get customers

slide-26
SLIDE 26

MySQL Conference 2010 MySQL Conference 2010

/ Strengths

  • Complete solution for managing data, not just

replication

  • Features for SaaS / multi-tenant apps
  • External replication is very flexible
  • Improving very rapidly

/ Why Not Use It?

  • Pretty new (< 2 years)
  • You may want to use other tools like Heartbeat
  • Built-in replication is simpler (well sometimes)
  • Needs more features for big data

Tungsten Trade-Offs

slide-27
SLIDE 27

MySQL Conference 2010 MySQL Conference 2010

Drizzle: MySQL is Not in Kansas Anymore

/ Completely redesigned, pluggable replication for Drizzle DBMS / Key Characteristics

  • Physical - Replicates change set messages

including DDL

  • Asynchronous or Synchronous - Depends on

implementation

  • Row-based - Packages row changes into

Transaction GPB messages and sends replication streams / Drizzle team supplies reference implementation of replication

slide-28
SLIDE 28

MySQL Conference 2010 MySQL Conference 2010

Rethinking Replication Design

Drizzle's replication system looks nothing like MySQL Drizzle is entirely row-based (yes even DDL) Forget the terms master, slave, and binlog We use the terms publisher, subscriber, replicator and applier We have a transaction log, but it is not required for replication

Drizzle's transaction log is a module The transaction log module has example implementations of an applier

slide-29
SLIDE 29

MySQL Conference 2010 MySQL Conference 2010

Clients Parser Optimizer Listener Plugin (Protocol) Pluggable Storage Engine API MyISAM InnoDB MEMORY Archive PBXT Executor Authentication Plugin Query Cache Plugin Logging Plugin (Pre) Logging Plugin (Post) Replication Plugins Replication Services Transaction Services Scheduler Plugin

kernel

Authorization Plugin User-Defined Function Plugins Dictionary Plugin Plugin Registration Metadata Services

slide-30
SLIDE 30

MySQL Conference 2010 MySQL Conference 2010

Client issues DML that modifies data TransactionServices constructs Transaction message object

kernel

ReplicationServices pushes Transaction message out to all replication streams plugin::StorageEngine makes changes to data store TransactionServices calls commitTransaction() plugin::TransactionReplicator calls replicate()

Flow of events when client changes data state

plugin::TransactionApplier calls apply()

slide-31
SLIDE 31

MySQL Conference 2010 MySQL Conference 2010

What is a Replication Stream?

 A replication stream is the pair of a replicator and an

applier

 Each applier must be matched with a replicator

Can be done via command-line arguments

Can be hard-coded

 To see the replication streams that are active, you can

query DATA_DICTIONARY.REPLICATION_STREAMS:

drizzle> select * from data_dictionary.replication_streams; +--------------------+-------------------------+ | REPLICATOR | APPLIER | +--------------------+-------------------------+ | default_replicator | transaction_log_applier | +--------------------+-------------------------+ 1 row in set (0 sec)

slide-32
SLIDE 32

MySQL Conference 2010 MySQL Conference 2010

Drizzle Replication Add-ons: Rabbit MQ

 Developed by Marcus Eriksson − http://developian.com  Can replicate externally or internally − External by reading the Drizzle transaction log and

sending logs to RabbitMQ

 Multi-threaded applier constructs SQL statements from

transaction messages in log files on replica

− Internal via a C++ plugin

 /plugin/rabbitmq/  Implements plugin::TransactionApplier  Sends transaction message to RabbitMQ

slide-33
SLIDE 33

MySQL Conference 2010 MySQL Conference 2010

Galera: Synchronous Multi-Master Is Back!

/ In-core multi-master replication using certification to ensure global serialization / Key Characteristics

  • Logical replication - Replicates change sets
  • Synchronous - All databases in sync at all times
  • Multi-master - Connect to any node to update

/ So…

  • Good update performance
  • Write scaling!
  • Excellent read scaling (linear)
  • No master SPOF
slide-34
SLIDE 34

MySQL Conference 2010 MySQL Conference 2010

Galera Architecture - Virtual Synchrony MySQL MySQL DBMS DBMS MySQL MySQL DBMS DBMS MySQL MySQL DBMS DBMS

wsrep wsrep wsrep wsrep wsrep wsrep Galera Galera Library Library Galera Galera Library Library Galera Galera Library Library

Group Communications/Total Order Group Communications/Total Order

INSERT INTO FOO VALUES(1, 24, INSERT INTO FOO VALUES(1, 24, ‘ ‘hello hello’ ’); );

slide-35
SLIDE 35

MySQL Conference 2010 MySQL Conference 2010

Galera Benefits and Drawbacks

/ Completely eliminating latency between copies lets you scale horizontally with very effective HA / “Virtual synchrony” approach scales writes! / Subject to aborts on hot spots

  • “Birthday problem” data sets or queue-like structures

/ DDL changes typically lock the cluster / WAN operation is “interesting” (Find out more @3pm) / Very sensitive to group communication performance / Nobody has ever gotten the certification approach to work! This is vaporware!! Are you kidding me??

slide-36
SLIDE 36

MySQL Conference 2010 MySQL Conference 2010

Powered By Multi-Master (But Not Galera)

slide-37
SLIDE 37

MySQL Conference 2010 MySQL Conference 2010

PBXT: Skip Binlog Bureaucracy!

/ Replicate directly between PBXT storage plugins / Key Characteristics

  • Physical - Very efficient row replication
  • Asynchronous - Applications do not wait
  • Log-based - Replicate on PBXT journal

/ First step in attacking big data--very fast replication for moving quantities of data

slide-38
SLIDE 38

MySQL Conference 2010 MySQL Conference 2010

PBXT Replication Architecture

Index Transaction Log Data Log Replication Thread

INSERT ADD_ROW

Handle Data Writer Thread Slave Thread

MySQL Slave MySQL Master

Replication Stream

slide-39
SLIDE 39

MySQL Conference 2010 MySQL Conference 2010

DBT2 Performance Results

10X 9% 44%

slide-40
SLIDE 40

MySQL Conference 2010 MySQL Conference 2010

PBXT Round up

/ Very simple set-up for now (server restart to change slave configuration) / Journals are purged after they reach all slaves / Initial performance testing looks very promising / This is alpha code but… / Why didn’t anyone do this before??

slide-41
SLIDE 41

MySQL Conference 2010 MySQL Conference 2010

PostgreSQL: What Are the Neighbors Doing?

/ PostgreSQL 8 provides “warm standby” log shipping

  • No slave queries
  • Unavoidable data loss gap

/ PostgreSQL 9 introduces log streaming and “hot” standby slave reads / Key characteristics of PG 9 replication

  • Physical - Bit-for-bit copy of entire database
  • Asynchronous - Applications do not wait (for

now)

  • Log-based - Based on PG write-ahead log
slide-42
SLIDE 42

MySQL Conference 2010 MySQL Conference 2010

PostgreSQL 8.4 Warm Standby

WAL WAL Files Files

PostgreSQL PostgreSQL Master Master pg_xlogs pg_xlogs Directory Directory

Archived Archived WAL WAL Files Files

Archive Archive Directory Directory PostgreSQL PostgreSQL Standby Standby

WAL WAL Files Files

pg_xlogs pg_xlogs Directory Directory pg_standby pg_standby rsync to standby rsync to standby Continuous Continuous recovery recovery

slide-43
SLIDE 43

MySQL Conference 2010 MySQL Conference 2010

Hot Standby and Log Streaming

PostgreSQL PostgreSQL Master Master PostgreSQL PostgreSQL Hot Hot Standby Standby Continuous replication to Continuous replication to standby standby Recovery Recovery WAL WAL Sender Sender WAL WAL Receiver Receiver

Archived Archived WAL WAL Files Files

Archive Archive Directory Directory Archiving Archiving R/W client R/W client R/O client R/O client

slide-44
SLIDE 44

MySQL Conference 2010 MySQL Conference 2010

PostgreSQL Roundup

/ Log streaming/hot standby layers on top of existing features that are very robust / MVCC is a problem with full physical replication

  • New updates can clear records used by slave queries

/ Configuration files simple but overall set up and replication management are tricky / Still no way to read PostgreSQL logs directly / But expect log streaming to kill a lot of interest in trigger-based replication using SLONY, Londiste, or Bucardo / PG 9 is due out this summer

slide-45
SLIDE 45

MySQL Conference 2010 MySQL Conference 2010

Summary

slide-46
SLIDE 46

MySQL Conference 2010 MySQL Conference 2010

Final Words Final Words

/ / MySQL Replication is continuing to evolve MySQL Replication is continuing to evolve and set the standard and set the standard / / New replication mechanisms are evolving New replication mechanisms are evolving quickly to attack new problems quickly to attack new problems / / Educate yourself and build something new! Educate yourself and build something new!

slide-47
SLIDE 47

MySQL Conference 2010 MySQL Conference 2010

Our Thanks To…

/ Mark Callaghan / Seppo Jaakola / Mats Kindahl / Paul McCullagh / Alexey Yurchenko / Jeremy Zawodny

slide-48
SLIDE 48

MySQL Conference 2010 MySQL Conference 2010

Information/Contact Information/Contact

Robert Hodges http://www.continuent.com robert.hodges at continuent.com Jay Pipes http://www.rackspace.com jaypipes at gmail.com

slide-49
SLIDE 49

MySQL Conference 2010 MySQL Conference 2010

Extra Slides Extra Slides

slide-50
SLIDE 50

MySQL Conference 2010 MySQL Conference 2010

Google Semi-Synchronous Replication

/ Quorum algorithm -- Commits block until at last one slave responds affirmatively / Protects data but avoids system freeze if a slave is unavailable / Released as patch to MySQL; not widely available yet

MySQL MySQL DBMS DBMS MySQL MySQL DBMS DBMS Commit succeeds Commit succeeds when > 0 slaves when > 0 slaves respond affirmatively respond affirmatively MySQL MySQL DBMS DBMS MySQL MySQL DBMS DBMS MySQL MySQL DBMS DBMS