Galera Replication Synchronous Multi-Master Replication for InnoDB - - PowerPoint PPT Presentation

galera replication
SMART_READER_LITE
LIVE PREVIEW

Galera Replication Synchronous Multi-Master Replication for InnoDB - - PowerPoint PPT Presentation

Galera Replication Synchronous Multi-Master Replication for InnoDB ...well, why not for any other DBMS as well Seppo Jaakola Alexey Yurchenko Contents 1.Galera Cluster 2.Replication API 3.Benchmarking 4.Installation & Management


slide-1
SLIDE 1

Galera Replication

Synchronous Multi-Master Replication for InnoDB

...well, why not for any other DBMS as well Seppo Jaakola – Alexey Yurchenko

slide-2
SLIDE 2

April 14, 2010 Codership @ MySQL Conference 2010 2

Contents

1.Galera Cluster 2.Replication API 3.Benchmarking 4.Installation & Management 5.Galera Project

slide-3
SLIDE 3

April 14, 2010 Codership @ MySQL Conference 2010 3

Replication for Transactional DBMS

DBMS

slide-4
SLIDE 4

April 14, 2010 Codership @ MySQL Conference 2010 4

Replication API

DBMS repl API Interface for replication system

➔Calls for replication ➔Callbacks from replication

Plugin framework Interface for replication system

➔Calls for replication ➔Callbacks from replication

Plugin framework

slide-5
SLIDE 5

April 14, 2010 Codership @ MySQL Conference 2010 5

Pluggable Replicator

DBMS repl API R e p l i c a t i o n P r o v i d e r Provider can be loaded at DBMS start Provider can be loaded at DBMS start

slide-6
SLIDE 6

April 14, 2010 Codership @ MySQL Conference 2010 6

Galera Cluster

InnoDB wsrep G a l e r a R e p l i c a t i o n For MySQL/InnoDB For MySQL/InnoDB

slide-7
SLIDE 7

April 14, 2010 Codership @ MySQL Conference 2010 7

Galera Cluster

InnoDB wsrep G a l e r a R e p l i c a t i o n For MySQL/InnoDB For MySQL/InnoDB wsrep extension implements replication API wsrep extension implements replication API

slide-8
SLIDE 8

April 14, 2010 Codership @ MySQL Conference 2010 8

Galera Cluster

InnoDB wsrep G a l e r a R e p l i c a t i o n For MySQL/InnoDB For MySQL/InnoDB wsrep extension implements replication API wsrep extension implements replication API dynamically loaded library dynamically loaded library

slide-9
SLIDE 9

April 14, 2010 Codership @ MySQL Conference 2010 9

Galera Cluster

Clients InnoDB wsrep Transparent connections G a l e r a R e p l i c a t i o n

slide-10
SLIDE 10

April 14, 2010 Codership @ MySQL Conference 2010 10

Multi Master

Clients InnoDB wsrep Transparent connections Multi- master G a l e r a R e p l i c a t i o n

slide-11
SLIDE 11

April 14, 2010 Codership @ MySQL Conference 2010 11

Multi Master

Clients InnoDB wsrep InnoDB wsrep Transparent connections Multi- master G a l e r a R e p l i c a t i o n

slide-12
SLIDE 12

April 14, 2010 Codership @ MySQL Conference 2010 12

Multi Master

Clients InnoDB wsrep InnoDB wsrep InnoDB wsrep Transparent connections Multi- master G a l e r a R e p l i c a t i o n

slide-13
SLIDE 13

April 14, 2010 Codership @ MySQL Conference 2010 13

Synchronous Replication

Clients InnoDB wsrep InnoDB wsrep InnoDB wsrep Transparent connections Multi- master Synchronous replication G a l e r a R e p l i c a t i o n

slide-14
SLIDE 14

April 14, 2010 Codership @ MySQL Conference 2010 14

Galera Replication

  • Synchronous multi-master replication

➔ High Availability

  • No middle-ware, connections directly to DBMS

➔ Transparency

  • Row events, row level locking

➔ Write scalability

  • Certification based replication method
slide-15
SLIDE 15

April 14, 2010 Codership @ MySQL Conference 2010 15

G a l e r a R e p l i c a t i o n G a l e r a R e p l i c a t i o n Client

Synchronous Replication

wsrep wsrep wsrep trx commit

slide-16
SLIDE 16

April 14, 2010 Codership @ MySQL Conference 2010 16

G a l e r a R e p l i c a t i o n G a l e r a R e p l i c a t i o n Client

Synchronous Replication

wsrep wsrep wsrep trx WS commit Transaction is replicated to all nodes => HA WS

slide-17
SLIDE 17

April 14, 2010 Codership @ MySQL Conference 2010 17

G a l e r a R e p l i c a t i o n G a l e r a R e p l i c a t i o n Client

Synchronous Replication

wsrep wsrep wsrep trx trx Transaction is applied at later time => virtual synchrony

slide-18
SLIDE 18

April 14, 2010 Codership @ MySQL Conference 2010 18

Certification Based Replication

  • Transactions process independently in each

cluster node

  • Transaction write sets will be replicated at

commit time

  • Cluster wide conflicts resolved by

certification test

slide-19
SLIDE 19

April 14, 2010 Codership @ MySQL Conference 2010 19

write set population write set applier

Query Processing

MySQL MySQL

certification test WS extract replication Group Communication replication certification test

Client

slide-20
SLIDE 20

April 14, 2010 Codership @ MySQL Conference 2010 20

write set population write set applier

Commit Processing

MySQL MySQL

certification test WS extract replication Group Communication WS replication certification test

Client commit

slide-21
SLIDE 21

April 14, 2010 Codership @ MySQL Conference 2010 21

write set population write set applier

Commit Processing

MySQL MySQL

certification test WS extract replication Group Communication WS replication certification test

Client

slide-22
SLIDE 22

April 14, 2010 Codership @ MySQL Conference 2010 22

write set population write set applier

commit rollback

Commit Processing

MySQL MySQL

certification test WS extract replication Group Communication replication certification test

Client

WS

slide-23
SLIDE 23

April 14, 2010 Codership @ MySQL Conference 2010 23

Replication API

slide-24
SLIDE 24

April 14, 2010 Codership @ MySQL Conference 2010 24

Replication API

  • Galera integrates closely in DBMS

transaction processing

➔ There must be an interface between DBMS

and replication system

slide-25
SLIDE 25

April 14, 2010 Codership @ MySQL Conference 2010 25

Other Replication APIs

  • MySQL's API cooking up:

  • Drizzle's API, already there:

  • MariaDB specifying new API

http://forge.mysql.com/wiki/MySQL_Replication:_Walk-through_of_the_new_5.1_and_6.0_features http://www.jpipes.com/index.php?/archives/290-Towards-a-New-Modular-Replication-Architecture.html https://lists.launchpad.net/maria-developers/msg01998.html

slide-26
SLIDE 26

April 14, 2010 Codership @ MySQL Conference 2010 26

wsrep API

  • Codership's replication API
  • DBMS agnostic replication interface
  • Defines:

– Write Set replication for transactions – TO isolation for replicating DDL

  • Suitable for different replication modes

(sync/async, multi-master, master/slave, PITR...)

  • https://launchpad.net/wsrep

https://launchpad.net/wsrep

slide-27
SLIDE 27

April 14, 2010 Codership @ MySQL Conference 2010 27

wsrep API Implementation

  • Replication provider library load/unload
  • Write set population calls
  • Write set replication calls (at commit)
  • Prioritized transactions

– Lock queue modified – Aborting local victims

  • Configuration hooks
  • Status hooks
  • TO isolation for DDL queries
slide-28
SLIDE 28

April 14, 2010 Codership @ MySQL Conference 2010 28

Galera Library

DBMS

wsrep provider GCS framework replication wsrep hooks wsrep API

dlopen

Galera certification

vsbes gcomm spread

slide-29
SLIDE 29

April 14, 2010 Codership @ MySQL Conference 2010 29

Benchmarking

slide-30
SLIDE 30

April 14, 2010 Codership @ MySQL Conference 2010 30

Benchmarking

  • Tested with several benchmarks

– Sysbench, dbt2, DOTS, osdb, jmeter, sqlgen...

  • Tested with 'physical hardware' and with

Amazon EC2 instances

➔ In general, shows good scalability even with

write intensive work loads

slide-31
SLIDE 31

April 14, 2010 Codership @ MySQL Conference 2010 31

SysBench Benchmarks

  • SysBench OLTP mode test
  • 1M rows
  • EC2 Large instances

nodes users trx/s deadlks 95%lat

  • 1 18 385 0 0.092

2 36 761 2.54 0.100 3 45 900 3.42 0.103 4 60 1034 4.54 0.120

  • fficial 5.1.33 binary:

1 18 451 0 0.079

slide-32
SLIDE 32

April 14, 2010 Codership @ MySQL Conference 2010 32

Synchronous WAN Replication

  • SysBench OLTP
  • 1M rows
  • EC2 large instances
  • EU → US
  • Distance: ~3000 miles
  • Ping RTT: ~88 ms
slide-33
SLIDE 33

April 14, 2010 Codership @ MySQL Conference 2010 33

Installation

slide-34
SLIDE 34

April 14, 2010 Codership @ MySQL Conference 2010 34

Installing MySQL/Galera

Download from www.codership.com Distributions choices: 1.Pre-built RPM or Debian package 2.demo tar distribution 3.Source build

slide-35
SLIDE 35

April 14, 2010 Codership @ MySQL Conference 2010 35

Demo Distribution

  • Pre-built 32/64 bit linux binaries
  • Installs in one directory path
  • Contains a sample database
  • Good for testing/evaluation
slide-36
SLIDE 36

April 14, 2010 Codership @ MySQL Conference 2010 36

Demo Distribution

  • Install as regular user (not root)

$ tar xzf mysql-5.1.43-galera-0.7.3-x86_64.tgz

  • Node startup by: mysql-galera script

– Commands: start | stop | check

  • Specify cluster_address

– Start first node with address: gcomm:// – Start other nodes with gcomm://<first-node-ip>

$ mysql-galera -g gcomm:// start $ mysql-galera -g gcomm://<other-IP> start

slide-37
SLIDE 37

April 14, 2010 Codership @ MySQL Conference 2010 37

Galera in Cloud

  • VPS.net

– Nice new cloud computing solution – MySQL/Galera images available

  • Amazon EC2

– Extensively tested in EC2 – Deploy .e.g. Ubuntu node and install

MySQL/Galera manually

– Pre-built image underway

slide-38
SLIDE 38

April 14, 2010 Codership @ MySQL Conference 2010 38

Cluster Topologies

➔ Use 3 or more nodes for HA ➔ Application load balancing gives best

performance

➔ Use load balancer if a single connection

point is needed

➔ Reference node can help in joining

slide-39
SLIDE 39

April 14, 2010 Codership @ MySQL Conference 2010 39

Dedicated Replication Interconnection

SW SW

10.0.0.1 10.0.0.2 192.168.0.2 192.168.0.1

Public connections Public connections Min 1 Gb/sec replication network Min 1 Gb/sec replication network

slide-40
SLIDE 40

April 14, 2010 Codership @ MySQL Conference 2010 40

C l i e n t

Application Load Balancing

Connection pool

+ Gives best performance

  • Application must react to

cluster changes

slide-41
SLIDE 41

April 14, 2010 Codership @ MySQL Conference 2010 41

C l i e n t

Load Balancer

  • HW balancers
  • IP dispatching in kernel

e.g. LVS

  • TCP/IP load balancers

.e.g. GLB, in user land

  • Proxy (.e.g. MySQL Proxy)

Load Balancer in order of performance:

slide-42
SLIDE 42

April 14, 2010 Codership @ MySQL Conference 2010 42

G a l e r a R e p l i c a t i o n

Reference Node

C l i e n t s

No client connections

  • Works as donor for

joining nodes

  • Backups by

xtrabackup

slide-43
SLIDE 43

April 14, 2010 Codership @ MySQL Conference 2010 43

G a l e r a R e p l i c a t i o n

Reference Node as MySQL Master

C l i e n t s MySQL slave MySQL master

slide-44
SLIDE 44

April 14, 2010 Codership @ MySQL Conference 2010 44

Management

slide-45
SLIDE 45

April 14, 2010 Codership @ MySQL Conference 2010 45

mysql> show variables like 'wsrep%'; +--------------------------------+---------------------------------------------------------------+ | Variable_name | Value | +--------------------------------+---------------------------------------------------------------+ | wsrep_auto_increment_control | ON | | wsrep_cluster_address | gcomm:// | | wsrep_cluster_name | my_wsrep_cluster | | wsrep_convert_LOCK_to_trx | OFF | | wsrep_data_home_dir | /home/galera/mysql-5.1.42-2957,1439/mysql/var/ | | wsrep_dbug_option | NULL | | wsrep_debug | OFF | | wsrep_drupal_282555_workaround | ON | | wsrep_local_cache_size | 20971520 | | wsrep_node_incoming_address | 10.0.0.121:3306 | | wsrep_node_name | abyssinian | | wsrep_on | ON | | wsrep_provider | /home/galera/mysql-5.1.42-2957,1439/galera/lib/libmmgalera.so | | wsrep_provider_options | NULL | | wsrep_retry_autocommit | ON | | wsrep_slave_threads | 1 | | wsrep_sst_auth | root:rootpass | | wsrep_sst_donor | NULL | | wsrep_sst_method | mysqldump | | wsrep_sst_receive_address | AUTO | | wsrep_start_position | NULL | | wsrep_ws_persistency | OFF | +--------------------------------+---------------------------------------------------------------+ 22 rows in set (0.00 sec)

wsrep Variables

slide-46
SLIDE 46

April 14, 2010 Codership @ MySQL Conference 2010 46

wsrep Variables

  • wsrep_provider

– Path to provider library

  • wsrep_cluster_address

– tells the connection point where node can join – 'gcomm://' for first node – 'gcomm://<IP address>', for joining nodes

slide-47
SLIDE 47

April 14, 2010 Codership @ MySQL Conference 2010 47

wsrep Status

mysql> show status like 'wsrep%'; +---------------------------+--------------------------------------+ | Variable_name | Value | +---------------------------+--------------------------------------+ | wsrep_local_state_uuid | 0eedf650-1694-11df-0800-6227ab0639e3 | | wsrep_last_committed | 3 | | wsrep_replicated | 0 | | wsrep_replicated_bytes | 0 | | wsrep_received | 0 | | wsrep_received_bytes | 0 | | wsrep_local_commits | 0 | | wsrep_local_cert_failures | 0 | | wsrep_local_bf_aborts | 0 | | wsrep_flow_control_waits | 0 | | wsrep_local_status | Joined (5) | | wsrep_cluster_conf_id | 1 | | wsrep_cluster_size | 1 | | wsrep_cluster_state_uuid | 0eedf650-1694-11df-0800-6227ab0639e3 | | wsrep_cluster_status | Primary | | wsrep_local_index | 0 | | wsrep_ready | ON | +---------------------------+--------------------------------------+ 17 rows in set (0.00 sec)

slide-48
SLIDE 48

April 14, 2010 Codership @ MySQL Conference 2010 48

wsrep Status

  • wsrep_last_committed

– Tells which transaction has committed last

  • wsrep_local_cert_failures
  • wsrep_local_bf_aborts

– How much cluster caused rollbacks

  • wsrep_flow_control_waits

– How much wait for flow control

slide-49
SLIDE 49

April 14, 2010 Codership @ MySQL Conference 2010 49

Backups

  • No direct backup method in 0.7 release :(
  • To get a backup

➔ Join/depart a node in a cluster ➔ Use reference node as MySQL master and fan out

to a backup slave

➔ Use xtrabackup in reference node to get hot

backup

slide-50
SLIDE 50

April 14, 2010 Codership @ MySQL Conference 2010 50

Joining New Nodes Joining New Nodes

Clients MySQL MySQL MySQL A c t i v e c l u s t e r Joining node

wsrep_cluster_address= 10.0.0.2

SST Request

10.0.0.1 10.0.0.2 10.0.0.3

slide-51
SLIDE 51

April 14, 2010 Codership @ MySQL Conference 2010 51

Joining New Nodes Joining New Nodes

Clients MySQL MySQL MySQL

10.0.0.1 10.0.0.2 10.0.0.3

Donor node

  • 1. mysqldump
  • 2. load
slide-52
SLIDE 52

April 14, 2010 Codership @ MySQL Conference 2010 52

Joining New Nodes

Clients MySQL MySQL MySQL A c t i v e c l u s t e r

10.0.0.1 10.0.0.2 10.0.0.3

slide-53
SLIDE 53

April 14, 2010 Codership @ MySQL Conference 2010 53

Galera Project

slide-54
SLIDE 54

April 14, 2010 Codership @ MySQL Conference 2010 54

2008 2010 2009

Galera Project

0.7.1 0.7.2 0.7.3

Kick off Kick off First public releases First public releases 0.7 release Fully open source 0.7 release Fully open source

slide-55
SLIDE 55

April 14, 2010 Codership @ MySQL Conference 2010 55

Release 0.7

  • Current release 0.7.3

– Stable release – Production readiness – Open source

  • Simple management & installation utilities
  • State transfer by mysqldump
  • “Reasonably” good performance
slide-56
SLIDE 56

April 14, 2010 Codership @ MySQL Conference 2010 56

2010 2011

Road Map

Stability milestone 0.7 releases 0.7.4... Optimization milestone

➔Incremental backups ➔Xtrabackup ➔UDP multicast

Optimization milestone

➔Incremental backups ➔Xtrabackup ➔UDP multicast

Management milestone

➔Cluster commands ➔Management console

Management milestone

➔Cluster commands ➔Management console

slide-57
SLIDE 57

April 14, 2010 Codership @ MySQL Conference 2010 57

Summary

  • Certification based replication turns out

effective

– High Availability – Transparency – Good scalability even with high write rates

  • wsrep API is “not too hard” to implement
  • Any (transactional) DBMS can leverage this

replication possibility

slide-58
SLIDE 58

April 14, 2010 Codership @ MySQL Conference 2010 58

Codership – The Saga

  • Founders Seppo Jaakola, Alexey Yurchenko,

Teemu Ollakka

  • Fin-Rus community working from Finland
  • Experts in distributed systems & DBMS

development, information security

  • Set Sail Oct 2007
  • Projects:

– Galera – GLB (Debian ITP) – Cluster testing framework (in-house)

slide-59
SLIDE 59

April 14, 2010 Codership @ MySQL Conference 2010 59

Get in Touch!

  • R&D consulting services
  • Support subscriptions
  • Downloads available: http://www.codership.com
  • info@codership.com
  • Mailing list: codership-team@googlegroups.com