MyS ySQL QL Clu lust ster er Tutor orial ial MySQL ySQL Con - - PowerPoint PPT Presentation

mys ysql ql clu lust ster er tutor orial ial
SMART_READER_LITE
LIVE PREVIEW

MyS ySQL QL Clu lust ster er Tutor orial ial MySQL ySQL Con - - PowerPoint PPT Presentation

MyS ySQL QL Clu lust ster er Tutor orial ial MySQL ySQL Con onference rence & Ex & Expo o 2011 11 Max Met ethe her r <max@skysq x@skysql.com> .com> Joffrey Mic ichaie haie <jo joffrey@sky y@skysql


slide-1
SLIDE 1

Max Met ethe her r <max@skysq x@skysql.com> .com> Joffrey Mic ichaie haie <jo joffrey@sky y@skysql sql.com com> Johan n Andersso son <jo johan@ n@se sever eralni alnines es.com com>

MyS ySQL QL Clu lust ster er Tutor

  • rial

ial

MySQL ySQL Con

  • nference

rence & Ex & Expo

  • 2011

11

slide-2
SLIDE 2

Who are we?

  • Max Mether

– Trainer and Consultant at MySQL from 2001 – Curriculum Manager at MySQL – Training Manager at SkySQL from 2010

  • Johan Andersson

– Cluster Practice Manager at MySQL from 2003 – Consultant at Severalnines from 2010

  • Joffrey Michaie

– Cluster Consultant at MySQL from 2009 – Consultant at SkySQL from 2010

11.04.2011 SkySQL Ab 2011 Confidential 2

slide-3
SLIDE 3

11.04.2011 SkySQL Ab 2011 Confidential 3

Part 1 Introduction

slide-4
SLIDE 4

Cluster Use Cases

  • What is cluster used for?

– Telecom applications – Online Gaming – Financial Applications – eCommerce – Session Management

11.04.2011 SkySQL Ab 2011 Confidential 4

slide-5
SLIDE 5

Cluster Usage

What are/will you using the cluster for??

11.04.2011 SkySQL Ab 2011 Confidential 5

slide-6
SLIDE 6

Features

  • Shared nothing architecture

– No single point of failure

  • Synchronous replication between nodes
  • ACID transactions
  • Row level locking

11.04.2011 SkySQL Ab 2011 Confidential 6

slide-7
SLIDE 7

Features

  • In-memory storage

– Some data can be stored on disk – Checkpointing to disk for durability

  • Two types of indexes

– Ordered T-trees – Unique hash indexes

  • Online operations

– Add node groups – Software upgrade – Some table alterations

11.04.2011 SkySQL Ab 2011 Confidential 7

slide-8
SLIDE 8

Architecture

11.04.2011 SkySQL Ab 2011 Confidential 8

slide-9
SLIDE 9

Partitioning

11.04.2011 SkySQL Ab 2011 Confidential

Node 3 Node 4 Table

slide-10
SLIDE 10

Partitioning

11.04.2011 SkySQL Ab 2011 Confidential

Node 3 Node 4 Table

slide-11
SLIDE 11

Partitioning

11.04.2011 SkySQL Ab 2011 Confidential

Node 3 Node 4 Table

slide-12
SLIDE 12

Partitioning

11.04.2011 SkySQL Ab 2011 Confidential

Node 3 Node 4 Table

Primary Replica Secondary Replica

slide-13
SLIDE 13

Partitioning – 4 Data Nodes

11.04.2011 SkySQL Ab 2011 Confidential 13

Node 3 Node 4 Node 5 Node 6 Table

slide-14
SLIDE 14

Node Group Node Group

Partitioning – 4 Data Nodes

11.04.2011 SkySQL Ab 2011 Confidential 14

Node 3 Node 4 Node 5 Node 6 Table

slide-15
SLIDE 15

Heartbeat Circle

11.04.2011 SkySQL Ab 2011 Confidential 15

Node 6 Node 3 Node 4 Node 5

slide-16
SLIDE 16

Heartbeat Circle

11.04.2011 SkySQL Ab 2011 Confidential 16

Node 6 Node 3 Node 4 Node 5

slide-17
SLIDE 17

Heartbeat Circle

11.04.2011 SkySQL Ab 2011 Confidential 17

Node 6 Node 3 Node 4 Node 5

slide-18
SLIDE 18

Heartbeat Circle

11.04.2011 SkySQL Ab 2011 Confidential 18

Node 6 Node 3 Node 4 Node 5

slide-19
SLIDE 19

Network Partitioning Protocol

  • The network partitioning protocol is designed

to avoid a split brain scenario:

  • 1. Is there at least one node from each node

group?

  • 2. Are all nodes present from any node group?
  • 3. Ask the arbitrator

11.04.2011 SkySQL Ab 2011 Confidential 19

slide-20
SLIDE 20

Node Group Node Group

Uneven Split

11.04.2011 SkySQL Ab 2011 Confidential 20

Node 6 Node 3 Node 4 Node 5

slide-21
SLIDE 21

Node Group Node Group

Even Split

11.04.2011 SkySQL Ab 2011 Confidential 21

Node 6 Node 3 Node 4 Node 5

slide-22
SLIDE 22

Durability

  • In order for a node to recover faster some

data is stored locally

– The REDO log

  • Synchronized by global checkpoints (GCP)

– The DataMemory

  • Synchronized by local checkpoints (LCP)
  • These can also be used for system recovery

11.04.2011 SkySQL Ab 2011 Confidential 22

slide-23
SLIDE 23

Transactions

11.04.2011 SkySQL Ab 2011 Confidential 23

Node 3 Node 4 SQL Node

Transaction request

slide-24
SLIDE 24

Transactions – Two Phase Commit

11.04.2011 SkySQL Ab 2011 Confidential 24

Node 3 Node 4 Transaction Coordinator SQL Node

Transaction request

slide-25
SLIDE 25

Transactions – Prepare Phase

11.04.2011 SkySQL Ab 2011 Confidential 25

Node 3 Node 4 Transaction Coordinator SQL Node

Transaction request

slide-26
SLIDE 26

Transactions – Prepare Phase

11.04.2011 SkySQL Ab 2011 Confidential 26

Node 3 Node 4 Transaction Coordinator SQL Node

Transaction request

slide-27
SLIDE 27

Transactions – Prepare Phase

11.04.2011 SkySQL Ab 2011 Confidential 27

Node 3 Node 4 Transaction Coordinator SQL Node

Transaction request

slide-28
SLIDE 28

Transactions – Commit Phase

11.04.2011 SkySQL Ab 2011 Confidential 28

Node 3 Node 4 Transaction Coordinator SQL Node

Transaction request

slide-29
SLIDE 29

Transactions – Commit Phase

11.04.2011 SkySQL Ab 2011 Confidential 29

Node 3 Node 4 Transaction Coordinator SQL Node

Transaction request

slide-30
SLIDE 30

Transactions – Commit Phase

11.04.2011 SkySQL Ab 2011 Confidential 30

Node 3 Node 4 Transaction Coordinator SQL Node

Transaction request

slide-31
SLIDE 31

Transactions – Commit Phase

11.04.2011 SkySQL Ab 2011 Confidential 31

Node 3 Node 4 Transaction Coordinator SQL Node

Transaction sucessful

slide-32
SLIDE 32

Indexes

  • Unique Hash Indexes

– Each table has a Primary Key hash index – Other unique hash indexes implemented by hidden tables

  • Partitioned like tables
  • Ordered indexes

– T-trees – Local for each node

11.04.2011 SkySQL Ab 2011 Confidential 32

slide-33
SLIDE 33

11.04.2011 SkySQL Ab 2011 Confidential 33

Part 2 Practical Labs

slide-34
SLIDE 34

Preparations

  • 1. Load the virtualbox and start the system
  • 2. Examine the Cluster configuration file
  • 3. Start the cluster

– Start management node – Tail the cluster log – Start data nodes – Start the MySQL servers

  • 4. Load the sakila database
  • 5. Start the MySQL clients

11.04.2011 SkySQL Ab 2011 Confidential 34

slide-35
SLIDE 35

Exercise 1 – Initial test

  • 1. Create a test table
  • 2. Insert a row in the test table
  • 3. Login to the other MySQL server and verify

that the table is there too

  • 4. Drop the table

11.04.2011 SkySQL Ab 2011 Confidential 35

slide-36
SLIDE 36

Exercise 2 – Backup and Restore

  • 1. Take a backup

– Use the START BACKUP command

  • 2. Start a clean cluster with no data
  • 3. Use the backup to restore the data

– Use the ndb_restore utility

11.04.2011 SkySQL Ab 2011 Confidential 36

slide-37
SLIDE 37

Exercise 3 – Node Recovery

  • 1. Examine the log during the process
  • 2. Kill one of the data nodes with the kill

command

  • 3. Execute a query, is the cluster working?
  • 4. Restart the node
  • 5. Execute a query

11.04.2011 SkySQL Ab 2011 Confidential 37

slide-38
SLIDE 38

Exercise 4 – NDBINFO

  • 1. Go to the ndbinfo schema
  • 2. Examine the tables

11.04.2011 SkySQL Ab 2011 Confidential 38

slide-39
SLIDE 39

Exercise 5 – Resource Limits

  • 1. Issue the statements that run into limits
  • 2. Change the configuration file

– Set MaxNoOfConcurrentOperations to 40000

  • 3. Do a ”rolling restart”

– Restart each node one by one

  • 4. Re-issue the failed statement

11.04.2011 SkySQL Ab 2011 Confidential 39

slide-40
SLIDE 40

Exercise 6 – Partial Restore

  • 1. Restore one table from the backup

– Restore the table directly or – Extract the contents to plainfile and import

11.04.2011 SkySQL Ab 2011 Confidential 40

slide-41
SLIDE 41

Exercise 7 – Query Optimization

  • 1. Run the Query

– Watch query time

  • 2. Use EXPLAIN, show indexes

– Watch the query execution plan and cardinality

  • 3. Rewrite the query

– Watch query time

11.04.2011 SkySQL Ab 2011 Confidential 41

slide-42
SLIDE 42

11.04.2011 SkySQL Ab 2011 Confidential 42

Part 3 Best Practices

slide-43
SLIDE 43

Agenda

  • Cluster Setup

– Recommended Setup – Networking & Hardware Selection

  • Disk Data Tables
  • Configuration
  • Administration

– Online/Offline Operations – Backup and restore

  • Monitoring

11.04.2011 SkySQL Ab 2011 Confidential 43

slide-44
SLIDE 44

Recommended Setup

11.04.2011 SkySQL Ab 2011 Confidential 44

SQL+Mgm +AppServer +WebServer... Clients Data node Load Balancer(s) Bonding Redundant switches SQL+Mgm +AppServer +WebServer... Data node

slide-45
SLIDE 45

Networking

  • Dedicated >= 1GB/s networking
  • Prevent network failures (NIC x 2, Bonding)
  • Use Low-latency networking (Dolphin...)

–Especially when>= 8 data nodes or want higher throughput and lower latency

  • No security layer to management node

(remote shutdown allowed ....)

  • Enable port 1186 access only from cluster

nodes and administrators

11.04.2011 SkySQL Ab 2011 Confidential 45

slide-46
SLIDE 46

Hardware – Data Nodes

  • One data node can use 8 cores (Cluster 7.0+)
  • CPU: 2 x 4 core (Nehalem works really well)

– Fast CPU → fast processing of messages

  • RAM: As much as you need

– 10GB data set will require 20GB of RAM – Each node will then need 2 x 10 / #data nodes (For example 2 data nodes → 10GB → 16GB good)

  • Disk: 10xDataMemory + space for BACKUP +

TableSpace (if disk data tables)

11.04.2011 SkySQL Ab 2011 Confidential 46

slide-47
SLIDE 47

Hardware – MySQL Servers

  • CPU: 2 – 16 cores
  • RAM: Not so important – 4GB enough

(depends on connections and buffers)

  • Disks: Used mainly for logging

– Binary log needed for replication

11.04.2011 SkySQL Ab 2011 Confidential 47

slide-48
SLIDE 48

Disk Subsystem

11.04.2011 SkySQL Ab 2011 Confidential 48

low-end mid-end high-end 1 x SATA 7200RPM

  • For a read-most, write

not so much

  • No redundancy

(but other data node is the mirror) 1 x SAS 10KRPM

  • Heavy duty (many MB/s)
  • No redundancy

(but other data node is the mirror) 4 x SAS 10KRPM

  • Heavy duty (many MB/s)
  • Disk redundancy (RAID1+0)

hot swap

  • REDO, LCP, BACKUP – written sequentually in small chunks (256KB)
  • If possible, use Odirect = 1

LCP REDOLOG LCP REDOLOG LCP REDOLOG

slide-49
SLIDE 49

Filesystem

  • Most customers uses EXT3(Linux) and UFS

(Solaris)

– Ext2 could be an option (but recovery is longer)

  • XFS – we haven't experienced so much...
  • ZFS

– You must separate journal (Zil) and filesystem

  • Mount with noatime
  • Raw device is not supported

11.04.2011 SkySQL Ab 2011 Confidential 49

slide-50
SLIDE 50

Disk Data Storage

11.04.2011 SkySQL Ab 2011 Confidential 50

Minimal recommended high-end 2 x SAS 10KRPM (preferably)

  • Use High-end for heavy read write (1000's of 10KB records per sec) of data

(e.g Content Delivery platforms)

  • SSD for TABLESPACE is also interesting – not much experience of this yet
  • Having TABLESPACE on separate disk is good for read perf.
  • Enable WRITE_CACHE on devices

TABLESPACE LCP REDOLOG UNDOLOG UNDOLOG LCP (REDO LOG / UNDO LOG) TABLESPACE 1 TABLESPACE 2

4 x SAS 10-15KRPM (preferably)

(REDO LOG) (REDO LOG)

slide-51
SLIDE 51

Configuration – Disk Data Storage

  • Use Disk Data tables for

– Simple accesses (read/write on PK) – Same for innodb – easily DISK BOUND (iostat)

  • Set

– DiskPageBufferMemory=3072M

  • is a good start if you rely a lot on disk data – like the Innodb_Buffer_Pool,

but set it as high as you can!

  • Increased chance that a page will be cached

– SharedGlobalMemory=384M-1024M – UNDO_BUFFER=64M to 128M (if you write a lot)

  • You cannot change this BUFFER later!
  • Specified at LOGFILE GROUP creation time

– DiskIOThreadPool=[ 8 .. 16 ] (Cluster 7.0+)

11.04.2011 SkySQL Ab 2011 Confidential 51

slide-52
SLIDE 52

Configuration - General

– MaxNoOfExecutionThreads<=#cores

  • Contention can occur → unexpected behaviour

– RedoBuffer=32-64M

  • If you need to set it higher your disks are too slow

– FragmentLogFileSize=256M – NoOfFragmentLogFiles= 6 x DataMemory (in MB) / (4x 256MB)

  • Most common issue – redo log too small

– Try the configurator: www.severalnines.com/config

11.04.2011 SkySQL Ab 2011 Confidential 52

slide-53
SLIDE 53

Application - Primary Keys

  • Always define a primary key

– Tables without primary keys are accepted

  • A hidden primary key is created
  • The hidden PK is not replicated
  • There are recovery issues with hidden PKs
  • Application behavior (KEY NOT FOUND.. etc)
  • At least have a

id BIGINT AUTO_INCREMENT PRIMARY KEY

– Even if you don't need it for your applications

11.04.2011 SkySQL Ab 2011 Confidential 53

slide-54
SLIDE 54

Application - Query Cache

  • Don't cache everything in the Query Cache

– Expensive to invalidate over N mysql servers – A write on one server will force the others to purge their cache

  • For tables that change seldom (or read-only)

– Set query_cache_type=2 (DEMAND) SELECT SQL_CACHE <cols> .. FROM table; – This can be good for STATIC data

11.04.2011 SkySQL Ab 2011 Confidential 54

slide-55
SLIDE 55

Application – Transaction Size

  • Transactions (large updates)

–NDB designed for many and short transactions

  • Recommended to UPDATE / DELETE in small chunks
  • Use LIMIT 10000 until all records are UPDATED/DELETED
  • MaxNoOfConcurrentOperations

–Limit for how many records than can be modified simultaneously on one data node –MaxNoOfConcurrentOperations=1000000 will use 1GB of RAM

  • Use only if necessary

11.04.2011 SkySQL Ab 2011 Confidential 55

slide-56
SLIDE 56

Application – Table Locks

  • Table lock commands are local only

–FLUSH TABLE WITH READ LOCK; –LOCK TABLES <table> READ;

  • You must get the lock on all mysql servers

11.04.2011 SkySQL Ab 2011 Confidential 56

slide-57
SLIDE 57

Application – Schema Operations

  • Don't use too much CREATE/DROP TABLE
  • f NDB tables

–It is a heavy operation within Cluster –Takes much longer than with standard MySQL

11.04.2011 SkySQL Ab 2011 Confidential 57

slide-58
SLIDE 58

REDO Log Optimizations

  • Some tables account for a lot of writes, but do

not need to be recovered (session tables)

– A session table is often unnecessary to REDO LOG and to CHECKPOINT

  • Create these tables as 'NO LOGGING' tables:

mysql> set @ndb_curr_val=@@ndb_table_no_logging; mysql> set ndb_table_no_logging=1; mysql> create table session_table(..) engine=ndb; mysql> set ndb_table_no_logging=@ndb_curr_val;

– session_table will not be REDO logged → No disk activity for this table!

11.04.2011 SkySQL Ab 2011 Confidential 58

slide-59
SLIDE 59

ALTER TABLE

  • NOT online operations:
  • Rename a table
  • Change data type
  • Change storage size
  • Drop column
  • Rename column
  • Add/Drop a PRIMARY KEY
  • Online operations:
  • Add column (ALTER ONLINE …)
  • CREATE INDEX
  • Online add node (see my presentation from last year how to do it)
  • Altering a 1GB table offline requires 1GB extra

11.04.2011 SkySQL Ab 2011 Confidential 59

slide-60
SLIDE 60

Administration Layer

  • Introduce a MySQL Server for administration purposes!
  • Should never ever get application requests
  • Simplifies heavy (non online) schema changes

11.04.2011 SkySQL Ab 2011 Confidential 60

Storage layer App layer SQL layer Admin layer syncrepl

#give explicit nodeid in config.ini: [mysqld] id=8 hostname=X # in my.cnf: ndb_connectstring=”nodeid=8;x,y” ndb_cluster_connnection_pool=1

slide-61
SLIDE 61

Online Upgrades

  • OS, SW version (7.0.x → 7.1.x)
  • Configuration

– increase DM, IM, Buffers, redo log, [mysqld] slots

  • Hardware (upgrade more RAM etc)
  • Adding data nodes (from 7.0)

– See Johan’s presentation from the conference 2009

11.04.2011 SkySQL Ab 2011 Confidential 61

slide-62
SLIDE 62

Backup

  • Backup of NDB tables

– Online – can have ongoing transactions – Consistent – only committed data – ndb_mgm -e “START BACKUP”

  • Copy backup files from data nodes to safe

location

  • Non-NDB tables must be backed up separately
  • MySQL system tables are stored only in

MyISAM

11.04.2011 SkySQL Ab 2011 Confidential 62

slide-63
SLIDE 63

Backup

  • You want to backup (for each mysql server)

– mysql database – Triggers, SP ...

  • Use 'mysqldump‘

mysqldump mysql > mysql.sql mysqldump --no-data

  • -no-create-info -R >

routines.sql

  • Copy my.cnf & config.ini files

11.04.2011 SkySQL Ab 2011 Confidential 63

slide-64
SLIDE 64

Monitoring

  • Mandatory to monitor

– CPU/Network/Memory usage – Disk capacity (I/O) usage – Network latency between nodes – Node status ... – Used Index/Data Memory

  • www.severalnines.com/cmon- monitors data

nodes and MySQL servers

11.04.2011 SkySQL Ab 2011 Confidential 64