APACHE COTTON MySQL on Mesos Yan Xu xujyan 1 SHORT HISTORY - - PowerPoint PPT Presentation

apache cotton
SMART_READER_LITE
LIVE PREVIEW

APACHE COTTON MySQL on Mesos Yan Xu xujyan 1 SHORT HISTORY - - PowerPoint PPT Presentation

APACHE COTTON MySQL on Mesos Yan Xu xujyan 1 SHORT HISTORY Mesos: cornerstone of Twitters compute platform. MySQL: backbone of Twitters data platform. Mysos: started as a hackweek project @twitter. Apache Cotton:


slide-1
SLIDE 1

APACHE COTTON

MySQL on Mesos

Yan Xu xujyan

1

slide-2
SLIDE 2

SHORT HISTORY

  • Mesos: cornerstone of Twitter’s compute platform.
  • MySQL: backbone of Twitter’s data platform.
  • Mysos: started as a hackweek project @twitter.
  • Apache Cotton: fluffy, elastic & cloudy.
  • Call for collaboration / contribution!

2

slide-3
SLIDE 3

WHY MYSQL ON MESOS

  • From manual MySQL

administration to self- service.

  • Distributed systems should

manage themselves.

3

slide-4
SLIDE 4

WHY COTTON

  • Reusability vs. flexibility.
  • Generic schedulers: stateless tier one services.
  • MySQL: stateful & complex.
  • Interactive: Cotton coordinates MySQL instances during

their lifecycles.

  • Push common functionality down!

4

slide-5
SLIDE 5

COTTON INTRO

5

slide-6
SLIDE 6

MYSQL ON A MESOS CLUSTER

Services

mesos-slaves mesos-master

S S S S” S” S” S’ S’ S’

frameworks

6

slide-7
SLIDE 7

MYSQL ON A MESOS CLUSTER

Services

mesos-slaves mesos-master

S S S S” S” S” S’ S’ S’

frameworks cotton-scheduler

6

slide-8
SLIDE 8

MYSQL ON A MESOS CLUSTER

Services

mesos-slaves mesos-master

S S S S” S” S” S’ S’ S’

frameworks cotton-executor cotton-scheduler

6

slide-9
SLIDE 9

MYSQL ON A MESOS CLUSTER

Services

mesos-slaves mesos-master

S S S S” S” S” S’ S’ S’

frameworks cotton-executor

6

slide-10
SLIDE 10

MYSQL ON A MESOS CLUSTER

Services

mesos-slaves mesos-master

S S S S” S” S” S’ S’ S’

frameworks cotton-executor

6

slide-11
SLIDE 11

“COTTON CLUSTERS”

Mesos cluster Cotton cluster MySQL cluster MySQL cluster Cotton cluster MySQL cluster

7

slide-12
SLIDE 12

SCHEDULER + EXECUTOR

  • Scheduler: global view, central scheduling, react to

events (from Mesos and executors), coordinate instances (e.g. Elector).

  • Scheduler deploy, upgrade, HA, service discovery, :

managed by Aurora, state stored in ZK.

  • Executor: Bootstrap, launch, monitor, interact.

8

slide-13
SLIDE 13

mesos-slaves mesos-master CREATE CLUSTER ‘cluster0’: nodes=3, size=SMALL, user, passwd, backup_id Cotton Scheduler Offers

9

slide-14
SLIDE 14

mesos-slaves mesos-master Cotton Scheduler A B C LAUNCH TASK LAUNCH TASK LAUNCH TASK LAUNCH 3 TASKs

10

slide-15
SLIDE 15

mesos-slaves mesos-master 3 TASKs RUNNING Cotton Scheduler A B C RUNNING RUNNING RUNNING

11

slide-16
SLIDE 16

mesos-slaves mesos-master GTID_EXECUTED? Cotton Scheduler A B C

12

slide-17
SLIDE 17

mesos-slaves mesos-master Master = Slave(MAX(GTIDs)) Cotton Scheduler A B C GTIDs

13

slide-18
SLIDE 18

mesos-slaves mesos-master ELECT A Cotton Scheduler A B C A elected

14

slide-19
SLIDE 19

mesos-slaves mesos-master Cotton Scheduler A B C REPARENT PROMOTE REPARENT Master: A Slaves: B, C zk://

15

slide-20
SLIDE 20

mesos-slaves mesos-master ELECT B Cotton Scheduler A B C

Instance A unhealthy

Host C down

16

slide-21
SLIDE 21

mesos-slaves mesos-master LAUNCH TASKS D,E Cotton Scheduler A B C

17

slide-22
SLIDE 22

SERVICE DISCOVERY

  • MySQL connection: ip, port, user, password
  • API input: cluster name, user(, password)
  • API out: (immediate) ZooKeeper path(, password)
  • Cotton publishes to ZK:
  • ip:port
  • master/slave roles

18

slide-23
SLIDE 23

DISCOVERY MECHANISMS

  • ServerSets w/ ZK libraries
  • Proxying (haproxy-marathon-bridge)
  • DNS (mesos-dns)
  • Scheduler REST API
  • Mesos commons library

19

slide-24
SLIDE 24

MONITORING

  • Watching ZK group:
  • When is the MySQL cluster ready? (e.g. in an hour)
  • When has old master died & a new master elected?
  • Monitoring stats: Is the cluster healthy? Is Cotton service healthy?
  • <scheduler>/vars.json
  • Mesos exported stats (container stats, <slave>/monitor/statistics.json)
  • Executor exported stats (MySQL specific)

20

slide-25
SLIDE 25

USING COTTON

21

slide-26
SLIDE 26

CUSTOMIZING COTTON

  • Organization level (e.g. Twitter)
  • Merge custom scripts and configs.
  • Implement Installer, BackupStore interfaces.
  • Cotton cluster level (e.g. Devel): scheduler flags.
  • MySQL cluster level (e.g. Ads): API request arguments.

22

slide-27
SLIDE 27

BACKUP STORE

  • Discover, fetch, decrypt, extract, post-processing, etc.
  • Customization: TwitterBackupStore
  • Flags: --backup_store_args. (e.g. HDFS args)
  • API args: backup_id.
  • Organization specific: group, cluster, partition, timestamp, etc.
  • Supporting defaults and conventions: latest, etc.

23

slide-28
SLIDE 28

INSTALLER

  • Discover, fetch, install.
  • Customization: TwitterPackageInstaller
  • Twitter uses its own MySQL releases (with its own custom config variables)
  • Auxiliary utils and libs.
  • Flags: --installer_args. (e.g. Package management args)
  • API args: package.
  • release, version, tags (latest, live), etc.
  • Filesystem isolation (target 0.25): MySQL as a separate layer.

24

slide-29
SLIDE 29

CONFIGS AND SCRIPTS

  • Customization: Executor files
  • e.g. my.cnf.
  • Unpacked to disk upon launch.
  • Flags: --executor_environ.
  • JSON => ExecutorInfo::command::environment.
  • e.g. MYSOS_DEFAULTS_FILE=<custom_my_cnf>.

25

slide-30
SLIDE 30

DEPLOYING COTTON

  • Cotton (PyPI) +

Your customization => Your Cotton.

  • PEX: Python EXecutable with all its dependencies

bundled.

  • cotton_executor.pex: fetchable by mesos-fetcher.
  • cotton_scheduler.pex: managed by Aurora.

26

slide-31
SLIDE 31

DEVELOPING COTTON

27

slide-32
SLIDE 32

RETAINING MYSQL STATE

  • Persistent volumes: available since 0.23.
  • Reserve hosts for Cotton through roles.
  • Scheduler: create persistent volumes.
  • Executor: sandbox bind mounted from a persistent volume (a

directory managed by Mesos and not GCed).

  • Scheduler: look for the old volumes from offers and reuse if

possible.

28

slide-33
SLIDE 33

ROAD AHEAD

  • Client auth & authz; admin privilege.
  • Replicate and failover using GTID.
  • Isolate disk IO on shared hosts.
  • Multiple MySQL versions per Cotton cluster.
  • Scheduling contraints.
  • Add instances to existing clusters.

29

slide-34
SLIDE 34

RESOURCES

  • https://github.com/apache/incubator-cotton
  • https://issues.apache.org/jira/browse/cotton
  • #apache-cotton on freenode.org
  • https://twitter.com/apachecotton
  • dev-subscribe@cotton.incubator.apache.org

30

slide-35
SLIDE 35

QUESTIONS?

31