Getting Rid of Zookeeper Jay Guo Kapil Arya Software Developer - - PowerPoint PPT Presentation

getting rid of zookeeper
SMART_READER_LITE
LIVE PREVIEW

Getting Rid of Zookeeper Jay Guo Kapil Arya Software Developer - - PowerPoint PPT Presentation

MesosCon Asia 2016 Getting Rid of Zookeeper Jay Guo Kapil Arya Software Developer @IBM Mesos Committer @Mesosphere guojiannan@cn.ibm.com kapil@mesosphere.io 1 Motivation 2 Zookeeper is... Mature Feature-rich ... 3


slide-1
SLIDE 1

Getting Rid of Zookeeper

MesosCon Asia 2016

1

Jay Guo

Software Developer @IBM guojiannan@cn.ibm.com

Kapil Arya

Mesos Committer @Mesosphere kapil@mesosphere.io

slide-2
SLIDE 2

Motivation

2

slide-3
SLIDE 3

Zookeeper is...

  • Mature
  • Feature-rich
  • ...

3

slide-4
SLIDE 4

But!

  • Primitive K/V store

○ Provide your own tooling for other abstractions!

  • Heavy
  • Hard dependencies
  • Language binding instead of RESTfull API
  • ...

4

slide-5
SLIDE 5

It’s all about having options!

5

  • Chocolate
  • Strawberry
  • Vanilla
  • ...
slide-6
SLIDE 6

Mesos HA: An Overview

6

slide-7
SLIDE 7

High Availability

First of all, we need a Distributed Key-Value storage...

7

slide-8
SLIDE 8

Mesos HA

  • At least three Mesos Masters
  • One leading Master

○ Leader election ○ Leader detection

  • Replicated Log

Zookeeper as the distributed key-value store

8

ZK ZK ZK Mesos Master Mesos Master Mesos Master leader Zookeeper cluster Mesos Agents / Frameworks

slide-9
SLIDE 9

Mesos HA: Leader Election

  • All Masters “contend” to be

the leader!

9

ZK ZK ZK Mesos Master Mesos Master Mesos Master Contend Contend Contend

slide-10
SLIDE 10

Mesos HA: Leader Election

  • All Masters “contend” to be

the leader!

  • Only one succeeds; others fail

10

ZK ZK ZK Mesos Master Mesos Master Mesos Master Fail Fail Success

slide-11
SLIDE 11

Mesos HA: Leader Election

  • All Masters “contend” to be

the leader!

  • Only one succeeds; others fail
  • We have a leading Masters!

11

ZK ZK ZK Mesos Master Mesos Master Mesos Master Watch Watch Hold

slide-12
SLIDE 12

Mesos HA: Losing a Leader

  • Suppose the leading Master is

“lost”

12

ZK ZK ZK Mesos Master Mesos Master Mesos Master Watch Watch Master connection lost

slide-13
SLIDE 13

Mesos HA: Losing a Leader

  • Suppose the leading Master is

“lost”

  • All other Masters are notified

13

ZK ZK ZK Mesos Master Mesos Master Mesos Master Notify Notify Master connection lost

slide-14
SLIDE 14

Mesos HA: Losing a Leader

  • Suppose the leading Master is

“lost”

  • All other Masters are notified
  • The remaining Masters

contend again

14

ZK ZK ZK Mesos Master Mesos Master Contend Contend Mesos Master

slide-15
SLIDE 15

Mesos HA: Losing a Leader

  • Suppose the leading Master is

“lost”

  • All other Masters are notified
  • The remaining Masters

contend again

  • One of them succeeds

15

ZK ZK ZK Mesos Master Mesos Master Success Fail Mesos Master

slide-16
SLIDE 16

Mesos HA: Losing a Leader

  • Suppose the leading Master is

“lost”

  • All other Masters are notified
  • The remaining Masters

contend again

  • One of them succeeds
  • A new leader is elected!

16

ZK ZK ZK Mesos Master Mesos Master Watch Hold Mesos Master

slide-17
SLIDE 17

What about Agents/Frameworks?

17

slide-18
SLIDE 18

Mesos HA: Leader Detection

  • Framework/Agent connects to

Zookeeper to “detect” about the current leading Master

18

ZK ZK ZK Mesos Master Mesos Master Mesos Master Watch Watch Hold Mesos Agents / Frameworks Detect

slide-19
SLIDE 19

Mesos HA: Leader Detection

  • Framework/Agent connects to

Zookeeper to “detect” about the current leading Master

  • Zookeeper provides Master’s

location

○ I.e. IP:Port

19

ZK ZK ZK Mesos Master Mesos Master Mesos Master Watch Watch Hold IP:Port Mesos Agents / Frameworks

slide-20
SLIDE 20

Mesos HA: Leader Detection

  • Framework/Agent connects to

Zookeeper to “detect” the current leading Master

  • Zookeeper provides Master’s

location

  • Framework/Agent connects

to the “leader”

20

ZK ZK ZK Mesos Master Mesos Master Mesos Master Watch Watch Hold Connect Mesos Agents / Frameworks

slide-21
SLIDE 21

What about Replicated Log?

Replicated Log lets you create replicated fault-tolerant append-only logs. The Mesos master uses Replicated Log to store cluster state in a replicated, durable way.

21

slide-22
SLIDE 22

Mesos HA: Replicated Log

  • Each replica registers its pid

into ZK and maintain the presence.

22

ZK ZK ZK Replica Replica Register & hold Register & hold

slide-23
SLIDE 23

Mesos HA: Replicated Log

23

  • Each replica registers its pid

into ZK and maintain the presence.

  • When new replica joins the

cluster, existing ones get notified and get to know the pid of new replica.

ZK ZK ZK Replica Replica Replica Notified with info

  • f new replica

register Notified with info

  • f new replica
slide-24
SLIDE 24

Mesos HA: Replicated Log

24

  • Each replica registers its own

pid into ZK and maintain the presence.

  • When new replica joins the

cluster, existing ones get notified and get to know the pid of new replica.

  • Every replica knows all nodes

in the cluster and do Paxos.

ZK ZK ZK Replica Replica Replica Paxos Paxos Paxos

slide-25
SLIDE 25

Replacing Zookeeper

25

? =

ZK

Etcd Consul

|| ||

...

ZK ZK

?

Mesos Master Mesos Master Mesos Master leader Distributed KV Store

slide-26
SLIDE 26
  • Master Contender for leader election

Three Key Components

26

ZK ZK

?

Mesos Master

Contender

Distributed KV Store

bool contend();

slide-27
SLIDE 27
  • Master Contender for leader election
  • Master Detector for discovery

Three Key Components

27

ZK ZK

?

Mesos Master

Contender

Detector Distributed KV Store

bool contend(); MasterInfo detect(MasterInfo previous);

slide-28
SLIDE 28
  • Master Contender for leader election
  • Master Detector for discovery
  • PIDGroup for initialization

Three Key Components

28

bool contend(); MasterInfo detect(MasterInfo previous); void initialize(pid_t pid);

ZK ZK

?

Mesos Master

Contender

Detector

PIDGroup

Distributed KV Store

slide-29
SLIDE 29

A Case for Modularization!

29

  • Already a clear-cut interfaces between:

○ Master and Contender ○ Agent and Detector ○ Framework and Detector

  • For new distributed KV store

implementation, we just write the module without having to modify Mesos itself!

ZK ZK

?

Mesos Master

Contender

Detector

PIDGroup

Distributed KV Store

slide-30
SLIDE 30

Let’s Talk about Modules!

30

slide-31
SLIDE 31

Mesos Modules

  • Module/Plugin/Extension
  • Add/replace a Mesos component

○ Isolators ○ Authenticators ○ …

  • Hook modules:

○ Listen to interesting events ○ Modify/enhance certain code paths ○ Prepare/enhance task environment ○ ...

31

slide-32
SLIDE 32
  • Compiled as shared libraries

○ E.g., libmesos_network_overlay.so

  • Specified when launching Master/Agent/Framework

mesos-agent.sh <master-parameters>

  • -modules=file:///path/to/modules.json
  • -isolation=”my_isolator”
  • Gets loaded during initialization

○ E.g., the ”my_isolator” isolator will be loaded into the Agent to provide task isolation

How are Modules Used?

32

slide-33
SLIDE 33

Community Modules

33

I just wrote a Mesos module that provides a really cute feature. How do I make it useful for others!

slide-34
SLIDE 34

Modules are Tricky!

34

  • Developing
  • Building
  • Testing
  • Using
  • Hosting
  • How can we make it all better for community?
slide-35
SLIDE 35

Writing Modules

  • Doesn’t require intimate Mesos knowledge

○ Just the details of the subsystem being implemented (e.g., Isolators)

  • Familiarity with Mesos model is required

○ E.g., libprocess, events, futures and promises, etc.

  • Closely tied with Mesos version

○ To ensure mutual compatibility

35

slide-36
SLIDE 36

Building Modules: Issues

  • Build Mesos first!

○ Install all Mesos dependencies ○ Takes a long time to build ○ Version dependencies

36

slide-37
SLIDE 37

Building Modules: Good News!

  • Starting Mesos 1.0 release, pre-compiled Mesos deb/rpm packages

contain everything needed to build modules

37

slide-38
SLIDE 38

Testing Modules

38

I just wrote a simple Mesos module that provide a cute feature and I know how to build it! Can I write unit tests for it?

slide-39
SLIDE 39

Testing Modules

  • Key questions:

○ How to get good test coverage? ○ How can we solicit help from community?

  • Good news!

○ Efforts on the way to create a “libmesos_test” library that can be used to create/run gmock style tests just like with Mesos itself.

39

slide-40
SLIDE 40

How do we, as a community, make third-party modules available for general consumption? While making sure the developers and consumers can seamlessly test/integrate into their environments!

Community-Driven Modules

40

slide-41
SLIDE 41

Community Modules: Proposal

  • A central registry that contains pointers:

○ E.g., github.com/mesos/modules ○ Each module (or a set of related modules) in its own repository

  • Make Mesos version-specific binary rpm/deb modules available

○ E.g., lib_my_module_<module-version>_<mesos_version>.so

41

slide-42
SLIDE 42

Module CI: Coming Soon!

  • Builds binary packages for every registered module

○ Across a given set of Mesos versions ○ Work-in-progress!

  • Automatic build/testing for upcoming Mesos release

○ Catch incompatibilities sooner!

  • Run tests!

42

slide-43
SLIDE 43

Let’s take a look at Etcd!

43

slide-44
SLIDE 44

Etcd: A Distributed KV Store

  • HTTP API (no language bindings)
  • May already exist in your environments

44

slide-45
SLIDE 45

Etcd in a Mesos Cluster

45

  • Create Etcd-specific modules for:

○ Master detector ○ Master Contender ○ PIDGroup

  • No need to modify/rebuild Mesos

ZK ZK Mesos Master

Contender

Detector

PIDGroup

Distributed KV Store

Etcd

slide-46
SLIDE 46

Again, it’s all about having options!

46

  • Chocolate
  • Strawberry
  • Vanilla
  • ...
slide-47
SLIDE 47

Again, it’s all about having options!

47

  • Chocolate

Zookeeper

  • Strawberry

Etcd

  • Vanilla

Consul

  • ...
slide-48
SLIDE 48

Demo!

48

slide-49
SLIDE 49

Module CI: A Glimpse!

49

slide-50
SLIDE 50

Acknowledgments!

  • Shuai Lin
  • Cody Maloney
  • Benjamin Hindman
  • Joseph Wu

50

slide-51
SLIDE 51

Thanks!

51

  • Etcd modules:

○ https://github.com/guoger/mesos-etcd-module/tree/1.1.x ○ https://github.com/guoger/mesos/tree/pid-group-on-1.1.x