Cluster management at Google 2015-02 john wilkes / - - PowerPoint PPT Presentation

cluster management at google
SMART_READER_LITE
LIVE PREVIEW

Cluster management at Google 2015-02 john wilkes / - - PowerPoint PPT Presentation

Cluster management at Google 2015-02 john wilkes / johnwilkes@google.com Principal Software Engineer For the past 15 years , Google has been building out the worlds fastest, most powerful, highest quality cloud infrastructure on the


slide-1
SLIDE 1

Cluster management at Google

2015-02

john wilkes / johnwilkes@google.com Principal Software Engineer

slide-2
SLIDE 2

Images by Connie Zhou

For the past 15 years, Google has been building out the world’s fastest, most powerful, highest quality cloud infrastructure on the planet.

slide-3
SLIDE 3

job hello_world = { runtime = { cell = 'ic' } // What cluster should we run in? binary = '.../hello_world_webserver' // What program are we to run? args = { port = '%port%' } // Command line parameters requirements = { // Resource requirements ram = 100M disk = 100M cpu = 0.1 } replicas = 5 // Number of tasks }

10000

Hello World

slide-4
SLIDE 4

> borgcfg .../hello_world_webserver.borg up ... About to affect 10000 tasks and 1 packages on cell IC. Do you wish to continue (yes/no) [no]? yes

==== Staging package hello_world_webserver.63ce1b965155c75e/johnwilkes on ic... SUCCESS ==== Making package hello_world_webserver.63ce1b965155c75e/johnwilkes on ic... SUCCESS ==== Starting job hello_world on ic... SUCCESS

Hello World

slide-5
SLIDE 5

Hello World

slide-6
SLIDE 6

What just happened?

web browsers BorgMaster link shard UI shard BorgMaster link shard UI shard BorgMaster link shard UI shard BorgMaster link shard UI shard Cell Scheduler borgcfg web browsers scheduler Borglet Borglet Borglet Borglet BorgMaster link shard read/UI shard Config file

persistent store (Paxos)

Binary

Hello World

slide-7
SLIDE 7

Images by Connie Zhou

Hello World

slide-8
SLIDE 8

Hello World

slide-9
SLIDE 9

task-eviction rates and causes

9

Failures

slide-10
SLIDE 10

Images by Connie Zhou

A 2000-machine service will have >10 machine crashes per day DRAM errors (1% AFR) Disk failures (2-10% AFR) Machine crashes (~2/year) OS upgrades (2-6/year)

slide-11
SLIDE 11

Images by Connie Zhou

A 2000-machine service will have >10 machine crashes per day

This is normal; not a problem

DRAM errors (1% AFR) Disk failures (2-10% AFR) Machine crashes (~2/year) OS upgrades (2-6/year)

slide-12
SLIDE 12

Advanced bin- packing algorithms

Experimental placement

  • f production VM

workload, July 2014

Efficiency

slide-13
SLIDE 13

Advanced bin- packing algorithms

There are no obvious bucket sizes (cf. cloud VMs)

13

nice round numbers gaming the system

Efficiency

slide-14
SLIDE 14

Advanced bin- packing algorithms

Heterogeneous workloads, May 2011 Omega paper, EuroSys 2013

Job runtime [log]

Batch jobs Service jobs

CDF

Efficiency

slide-15
SLIDE 15

15

Utilization: sharing clusters between prod/batch helps

Efficiency

slide-16
SLIDE 16

Utilization: sharing clusters between prod/batch helps

16

Efficiency

slide-17
SLIDE 17

Heterogeneity and dynamicity of clouds at scale: Google trace analysis. SoCC’12

Advanced bin- packing algorithms

Data from a cluster with 12k machines, May 2011 Trace is publicly available

Efficiency

slide-18
SLIDE 18

Resource reclamation could be more aggressive

Nov/Dec 2013

18

Efficiency

slide-19
SLIDE 19

tasks/machine threads/machine

Multiple applications per machine

CPI^2 paper, EuroSys 2013

Efficiency

slide-20
SLIDE 20

← μ + σ ← μ ← μ + 2σ ← μ + 3σ

  • utliers => victims

task CPI

Multiple applications per machine

CPI^2 paper, EuroSys 2013 1. Gather CPI for all the tasks in a job 2. Find outliers 3. Take action

Efficiency

slide-21
SLIDE 21

Exposing mechanisms is fragile Better: declarative intents

Achieving desired behavior

slide-22
SLIDE 22

Achieving desired behavior

Service level objective (SLO)

Examples:

  • availability
  • obtainability
  • reliability
  • velocity
  • freshness?
  • accuracy?
  • security?

an SLO

slide-23
SLIDE 23

web browsers BorgMaster link shard UI shard BorgMaster link shard UI shard BorgMaster link shard UI shard BorgMaster link shard UI shard Cell Scheduler borgcfg web browsers scheduler Borglet Borglet Borglet Borglet BorgMaster link shard read/UI shard Config file

persistent store (Paxos)

A few other moving parts

slide-24
SLIDE 24

app

agent

master

job config

A few other moving parts

slide-25
SLIDE 25

app

agent

master

job config storage

A few other moving parts

slide-26
SLIDE 26

app

agent

master

job config storage

A few other moving parts

slide-27
SLIDE 27

app

agent

master

system config job config storage

A few other moving parts

slide-28
SLIDE 28

monitoring app

agent

master

system config job config storage

A few other moving parts

slide-29
SLIDE 29

app

agent

master

system config monitoring binaries + data distribution job config storage

A few other moving parts

slide-30
SLIDE 30

app

agent

master

system config monitoring security binaries + data distribution job config storage

A few other moving parts

slide-31
SLIDE 31

app

agent

master

system config monitoring security accounting/planning binaries + data distribution job config storage

Diagram from an original by Cody Smith.

A few other moving parts

slide-32
SLIDE 32

app

agent master

system config monitoring security accounting/billing binaries + data distribution job config storage

Diagram from an original by Cody Smith.

A few other moving parts

slide-33
SLIDE 33

Containers

Everything at Google runs in a container -- including our VMs Containers give us:

  • resource isolation
  • execution isolation
  • CPU QoS

We start over 2 billion containers per week.

Image: "Container" glynlowe CC-BY-2.0 https://www.flickr.com/photos/glynlowe/10921733615

slide-34
SLIDE 34

Machine Machine Machine Machine

κυβερνήτης:

Greek for “pilot” or “helmsman of a ship”

The open source cluster manager from Google.

Kubernetes

slide-35
SLIDE 35

Machine Host Machine Host Machine Host Machine Host Machine Host Machine Host Machine Host Container Agent Container Agent Container Agent Container Agent Container Agent Container Agent Container Agent

Kubernetes

Web server Log roller

slide-36
SLIDE 36

Log roller Web server

Machine Host Machine Host Machine Host Machine Host Machine Host Machine Host Machine Host Container Agent Container Agent Container Agent Container Agent Container Agent Container Agent Container Agent

Kubernetes master/scheduler

Pods

slide-37
SLIDE 37

FE FE FE FE FE BE BE BE BE BE BE BE BE BE Machine Host Machine Host Machine Host Machine Host Machine Host Machine Host Machine Host Container Agent Container Agent Container Agent Container Agent Container Agent Container Agent Container Agent

Kubernetes master/scheduler

Labels

slide-38
SLIDE 38

FE FE FE FE FE BE BE BE BE BE BE BE BE BE Machine Host Machine Host Machine Host Machine Host Machine Host Machine Host Machine Host Container Agent Container Agent Container Agent Container Agent Container Agent Container Agent Container Agent

Kubernetes master/scheduler

Label selectors

labels: role: frontend

slide-39
SLIDE 39

Machine Host Machine Host Machine Host Machine Host Machine Host Machine Host Machine Host Container Agent Container Agent Container Agent Container Agent Container Agent Container Agent Container Agent

Kubernetes master/scheduler

FE FE FE FE FE BE BE BE BE BE BE BE BE BE

Label selectors

labels: role: frontend stage: production

slide-40
SLIDE 40

FE FE FE

replicas: 3 template: ... labels: role: frontend

Machine Host Machine Host Machine Host Machine Host Machine Host Machine Host Machine Host Container Agent Container Agent Container Agent Container Agent Container Agent Container Agent Container Agent

Kubernetes - Master/Scheduler

Replica controller

slide-41
SLIDE 41

FE FE FE FE

replicas: 4 template: ... labels: role: frontend

Machine Host Machine Host Machine Host Machine Host Machine Host Machine Host Machine Host Container Agent Container Agent Container Agent Container Agent Container Agent Container Agent Container Agent

Kubernetes - Master/Scheduler

Replica controller

slide-42
SLIDE 42

id: frontend-service port: 9000 labels: role: frontend

frontend-service FE FE FE FE Machine Host Machine Host Machine Host Machine Host Machine Host Machine Host Machine Host Container Agent Container Agent Container Agent Container Agent Container Agent Container Agent Container Agent

Kubernetes - Master/Scheduler

Service

slide-43
SLIDE 43

The open source cluster manager from Google.

  • Pods: groups of containers
  • Labels
  • Replica controller
  • Services

http://kubernetes.io

Kubernetes

slide-44
SLIDE 44
  • ffered load

resources

Do it yourself? Sure.

Pulling it all together

slide-45
SLIDE 45
slide-46
SLIDE 46
slide-47
SLIDE 47

We choose to go to the roof not because it is glamorous, but because it is right there!

... the bulk of our success is the result of the methodical, relentless, persistent pursuit of 1.3- 2x opportunities -- what I have come to call "roofshots".

  • - Luiz Barroso

Pulling it all together

slide-48
SLIDE 48

Porsche doesn't make cars: it designs and assembles them

1H2014: ○ 1.7% (89k) of VW group's vehicles ○ 23% (€1.4b) of its profits

Data: Volkswagen, 2014-07-31 Image: john wilkes

Pulling it all together

slide-49
SLIDE 49

Cloud system providers are getting better at everything ...

  • capacity management
  • monitoring
  • storage + networking
  • reliability
  • software development tooling
  • ...

Wouldn't you like to stand on others' shoulders?

Pulling it all together

slide-50
SLIDE 50

johnwilkes@google.com http://kubernetes.io

Images by Connie Zhou

Three rules of thumb:

  • 1. Resiliency is more important than

performance.

  • 2. Relax. Let go. Build on what others

have done.

  • 3. Do more monitoring.
slide-51
SLIDE 51
slide-52
SLIDE 52

SLA = SLOs + consequences of achieving or missing them Example:

  • if availability > 99.95% (SLO)

user pays £xx/CPU-week

  • else gets a 30% refund

Achieving desired behavior

Service level agreement (SLA)

an SLO

slide-53
SLIDE 53

For the past 15 years, Google has been building out the world’s fastest, most powerful, highest quality cloud infrastructure on the planet.

Images by Connie Zhou