Docker Orchestration: Beyond the Basics Aaron Lehmann Software - - PowerPoint PPT Presentation

docker orchestration beyond the basics
SMART_READER_LITE
LIVE PREVIEW

Docker Orchestration: Beyond the Basics Aaron Lehmann Software - - PowerPoint PPT Presentation

Docker Orchestration: Beyond the Basics Aaron Lehmann Software Engineer, Docker About me Software engineer at Docker Maintainer on SwarmKit and Docker Engine open source projects Focusing on distributed state, task scheduling,


slide-1
SLIDE 1

Docker Orchestration: Beyond the Basics

Aaron Lehmann Software Engineer, Docker

slide-2
SLIDE 2

2

About me

  • Software engineer at Docker
  • Maintainer on SwarmKit and Docker Engine open source

projects

  • Focusing on distributed state, task scheduling, and rolling

updates

slide-3
SLIDE 3

Swarm mode

slide-4
SLIDE 4

4

Swarm mode is Docker’s built in orchestration

  • Docker can orchestrate containers over multiple machines

without extra software

  • Example: running a instances of a web service on several

machines

slide-5
SLIDE 5

5

Getting started with swarm mode

  • Initialize a new swarm:

mgr-1$ docker swarm init

  • Join an existing swarm:

worker-1$ docker swarm join --token <token>
 192.168.65.2:2377

slide-6
SLIDE 6

6

Swarm mode: Services

  • Swarm mode deals with services, not individual containers
  • Each service creates one or more replica tasks, which are run as

containers

  • On manager, create a new service for a search microservice

application:

mgr-1$ docker service create -p 8080:8080 --name search \

  • -replicas 4 searchsvc:v1.0

mgr-1$ docker service ls ID NAME REPLICAS IMAGE COMMAND 2xtw9qipmbe9 search 4/4 searchsvc:v1.0

slide-7
SLIDE 7

7

Swarm mode: Nodes

  • Worker nodes just run service tasks
  • Manager nodes manage the swarm

mgr-1$ docker node ls ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS drwxwi4h2fb0tcrwgmpmma2x0 * mgr-1 Ready Active Leader 1mhtdwhvsgr3c26xxbnzdc3yp mgr-2 Ready Active Reachable 516pacagkqp2xc3fk9t1dhjor mgr-3 Ready Active Reachable 9j68exjopxe7wfl6yuxml7a7j worker-1 Ready Active 03g1y59jwfg7cf99w4lt0f662 worker-2 Ready Active dxn1zf6l61qsb1josjja83ngz worker-3 Ready Active

slide-8
SLIDE 8

8

Swarm mode topology

Manager Manager Manager Worker Worker Worker Worker Worker Worker

Search service container Billing service container Search service container Search service container Billing service container Search service container

slide-9
SLIDE 9

9

Swarm mode topology

Worker Worker Worker Worker Worker Worker

Search service container Billing service container Search service container Search service container Billing service container Search service container

Manager Manager Manager

slide-10
SLIDE 10

10

Swarm mode topology

Worker Worker Worker Worker Worker

Search service container Billing service container Search service container Search service container Search service container Billing service container

Manager Manager Manager

slide-11
SLIDE 11

High availability

slide-12
SLIDE 12

12

High availability

  • Survive failures of some portion of workers and managers
  • If a worker fails, its assigned tasks are rescheduled

elsewhere

slide-13
SLIDE 13

13

High availability

  • What about manager failures?
  • Managers are part of a Raft cluster that replicates the

state of the swarm

slide-14
SLIDE 14

14

Raft

  • Raft is a protocol for maintaining a strongly consistent

distributed log

  • Way to avoid a single point of failure
slide-15
SLIDE 15

15

Raft concepts

  • Quorum: A majority of managers
  • Leader: Randomly chosen manager that can add

information to the distributed log

  • Election: The process of choosing a new leader
slide-16
SLIDE 16

16

High availability

  • The leader is the manager that:
  • Makes the scheduling decisions
  • Keeps track of node health
  • Handles API calls
slide-17
SLIDE 17

17

High availability

  • If the leader fails, another manager is elected in its place
  • For Raft to function, more than half the managers (a

quorum) must be reachable

slide-18
SLIDE 18

18

How many managers for a swarm?

  • A single manager is fine in some scenarios
  • Any swarm meant to survive a manager failure should

have 3 or 5 managers

  • No scaling benefit to adding additional managers
  • Each one replicates a full copy of the swarm's state
slide-19
SLIDE 19

19

Manager fault tolerance

Number of managers Majority Tolerated Failures 1 1 2 2

slide-20
SLIDE 20

20

Manager fault tolerance

Number of managers Majority Tolerated Failures 1 1 2 2

slide-21
SLIDE 21

21

Manager fault tolerance

Number of managers Majority Tolerated Failures 1 1 2 2 3 2 1 4 3 1

slide-22
SLIDE 22

22

Manager fault tolerance

Number of managers Majority Tolerated Failures 1 1 2 2 3 2 1 4 3 1

slide-23
SLIDE 23

23

Manager fault tolerance

Number of managers Majority Tolerated Failures 1 1 2 2 3 2 1 4 3 1 5 3 2 6 4 2

slide-24
SLIDE 24

24

Manager fault tolerance

Number of managers Majority Tolerated Failures 1 1 2 2 3 2 1 4 3 1 5 3 2 6 4 2

slide-25
SLIDE 25

25

Manager fault tolerance

Number of managers Majority Tolerated Failures 1 1 2 2 3 2 1 4 3 1 5 3 2 6 4 2 7 4 3 8 5 3 9 5 4

slide-26
SLIDE 26

26

Where to deploy the managers

  • Managers must have static IP addresses
  • Managers should have very reliable connectivity to each
  • ther
  • Swarms that span a big geographic area aren't

recommended

  • Looking at federation as an eventual solution for multi-

region

  • Spreading managers across a cloud provider's "availability

zones" in one region may make sense

slide-27
SLIDE 27

27

Advertised IP addresses

  • All managers must be reachable by all other managers
  • Managers need to know their own IP addresses so they

can tell other managers how to reach them

  • The address is autodetected if there is only one network

device, or in the process of joining an existing swarm

slide-28
SLIDE 28

28

Advertised IP addresses

  • If the address can't be autodetected, provide

  • -advertise-addr when running


docker swarm init

  • Many swarm instability issues are actually caused by

managers not being able to communicate

slide-29
SLIDE 29

29

What to do if quorum is lost

  • Suppose two out of three managers fail
  • The swarm won't be able to schedule tasks or perform

administrative functions

  • You will see timeouts from commands like


docker node ls if this happens

slide-30
SLIDE 30

30

What to do if quorum is lost

  • What if these managers are gone forever?
  • docker swarm init --force-new-cluster on the

surviving manager recovers from this state

  • This modifies the swarm so that it only has a single

manager

  • From that point, new managers can be added
slide-31
SLIDE 31

31

Protecting managers from accidental overloading

  • By default, managers will be assigned tasks just like

workers

  • This makes sense on a laptop-scale deployment
  • Best practice for serious deployments: avoid running

container workloads on managers

slide-32
SLIDE 32

32

Protecting managers from accidental overloading

  • Drain the managers to prevent them from running service

tasks:

mgr-1$ docker node update --availability=drain <manager id>

  • Alternatively, set the node.role == worker constraint
  • n all services
slide-33
SLIDE 33

33

Rolling updates

  • Important to avoid downtime during updates
  • docker service update is a rolling update by default
  • Parameters:
  • Update delay (--update-delay)
  • Update failure action: pause or continue


(--update-failure-action)

  • Parallelism (--update-parallelism)
slide-34
SLIDE 34

34

Rolling updates

Prepare new Stop old Start new Health checks Update delay Prepare new Stop old Start new Health checks Update delay Prepare new Stop old Prepare new Stop old

Time

Update parallelism

{

slide-35
SLIDE 35

Security

slide-36
SLIDE 36

36

Security model

  • All swarm connections are encrypted and authenticated

with mutual TLS

  • Each node is identified by its certificate (CN = node ID)
  • The certificate authorizes the node to act as either a

worker or manager (OU = swarm-manager or OU = swarm-worker)

  • By default, each manager operates as a certificate

authority with the same CA key

slide-37
SLIDE 37

37

Security around adding nodes

  • How does a new node authenticate itself before having a

certificate?

  • It presents a join token which is provided to


docker swarm join

slide-38
SLIDE 38

38

Security around adding nodes

  • The join token contains a secret that authorizes the new

node to receive either a worker or manager certificate

  • It also contains a digest of the root CA certificate, for

protection against man-in-the-middle attacks

  • The node does not use or store the join token after joining
slide-39
SLIDE 39

39

Node joining example: adding a new worker

  • On a manager, retrieve the join token:

mgr-1$ docker swarm join-token worker To add a worker to this swarm, run the following command: docker swarm join \


  • -token

SWMTKN-1-5f7umqonkff6je2l1kqpxdsok3bwipn73hlr5dxtvx4lusy809

  • 5yn6jy5zqqq3tnummvq365y7m \


172.17.0.2:2377

slide-40
SLIDE 40

40

Node joining example: adding a new worker

  • Run the command on the new worker:

worker-1$ docker swarm join --token \ SWMTKN-1-5f7umqonkff6je2l1kqpxdsok3bwipn73hlr5dxtvx4lusy809

  • 5yn6jy5zqqq3tnummvq365y7m \


172.17.0.2:2377 This node joined a swarm as a worker.

slide-41
SLIDE 41

41

Node joining flow

Joining node Manager

Join token, certificate request Signed certificate Node registration Task assignments = TLS with no client certificate = Mutually authenticated TLS

slide-42
SLIDE 42

42

Rotating join tokens

  • The join tokens remain valid until they are rotated
  • It is good practice to periodically rotate them
  • docker swarm join-token --rotate worker

generates a new worker token to replace the old one

  • docker swarm join-token --rotate manager

generates a new manager token to replace the old one

slide-43
SLIDE 43

43

Rotating join tokens

mgr-1$ docker swarm join-token --rotate worker Succesfully rotated worker join token. To add a worker to this swarm, run the following command: docker swarm join \


  • -token

SWMTKN-1-5f7umqonkff6je2l1kqpxdsok3bwipn73hlr5dxtvx4l usy809-6cq1skbwkkrp2xgv4ak0cgn01 \
 172.17.0.2:2377

slide-44
SLIDE 44

44

Certificate renewal

  • By default, certificates issued to nodes by the Swarm

manager are valid for 90 days

  • Before they expire, nodes automatically renew their

certificates

  • Jitter is added to the renewal time
slide-45
SLIDE 45

45

Certificate renewal

  • Renewal does not involve join tokens
  • A manager will issue a renewed certificate to any node that

can prove its identity with mutual TLS

  • The certificate validity period can be changed with

mgr-1$ docker swarm update --cert-expiry=1000h

  • A shorter expiration time limits the time window where a

leaked certificate is useful to an attacker

slide-46
SLIDE 46

46

External certificate authorities

  • Some may prefer to use a hardened external CA
  • Swarm mode can be set up to call out to an external CA

$ docker swarm init --external-ca \
 protocol=cfssl,url=https://myca.domain.com

slide-47
SLIDE 47

47

Node joining flow (external CA)

Joining node Manager

Join token, cert request Signed certificate Node registration Task assignments

= TLS with no client certificate = Mutually authenticated TLS

External CA

Cert request Signed cert

slide-48
SLIDE 48

48

External certificate authorities

  • Initially supported protocol is cfssl's JSON API
  • Swarm manager authenticates with the external CA using

mutual TLS with its manager certificate

  • External CA becomes a single point of failure for granting

and renewing certificates

slide-49
SLIDE 49

49

Registry credentials

  • Some images are private, meaning a password or token is

needed to pull them

  • If I just run:



 mgr-1$ docker service create myprivateimage
 ...workers won't be able to pull the image

slide-50
SLIDE 50

50

Registry credentials

  • docker login credentials are forwarded to nodes

executing containers from the service if


  • -with-registry-auth is specified:

mgr-1$ docker service create \

  • -with-registry-auth myprivateimage

  • Note that the password or token is exposed to workers

where the tasks are scheduled

slide-51
SLIDE 51

51

Registry credentials

  • Alternative to --with-registry-auth: pre-pull private

images on all nodes

  • Consider using constraints to limit private images to

hardened nodes

slide-52
SLIDE 52

Upcoming improvements

slide-53
SLIDE 53

53

Upcoming improvements: Secrets management

  • Include secrets such as crypto keys with services
  • The associated secrets are sent to the nodes where the

service's tasks are assigned

  • They are made available inside a RAM filesystem

mounted inside the container

  • Secrets are not written to disk as part of task execution
slide-54
SLIDE 54

54

Upcoming improvements: Visibility into container logs

  • Currently no swarm-level support for container logs
  • It will be possible to access logs from any task through a

manager, regardless of where the task is running

slide-55
SLIDE 55

55

Upcoming improvements: High availability scheduling

  • Improves scheduling algorithm to spread out replicas over

as many nodes as possible

  • This avoids concentrating the service's tasks on one node
  • r a few nodes
  • Then it's harder for hardware/network failures to take out

all replicas

slide-56
SLIDE 56

56

Upcoming improvements: High availability scheduling

Scale up search service to 3 replicas, old scheduling algorithm:

Node 2

Billing
 service Billing
 service Billing
 service

Node 1

Search
 service Search
 service Search
 service

Node 1 has fewest tasks, so it receives the new task

Search
 service

slide-57
SLIDE 57

57

Upcoming improvements: High availability scheduling

Scale up search service to 3 replicas, new scheduling algorithm:

Node 2

Billing
 service Billing
 service Billing
 service

Node 1

Search
 service Search
 service Search
 service

Node 2 has fewest replicas

  • f the search service, so it

receives the new task

Search
 service

slide-58
SLIDE 58

58

Upcoming improvements: Roll back service updates

  • If you accidentally roll out a bad update to a service, it will

be possible to roll it back with a simple command:

mgr-1$ docker service update --rollback <servicename>

  • This reverts to the previous version of the service
slide-59
SLIDE 59

59

Upcoming improvements: Roll back service updates

  • Related enhancement: rolling updates can pause after a

configurable fraction of the new tasks fail

  • Configurable time period to monitor each new task for

failure

slide-60
SLIDE 60

60

Upcoming improvements: Roll back service updates

Prepare task 1 Stop old Start task 1 Health checks Update delay Prepare task 2 Stop old

Time

Monitoring period for task 1

Task 1 fails

slide-61
SLIDE 61

61

Other resources

  • Swarm mode documentation:


https://docs.docker.com/engine/swarm/

  • Overview of Raft:


http://thesecretlivesofdata.com/raft/

  • SwarmKit project GitHub repository:


https://github.com/docker/swarmkit

slide-62
SLIDE 62

Thanks!