SLIDE 1 Ed Rooth
@sym3tri | ed.rooth@coreos.com | coreos.com
More Containers, More Problems
SLIDE 2
- 1. Define problems
- 2. Define vision of the solution
- 3. How CoreOS is building solutions
- 4. How you can get started
Agenda
SLIDE 3
a server
It all started with...
SLIDE 4
many servers
Then we got...
SLIDE 5
VMs on our servers
Then we got...
SLIDE 6
APIs around hosted VMs (cloud)
Then we got...
SLIDE 7
even more servers
Which led to...
SLIDE 8
The cloud made booting servers really easy. Also… Moore’s law is still a thing.
Too Many Servers!
SLIDE 9
Patching………………………..is hard Dependency management........is hard Managing access ……………...is hard Managing workloads ………....is hard App Lifecycle management .. ..is hard Identifying security issues ......is hard
More Servers, More Problems
SLIDE 10 More Servers == More Sysadmins
Servers Sysadmins
1000 500
SLIDE 11 1000 500
More Servers, More Problems
Servers Sysadmins
SLIDE 12
… before the rest of us did. They solved many of these problems internally, and published some great papers.
Google needed more servers
SLIDE 13
We started building it
CoreOS, Google, and the community... are building the open-source version.
SLIDE 14
#GIFEE
SLIDE 15
Google’s Infrastructure For Everyone Else
What is #GIFEE?
SLIDE 16 "Fundamentally, it's what happens when you ask a software engineer to design an operations function."
Vice President, Google Engineering founder of Google SRE
Google’s Infrastructure
SLIDE 17
SLIDE 18
Servers are not your pets Servers are the new CPU Cores Clusters are the new servers
What is #GIFEE?
SLIDE 19
Evolution of Servers
SLIDE 20
Clusters
Server Cluster
SLIDE 21
Clusters
Process App
SLIDE 22
Operating System Custom Linux Distributed Consensus Chubby Cluster Manager Borg Monitoring BorgMon RPC framework Stubby Auth private
SLIDE 23
Operating System Custom Linux CoreOS Linux Distributed Consensus Chubby etcd Cluster Manager Borg Kubernetes Monitoring BorgMon Prometheus RPC framework Stubby gRPC Auth private Dex
Open Source
SLIDE 24
“cluster operating system”
SLIDE 25
Orchestration State Scheduler: Gets work to the servers
OS for Clusters
SLIDE 26
Software manages servers Software manages workloads Declare what you want, it will become so
What is #GIFEE?
SLIDE 27 worker kubelet worker kubelet worker kubelet worker kubelet worker kubelet worker kubelet worker kubelet API + scheduler
SLIDE 28 worker kubelet API + scheduler
SLIDE 29 API + Scheduler + worker
works on 1 node too
SLIDE 30
Primary component of the Cluster OS Fits our vision Started by Google with over 10 yrs experience running Borg
SLIDE 31
Centralized administration & orchestration No more SSH Yes, that even means your favorite config mgmt tool
What is #GIFEE?
SLIDE 32 What is #GIFEE?
$ scp myapp host:/opt $ ssh host systemd-run /opt/myapp
Don’t say HOW
SLIDE 33 What is #GIFEE?
$ kubectl run myapp
- -image=quay.io/sym3tri/hello
- -replicas=1
$ kubectl get pods POD IP myapp-97wt8 10.2.29.3
say WHAT
SLIDE 34 What is #GIFEE?
$ kubectl scale rc myapp
$ kubectl get pods POD IP myapp-97wt8 10.2.29.3 myapp-f839d 10.2.29.4 myapp-98b35 10.2.29.5 myapp-e40ee 10.2.29.8
say WHAT again
SLIDE 35 What is #GIFEE?
$ kubectl run myapp
- -image=quay.io/sym3tri/hello
- -replicas=1
$ kubectl get pods POD IP myapp-97wt8 10.2.29.3
say WHAT
SLIDE 36
SLIDE 37
RC web-prod select(env=prod,app=web) count=1 Pod env=prod app=web
SLIDE 38
RC web-prod select(env=prod,app=web) count=4 Pod env=prod app=web Pod env=prod app=web Pod env=prod app=web Pod env=prod app=web
SLIDE 39
automated != automatic
SLIDE 40
Dependencies are isolated per app Apps automatically migrate throughout the cluster
What is #GIFEE?
SLIDE 41 All apps are “12-factor” Configuration/Secret management
What is #GIFEE?
prod config staging config
SLIDE 42
Consistent Deployment API Deploy canary builds and experiments Rolling Updates
What is #GIFEE?
SLIDE 43 Load Balanced Service
app v1 app v1 app v1 app v1
SLIDE 44 Load Balanced Service
app v1 app v1 app v1 app v1 app v2
SLIDE 45 Load Balanced Service
app v1 app v1 app v1 app v1 app v2
SLIDE 46 Load Balanced Service
app v1 app v1 app v1 app v1 app v2
SLIDE 47 Load Balanced Service
app v1 app v1 app v1 app v2 app v2
SLIDE 48 Load Balanced Service
app v1 app v1 app v2 app v2 app v2
SLIDE 49 Load Balanced Service
app v2 app v2 app v2 app v2
SLIDE 50 C Team B Team A Team
What is #GIFEE?
Mixed workloads (staging + prod) Logically partitioned resources
SLIDE 51 Trusted & Secure from the bottom up* Only trusted code is executed
What is #GIFEE?
Cluster OS Container Runtime OS Firmware & TPM
SLIDE 52 Every {human,machine,process} is… authenticated & authorized All communication is encrypted
What is #GIFEE?
worker kubelet API + scheduler
SLIDE 53 Failure is expected and handled for…
- Services / Apps
- Machines
- Storage
- Clusters
- Regions
What is #GIFEE?
SLIDE 54
Logging Monitoring / Alerting
What is #GIFEE?
SLIDE 55
Compatibility with existing tools Work with other projects (Docker, Calico, Prometheus) Incorporates lessons learned
#GIFEE vs Google Infra?
SLIDE 56
Build for scale Manage your apps, not servers High Availability New paradigm of infra/development
Why?
SLIDE 57
We believe: As #GIFEE becomes ubiquitous, the Internet becomes more secure overall
#GIFEE and Security
SLIDE 58
Secure the Internet
CoreOS Mission
SLIDE 59
Journey to #GIFEE
SLIDE 60 Leverage prior work + standards
Getting Started
SLIDE 61
Start from the bottom The Operating System
Securing The Internet
SLIDE 62 Minimal Server OS + Automatic Updates Requires:
- Distributed consensus
- Containers
- Cluster computing
Securing The Internet
SLIDE 63
In this new world we containerize all the things…
Containerize
SLIDE 64
but…
Containerize
SLIDE 65 “Every solution breeds new problems”
1つの問題解決 → 別の問題発生
More Containers, More Problems
SLIDE 66 Problem #1
container distribution
More Containers, More Problems
SLIDE 67 Problem #1
container distribution
More Containers, More Problems
Solution
SLIDE 68 More Containers, More Problems
Problem #2
- Docker security model
- Docker coupling of
components
SLIDE 69 More Containers, More Problems
Problem #2
- Docker security model
- Docker coupling of
components Solution
SLIDE 70
More Containers, More Problems
systemd app systemd app docker run redis docker engine daemon
SLIDE 71
Implementation:
Side Note: Spec vs Implementation
SLIDE 72 Side Note: Spec vs Implementation
Specification:
https://en.wikipedia.org/wiki/ISO_668
SLIDE 73 More Containers, More Problems
Problem #3
SLIDE 74 More Containers, More Problems
Problem #3
Solution
SLIDE 75 More Containers, More Problems
Problem #4
SLIDE 76 More Containers, More Problems
Problem #4
Solution
- Go
- Buildroot
- acbuild for ACIs
SLIDE 77 github.com/brianredbeard/minimal_containers NOOOOOOOOO!!!
Your container is 500MB !?
SLIDE 78 Problems #5-11
- Co-locating Containers
- Intelligent Scheduling
- Port Management
- Segmenting workloads
- Configuration Management
- Secrets Management
- Inconsistent Deployments
More Containers, More Problems
SLIDE 79 Problems #5-11
- Co-locating Containers
- Intelligent Scheduling
- Port Management
- Segmenting workloads
- Configuration Management
- Secrets Management
- Inconsistent Deployments
More Containers, More Problems
Solution
SLIDE 80 More Containers, More Problems
Problem #12 Networking
- Too many types of SDNs
- IP per POD
SLIDE 81 More Containers, More Problems
Problem #12 Networking
- Too many types of SDNs
- IP per POD
Solution
SLIDE 82 More Containers, More Problems
Problem #13
- Metrics
- Monitoring
- Alerting
SLIDE 83 More Containers, More Problems
Problem #13
- Metrics
- Monitoring
- Alerting
Solution
SLIDE 84 More Containers, More Problems
Problem #14
containers
SLIDE 85 More Containers, More Problems
Problem #14
containers Solution
SLIDE 86
SLIDE 87 More Containers, More Problems
Problem #15
clusters
SLIDE 88 More Containers, More Problems
Problem #15
clusters Solution
SLIDE 89
SLIDE 90 More Containers, More Problems
Problem #16
SLIDE 91 More Containers, More Problems
Problem #16
Solution
- Ignition
- coreos-baremetal
- Tectonic baremetal
installer
SLIDE 92 More Containers, More Problems
Problem #17
trust
SLIDE 93 More Containers, More Problems
Solution
Computing (DTC) Problem #17
trust
SLIDE 94 More Containers, More Problems
Problem #18
SLIDE 95 More Containers, More Problems
Solution
Problem #18
SLIDE 96 Kubernetes is the kernel, Tectonic is the distro.
tectonic.com @tectonic
SLIDE 98
Kubernetes Contributions
OIDC Authentication RBAC Authorization TLS Bootstrapping rktnetes 2x Scheduler Performance etcd 3 support coreos-kubernetes Bootstrap/Upgrade Simplification
SLIDE 99
Future
More Management Tools Expand platform support Prometheus Enhancements Federated Clusters
SLIDE 100
Summary
Open-Source is key Security is key Updates are key Containers Orchestration Automatic systems
SLIDE 101 Ed Rooth
@sym3tri | ed.rooth@coreos.com | coreos.com
More Containers, More Problems
SLIDE 102 We’re hiring in all departments! Email: careers@coreos.com Positions: coreos.com/ careers
90+ Projects on GitHub, 1,000+ Contributors
OPEN SOURCE
CoreOS.com - @coreoslinux - github/coreos Secure solutions, support plans, training + more
ENTERPRISE
sales@coreos.com - tectonic.com - quay.io
CoreOS is Running the World’s Containers