From Sysadmin to SRE CORE Site Reliability at Netflix Jonah Al C - - PowerPoint PPT Presentation

from sysadmin to sre
SMART_READER_LITE
LIVE PREVIEW

From Sysadmin to SRE CORE Site Reliability at Netflix Jonah Al C - - PowerPoint PPT Presentation

From Sysadmin to SRE CORE Site Reliability at Netflix Jonah Al C loud O perations R eliability E ngineering context > control hire smart people (and get out of the way) freedom & responsibility learning organization Netflix as a


slide-1
SLIDE 1

From Sysadmin to SRE

CORE Site Reliability at Netflix

slide-2
SLIDE 2

Jonah

slide-3
SLIDE 3

Al

slide-4
SLIDE 4
slide-5
SLIDE 5

C loud O perations R eliability E ngineering

slide-6
SLIDE 6

context > control

slide-7
SLIDE 7

(and get out of the way) hire smart people

slide-8
SLIDE 8

freedom & responsibility

slide-9
SLIDE 9

learning organization

slide-10
SLIDE 10

Netflix as a Node shop

slide-11
SLIDE 11

sysadmin to SRE

a before and after story

slide-12
SLIDE 12

configuration management

slide-13
SLIDE 13

baked AMIs

slide-14
SLIDE 14

uncontrolled chaos

slide-15
SLIDE 15

deliberate chaos

slide-16
SLIDE 16

change prevention

slide-17
SLIDE 17

change logging

slide-18
SLIDE 18

Nagios Ganglia Graphite Cacti MRTG

slide-19
SLIDE 19

Atlas & insight engineering

slide-20
SLIDE 20
slide-21
SLIDE 21
slide-22
SLIDE 22
slide-23
SLIDE 23
slide-24
SLIDE 24
slide-25
SLIDE 25

@jonahhorowitz @altobey https://netflix.github.io/ https://jobs.netflix.com/ @brendangregg, next: Broken Linux Performance Tools Ballroom H - 13:30 to 14:30

slide-26
SLIDE 26