Engineering Velocity: Continuous Delivery at Netflix Dianne Marsh - - PowerPoint PPT Presentation

engineering velocity continuous delivery at netflix
SMART_READER_LITE
LIVE PREVIEW

Engineering Velocity: Continuous Delivery at Netflix Dianne Marsh - - PowerPoint PPT Presentation

Engineering Velocity: Continuous Delivery at Netflix Dianne Marsh SATURN 2014 en-gi-neer-ing + ve-loc-i-ty applying science and technology to designing and building speed into a system Availability vs. Rate of Change 6 5 Availablity


slide-1
SLIDE 1

Engineering Velocity: Continuous Delivery at Netflix

Dianne Marsh SATURN 2014

slide-2
SLIDE 2

en-gi-neer-ing + ve-loc-i-ty

  • applying science and technology to designing and building speed

into a system

slide-3
SLIDE 3

Availability vs. Rate of Change

Availablity (in 9’s) 1 2 3 4 5 6 Rate of Change 10 100 1000

slide-4
SLIDE 4

Shift the Curve

Availablity (in 9’s) 1 2 3 4 5 6 Rate of Change 10 100 1000 10000

slide-5
SLIDE 5

http://www.slideshare.net/reed2001/culture-1798664

slide-6
SLIDE 6

Manager’s Role

Context, not Control Loosely coupled, Tightly aligned And hire well!

slide-7
SLIDE 7

Get out of the Way

Freedom to Innovate

slide-8
SLIDE 8

Support Experimentation

  • How We Built a

Predictive Autoscaling Engine

http://techblog.netflix.com/2013/11/scryer-netflixs-predictive-auto-scaling.html

slide-9
SLIDE 9

Support Independent Paths of Exploration Don’t Prematurely Optimize!

slide-10
SLIDE 10

Blameless Culture

slide-11
SLIDE 11

Developers Deploy Their Code

Run What You Wrote

  • Rapid Innovation
  • Rapid Detection
  • Rapid Response
  • = Freedom + Responsibility
slide-12
SLIDE 12

Support with Tools

slide-13
SLIDE 13

Jenkins Job DSL

Configuration as Code Groovy Script Scripts go in Version Control

http://www.slideshare.net/quidryan/configuration-as-code

slide-14
SLIDE 14

Aminator

Create AMI from Base AMI Image contains service and everything needed to run it Unit of Deployment for Test and Prod Abstracts Cloud Details

http://techblog.netflix.com/2013/03/ami-creation-with-aminator.html

slide-15
SLIDE 15

Asgard

Deploys Netflix to the Cloud Red/Black push Developed to address delays in rollback

http://www.infoq.com/presentations/asgard

slide-16
SLIDE 16

Red/Black Push

  • Scale up new instances
  • Run canary analysis
  • Turn on traffic to new ASG
  • Turn off traffic to old ASG
  • Wait … analyze … continue
slide-17
SLIDE 17

Workflow

Continuous Delivery Engine Judges between Stages Represent Best Practices

http://techblog.netflix.com/2013/09/glisten-groovy-way-to-use-amazons.html

slide-18
SLIDE 18

One Click Deployment?

slide-19
SLIDE 19

Regional Isolation

Limit Impact of Human Error

  • Stagger Deployments?
  • Canary Testing per Region?
  • Know your Service!
slide-20
SLIDE 20

Multi-Region Consistency

Build Tooling to:

  • Schedule Deployments
  • Prefer Off-Peak
  • Choose Next Available

Region

  • Provide Visibility by Region
slide-21
SLIDE 21

Simian Army

  • Chaos Monkey
  • Latency Monkey
  • Conformity Monkey
  • Janitor Monkey 


(and more)

http://www.infoq.com/presentations/netflix-resiliency-failure-cloud

slide-22
SLIDE 22

Chaos Monkey

Kills Running Instances

  • Simulates failures inherent to

running in the cloud

  • In Production
slide-23
SLIDE 23

Latency Monkey

Introduces Latency between services

slide-24
SLIDE 24

Conformity Monkey

Have Deployments Diverged?

  • Balance Regional

Consistency with Regional Isolation

  • Build Best Practices into

Tooling and Reporting

slide-25
SLIDE 25

Janitor Monkey

Reduce Cognitive Load and Cost

  • Remove unused instances
  • Uniform way to clean up
slide-26
SLIDE 26

Shifting the Curve with Tooling

  • Value Self-Service
  • Test Everywhere
  • Awareness of Multiple Regions
  • Best Practices Represented in Tooling
  • Recover Quickly and Easily
  • Be Cloud Native
slide-27
SLIDE 27

Shifting the Curve with Culture

  • Context not Control
  • Freedom to Experiment
  • Blameless Culture
slide-28
SLIDE 28

ArsTechnica, November 2012

“As the number of applications and the scale of the campaign's AWS infrastructure use climbed, the DevOps team shifted to using Asgard—an open-source tool developed by Netflix to manage cloud deployments.”

slide-29
SLIDE 29

Thanks! Dianne Marsh (@dmarsh) dmarsh@netflix.com