Scaling Continuous Deployment @ Etsy Avleen Vig Staff Operations - - PowerPoint PPT Presentation

scaling continuous deployment etsy
SMART_READER_LITE
LIVE PREVIEW

Scaling Continuous Deployment @ Etsy Avleen Vig Staff Operations - - PowerPoint PPT Presentation

Scaling Continuous Deployment @ Etsy Avleen Vig Staff Operations Engineer @avleen With much credit: Daniel Schauenberg: (@mrtazz) Statistics @avleen Statistics @avleen Our application Mostly monolithic @avleen Our application A


slide-1
SLIDE 1

Scaling Continuous
 Deployment @ Etsy

Avleen Vig Staff Operations Engineer
 @avleen

With much credit: Daniel Schauenberg: (@mrtazz)

slide-2
SLIDE 2

@avleen

Statistics

slide-3
SLIDE 3

@avleen

Statistics

slide-4
SLIDE 4

@avleen

Our application

Mostly monolithic

slide-5
SLIDE 5

@avleen

Our application

A few services too

slide-6
SLIDE 6

@avleen

Our application

Deploy frequency

slide-7
SLIDE 7

@avleen

Our team

Before..

slide-8
SLIDE 8

@avleen

Our team

Today.. ..and that’s just a fraction!

slide-9
SLIDE 9

@avleen

Deploying code

The push train

Item by decomodwalls

slide-10
SLIDE 10

@avleen

Deploying code

#push

  • IRC channel to organize push trains
  • Join a train if you want to deploy

changes

  • Schedule is planned via the channel

topic

  • First in the train is the driver
slide-11
SLIDE 11

@avleen

Deploying code

<prod> kseever* + jameslee | jpaul | avleen (c)

#push

slide-12
SLIDE 12

@avleen

Deploying code

<prod> bateman* + krunal* + enorris* | tristan (c) + jameslee (c) + jlaster (c) | dawa + corey + sandosh + jklein + magera + seth_home + mpascual + nathan | bateman | russp (c)

#push

slide-13
SLIDE 13

@avleen

Deploying code

Deployinator

slide-14
SLIDE 14

@avleen

https://github.com/ etsy/deployinator

slide-15
SLIDE 15

@avleen

Deploying code

So what’s the problem?

slide-16
SLIDE 16

@avleen

Deploying code

So what’s the problem?

  • Deploy-time requests are not

atomic

  • Weird limbo while syncing in-

place

  • Limits on pushes-per-day
  • Long wait times
slide-17
SLIDE 17

@avleen

Deploying code

Um, limits per day?

  • (push_queue_hours * 60)


minutes to deploy

  • At 15 mins/deploy, we get ~32

deploys per day - not enough!

slide-18
SLIDE 18

@avleen

How can we scale it?

Our options:

  • More code in each deploy
  • Allow concurrent deploys
  • Reduce deploy times
  • Make deploys atomic
  • Fork more concurrent rsyncs
slide-19
SLIDE 19

@avleen

How can we scale it?

More code in each deploy:

  • Also has limits
  • How many people can be in

each push?

  • We found ~8 to be our limit

for reducing wait times

slide-20
SLIDE 20

@avleen

How can we scale it?

Allow concurrent deploys:

  • For config changes
  • Code on independent systems
  • The few services we have
slide-21
SLIDE 21

@avleen

How can we scale it?

Concurrent deploys:

HELLO SPLIT QUEUES HELLO SPLIT QUEUES

slide-22
SLIDE 22

@avleen

How can we scale it?

Reduce deploy times:

  • Tweaks around rsync
  • Keep codebase in RAM (tmpfs)
  • Increase rsync concurrency
  • Reduce timeouts and retry

intervals

slide-23
SLIDE 23

@avleen

How can we scale it?

Make deploys atomic:

Yin Yang

Active Docroot

slide-24
SLIDE 24

@avleen

How can we scale it?

Make deploys atomic:

Yin Yang

Active Docroot rsync

slide-25
SLIDE 25

@avleen

How can we scale it?

Make deploys atomic:

Yin Yang

Active Docroot

slide-26
SLIDE 26

@avleen

How can we scale it?

Make deploys atomic:

Yin Yang

Active Docroot

slide-27
SLIDE 27

@avleen

How can we scale it?

Make deploys atomic:

  • Not so trivial
  • PHP opcache problems
  • include_path troubles
  • Swapping symlinks mid-request
slide-28
SLIDE 28

@avleen

http://github.com/ etsy/mod_realdoc

slide-29
SLIDE 29

@avleen

How can we scale it?

Make deploys atomic, mod_realdoc:

  • Apache post_read_request hook
  • Whole request works on

realpath of docroot

  • Caches realpath for 2s
slide-30
SLIDE 30

@avleen

http://github.com/ etsy/incpath

slide-31
SLIDE 31

@avleen

How can we scale it?

Make deploys atomic, incpath:

  • PHP extension
  • Updates a portion of

include_path

  • $_SERVER[“DOCUMENT_ROOT”]
slide-32
SLIDE 32

@avleen

Infrastructure

slide-33
SLIDE 33

@avleen

Scaling infrastructure

Before:

Deploy
 Host

Deployinator

Production
 Servers

slide-34
SLIDE 34

@avleen

Scaling infrastructure

After:

Deploy
 Host

Deployinator

Production
 Servers Deploy
 Host

slide-35
SLIDE 35

@avleen

Results!

slide-36
SLIDE 36

@avleen

Results!

What did we gain?

  • No need to restart apache
  • Entire deploy in one push
  • Opcode cache stays warm!
slide-37
SLIDE 37

@avleen

Results!

Push frequency

  • (push_queue_hours * 60)


minutes to deploy

  • Still ~15mins/deploy:


Much more code going out
 Tests still run fast
 Less time waiting to deploy

slide-38
SLIDE 38

@avleen

Q&A