Deploying, At An Unusual Scale Andrew Godwin @andrewgodwin - - PowerPoint PPT Presentation

deploying at an unusual scale
SMART_READER_LITE
LIVE PREVIEW

Deploying, At An Unusual Scale Andrew Godwin @andrewgodwin - - PowerPoint PPT Presentation

Deploying, At An Unusual Scale Andrew Godwin @andrewgodwin http://www.flickr.com/photos/whiskeytango/1431343034/ Hi, I'm Andrew. Serial Python developer Django core committer Co-founder of ep.io Hi, I'm Andrew. Serial Python developer


slide-1
SLIDE 1

Deploying, At An Unusual Scale

Andrew Godwin

http://www.flickr.com/photos/whiskeytango/1431343034/

@andrewgodwin

slide-2
SLIDE 2

Hi, I'm Andrew.

Serial Python developer Django core committer Co-founder of ep.io

slide-3
SLIDE 3

Hi, I'm Andrew.

Serial Python developer Django core committer Co-founder of ep.io Occasional fast talker

slide-4
SLIDE 4

""Andrew speaks English like a machine gun speaks bullets.""

Reinout van Rees

slide-5
SLIDE 5

We're ep.io

Python Platform-as-a-Service Easy deployment, easy upgrades PostgreSQL, Redis, Celery, and more

slide-6
SLIDE 6

Why am I here?

Our Architecture How we deploy Django How varied Django deployments are

slide-7
SLIDE 7

Our Architecture

slide-8
SLIDE 8

Balancer

Runner Runner Runner

App 1 App 2 App 3 App 2 App 4 App 1

Databases File Storage

Balancer

slide-9
SLIDE 9

Oh My God, It's Full of Pairs

Everything is redundant Distributed programming is Hard

slide-10
SLIDE 10

Hardware

Real colo'd machines Linode EC2 (pretty unreliable) (pretty reliable) (pretty reliable) IPv6 (as much as we can)

slide-11
SLIDE 11

ØMQ

We used to use Redis Everything now on ZeroMQ Eliminates SPOF*

* Single Point Of Failure. What a pointless acronym.

slide-12
SLIDE 12

ØMQ Usage

Redundant location-resolvers (Nexus) REQ/XREP for control messages PUSH/PULL for stats, logs PUB/SUB for heartbeats, locking

slide-13
SLIDE 13

Runners

Unsurprisingly, these run the code SquashFS filesystem images Virtualenvs per app UID & permission isolation, more coming

slide-14
SLIDE 14

Logging/Stats

All done asynchronously using ØMQ Logs to filesystem (chunked files) Stats to PostgreSQL database, for now

slide-15
SLIDE 15

Loadbalancers

Intercept all incoming HTTP requests Look up hostname (or suffix) HTTP 1.1 compliant

slide-16
SLIDE 16

Databases

Shared (only for PostgreSQL) Dedicated (uses Runner framework) PostgreSQL 9, damnit

slide-17
SLIDE 17

Django in the backend

We use the ORM extensively Annoying settings fiddling in __init__

slide-18
SLIDE 18

www.ep.io

Runs on ep.io, just like any other app* Provides JSON API, web UI

* Well not quite - App ID 0 is special - but we're working on it

slide-19
SLIDE 19

WSGI

It's a standard, right?

slide-20
SLIDE 20

WSGI

It's a standard, right? Well, yes, and it works fine, but it's not enough for serving a Python app

slide-21
SLIDE 21

Static Files

CSS, images, JavaScript, etc. Needs a URL and a directory path

slide-22
SLIDE 22

Python & Dependencies

Mostly filled by pip/buildout/etc packaging apparently allows version spec

slide-23
SLIDE 23

Deploying Django

It makes things consistent, right?

slide-24
SLIDE 24

Settings Layouts

Vanilla settings.py local_settings.py configs/HOSTNAME.py Many others...

slide-25
SLIDE 25

Python Paths

Project-level imports App-level imports apps/ directories

slide-26
SLIDE 26

Databases

If it's SQL, it's PostgreSQL Redis for key-value, MongoDB soon Some things assume a safe network

slide-27
SLIDE 27

HA (High Availability)

Not terribly easy with shared DBs PostgreSQL 9's sensible warm standby Redis has SLAVEOF Possibly use DRBD for general solution

slide-28
SLIDE 28

Backups

High Availability is NOT a backup btrfs for consistent snapshotting Archived remote syncs No access to backups from servers

slide-29
SLIDE 29

Migrations

No solution yet for migration/code sync We're working on it...

slide-30
SLIDE 30

Web serving

It's not like it's important or anything

slide-31
SLIDE 31

gunicorn

Small and lightweight Supports long-running requests Pretty stable

slide-32
SLIDE 32

nginx

Even more lightweight Extremely fast Really, really stable

slide-33
SLIDE 33

The Load Balancer

Used to be HAProxy Rewritten to custom Python daemon eventlet used for high throughput Can't use nginx, no HTTP 1.1 for backends

slide-34
SLIDE 34

Celery

See: Yesterday's Talk Slightly tricky to run many We use Redis as the backend

slide-35
SLIDE 35

Management Commands

First off, run as subprocess Then, a custom PTY module Now, run as pty-wrapping subprocesses

slide-36
SLIDE 36

Some General Advice

If you're crazy enough to do this

slide-37
SLIDE 37

Messaging's Not Enough

Having a state to check is handy

slide-38
SLIDE 38

Why run one, when you can run two for twice the price?

Redundancy is good. Double redundancy is better.

slide-39
SLIDE 39

Always expect the worst

Hope you never have to deal with it.

slide-40
SLIDE 40

The more backups, the better.

Make sure you have historical ones, too.

slide-41
SLIDE 41

Django is very flexible

Sometimes a little too flexible...

slide-42
SLIDE 42

Your real problems will emerge later

Don't over-optimise up front for everything

slide-43
SLIDE 43

Questions?

Andrew Godwin

andrew@ep.io @andrewgodwin