HEROKU CAMINHO AT A ALTA ESCALA E DISPONIBILIDADE fabio@heroku.com - - PowerPoint PPT Presentation

heroku
SMART_READER_LITE
LIVE PREVIEW

HEROKU CAMINHO AT A ALTA ESCALA E DISPONIBILIDADE fabio@heroku.com - - PowerPoint PPT Presentation

HEROKU CAMINHO AT A ALTA ESCALA E DISPONIBILIDADE fabio@heroku.com QCONSP 2014 FABIO KUNG Tech Lead , Runtime Systems at Heroku heroku scale web= 3 worker= 2 alta-escala-disponibilidade.herokuapp.com milhes de aplicaes (web) um dos


slide-1
SLIDE 1

HEROKU

CAMINHO ATÉ A ALTA ESCALA E DISPONIBILIDADE

fabio@heroku.com

slide-2
SLIDE 2

QCONSP 2014

FABIO KUNG

Tech Lead, Runtime Systems at Heroku

slide-3
SLIDE 3
slide-4
SLIDE 4

heroku scale web=3 worker=2

slide-5
SLIDE 5

alta-escala-disponibilidade.herokuapp.com

slide-6
SLIDE 6

milhões de aplicações (web)

slide-7
SLIDE 7

um dos maiores deployments de Linux containers (LXC) do mundo

slide-8
SLIDE 8

>60k requisições por segundo

slide-9
SLIDE 9

>5G requisições por dia

slide-10
SLIDE 10

12FACTOR.NET

portáveis plataformas modernas (cloud) elasticidade

slide-11
SLIDE 11

duas regiões em produção: us-east and eu-west

slide-12
SLIDE 12

diversas Availability Zones

slide-13
SLIDE 13
slide-14
SLIDE 14
slide-15
SLIDE 15
slide-16
SLIDE 16

2008

slide-17
SLIDE 17

2009

slide-18
SLIDE 18

2010

slide-19
SLIDE 19

CRESCIMENTO

facebook/heroku

slide-20
SLIDE 20

2011

slide-21
SLIDE 21

2013

slide-22
SLIDE 22

VIBE

slide-23
SLIDE 23
slide-24
SLIDE 24

starving-samurai-42.herokuapp.com

slide-25
SLIDE 25
slide-26
SLIDE 26

https://www.flickr.com/photos/timriley/9361949580

slide-27
SLIDE 27
slide-28
SLIDE 28
slide-29
SLIDE 29

cultura hacker flickr/dominicotine

slide-30
SLIDE 30

TIMES E COMPONENTES

slide-31
SLIDE 31

TOTAL OWNERSHIP

Dependências? Autonomia Poliglota Full stack

slide-32
SLIDE 32

CORE -> MICROSERVICES

no free lunch

slide-33
SLIDE 33

INTERFACES IMPLÍCITAS

documentação pobre, informal evolução, updates, releases coordenados manifest-driven APIs

slide-34
SLIDE 34

SISTEMAS DISTRIBUÍDOS

retry circuit breaker rate limiting rollback (transações distribuidas) state replication cache ...

slide-35
SLIDE 35

HEROKU SCALE WEB=3 WORKER=5

slide-36
SLIDE 36

HEROKU SCALE WEB=3 WORKER=5

slide-37
SLIDE 37

TROUBLESHOOTING

assincronicidade distributed tracing visibilidade!

slide-38
SLIDE 38
slide-39
SLIDE 39
slide-40
SLIDE 40

TESTES

slide-41
SLIDE 41

DEPLOYS

slide-42
SLIDE 42

DUPLICAÇÃO!

slide-43
SLIDE 43

EPHEMERALIZATION

Do more with less.

slide-44
SLIDE 44

DOGFOODING

slide-45
SLIDE 45

TOOLS TEAM

slide-46
SLIDE 46

DEVCLOUDS

boot your own Heroku

@merman boot my cloud

slide-47
SLIDE 47

KERNEL PLATFORM

slide-48
SLIDE 48

DIREWOLF

slide-49
SLIDE 49

POSTGRESQL

contra exemplo: RabbitMQ

slide-50
SLIDE 50

ORG ACCOUNTS

slide-51
SLIDE 51

MÚLTIPLAS TECNOLOGIAS

diretrizes service toolkits produto poliglota

slide-52
SLIDE 52

#OPSLIFE

slide-53
SLIDE 53

plantões semanais

slide-54
SLIDE 54

ESCALATION PATH

time todo na rotação gerente do time Incident Commander

slide-55
SLIDE 55

TRANSPARÊNCIA

status.heroku.com

slide-56
SLIDE 56

csquared's Heroku Outage Lights System

slide-57
SLIDE 57

TIME DE OPS

Total ownership?

slide-58
SLIDE 58

SRE

SITE RELIABILITY ENGINEERS confiabilidade global capacity planning reviews retrospectivas de incidentes tools, dashboards fardo do plantão

slide-59
SLIDE 59

MUDANÇAS

atualizar instâncias existentes vs. substituir por novas instâncias

slide-60
SLIDE 60

AVERSÃO A RISCO

mudanças simples de uma linha -> catástrofe

slide-61
SLIDE 61

RIGOR

slide-62
SLIDE 62

"Hackers write Too Much Software. Need to change Process. Heroes mask Too Many Problems. Need to change Teamwork."

  • , Engineering Manager

Noah

slide-63
SLIDE 63

REVISÃO DE CÓDIGO

async, membros remotos

slide-64
SLIDE 64

DOCUMENTAÇÃO

slide-65
SLIDE 65

DIAGRAMAS

slide-66
SLIDE 66

DESIGN

slide-67
SLIDE 67

BLOG DRIVEN DEVELOPMENT

slide-68
SLIDE 68

CFP

grandes decisões difîceis

slide-69
SLIDE 69

CHECKLISTS

slide-70
SLIDE 70

Example: production checklist ✓ Has ops docs with executable instructions ✓ Has a high-fidelity staging setup with production parity ✓ Requested audit from the security team ✓ Alerts a human if it is down ✓ Simulated failures ✓ Uses structured logging ✓ Enforces SSL access ✓ Creds and rotation procedures are documented ✓ Send a launch email to engineering@ ✓ Move to Production on the Engineering Lifecycle board

slide-71
SLIDE 71

SUPORTE

embutido

slide-72
SLIDE 72

BUS FACTOR

Total ownership?

slide-73
SLIDE 73

BENEVOLENT DICTATORSHIP

BDFL

slide-74
SLIDE 74
slide-75
SLIDE 75

COOPETIÇÃO

COMPETIÇÃO COOPERATIVA

slide-76
SLIDE 76

LXC

ex.: DotCloud, container-rfc We lost the standards game for virtual machine images, but it feels like this community is tight nit enough we might be able to do something for Linux Containers.

  • - Alex Polvi (coreos.com)
slide-77
SLIDE 77

GIT

$ git push heroku master Counting objects: 1, done. Writing objects: 100% (1/1), 181 bytes | 0 bytes/s, done. Total 1 (delta 0), reused 0 (delta 0)

  • ----> Ruby app detected
  • ----> Compiling Ruby

... To git@heroku.com:myapp.git 91dfe0b..f251ba7 master -> master

ex.: GitHub

slide-78
SLIDE 78

2012

slide-79
SLIDE 79

CLOUD

ex.: AWS, AppEngine

slide-80
SLIDE 80

PESSOAS

política de "não jerks"

slide-81
SLIDE 81

CORE -> TIMES INDEPENDENTES

slide-82
SLIDE 82

TOTAL OWNERSHIP

slide-83
SLIDE 83

FOCO?

SRE produto

  • heróis

+coordenação

slide-84
SLIDE 84

HEROKU NA EUROPA

Furacão Sandy (2012) -> us-east -> us-west

slide-85
SLIDE 85

GERÊNCIA

slide-86
SLIDE 86

's mdz Scaling Human Systems

slide-87
SLIDE 87

SLACK

always too busy

slide-88
SLIDE 88

O QUE MUDOU?

valores (Adam Wiggins)

slide-89
SLIDE 89

EPHEMERALIZATION

Do more with less.

slide-90
SLIDE 90

MAKE IT REAL

Ideas are cheap.

slide-91
SLIDE 91

SHIP IT

Nothing is real until it's being used by a real user.

slide-92
SLIDE 92

DO IT WITH STYLE

Aesthetic matters.

slide-93
SLIDE 93

INTUITION-DRIVEN | DATA-DRIVEN

Users don't really know what they want.

slide-94
SLIDE 94

... PROVE COM DADOS

bikeshed@heroku.com

slide-95
SLIDE 95

DIVIDE AND CONQUER

If it's hard, cut scope.

slide-96
SLIDE 96

TIMING MATTERS

Maybe now isn't the right time.

slide-97
SLIDE 97

THROW THINGS AWAY

Never be afraid to throw something away and do it again.

slide-98
SLIDE 98

https://www.flickr.com/photos/teich/9427507382/

slide-99
SLIDE 99

SMALL SHARP TOOLS

Composability. . The Art of Unix Programming Also teams. Several small, autonomous, focused teams.

slide-100
SLIDE 100

PUT IT IN THE CLOUD

Services, not software.

slide-101
SLIDE 101

RESULTS, NOT POLITICS

"get ahead" in your career by delivering real value. Not by impressing your boss or with big talk.

slide-102
SLIDE 102

DECISION-MAKING VIA OWNERSHIP

NOT CONSENSUS OR AUTHORITY Ownership can't be given, only taken. Ownership can't be declared, only demonstrated.

slide-103
SLIDE 103

DO-OCRACY / INTRAPRENEURSHIP

Ask forgiveness, not permission.

slide-104
SLIDE 104

EVERYTHING IS AN EXPERIMENT

Everything is always subject to change. Ending an experiment isn't a failure.

slide-105
SLIDE 105

OWN UP TO FAILURE

Admit your mistake, say you're sorry, and feel the failure to make sure you learned from it. Then, get back to work.

slide-106
SLIDE 106

GRADUAL ROLLOUTS

  • Incremental. Adjust.
slide-107
SLIDE 107

DESIGN EVERYTHING

Be intentional.

slide-108
SLIDE 108

QUESTION EVERYTHING

The status quo is never good enough.

slide-109
SLIDE 109

MANIACAL FOCUS ON SIMPLICITY

There is no step 1.

slide-110
SLIDE 110

... CÓDIGO ESPERTO DEMAIS load averages ELB -> unicorns

slide-111
SLIDE 111

CLI 4 LIFE

Command-line interfaces are the heart of developer workflows.

slide-112
SLIDE 112

IGNORE THE COMPETITION

Except to borrow good ideas.

slide-113
SLIDE 113

WRITE WELL

Clear writing is clear thinking.

slide-114
SLIDE 114

STRONG OPINIONS, WEAKLY HELD

Be willing to change your mind.

slide-115
SLIDE 115

OBRIGADO!

fabio@heroku.com