SOLID SNAKES ATTITUDE INCENTIVES IMPORTANT VS URGENT SIMPLICITY - - PowerPoint PPT Presentation

solid snakes attitude incentives important vs urgent
SMART_READER_LITE
LIVE PREVIEW

SOLID SNAKES ATTITUDE INCENTIVES IMPORTANT VS URGENT SIMPLICITY - - PowerPoint PPT Presentation

HYNEK SCHLAWACK SOLID SNAKES ATTITUDE INCENTIVES IMPORTANT VS URGENT SIMPLICITY THE PRICE OF RELIABILITY IS THE PURSUIT OF THE UTMOST SIMPLICITY. Sir C.A.R. Hoare NORMAL ACCIDENTS ESSENTIAL ESSENTIAL VS ACCIDENTAL OPERATIONAL


slide-1
SLIDE 1

SOLID SNAKES

HYNEK SCHLAWACK

slide-2
SLIDE 2
slide-3
SLIDE 3
slide-4
SLIDE 4

ATTITUDE

slide-5
SLIDE 5

INCENTIVES

slide-6
SLIDE 6

IMPORTANT VS URGENT

slide-7
SLIDE 7

THE PRICE OF RELIABILITY IS THE PURSUIT OF THE UTMOST SIMPLICITY.

Sir C.A.R. Hoare

SIMPLICITY
slide-8
SLIDE 8
slide-9
SLIDE 9

NORMAL ACCIDENTS

slide-10
SLIDE 10
slide-11
SLIDE 11
slide-12
SLIDE 12
slide-13
SLIDE 13

ESSENTIAL

slide-14
SLIDE 14

ESSENTIAL VS ACCIDENTAL

slide-15
SLIDE 15
slide-16
SLIDE 16
slide-17
SLIDE 17
slide-18
SLIDE 18
slide-19
SLIDE 19

OPERATIONAL COMPLEXITY

slide-20
SLIDE 20 your DC Client App DB Redis Cache CDN Work Queue
slide-21
SLIDE 21 your DC Client App DB Redis Cache CDN Work Queue
slide-22
SLIDE 22 your DC Client App DB Redis Cache CDN Work Queue
slide-23
SLIDE 23

MICROSERVICES

slide-24
SLIDE 24

Service 2 Service 3 Service 1 Service 4 Service 5 Service 6 Service 7 Service 8

slide-25
SLIDE 25
slide-26
SLIDE 26

COMPLEXITY IS REALITY

slide-27
SLIDE 27
slide-28
SLIDE 28

PLAN FOR STUPIDITY

slide-29
SLIDE 29

I DON’T BELIEVE IN HUMAN ERROR

John Allspaw, CTO at Etsy

HUMAN ERRORS
slide-30
SLIDE 30
slide-31
SLIDE 31
slide-32
SLIDE 32

DATA VALIDATION

slide-33
SLIDE 33

DATA VALIDATION AT EDGES

slide-34
SLIDE 34

DATA VALIDATION NORMALIZATION AT EDGES

slide-35
SLIDE 35
slide-36
SLIDE 36

PLOT TWIST!

slide-37
SLIDE 37

FAILURE IS INEVITABLE

slide-38
SLIDE 38

RELIABILITY

slide-39
SLIDE 39

RELIABILITY

Twitter 2007

slide-40
SLIDE 40

RELIABILITY

Twitter 2007 NASA 1969

slide-41
SLIDE 41
slide-42
SLIDE 42

FAILURE IS INEVITABLE

slide-43
SLIDE 43

FAILURE IS INEVITABLE (⌐■_■)

slide-44
SLIDE 44

EXPECT

slide-45
SLIDE 45
slide-46
SLIDE 46
slide-47
SLIDE 47

TIMEOUTS

slide-48
SLIDE 48
slide-49
SLIDE 49

CLOSED

Local Client Remote API Circuit Breaker call() call() result result
slide-50
SLIDE 50

CLOSED → OPEN

Local Client Remote API Circuit Breaker call() call() timeout! timeout!
slide-51
SLIDE 51

OPEN

Local Client Remote API Circuit Breaker call() circuit
  • pen!
slide-52
SLIDE 52

OPEN → HALF-CLOSED

Local Client Remote API Circuit Breaker call() call() result result
slide-53
SLIDE 53

REDUNDANCY

slide-54
SLIDE 54
slide-55
SLIDE 55

DOCS

slide-56
SLIDE 56

DEAL WITH IT

(¬∎_∎)

slide-57
SLIDE 57

DON’T MAKE IT WORSE

slide-58
SLIDE 58

RETRIES

slide-59
SLIDE 59

BACKOFF

slide-60
SLIDE 60

BACKOFF

EXPONENTIAL

slide-61
SLIDE 61

BACKOFF

EXPONENTIAL

WITH JITTER

slide-62
SLIDE 62

Frontend Backend

3x

slide-63
SLIDE 63 Internal Backend A Internal Backend B

9x 9x

Frontend Backend 3x
slide-64
SLIDE 64 Internal Backend C 27x Internal Backend A Internal Backend B 9x 9x Frontend Backend 3x
slide-65
SLIDE 65

DON’T SWALLOW ERRORS

slide-66
SLIDE 66

try: do_something() return True except Exception: return False

slide-67
SLIDE 67

try: do_something() except Exception: raise AppException()

slide-68
SLIDE 68

try: do_something() return True except Exception as e: raise AppException() from e

slide-69
SLIDE 69

try: do_something() return True except Exception as e: raise AppException() from e

AppException().__cause__ == e

slide-70
SLIDE 70

DON’T TRY TOO HARD

slide-71
SLIDE 71

sys.exit(1)

slide-72
SLIDE 72

CRASH-ONLY

slide-73
SLIDE 73

FAIL FAST FAIL LOUDLY

slide-74
SLIDE 74

FOCUS ON RECOVERY

slide-75
SLIDE 75

MTTR

slide-76
SLIDE 76
slide-77
SLIDE 77

ZERO EXPECTATIONS

slide-78
SLIDE 78
slide-79
SLIDE 79

FAULT TOLERANCE

slide-80
SLIDE 80

FAULT TOLERANCE RECOVERY

slide-81
SLIDE 81

OX.CX/SS @HYNEK VRMD.DE