Things break. riak bends. Justin Sheehy justin@basho.com - - PowerPoint PPT Presentation

things break riak bends
SMART_READER_LITE
LIVE PREVIEW

Things break. riak bends. Justin Sheehy justin@basho.com - - PowerPoint PPT Presentation

Things break. riak bends. Justin Sheehy justin@basho.com Perfection is Unattainable A system cannot perform as well during a storm of component failure as it can on a sunny day. Know How You Degrade Plan it and understand it before your


slide-1
SLIDE 1

Things break.

Justin Sheehy

justin@basho.com

riak bends.

slide-2
SLIDE 2

Perfection is Unattainable

A system cannot perform as well during a storm

  • f component failure as it can on a sunny day.
slide-3
SLIDE 3

3

Know How You Degrade

You might prevent whole system failure if you’re lucky and good, but what happens during partial failure? Plan it and understand it before your users do.

slide-4
SLIDE 4

4

Know How You Degrade

Plan it and understand it before your users do. You think you know which parts will break.

slide-5
SLIDE 5

5

Know How You Degrade

Plan it and understand it before your users do. You think you know which parts will break. You are wrong.

slide-6
SLIDE 6

6

Harvest and Yield

harvest: a fraction

data available / complete data

yield: a probability

queries completed / q's requested

in tension with each other: (harvest * yield) ~ constant goal: failures cause known linear reduction to one of these

slide-7
SLIDE 7

7

Harvest and Yield

traditional design demands 100% harvest but success of modern applications is

  • ften measured in yield

plan ahead, know when you care!

slide-8
SLIDE 8

Perfection is Unattainable

A system cannot perform as well during a storm

  • f component failure as it can on a sunny day.
slide-9
SLIDE 9

Perfection is Unattainable

A system cannot perform as well during a storm

  • f component failure as it can on a sunny day.

failures will happen.

slide-10
SLIDE 10

Assume that

10

Designing whole systems and components with individual failures in mind is a plan for predictable success. failures will happen. Designing whole systems and components with individual failures in mind is a plan for predictable success.

Resilience is Attainable

slide-11
SLIDE 11

Layered, multi-scale resilience is key!

11

Designing whole systems and components with individual failures in mind is a plan for predictable success.

Resilience is Attainable

slide-12
SLIDE 12

12

Component Failure: reboot of live database

Worst case: whole DB corrupted! Typical mitigation: write-ahead logging for repair

slide-13
SLIDE 13

13

Worst case: whole DB corrupted! Typical mitigation: write-ahead logging for repair Drawbacks: logging adds I/O, repair can be slow

Component Failure: reboot of live database

slide-14
SLIDE 14

14

Alternative: append-only main storage "log-structured" databases Example: bitcask

Component Failure: reboot of live database

slide-15
SLIDE 15

15

Bitcask

simple append-only file format

slide-16
SLIDE 16

16

Bitcask

s i m p l e a p p e n d

  • n

l y fi l e f

  • r

m a t a s a c

  • m

p

  • n

e n t

  • f

s

  • m

e t h i n g b i g g e r

slide-17
SLIDE 17

17

Component Failure: reboot during record write

What about a half-written write? Two problems: detection, minimization.

slide-18
SLIDE 18

18

Component Failure: reboot during record write

What about a half-written write? Two problems: detection, minimization. minimum-length check, CRC-check per record

slide-19
SLIDE 19

19

Component Failure: reboot during record write

What about a half-written write? Two problems: detection, minimization. invalidate only the end-failed record, not the file

slide-20
SLIDE 20

20

Zoom Out: Bitcask is one part of Riak

slide-21
SLIDE 21

21

Component Failure: internal subsystem crash

Bugs can lurk anywhere. Unpredictability, eek. Typical mitigation: complex exception-management

X? X? X? X? X? X? X?

slide-22
SLIDE 22

22

Component Failure: internal subsystem crash

Stronger mitigation: supervision trees and "let it crash" Added bonus: simpler and clearer code

slide-23
SLIDE 23

23

Zoom Out: Virtual Nodes

Many storage instances per server. If one fails, whole system is okay.

...

slide-24
SLIDE 24

24

Zoom Out: Virtual Nodes

...

Also good for operational sanity when adding or removing hosts. Many storage instances per server. If one fails, whole system is okay.

slide-25
SLIDE 25

25

Zoom Out: Riak is a Distributed System

slide-26
SLIDE 26

26

Component Failure: reboot during record write

What about a half-written write? Two problems: detection, minimization. invalidate only the end-failed record, not the file Isn't this still a busted record?

slide-27
SLIDE 27

27

Mitigation: quorum reads

{ok, Value} {ok, Value} {error, not_found}

slide-28
SLIDE 28

28

Mitigation: quorum reads

{ok, Value} {ok, Value} {error, not_found}

client

{ok, Value}

slide-29
SLIDE 29

29

Mitigation: quorum reads

{ok, Value} {ok, Value} {error, bad_crc}

client

{ok, Value}

helps with nearly any local error:

slide-30
SLIDE 30

30

Mitigation: read-repair

{ok, Value} {ok, Value} {error, bad_crc}

client

{ok, Value} {ok, Value}

slide-31
SLIDE 31

31

Component Failure: server down!

From a distributed system's point of view, a whole server can be seen as "a component." How can the overall system continue to perform? Computers fail all the time.

slide-32
SLIDE 32

32

Mitigation: quorum reads

{ok, Value} {ok, Value}

client

{ok, Value}

X

slide-33
SLIDE 33

33

What about writes?

  • k

client

PUT Value

X

  • k
slide-34
SLIDE 34

34

Mitigation: sloppy quorum

  • k

client

PUT Value

X

  • k
  • k
  • k
slide-35
SLIDE 35

35

Mitigation: sloppy quorum

client

X

sloppy quorums work for reads too!

{ok, Value} {ok, Value} {ok, Value} {ok, Value}

slide-36
SLIDE 36

36

sloppy quorums are sloppy

?

{ok, Value} {ok, Value} {ok, Value}

slide-37
SLIDE 37

37

Mitigation: hinted handoff

slide-38
SLIDE 38

38

Mitigation: hinted handoff

also a fix for inconsistent view of membership!

slide-39
SLIDE 39

39

Zoom out: multiple clusters

slide-40
SLIDE 40

40

Component Failure: datacenter-level outage

X

slide-41
SLIDE 41

41

Mitigation: masterless replication

X

still live!

slide-42
SLIDE 42

42

Mitigation: masterless replication

(will catch up later)

slide-43
SLIDE 43

Things break.

Justin Sheehy

justin@basho.com

riak bends.