NoSQL like There is No Tomorrow Khawaja Head of Engineering, NoSQL - - PowerPoint PPT Presentation

nosql like there is no tomorrow
SMART_READER_LITE
LIVE PREVIEW

NoSQL like There is No Tomorrow Khawaja Head of Engineering, NoSQL - - PowerPoint PPT Presentation

NoSQL like There is No Tomorrow Khawaja Head of Engineering, NoSQL Swaminathan Sivasubramanian Swami GM, NoSQL @swami_79 @ksshams how can you build your own DynamoDB Scale service? @swami_79 @ksshams lets start with a story about a


slide-1
SLIDE 1

@ksshams @swami_79

NoSQL like There is No Tomorrow

Swaminathan Sivasubramanian

GM, NoSQL

Khawaja

Head of Engineering, NoSQL

Swami

slide-2
SLIDE 2

@ksshams @swami_79

how can you build your own DynamoDB Scale service?

slide-3
SLIDE 3

@ksshams @swami_79

let’s start with a story about a little company called amazon.com

slide-4
SLIDE 4

@ksshams @swami_79

  • nce upon a time...

(in 2000)

episode 1

slide-5
SLIDE 5

@ksshams @swami_79

a few thousand miles away... (seattle)

slide-6
SLIDE 6

@ksshams @swami_79

amazon.com - a rapidly growing Internet based retail business relied on relational databases

slide-7
SLIDE 7

@ksshams @swami_79

we had 1000s of independent services

slide-8
SLIDE 8

@ksshams @swami_79

each service managed its state in RDBMs

slide-9
SLIDE 9

@ksshams @swami_79

RDBMs are actually kind of cool

slide-10
SLIDE 10

@ksshams @swami_79

first of all... SQL!!

slide-11
SLIDE 11

@ksshams @swami_79

so it is easier to query..

slide-12
SLIDE 12

@ksshams @swami_79

easier to learn

slide-13
SLIDE 13

@ksshams @swami_79

as versatile as a swiss army knife

complex queries

key-value access transactions analytics

slide-14
SLIDE 14

@ksshams @swami_79

RDBMs are too similar to Swiss Army Knives

slide-15
SLIDE 15

@ksshams @swami_79

but sometimes.. swiss army knifes.. can be more than what you bargained for

slide-16
SLIDE 16

@ksshams @swami_79

partitioning

easy

re-partitioning

hard..

slide-17
SLIDE 17

@ksshams @swami_79

so we bought

bigger boxes...

slide-18
SLIDE 18

@ksshams @swami_79

Q4 was hard-work at Amazon

benchmark new hardware migrate to new hardware repartition databases pray ...

slide-19
SLIDE 19

@ksshams @swami_79

RDBMs availability challenges..

slide-20
SLIDE 20

@ksshams @swami_79

then.. (in 2005)

episode 2

slide-21
SLIDE 21

@ksshams @swami_79

amazon dynamo

predecessor to dynamoDB

specialist tool :

  • limited querying capabilities
  • simpler consistency

replicated DHT with consistent hashing

  • ptimistic replication

“sloppy quorum” anti-entropy mechanism

  • bject versioning
slide-22
SLIDE 22

@ksshams @swami_79

dynamo had many benefits

  • higher availability
  • we traded it off for eventual consistency

  • incremental scalability
  • no more repartitioning
  • no need to architect apps for peak
  • just add boxes

  • simpler querying model ==>> predictable performance
slide-23
SLIDE 23

@ksshams @swami_79

but dynamo was not perfect...

lacked strong consistency

slide-24
SLIDE 24

@ksshams @swami_79

but dynamo was not perfect...

scaling was easier, but...

slide-25
SLIDE 25

@ksshams @swami_79

but dynamo was not perfect...

steep learning curve

slide-26
SLIDE 26

@ksshams @swami_79

but dynamo was not perfect...

dynamo was a product ... ==>> not a service...

slide-27
SLIDE 27

@ksshams @swami_79

then.. (in 2012)

episode 3

slide-28
SLIDE 28

@ksshams @swami_79

“Even though we have years of experience with large, complex NoSQL architectures, we are happy to be finally out of the business of managing it ourselves.” - Don MacAskill, CEO

  • NoSQL database
  • fast & predictable performance
  • seamless scalability
  • easy administration

DynamoDB

slide-29
SLIDE 29

@ksshams @swami_79

build services not software!!

slide-30
SLIDE 30

@ksshams @swami_79

amazon.com’s experience with services

slide-31
SLIDE 31

@ksshams @swami_79

how do you create a successful service?

slide-32
SLIDE 32

@ksshams @swami_79

with great services, comes great responsibility

slide-33
SLIDE 33

@ksshams @swami_79

DynamoDB Goals and Philosophies

never compromise on durability

scale is our problem easy to use

scale in rps consistent and low latencies

slide-34
SLIDE 34

@ksshams @swami_79

Architect Test Monitor Goals D e p l

  • y

Develop

Customer

slide-35
SLIDE 35

@ksshams @swami_79

Architect

Test

Monitor Goals

Customer

Deploy Develop

slide-36
SLIDE 36

@ksshams @swami_79

Sacred Tenets in Services

plan for success - plan for scalability don’t compromise durability for performance

plan for failures - fault -tolerance is key

consistent performance is important design - think of blast radius insist on correctness

slide-37
SLIDE 37

@ksshams @swami_79

fault tolerance is a lesson best learned offline

slide-38
SLIDE 38

@ksshams @swami_79

a simple 2-way replication system of a traditional database…

Primary Standby

Writes

slide-39
SLIDE 39

@ksshams @swami_79

@ksshams @swami_79

P S

S ¡is ¡dead, ¡need ¡ to ¡trigger ¡new ¡ replica P ¡is ¡dead, ¡need ¡to ¡ promote ¡myself

P’

slide-40
SLIDE 40

@ksshams @swami_79

@ksshams @swami_79

improved Replication: quorum

Replica Replica

Writes

Replica

Quorum: Successful write on a majority

slide-41
SLIDE 41

Not so easy..

Replica B Replica C

Writes from client A

Replica A Replica D

New member in the group Should I continue to serve reads? Should I start a new quorum?

Replica E Replica F

Reads and Writes from client B Classic Split Brain Issue in Replicated systems leading to lost writes!

slide-42
SLIDE 42

@ksshams @swami_79

Building correct distributed systems is not straight forward..

  • How do you handle replica failures?
  • How do you ensure there is not a parallel

quorum?

  • How do you handle partial failures of replicas?
  • How do you handle concurrent failures?
slide-43
SLIDE 43

correctness is hard, but necessary

slide-44
SLIDE 44

Formal Methods

slide-45
SLIDE 45

Formal Methods

to minimize bugs, we must have a precise description of the design

slide-46
SLIDE 46

Formal Methods

code is too detailed how would you express partial failures or concurrency? design documents and diagrams are vague & imprecise

slide-47
SLIDE 47

Formal Methods

law of large numbers is your friend, so design for scale until you hit large numbers

slide-48
SLIDE 48

@ksshams @swami_79

TLA+ to the rescue?

slide-49
SLIDE 49

@ksshams @swami_79

PlusCal

slide-50
SLIDE 50

@ksshams @swami_79

formal methods are necessary

but not sufficient..

slide-51
SLIDE 51

@ksshams @swami_79

A r c h i t e c t

Test

M

  • n

i t

  • r

Goals

customer

Deploy D e v e l

  • p
slide-52
SLIDE 52

@ksshams @swami_79

forget to test - no, serious .. don’t ly

slide-53
SLIDE 53

embrace failure and don’t be surprised

simulate failures at unit test level

fault injection testing

datacenter testing

network brown out testing

scale testing

slide-54
SLIDE 54

testing is a lifelong journey

slide-55
SLIDE 55

@ksshams @swami_79

testing is necessary but not sufficient..

slide-56
SLIDE 56

@ksshams @swami_79

Architect Test Monitor G

  • a

l s

Customer

Deploy D e v e l

  • p
slide-57
SLIDE 57

@ksshams @swami_79

release cycle

gamma simulate real world

  • ne box

does it work? phased deployment treading lightly monitor does it still work?

slide-58
SLIDE 58

@ksshams @swami_79

Canaries

slide-59
SLIDE 59

@ksshams @swami_79

Alarms

slide-60
SLIDE 60

@ksshams @swami_79

Monitor customer behavior

Architect T e s t

Monitor

G

  • a

l s

Customer

Deploy Develop

slide-61
SLIDE 61

@ksshams @swami_79

measuring customer experience is key

don’t be satisfied by average - look at 99 percentile

slide-62
SLIDE 62

@ksshams @swami_79

understand the scaling dimensions

slide-63
SLIDE 63

@ksshams @swami_79

understand how your service will be abused

slide-64
SLIDE 64

@ksshams @swami_79

let’s see these rules in action through a true story

slide-65
SLIDE 65

@ksshams @swami_79

we were building distributed systems all over amazon.com

slide-66
SLIDE 66

@ksshams @swami_79

we needed a uniform and correct way to do consensus..

slide-67
SLIDE 67

@ksshams @swami_79

so we built a paxos lock library service

slide-68
SLIDE 68

@ksshams @swami_79

such a service is so much more useful than just leader election.. it became a distributed state store

slide-69
SLIDE 69

@ksshams @swami_79

such a service is so much more useful than just leader election..

  • r a distributed state store

wait wait.. you’re telling me if I poll, I can detect node failure?

slide-70
SLIDE 70

@ksshams @swami_79

we acted quickly - and scaled up our entire fleet with more nodes

doh!!!!

we slowed consensus...

slide-71
SLIDE 71

@ksshams @swami_79

understand the scaling dimensions

& scale them independently...

slide-72
SLIDE 72

@ksshams @swami_79

L e a d e r E l e c t i

  • n

Failure Notification

State Store

a lock service has 3 components..

slide-73
SLIDE 73

@ksshams @swami_79

L e a d e r E l e c t i

  • n

F a i l u r e N

  • t

i f i c a t i

  • n

State Store

they must be scaled independently..

slide-74
SLIDE 74

@ksshams @swami_79

L e a d e r E l e c t i

  • n

F a i l u r e N

  • t

i f i c a t i

  • n

State Store

they must be scaled independently..

slide-75
SLIDE 75

@ksshams @swami_79

L e a d e r E l e c t i

  • n

F a i l u r e N

  • t

i f i c a t i

  • n

State Store

they must be scaled independently..

slide-76
SLIDE 76

@ksshams @swami_79

understand scaling dimensions

  • bserve

how service is used

  • scalability over features

strive for correctness relentlessly test monitor like a hawk

slide-77
SLIDE 77

@ksshams @swami_79

Thank You!

@swami_79 @kshams