Data Modeling for Scale with Riak Data Types Sean Cribbs - - PowerPoint PPT Presentation

data modeling for scale with riak data types
SMART_READER_LITE
LIVE PREVIEW

Data Modeling for Scale with Riak Data Types Sean Cribbs - - PowerPoint PPT Presentation

Data Modeling for Scale with Riak Data Types Sean Cribbs @seancribbs #riak #datatypes QCon NYC 2014 I work for Basho We make Visit our booth! Riak is Eventually Consistent key-value + indexes + search + MapReduce Eventual Consistency 1


slide-1
SLIDE 1

Data Modeling for Scale with Riak Data Types

Sean Cribbs @seancribbs #riak #datatypes QCon NYC 2014

slide-2
SLIDE 2

I work for Basho We make

Visit our booth!

slide-3
SLIDE 3

Riak is Eventually Consistent

key-value + indexes + search + MapReduce

slide-4
SLIDE 4

Eventual Consistency

Replicated Loose coordination Convergence

1 2 3

slide-5
SLIDE 5

✔ Fault-tolerant ✔ Highly available ✔ Low-latency

Eventual is Good

slide-6
SLIDE 6

Consistency?

1 2 3 B A

No clear winner! Throw one out? Keep both?

slide-7
SLIDE 7

Consistency?

1 2 3 B A

No clear winner! Throw one out? Keep both? Cassandra

slide-8
SLIDE 8

Consistency?

1 2 3 B A

No clear winner! Throw one out? Keep both? Cassandra Riak

slide-9
SLIDE 9

Conficts!

A! B!

slide-10
SLIDE 10

Semantic Resolution

  • Your app knows the domain - use

business rules to resolve

  • Amazon Dynamo’s shopping cart
slide-11
SLIDE 11

Semantic Resolution

  • Your app knows the domain - use

business rules to resolve

  • Amazon Dynamo’s shopping cart

“Ad hoc approaches have proven brittle and error-prone”

slide-12
SLIDE 12

Convergent Replicated Data Types

slide-13
SLIDE 13

Convergent Replicated Data Types

useful abstractions

slide-14
SLIDE 14

Convergent Replicated Data Types

multiple independent copies useful abstractions

slide-15
SLIDE 15

Convergent Replicated Data Types

multiple independent copies resolves automatically toward a single value useful abstractions

slide-16
SLIDE 16

How CRDTs Work

  • A partially-ordered set of values
  • A merge function
  • An identity value
  • Infation operations
slide-17
SLIDE 17

How CRDTs Work

  • A partially-ordered set of values
  • A merge function
  • An identity value
  • Infation operations

What CRDTs Enable

  • Consistency without coordination
  • Fluent, rich interaction with data
slide-18
SLIDE 18

This research is supported in part by European FP7 project 609 551 SyncFree http://syncfree.lip6.fr/ (2013--2016).

slide-19
SLIDE 19

by @joedevivo

slide-20
SLIDE 20

Forget CRDTs Do Data Modeling

slide-21
SLIDE 21

Data Modeling for Riak

  • Identify needs for both read and write
  • Design around key as index
  • Denormalize relationships if possible
  • Weigh data size against coherence
slide-22
SLIDE 22

Riak Data Types

slide-23
SLIDE 23

Riak Data Types

increment decrement

Counter :: int

slide-24
SLIDE 24

Riak Data Types

increment decrement

Counter :: int

add* remove

Set :: { bytes }

slide-25
SLIDE 25

Riak Data Types

increment decrement

Counter :: int

add* remove

Set :: { bytes }

remove update*

Map :: bytes → DT

slide-26
SLIDE 26

Riak Data Types

increment decrement

Counter :: int

add* remove

Set :: { bytes }

remove update*

Map :: bytes → DT

slide-27
SLIDE 27

Riak Data Types

increment decrement

Counter :: int

add* remove

Set :: { bytes }

assign

Register :: bytes

remove update*

Map :: bytes → DT

slide-28
SLIDE 28

Riak Data Types

increment decrement

Counter :: int

add* remove

Set :: { bytes }

enable* disable

Flag :: boolean

assign

Register :: bytes

remove update*

Map :: bytes → DT

slide-29
SLIDE 29

MADDATA

slide-30
SLIDE 30

Counters

slide-31
SLIDE 31

Ad Network

  • Impressions - when someone sees an

ad

  • Click-through - when someone clicks
  • n an ad
  • Hourly rollups

ad-metrics/<campaign>/<type>-<hour>

slide-32
SLIDE 32

$ riak-admin bucket-type create ad-metrics \ '{"props":{"datatype":"counter"}}' ad-metrics created $ riak-admin bucket-type activate ad-metrics ad-metrics has been activated $ riak-admin bucket-type list ad-metrics (active)

Ad Network

slide-33
SLIDE 33

from riak import RiakClient from rogersads import RIAK_CONFIG from time import strftime client = RiakClient(**RIAK_CONFIG) metrics = client.bucket_type('ad-metrics') def record_metric(campaign, metric_type): key = metric_type + strftime('-%Y%m%d-%H') counter = metrics.bucket(campaign).new(key) counter.increment() counter.store()

Ad Network

slide-34
SLIDE 34

Ad Network

750 1500 2250 3000 9 10 11 12 13 14 15 16
slide-35
SLIDE 35

Sets

slide-36
SLIDE 36
  • RSVPs - guest lists
  • Connections - friends lists per-user
  • Likes - expressing interest

PartyOn

slide-37
SLIDE 37

$ riak-admin bucket-type create partyon-sets \ '{"props":{"datatype":"set"}}' partyon-sets created $ riak-admin bucket-type activate partyon-sets partyon-sets has been activated $ riak-admin bucket-type list partyon-sets (active)

PartyOn

slide-38
SLIDE 38
  • RSVPs

partyon-sets/rsvps/<eventid>

  • Connections

partyon-sets/friends/<userid>

  • Likes

partyon-sets/likes/<eventid>

PartyOn

slide-39
SLIDE 39

from riak.datatypes import Set sets = client.bucket_type('partyon-sets') rsvps = sets.bucket('rsvps') friends = sets.bucket('friends') likes = sets.bucket('likes')

PartyOn

slide-40
SLIDE 40

def rsvp_get(event): return rsvps.get(event) # Returns a Set def rsvp_add(event, user): guests = rsvps.new(event) guests.add(user) guests.store(return_body=True) return guests.context def rsvp_remove(event, user, context): guests = Set(rsvps, event, context=context) guests.remove(user) guests.store()

PartyOn

slide-41
SLIDE 41

Maps

(and the rest)

slide-42
SLIDE 42

GameNet

  • User profles - demographic data

users/profiles/<userid>

  • Achievements - trophies per game

users/trophies/<userid>

  • Game state - progress and stats

users/<gameid>/<userid>

slide-43
SLIDE 43

$ riak-admin bucket-type create users \ '{"props":{"datatype":"map"}}' users created $ riak-admin bucket-type activate users users has been activated $ riak-admin bucket-type list users (active)

GameNet

slide-44
SLIDE 44

users = client.bucket_type('users') def update_profile(user, fields): profile = users.bucket('profiles').get(user) for field in fields: if field in USER_FLAGS: if fields[field]: profile.flags[field].enable() else: profile.flags[field].disable() else: value = fields[field] profile.registers[field].assign(value) profile.store()

GameNet

slide-45
SLIDE 45

def add_trophy(user, game, trophy): trophies = users.bucket('trophies').get(user) trophies.sets[game].add(trophy) trophies.store() def get_trophies(user, game): trophies = users.bucket('trophies').get(user) return trophies.sets[game].value

GameNet

slide-46
SLIDE 46

def build_structure(user, game, structure, gold, wood, stone): gamestate = users.bucket(game).get(user) gamestate.sets['structures'].add(structure) gamestate.counters['gold'].decrement(gold) gamestate.counters['wood'].decrement(wood) gamestate.counters['stone'].decrement(stone) gamestate.store(return_body=True) return gamestate.value

GameNet

slide-47
SLIDE 47

client.create_search_index('asteroids') users.bucket('asteroids').set_property('search_index', 'asteroids') def find_asteroids_opponents(min_score=0): query = "score_counter:[{} to *]".format(min_score) results = client.fulltext_search( 'asteroids', query, fl=['userid_register', 'score_counter']) return results['docs']

GameNet

slide-48
SLIDE 48

Benefts

  • Richer interactions, familiar types
  • Write mutations, not state
  • No merge function to write
  • Same reliability and predictability of

vanilla Riak

slide-49
SLIDE 49

Caveats

  • Value size still matters
  • Updates not idempotent (yet)
  • Cross-key atomicity not possible (yet)
slide-50
SLIDE 50

Future

  • Riak 2.0 due out this summer - betas

available now!

  • Richer querying, lighter storage

requirements, more types

slide-51
SLIDE 51