CRDTs in Practice
Marc Shapiro – Inria & UPMC Nuno Preguiça – U. NOVA
CRDTs in Practice Marc Shapiro Inria & UPMC Nuno Preguia U. - - PowerPoint PPT Presentation
CRDTs in Practice Marc Shapiro Inria & UPMC Nuno Preguia U. NOVA Cloud to the edge Social, web, e-commerce: shared mutable data Scalability replication consistency issues 2 [CRDTs in practice CodeMesh 2015] Cloud to
Marc Shapiro – Inria & UPMC Nuno Preguiça – U. NOVA
[CRDTs in practice — CodeMesh 2015]
Social, web, e-commerce: shared mutable data Scalability ⇒ replication ⇒ consistency issues
2
[CRDTs in practice — CodeMesh 2015]
Social, web, e-commerce: shared mutable data Scalability ⇒ replication ⇒ consistency issues
3
[CRDTs in practice — CodeMesh 2015]
Data type
Replicated
Available
properties)
4
[All About Consistency — CodeMesh 2015]
Availability is king (otherwise stay away)
Fine-grain mutable shared data
Mobile computing In DC Geo-replication
5
[CRDTs in practice — CodeMesh 2015]
Backward-compatible with sequential datatype If operations commute, they can be concurrent
Otherwise, deterministic semantics
6
≣ add(e) || rm (f)
[CRDTs in practice — CodeMesh 2015] 7
[CRDTs in practice — CodeMesh 2015]
Largest European on-line betting operator
Before: SQLserver, doesn't scale, hours to converge mid 2013: noSQL riak: available, siblings; ad-hoc merge (hard!)
8
[CRDTs in practice — CodeMesh 2015]
≥ Jan. 2014; in anger ≥ Dec. 2014 ORSWOT add-remove set
Transformational : “CRDTs saved the day”
Future wish list: “Extra guarantees … without impacting availability.”
9
[CRDTs in practice — CodeMesh 2015]
Many Set operations commute: add(e) / add(f), add(e) / rm(f), etc. Non-commuting pair: add(e) / rm(e)
∧ rmv(e)<add(e) ⟹ e ∈ S }
{⊥e ∈ S}
{e ∈ S}
{e ∉ S} All deterministic, satisfy conditions
10
[CRDTs in practice — CodeMesh 2015]
TV Venice
Replicated wedding list Ordered list of “wishes” (strings)
Position: “after item”
11
TV Venice TV Venice TV Venice Books TV Venice Books TV Books TV Venice TV Ski trip Books TV Venice Ski trip TV Venice Ski trip TV Ski trip Books
[CRDTs in practice — CodeMesh 2015]
TV ski trip books
12
World peace laptop Venice iDrone
Each item points to the next one
[CRDTs in practice — CodeMesh 2015]
TV ski trip books
13
World peace laptop Venice iDrone
Each item points to the next one
[CRDTs in practice — CodeMesh 2015]
14
iDrone World Peace TV Ski trip Books Laptop iDrone World Peace TV Ski trip Books Laptop World Peace iDrone TV Ski trip Books Laptop iDrone World Peace TV Ski trip Books Laptop iDrone World Peace TV Ski trip Books Laptop iDrone TV Ski trip Books Laptop World Peace
[CRDTs in practice — CodeMesh 2015]
15
World Peace iDrone TV Ski trip Books Laptop World Peace World Peace iDrone TV Ski trip Books Laptop World Peace
[CRDTs in practice — CodeMesh 2015]
16
World Peace TV Ski trip Books Laptop iDrone World Peace TV Ski trip Books Laptop
World Peace TV Ski trip Books Laptop
[CRDTs in practice — CodeMesh 2015]
Remove specification { true } rm(wish) { tombstone(wish) } Move, offer: maintain uniqueness invariant { ¬offered(wish,_) } offer(wish) { offered(wish, red) } Precondition stable under concurrent updates?
17
[CRDTs in practice — CodeMesh 2015]
Availability ⟹ concurrent updates
Backwards compatible
Maintaining invariants
18
[All About Consistency — CodeMesh 2015]
Valter Balegas et al.– NOVA LINCS, DI, FCT, Universidade NOVA de Lisboa @ RICON'15
Many applications need to enforce conditions like: counter ≥ K E.g.:
19
[All About Consistency — CodeMesh 2015]
X ≥ 0 Given X = n , there are n rights to execute dec() Distribute rights among replicas
20
[All About Consistency — CodeMesh 2015]
Execute operations locally without coordination Peer-to-peer synchronisation Fail if not enough rights exist
21
[All About Consistency — CodeMesh 2015]
Valter Balegas et al.– NOVA LINCS, DI, FCT, Universidade NOVA de Lisboa @ RICON'15
Create(type, value); Increment(value); Decrement(value); Value(); Transfer(to, qty);
22
[All About Consistency — CodeMesh 2015]
Valter Balegas et al.– NOVA LINCS, DI, FCT, Universidade NOVA de Lisboa @ RICON'15
R
r1 r2 r3 U r1 r2 r3 R1 R2 R3
Increment(10); Increment(8); Increment(15);
23
R
r1 r2 r3 U r1 r2 r3
R
r1 r2 r3 U r1 r2 r3
R
r1 r2 r3 U r1 10 r2 r3
R
r1 r2 r3 U r1 r2 15 r3
R
r1 r2 r3 U r1 r2 r3 8
[All About Consistency — CodeMesh 2015]
Valter Balegas et al.– NOVA LINCS, DI, FCT, Universidade NOVA de Lisboa @ RICON'15
R1 R2 R3
24
R
r1 r2 r3 U r1 10 r2 r3
R
r1 r2 r3 U r1 r2 15 r3
R
r1 r2 r3 U r1 r2 r3 8
[All About Consistency — CodeMesh 2015]
Valter Balegas et al.– NOVA LINCS, DI, FCT, Universidade NOVA de Lisboa @ RICON'15
R1 R2 R3
25
R
r1 r2 r3 U r1 10 r2 r3
R
r1 r2 r3 U r1 r2 15 r3
R
r1 r2 r3 U r1 r2 r3 8
decrement(15); decrement(5);
R
r1 r2 r3 U r1 r2 15 5 r3
[All About Consistency — CodeMesh 2015]
Valter Balegas et al.– NOVA LINCS, DI, FCT, Universidade NOVA de Lisboa @ RICON'15
R1 R2 R3
26
R
r1 r2 r3 U r1 10 r2 r3
R
r1 r2 r3 U r1 r2 15 r3
R
r1 r2 r3 U r1 r2 r3 8
R
r1 r2 r3 U r1 r2 15 5 r3
transfer(r1, 4);
R
r1 r2 r3 U r1 r2 r3 4 8
[All About Consistency — CodeMesh 2015]
Valter Balegas et al.– NOVA LINCS, DI, FCT, Universidade NOVA de Lisboa @ RICON'15
R1 R2 R3
27
R
r1 r2 r3 U r1 10 r2 r3
R
r1 r2 r3 U r1 r2 15 5 r3
R
r1 r2 r3 U r1 r2 r3 4 8
[All About Consistency — CodeMesh 2015]
Valter Balegas et al.– NOVA LINCS, DI, FCT, Universidade NOVA de Lisboa @ RICON'15
R1 R2 R3
28
R
r1 r2 r3 U r1 10 r2 r3
R
r1 r2 r3 U r1 r2 15 5 r3
R
r1 r2 r3 U r1 r2 r3 4 8
merge(r1,r2);
Each replica only touches his line. Merge by taking max of each cell.
[All About Consistency — CodeMesh 2015]
Valter Balegas et al.– NOVA LINCS, DI, FCT, Universidade NOVA de Lisboa @ RICON'15
R1 R2 R3
29
R
r1 r2 r3 U r1 10 r2 r3
R
r1 r2 r3 U r1 r2 15 5 r3
R
r1 r2 r3 U r1 r2 r3 4 8
merge(r1,r2);
Each replica only touches his line. Merge by taking max of each cell.
R
r1 r2 r3 U r1 10 r2 15 5 r3
[All About Consistency — CodeMesh 2015]
Valter Balegas et al.– NOVA LINCS, DI, FCT, Universidade NOVA de Lisboa @ RICON'15
R1 R2 R3
30
R
r1 r2 r3 U r1 10 r2 r3
R
r1 r2 r3 U r1 r2 15 5 r3
R
r1 r2 r3 U r1 r2 r3 4 8
Each replica only touches his line. Merge by taking max of each cell.
merge(r3,r2);
R
r1 r2 r3 U r1 10 r2 15 5 r3
[All About Consistency — CodeMesh 2015]
Valter Balegas et al.– NOVA LINCS, DI, FCT, Universidade NOVA de Lisboa @ RICON'15
R1 R2 R3
31
R
r1 r2 r3 U r1 10 r2 r3
R
r1 r2 r3 U r1 r2 15 5 r3
R
r1 r2 r3 U r1 r2 r3 4 8
Each replica only touches his line. Merge by taking max of each cell.
merge(r3,r2);
R
r1 r2 r3 U r1 10 r2 15 5 r3 4 8
[All About Consistency — CodeMesh 2015]
Check local rights ≥ 12
Valter Balegas et al.– NOVA LINCS, DI, FCT, Universidade NOVA de Lisboa @ RICON'15
R1 R2 R3
32
R
r1 r2 r3 U r1 10 r2 15 5 r3 4 8
R
r1 r2 r3 U r1 10 r2 15 5 r3 4 8
R
r1 r2 r3 U r1 r2 r3 4 8
decrement(12);
[All About Consistency — CodeMesh 2015]
Check local rights ≥ 12
Valter Balegas et al.– NOVA LINCS, DI, FCT, Universidade NOVA de Lisboa @ RICON'15
R1 R2 R3
33
R
r1 r2 r3 U r1 10 r2 15 5 r3 4 8
R
r1 r2 r3 U r1 10 r2 15 5 r3 4 8
R
r1 r2 r3 U r1 r2 r3 4 8
decrement(12);
local = R[1][1]
[All About Consistency — CodeMesh 2015]
Check local rights ≥ 12
Valter Balegas et al.– NOVA LINCS, DI, FCT, Universidade NOVA de Lisboa @ RICON'15
R1 R2 R3
34
R
r1 r2 r3 U r1 10 r2 15 5 r3 4 8
R
r1 r2 r3 U r1 10 r2 15 5 r3 4 8
R
r1 r2 r3 U r1 r2 r3 4 8
decrement(12);
local = R[1][1] + ΣR[i][1]
[All About Consistency — CodeMesh 2015]
Check local rights ≥ 12
Valter Balegas et al.– NOVA LINCS, DI, FCT, Universidade NOVA de Lisboa @ RICON'15
R1 R2 R3
35
R
r1 r2 r3 U r1 10 r2 15 5 r3 4 8
R
r1 r2 r3 U r1 10 r2 15 5 r3 4 8
R
r1 r2 r3 U r1 r2 r3 4 8
decrement(12);
[All About Consistency — CodeMesh 2015]
Operation execute locally; fail if no rights available Redistribute rights
Peer-to-peer synchronization Prototype implemented on top of Riak
36
[All About Consistency — CodeMesh 2015]
Valter Balegas et al.– NOVA LINCS, DI, FCT, Universidade NOVA de Lisboa @ RICON'15
Deployment: 3 Regions on AWS (m1.large) Configurations:
37
[All About Consistency — CodeMesh 2015]
Valter Balegas et al.– NOVA LINCS, DI, FCT, Universidade NOVA de Lisboa @ RICON'15
38
[All About Consistency — CodeMesh 2015]
39
T r a n s m i t
partial database app
Process request & store update Transmit T r a n s m i t fail-over
full database
Transmit
[All About Consistency — CodeMesh 2015]
Cache data at clients
Highly available transactions
Causal consistency
40
[All About Consistency — CodeMesh 2015]
High-level operations
Operations modeled as transactions State
41
[All About Consistency — CodeMesh 2015]
Directory: (name, type) → object
Concurrent: merge v recursively
Concurrent create, edit: re-create
42
[All About Consistency — CodeMesh 2015]
ms ms ms
3 DCs in Amazon EC2 100 client nodes in PlanetLab Cache size: 512 objects SwiftSocial: 90% cache hits
43
[All About Consistency — CodeMesh 2015]
Operations with > 1 cache miss
Read-in-past + client- assisted fault tolerance RTT Client-side caching & updates
44
writes r e a d s reads/writes, remote, no FT reads, remote + stable update writes, remote+stable update
[All About Consistency — CodeMesh 2015]
45
SwiftCloud
1 DC 2 DC 3 DC 1 DC 2 DC 3 DC
classic synch. 60×
[All About Consistency — CodeMesh 2015]
46
SwiftCloud
1 DC 2 DC 3 DC 1 DC 2 DC 3 DC
classic synch. 6× 12×
[All About Consistency — CodeMesh 2015]
47
[CRDTs in practice — CodeMesh 2015]
Applications requires multiple CRDTs
Need to lower expectations… … but still possible to enforce some invariants
consistency
48
[All About Consistency — CodeMesh 2015]
SyncFree
Masoud Saeida-Ardakani, Carlos Baquero, Valter Balegas, Annette Bieniusa, Russell Brown, Sérgio Duarte, Carla Ferreira, Alexey Gotsman, Mahsa Najafzadeh, Marek Zawirski, and more.
49