Programming Distributed Systems 08 Replicated Data Types Annette - - PowerPoint PPT Presentation

programming distributed systems
SMART_READER_LITE
LIVE PREVIEW

Programming Distributed Systems 08 Replicated Data Types Annette - - PowerPoint PPT Presentation

Programming Distributed Systems 08 Replicated Data Types Annette Bieniusa AG Softech FB Informatik TU Kaiserslautern Summer Term 2018 Annette Bieniusa Programming Distributed Systems Summer Term 2018 1/ 37 Motivation So far, we resolved


slide-1
SLIDE 1

Programming Distributed Systems

08 Replicated Data Types Annette Bieniusa

AG Softech FB Informatik TU Kaiserslautern

Summer Term 2018

Annette Bieniusa Programming Distributed Systems Summer Term 2018 1/ 37

slide-2
SLIDE 2

Motivation

So far, we resolved conflicting updates (i.e. non-commutative) updates simply by sequencing operations using arbitration order (ar). But sometimes, applications do not want to depend on a global order such as ar do want to be made aware of conflicts do want to resolve conflicts in a type-specific way.

Annette Bieniusa Programming Distributed Systems Summer Term 2018 2/ 37

slide-3
SLIDE 3

Example: Multi-value register

Annette Bieniusa Programming Distributed Systems Summer Term 2018 3/ 37

slide-4
SLIDE 4

How can we determine the state?

Annette Bieniusa Programming Distributed Systems Summer Term 2018 4/ 37

slide-5
SLIDE 5

Formal model

Annette Bieniusa Programming Distributed Systems Summer Term 2018 5/ 37

slide-6
SLIDE 6

Sequential semantics for registers

S : Op × Op∗ → V al S(rd, ǫ) = undef (read returns initial value) S(rd, wr(2) · wr(8)) = 8 (read returns last value written) S(wr(3), rd · wr(2) · wr(8)) = ok (write always returns ok)

Annette Bieniusa Programming Distributed Systems Summer Term 2018 6/ 37

slide-7
SLIDE 7

Operation Context

An operation context is a finite event graph C = (E, op, vis, ar). Events in E capture what prior operations are visible to the

  • peration that is to be performed.

Annette Bieniusa Programming Distributed Systems Summer Term 2018 7/ 37

slide-8
SLIDE 8

Concurrent semantics for Multi-Value Register

F : Op × C → V al Fmvr(wr(x), C) =

  • k

Fmvr(rd(), C) = {x| exists e in C such that op(e) = wr(x) and e is vis-maximal in C}

Annette Bieniusa Programming Distributed Systems Summer Term 2018 8/ 37

slide-9
SLIDE 9

Quizz: What do the read ops return?

Annette Bieniusa Programming Distributed Systems Summer Term 2018 9/ 37

slide-10
SLIDE 10

Annette Bieniusa Programming Distributed Systems Summer Term 2018 10/ 37

slide-11
SLIDE 11

Return values in Abstract Executions revisited

Last lecture: Consistency Notions can be defined alternatively with concurrency semantics An abstract execution A = (E, op, rval, rb, ss, vis, ar) satisfies a concurrent semantics F if rval(e) = F(op(e), A |vis−1(e),op,vis,ar

Annette Bieniusa Programming Distributed Systems Summer Term 2018 11/ 37

slide-12
SLIDE 12

Semantics of a set data type

Sequential specification of abstract data type Set S: {true} add(e) {e ∈ S} {true} rmv(e) {e / ∈ S} The following pairs of operations are commutative (for two elements e, f and e = f):

{true} add(e); add(e) {e ∈ S} {true} add(e); add(f) {e, f ∈ S} {true} rmv(e); rmv(e) {e / ∈ S} {true} rmv(e); rmv(f) {e, f / ∈ S} {true} add(e); rmv(f) {e ∈ S, f / ∈ S}

For these ops, the concurrent execution should yield the same result as executing the ops in any order.

Annette Bieniusa Programming Distributed Systems Summer Term 2018 12/ 37

slide-13
SLIDE 13

What are the options regarding a concurrency semantics for add(e) and rmv(e)?

The operations add(e) and rmv(e) are not commutative

{true} add(e); rmv(e) {e / ∈ S} {true} rmv(e); add(e) {e ∈ S}

Options

sequential consistency / linearizability → not really. . . add-wins: e ∈ S remove-wins: e / ∈ S erroneous state last-writer wins (i.e. define arbitration order through total order, for example, via totally- ordered timestamps)

Annette Bieniusa Programming Distributed Systems Summer Term 2018 13/ 37

slide-14
SLIDE 14

Set Semantics

Annette Bieniusa Programming Distributed Systems Summer Term 2018 14/ 37

slide-15
SLIDE 15

Formal Semantics for the Add-Wins Set

Faws(add(x), C) =

  • k

Faws(rmv(x), C) =

  • k

Faws(rd(), C) = {x| exists e in C such that op(e) = add(x) and there exists no e’ in C such that

  • p(e′) = rmv(x) and e vis

− − → e′}

Annette Bieniusa Programming Distributed Systems Summer Term 2018 15/ 37

slide-16
SLIDE 16

Sets with odd semantics

Grow-only set

Convergence by union on element set No remove operation

2P-Set (Wuu & Bernstein PODC 1984)

Set of added elements + set of tombstones Add/remove once Violates sequential spec

c-set (Sovran et al., SOSP 2011)

Maintain add/remove counter Violates sequential spec

Annette Bieniusa Programming Distributed Systems Summer Term 2018 16/ 37

slide-17
SLIDE 17

Conflict-free Replicated Data Types (CRDTs) [3]

Same API as sequential abstract data type, but with concurrency semantics If operations are commutative, same semantics as in sequential execution Otherwise, need arbitration to resolve conflict

Don’t loose updates! Results should not depend on the order received Semantics close to sequential version

Annette Bieniusa Programming Distributed Systems Summer Term 2018 17/ 37

slide-18
SLIDE 18

CRDTs and Consistency

Strong Eventual Consistency

Eventual delivery: Every update is eventually applied at all correct replicas Termination: Update operation terminates Strong convergence: Correct replicas that have applied the same update have equivalent state

Annette Bieniusa Programming Distributed Systems Summer Term 2018 18/ 37

slide-19
SLIDE 19

CRDT catalogue

Register (Laster-writer wins, Multi-value) Set (Grow-Only, Add-Wins, Remove-Wins) Flags Counter (unlimited, restricted/bounded) Graph (directed, monotone DAG) Sequence / List Map, JSON

Annette Bieniusa Programming Distributed Systems Summer Term 2018 19/ 37

slide-20
SLIDE 20

Specification: Replicated counter

All operations commute, no conflict resolution is needed Value returned depends only on E and op, but not on vis and ar Fctr(rd(), (E, op, vis, ar)) = |{e′ ∈ E | op(e′) = inc}| Next: Operational Semantics

Annette Bieniusa Programming Distributed Systems Summer Term 2018 20/ 37

slide-21
SLIDE 21

State-based CRDTs

Annette Bieniusa Programming Distributed Systems Summer Term 2018 21/ 37

slide-22
SLIDE 22

State-based specifications

Synchronization by propagating replica state Updates must inflate the state State must form a join semi-lattice wrt merge ⇒ Merge must be idempotent, commutative, associative

Annette Bieniusa Programming Distributed Systems Summer Term 2018 22/ 37

slide-23
SLIDE 23

Join-semilattice

A join-semilattice S is a set that has a join (i.e. a least upper bound) for any nonempty finite subset: For all elements x, y ∈ S, the least upper bound (LUB) x ⊔ y exists. A semilattice is a commutative, idempotent and associative. A partial order on the elements of S is induced by setting x ≤ y iff x ⊔ y = y.

Annette Bieniusa Programming Distributed Systems Summer Term 2018 23/ 37

slide-24
SLIDE 24

Examples

Annette Bieniusa Programming Distributed Systems Summer Term 2018 24/ 37

slide-25
SLIDE 25

Example: Counter

Annette Bieniusa Programming Distributed Systems Summer Term 2018 25/ 37

slide-26
SLIDE 26

Operation-based CRDTs

Annette Bieniusa Programming Distributed Systems Summer Term 2018 26/ 37

slide-27
SLIDE 27

Op-based specifications

Synchronization by propagating operations and re-apply Concurrent updates must commute Requires reliable causal delivery

Annette Bieniusa Programming Distributed Systems Summer Term 2018 27/ 37

slide-28
SLIDE 28

Example: Counter

Annette Bieniusa Programming Distributed Systems Summer Term 2018 28/ 37

slide-29
SLIDE 29

Example: Add-wins Set (Observed-remove Set)

Annette Bieniusa Programming Distributed Systems Summer Term 2018 29/ 37

slide-30
SLIDE 30

Example: Add-wins Set (Observed-remove Set)

Annette Bieniusa Programming Distributed Systems Summer Term 2018 30/ 37

slide-31
SLIDE 31

Optimized version of Add-wins Set

Possible to garbage-collect the tombstone after remove Trick: Assuming causal delivery, element will never be re-introduced (with the same id)[2]

Annette Bieniusa Programming Distributed Systems Summer Term 2018 31/ 37

slide-32
SLIDE 32

Optimized version of Add-wins Set

Annette Bieniusa Programming Distributed Systems Summer Term 2018 32/ 37

slide-33
SLIDE 33

Challenges with CRDTs

Meta-data overhead for CRDTs that require causal contexts

Version vectors track concurrent modifications (O(N) ⇒ Churn!)

Monotonically growing state with state-based approach

Tombstones ⇒ Garbage collection

Composability for recursive CRDTs (Maps, Sequences; Transactions)

What is the desirable semantics of the composed entity?

Annette Bieniusa Programming Distributed Systems Summer Term 2018 33/ 37

slide-34
SLIDE 34

Delta-based CRDTs

State-based CRDTs suffer from monotonically growing state Op-based CRDTs require causal delivery

Delta-based CRDTs[1]

Small message comprising a set of incremental updates Work over unreliable communication channels A delta-mutator mδ is a function, corresponding to an update

  • peration, which takes a state X in a join-semilattice S as parameter

and returns a delta-mutation mδ(X), also in S. X′ = X ⊔ mδ(X)

Annette Bieniusa Programming Distributed Systems Summer Term 2018 34/ 37

slide-35
SLIDE 35

Adoption of CRDTs in industry

Annette Bieniusa Programming Distributed Systems Summer Term 2018 35/ 37

slide-36
SLIDE 36

Conclusion

CRDTs provide Strong Eventual Consistency (sometimes even more) Data-type specific conflict resolution

Deterministic (independent of local update order) Relation to sequential semantics Meta-data overhead can be substantial

Annette Bieniusa Programming Distributed Systems Summer Term 2018 36/ 37

slide-37
SLIDE 37

Further reading I

[1] Paulo S´ ergio Almeida, Ali Shoker und Carlos Baquero. “Delta state replicated data types”. In: J. Parallel Distrib. Comput. 111 (2018), S. 162–173. doi: 10.1016/j.jpdc.2017.08.003. url: https://doi.org/10.1016/j.jpdc.2017.08.003. [2] Annette Bieniusa u. a. “An optimized conflict-free replicated set”. In: CoRR abs/1210.3368 (2012). arXiv: 1210.3368. url: http://arxiv.org/abs/1210.3368. [3] Nuno Preguic ¸a, Carlos Baquero und Marc Shapiro. “Conflict-Free Replicated Data Types (CRDTs)”. In: Encyclopedia of Big Data

  • Technologies. Hrsg. von Sherif Sakr und Albert Zomaya. Cham:

Springer International Publishing, 2018, S. 1–10. isbn: 978-3-319-63962-8. doi: 10.1007/978-3-319-63962-8 185-1. url: https://doi.org/10.1007/978-3-319-63962-8 185-1.

Annette Bieniusa Programming Distributed Systems Summer Term 2018 37/ 37