Bumper : Sheltering Transactions from Conflicts Nuno Diegues and - - PowerPoint PPT Presentation

bumper sheltering transactions from conflicts
SMART_READER_LITE
LIVE PREVIEW

Bumper : Sheltering Transactions from Conflicts Nuno Diegues and - - PowerPoint PPT Presentation

Bumper : Sheltering Transactions from Conflicts Nuno Diegues and Paolo Romano ndiegues@gsd.inesc-id.pt www.cloudtm.eu Nuno Diegues 1/27 Context: scalability Large-scale systems powered by the cloud computing paradigm Nuno Diegues 2/27


slide-1
SLIDE 1

Bumper: Sheltering Transactions from Conflicts

Nuno Diegues and Paolo Romano

ndiegues@gsd.inesc-id.pt www.cloudtm.eu

Nuno Diegues 1/27

slide-2
SLIDE 2

Context: scalability

Large-scale systems

powered by the cloud computing paradigm

Nuno Diegues 2/27

slide-3
SLIDE 3

Context: scalability

BigTable Dynamo Cassandra 2006 2007 2008

Weak Consistency

Nuno Diegues 2/27

slide-4
SLIDE 4

Context: scalability

BigTable Dynamo Cassandra 2006 2007 2008

Weak Consistency

Programmers were exposed to idiosyncrasies of the concurrency model

Nuno Diegues 2/27

slide-5
SLIDE 5

Context: strong consistency

BigTable Dynamo Cassandra Azure Megastore SCORe Spanner 2006 2007 2008 2011 2012

Weak Consistency Strong Consistency

Serializable transactions

Nuno Diegues 3/27

slide-6
SLIDE 6

Context: exploited techniques

Partial Replication

Nuno Diegues 4/27

slide-7
SLIDE 7

Context: exploited techniques

Partial Replication

Nuno Diegues 4/27

slide-8
SLIDE 8

Context: exploited techniques

access

Genuineness

Nuno Diegues 4/27

slide-9
SLIDE 9

Context: exploited techniques

commit?

Genuineness

Nuno Diegues 4/27

slide-10
SLIDE 10

Context: exploited techniques

read-only transaction

Multi-versions

Nuno Diegues 4/27

slide-11
SLIDE 11

Context: exploited techniques

read-only transaction update transaction

Multi-versions

Nuno Diegues 4/27

slide-12
SLIDE 12

Context: exploited techniques

read-only transaction update transaction

modify

Multi-versions

Nuno Diegues 4/27

slide-13
SLIDE 13

Context: exploited techniques

read-only transaction update transaction

modify

2 2 2 2 1 1 1 1 1 1

Multi-versions

Nuno Diegues 4/27

slide-14
SLIDE 14

Context: exploited techniques

read-only transaction update transaction consistent reads 2 2 1 1 1 1 2 2 1 1

Multi-versions

Nuno Diegues 4/27

slide-15
SLIDE 15

Context: exploited techniques

read-only transaction update transaction consistent reads 2 2 1 1 1 1 2 2 1 1

Multi-versions — abort-free read-only transactions

Nuno Diegues 4/27

slide-16
SLIDE 16

A State of the art protocol: SCORe [Middleware12]

Open-source project

Nuno Diegues 5/27

slide-17
SLIDE 17

A State of the art protocol: SCORe [Middleware12]

5000 10000 15000 20000 20 40 60 80 100 120 140 160

throughput (txs/s)

#machines

Scalable performance through genuineness and multi-version

Nuno Diegues 5/27

slide-18
SLIDE 18

A State of the art protocol: SCORe [Middleware12]

5000 10000 15000 20000 20 40 60 80 100 120 140 160

throughput (txs/s)

#machines

Update transactions performance depends on contention patterns

Nuno Diegues 5/27

slide-19
SLIDE 19

Problem 1/2

Overly pessimistic validation schemes

Nuno Diegues 6/27

slide-20
SLIDE 20

Problem 1/2

function commit(Transaction tx): for each ‹datum, version›∈ tx.readSet do if not latestVersion(datum, version) then abort(tx)

Nuno Diegues 6/27

slide-21
SLIDE 21

Problem 1/2

read x

T1

read y Nuno Diegues 6/27

slide-22
SLIDE 22

Problem 1/2

read x

T1

read y read x write x

T2

Nuno Diegues 6/27

slide-23
SLIDE 23

Problem 1/2

read x

T1

read y write y read x write x

T2

Nuno Diegues 6/27

slide-24
SLIDE 24

Problem 1/2

read x

T1

read y write y read x write x

rw

T2

Stale read — T1 missed T2

Nuno Diegues 6/27

slide-25
SLIDE 25

Problem 1/2

read x

T1

read y write y read x write x

rw

T2

Sufficient condition for rejecting non-serializable histories But it is not necessary

Nuno Diegues 6/27

slide-26
SLIDE 26

Problem 2/2

Contention hotspots are inherently non-parallelizable

Nuno Diegues 7/27

slide-27
SLIDE 27

Problem 2/2

read z write z

T1

Nuno Diegues 7/27

slide-28
SLIDE 28

Problem 2/2

read z write z read z write z

T2 T1

Nuno Diegues 7/27

slide-29
SLIDE 29

Problem 2/2

read z write z read z write z

rw rw

T2 T1

Nuno Diegues 7/27

slide-30
SLIDE 30

Contribution: Bumper

Objective Enable scalability in conflict-prone scenarios By sheltering transactions from conflicts

Nuno Diegues 8/27

slide-31
SLIDE 31

Contribution: Bumper

Objective

2000 4000 6000 8000 10000 12000 14000 20 40 60 80 100 120 140 160

throughput (txs/s)

#machines

SCORe

Nuno Diegues 8/27

slide-32
SLIDE 32

Contribution: Bumper

Objective

2000 4000 6000 8000 10000 12000 14000 20 40 60 80 100 120 140 160

throughput (txs/s)

#machines

Bumper SCORe

Nuno Diegues 8/27

slide-33
SLIDE 33

Contributions: outline

Distributed Time-Warping

◮ automatically avoid benign conflicts Nuno Diegues 9/27

slide-34
SLIDE 34

Contributions: outline

Distributed Time-Warping

◮ automatically avoid benign conflicts

Delayed Actions

◮ circumvent contention hotspots Nuno Diegues 9/27

slide-35
SLIDE 35

Contributions: outline

Distributed Time-Warping

◮ automatically avoid benign conflicts

Delayed Actions

◮ circumvent contention hotspots

Experimental Evaluation

Nuno Diegues 9/27

slide-36
SLIDE 36

Contribution: 1/2 Distributed Time-Warping

Allow transactions that observe stale reads to commit

Nuno Diegues 10/27

slide-37
SLIDE 37

Contribution: 1/2

read x

T1

read y write y read x write x

rw

T2

Nuno Diegues 10/27

slide-38
SLIDE 38

Contribution: 1/2

read x

T1

read y write y read x write x

rw

T2 T2 T1

time serialization Nuno Diegues 10/27

slide-39
SLIDE 39

Contribution: 1/2

read x

T1

read y write y read x write x

rw

T2 T2 T1

time

rw

serialization Nuno Diegues 10/27

slide-40
SLIDE 40

Contribution: 1/2

read x

T1

read y write y read x write x

rw

T2 T2

serialization Nuno Diegues 10/27

slide-41
SLIDE 41

Contribution: 1/2

read x

T1

read y write y read x write x

rw

T2 T2

serialization

T1

rw

Nuno Diegues 10/27

slide-42
SLIDE 42

Contribution: 1/2

read x

T1

read y write y read x write x

rw

T2 T2

serialization

T1

rw

Time-warp commit

Nuno Diegues 10/27

slide-43
SLIDE 43

Distributed Time-Warping: intuition

T tries to time-warp to the past if it read stale data

Nuno Diegues 11/27

slide-44
SLIDE 44

Distributed Time-Warping: intuition

T tries to time-warp to the past if it read stale data Acceptable if no T ′ witnessed the absence of T

Nuno Diegues 11/27

slide-45
SLIDE 45

Distributed Time-Warping: rules 1/2

Distributed scalar clock protocol — used to version data

1 tsC orders transactions after dependencies (read-from) Nuno Diegues 12/27

slide-46
SLIDE 46

Distributed Time-Warping: rules 1/2

Distributed scalar clock protocol — used to version data

1 tsC orders transactions after dependencies (read-from) 2 before transactions that were missed Nuno Diegues 12/27

slide-47
SLIDE 47

Distributed Time-Warping: rules 1/2

Distributed scalar clock protocol — used to version data

1 tsC orders transactions after dependencies (read-from) ◮ In the absence of stale reads, T2.tsC = scalar clock 2 before transactions that were missed

read x

T1

read y read x write x

T2

ts = 1

Nuno Diegues 12/27

slide-48
SLIDE 48

Distributed Time-Warping: rules 1/2

Distributed scalar clock protocol — used to version data

1 tsC orders transactions after dependencies (read-from) ◮ In the absence of stale reads, T2.tsC = scalar clock 2 before transactions that were missed

read x

T1

read y read x write x

T2

ts = 1

Nuno Diegues 12/27

slide-49
SLIDE 49

Distributed Time-Warping: rules 1/2

Distributed scalar clock protocol — used to version data

1 tsC orders transactions after dependencies (read-from) ◮ In the absence of stale reads, T2.tsC = scalar clock 2 before transactions that were missed ◮ Otherwise, serialize before earliest transaction missed

read x

T1

read y read x write x

T2

ts = 1

Nuno Diegues 12/27

slide-50
SLIDE 50

Distributed Time-Warping: rules 1/2

Distributed scalar clock protocol — used to version data

1 tsC orders transactions after dependencies (read-from) ◮ In the absence of stale reads, T2.tsC = scalar clock 2 before transactions that were missed ◮ Otherwise, serialize before earliest transaction missed ◮ earliest missed T2: T1.tsC = T2.tsC − ǫ

read x

T1

read y read x write x

T2

ts = 1

Nuno Diegues 12/27

slide-51
SLIDE 51

Distributed Time-Warping: rules 1/2

Distributed scalar clock protocol — used to version data

1 tsC orders transactions after dependencies (read-from) ◮ In the absence of stale reads, T2.tsC = scalar clock 2 before transactions that were missed ◮ Otherwise, serialize before earliest transaction missed ◮ earliest missed T2: T1.tsC = T2.tsC − ǫ

read x

T1

read y write y read x write x

T2

ts = 1

Nuno Diegues 12/27

slide-52
SLIDE 52

Distributed Time-Warping: rules 1/2

Distributed scalar clock protocol — used to version data

1 tsC orders transactions after dependencies (read-from) ◮ In the absence of stale reads, T2.tsC = scalar clock 2 before transactions that were missed ◮ Otherwise, serialize before earliest transaction missed ◮ earliest missed T2: T1.tsC = T2.tsC − ǫ

read x

T1

read y write y read x write x

rw

T2

ts = 1

Nuno Diegues 12/27

slide-53
SLIDE 53

Distributed Time-Warping: rules 1/2

Distributed scalar clock protocol — used to version data

1 tsC orders transactions after dependencies (read-from) ◮ In the absence of stale reads, T2.tsC = scalar clock 2 before transactions that were missed ◮ Otherwise, serialize before earliest transaction missed ◮ earliest missed T2: T1.tsC = T2.tsC − ǫ

read x

T1

read y write y read x write x

rw

T2

ts = 1 ts = 1 - ε

Nuno Diegues 12/27

slide-54
SLIDE 54

Distributed Time-Warping: rules 1/2

Distributed scalar clock protocol — used to version data

1 tsC orders transactions after dependencies (read-from) ◮ In the absence of stale reads, T2.tsC = scalar clock 2 before transactions that were missed ◮ Otherwise, serialize before earliest transaction missed ◮ earliest missed T2: T1.tsC = T2.tsC − ǫ

read x

T1

read y write y read x write x

rw

T2

ts = 1 ts = 1 - ε write z

T3

ts = 2

Nuno Diegues 12/27

slide-55
SLIDE 55

Distributed Time-Warping: rules 2/2

When can we not apply this idea?

Nuno Diegues 13/27

slide-56
SLIDE 56

Distributed Time-Warping: rules 2/2

When can we not apply this idea? Look out for a specific structure

Nuno Diegues 13/27

slide-57
SLIDE 57

Distributed Time-Warping: rules 2/2

When can we not apply this idea? Look out for a specific structure: Three transactions connected

◮ a triad

A

write y read y

rw

T

write x

B

read x

rw

Nuno Diegues 13/27

slide-58
SLIDE 58

Distributed Time-Warping: rules 2/2

When can we not apply this idea? Look out for a specific structure: Three transactions connected

◮ a triad

The link between all three

◮ the pivot

A

write y read y

rw

T

write x

B

read x

rw

Pivot Nuno Diegues 13/27

slide-59
SLIDE 59

Distributed Time-Warping: rules 2/2

When can we not apply this idea? Look out for a specific structure: Three transactions connected

◮ a triad

The link between all three

◮ the pivot

Abort if:

◮ Completes a triad ◮ Whose pivot time-warp

commits

A

write y read y

rw

T

write x

B

read x

rw

Pivot Nuno Diegues 13/27

slide-60
SLIDE 60

Distributed Time-Warping: details

read access

read-set

empty

Nuno Diegues 14/27

slide-61
SLIDE 61

Distributed Time-Warping: details

read access

stamp: <ts, tid>

read-set

Register read operation

Nuno Diegues 14/27

slide-62
SLIDE 62

Distributed Time-Warping: details

read-only transaction

stamp: <ts, tid>

read-set

Abort-free read-only transactions

Nuno Diegues 14/27

slide-63
SLIDE 63

Distributed Time-Warping: details

stamp: <ts, tid>

read write

stamp: <ts, tid>

read-set write-set

Consider update transaction: Validate writes and reads

Nuno Diegues 14/27

slide-64
SLIDE 64

Distributed Time-Warping: details

stamp: <10, T> stamp: <10, T> validate writes T.start = 9 may time-warp?

read-set write-set

Validate writes and reads

Nuno Diegues 14/27

slide-65
SLIDE 65

Distributed Time-Warping: details

stamp: <10, T> stamp: <10, T> validate writes T.start = 9 may time-warp? Yes

read-set write-set

Validate writes and reads

Nuno Diegues 14/27

slide-66
SLIDE 66

Distributed Time-Warping: details

stamp: <10, T> stamp: <10, T> validate reads need time-warp? T.start = 9 may time-warp? Yes

read-set write-set

Validate writes and reads

Nuno Diegues 14/27

slide-67
SLIDE 67

Distributed Time-Warping: details

stamp: <10, T> stamp: <10, T> validate reads need time-warp? No T.start = 9 may time-warp? Yes

read-set write-set

1 1

1 1

Validate writes and reads

Nuno Diegues 14/27

slide-68
SLIDE 68

Distributed Time-Warping: details

stamp: <10, T> stamp: <10, T> validate reads need time-warp? Yes T.start = 9 may time-warp? Yes

read-set write-set

2 1

1 1

new version missed!

1

Time-warp commit

Nuno Diegues 14/27

slide-69
SLIDE 69

Distributed Time-Warping: details

stamp: <10, T> stamp: <11, Z> need time-warp? Yes T.start = 9 may time-warp? No

read-set write-set

2 1

1 1

1

Abort condition

Nuno Diegues 14/27

slide-70
SLIDE 70

Contribution: 2/2

read z write z read z write z

rw rw

T2 T1

Recall hotspot example

Nuno Diegues 15/27

slide-71
SLIDE 71

Contribution: 2/2

read z write z read z write z

rw rw

T2 T1

? Delayed Actions programming abstraction

Nuno Diegues 15/27

slide-72
SLIDE 72

Contribution: 2/2

read z write z read z write z

rw rw

T2 T1

? Delayed Actions programming abstraction

Nuno Diegues 15/27

slide-73
SLIDE 73

Contribution: 2/2

function TPC-C-payment(wID, dID, cID, amount): store.startTransaction() ⊲ load warehouse wID ⊲ load district dID ⊲ load customer cID ⊲ decrement customer balance ⊲ increment customer yield ⊲ update customer credit history ⊲ increment warehouse yield ⊲ increment district yield store.commitTransaction() return

Nuno Diegues 15/27

slide-74
SLIDE 74

Contribution: 2/2

function TPC-C-payment(wID, dID, cID, amount): store.startTransaction() ⊲ load warehouse wID ⊲ load district dID ⊲ load customer cID ⊲ decrement customer balance ⊲ increment customer yield ⊲ update customer credit history ⊲ increment warehouse yield — contention hotspot ⊲ increment district yield store.commitTransaction() — restart a lot of work return

Nuno Diegues 15/27

slide-75
SLIDE 75

Contribution: 2/2

function TPC-C-payment(wID, dID, cID, amount): store.startTransaction() ... ⊲ lots of stuff String key ← wID + "balanceKey" int amount ← store.read(key) ⊲ does not affect transaction amount ← amount + change store.write(key, amount) ⊲ objective is to increment ... store.commitTransaction() return

Nuno Diegues 15/27

slide-76
SLIDE 76

Delayed Actions: key idea

T1 T2

Server

read read start start

...

read balance write balance write balance read balance

... ... ...

try commit try commit

  • k

abort Nuno Diegues 16/27

slide-77
SLIDE 77

Delayed Actions: key idea

T1 T2

Server

read read start start

...

read balance write balance write balance read balance

... ... ...

try commit try commit

  • k

abort Nuno Diegues 16/27

slide-78
SLIDE 78

Delayed Actions: key idea

T1 T2

Server

read read start start

... ...

try commit try commit

  • k

delay T1 delay T2

  • k

Nuno Diegues 16/27

slide-79
SLIDE 79

Contribution 2/2

Delayed Actions: identify operations processed without conflicts

Nuno Diegues 17/27

slide-80
SLIDE 80

Contribution 2/2

Delayed Actions: identify operations

◮ outcome does not affect the transaction

processed without conflicts

Nuno Diegues 17/27

slide-81
SLIDE 81

Contribution 2/2

Delayed Actions: identify operations

◮ outcome does not affect the transaction ◮ require programmer’s knowledge

processed without conflicts

Nuno Diegues 17/27

slide-82
SLIDE 82

Contribution 2/2

Delayed Actions: identify operations

◮ outcome does not affect the transaction ◮ require programmer’s knowledge

processed without conflicts

Nuno Diegues 17/27

slide-83
SLIDE 83

Contribution 2/2

Delayed Actions: identify operations

◮ outcome does not affect the transaction ◮ require programmer’s knowledge

processed without conflicts

◮ delayed to commit-time Nuno Diegues 17/27

slide-84
SLIDE 84

Contribution 2/2

Delayed Actions: identify operations

◮ outcome does not affect the transaction ◮ require programmer’s knowledge

processed without conflicts

◮ delayed to commit-time ◮ use the latest snapshot Nuno Diegues 17/27

slide-85
SLIDE 85

Contribution 2/2

Delayed Actions: identify operations

◮ outcome does not affect the transaction ◮ require programmer’s knowledge

processed without conflicts

◮ delayed to commit-time ◮ use the latest snapshot ◮ ensure it cannot be invalidated Nuno Diegues 17/27

slide-86
SLIDE 86

Delayed Actions: details

In the commit procedure:

Nuno Diegues 18/27

slide-87
SLIDE 87

Delayed Actions: details

In the commit procedure: acquire locks for delayed actions

Nuno Diegues 18/27

slide-88
SLIDE 88

Delayed Actions: details

In the commit procedure: acquire locks for delayed actions keys were defined by programmer

Nuno Diegues 18/27

slide-89
SLIDE 89

Delayed Actions: details

In the commit procedure: acquire locks for delayed actions keys were defined by programmer

◮ over-approximate for dynamic structures Nuno Diegues 18/27

slide-90
SLIDE 90

Delayed Actions: details

In the commit procedure: acquire locks for delayed actions keys were defined by programmer

◮ over-approximate for dynamic structures

if successful, confirm transaction:

Nuno Diegues 18/27

slide-91
SLIDE 91

Delayed Actions: details

In the commit procedure: acquire locks for delayed actions keys were defined by programmer

◮ over-approximate for dynamic structures

if successful, confirm transaction:

◮ atomically execute delayed actions Nuno Diegues 18/27

slide-92
SLIDE 92

Delayed Actions: details

In the commit procedure: acquire locks for delayed actions keys were defined by programmer

◮ over-approximate for dynamic structures

if successful, confirm transaction:

◮ atomically execute delayed actions ◮ write-back deferred updates Nuno Diegues 18/27

slide-93
SLIDE 93

Delayed Actions: details

In the commit procedure: acquire locks for delayed actions keys were defined by programmer

◮ over-approximate for dynamic structures

if successful, confirm transaction:

◮ atomically execute delayed actions ◮ write-back deferred updates ◮ atomicity enforced with sequential confirmation thread per machine Nuno Diegues 18/27

slide-94
SLIDE 94

In more detail...

Algorithms for Bumper

Nuno Diegues 19/27

slide-95
SLIDE 95

In more detail...

Integration with different Distributed Data Platforms

Nuno Diegues 19/27

slide-96
SLIDE 96

In more detail...

Correctness proofs

Nuno Diegues 19/27

slide-97
SLIDE 97

In more detail...

Fault-tolerance

Nuno Diegues 19/27

slide-98
SLIDE 98

Experimental Evaluation: questions

How much can Bumper enhance scalability in conflict-prone scenarios?

Nuno Diegues 20/27

slide-99
SLIDE 99

Experimental Evaluation: questions

How much can Bumper enhance scalability in conflict-prone scenarios? To what extent can it reduce transactions’ abort rate?

Nuno Diegues 20/27

slide-100
SLIDE 100

Experimental Evaluation: questions

How much can Bumper enhance scalability in conflict-prone scenarios? To what extent can it reduce transactions’ abort rate? What overheads does it introduce?

Nuno Diegues 20/27

slide-101
SLIDE 101

Experimental Evaluation: example

2000 4000 6000 8000 10000 12000 20 40 60 80 100 120 140 160 20 40 60 80 100 throughput (txs/s) aborted transactions (%)

#nodes

Bumper throughput SCORe throughput Bumper aborts SCORe aborts Nuno Diegues 21/27

slide-102
SLIDE 102

Ordered Skip-List: DTW

2000 4000 6000 8000 10000 20 40 60 80 100 120 140 160 20 40 60 80 100 throughput (txs/s) aborted transactions (%)

#nodes

SCORe throughput SCORe aborts

50% read-only; transactions modify different elements but conflict

Nuno Diegues 22/27

slide-103
SLIDE 103

Ordered Skip-List: DTW

2000 4000 6000 8000 10000 20 40 60 80 100 120 140 160 20 40 60 80 100 throughput (txs/s) aborted transactions (%)

#nodes

Bumper throughput SCORe throughput Bumper aborts SCORe aborts

Speedup of 2.23× at 160 machines

Nuno Diegues 22/27

slide-104
SLIDE 104

TPC-C: DA

1000 2000 3000 4000 5000 6000 7000 20 40 60 80 100 120 140 160 20 40 60 80 100 throughput (txs/s) aborted transactions (%)

#nodes

SCORe throughput SCORe aborts

also 50% read-only; transactions update warehouse/district balance often

Nuno Diegues 23/27

slide-105
SLIDE 105

TPC-C: DA

1000 2000 3000 4000 5000 6000 7000 20 40 60 80 100 120 140 160 20 40 60 80 100 throughput (txs/s) aborted transactions (%)

#nodes

Bumper throughput SCORe throughput Bumper aborts SCORe aborts

Speedup of 3.4× at 160 machines

Nuno Diegues 23/27

slide-106
SLIDE 106

Vacation: DTW + DA

1000 3000 5000 7000 9000 11000 13000 20 40 60 80 100 120 140 160 20 40 60 80 100 throughput (txs/s) aborted transactions (%)

#nodes

SCORe throughput SCORe aborts

also 50% read-only; both Time-Warp and Delayed Actions

Nuno Diegues 24/27

slide-107
SLIDE 107

Vacation: DTW + DA

1000 3000 5000 7000 9000 11000 13000 20 40 60 80 100 120 140 160 20 40 60 80 100 throughput (txs/s) aborted transactions (%)

#nodes

Bumper throughput SCORe throughput Bumper aborts SCORe aborts

Reduction of abort rate to under 1%.

Nuno Diegues 24/27

slide-108
SLIDE 108

Overhead assessment

1000 2000 3000 4000 5000 6000 20 40 60 80 100 120 140 160 20 40 60 80 100

throughput (txs/s) aborted transactions (%)

#nodes Bumper throughput SCORe throughput Bumper aborts SCORe aborts 5000 10000 15000 20000 20 40 60 80 100 120 140 160

throughput (txs/s)

#nodes Bumper throughput SCORe throughput

Nuno Diegues 25/27

slide-109
SLIDE 109

Summary

Strong consistency is possible with MVCC + Genuine Partial

Nuno Diegues 26/27

slide-110
SLIDE 110

Summary

Strong consistency is possible with MVCC + Genuine Partial Update transactions impair scalability:

Nuno Diegues 26/27

slide-111
SLIDE 111

Summary

Strong consistency is possible with MVCC + Genuine Partial Update transactions impair scalability:

◮ benign conflicts that do not threat serializability Nuno Diegues 26/27

slide-112
SLIDE 112

Summary

Strong consistency is possible with MVCC + Genuine Partial Update transactions impair scalability:

◮ benign conflicts that do not threat serializability ◮ hotspots of contention Nuno Diegues 26/27

slide-113
SLIDE 113

Summary

Strong consistency is possible with MVCC + Genuine Partial Update transactions impair scalability:

◮ benign conflicts that do not threat serializability ◮ hotspots of contention

Distributed Time-Warping to automatically reduce benign conflicts

Nuno Diegues 26/27

slide-114
SLIDE 114

Summary

Strong consistency is possible with MVCC + Genuine Partial Update transactions impair scalability:

◮ benign conflicts that do not threat serializability ◮ hotspots of contention

Distributed Time-Warping to automatically reduce benign conflicts Delayed Actions to overcome hotspots of contention

Nuno Diegues 26/27

slide-115
SLIDE 115

Questions?

Thank you!

Nuno Diegues 27/27

slide-116
SLIDE 116

Synergy: DTW + DA

Nuno Diegues 28/27

slide-117
SLIDE 117

Synergy: DTW + DA

read x:0 write x=1

T1

Serialization: i.e., possible equivalent serial order...

Nuno Diegues 28/27

slide-118
SLIDE 118

Synergy: DTW + DA

read x:0 write x=1 read x:0 delayed:z

T1 T2

Serialization: T1

Nuno Diegues 28/27

slide-119
SLIDE 119

Synergy: DTW + DA

read x:0 write x=1 read x:0 delayed:z

T1 T2

read z:0 write z=1

Serialization: T1

Nuno Diegues 28/27

slide-120
SLIDE 120

Synergy: DTW + DA

read x:0 write x=1 read x:0 delayed:z

rw

T1 T2

read z:0 write z=1

Serialization: T2 → T1

Nuno Diegues 28/27

slide-121
SLIDE 121

Synergy: DTW + DA

read x:0 write x=1 read x:0 delayed:z

rw

T1 T2

read z:0 write z=1 read z:0 write z=1

T3

read x:1

Serialization: T2 → T1

Nuno Diegues 28/27

slide-122
SLIDE 122

Synergy: DTW + DA

read x:0 write x=1 read x:0 delayed:z

rw rw

T1 T2

read z:0 write z=1 read z:0 write z=1

T3

read x:1

Serialization: T3 → T2 → T1

Nuno Diegues 28/27

slide-123
SLIDE 123

Synergy: DTW + DA

read x:0 write x=1 read x:0 delayed:z

rw rw

T1 T2

read z:0 write z=1 read z:0 write z=1

T3

read x:1

Serialization: T3 → T2 → T1 → T3

Nuno Diegues 28/27

slide-124
SLIDE 124

Synergy: DTW + DA

read x:0 write x=1 read x:0 delayed:z

rw rw

T1 T2

read z:0 write z=1 read z:0 write z=1

T3

read x:1

Possible to overcome, at the cost of additional validations — future work

Nuno Diegues 28/27

slide-125
SLIDE 125

Synergy: DTW + DA

read x:0 write x=1 read x:0 delayed:z

rw rw

T1 T2

read z:0 write z=1 read z:0 write z=1

T3

read x:1

A transaction either Time-Warps or uses Delayed Actions (or none...)

Nuno Diegues 28/27

slide-126
SLIDE 126

Red Blue Consistency [OSDI12]

Consistency intuition to reason on serializable transactions and eventual consistency

Nuno Diegues 29/27

slide-127
SLIDE 127

Red Blue Consistency [OSDI12]

Consistency intuition to reason on serializable transactions and eventual consistency We have more scalable red transactions

Nuno Diegues 29/27

slide-128
SLIDE 128

Red Blue Consistency [OSDI12]

Consistency intuition to reason on serializable transactions and eventual consistency We have more scalable red transactions We could have blue as well

Nuno Diegues 29/27

slide-129
SLIDE 129

Red Blue Consistency [OSDI12]

Consistency intuition to reason on serializable transactions and eventual consistency We have more scalable red transactions We could have blue as well Our red transactions are better

Nuno Diegues 29/27

slide-130
SLIDE 130

Serializable Snapshot Isolation [TODS09]

Bumper: abort-free read-only transactions

Nuno Diegues 30/27

slide-131
SLIDE 131

Serializable Snapshot Isolation [TODS09]

Bumper: abort-free read-only transactions distributed, genuine partial

Nuno Diegues 30/27

slide-132
SLIDE 132

Serializable Snapshot Isolation [TODS09]

Bumper: abort-free read-only transactions distributed, genuine partial accept even more histories

Nuno Diegues 30/27

slide-133
SLIDE 133

Serializable Snapshot Isolation[TODS09]

write x read x write y

rw

T1 T2

read z

T3 inConflict = true inConflict = true

  • utConflict = true
  • utConflict = true

Nuno Diegues 31/27

slide-134
SLIDE 134

Serializable Snapshot Isolation[TODS09]

write x read x write y

rw

T1 T2

read z

T3

Nuno Diegues 31/27

slide-135
SLIDE 135

Delayed Actions: usage

function void updateBalance(int change): store.startTransaction() int amount ← store.read("balanceKey") amount ← amount + change store.write("balanceKey", amount) ... ⊲ possibly do other stuff store.commitTransaction() return

Nuno Diegues 32/27

slide-136
SLIDE 136

Delayed Actions: usage

function void updateBalance(int change): store.startTransaction() store.delay({"balanceKey"}, function void action(String[] keys): int amount ← store.read(keys[0]) amount ← amount + change store.write(keys[0], amount) ) ... ⊲ possibly do other stuff store.commitTransaction() ⊲ executes delayed actions if successful return

Nuno Diegues 32/27

slide-137
SLIDE 137

Delayed Actions: locking

Lock modes: when can we share the lock? read write read ✓

  • write

✗ ✗

Nuno Diegues 33/27

slide-138
SLIDE 138

Delayed Actions: locking

Lock modes: when can we share the lock? read write delayed read ✓

  • write

✗ ✗

  • delayed

✗ ✗ ✓

Nuno Diegues 33/27