FIT: A Distributed Database Performance Tradeoff Jose M. Faleiro, - - PowerPoint PPT Presentation

fit a distributed database performance tradeoff
SMART_READER_LITE
LIVE PREVIEW

FIT: A Distributed Database Performance Tradeoff Jose M. Faleiro, - - PowerPoint PPT Presentation

FIT: A Distributed Database Performance Tradeoff Jose M. Faleiro, Yale University Daniel J. Abadi, Yale University Presented by Bojun Wang FAIRNESS THROUGHPUT ISOLATION 1 Isolation v.s. Throughput and Fairness Strong isolation


slide-1
SLIDE 1

FIT: A Distributed Database Performance Tradeoff

Jose M. Faleiro, Yale University Daniel J. Abadi, Yale University

  • Presented by Bojun Wang

FAIRNESS ISOLATION THROUGHPUT

1

slide-2
SLIDE 2

Isolation v.s. Throughput and Fairness

  • Strong isolation —> poor throughput
  • poor isolation —> good throughput
  • But fairness is another factor: FIT 3-way trade-off

FAIRNESS ISOLATION THROUGHPUT

2

slide-3
SLIDE 3

DEFINITIONS

  • Distributed Transaction: reads/writes involves records from

multiple partitions

  • ASSUMPTION: a distributed database must satisfy Liveness,

Atomicity, and Safety

3

slide-4
SLIDE 4

DEFINITIONS

  • Liveness: If distributed transaction is always re-submitted whenever

it sees a system-induced abort, it’s guaranteed to commit eventually.

  • system-induced abort: caused by partition failure or deadlocks
  • logic-induced abort: caused by logic inside transaction
  • Safety: all nodes involved in a distributed transaction must all agree

to commit, otherwise abort.

  • Atomicity: all/none updates of a transaction are in database.

4

slide-5
SLIDE 5

Fairness (intuitively)

  • Database system does not deliberately prioritize nor delay certain

transactions.

  • Never artificially adds latency to a transaction for the purpose
  • f facilitating the execution of other transactions.

5

slide-6
SLIDE 6

UNFAIRNESS EXAMPLES

  • Example 1: “group commit”
  • writing logs to disk is slow
  • write N transactions’ logs in batch, single disk write
  • better overall throughput
  • but some transactions cannot commit until threshold N is met
  • Example 2: “lazy evaluation”
  • collect transactions that reads/writes spatial close records
  • defer execution
  • amortize cost of bring records into memory
  • but some transactions have to wait for other transactions

6

slide-7
SLIDE 7

DEFINITIONS

  • Synchronization Independence: One transaction cannot cause

another transaction to block or abort. (Even with conflicting data accesses)

  • Synchronization Independence implies Weak Isolation
  • running with synchronization independence, cannot guarantee any

form of isolation

7

slide-8
SLIDE 8

FIT TRADEOFF

  • a distributed transaction needs coordination between partitions
  • Strong isolation

—> conflicting transactions must wait — > coordination increases wait time —> bad throughput

8

slide-9
SLIDE 9

FIT TRADEOFF

  • Distributed Transaction needs coordination between nodes
  • Strong isolation

—> conflicting transactions must wait synchronization independence — > coordination increases wait time reduce impact of coordination Weak Isolation Good Throughput

9

slide-10
SLIDE 10

FIT TRADEOFF

  • Strong Isolation
  • coordination makes conflicting transaction wait longer
  • But giving up Fairness can reduct this impact
  • Example
  • Do coordination outside of transaction
  • Thus not increasing conflicting transactions wait time
  • Better Throughput Bad Fairness

10

slide-11
SLIDE 11

FIT IN EXAMPLES

Fairness Isolation Throughput G-Store Calvin Spanner Cassandra RAMP

11

slide-12
SLIDE 12

G-Store

EXAMPLES

Isolation Throughput Fairness

  • KeyGroup
  • Put a set of keys into one ‘leader’ partition
  • Reduce coordination cost
  • Not fair to keys not in KeyGroup
  • Some Transactions delayed to form new KeyGroup

12

slide-13
SLIDE 13

Calvin

EXAMPLES

Isolation Throughput Fairness

  • Pre-process a batch of transactions
  • generate total ordering, i.e. a redo log
  • serializable isolation level
  • eliminate deadlock; avoid expensive planning for failures

forced-log writes, synchronous replication

  • minimize coordination cost
  • Pre-process a large batch of transactions for throughput

Unfairness

13

slide-14
SLIDE 14

Spanner

EXAMPLES

Isolation Throughput Fairness

  • Serializable Isolation level
  • Guarantee Fairness
  • 2-phase-commit in replicated setting
  • synchronously replicate every node’s prepare vote
  • synchronously replicate coordinator’s final commit decision
  • Coordination during transaction —> hurt throughput

14

slide-15
SLIDE 15

Cassandra

EXAMPLES

Isolation Throughput Fairness

  • “batch transaction”: UPDATE SET DELETE
  • allow clients to see partial results
  • give up isolation
  • no coordination required for conflicting “transactions”
  • good throughput and good fairness

15

slide-16
SLIDE 16

RAMP

EXAMPLES

Isolation Throughput Fairness

  • Read Atomic: All/None of a transaction updates are visible
  • Implemented by Read Atomic Multi-Partition
  • guarantee synchronization independence
  • weak isolation

16

slide-17
SLIDE 17

FIT IN EXAMPLES

Fairness Isolation Throughput G-Store Calvin Spanner Cassandra RAMP

17

slide-18
SLIDE 18

FIT, IN MULTICORE DATABASE

Isolation Throughput Fairness

  • SILO: Multicore Machine Database, Serializable
  • Tradeoff fairness to gain throughput
  • append logs to shared in-memory buffer
  • expensive to append logs due to synchronization cost
  • each core store logs in core-local buffer
  • periodically move logs from local to shared
  • Amortize synchronization cost over batch of transactions. Unfairness

18

slide-19
SLIDE 19

FIT, IN MULTICORE DATABASE

Isolation Throughput Fairness

  • Dopple: Multicore Machine Database, Serializable
  • joined phase ——aggregate—— split phase
  • joined phase, only one record exists, all transaction allowed
  • split phase, replica, only allow commuting operations. Unfairness

19

slide-20
SLIDE 20

Coordination is a price

FIT TRADEOFF

  • Pay it during transaction + strong isolation ==> poor throughput
  • Pay it before transaction + strong isolation ==> unfairness
  • Give up isolation (reduces coordination impact) ==> fairness &

throughput

20