High Performance Transactions via Early Write Visibility Jose - - PowerPoint PPT Presentation
High Performance Transactions via Early Write Visibility Jose - - PowerPoint PPT Presentation
High Performance Transactions via Early Write Visibility Jose Faleiro Daniel Abadi Joseph Hellerstein Serializability is our gold standard Developers focus on individual transaction correctness System ensures correctness under concurrency
Serializability is our gold standard
Developers focus on individual transaction correctness System ensures correctness under concurrency Developers can focus on application logic
Elephant in the room: Serializability is the exception Weak isolation is the norm
Serializability in practice…
is snapshot isolation is not the default
Non-modular applications: changing anything changes everything
Non-modular applications: changing anything changes everything Silent data corruption
Silent data corruption Non-modular applications: changing anything changes everything Security bugs
“The hacker discovered that if you place several withdrawals all in practically the same instant, they will get processed at more or less the same time. This will result in a negative balance, but valid insertions into the database… ’’
(Second!) Elephant in the room: Very little progress towards addressing the gap
The real hurdle is recoverability mechanism
Recoverability + isolation
Strong isolation mechanisms have limited mileage Due to: Recoverability mechanisms Isolation level specifications
Recoverability + isolation
Limitation is independent of isolation level implementation
Recoverability + isolation
Limitation is independent of isolation level implementation
Includes all modern concurrency control protocols based on 2PL, OCC, MVCC, Timestamp ordering
This talk
State-of-the-art recoverability mechanisms fundamentally limit strong isolation levels New recoverability mechanism based on deterministic execution
Recoverability
Committed transactions must read committed data Required of popular isolation levels: Read committed, snapshot isolation, repeatable read, serializable
Purchase(item_id, cust_id): item = items_tbl[item_id] if item.count == 0: Abort() item.count -= 1 bills.insert(cust_id, item_id, item.price) history.insert(cust_id, item_id)
Recoverability mechanisms
Purchase(item_id, cust_id): item = item = items_tbl items_tbl[item_id item_id] if if item.count item.count == 0: == 0: Abort() Abort() item.count -= 1 bills.insert(cust_id, item_id, item.price) history.insert(cust_id, item_id)
Recoverability mechanisms
Purchase(item_id, cust_id): item = items_tbl[item_id] if item.count == 0: Abort() item.count item.count -= 1 = 1 bills.insert(cust_id, item_id, item.price) history.insert(cust_id, item_id)
Recoverability mechanisms
Purchase(item_id, cust_id): item = items_tbl[item_id] if item.count == 0: Abort() item.count -= 1 bills.insert bills.insert(cust_id cust_id, , item_id item_id, , item.price item.price) history.insert history.insert(cust_id cust_id, , item_id item_id)
Recoverability mechanisms
Purchase(item_id, cust_id): item = items_tbl[item_id] if item.count == 0: Abort()
Recoverability mechanisms
Purchase(item_id, cust_id): item = items_tbl[item_id] if item.count == 0: Abort() item.count -= 1
Recoverability mechanisms
Purchase(item_id, cust_id): item = items_tbl[item_id] if item.count == 0: Abort() item.count -= 1 Purchase(item_id, cust_id): item = items_tbl[item_id] if item.count == 0: Abort()
Recoverability mechanisms
Purchase(item_id, cust_id): item = items_tbl[item_id] if item.count == 0: Abort() item.count -= 1 Purchase(item_id, cust_id): item = items_tbl[item_id] if item.count == 0: Abort()
Recoverability mechanisms
Recoverability mechanisms
Purchase(item_id, cust_id): item = items_tbl[item_id] if item.count == 0: Abort() item.count -= 1 bills.insert(cust_id, item_id, item.price) history.insert(cust_id, item_id)
Strawmen: Wait until commit Limited throughput Expose writes immediately Cascaded rollbacks
Recoverability mechanisms
Purchase(item_id, cust_id): item = items_tbl[item_id] if item.count == 0: Abort() item.count -= 1 bills.insert(cust_id, item_id, item.price) history.insert(cust_id, item_id)
State-of-the-art: Group commit Wait until end of execution, not commit Readers “share fate” with writers Durable write latency does not limit throughput Cascaded rollbacks restricted to failures
State-of-the-art: Group commit
Purchase(item_id, cust_id): item = items_tbl[item_id] if item.count == 0: Abort() item.count -= 1 bills.insert(cust_id, item_id, item.price) history.insert(cust_id, item_id)
Writes made visible at the end of txn’s execution
Purchase(item_id, cust_id): item = items_tbl[item_id] if item.count == 0: Abort() item.count -= 1 bills.insert(cust_id, item_id, item.price) history.insert(cust_id, item_id)
Write visibility delay
Purchase(item_id, cust_id): item = items_tbl[item_id] if item.count == 0: Abort() item.count -= 1 bills.insert(cust_id, item_id, item.price) history.insert(cust_id, item_id)
Serializability: Conflicting txns must wait Read Committed: Conflicting txns can read old values Write visibility delay
Serializable
Purchase(item_id, cust_id): item = items_tbl[item_id] if item.count == 0: Abort() item.count -= 1 bills.insert(cust_id, item_id, item.price) history.insert(cust_id, item_id) Purchase(item_id, cust_id): item = items_tbl[item_id] if item.count == 0: Abort() item.count -= 1 bills.insert(cust_id, item_id, item.price) history.insert(cust_id, item_id) Purchase(item_id, cust_id): item = items_tbl[item_id] if item.count == 0: Abort() item.count -= 1 bills.insert(cust_id, item_id, item.price) history.insert(cust_id, item_id)
Read committed
Purchase(item_id, cust_id): item = items_tbl[item_id] if item.count == 0: Abort() item.count -= 1 bills.insert(cust_id, item_id, item.price) history.insert(cust_id, item_id) Purchase(item_id, cust_id): item = items_tbl[item_id] if item.count == 0: Abort() item.count -= 1 bills.insert(cust_id, item_id, item.price) history.insert(cust_id, item_id) Purchase(item_id, cust_id): item = items_tbl[item_id] if item.count == 0: Abort() item.count -= 1 bills.insert(cust_id, item_id, item.price) history.insert(cust_id, item_id)
Item
Read
Impact of delayed write visibility
Impact of delayed write visibility
Workload: 10 read-modify-write txns 1 hot, 9 cold Vary position of hot update
Impact of delayed write visibility
Cold update Cold update . . . Cold update Cold update Hot update Hot update Workload: 10 read-modify-write txns 1 hot, 9 cold Vary position of hot update
Impact of delayed write visibility
Cold update Cold update . . . Cold update Cold update Hot update Hot update Workload: 10 read-modify-write txns 1 hot, 9 cold Vary position of hot update Write visibility delay: 0
Impact of delayed write visibility
Cold update Cold update . . . Hot update Hot update Cold update Cold update Workload: 10 read-modify-write txns 1 hot, 9 cold Vary position of hot update Write visibility delay: 2
Impact of delayed write visibility
Hot Hot update update Cold update Cold update . . . Cold update Cold update Workload: 10 read-modify-write txns 1 hot, 9 cold Vary position of hot update Write visibility delay: 9
Impact of delayed write visibility
0 K 100 K 200 K 300 K 400 K 1 2 3 4 5 6 7 8 9 Throughput (txns/sec) Write visibility delay
Read Committed Serializable
Impact of delayed write visibility 30% drop in Read committed vs 3x drop in Serializable
Metaphor credit: Bill Thies
Evolution of transaction processing
Concurrency control Recoverability
Metaphor credit: Bill Thies
Evolution of transaction processing
Concurrency control Recoverability
Why delayed write visibility?
Database systems have the flexibility to arbitrarily abort transactions
Why delayed write visibility?
Abort statements Constraint violations Deadlocks Failures Validation errors Resource constraints
Why delayed write visibility?
Abort statements Constraint violations Deadlocks Failures Validation errors Resource constraints State induced System induced
Purchase(item_id, cust_id): item = items_tbl[item_id] if item.count == 0: Abort() item.count -= 1 bills.insert(cust_id, item_id, item.price) history.insert(cust_id, item_id)
Write visibility delay
Why delayed write visibility
Abort statements Constraint violations Deadlocks Failures Validation errors Resource constraints State induced System induced
Purchase(item_id, cust_id): item = items_tbl[item_id] if item.count == 0: Abort() item.count -= 1 bills.insert(cust_id, item_id, item.price) history.insert(cust_id, item_id)
Early write visibility
Purchase(item_id, cust_id): item = items_tbl[item_id] if item.count == 0: Abort() item.count -= 1 bills.insert(cust_id, item_id, item.price) history.insert(cust_id, item_id)
Item’s count update is visible here
Early write visibility
Purchase(item_id, cust_id): item = items_tbl[item_id] if item.count == 0: Abort() item.count -= 1 bills.insert(cust_id, item_id, item.price) history.insert(cust_id, item_id) Purchase(item_id, cust_id): item = items_tbl[item_id] if item.count == 0: Abort() item.count -= 1 bills.insert(cust_id, item_id, item.price) history.insert(cust_id, item_id)
Early write visibility
Challenges
Delayed write visibility for serializability
Long duration write locks (e.g., 2PL) Write buffering (e.g., OCC, MVCC, T/O)
Intrinsic system induced aborts
Dynamic locking can deadlock Abort on serialization errors (OCC, MVCC, T/O)
Early write visibility is incompatible with existing isolation mechanisms
Deterministic execution
Deterministic execution
T0 T1 T2 T3 T4
Deterministic execution
T0 T1 T2 T3 T4 T0 T1 T2 T4 T3
Determine legal schedule
Deterministic execution
T0 T1 T2 T3 T4 T0 T1 T2 T4 T3
Determine legal schedule Execute
Deterministic execution
T0 T1 T2 T3 T4 T0 T1 T2 T4 T3
Determine legal schedule Execute
Serializable “by definition” No system induced aborts
Piece wise visibility
Purchase(item_id, cust_id): item = items_tbl[item_id] if item.count == 0: Abort() item.count -= 1 bills.insert(cust_id, item_id, item.price) history.insert(cust_id, item_id)
Item Bills Hist
Piece wise visibility
Purchase(item_id, cust_id): item = items_tbl[item_id] if item.count == 0: Abort() item.count -= 1 bills.insert(cust_id, item_id, item.price) history.insert(cust_id, item_id)
Item Bills Hist
Data dependencies
Piece wise visibility
Purchase(item_id, cust_id): item = items_tbl[item_id] if item.count == 0: Abort() item.count -= 1 bills.insert(cust_id, item_id, item.price) history.insert(cust_id, item_id)
Item Bills Hist
Data dependencies Commit dependencies
Abortable Non-abortable Non-abortable
Piece wise visibility
Item1 Bills Hist
Data dependencies Commit dependencies
Item0
Piece wise visibility
Item1 Bills Hist
Data dependencies Commit dependencies
Item0
RVP RVPs (rendezvous points) implement a lightweight commit protocol
Piece wise visibility
Item1 Bills Hist
Data dependencies Commit dependencies
Item0
C=2 RVPs (rendezvous points) implement a lightweight commit protocol
Piece wise visibility
Item1 Bills Hist
Data dependencies Commit dependencies
Item0
C=1 RVPs (rendezvous points) implement a lightweight commit protocol
Piece wise visibility
Item1 Bills Hist
Data dependencies Commit dependencies
Item0
C=0 RVPs (rendezvous points) implement a lightweight commit protocol
Piece wise visibility
Item1 Bills Hist
Data dependencies Commit dependencies
Item0
C=-1 RVPs (rendezvous points) implement a lightweight commit protocol
Piece wise visibility
Item1 Bills Hist
Data dependencies Commit dependencies
Item0
RVP RVPs (rendezvous points) implement a lightweight commit protocol Abortable writes visible after commit decision Non-abortable writes visible after they execute
Executing pieces
T0 T1 T2 T3 T4 T0 T1 T2 T4 T3
Determine legal schedule Execute
Replace transactions with pieces Use Piece Wise Visibility rules
Conflict information
PWV can use coarse-grained conflict information
E.g., partitions, foreign-keys