Dont Give Up on Serializability Just Yet Neha Narula Dont Give Up on - PowerPoint PPT Presentation

Don’t Give Up on Serializability Just Yet Neha Narula

Don’t Give Up on Serializability Just Yet A journey into serializable systems Neha Narula MIT CSAIL GOTO Chicago May 2015 2 ¡

@neha • PhD candidate at MIT • Formerly at Google • Research in fast transactions for multi-core databases and distributed systems 3 ¡

However, the most important person in my gang will be a systems programmer. A person who can debug a device driver or a distributed system is a person who can be trusted in a Hobbesian nightmare of breathtaking scope; a systems programmer has seen the terrors of the world and understood the intrinsic horror of existence. 4 ¡

A journey into serializable systems

1M messages/sec 1/5 of all page views in the US 1M messages/sec from mobile devices

Databases are difficult to scale Application servers are stateless; add more for more traffic Database is stateful 8 ¡

Distributed databases Partition data on multiple servers for more performance 9 ¡

Example partitioned database Database ¡ widgets table 0-99 � widget_id � Webservers Database ¡ 100-199 � Database ¡ Database ¡ ? � 200-299 �

2007 • Mapreduce • Google File System • Bigtable 11 ¡

Pros/Cons • In-memory • No schema • HIGHLY scalable • Require complex key/row/document • Transparently fault design tolerant • No query language • Geo replication • No indexes • No transactions • No guarantees 12 ¡

mysql> BEGIN TRANSACTION UPDATE … COMMIT 15 ¡

Problem with dropping transactions • Difficult to reason about concurrent interleavings • Might result in incorrect, unrecoverable state 16 ¡

“The hacker discovered that multiple simultaneous withdrawals are processed essentially at the same time and that the system's software doesn't check quickly enough for a negative balance” h1p://arstechnica.com/security/2014/03/yet-‑another-‑exchange-‑hacked-‑poloniex-‑loses-‑ around-‑50000-‑in-‑bitcoin/ ¡

Consistency guarantees help us reason about our code and avoid subtle bugs

Consistency A very misused word in systems! • C as in ACID • C as in CAP • C as in sequential, causal, eventual, strict consistency

ACID Transactions Atomic Whole thing happens or not Consistent Application-defined correctness Isolated Other transactions do not Durable interfere Can recover correctly from a crash SET TRANSACTION ISOLATION LEVEL SERIALIZABLE BEGIN TRANSACTION ... COMMIT 21 ¡

What is Serializability? Serializability != Serial 22 ¡

What is Serializability? The result of executing a set of transactions is the same as if those transactions had executed one at a time, in some serial order. If each transaction preserves correctness, the DB will be in a correct state. We can pretend like there’s no concurrency! 23 ¡

Database transactions should be serializable k=0,j=0 � TXN1( k, j Key) (Value, Value) { To the programmer: � a := GET( k ) b := GET( j ) TXN1 TXN2 return a, b or � } TXN2( k, j Key) { TXN2 TXN1 ADD( k, 1) ADD( j ,1) time } Valid return values for TX1: (0,0) � or (1,1) � 24 ¡

Benefits of Serializability • Do not have to reason about interleavings • Do not have to express invariants separately from the code! 25 ¡

Serializability Costs • On a multi-core database, serialization and cache line transfers • On a distributed database, serialization and network calls Concurrency control: Locking and coordination 26 ¡

Eventual consistency If no new updates are made to the object, eventually all accesses will return the last updated value.

Eventual consistency If no new updates are made to the object, eventually all accesses will return the last updated value the same value. (What is last, really?) (And when do we stop writing?) (And what about multi-key transactions?)

Sequential consistency: cache coherence P1 ¡ P2 ¡ P3 ¡ RAM ¡

P1: ¡W(x)a ¡ P2: ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡W(x)b ¡ P3: ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡R(x)a ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡R(x)b ¡ Lme ¡ P1: ¡W(x)a ¡ P2: ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡W(x)b ¡ P3: ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡R(x)a ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡R(x)b ¡ Lme ¡

P1: ¡W(x)a ¡ P2: ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡W(x)b ¡ P3: ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡R(x)b ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡R(x)a ¡ Lme ¡ P1: ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡W(x)a ¡ P2: ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡W(x)b ¡ P3: ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡R(x)b ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡R(x)a ¡ Lme ¡

External Consistency Everything that sequential consistency has Except results actually match time. An external observer

Not Externally Consistent Then ¡I ¡read ¡ The ¡value ¡of ¡x ¡ P1: ¡W(x)a ¡ ¡ x=a? ¡ is ¡b! ¡ P2: ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡W(x)b ¡ ¡ P3: ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡R(x)b ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡R(x)a ¡ P3: ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ Lme ¡

CAP Theorem • Brewer’s PODC talk: “Consistency, Availability, Partition-tolerance: choose two” in 2000 – Partition-tolerance is a failure model – Choice: can you process reads and writes during a partition or not? • FLP result – “Impossibility of Distributed Consensus with One Faulty Process” in 1985 – Asynchronous model; cannot tell the difference between message delay and failure

What does this mean? It’s impossible to decide anything on the internet?

NP-hard

What does CAP mean? It’s impossible to 100% of the time decide everything on the internet if we can’t rely on synchronous messaging We can 100% of the time decide everything if partitions heal (we know the upper bound on message delays) We can still play Candy Crush

CAP � Consistency vs. Performance Consistency (like serializability) requires communication and blocking How do we reduce these costs while: • Producing a correct ordering of reads and writes and • Handling failures and (eventually) making progress?

Improving Serializability Performance Technique Systems Atomic clocks to bound time Spanner skew Transaction chopping Lynx, ROCOCO Commutative locking Escrow transactions, abstract data types, Doppel Deterministic ordering Granola, Calvin 39 ¡

Goal: parallel performance • Different concurrency control schemes for popular, contended data • Commutative locking • Abstract datatypes • Per-core (or per-server) data and constraints 40 ¡

Operation Model Developers write transactions as stored procedures which are composed of operations on keys and values: Replicate for reads value GET( k ) Traditional key/value operations Save last write void PUT( k , v ) Replicate for void INCR( k , n ) Operations on numeric commutative values which modify the void MAX( k , n ) operations existing value void MULT( k , n ) Ordered PUT , insert to an void OPUT( k , v , o ) ordered list, user-defined void TOPK_INSERT( k , v , o ) functions Log operations void UDF(k,v,a) 41 ¡

Spanner/F1 “We believe it is better to have application programmers deal with performance problems due to overuse of transactions as bottlenecks arise, rather than always coding around the lack of transactions.”

Takeaways • Use well-tested, long-lived database systems • Use SERIALIZABLE until it becomes a performance problem • Think about what is changing when you move to systems with different models 43 ¡

Thanks! � narula@mit.edu http://nehanaru.la @neha The ¡Stata ¡Center ¡via ¡emax: ¡h1p://hip.cat/emax/ ¡

Questions? Please remember to evaluate via the GOTO Guide App

Dont Give Up on Serializability Just Yet Neha Narula Dont Give Up on - PowerPoint PPT Presentation

Dont Give Up on Serializability Just Yet Neha Narula Dont Give Up on Serializability Just Yet A journey into serializable systems Neha Narula MIT CSAIL GOTO Chicago May 2015 2 @neha PhD candidate at MIT Formerly at Google

HEALTH #UNselfie Maryland GIVE the gift of GIVE LEADERSHIP #UNselfie Maryland GIVE the gift

Optimistic Concurrency Control April 13, 2017 1 Serializability Executing transactions

Cut-o ff theorems for deadlocks and serializability Lisbeth Fajstrup Department of Mathematics

They Don t Want Them Or You t Want Them Or You They Don Don t Have Them: t Have

Don Juans Troubles Don Juans Troubles Hey, Anna, how are you? Don Juans Troubles Hey,

DUTY TO GIVE REASONS Duty to give reasons Key principle A decision-maker must always give

You You aint You You aint aint see nothing yet aint see nothing

Genomics - lots of potential, but don't break out the bubbly just yet! Professor Jon Hickford

KRISTA BOAN WAIT, WHAT JUST HAPPENED? WAIT, WHAT JUST HAPPENED? WAIT, WHAT JUST HAPPENED? WAIT,

Just Culture CAPT JEFF SALVON-HARMAN, MD JUST CULTURE, CERTIFIED QUALITY FOCUS OFFICE OF THE

Optimizing Distributed Transactions: Speculative Client Execution, Certified Serializability, and

Combining Concurrency Control and Recovery Instructor: Matei Zaharia cs245.stanford.edu Outline

Serializability with Snapshot Isolation under the Hood Mihaela Bornea 1 , S. Elnikety 2 , O.

Concurrency Control Instructor: Matei Zaharia cs245.stanford.edu Outline What makes a schedule

Atomic Transactions The Transaction Model / Primitives Serializability

Atomic Transactions The Transaction Model / Primitives Serializability Implementation

Duality and Tilting for Commutative DG Rings Amnon Yekutieli Department of Mathematics Ben

Distributed Collaborative Editing LSEQ: an Adaptive Distributed Sequence Data Structure On the

Set-theoretic solutions of the pentagon equation Francesco Catino Universit` a del Salento

Relational Model It is the most popular implementation model CS 2550 / Spring 2006

Lecture 14: Parallel Algorithms Abhinav Bhatele, Department of Computer Science Communication

600.405 Finite-State Methods in NLP Assignment 2: Semirings etc. Prof. J. Eisner Fall

Tensors, !-graphs, and non-commutative quantum structures Aleks Kissinger David Quick QPL June

Lecture 7.1: Basic ring theory Matthew Macauley Department of Mathematical Sciences Clemson

Sambuz

Useful Links

Newsletter

Mail Us