Granola: Low‐Overhead Distributed Transac9on Coordina9on James Cowling and Barbara Liskov MIT CSAIL
Granola An infrastructure for building distributed storage applica<ons. A distributed transac9on coordina9on protocol, that provides strong consistency without locking .
Distributed Storage … Repository 1 Repository 2 Repository 3 … Client 1 Client 2
Why Transac9ons? Atomic opera9ons allow users and developers to ignore concurrency Distributed atomic opera<ons allow data to span mul9ple repositories avoids inconsistency between repositories
Distributed transac9ons are hard Tension between consistency and performance
Op9ng for Consistency Two‐phase commit with strict two‐ phase locking Distributed databases, Sinfonia, etc.
Op9ng for Consistency Two‐phase commit with strict two‐ phase locking High transac9on cost: mul<ple Distributed databases, Sinfonia, etc. message delays, forced log writes, locking/logging overhead (≈30‐40%*) * OLTP through the looking glass, and what we found there Harizopoulos et al, SIGMOD ‘08
Op9ng for Performance No distributed transac<ons SimpleDB, Bigtable, CRAQ, etc. Weak consistency models Dynamo, Cassandra, MongoDB, Hbase, PNUTS, etc.
Op9ng for Performance No distributed transac<ons SimpleDB, Bigtable, CRAQ, etc. Place the burden of consistency on the applica9on developer Weak consistency models Dynamo, Cassandra, MongoDB, Hbase, PNUTS, etc.
Where we come in… Strong Consistency and High Performance (for a large class of transac<ons)
Introduce a new transac9on class which lets us provide consistency without locking
M OTIVATION T RANSACTION M ODEL P ROTOCOL E VALUATION
One‐Round Transac9ons Expressed in one round of communica9on between clients and repositories Execute to comple9on at each par<cipant
General Opera9ons Transac<ons are uninterpreted by Granola, and can execute arbitrary opera9ons
Transac9on Classes Single‐Repository execute en<rely on one repository Distributed Transac<ons Coordinated • Independent •
Coordinated Transac9ons Commit only if all par<cipants vote to commit Example: • Transfer $50 between accounts
Independent Transac9ons Transac<ons where all par<cipants will make the same commit/abort decision Examples: • Add 1% interest to each bank balance. • Compute total amount of money in the bank.
Independent Transac9ons Evidence these are common in OLTP workloads • Any read‐only distributed transac<on • Transac<ons that always commit • Atomically update replicated data • Where commit decision is determinis9c func9on of shared state
Example: TPC‐C TPC‐C benchmark can be expressed en<rely using single repository and independent transac9ons e.g., new_order transac<on only aborts if invalid item number can be computed locally if we replicate Item table
M OTIVATION T RANSACTION M ODEL P ROTOCOL E VALUATION
Repository Server Applica<on run result Granola Repository Library transac<on coordina<on Client Granola Client Library invoke result Client Applica<on
Replica9on Primary Backup Implemented Repository as Backup
Repository Modes Primarily in Timestamp Mode • Single‐repository, Independent Occasionally in Locking Mode • When coordinated transac<ons are required
Timestamps Each transac<on is assigned a <mestamp Each repository executes transac<ons in 9mestamp order Timestamps define global serial order
Key Ques9ons How do we assign <mestamps in a scalable , fault‐tolerant way? How do we make sure we always execute in 9mestamp order ?
Single‐Repository Protocol Clients present the highest 9mestamp they have observed Repository chooses <mestamp higher than the client 9mestamp , any previous transac<on, and its clock value Repository executes in 9mestamp order , sends response and <mestamp to client
Assign Choose <mestamp higher than <mestamp previous transac<ons Log Transac<on Run
Assign Choose <mestamp higher than <mestamp previous transac<ons Run replica<on protocol to Log record <mestamp Transac<on Run
Assign Choose <mestamp higher than <mestamp previous transac<ons Run replica<on protocol to Log record <mestamp Transac<on Execute in <mestamp order Run Send result and <mestamp to the client
Independent Protocol Clients present highest <mestamp they observed Repository chooses proposed 9mestamp higher than clock value and previous transac<ons Repositories vote to determine highest 9mestamp Repository executes in <mestamp order, sends <mestamp to client
Propose Choose proposed 9mestamp higher than <mestamp previous transac<ons Log Run replica<on protocol to Transac<on record proposed <mestamp Send proposed 9mestamp Vote to the other par9cipants Pick final Highest 9mestamp from among votes <mestamp Execute in <mestamp order Run Send result and final <mestamp to client
Timestamp Constraint Won’t execute transac<on un<l it has the lowest 9mestamp of all concurrent transac<ons Guarantees a global serial execu9on order
Example 1: Queue : Queue : History: History: Repository 1 Repository 2 Alice Bob
Queue : Queue : History: History: Repository 1 Repository 2 T1 T1 Alice Bob
Queue : Queue : T1 [prop. ts: 9 ] T1 [prop. ts: 3 ] History: History: Repository 1 Repository 2 T1 T1 Alice Bob
Queue : Queue : T1 [prop. ts: 9] T1 [prop. ts: 3] History: History: Vote T1 [9] … Repository 1 Repository 2 Vote T1 [3] Alice Bob
Queue : Queue : T1 [prop. ts: 3] History: History: T1 [final ts: 9] Vote T1 [9] … Repository 1 Repository 2 T1 [final ts: 9] Alice Bob
Queue : Queue : T1 [prop. ts: 3] History: History: T1 [final ts: 9] Vote T1 [9] … Repository 1 Repository 2 T2 Alice Bob
Queue : Queue : T1 [prop. ts: 3], T2 [ts: 5 ] History: History: T1 [final ts: 9] Vote T1 [9] … Repository 1 Repository 2 T2 Alice Bob
Queue : Queue : T1 [prop. ts: 3], T2 [ts: 5] History: History: T1 [final ts: 9] Vote T1 [9] … Repository 1 Repository 2 Alice Bob
Queue : Queue : T1 [prop. ts: 3], T2 [ts: 5] History: History: T1 [final ts: 9] Vote T1 [9] Repository 1 Repository 2 Alice Bob
Queue : Queue : T2 [ts: 5], T1 [final ts: 9] History: History: T1 [final ts: 9] Vote T1 [9] Repository 1 Repository 2 Alice Bob
Queue : Queue : T1 [final ts: 9] History: History: T2 [ts: 5] T1 [final ts: 9] Repository 1 Repository 2 T2 [ts: 5] Alice Bob
Queue : Queue : History: History: T2 [ts: 5], T1 [final ts: 9] T1 [final ts: 9] Repository 1 Repository 2 T1 [final ts: 9] Alice Bob
Queue : Queue : History: History: T2 [ts: 5], T1 [final ts: 9] T1 [final ts: 9] Repository 1 Repository 2 Global serial order: T2 ‐> T1 Alice Bob
Choosing 9mestamps Client‐provided <mestamp guarantees transac<on will be serialized aYer any transac9on it observed
Example 2: Queue : Queue : T1 [prop. ts: 3] History: History: T1 [final ts: 9] Vote T1 [9] … Repository 1 Repository 2 Alice Bob
Queue : Queue : T1 [prop. ts: 3] History: History: T1 [final ts: 9] Vote T1 [9] … Repository 1 Repository 2 T2 Alice Bob
Queue : Queue : T1 [prop. ts: 3] History: History: T1 [final ts: 9], T2 [ts: 10] Vote T1 [9] … Repository 1 Repository 2 T2 [ts: 10 ] Alice Bob
Queue : Queue : T1 [prop. ts: 3] History: History: T1 [final ts: 9] , T2 [ts: 10] Vote T1 [9] … Repository 1 Repository 2 T3 [latest ts: 10] Alice Bob
Queue : Queue : T1 [prop. ts: 3], T3 [ts: 11 ] History: History: T1 [final ts: 9] , T2 [ts: 10] Vote T1 [9] … Repository 1 Repository 2 T3 [latest ts: 10] Alice Bob
Queue : Queue : T1 [final. ts: 9], T3 [ts: 11] History: History: T1 [final ts: 9] , T2 [ts: 10] Vote T1 [9] Repository 1 Repository 2 Alice Bob
Queue : Queue : History: History: T1 [final ts: 9], T3 [ts: 11] T1 [final ts: 9] , T2 [ts: 10] Repository 1 Repository 2 Global serial order: T1 ‐> T2 ‐> T3 Alice Bob
Where are we now? Timestamp‐based transac<on coordina<on • Interoperability with coordinated transac9ons • Recovery from failures
Coordinated Transac9ons Applica<on determines commit/abort vote Requires locking to ensure vote isn’t invalidated
Protocol changes Prepare phase applica<on acquires locks and determines vote Timestamp vo<ng Repository can commit transac<ons out of 9mestamp order
Protocol changes Prepare phase where applica<on acquires locks and determines vote Timestamp Constraint: Timestamps s&ll match the serial order, even if execu<on happens out of <mestamp order Repository can commit transac<ons out of <mestamp order
Locking Mode Locking Mode Timestamp Mode Repository 1 Repository 2 Repository 3 coordinated independent transac<on transac<on Client 1 Client 2
Recommend
More recommend