Commitment and Mutual Exclusion CS 188 Distributed Systems - PowerPoint PPT Presentation

Commitment and Mutual Exclusion CS 188 Distributed Systems February 18, 2015 Lecture 11 Page 1 CS 188,Winter 2015

Introduction • Many distributed systems require that participants agree on something – On changes to important data – On the status of a computation – On what to do next • Reaching agreement in a general distributed system is challenging Lecture 11 Page 2 CS 188,Winter 2015

Commitment • Reaching agreement in a distributed system is extremely important • Usually impossible to control a system’s behavior without agreement • One approach to agreement is to get all participants to prepare to agree • Then, once prepared, to take the action Lecture 11 Page 3 CS 188,Winter 2015

Challenges to Commitment • There are challenges to ensuring that commitment occurs • Different nodes’ actions aren’t synchronous • Communication only via messages • Other actions can intervene • Failures can occur Lecture 11 Page 4 CS 188,Winter 2015

For Example, • An optimistically replicated file system like Ficus • We want to be able to add replicas of a volume • Which is a lot easier to do if all nodes hosting existing replicas agree Lecture 11 Page 5 CS 188,Winter 2015

The Scenario A B C 3 7 5 3 3 7 7 0 5 5 3 7 5 1 2 4 3 But we need a version vector D I want a element for the replica, new replica too! 0 Lecture 11 Page 6 CS 188,Winter 2015

So What’s the Problem? • A and C don’t know about the new replica – But they can learn about it as soon as they contact B • So why is there any difficulty? Lecture 11 Page 7 CS 188,Winter 2015

One Problem A B C 3 7 5 3 7 5 3 7 5 1 2 4 3 4 Now for some updates! D E Different updates . . . Same version vector . . . 3 3 7 7 5 5 0 1 3 3 7 7 5 5 1 0 Lecture 11 Page 8 CS 188,Winter 2015

And It Can Be a Lot Worse • What if replicas are being added and dropped frequently? • How will we keep track of which ones are live and which ones are which? • It can get very confusing Lecture 11 Page 9 CS 188,Winter 2015

But That’s Not What I Want To Do, Anyway • A common answer from system designers • They don’t care about the odd corner cases • They don’t expect them to happen • So why pay a lot to handle them right? • Sometimes a reasonable answer . . . Lecture 11 Page 10 CS 188,Winter 2015

Why You Should Care • If you allow a system to behave a certain way – Even if you don’t think it ever will • And your system is widely deployed and used • Sooner or later that improbable thing will happen • And who knows what happens next? Lecture 11 Page 11 CS 188,Winter 2015

The Basic Solution • Use a commitment protocol • To ensure that all participating nodes understand what’s happening • And agree to it • Handles issues of concurrency and failures Lecture 11 Page 12 CS 188,Winter 2015

Transactions • A mechanism to achieve commitment • By ensuring atomicity – Also consistency, isolation, and durability • Very important in database community • Set of asynchronous request/reply communications • Either all of set are complete or none Lecture 11 Page 13 CS 188,Winter 2015

Transactions and ACID Properties • ACID - Atomicity, Consistency, Isolation, and Durability • Atomicity - all happen or none • Consistency - Outcome equivalent to some serial ordering of actions • Isolation - Partial results are invisible outside the transaction • Durability - Committed transactions survive crashes and other failures Lecture 11 Page 14 CS 188,Winter 2015

Achieving the ACID Properties • In distributed environment, use two- phase commit protocol • A unanimous voting protocol – Do something if all participants agree it should be done • Essentially, hold on to results of a transaction until all participants agree Lecture 11 Page 15 CS 188,Winter 2015

Basics of Two-Phase Commit • Run at the end of all application actions in a transaction • Must end in commit or abort decision • Must work despite delays and failures • Require access to stable storage • Usually started by a coordinator – But coordinator has no more power than any other participant Lecture 11 Page 16 CS 188,Winter 2015

The Two Phases • Phase one: prepare to commit – All participants are informed that they should get ready to commit – All agree to do so • Phase two: commitment – Actually commit all effects of the transaction Lecture 11 Page 17 CS 188,Winter 2015

Outline of Two-Phase Commit Protocol 1. Coordinator writes prepare to his local stable log 2. Coordinator sends prepare message to all other participants 3. Each participant either prepares or aborts, writing choice to its local log 4. Each participant sends his choice to the coordinator Lecture 11 Page 18 CS 188,Winter 2015

The Two-Phase Commit Protocol, continued 5. The coordinator collects all votes 6. If all participants vote to commit, coordinator writes commit to its log 7. If any participant votes to abort, coordinator writes abort to its log 8. Coordinator sends his decision to all others Lecture 11 Page 19 CS 188,Winter 2015

The Two-Phase Commit Protocol, concluded 9. If other participants receive a commit message, write commit to log and release transaction resources 10. If other participants receive an abort message, write abort to log and release transaction resources 11. Return acknowledgement to coordinator Lecture 11 Page 20 CS 188,Winter 2015

A Two-Phase Commit Example Phase 1 Phase 2 Node 4 Node 1 coordinator committed prepared commit prepare prepare commit All voted prepare yes! commit prepare prepare commit commit Node 2 Node 3 Lecture 11 Page 21 CS 188,Winter 2015

What About the Abort Case? • Same as commit, except not everyone voted yes • Instead of committing, send aborts – And abort locally at coordinator • On receipt of an abort message, undo everything Lecture 11 Page 22 CS 188,Winter 2015

Overheads of Two-Phase Commit • For n participants, 4*(n-1) messages – Each participant (except coordinator) gets a prepare and a commit message – Each participant (except coordinator) sends a prepared and a committed message • Can optimize committed messages away – With potential cost of serious latencies in clearing log records Lecture 11 Page 23 CS 188,Winter 2015

Two-Phase Commit and Failures • Two-phase commit behaves well in the face of all single node failures – May not be able to commit – But will cleanly commit or abort – And, if anyone commits, eventually everyone will • Assumes fail-stop failures Lecture 11 Page 24 CS 188,Winter 2015

Some Failure Examples: Example 1 Node 4 Node 1 Failure of coordinator after prepare prepare sent; not all participants get Nodes 2, 3, 4 consult prepare on timeout and abort prepare Node 2 Node 3 Lecture 11 Page 25 CS 188,Winter 2015

Some Failure Examples: Example 2 Node 4 Node 1 Failure of other abort prepare participant before it prepare replied to prepare Node 1 never got a response from node 4 prepare prepare Node 2 Node 3 Lecture 11 Page 26 CS 188,Winter 2015

Some Failure Examples: Example 3 Node 4 Node 1 Failure of other commit Query commit status Commit participant after prepare commit All voted prepare replying prepared commit yes! Node 4 consults its log Node 1 never got the What happens if and notices it was committed message node 4 recovers? prepared from node 4 prepare prepare commit commit Node 2 Node 3 Lecture 11 Page 27 CS 188,Winter 2015

Handling Failures • Non-failed nodes still recover if some participants failed • The coordinator can determine what other nodes did – Did we commit or did we not? • If the coordinator failed, a new coordinator can be elected – And can determine state of commit – Except . . . Lecture 11 Page 28 CS 188,Winter 2015

An Issue With Two-Phase Commit • What if both the coordinator and another node fail? – During the commit phase • Two possibilities 1. The other failed node committed 2. The other failed node did not commit Lecture 11 Page 29 CS 188,Winter 2015

Possibility 1 Node 4 Node 1 prepare prepare commit prepare prepare commit Node 2 Node 3 Lecture 11 Page 30 CS 188,Winter 2015

Possibility 2 Node 4 Node 1 prepare prepare commit prepare prepare Node 2 Node 3 Lecture 11 Page 31 CS 188,Winter 2015

What Do the Other Nodes Do? Here’s what they see, in both cases: Node 4 Node 1 Node 1 But what happened at the failed nodes? prepare prepare prepare This? commit commit Or this? prepare prepare prepare commit Node 2 Node 2 Node 3 Lecture 11 Page 32 CS 188,Winter 2015

Why Does It Matter? • Well, why? • Consider, for each case, what would have happened if node 2 hadn’t failed Lecture 11 Page 33 CS 188,Winter 2015

Handling the Problem • Go to three phases instead of two • Third phase provides the necessary information to distinguish the cases • So if this two node failure occurs, other nodes can tell what happened Lecture 11 Page 34 CS 188,Winter 2015

Commitment and Mutual Exclusion CS 188 Distributed Systems - PowerPoint PPT Presentation

Commitment and Mutual Exclusion CS 188 Distributed Systems February 18, 2015 Lecture 11 Page 1 CS 188,Winter 2015 Introduction Many distributed systems require that participants agree on something On changes to important data On

The Synchronization Toolbox The Synchronization Toolbox Mutual Exclusion Mutual Exclusion Race

Distributed Mutual Exclusion Last time Synchronizing real, distributed clocks

CSC2/458 Parallel and Distributed Systems Mutual Exclusion and Leader Elections Sreepathi Pai

Distributed Systems Mutual Exclusion & Election Algorithms Mutual Exclusion & Election

Todays Topics - Coordination and Agreement Chapter 12. Distributed Mutual Exclusion. 12.2

Distributed Mutual Exclusion Algorithms Course: Distributed Computing Faculty: Dr. Rajendra

Distributed Algorithms Distributed Algorithms Distributed Mutual Exclusion Olivier Dalle (*)

Analyzing Irregular Mutual Analyzing Irregular Mutual Exclusion in Parallel Programs Exclusion

Section 1 Commitment Schemes Commitment Schemes Commitment Schemes Digital analogue of a safe.

Agreement in Distributed Systems CS 188 Distributed Systems February 19, 2015 Lecture 12 Page

Distributed Systems Rik Sarkar James Cheney Distributed Mutual Exclusion February 10, 2014

Operating System Principles: Mutual Exclusion and Asynchronous Completion CS 111 Operating

Peterson s lock: Establishing A derivation mutual exclusion Claim: If the following invariant

Global view mutual exclusion despite process crash failures? But Which is the weakest

Mutual Exclusion 1 Goals of the lecture Time domain vs Causalit y domain

Intro Concurrency Mutual Exclusion Condition Variables Reentrant Read / Write Locks CS 2112 Lab

Consensus in Distributed Systems Jeff Chase Duke University Consensus P 1 P 1 v 1 d 1

Self Storage Website SEO Audit A Brief Overview What is SEO? Search Engine

Which t-Norm Case When This . . . Is Most Appropriate for Our Answers First Result: Product . .

*Recommended $8.2M one time this year $16.4 M over 5 years to base $25 M one time to offset

DATABSE SYSTEMS CONSENSUS ON TRANSACTION COMMIT. TODS06 MADE BY- ARCHIT GARG 1 Agenda

The One Time Pad Dan Boneh Symmetric Ciphers: defini<on

Quantum teleportation, diagrams, and the one-time pad A. Kissinger Digital Security Group

Classical Ciphers Playfair Cipher Polyalphabetic Ciphers Cryptography Vigen` ere Cipher

Commitment and Mutual Exclusion CS 188 Distributed Systems - PowerPoint PPT Presentation

Commitment and Mutual Exclusion CS 188 Distributed Systems February 18, 2015 Lecture 11 Page 1 CS 188,Winter 2015 Introduction Many distributed systems require that participants agree on something On changes to important data On

The Synchronization Toolbox The Synchronization Toolbox Mutual Exclusion Mutual Exclusion Race

Distributed Mutual Exclusion Last time Synchronizing real, distributed clocks

CSC2/458 Parallel and Distributed Systems Mutual Exclusion and Leader Elections Sreepathi Pai

Distributed Systems Mutual Exclusion &amp; Election Algorithms Mutual Exclusion &amp; Election

Todays Topics - Coordination and Agreement Chapter 12. Distributed Mutual Exclusion. 12.2

Distributed Mutual Exclusion Algorithms Course: Distributed Computing Faculty: Dr. Rajendra

Distributed Algorithms Distributed Algorithms Distributed Mutual Exclusion Olivier Dalle (*)

Analyzing Irregular Mutual Analyzing Irregular Mutual Exclusion in Parallel Programs Exclusion

Section 1 Commitment Schemes Commitment Schemes Commitment Schemes Digital analogue of a safe.

Agreement in Distributed Systems CS 188 Distributed Systems February 19, 2015 Lecture 12 Page

Distributed Systems Rik Sarkar James Cheney Distributed Mutual Exclusion February 10, 2014

Operating System Principles: Mutual Exclusion and Asynchronous Completion CS 111 Operating

Peterson s lock: Establishing A derivation mutual exclusion Claim: If the following invariant

Global view mutual exclusion despite process crash failures? But Which is the weakest

Mutual Exclusion 1 Goals of the lecture Time domain vs Causalit y domain

Intro Concurrency Mutual Exclusion Condition Variables Reentrant Read / Write Locks CS 2112 Lab

Consensus in Distributed Systems Jeff Chase Duke University Consensus P 1 P 1 v 1 d 1

Self Storage Website SEO Audit A Brief Overview What is SEO? Search Engine

Which t-Norm Case When This . . . Is Most Appropriate for Our Answers First Result: Product . .

*Recommended $8.2M one time this year $16.4 M over 5 years to base $25 M one time to offset

DATABSE SYSTEMS CONSENSUS ON TRANSACTION COMMIT. TODS06 MADE BY- ARCHIT GARG 1 Agenda

The One Time Pad Dan Boneh Symmetric Ciphers: defini&lt;on

Quantum teleportation, diagrams, and the one-time pad A. Kissinger Digital Security Group

Classical Ciphers Playfair Cipher Polyalphabetic Ciphers Cryptography Vigen` ere Cipher

Distributed Systems Mutual Exclusion & Election Algorithms Mutual Exclusion & Election

The One Time Pad Dan Boneh Symmetric Ciphers: defini<on