Consensus with Partial Synchrony Pedro Ferreira do Souto - PowerPoint PPT Presentation

Consensus with Partial Synchrony Pedro Ferreira do Souto Departamento de Engenharia Informtica Faculdade de Engenharia Universidade do Porto Pedro F. Souto (FEUP) Consensus with Partial Synchrony 1 / 55

Outline Failure Detection 1 Consensus 2 Problem Definition Solution by Transformation of Synchronous Algorithms PSynchAgreement More Partially Synchronous Models Further Reading 3 Pedro F. Souto (FEUP) Consensus with Partial Synchrony 2 / 55

Failure Detection Outline Failure Detection 1 Consensus 2 Problem Definition Solution by Transformation of Synchronous Algorithms PSynchAgreement More Partially Synchronous Models Further Reading 3 Pedro F. Souto (FEUP) Consensus with Partial Synchrony 3 / 55

Failure Detection PSynchFD Failure Detector Failure detector GTA with stop i input actions inform - stopped ( j ) i , i � = j output actions, which notifiy process i that process j has stopped. PSynchFD failure detector algorithm 1. Each process P i continually sends messages to all the other processes. 2. If a process P i performs a sufficiently large number m of steps without receiving a message from P j , it records that P j has stopped and outputs inform - stopped ( j ) i ◮ The number m of steps is taken to be the smallest integer that is strictly greater than ( d + ℓ 2 ) /ℓ 1 + 1 Perfect failure detector reports 1. only failures that have actually happened; 2. all such failures to all other non-faulty processes. Pedro F. Souto (FEUP) Consensus with Partial Synchrony 4 / 55

Failure Detection Theorem 25.1: PSynchFD is a perfect failure detector Proof (by contradiction) It should be clear that all failures are eventually detected. So, let’s assume that P i reports that P j has stopped but it has not. 1. If P i outputs inform - stopped ( j ) i , it must have been the case that it has not received a message from P j in the previous ( d + ℓ 2 ) /ℓ 1 + 1 steps. 2. Since each step takes at least ℓ 1 time units, this means that strictly more than d + ℓ 2 time units have passed since the last time P i received a message from P j . 3. Since the channel delay is at most d , then P j has not sent a message for at least ℓ 2 time units. 4. Since P j sends messages to every processes once per step, P j has taken more than ℓ 2 to execute a step. 5. This is a contradiction, because ℓ 2 is the upper bound for P j to take a step. Thus P j must have stopped. Pedro F. Souto (FEUP) Consensus with Partial Synchrony 5 / 55

Failure Detection Lower bound on PSynchFD (Theorem 25.2 part 1) Theorem 25.2 part 1 t − a In any timed execution, the time from a t − a + ℓ 2 a > ℓ 2 + d stop j event until a inform - stopped ( j ) i d event, if any, is strictly greater than d t Let t be the time when event inform - stopped ( j ) i occurs. 1. As pointed out above, it must be the case that P i has not received any message from P j for time a > ℓ 2 + d . 2. Hence, it must be the case that P j has not sent any message from [ t − a , t − a + ℓ 2 ], for otherwise it would have been received by P i in the interval [ t − a , t − a + ℓ 2 + d ], which is included in [ t − a , t ] 3. Since a > ℓ 2 + d , it must be the case that P j has stopped by t − a + ℓ 2 < t − d , i.e. at least d time units before inform - stopped ( j ) i . Note This means that if P i times out P j , then all the messages P j has sent before failing must have already been received. Pedro F. Souto (FEUP) Consensus with Partial Synchrony 6 / 55

Failure Detection Upper bound on PSynchFD (Theorem 25.2 part 2) Theorem 25.2 part 2 t d In any admissible timed execution in which stop j event occurs, within time Ld + d + O ( L ℓ 2 ) after stop j , either ml 2 an inform - stopped ( j ) i event or a stop i event occurs. L = ℓ 2 /ℓ 1 is a measure of the uncertainty of process execution speeds. Let t be the time when event stop j occurs. 1. Then no message is sent from P j to P i after time t , so no message is received by P i from P j after time t + d . 2. After receiving P j ’s last message, P i counts m steps, each of which can take at most ℓ 2 time to execute. 3. Because m is strictly greater than ( d + ℓ 2 ) /ℓ 1 + 1, we get m ℓ 2 > ( d + ℓ 2 ) L + ℓ 2 , i.e. m ℓ 2 = Ld + O ( L ℓ 2 ). 4. Thus, if P i does not fail in the meantime, the total time from stop j to inform - stopped ( j ) i is Ld + d + O ( L ℓ 2 ) Pedro F. Souto (FEUP) Consensus with Partial Synchrony 7 / 55

Consensus Outline Failure Detection 1 Consensus 2 Problem Definition Solution by Transformation of Synchronous Algorithms PSynchAgreement More Partially Synchronous Models Further Reading 3 Pedro F. Souto (FEUP) Consensus with Partial Synchrony 8 / 55

Consensus Problem Definition Outline Failure Detection 1 Consensus 2 Problem Definition Solution by Transformation of Synchronous Algorithms PSynchAgreement More Partially Synchronous Models Further Reading 3 Pedro F. Souto (FEUP) Consensus with Partial Synchrony 9 / 55

Consensus Problem Definition Consensus: External interfaces System A init ( v ) i input action; users ports System A stop i decide ( v ) i output action; stop i input action; init ( v ) i U i where 1 ≤ i ≤ n and v ∈ V decide ( v ) i Note all actions with subscript i are said to occur on port i ; User U i decide ( v ) i input action; init ( v ) i output action; U i performs at most one init i action in any timed execution. Definition A sequence of init i and decide i actions is well-formed for i provided that it is some prefix of a sequence of the form init ( v ) i , decide ( w ) i . Pedro F. Souto (FEUP) Consensus with Partial Synchrony 10 / 55

Consensus Problem Definition Consensus: Problem definition (1/2) Well-formedness: In any timed execution of the combined system, and for any port i , the interactions between U i and A are well-formed for i . Agreement: In any timed execution, all decision values are identical. Validity: In any timed execution, if all init actions that occur contain the same value v , then v is the only possible decision value. Failure-free termination: In any admissible failure-free timed execution in which init events occur on all ports, a decide event occurs on each port. f -failure termination, 0 ≤ f ≤ n : In any admissible timed execution in which init events occur on all ports, if there are stop events on at most f ports, then a decide event occurs on all the remaining ports. Definition Wait-free termination is the special case of f -failure termination where f = n . Pedro F. Souto (FEUP) Consensus with Partial Synchrony 11 / 55

Consensus Problem Definition Consensus: Problem definition (2/2) System A Is the composition of the users ports processes channels following automata P i with bounds ℓ 1 and ℓ 2 for each of stop i its tasks, where 0 < ℓ 1 ≤ ℓ 2 < ∞ . 1 Processes are subject to stopping failures. init ( v ) i U i i C ij which are point-to-point reliable decide ( v ) i FIFO channels with an upper bound of d on the delivery time for every n message (this is not an MMT automaton) Definition A solves the agreement problem if it satisfies well-formedness, agreement, validity and failure-free termination. Pedro F. Souto (FEUP) Consensus with Partial Synchrony 12 / 55

Consensus Solution by Transformation of Synchronous Algorithms Outline Failure Detection 1 Consensus 2 Problem Definition Solution by Transformation of Synchronous Algorithms PSynchAgreement More Partially Synchronous Models Further Reading 3 Pedro F. Souto (FEUP) Consensus with Partial Synchrony 13 / 55

Consensus Solution by Transformation of Synchronous Algorithms Idea for a Solution Main result It is possible to solve agreement with f failures in the partially synchronous setting with upper and lower bounds of f + 1 rounds (just like in the synchronous model). Observation All the algorithms for agreement in the synchronous network model require f + 1 rounds to tolerate f stopping failures. Idea Transform these algorithms to algorithms in the partially synchronous network model. Pedro F. Souto (FEUP) Consensus with Partial Synchrony 14 / 55

Consensus Solution by Transformation of Synchronous Algorithms Transformation of synchronous network algorithms (1/3) Let A be any synchronous network algorithm for a complete graph network. The algorithm A ′ for the partially synchronous network model is as follows: Each process P i is the composition of two MMT automata: Q i is i ’s portion of the PSynchFD algorithm. It includes: stop i input action. informed - stopped i output actions. R i is the main automaton. It includes: informed - stopped i inputs (which are matched with Q i outputs); stopped state variable, that keeps track of the set of failed processes, i.e. processes j for which it has received the inputs informed - stopped ( j ) i ; simulated state variables of process i of A. Pedro F. Souto (FEUP) Consensus with Partial Synchrony 15 / 55

Consensus with Partial Synchrony Pedro Ferreira do Souto - PowerPoint PPT Presentation

Consensus with Partial Synchrony Pedro Ferreira do Souto Departamento de Engenharia Informtica Faculdade de Engenharia Universidade do Porto Pedro F. Souto (FEUP) Consensus with Partial Synchrony 1 / 55 Outline Failure Detection 1

Taking Synchrony Seriously: Taking Synchrony Seriously: A Perceptual-Level Model of Infant A

Byzantine Fault Tolerance and Partial Synchrony Stefan Stattelmann Seminar Advanced Topics in

Prior Work Consensus Consensus Reliable BGP Consensus Reliable BGP Consensus Routing

Consensus Building Consensus is Consensus is finding an acceptable proposal that all members

Prior Work Consensus Consensus Reliable BGP Consensus Reliable BGP Consensus Routing

Consensus and Dissent or: Meta - Consensus Consensus about what we have consensus

Circumventing Impossibility Partial Synchrony Circumventing Impossibility Consensus is an

1 Chances to weaken ordering Virtual Synchrony at a glance Suppose that any conflicting

Overview Partial Constituent Fronting in German The phenomenon: Partial constituent fronting

Membership of the consensus group Membership of the consensus group Members of the group were

Distributed Algorithms (PhD course) Consensus SARDAR MUHAMMAD SULAMAN Consensus The

When Aeron Met Raft Martin Thompson - @mjpt777 What does Consensus mean? consensus

Distributed Systems CS425/ECE428 03/06/2020 Todays agenda Consensus Consensus in

CONSENSUS Fall 2012 Ken Birman Consensus a classic problem Consensus abstraction underlies

Rate vs temporal code about synchrony Learning objectives: Learning objectives: To

Partial vs. Total Order a.k.a Polychrony vs. Synchrony Models of Time for Safety Critical Systems

Linux Kernel Co-Scheduling For Bulk Synchronous Parallel Applications ROSS 2011 Tucson, AZ Ter

Synchronous Constructive Cry ryptography Chen-Da Ueli Liu-Zhang Maurer ETH Zurich ETH

Abstraction of Clocks in Synchronous Data-flow Systems A. Cohen 1 L. Mandel 2 F. Plateau 2 M.

Wireless Sensor Networks 14th Lecture 12.12.2006 Christian Schindelhauer

Lecture 4: Checking properties in NuSMV B. Srivathsan Chennai Mathematical Institute Model

Synchronous Batching: From Cascades to Free Routes Roger Dingledine The Free Haven Project

Unit-12: Modeling timing constraints B. Srivathsan Chennai Mathematical Institute NPTEL-course

Shaahin Hessabi Department of Computer Engineering Sharif University of Technology D Design for