 
              Coordination Coordination If the solution to availability and scalability is to decentralize and replicate functions and data, how do we coordinate the nodes? • data consistency Failures and Consensus Failures and Consensus • update propagation • mutual exclusion • consistent global states • group membership • group communication • event ordering • distributed consensus • quorum consensus Overview Consensus Overview Consensus P 1 P 1 • The consensus problem and its variants v 1 • Failure models d 1 • Consensus in synchronous model with no failures Unr eliable • Consensus in synchronous model with fail-stop Consensus mult icast algor it hm • The trouble with byzantine failures P 2 P 3 P 2 P 3 • Impossibility of consensus with too many byzantine failures v 2 v 3 d 2 d 3 • Consensus in synchronous with a few byzantine failures Step 1 Step 2 Propose. • Impossibility of consensus in asynchronous with failures Decide. • Consensus in practice anyway Generalizes to N nodes/processes. • Recovery and failure detectors Properties for Correct Consensus Properties for Correct Consensus Variant I: Consensus (C) Variant I: Consensus (C) Termination : All correct processes eventually decide. Agreement : All correct processes select the same d i . d i = v k Or…(stronger) all processes that do decide select the same d i , even if they later fail. Integrity : All deciding processes select the “right” value. • As specified for the variants of the consensus problem. P i selects d i from {v 0 , …, v N-1 }. All P i select d i as the same v k . If all P i propose the same v , then d i = v , else d i is arbitrary. 1
Variant II: Command Consensus (BG) Variant III: Interactive Consistency (IC) Variant II: Command Consensus (BG) Variant III: Interactive Consistency (IC) leader or v leader commander d i = [v 0 , …, v N-1 ] subordinate or lieutenant d i = v leader P i selects d i = v leader proposed by designated leader node P leader if P i selects d i = [v 0 , …, v N-1 ] vector reflecting the values the leader is correct, else the selected value is arbitrary. proposed by all correct participants. As used in the Byzantine generals problem. Also called attacking armies . Equivalence of Consensus Variants Four Dimensions of Failure Models Equivalence of Consensus Variants Four Dimensions of Failure Models Reliable vs. unreliable network If any of the consensus variants has a solution, then all of them have a solution. Reliable : all messages are eventually delivered exactly once. Proof is by reduction . Synchronous vs. asynchronous communication Synchronous : message delays (and process delays) are bounded, • IC from BG . Run BG N times, one with each P i as leader. enabling communication in synchronous rounds . • C from IC . Run IC, then select from the vector. Byzantine vs. fail-stop • BG from C . Fail-stop : faulty nodes stop and do not send. Step 1: leader proposes to all subordinates. Byzantine : faulty nodes may send arbitrary messages. Step 2: subordinates run C to agree on the proposed value. Authenticated vs. unauthenticated • IC from C ? BG from IC ? Etc. Authenticated : the source and content of every message can be verified, even if a Byzantine failure occurs. Assumptions Assumptions Consensus: synchronous with no failures Consensus: synchronous with no failures For now we assume: The solution is trivial in one round of proposal messages. • Nodes/processes communicate only by messages. Intuition: all processes receive the same values, the values sent by the other processes. • The network may be synchronous or asynchronous. Step 1. Propose. • The network channels are reliable . Is this realistic? Step 2. At end of round, each P i decides from received values. There are three kinds of node/process failures: • Consensus : apply any deterministic function to {v 0 ,…, v N-1 }. • Fail-stop • Command consensus : if v leader was received, select it, else • Authenticated Byzantine (“signed messages”) apply any deterministic function to {v 0 ,…, v N-1 }. • Interactive consistency : construct a vector from all received • Byzantine (“unsigned”) values. 2
Consensus: synchronous fail- -stop stop Lamport’s 1982 Result, Generalized by Pease 1982 Result, Generalized by Pease Consensus: synchronous fail Lamport’s F+1 rounds of exchanges can reach consensus for N The Lamport/Pease result shows that consensus is impossible: processes with up to F processes failing. • with byzantine failures, In each round, each node says everything that it knows that it hasn’t already said in previous rounds. • if one-third or more processes fail (N ≤ 3F), At most N 2 values are sent. Lamport shows it for 3 processes, but Pease generalizes to N. Intuition: suppose P i learns a value v from P j during a round. • even with synchronous communication. • Other correct processes also learned v from P j during that Intuition: a node presented with inconsistent information round, unless P j failed during the round. cannot determine which process is faulty. • Other correct processes will learn it from P i in the next round, unless P i also fails during that round. The good news: consensus can be reached if N > 3F, no matter what kinds of node failures occur. • Adversary must fail one process in each round, after sending its value to one other process…so F+1 rounds are sufficient if at most F failures occur. Impossibility with three byzantine byzantine generals generals Solution with four byzantine byzantine generals generals Impossibility with three Solution with four p p 1 (Commander) 1 (Commander) p 1 (Commander) p 1 (Commander) “3:1:u” means “3 says 1 says u”. 1:v 1:v 1:w 1:x 1:v 1:v 1:u 1:w 1:v 1:v 2:1:v 2:1:w 2:1:v 2:1:u p 2 p p p 3 2 3 p p p p 3:1:u 3:1:w 2 3 2 3 3:1:u 3:1:x 4:1:v 4:1:v 4:1:v 4:1:v Faulty processes are shown shaded 2:1:v 3:1:w 2:1:u 3:1:w [Lamport82] Intuition: subordinates cannot distinguish these cases. p p Each must select the commander’s value in the first case, 4 4 Faulty processes are shown shaded but this means they cannot agree in the second case. Intuition: vote. Summary: Byzantine Failures Summary: Byzantine Failures Fischer- Fischer -Lynch Lynch- -Patterson (1985) Patterson (1985) A solution exists if less than one-third are faulty (N > 3F). No consensus can be guaranteed in an asynchronous communication system in the presence of any failures. It works only if communication is synchronous. Intuition: a “failed” process may just be slow, and can rise Like fail-stop consensus, the algorithm requires F+1 rounds. from the dead at exactly the wrong time. The algorithm is very expensive and therefore impractical. Consensus may occur recognizably on occasion, or often. Number of messages is exponential in the number of rounds. e.g., if no inconveniently delayed messages Signed messages make the problem easier ( authenticated FLP implies that no agreement can be guaranteed in an byzantine ). asynchronous system with byzantine failures either. • In general case, the failure bounds (N > 3F) are not affected. • Practical algorithms exist for N > 3F. [Castro&Liskov] 3
Recommend
More recommend