CS6100: Topics in Design and Analysis of Algorithms Fault Tolerant - - PDF document

cs6100 topics in design and analysis of algorithms
SMART_READER_LITE
LIVE PREVIEW

CS6100: Topics in Design and Analysis of Algorithms Fault Tolerant - - PDF document

CS6100: Topics in Design and Analysis of Algorithms Fault Tolerant Consensus CS6100 (Even 2012): Fault Tolerant Consensus Models Failure Types 1. Clean Crash Failure completely fail 2. (Unclean) Crash Failure fail after some messages


slide-1
SLIDE 1

CS6100: Topics in Design and Analysis of Algorithms

Fault Tolerant Consensus

CS6100 (Even 2012): Fault Tolerant Consensus

slide-2
SLIDE 2

Models

Failure Types

  • 1. Clean Crash Failure — completely fail
  • 2. (Unclean) Crash Failure — fail after some messages

sent.

  • 3. Byzantine Failure — behave arbitrarily.

Timing

  • Asynchronous — impossible (FLP 1985)
  • Synchronous — our focus

We consider f-resilient synchronous complete networks, i.e, at most f nodes crash, in the LOCAL model.

CS6100 (Even 2012): Fault Tolerant Consensus 1

slide-3
SLIDE 3

Problem Definition

  • Each node i has an intial value called inputi
  • Intermediate variable xi (not strictly required)
  • Each node i has an output write-once variable yi
  • utput
  • Consensus is reached when following guarantees are

achieved. Termination Every non-faulty node i must eventually write into yi. Agreement yi = yj for all non faulty nodes Validity If all inputi = 0 (resp., 1) for non faulty i, all non faulty i should decide 0 (resp., 1).

  • We want to minimize time (rounds) to termination.

CS6100 (Even 2012): Fault Tolerant Consensus 2

slide-4
SLIDE 4

Algorithm

Each node i initially sets x = pi, then V = {x}. For every round k, 1 ≤ k ≤ f + 1:

  • 1. Send the set V (or elements not sent before) to

every other node.

  • 2. Denote set received from vj as Sj, j = i and j has

not crashed yet.

  • 3. V = V ∪

j Sj.

  • 4. If k = f + 1, then y = min(V ).

Lemma 1. In every execution, at the end of round f + 1, Vi = Vj for every two non-faulty nodes i and j. The following theorem follows as a corollary. Theorem 2. Above algorithm solves consensus in f + 1 rounds.

CS6100 (Even 2012): Fault Tolerant Consensus 3

slide-5
SLIDE 5

Proof of Lemma 1

Suppose for contradiction that Vi ⊂ Vj. Consider x ∈ Vj \ Vi. For every round i, 1 ≤ i ≤ f + 1, x is in the set of values held by at least one node. Otherwise, it cannot exist in Vj. However, there are at most f faults, so by the pigeonhole principle, there must be a fault free round in [1, f + 1]. During that fault free round, every node would have received x, a contradiction to our assumption that Vi ⊂ Vj.

CS6100 (Even 2012): Fault Tolerant Consensus 4

slide-6
SLIDE 6

Lower Bound on the Number of Rounds

Intuition: If nodes decide too early, they could not have distinguished between two executions where the decisions must have been different. Some definitions. Execution (Informal definition.) Sequence

  • f

synchronous

  • perations

(i.e., computations, messages sent and messages received) executed by all the nodes. View of node i Subsequence

  • f

computation and messages sent/received by node i. Denoted α | i. Similarity of Executions W.R.T i. Two executions α1 and α2 are similar w.r.t. node i if α1 | i = α1 | i. We denote this similarity by α1 ∼i α2 Failure Sparse Executions There is at most 1 node failure per round. In the rest of the lower bound proof, we assume failure sparse executions.

CS6100 (Even 2012): Fault Tolerant Consensus 5

slide-7
SLIDE 7

Valence of a Configuration C is the set

  • f

all consensus values that can be reached from a configuration C. Univalent/Bivalent Configuration C The valence

  • f C is a set of cardinality one/two.

0-valent configuration if the valence set has only 0. Similarly, define 1-valent configurations. Theorem 3. Any consensus algorithm A on n nodes resilient to f crash failures requires at least f + 1 rounds in some execution. (Assume n ≥ f + 2.) Proof. First we show that the initial configuration is bivalent. Then we show that there is an execution under which the configuration after the first f −1 rounds is bivalent. Then, show that one more round does not suffice.

CS6100 (Even 2012): Fault Tolerant Consensus 6

slide-8
SLIDE 8

Initial Bivalent Configuration

Assume all initial configurations are univalent. Consider the following possible initial values in the sequence of n (= 10) nodes. 0000000000 -> 0 : : : 0000011111 -> 0 0000001111 -> 1 : : : 1111111111 -> 1 If 6th node crashes then, should the consensus be 1 or 0? A contradition.

CS6100 (Even 2012): Fault Tolerant Consensus 7

slide-9
SLIDE 9

f − 1 Round Bivalent Executions

Lemma 4. ∀k, 0 ≤ k ≤ f − 1, ∃ a k-round execution

  • f A that ends in a bivalent configuration.
  • Proof. Assume first k − 1 rounds are bivalent, k ≤

f − 1. We need to show that kth round is also

  • bivalent. Assume the contrary.

WLOG assume one round extension without failure is 1-valent. Call this βk. Since the execution is bivalent at end of k −1th round, there is a 1 round failure sparse extension that is 0-valent with one node p turning faulty. Call it γk. One can find two intermediate configurations (shown as αi

k and αi+1 k

) with different univalencies. However, the two configurations only differ slightly. In particular, there is a node qi+1 that does not hear from p in αi+1

k

while it hears from p in αi

k.

CS6100 (Even 2012): Fault Tolerant Consensus 8

slide-10
SLIDE 10

0/1 1-valent βk 0−-valent γk 1-valent αi

k

0-valent αi+1

k

(node p sends to all nodes) (node p fails to send to (q1, . . . , qi)) (node p fails to send to (q1, . . . , qi, qi+1)) (node p fails to send to (q1, . . . , qi, qi+1, . . . , qm))

Consider the case where qi+1 dies in round k + 1. The network nodes cannot distinguish between αi

k and

αi+1

k

.

CS6100 (Even 2012): Fault Tolerant Consensus 9

slide-11
SLIDE 11

One More Indecisive Round

Lemma 5. If αf−1 is an f − 1 round execution of A that ends in a bivalent configuration, there ∃ a one- round extension of αf−1 in which some non-faulty node has not decided. Proof Sketch.

1-valent βf 0−-valent γf

δf

p sends to pj p sends to pk p fails to send to pj p sends to pk p fails to send to pj p might send to pk

Note that in the above possible one round extensions, pk cannot distinguish between βf and δf. Whereas, pj cannot distinguish between δf and γf. The proof of the theorem follows.

CS6100 (Even 2012): Fault Tolerant Consensus 10

slide-12
SLIDE 12

Byzantine Agreement With Global Coin

What if f < n/8 nodes can behave arbitrarily1? For instance, they can collude and actively thwart the efforts of the algorithm! Of course, no hope of solving consensus in ≤ f rounds. Suppose the nodes have access to a global unbiased

  • coin. Each round, this coin produces an outcome that

every node sees, but future outcomes are not revealed.

1We don’t allow bad nodes to spoof other nodes. CS6100 (Even 2012): Fault Tolerant Consensus 11

slide-13
SLIDE 13

Algorithm ByzGen

Require: A value inputi for each node i. Ensure: A decision yi.

1: vote = inputi 2: L = (5n/8) + 1 3: H = (3n/4) + 1 4: G = 7n/8 5: for all rounds do 6:

Broadcast vote.

7:

Receive votes from all processors.

8:

maj = majority of values received including own

  • vote. {maj is a value from {0, 1}.}

9:

tally = # of occurrences of maj among votes received.

10:

if coin = heads then

11:

threshold = L

12:

else

13:

threshold = H

14:

end if

15:

if tally ≥ threshold then

16:

vote = maj

17:

else

CS6100 (Even 2012): Fault Tolerant Consensus 12

slide-14
SLIDE 14

18:

vote = 0

19:

end if

20:

if tally ≥ G then

21:

Set yi to maj permanently.

22:

end if

23: end for

CS6100 (Even 2012): Fault Tolerant Consensus 13

slide-15
SLIDE 15

Analysis

What happens if all good processors begin with same value? Easy Case! If two good nodes compute different maj values, then their tally will not exceed threshold (either L or H). They will vote for 0 and next round maj will be 0. Faulty processors foil a threshold x ∈ {L, H} in a round if tally exceeds x for one processor while tally is below x for another. Note that both L and H cannot be foiled in the same round. The probability that a chosen threshold is foiled in a round, therefore, is at most 1/2. So in expected 2 rounds, we have an unfoiled threshold. Then all good players will chose the same vote v and in the next round, they will reach consensus because G ≥ H + n/8 > H > L

CS6100 (Even 2012): Fault Tolerant Consensus 14

slide-16
SLIDE 16

nodes will send out vote v. Theorem 6. The expected # of rounds for ByzGen to reach agreement is a constant.

CS6100 (Even 2012): Fault Tolerant Consensus 15