Perfect Failure Detection with Very Few Bits
Pierre Fraigniaud 1 Sergio Rajsbaum 2
- C. Travers3
Petr Kuznetsov 4 Thibault Rieutord 4
1IRIF, Paris 2UNAM, Mexico 3LaBRI, Bordeaux 4ParisTech, Paris
ANR Descartes, Chasseneuil, Octobre 2017
Perfect Failure Detection with Very Few Bits Pierre Fraigniaud 1 - - PowerPoint PPT Presentation
Perfect Failure Detection with Very Few Bits Pierre Fraigniaud 1 Sergio Rajsbaum 2 C. Travers 3 Petr Kuznetsov 4 Thibault Rieutord 4 1 IRIF, Paris 2 UNAM, Mexico 3 LaBRI, Bordeaux 4 ParisTech, Paris ANR Descartes, Chasseneuil, Octobre 2017 Failure
Pierre Fraigniaud 1 Sergio Rajsbaum 2
Petr Kuznetsov 4 Thibault Rieutord 4
1IRIF, Paris 2UNAM, Mexico 3LaBRI, Bordeaux 4ParisTech, Paris
ANR Descartes, Chasseneuil, Octobre 2017
Failure Detector Communication Primitives Network Protocol
Failure Detector Communication Primitives Network Protocol A s y n c h r
y
Failure Detector Communication Primitives Network Protocol A s y n c h r
y F a i l u r e s
Failure detector D is the weakest for task T ⇐ ⇒
1 There is a protocol for T using D 2 Any f.d. D′ that can be used to solve T can emulate D
Failure detector D is the weakest for task T ⇐ ⇒
1 There is a protocol for T using D 2 Any f.d. D′ that can be used to solve T can emulate D
Minimum information on failures required to solve T
1 2 3 4 5 6
1 2 3 4 5 6
{p1, p4} p1 p2 p3 p4 p5 Perfect failure detector P
{p1, p4} p1 p2 p3 p4 p5 Perfect failure detector P
{p1, p4} p1 p2 p3 p4 p5 Perfect failure detector P
{p1, p4, p5} p1 p2 p3 p4 p5 Perfect failure detector P
2 p1 p2 p3 p4 p5 Failure detector φ
2 p1 p2 p3 p4 p5 Failure detector φ
2 p1 p2 p3 p4 p5 Failure detector φ
3 p1 p2 p3 p4 p5 Failure detector φ
In a n-process system P φ
In a n-process system P
φ
In a n-process system P
φ
In a n-process system P
φ
And yet:
P and φ are equivalent: any task that can be solved using P (resp. φ) can also be solved using φ (resp. P)
There exists a failure detector µP as powerful as P that
There exists a failure detector µP as powerful as P that
No failure detector outputting a constant number of bits per proc. can emulate P
id1 id2 id3 id4 id5
id1 id2 id3 id4 id5
id1 id2 id3 id4 id5
id1 id2 id3 id4 id5
id1 id2 id3 id4 id5
[Fraigniaud, Rajsbaum, T. LATIN’16]
id1 id2 id3 id4 id5
id1 id2 id3 id4 id5 01 11 00 01 01
id1 id2 id3 id4 id5 01 11 00 01 01
01 11 00 01 01 ) ? − → YES
id1 id2 id3 id4 id5 01 11 00 01 01
01 11 00 01 01 ) ? − → YES
11 00 01 ) ? − → NO
such that for each n ∈ N there exists a code of n cn ∈ An :
1 f(cn) = YES and
such that for each n ∈ N there exists a code of n cn ∈ An :
1 f(cn) = YES and 2 For every sub-word c′ of cn, f(c′) = NO
distributed code of n C(n) = n, n, . . . . . . , n
distributed code of n C(n) = n, n, . . . . . . , n
f (x1, . . . , xℓ) = YES ⇐ ⇒ x1 = x2 = . . . = xℓ = ℓ
distributed code of n C(n) = n, n, . . . . . . , n
f (x1, . . . , xℓ) = YES ⇐ ⇒ x1 = x2 = . . . = xℓ = ℓ Alphabet of N symbols to encode the first N integers
distributed code of n C(n) = n, n, . . . . . . , n
f (x1, . . . , xℓ) = YES ⇐ ⇒ x1 = x2 = . . . = xℓ = ℓ Alphabet of N symbols to encode the first N integers Challenge: Compact encoding
0000 code of 4
0000 code of 4 00110 code of 5
0000 code of 4 00110 code of 5 011010 code of 6
0000 code of 4 00110 code of 5 011010 code of 6 1101010 10101011 010101111 1111110010 11111001011 111100101111 1110010111111 . . . . . . . . . . . . 11111111111 . . . . . . . . . 1 code of 2257 − 2
0000 code of 4 00110 code of 5 011010 code of 6 1101010 10101011 010101111 1111110010 11111001011 111100101111 1110010111111 . . . . . . . . . . . . 11111111111 . . . . . . . . . 1 code of 2257 − 2 not a sub-word
0000 code of 4 00110 code of 5 011010 code of 6 1101010 10101011 010101111 1111110010 11111001011 111100101111 1110010111111 . . . . . . . . . . . . 11111111111 . . . . . . . . . 1 code of 2257 − 2 not a sub-word H(4)-1
Let w, w ′ ∈ {0, 1}∗ w ∗ w ′ ⇐ ⇒ w is a sub-word of w ′ w 1010 w’ 001110111110
Let w, w ′ ∈ {0, 1}∗ w ∗ w ′ ⇐ ⇒ w is a sub-word of w ′ w 1010 w’ 001110111110
A sequence w1, w2, . . . , wℓ of words of {0, 1}∗ is bad iff for every i < j, wi ∗ wj
({0, 1}∗, ∗) is a well-quasi order
({0, 1}∗, ∗) is a well-quasi order That is, every bad sequence over {0, 1}∗ is finite
({0, 1}∗, ∗) is a well-quasi order That is, every bad sequence over {0, 1}∗ is finite
Bad sequences w1, w2, . . . , wℓ over {0, 1}∗ with
have length bounded by L(d) where L is a function of Ackermannian growth
0000 00110 011010 1101010 10101011 10101011 010101111 1111110010 11111001011 111100101111 1110010111111 . . . . . . . . . . . . 11111111111 . . . . . . . . . 1 not a sub-word
D1 D2 D3
D1 D2 D3 H(1) H(H(1)) H(H(H(1)))
D1 D2 D3 H(1) H(H(1)) H(H(H(1))) encode integers H(1) . . . H(H(1)) − 1
D1 D2 D3 H(1) H(H(1)) H(H(H(1))) encode integers H(1) . . . H(H(1)) − 1 encode integers H(H(1)) . . . H(H(H(1))) − 1
A = {0, 1} × N Code(n) : Di
n
A = {0, 1} × N Code(n) : (0, i), (1, i), (0, i), . . . , (1, i), (0, i), (1, i) Di
n
A = {0, 1} × N Code(n) : (0, i), (1, i), (0, i), . . . , (1, i), (0, i), (1, i) Di
n f((b1, d1), (b2, d2), . . . , (bn, dn)) = YES ⇐ ⇒
1 d1 = d2 = . . . = dn = i 2 b1, . . . , bn is the sequence of length n in Di
How many bits to distributively encode the first n integers? Code(n) : (0, i), (1, i), (0, i), . . . , (1, i), (0, i), (1, i)
How many bits to distributively encode the first n integers? Code(n) : (0, i), (1, i), (0, i), . . . , (1, i), (0, i), (1, i) = ⇒ 1 + log(i) bits
How many bits to distributively encode the first n integers? Code(n) : (0, i), (1, i), (0, i), . . . , (1, i), (0, i), (1, i) = ⇒ 1 + log(i) bits where i = min{j : n < Hj(1)}≤ Ack−1(n) Di Hi(1)
Failure detector µP:
non-faulty processes
p1 p2 p3 p4 p5 x x epoch i epoch i + 1
epoch 1 epoch 2 epoch ℓ a1 a2 aℓ time
epoch 1 epoch 2 epoch ℓ a1 a2 aℓ time
Let Q = {q1, . . . , q4} ⊆ {p1, . . . , pn} q1 q2 q3 q4 epoch i ai
w = w1w2w3w4
Let Q = {q1, . . . , q4} ⊆ {p1, . . . , pn} q1 q2 q3 q4 epoch i ai
w = w1w2w3w4
⇒ f(w) = false and f(code(ai)) = true
Let Q = {q1, . . . , q4} ⊆ {p1, . . . , pn} q1 q2 q3 q4 epoch i ai
w = w1w2w3w4
⇒ f(w) = false and f(code(ai)) = true
Let Q = {q1, . . . , q4} ⊆ {p1, . . . , pn} q1 q2 q3 q4 epoch i ai
w = w1w2w3w4
Let Q = {q1, . . . , q4} ⊆ {p1, . . . , pn} q1 q2 q3 q4 epoch i ai
w = w1w2w3w4
epoch 1 epoch 2 epoch 3 epoch ℓ a1 a2 a3 a4 time collect collect collect collect w1 w2 w3 w4
epoch 1 epoch 2 epoch 3 epoch ℓ a1 a2 a3 a4 time collect collect collect collect w1 w2 w3 w4
= ⇒ In a sequence of n collects, at least one is clean
epoch 1 epoch 2 epoch 3 epoch ℓ a1 a2 a3 a4 time collect collect collect collect w1 w2 w3 w4
f(wi) = true
epoch 1 epoch 2 epoch 3 epoch ℓ a1 a2 a3 a4 time collect collect collect collect w1 w2 w3 w4
f(wi) = true
P output = {p1, . . . , pn} \ Q
Failure detector µP
Failure detector µP
Failure detector µP
Failure detector µP
Is there a f.d. D that
1 can emulate P 2 outputs less than log Ack−1(n) bits per process ?
Failure detector µP
Is there a f.d. D that
1 can emulate P 2 outputs less than log Ack−1(n) bits per process ?
No failure detector with constant-size output can emulate P
Assume for contradiction D f.d. such that
Assume for contradiction D f.d. such that
Assume for contradiction D f.d. such that
Ingredients
Construct two executions e and e′ :
Construct two executions e and e′ :
= ⇒ in e′ TD→P erroneously outputs a non-faulty process
Let e an (infinite) execution q
d2 d3 d4 d5 d6 d7
Let e an (infinite) execution q
d3 d5 d7 d d d
Let e an (infinite) execution q
d3 d5 d7 d d d
Execution ˜ e q
d d
Let e an (infinite) execution in which crashes are initial q1 q2 q3 q4 q5 x x
Let e an (infinite) execution in which crashes are initial q1 q2 q3 q4 q5 x x
Constant f.d. output (di) at each non-faulty process qi
Let e an (infinite) execution in which crashes are initial q1 q2 q3 q4 q5 x x
Constant f.d. output (di) at each non-faulty process qi e − → we = d1d3d5 ∈ R∗
D
execution associated word ∈ R∗
D
e1 w1 e2 w2 e3 w3 . . . . . . . . . . . . ... eL wL |wi| = i
execution associated word ∈ R∗
D
e1 w1 e2 w2 e3 w3 . . . . . . . . . . . . ... eL wL |wi| = i
D ∗) is a wqo
execution associated word ∈ R∗
D
e1 w1 e2 w2 e3 w3 . . . . . . . . . . . . ... eL wL |wi| = i
D ∗) is a wqo
= ⇒ For large enough L, ∃ i, j : 1 ≤ i < j ≤ L and wi subword of wj
execution e w = abc q1 q2 q3
execution e′ w ′ = xabyc q′
1
q′
2
q′
3
q′
4
q′
5
.
execution e w = abc q1 q2 q3
execution e′ w ′ = xabyc q′
1
q′
2
q′
3
q′
4
q′
5
a (or b, c) may be output at processes with distinct ids in e and e′.
execution e w = abc q1 q2 q3
execution e′ w ′ = xabyc q′
1
q′
2
q′
3
q′
4
q′
5
a (or b, c) may be output at processes with distinct ids in e and e′. Rely on Ramsey’s Theorem to get rid of ids
Summary:
process
impossible
Summary:
process
impossible
Summary:
process
impossible
Future work:
Summary:
process
impossible
Future work:
Summary:
process
impossible
Future work:
between output size and failure detector power
Summary:
process
impossible
Future work:
between output size and failure detector power
integers
c assigns a color to each subset of processes c(Q = {q1, . . . , qk}) ∈ Rk
D
q1 q2 q3 p = qi x
c assigns a color to each subset of processes c(Q = {q1, . . . , qk}) ∈ Rk
D
q1 q2 q3 p = qi x
c assigns a color to each subset of processes c(Q = {q1, . . . , qk}) ∈ Rk
D
q1 q2 q3 p = qi x
c({q1, . . . , q3}) = (d1, d2, d3)
c : Q = {q1, . . . , qk} → Rk
D
c : Q = {q1, . . . , qk} → Rk
D
c : Q = {q1, . . . , qk} → Rk
D
c : Q = {q1, . . . , qk} → Rk
D
c : Q = {q1, . . . , qk} → Rk
D
c : Q = {q1, . . . , qk} → Rk
D
For any q ∈ S,
A failure detector D
p1 p2 p3 p4 x x x
p1 p2 p3 p4 x x x
For each failure pattern, D defines which outputs are valid
Failure detectors D and D′ are equivalent ⇐ ⇒ There exist two asynchronous, crash-resilient protocols