Perfect Failure Detection with Very Few Bits Pierre Fraigniaud 1 - - PowerPoint PPT Presentation

perfect failure detection with very few bits
SMART_READER_LITE
LIVE PREVIEW

Perfect Failure Detection with Very Few Bits Pierre Fraigniaud 1 - - PowerPoint PPT Presentation

Perfect Failure Detection with Very Few Bits Pierre Fraigniaud 1 Sergio Rajsbaum 2 C. Travers 3 Petr Kuznetsov 4 Thibault Rieutord 4 1 IRIF, Paris 2 UNAM, Mexico 3 LaBRI, Bordeaux 4 ParisTech, Paris ANR Descartes, Chasseneuil, Octobre 2017 Failure


slide-1
SLIDE 1

Perfect Failure Detection with Very Few Bits

Pierre Fraigniaud 1 Sergio Rajsbaum 2

  • C. Travers3

Petr Kuznetsov 4 Thibault Rieutord 4

1IRIF, Paris 2UNAM, Mexico 3LaBRI, Bordeaux 4ParisTech, Paris

ANR Descartes, Chasseneuil, Octobre 2017

slide-2
SLIDE 2

Failure Detectors [Chandra Toueg 96]

  • Distributed device
  • Give (unreliable) information on failures
slide-3
SLIDE 3

Modular Distributed Computing

Failure Detector Communication Primitives Network Protocol

slide-4
SLIDE 4

Modular Distributed Computing

Failure Detector Communication Primitives Network Protocol A s y n c h r

  • n

y

slide-5
SLIDE 5

Modular Distributed Computing

Failure Detector Communication Primitives Network Protocol A s y n c h r

  • n

y F a i l u r e s

slide-6
SLIDE 6

Relative Hardness of Distributed Task

Failure detector D is the weakest for task T ⇐ ⇒

1 There is a protocol for T using D 2 Any f.d. D′ that can be used to solve T can emulate D

slide-7
SLIDE 7

Relative Hardness of Distributed Task

Failure detector D is the weakest for task T ⇐ ⇒

1 There is a protocol for T using D 2 Any f.d. D′ that can be used to solve T can emulate D

Minimum information on failures required to solve T

slide-8
SLIDE 8

Failure Detectors

1 2 3 4 5 6

  • Local failure detection module at each proc.
slide-9
SLIDE 9

Failure Detectors

1 2 3 4 5 6

  • Local failure detection module at each proc.
  • Provide information on other proc. failures
slide-10
SLIDE 10

Perfect Failure Detector

{p1, p4} p1 p2 p3 p4 p5 Perfect failure detector P

slide-11
SLIDE 11

Perfect Failure Detector

{p1, p4} p1 p2 p3 p4 p5 Perfect failure detector P

  • Provide each proc. with a list of proc ids.
slide-12
SLIDE 12

Perfect Failure Detector

{p1, p4} p1 p2 p3 p4 p5 Perfect failure detector P

  • Provide each proc. with a list of proc ids.
  • No false alarm
slide-13
SLIDE 13

Perfect Failure Detector

{p1, p4, p5} p1 p2 p3 p4 p5 Perfect failure detector P

  • Provide each proc. with a list of proc ids.
  • No false alarm
  • Eventually outputs the set of non-faulty processes
slide-14
SLIDE 14

Failure Detector φ

2 p1 p2 p3 p4 p5 Failure detector φ

slide-15
SLIDE 15

Failure Detector φ

2 p1 p2 p3 p4 p5 Failure detector φ

  • Provide each proc. with an integer
slide-16
SLIDE 16

Failure Detector φ

2 p1 p2 p3 p4 p5 Failure detector φ

  • Provide each proc. with an integer
  • Lower bound on the number of failures
slide-17
SLIDE 17

Failure Detector φ

3 p1 p2 p3 p4 p5 Failure detector φ

  • Provide each proc. with an integer
  • Lower bound on the number of failures
  • Eventually tight
slide-18
SLIDE 18

P vs. φ

In a n-process system P φ

slide-19
SLIDE 19

P vs. φ

In a n-process system P

  • List of proc ids.

φ

  • integer f , 0 ≤ f ≤ n
slide-20
SLIDE 20

P vs. φ

In a n-process system P

  • List of proc ids.
  • n bits per process

φ

  • integer f , 0 ≤ f ≤ n
  • log n bits per process
slide-21
SLIDE 21

P vs. φ

In a n-process system P

  • List of proc ids.
  • n bits per process

φ

  • integer f , 0 ≤ f ≤ n
  • log n bits per process

And yet:

Theorem (Mostefaoui, Raynal, T.)

P and φ are equivalent: any task that can be solved using P (resp. φ) can also be solved using φ (resp. P)

slide-22
SLIDE 22

This talk

How many bits per proc. are needed to achieve perfect failure detection ?

slide-23
SLIDE 23

This talk

How many bits per proc. are needed to achieve perfect failure detection ? Theorem (upper bound)

There exists a failure detector µP as powerful as P that

  • utputs O(Ack−1(n)) bits per proc
slide-24
SLIDE 24

This talk

How many bits per proc. are needed to achieve perfect failure detection ? Theorem (upper bound)

There exists a failure detector µP as powerful as P that

  • utputs O(Ack−1(n)) bits per proc

Theorem (lower bound)

No failure detector outputting a constant number of bits per proc. can emulate P

slide-25
SLIDE 25

Model

id1 id2 id3 id4 id5

  • Message passing
slide-26
SLIDE 26

Model

id1 id2 id3 id4 id5

  • Message passing
  • Asynchronous
slide-27
SLIDE 27

Model

id1 id2 id3 id4 id5

  • Message passing
  • Asynchronous
  • n processes
slide-28
SLIDE 28

Model

id1 id2 id3 id4 id5

  • Message passing
  • Asynchronous
  • n processes
  • Crash failures
slide-29
SLIDE 29

Model

id1 id2 id3 id4 id5

  • Message passing
  • Asynchronous
  • n processes
  • Crash failures
  • Unique ids
slide-30
SLIDE 30

Distributed Encoding

  • f the Integers

[Fraigniaud, Rajsbaum, T. LATIN’16]

slide-31
SLIDE 31

Counting the Stars

slide-32
SLIDE 32

Counting with Distributed Certificates

id1 id2 id3 id4 id5

slide-33
SLIDE 33

Counting with Distributed Certificates

id1 id2 id3 id4 id5 01 11 00 01 01

slide-34
SLIDE 34

Counting with Distributed Certificates

id1 id2 id3 id4 id5 01 11 00 01 01

  • verify(5,

01 11 00 01 01 ) ? − → YES

slide-35
SLIDE 35

Counting with Distributed Certificates

id1 id2 id3 id4 id5 01 11 00 01 01

  • verify(5,

01 11 00 01 01 ) ? − → YES

  • verify(3,

11 00 01 ) ? − → NO

slide-36
SLIDE 36

Distributed Encoding of the Integers

  • A alphabet
slide-37
SLIDE 37

Distributed Encoding of the Integers

  • A alphabet
  • f : A∗ → {YES, NO}
slide-38
SLIDE 38

Distributed Encoding of the Integers

  • A alphabet
  • f : A∗ → {YES, NO}

such that for each n ∈ N there exists a code of n cn ∈ An :

1 f(cn) = YES and

slide-39
SLIDE 39

Distributed Encoding of the Integers

  • A alphabet
  • f : A∗ → {YES, NO}

such that for each n ∈ N there exists a code of n cn ∈ An :

1 f(cn) = YES and 2 For every sub-word c′ of cn, f(c′) = NO

slide-40
SLIDE 40

Simple Distributed Encoding

distributed code of n C(n) = n, n, . . . . . . , n

  • n times
slide-41
SLIDE 41

Simple Distributed Encoding

distributed code of n C(n) = n, n, . . . . . . , n

  • n times

f (x1, . . . , xℓ) = YES ⇐ ⇒ x1 = x2 = . . . = xℓ = ℓ

slide-42
SLIDE 42

Simple Distributed Encoding

distributed code of n C(n) = n, n, . . . . . . , n

  • n times

f (x1, . . . , xℓ) = YES ⇐ ⇒ x1 = x2 = . . . = xℓ = ℓ Alphabet of N symbols to encode the first N integers

slide-43
SLIDE 43

Simple Distributed Encoding

distributed code of n C(n) = n, n, . . . . . . , n

  • n times

f (x1, . . . , xℓ) = YES ⇐ ⇒ x1 = x2 = . . . = xℓ = ℓ Alphabet of N symbols to encode the first N integers Challenge: Compact encoding

slide-44
SLIDE 44

Diagonal Sequence

0000 code of 4

slide-45
SLIDE 45

Diagonal Sequence

0000 code of 4 00110 code of 5

slide-46
SLIDE 46

Diagonal Sequence

0000 code of 4 00110 code of 5 011010 code of 6

slide-47
SLIDE 47

Diagonal Sequence

0000 code of 4 00110 code of 5 011010 code of 6 1101010 10101011 010101111 1111110010 11111001011 111100101111 1110010111111 . . . . . . . . . . . . 11111111111 . . . . . . . . . 1 code of 2257 − 2

slide-48
SLIDE 48

Diagonal Sequence

0000 code of 4 00110 code of 5 011010 code of 6 1101010 10101011 010101111 1111110010 11111001011 111100101111 1110010111111 . . . . . . . . . . . . 11111111111 . . . . . . . . . 1 code of 2257 − 2 not a sub-word

slide-49
SLIDE 49

Diagonal Sequence

0000 code of 4 00110 code of 5 011010 code of 6 1101010 10101011 010101111 1111110010 11111001011 111100101111 1110010111111 . . . . . . . . . . . . 11111111111 . . . . . . . . . 1 code of 2257 − 2 not a sub-word H(4)-1

slide-50
SLIDE 50

Aside: Well Quasi-order

Let w, w ′ ∈ {0, 1}∗ w ∗ w ′ ⇐ ⇒ w is a sub-word of w ′ w 1010 w’ 001110111110

slide-51
SLIDE 51

Aside: Well Quasi-order

Let w, w ′ ∈ {0, 1}∗ w ∗ w ′ ⇐ ⇒ w is a sub-word of w ′ w 1010 w’ 001110111110

Bad Sequence

A sequence w1, w2, . . . , wℓ of words of {0, 1}∗ is bad iff for every i < j, wi ∗ wj

slide-52
SLIDE 52

Well-quasi Order

Higman’s lemma

({0, 1}∗, ∗) is a well-quasi order

slide-53
SLIDE 53

Well-quasi Order

Higman’s lemma

({0, 1}∗, ∗) is a well-quasi order That is, every bad sequence over {0, 1}∗ is finite

slide-54
SLIDE 54

Well-quasi Order

Higman’s lemma

({0, 1}∗, ∗) is a well-quasi order That is, every bad sequence over {0, 1}∗ is finite

Length Function Theorem [ Schmitz et al., ICALP’11]

Bad sequences w1, w2, . . . , wℓ over {0, 1}∗ with

  • |w1| ≤ d
  • |wi| ≤ i

have length bounded by L(d) where L is a function of Ackermannian growth

slide-55
SLIDE 55

A Bad Sequence

0000 00110 011010 1101010 10101011 10101011 010101111 1111110010 11111001011 111100101111 1110010111111 . . . . . . . . . . . . 11111111111 . . . . . . . . . 1 not a sub-word

slide-56
SLIDE 56

Multi diagonal sequence

D1 D2 D3

slide-57
SLIDE 57

Multi diagonal sequence

D1 D2 D3 H(1) H(H(1)) H(H(H(1)))

slide-58
SLIDE 58

Multi diagonal sequence

D1 D2 D3 H(1) H(H(1)) H(H(H(1))) encode integers H(1) . . . H(H(1)) − 1

slide-59
SLIDE 59

Multi diagonal sequence

D1 D2 D3 H(1) H(H(1)) H(H(H(1))) encode integers H(1) . . . H(H(1)) − 1 encode integers H(H(1)) . . . H(H(H(1))) − 1

slide-60
SLIDE 60

Encoding from Multi Diagonal Sequence

A = {0, 1} × N Code(n) : Di

  • 010. . .0101

n

slide-61
SLIDE 61

Encoding from Multi Diagonal Sequence

A = {0, 1} × N Code(n) : (0, i), (1, i), (0, i), . . . , (1, i), (0, i), (1, i) Di

  • 010. . .0101

n

slide-62
SLIDE 62

Encoding from Multi Diagonal Sequence

A = {0, 1} × N Code(n) : (0, i), (1, i), (0, i), . . . , (1, i), (0, i), (1, i) Di

  • 010. . .0101

n f((b1, d1), (b2, d2), . . . , (bn, dn)) = YES ⇐ ⇒

1 d1 = d2 = . . . = dn = i 2 b1, . . . , bn is the sequence of length n in Di

slide-63
SLIDE 63

Compactness

How many bits to distributively encode the first n integers? Code(n) : (0, i), (1, i), (0, i), . . . , (1, i), (0, i), (1, i)

slide-64
SLIDE 64

Compactness

How many bits to distributively encode the first n integers? Code(n) : (0, i), (1, i), (0, i), . . . , (1, i), (0, i), (1, i) = ⇒ 1 + log(i) bits

slide-65
SLIDE 65

Compactness

How many bits to distributively encode the first n integers? Code(n) : (0, i), (1, i), (0, i), . . . , (1, i), (0, i), (1, i) = ⇒ 1 + log(i) bits where i = min{j : n < Hj(1)}≤ Ack−1(n) Di Hi(1)

slide-66
SLIDE 66

Perfect Failure Detection from Distributed Encoding

slide-67
SLIDE 67

Perfect failure detection from distributed encoding

Failure detector µP:

  • Encode an upper bound on the number of alive processes
  • Eventually converge to the (code of the) number of

non-faulty processes

slide-68
SLIDE 68

µP

p1 p2 p3 p4 p5 x x epoch i epoch i + 1

  • w1
  • w2
  • w4
  • w5
  • Constant fd output at each proc in each epoch
  • w1w2w4w5 ∗ code(ai), where # alive(epoch i) ≤ ai
slide-69
SLIDE 69

µP

epoch 1 epoch 2 epoch ℓ a1 a2 aℓ time

  • At most n epochs
  • a1 ≥ a2 ≥ . . . ≥ aℓ
slide-70
SLIDE 70

µP

epoch 1 epoch 2 epoch ℓ a1 a2 aℓ time

  • At most n epochs
  • a1 ≥ a2 ≥ . . . ≥ aℓ
  • aℓ = # alive(last epoch) = # correct procs
slide-71
SLIDE 71

From µP to P

Let Q = {q1, . . . , q4} ⊆ {p1, . . . , pn} q1 q2 q3 q4 epoch i ai

  • w1
  • w2
  • w3
  • w4

w = w1w2w3w4

  • Recall: w ∗ code(ai) and |Alive(epoch i)| ≤ ai
slide-72
SLIDE 72

From µP to P

Let Q = {q1, . . . , q4} ⊆ {p1, . . . , pn} q1 q2 q3 q4 epoch i ai

  • w1
  • w2
  • w3
  • w4

w = w1w2w3w4

  • Recall: w ∗ code(ai) and |Alive(epoch i)| ≤ ai
  • Code def.: w ≺∗ code(ai) =

⇒ f(w) = false and f(code(ai)) = true

slide-73
SLIDE 73

From µP to P

Let Q = {q1, . . . , q4} ⊆ {p1, . . . , pn} q1 q2 q3 q4 epoch i ai

  • w1
  • w2
  • w3
  • w4

w = w1w2w3w4

  • Recall: w ∗ code(ai) and |Alive(epoch i)| ≤ ai
  • Code def.: w ≺∗ code(ai) =

⇒ f(w) = false and f(code(ai)) = true

  • Hence, if f(w) = true then {p1, . . . , pn} \ Q ⊆ Faulty
slide-74
SLIDE 74

Dirty Collect

Let Q = {q1, . . . , q4} ⊆ {p1, . . . , pn} q1 q2 q3 q4 epoch i ai

  • w1
  • w2
  • w3
  • w4

w = w1w2w3w4

  • w1, w2, w3, w4 sampled in = epochs
slide-75
SLIDE 75

Dirty Collect

Let Q = {q1, . . . , q4} ⊆ {p1, . . . , pn} q1 q2 q3 q4 epoch i ai

  • w1
  • w2
  • w3
  • w4

w = w1w2w3w4

  • w1, w2, w3, w4 sampled in = epochs
  • f(w) = true ?? f(w) = false ??
slide-76
SLIDE 76

Clean Collect

epoch 1 epoch 2 epoch 3 epoch ℓ a1 a2 a3 a4 time collect collect collect collect w1 w2 w3 w4

  • At most n epochs
slide-77
SLIDE 77

Clean Collect

epoch 1 epoch 2 epoch 3 epoch ℓ a1 a2 a3 a4 time collect collect collect collect w1 w2 w3 w4

  • At most n epochs

= ⇒ In a sequence of n collects, at least one is clean

slide-78
SLIDE 78

From µP to P

epoch 1 epoch 2 epoch 3 epoch ℓ a1 a2 a3 a4 time collect collect collect collect w1 w2 w3 w4

  • Collect i is successful if (1) terminates and (2)

f(wi) = true

slide-79
SLIDE 79

From µP to P

epoch 1 epoch 2 epoch 3 epoch ℓ a1 a2 a3 a4 time collect collect collect collect w1 w2 w3 w4

  • Collect i is successful if (1) terminates and (2)

f(wi) = true

  • If for some set Q, there are n successful collects

P output = {p1, . . . , pn} \ Q

slide-80
SLIDE 80

µP: Summary

Failure detector µP

  • Outputs O(log Ack−1(n)) bits per processes
  • Can emulate the perfect failure detector P
slide-81
SLIDE 81

µP: Summary

Failure detector µP

  • Outputs O(log Ack−1(n)) bits per processes
  • Can emulate the perfect failure detector P
  • (P can also emulate µP – see the paper)
slide-82
SLIDE 82

Lower Bound

Failure detector µP

  • Outputs O(log Ack−1(n)) bits per processes
slide-83
SLIDE 83

Lower Bound

Failure detector µP

  • Outputs O(log Ack−1(n)) bits per processes

Is there a f.d. D that

1 can emulate P 2 outputs less than log Ack−1(n) bits per process ?

slide-84
SLIDE 84

Lower Bound

Failure detector µP

  • Outputs O(log Ack−1(n)) bits per processes

Is there a f.d. D that

1 can emulate P 2 outputs less than log Ack−1(n) bits per process ?

Theorem

No failure detector with constant-size output can emulate P

slide-85
SLIDE 85

Lower Bound Proof

Assume for contradiction D f.d. such that

  • Constant range R, (independant of n)
slide-86
SLIDE 86

Lower Bound Proof

Assume for contradiction D f.d. such that

  • Constant range R, (independant of n)
  • TD→P (can emulate P)
slide-87
SLIDE 87

Lower Bound Proof

Assume for contradiction D f.d. such that

  • Constant range R, (independant of n)
  • TD→P (can emulate P)

Ingredients

  • Ramsey’s theorem
  • Well quasi-order theory
slide-88
SLIDE 88

Goal

Construct two executions e and e′ :

  • indistinguishable for some non-faulty processes
  • with Correct(e) Correct(e′)
slide-89
SLIDE 89

Goal

Construct two executions e and e′ :

  • indistinguishable for some non-faulty processes
  • with Correct(e) Correct(e′)

= ⇒ in e′ TD→P erroneously outputs a non-faulty process

slide-90
SLIDE 90

From Executions to Words

Let e an (infinite) execution q

  • d1

d2 d3 d4 d5 d6 d7

  • As RD is finite, ∃d ∈ D output infinitely many times at q
slide-91
SLIDE 91

From Executions to Words

Let e an (infinite) execution q

  • d2

d3 d5 d7 d d d

  • As RD is finite, ∃d ∈ D output infinitely many times at q
slide-92
SLIDE 92

From Executions to Words

Let e an (infinite) execution q

  • d2

d3 d5 d7 d d d

  • As RD is finite, ∃d ∈ D output infinitely many times at q

Execution ˜ e q

  • d

d d

  • Constant failure detector output
slide-93
SLIDE 93

From Executions to Words

Let e an (infinite) execution in which crashes are initial q1 q2 q3 q4 q5 x x

slide-94
SLIDE 94

From Executions to Words

Let e an (infinite) execution in which crashes are initial q1 q2 q3 q4 q5 x x

  • d1
  • d1
  • d1
  • d1
  • d3
  • d3
  • d3
  • d3
  • d3
  • d5
  • d5
  • d5
  • d5

Constant f.d. output (di) at each non-faulty process qi

slide-95
SLIDE 95

From Executions to Words

Let e an (infinite) execution in which crashes are initial q1 q2 q3 q4 q5 x x

  • d1
  • d1
  • d1
  • d1
  • d3
  • d3
  • d3
  • d3
  • d3
  • d5
  • d5
  • d5
  • d5

Constant f.d. output (di) at each non-faulty process qi e − → we = d1d3d5 ∈ R∗

D

slide-96
SLIDE 96

Towards Indistinguishable Executions

execution associated word ∈ R∗

D

e1 w1 e2 w2 e3 w3 . . . . . . . . . . . . ... eL wL |wi| = i

slide-97
SLIDE 97

Towards Indistinguishable Executions

execution associated word ∈ R∗

D

e1 w1 e2 w2 e3 w3 . . . . . . . . . . . . ... eL wL |wi| = i

  • (Higman’s Lemma) (R∗

D ∗) is a wqo

slide-98
SLIDE 98

Towards Indistinguishable Executions

execution associated word ∈ R∗

D

e1 w1 e2 w2 e3 w3 . . . . . . . . . . . . ... eL wL |wi| = i

  • (Higman’s Lemma) (R∗

D ∗) is a wqo

= ⇒ For large enough L, ∃ i, j : 1 ≤ i < j ≤ L and wi subword of wj

slide-99
SLIDE 99

Towards Indistinguishable Executions

execution e w = abc q1 q2 q3

  • a
  • a
  • a
  • a
  • b
  • b
  • b
  • b
  • c
  • c
  • c
  • c

execution e′ w ′ = xabyc q′

1

q′

2

q′

3

q′

4

q′

5

  • x
  • x
  • x
  • x
  • a
  • a
  • a
  • a
  • b
  • b
  • b
  • b
  • y
  • y
  • y
  • y
  • c
  • c
  • c
  • c

.

slide-100
SLIDE 100

Towards Indistinguishable Executions

execution e w = abc q1 q2 q3

  • a
  • a
  • a
  • a
  • b
  • b
  • b
  • b
  • c
  • c
  • c
  • c

execution e′ w ′ = xabyc q′

1

q′

2

q′

3

q′

4

q′

5

  • x
  • x
  • x
  • x
  • a
  • a
  • a
  • a
  • b
  • b
  • b
  • b
  • y
  • y
  • y
  • y
  • c
  • c
  • c
  • c

a (or b, c) may be output at processes with distinct ids in e and e′.

slide-101
SLIDE 101

Towards Indistinguishable Executions

execution e w = abc q1 q2 q3

  • a
  • a
  • a
  • a
  • b
  • b
  • b
  • b
  • c
  • c
  • c
  • c

execution e′ w ′ = xabyc q′

1

q′

2

q′

3

q′

4

q′

5

  • x
  • x
  • x
  • x
  • a
  • a
  • a
  • a
  • b
  • b
  • b
  • b
  • y
  • y
  • y
  • y
  • c
  • c
  • c
  • c

a (or b, c) may be output at processes with distinct ids in e and e′. Rely on Ramsey’s Theorem to get rid of ids

slide-102
SLIDE 102

Conclusion

Summary:

  • Perfect failure detection with O(Ack−1(n)) bits per

process

  • Perfect failure detection with constant output is

impossible

slide-103
SLIDE 103

Conclusion

Summary:

  • Perfect failure detection with O(Ack−1(n)) bits per

process

  • Perfect failure detection with constant output is

impossible

  • Applications of wqo theory to distributed computing
slide-104
SLIDE 104

Conclusion

Summary:

  • Perfect failure detection with O(Ack−1(n)) bits per

process

  • Perfect failure detection with constant output is

impossible

  • Applications of wqo theory to distributed computing

Future work:

slide-105
SLIDE 105

Conclusion

Summary:

  • Perfect failure detection with O(Ack−1(n)) bits per

process

  • Perfect failure detection with constant output is

impossible

  • Applications of wqo theory to distributed computing

Future work:

  • Close the gap between lower and upper bounds
slide-106
SLIDE 106

Conclusion

Summary:

  • Perfect failure detection with O(Ack−1(n)) bits per

process

  • Perfect failure detection with constant output is

impossible

  • Applications of wqo theory to distributed computing

Future work:

  • Close the gap between lower and upper bounds
  • Failure detector as (distributed) encoder: Relation

between output size and failure detector power

slide-107
SLIDE 107

Conclusion

Summary:

  • Perfect failure detection with O(Ack−1(n)) bits per

process

  • Perfect failure detection with constant output is

impossible

  • Applications of wqo theory to distributed computing

Future work:

  • Close the gap between lower and upper bounds
  • Failure detector as (distributed) encoder: Relation

between output size and failure detector power

  • Other application of the distributed encoding of the

integers

slide-108
SLIDE 108

Thanks!

slide-109
SLIDE 109

Coloring Subsets of Processes

c assigns a color to each subset of processes c(Q = {q1, . . . , qk}) ∈ Rk

D

q1 q2 q3 p = qi x

slide-110
SLIDE 110

Coloring Subsets of Processes

c assigns a color to each subset of processes c(Q = {q1, . . . , qk}) ∈ Rk

D

q1 q2 q3 p = qi x

  • d1
  • d1
  • d1
  • d1
  • d2
  • d2
  • d2
  • d2
  • d2
  • d3
  • d3
  • d3
  • d3
slide-111
SLIDE 111

Coloring Subsets of Processes

c assigns a color to each subset of processes c(Q = {q1, . . . , qk}) ∈ Rk

D

q1 q2 q3 p = qi x

  • d1
  • d1
  • d1
  • d1
  • d2
  • d2
  • d2
  • d2
  • d2
  • d3
  • d3
  • d3
  • d3

c({q1, . . . , q3}) = (d1, d2, d3)

slide-112
SLIDE 112

Getting Rid of Ids

c : Q = {q1, . . . , qk} → Rk

D

slide-113
SLIDE 113

Getting Rid of Ids

c : Q = {q1, . . . , qk} → Rk

D

Ramsey’s Theorem

  • For any m, k
slide-114
SLIDE 114

Getting Rid of Ids

c : Q = {q1, . . . , qk} → Rk

D

Ramsey’s Theorem

  • For any m, k
  • There exists n = g(m, k) such that
slide-115
SLIDE 115

Getting Rid of Ids

c : Q = {q1, . . . , qk} → Rk

D

Ramsey’s Theorem

  • For any m, k
  • There exists n = g(m, k) such that
  • There exists a m-subset S of the n procs such that
slide-116
SLIDE 116

Getting Rid of Ids

c : Q = {q1, . . . , qk} → Rk

D

Ramsey’s Theorem

  • For any m, k
  • There exists n = g(m, k) such that
  • There exists a m-subset S of the n procs such that
  • Every k-subset of S has the same color
slide-117
SLIDE 117

Getting Rid of Ids

c : Q = {q1, . . . , qk} → Rk

D

Ramsey’s Theorem

  • For any m, k
  • There exists n = g(m, k) such that
  • There exists a m-subset S of the n procs such that
  • Every k-subset of S has the same color

Intuitively

For any q ∈ S,

  • For failure pattern with k correct procs and initial crashes
  • F.d output at q depends only on the rank of its id
slide-118
SLIDE 118

Failure Detector Specification

A failure detector D

  • Outputs symbols in some range RD
  • Is defined with respect to failure patterns
slide-119
SLIDE 119

Failure Pattern

p1 p2 p3 p4 x x x

slide-120
SLIDE 120

Failure Pattern

p1 p2 p3 p4 x x x

F : N → 2{p1,...,pn}

slide-121
SLIDE 121

Failure Detector Specification

For each failure pattern, D defines which outputs are valid

  • History : H(p, t) is the output at process p at time t
  • D(F) = valid histories for failure pattern F
slide-122
SLIDE 122

Failure Detector Equivalence

Failure detectors D and D′ are equivalent ⇐ ⇒ There exist two asynchronous, crash-resilient protocols

  • TD→D′ that emulates D′ using D
  • TD′→D that emulates D using D′