Contention-Related Crash Failures Anas Durand LIP6, Sorbonne - - PowerPoint PPT Presentation

contention related crash failures ana s durand
SMART_READER_LITE
LIVE PREVIEW

Contention-Related Crash Failures Anas Durand LIP6, Sorbonne - - PowerPoint PPT Presentation

Contention-Related Crash Failures Anas Durand LIP6, Sorbonne Universit, Paris April 1st, 2019 1 / 25 Anas Durand Contention-Related Crash Failures Set Agreement and Renaming in the Presence of Contention-Related Crash Failures SSS 2018


slide-1
SLIDE 1

Contention-Related Crash Failures Anaïs Durand

LIP6, Sorbonne Université, Paris

April 1st, 2019

Anaïs Durand Contention-Related Crash Failures 1/25

slide-2
SLIDE 2

Set Agreement and Renaming in the Presence of Contention-Related Crash Failures SSS 2018 Joint work with: Michel Raynal Gadi Taubenfeld

Anaïs Durand Contention-Related Crash Failures 2/25

slide-3
SLIDE 3

Computational Model

Asynchronous deterministic system n processes p1, . . . , pn Atomic read/write registers 0 ≤ t < n process crashes Participation required

Anaïs Durand Contention-Related Crash Failures 3/25

slide-4
SLIDE 4

Process crashes

2 kinds of process crashes usually considered:

Initially dead processes “Classical” (any-time) crashs: no

constraints

Anaïs Durand Contention-Related Crash Failures 4/25

slide-5
SLIDE 5

Contention-Related Crash Failures [Taubenfeld,18]

Contention = # processes that accessed a shared register

≈ # processes that started to compute

λ = predefined contention threshold 2 possible definitions:

contention λ

No crashes

contention λ

No crashes

Anaïs Durand Contention-Related Crash Failures 5/25

slide-6
SLIDE 6

Contention-Related Crash Failures [Taubenfeld,18]

Contention = # processes that accessed a shared register

≈ # processes that started to compute

λ = predefined contention threshold 2 possible definitions:

contention λ

No crashes λ-constrained crashes

contention λ

No crashes

Anaïs Durand Contention-Related Crash Failures 5/25

slide-7
SLIDE 7

Contention-Related vs. Any-Time Crash Failures

Consensus: ◮ [Fischer et al., 85]: Impossible with one any-time crash failure. ◮ [Taubenfeld, 18]: Algorithm that tolerates one (n − 1)-constrained

crash failure for n > 1.

k-Set Agreement, 1 ≤ k < n: ◮ [Borowsky, Gafni, 93]: Impossible with k any-time crash failures. ◮ [Taubenfeld, 18]: Algorithm that tolerates ℓ + k − 2 (n − ℓ)-constrained

crash failures for ℓ ≥ 1 and n ≥ 2ℓ + k − 2.

Anaïs Durand Contention-Related Crash Failures 6/25

slide-8
SLIDE 8

Motivation

Consider a problem P that can be solved with t any-time crash failures, but impossible with t + 1 any-time crash failures. Given λ, can P be solved with both

t1 λ-constrained

and

t2 ≤ t any-time

crash failures, with t1 + t2 > t? We consider here: k-set agreement (for k ≥ 2) and renaming

Anaïs Durand Contention-Related Crash Failures 7/25

slide-9
SLIDE 9

k-Set Agreement

Anaïs Durand Contention-Related Crash Failures 8/25

slide-10
SLIDE 10

k-Set Agreement

[Chaudhuri,90] Definition

One-shot object Operation propose(v): propose value v and return a decided value Properties: ◮ Validity: decided value ∈ proposed values ◮ Agreement: ≤ k decided values ◮ Termination: every correct process decides

Anaïs Durand Contention-Related Crash Failures 9/25

slide-11
SLIDE 11

k-Set Agreement Algorithm: Properties

λ = n − k k ≥ 2 k = m + f , m ≥ 0, f ≥ 1

total # of faults t = 2m + f − 1 = k + m − 1 λ-constrained crashes 2m any-time crashes f − 1

[Borowsky, Gafni, 93]: Impossible with k any-time crash failures.

Anaïs Durand Contention-Related Crash Failures 10/25

slide-12
SLIDE 12

k-Set Agreement: Parameters

Parameters f and m allow the user to tune the proportion of each type

  • f crash failures.

m f k-1 1 k t = k − 1

max #any-time (= k-1)

m f k-1 k

2

  • 1

k k

2

  • t = 2

k

2

  • +

k

2

  • − 1

m f k-1 1 k t = 2k − 2

max #λ-constrained (= 2k-2)

Anaïs Durand Contention-Related Crash Failures 11/25

slide-13
SLIDE 13

k-Set Agreement: Shared Registers (1/2)

DEC: atomic register, initially ⊥ PART[1 . . . n]: snapshot object, initially [down, . . . , down] ◮ Atomic (linearizable) operations write() and snapshot() ◮ ≈ array of single-writer multi-reader atomic registers

PART[1 . . . n] such that:

  • pi invokes write(v) = writes v into PART[i]
  • pi invokes snapshot() = obtains the value of the array

PART[1 . . . n] as if it read simultaneously and instantaneously all its entries

Anaïs Durand Contention-Related Crash Failures 12/25

slide-14
SLIDE 14

k-Set Agreement: Shared Registers (2/2)

MUTEX[1]: one-shot deadlock-free f -mutex MUTEX[2]: one-shot deadlock-free m-mutex ◮ Operations acquire() and release() (invoked at most

  • nce)

◮ Properties:

  • Mutual exclusion: ≤ m processes simultaneously in critical

section

  • Deadlock-freedom: if < m processes crashes, then ≥ 1

process invoking acquire() terminates its invocation

Anaïs Durand Contention-Related Crash Failures 13/25

slide-15
SLIDE 15

k-Set Agreement Algorithm (1/2)

  • peration propose(ini) is

(1)

PART.write(up);

% signal participation

Anaïs Durand Contention-Related Crash Failures 14/25

slide-16
SLIDE 16

k-Set Agreement Algorithm (1/2)

  • peration propose(ini) is

(1)

PART.write(up);

% signal participation

(2)

repeat

(3)

parti := PART.snapshot();

% wait for n − t

(4)

counti := |{x such that parti[x] = up}|;

% participants

(5)

until counti ≥ n − t end repeat;

Anaïs Durand Contention-Related Crash Failures 14/25

slide-17
SLIDE 17

k-Set Agreement Algorithm (1/2)

  • peration propose(ini) is

(1)

PART.write(up);

% signal participation

(2)

repeat

(3)

parti := PART.snapshot();

% wait for n − t

(4)

counti := |{x such that parti[x] = up}|;

% participants

(5)

until counti ≥ n − t end repeat;

(6)

if counti ≤ λ then

% split processes into groups

(7)

groupi := 2;

% MUTEX[2] (m-mutex)

(8)

else

(9)

groupi := 1;

% MUTEX[1] (f-mutex)

(10)

end if

Anaïs Durand Contention-Related Crash Failures 14/25

slide-18
SLIDE 18

k-Set Agreement Algorithm (1/2)

  • peration propose(ini) is

(1)

PART.write(up);

% signal participation

(2)

repeat

(3)

parti := PART.snapshot();

% wait for n − t

(4)

counti := |{x such that parti[x] = up}|;

% participants

(5)

until counti ≥ n − t end repeat;

(6)

if counti ≤ λ then

% split processes into groups

(7)

groupi := 2;

% MUTEX[2] (m-mutex)

(8)

else

(9)

groupi := 1;

% MUTEX[1] (f-mutex)

(10)

end if

(11)

launch in // the threads T1 and T2;

Anaïs Durand Contention-Related Crash Failures 14/25

slide-19
SLIDE 19

k-Set Agreement Algorithm (2/2)

thread T1 is

% wait for a decided value

(12)

loop forever

(13)

if DEC = ⊥ then

(14)

return(DEC);

(15)

end if;

(16)

end loop;

Anaïs Durand Contention-Related Crash Failures 15/25

slide-20
SLIDE 20

k-Set Agreement Algorithm (2/2)

thread T1 is

% wait for a decided value

(12)

loop forever

(13)

if DEC = ⊥ then

(14)

return(DEC);

(15)

end if;

(16)

end loop; thread T2 is

% decide a value if enters its CS

(17)

if groupi = 1 ∨ m > 0 then

(18)

MUTEX[groupi].acquire();

(19)

if DEC = ⊥ then

(20)

DEC := ini;

(21)

end if

(22)

MUTEX[groupi].release();

(23)

return(DEC);

(24)

end if;

Anaïs Durand Contention-Related Crash Failures 15/25

slide-21
SLIDE 21

k-Set Agreement Algorithm: Validity & Agreement

thread T1 is

(12)

loop forever

(13)

if DEC = ⊥ then

(14)

return(DEC);

(15)

end if;

(16)

end loop; thread T2 is

(17)

if groupi = 1 ∨ m > 0 then

(18)

MUTEX[groupi].acquire();

(19)

if DEC = ⊥ then

(20)

DEC := ini;

(21)

end if

(22)

MUTEX[groupi].release();

(23)

return(DEC);

(24)

end if;

a Decided value = DEC

Anaïs Durand Contention-Related Crash Failures 16/25

slide-22
SLIDE 22

k-Set Agreement Algorithm: Validity & Agreement

thread T1 is

(12)

loop forever

(13)

if DEC = ⊥ then

(14)

return(DEC);

(15)

end if;

(16)

end loop; thread T2 is

(17)

if groupi = 1 ∨ m > 0 then

(18)

MUTEX[groupi].acquire();

(19)

if DEC = ⊥ then

(20)

DEC := ini;

(21)

end if

(22)

MUTEX[groupi].release();

(23)

return(DEC);

(24)

end if;

a Decided value = DEC b DEC assigned to proposed

values ini in CS

Anaïs Durand Contention-Related Crash Failures 16/25

slide-23
SLIDE 23

k-Set Agreement Algorithm: Validity & Agreement

thread T1 is

(12)

loop forever

(13)

if DEC = ⊥ then

(14)

return(DEC);

(15)

end if;

(16)

end loop; thread T2 is

(17)

if groupi = 1 ∨ m > 0 then

(18)

MUTEX[groupi].acquire();

(19)

if DEC = ⊥ then

(20)

DEC := ini;

(21)

end if

(22)

MUTEX[groupi].release();

(23)

return(DEC);

(24)

end if;

a Decided value = DEC b DEC assigned to proposed

values ini in CS

c MUTEX[1] ≤ f = values

MUTEX[2] ≤ m = values ⇒ ≤ f + m = k decided values

Anaïs Durand Contention-Related Crash Failures 16/25

slide-24
SLIDE 24

k-Set Agreement Algorithm: Termination (1/5)

(1)

PART.write(up);

(2)

repeat

(3)

parti := PART.snapshot();

(4)

counti := |{x such that parti[x] = up}|;

(5)

until counti ≥ n − t end repeat;

a ≤ t crashes + participation required

eventually counti ≥ n − t at every correct process pi

Anaïs Durand Contention-Related Crash Failures 17/25

slide-25
SLIDE 25

k-Set Agreement Algorithm: Termination (1/5)

(1)

PART.write(up);

(2)

repeat

(3)

parti := PART.snapshot();

(4)

counti := |{x such that parti[x] = up}|;

(5)

until counti ≥ n − t end repeat;

(6)

if counti ≤ λ then

(7)

groupi := 2;

(8)

else

(9)

groupi := 1;

(10)

end if

a ≤ t crashes + participation required

eventually counti ≥ n − t at every correct process pi

b ≤ n − k processes with counti ≤ n − k = λ when leaving loop (2)-(5)

≤ n − k processes in group 2

Anaïs Durand Contention-Related Crash Failures 17/25

slide-26
SLIDE 26

k-Set Agreement Algorithm: Termination (1/5)

thread T1 is

(12)

loop forever

(13)

if DEC = ⊥ then

(14)

return(DEC);

(15)

end if;

(16)

end loop; thread T2 is

(17)

if groupi = 1 ∨ m > 0 then

(18)

MUTEX[groupi].acquire();

(19)

if DEC = ⊥ then

(20)

DEC := ini;

(21)

end if

(22)

MUTEX[groupi].release();

(23)

return(DEC);

(24)

end if;

a ≤ t crashes + participation required

eventually counti ≥ n − t at every correct process pi

b ≤ n − k processes with counti ≤ n − k = λ when leaving loop (2)-(5)

≤ n − k processes in group 2

c one process decides ⇒ every correct process decides

Anaïs Durand Contention-Related Crash Failures 17/25

slide-27
SLIDE 27

k-Set Agreement Algorithm: Termination (2/5)

d If m = 0: k = m + f = f

f -1 any-time crashes n − t correct processes

Group 1 Group 2

f

Anaïs Durand Contention-Related Crash Failures 18/25

slide-28
SLIDE 28

k-Set Agreement Algorithm: Termination (2/5)

d If m = 0: k = m + f = f

f -1 any-time crashes n − t correct processes n − t = n − (f − 1) = n − k + 1

Group 1 Group 2

n-k f

Anaïs Durand Contention-Related Crash Failures 18/25

slide-29
SLIDE 29

k-Set Agreement Algorithm: Termination (2/5)

d If m = 0: k = m + f = f

f -1 any-time crashes n − t correct processes n − t = n − (f − 1) = n − k + 1

Group 1 Group 2

n-k f

Anaïs Durand Contention-Related Crash Failures 18/25

slide-30
SLIDE 30

k-Set Agreement Algorithm: Termination (2/5)

d If m = 0: k = m + f = f

f -1 any-time crashes n − t correct processes n − t = n − (f − 1) = n − k + 1

Group 1 Group 2

n-k f

≥ 1 correct process & ≤ f − 1 (any-time) crashes in group 1 (Properties of DF f -mutex MUTEX[1]) ⇒ at least one process decides

Anaïs Durand Contention-Related Crash Failures 18/25

slide-31
SLIDE 31

k-Set Agreement Algorithm: Termination (3/5)

d If m > 0: ◮ |group 1| ≥ f f -1 any-time crashes n − t correct processes 2m λ-constrained crashes

Group 1 Group 2

f

Anaïs Durand Contention-Related Crash Failures 19/25

slide-32
SLIDE 32

k-Set Agreement Algorithm: Termination (3/5)

d If m > 0: ◮ |group 1| ≥ f f -1 any-time crashes n − t correct processes 2m λ-constrained crashes

Group 1 Group 2

f

Anaïs Durand Contention-Related Crash Failures 19/25

slide-33
SLIDE 33

k-Set Agreement Algorithm: Termination (3/5)

d If m > 0: ◮ |group 1| ≥ f f -1 any-time crashes n − t correct processes 2m λ-constrained crashes

Group 1 Group 2

f

Anaïs Durand Contention-Related Crash Failures 19/25

slide-34
SLIDE 34

k-Set Agreement Algorithm: Termination (3/5)

d If m > 0: ◮ |group 1| ≥ f f -1 any-time crashes n − t correct processes 2m λ-constrained crashes

Group 1 Group 2

f

≥ 1 correct process & ≤ f − 1 (any-time) crashes in group 1 (Properties of DF f -mutex MUTEX[1]) ⇒ at least one process decides

Anaïs Durand Contention-Related Crash Failures 19/25

slide-35
SLIDE 35

k-Set Agreement Algorithm: Termination (4/5)

d If m > 0: ◮ |group 1| < f , correct ∈ group 1 f -1 any-time crashes n − t correct processes 2m λ-constrained crashes

Group 1 Group 2

f

≥ 1 correct process & ≤ f − 1 (any-time) crashes in group 1 (Properties of DF f -mutex MUTEX[1]) ⇒ at least one process decides

Anaïs Durand Contention-Related Crash Failures 20/25

slide-36
SLIDE 36

k-Set Agreement Algorithm: Termination (5/5)

d If m > 0: ◮ |group 1| < f , correct /

∈ group 1

f -1 any-time crashes n − t correct processes 2m λ-constrained crashes

Group 1 Group 2

f m

(n − k) − (n − t) = t − k = (2m + f − 1) − (m + f ) = m − 1

≥ 1 correct process & ≤ m − 1 crashes in group 2 (Properties of DF m-mutex MUTEX[2]) ⇒ at least one process decides

Anaïs Durand Contention-Related Crash Failures 21/25

slide-37
SLIDE 37

k-Set Agreement Algorithm: Termination (5/5)

d If m > 0: ◮ |group 1| < f , correct /

∈ group 1

f -1 any-time crashes n − t correct processes 2m λ-constrained crashes

Group 1 Group 2

n-k f m

(n − k) − (n − t) = t − k = (2m + f − 1) − (m + f ) = m − 1

≥ 1 correct process & ≤ m − 1 crashes in group 2 (Properties of DF m-mutex MUTEX[2]) ⇒ at least one process decides

Anaïs Durand Contention-Related Crash Failures 21/25

slide-38
SLIDE 38

k-Set Agreement Algorithm: Properties

λ = n − k k ≥ 2 k = m + f , m ≥ 0, f ≥ 1

total # of faults t = 2m + f − 1 = k + m − 1 λ-constrained crashes 2m any-time crashes f − 1

Anaïs Durand Contention-Related Crash Failures 22/25

slide-39
SLIDE 39

k-Set Agreement Algorithm: Generalization

λ = n − ℓ k ≤ ℓ ≤ n k ≥ 2 k = m + f , m ≥ 0, f ≥ 1

total # of faults t = 2m + ℓ − k + f − 1 λ-constrained crashes 2m + ℓ − k any-time crashes f − 1

Anaïs Durand Contention-Related Crash Failures 23/25

slide-40
SLIDE 40

Conclusion

Notion of contention-related crash failures Allows to circumvent impossibility results Better understanding of fault tolerance:

In the k-set agreement algorithm, can trade 1 “strong” any-time failure for 2 “weak” (n − k)-constrained failures

Future work: ◮ Tight bounds? ◮ General algorithm for k-set agreement, ∀k ≥ 1. ◮ What about crashes after the contention threshold λ? ◮ What about other definitions of weak crash failures?

Anaïs Durand Contention-Related Crash Failures 24/25

slide-41
SLIDE 41

Thank you for your attention! Do you have any question?

Anaïs Durand Contention-Related Crash Failures 25/25

slide-42
SLIDE 42

Renaming

Anaïs Durand Contention-Related Crash Failures 26/25

slide-43
SLIDE 43

Renaming [Attiya et al.,90] Definition

Initial name: idi New name space: {1 . . . M} Operation rename(idi): return a new name Properties: ◮ Validity: new name ∈ {1 . . . M} ◮ Agreement: no 2 same new names ◮ Termination: invokation of rename() by a correct process terminates

Anaïs Durand Contention-Related Crash Failures 27/25

slide-44
SLIDE 44

Renaming Algorithm: Properties

M = n + f λ = n − t − 1 t = m + f , m ≥ 0, f ≥ 0

total # of faults t = m + f λ-constrained crashes m any-time crashes f

[Herlihy, Shavit, 93]: Impossible with f + 1 any-time crash failures.

Anaïs Durand Contention-Related Crash Failures 28/25

slide-45
SLIDE 45

Renaming Algorithm: Parameters

Parameters f and m allow the user to tune the proportion of each type

  • f crash failures and the size of the new name space.

m f t t M = n + t

max #any-time (= t)

m f t t

2

  • t

t

2

  • M = n +

t

2

  • m

f t t M = n

max #λ-constrained (= t)

Anaïs Durand Contention-Related Crash Failures 29/25

slide-46
SLIDE 46

Renaming Algorithm: Shared Registers

PART[1 . . . n]: snapshot object, initially [down, . . . , down] RENAMING f : (n + f )-renaming object that: ◮ tolerates ≤ f any-time crash failures ◮ does not require participation

e.g. [Attiya, Welch, 04]

Anaïs Durand Contention-Related Crash Failures 30/25

slide-47
SLIDE 47

Renaming Algorithm

  • peration rename(idi) is

(1)

PART.write(up);

% signal participation

(2)

repeat

(3)

parti := PART.snapshot();

% wait for n − t

(4)

counti := |{x such that parti[x] = up}|;

% participants

(5)

until counti ≥ n − t end repeat;

Anaïs Durand Contention-Related Crash Failures 31/25

slide-48
SLIDE 48

Renaming Algorithm

  • peration rename(idi) is

(1)

PART.write(up);

% signal participation

(2)

repeat

(3)

parti := PART.snapshot();

% wait for n − t

(4)

counti := |{x such that parti[x] = up}|;

% participants

(5)

until counti ≥ n − t end repeat;

(6)

newNamei := RENAMING f .rename(idi);

% get new name

(7)

return(newNamei);

Anaïs Durand Contention-Related Crash Failures 31/25

slide-49
SLIDE 49

Renaming Algorithm: Proof

(1)

PART.write(up);

(2)

repeat

(3)

parti := PART.snapshot();

(4)

counti := |{x such that parti[x] = up}|;

(5)

until counti ≥ n − t end repeat;

a ≤ t crashes + participation required

eventually counti ≥ n − t at every correct process pi

Anaïs Durand Contention-Related Crash Failures 32/25

slide-50
SLIDE 50

Renaming Algorithm: Proof

(1)

PART.write(up);

(2)

repeat

(3)

parti := PART.snapshot();

(4)

counti := |{x such that parti[x] = up}|;

(5)

until counti ≥ n − t end repeat;

a ≤ t crashes + participation required

eventually counti ≥ n − t at every correct process pi

b n − t > λ no λ-constrained crashes in RENAMING f

≤ f crashes in RENAMING f

Anaïs Durand Contention-Related Crash Failures 32/25

slide-51
SLIDE 51

Renaming Algorithm: Proof

(1)

PART.write(up);

(2)

repeat

(3)

parti := PART.snapshot();

(4)

counti := |{x such that parti[x] = up}|;

(5)

until counti ≥ n − t end repeat;

a ≤ t crashes + participation required

eventually counti ≥ n − t at every correct process pi

b n − t > λ no λ-constrained crashes in RENAMING f

≤ f crashes in RENAMING f

c participation not required for RENAMING f + properties of

RENAMING f validity, agreement, & termination

Anaïs Durand Contention-Related Crash Failures 32/25

slide-52
SLIDE 52

Generalization to One-Shot Concurrent Objects

Transform OB = one-shot object tolerating < X any-time crashes, participation not required

λ = n − t − 1 t = m + f , m ≥ 0, 0 ≤ f ≤ X

total # of faults t = m + f λ-constrained crashes m any-time crashes f ≤ X

  • peration op(ini) is

(1)

PART.write(up);

(2)

repeat

(3)

parti := PART.snapshot();

(4)

counti := |{x such that parti[x] = up}|;

(5)

until counti ≥ n − t end repeat;

(6)

resi := OB.op(ini);

(7)

return(resi);

Anaïs Durand Contention-Related Crash Failures 33/25