Global Predicate Detection and Event Ordering Our Problem To - - PowerPoint PPT Presentation

global predicate detection and event ordering our problem
SMART_READER_LITE
LIVE PREVIEW

Global Predicate Detection and Event Ordering Our Problem To - - PowerPoint PPT Presentation

Global Predicate Detection and Event Ordering Our Problem To compute predicates over the state of a distributed application Model Message passing No failures Two possible timing assumptions: 1. Synchronous System 2. Asynchronous System


slide-1
SLIDE 1

Global Predicate Detection and Event Ordering

slide-2
SLIDE 2

Our Problem

To compute predicates

  • ver the state of

a distributed application

slide-3
SLIDE 3

Model

Message passing No failures Two possible timing assumptions:

  • 1. Synchronous System
  • 2. Asynchronous System

No upper bound on message delivery time No bound on relative process speeds No centralized clock

slide-4
SLIDE 4

Asynchronous systems

Weakest possible assumptions

  • cfr. “finite progress axiom”

Weak assumptions less vulnerabilities Asynchronous ≠ slow “Interesting” model w.r.t. failures (ah ah ah!) ≡

slide-5
SLIDE 5

Client-Server

Processes exchange messages using Remote Procedure Call (RPC)

A client requests a service by sending the server a message. The client blocks while waiting for a response

s c

slide-6
SLIDE 6

Client-Server

Processes exchange messages using Remote Procedure Call (RPC)

The server computes the response (possibly asking other servers) and returns it to the client A client requests a service by sending the server a message. The client blocks while waiting for a response

s

#!?%!

c

slide-7
SLIDE 7

Deadlock!

p2 p1 p3

slide-8
SLIDE 8

Goal

Design a protocol by which a processor can determine whether a global predicate (say, deadlock) holds

slide-9
SLIDE 9

Draw arrow from to if has received a request but has not responded yet

Wait-For Graphs

pi pj pj

slide-10
SLIDE 10

Draw arrow from to if has received a request but has not responded yet Cycle in WFG deadlock Deadlock cycle in WFG

Wait-For Graphs

⇒ ♦ ⇒ ·

pi pj pj

slide-11
SLIDE 11

The protocol

sends a message to On receipt of ’ s message, replies with its state and wait-for info p1 . . . p3 p0 p0 pi

slide-12
SLIDE 12

An execution

p1 p1 p2 p2 p3 p3

slide-13
SLIDE 13

An execution

p1 p1 p2 p2 p3 p3

slide-14
SLIDE 14

An execution

Ghost Deadlock!

p2 p2 p1 p1 p3 p3

slide-15
SLIDE 15

Houston, we have a problem...

Asynchronous system no centralized clock, etc. etc. Synchrony useful to coordinate actions

  • rder events

Mmmmhhh...

slide-16
SLIDE 16

Events and Histories

Processes execute sequences of events Events can be of 3 types: local, send, and receive is the -th event of process The local history of process is the sequence

  • f events executed by process

: prefix that contains first k events : initial, empty sequence The history H is the set

hp

hk

p

h0

p

ei

p

hp0 ∪ hp1 ∪ . . . hpn−1

NOTE: In H, local histories are interpreted as sets, rather than sequences, of events

p p p i

slide-17
SLIDE 17

Ordering events

Observation 1: Events in a local history are totally ordered

time

pi

slide-18
SLIDE 18

Ordering events

Observation 1: Events in a local history are totally ordered Observation 2: For every message , precedes

time

pi

time

pi

time

m receive(m) send(m)

m

pj

slide-19
SLIDE 19

Happened-before (Lamport[1978])

A binary relation defined over events

  • 1. if and , then
  • 2. if and ,

then

  • 3. if and then

→ ek

i , el i ∈ hi

k < l ek

i → el i

ei = send(m) ej = receive(m) ei → ej e → e e → e e → e

slide-20
SLIDE 20

Space-Time diagrams

A graphic representation of a distributed execution

time

p1 p2 p3 p1 p2 p3

slide-21
SLIDE 21

Space-Time diagrams

A graphic representation of a distributed execution

time

p1 p2 p3 p1 p2 p3

slide-22
SLIDE 22

Space-Time diagrams

A graphic representation of a distributed execution

time

p1 p2 p3 p1 p2 p3

slide-23
SLIDE 23

Space-Time diagrams

A graphic representation of a distributed execution

time

p1 p2 p3 p1 p2 p3

slide-24
SLIDE 24

Space-Time diagrams

A graphic representation of a distributed execution

time

p1 p2 p3 p1 p2 p3

H and impose a partial order

slide-25
SLIDE 25

Space-Time diagrams

A graphic representation of a distributed execution

time

p1 p2 p3 p1 p2 p3

H and impose a partial order

slide-26
SLIDE 26

Space-Time diagrams

A graphic representation of a distributed execution

time

p1 p2 p3 p1 p2 p3

H and impose a partial order

slide-27
SLIDE 27

Space-Time diagrams

A graphic representation of a distributed execution

time

p1 p2 p3 p1 p2 p3

H and impose a partial order

slide-28
SLIDE 28

Runs and Consistent Runs

A run is a total ordering of the events in H that is consistent with the local histories of the processors Ex: is a run A run is consistent if the total order imposed in the run is an extension of the partial

  • rder induced by

A single distributed computation may correspond to several consistent runs! h1, h2, . . . , hn →

slide-29
SLIDE 29

Cuts

A cut C is a subset of the global history of H p1 p2 p3 C = hc1

1 ∪ hc2 2 ∪ . . . hcn n

slide-30
SLIDE 30

A cut C is a subset of the global history of H The frontier of C is the set of events

Cuts

p1 p2 p3 C = hc1

1 ∪ hc2 2 ∪ . . . hcn n

ec1

1 , ec2 2 , . . . ecn n

slide-31
SLIDE 31

Global states and cuts

The global state of a distributed computation is an -tuple of local states To each cut corresponds a global state Σ = (σ1, . . . σn) (σc1

1 , . . . σcn n )

(c1 . . . cn) n

slide-32
SLIDE 32

Consistent cuts and consistent global states

A cut is consistent if A consistent global state is one corresponding to a consistent cut ∀ei, ej : ej ∈ C ∧ ei → ej ⇒ ei ∈ C

slide-33
SLIDE 33

What sees p0

p1 p2 p3

slide-34
SLIDE 34

What sees

Not a consistent global state: the cut contains the event corresponding to the receipt of the last message by but not the corresponding send event p1 p2 p3 p3

p0

slide-35
SLIDE 35

Our task

Develop a protocol by which a processor can build a consistent global state Informally, we want to be able to take a snapshot

  • f the computation

Not obvious in an asynchronous system...

slide-36
SLIDE 36

Our approach

Develop a simple synchronous protocol Refine protocol as we relax assumptions Record: processor states channel states Assumptions: FIFO channels Each timestamped with with m T(send(m))

slide-37
SLIDE 37

Snapshot I

  • i. selects
  • ii. sends “take a snapshot at ” to all processes
  • iii. when clock of reads then
  • a. records its local state
  • b. starts recording messages received on each of incoming

channels

  • c. stops recording a channel when it receives first message

with timestamp greater than or equal to

p0 tss p0 tss tss tss pi σi

p

slide-38
SLIDE 38

Snapshot I

  • i. selects
  • ii. sends “take a snapshot at ” to all processes
  • iii. when clock of reads then
  • a. records its local state
  • b. sends an empty message along its outgoing channels
  • c. starts recording messages received on each of incoming

channels

  • d. stops recording a channel when it receives first message

with timestamp greater than or equal to

p0 tss p0 tss tss tss pi σi

p

slide-39
SLIDE 39

Correctness

Theorem Snapshot I produces a consistent cut

< Assumption > < Assumption > < 0 and 1>

Proof

Need to prove

< Definition > < Property of real time> < 2 and 4> < 5 and 3> < Definition >

ej ∈ C ∧ ei → ej ⇒ ei ∈ C

  • 2. ei → ej
  • 1. ej ∈ C
  • 0. ej ∈ C ≡ T(ej) < tss
  • 3. T(ej) < tss
  • 4. ei → ej ⇒ T(ei) < T(ej)
  • 6. T(ei) < tss
  • 5. T(ei) < T(ej)
  • 7. ei ∈ C
slide-40
SLIDE 40

Clock Condition

< Property of real time>

Can the Clock Condition be implemented some other way?

  • 4. ei → ej ⇒ T(ei) < T(ej)
slide-41
SLIDE 41

Lamport Clocks

Each process maintains a local variable value of for event LC LC(e) ≡ LC e

ei

p

ei+1

p

ei

p

LC(ei

p) < LC(ei+1 p

) LC(ei

p) < LC(ej q)

ej

q

p q p

slide-42
SLIDE 42

Increment Rules

ei

p

ei+1

p

p ei

p

ej

q

p q

LC(ei+1

p

) = LC(ei

p) + 1

LC(ej

q) = max(LC(ej−1 q

), LC(ei

p)) + 1

Timestamp with m TS(m) = LC(send(m))

slide-43
SLIDE 43

Space-Time Diagrams and Logical Clocks

2 1 3 4 5 6 6 7 7 8 8 9

p1 p2 p3

slide-44
SLIDE 44

A subtle problem

when do S doesn’t make sense for Lamport clocks! there is no guarantee that will ever be S is anyway executed after Fixes:

if is internal/send and

execute and then S

if

put message back in channel re-enable ; set ; execute S

LC

e

LC = t LC = t t

LC = t − 2

LC = t − 1 e e

e = receive(m) ∧ (TS(m) ≥ t) ∧ (LC ≤ t − 1)

slide-45
SLIDE 45

An obvious problem

No ! Choose large enough that it cannot be reached by applying the update rules of logical clocks tss Ω

slide-46
SLIDE 46

An obvious problem

No ! Choose large enough that it cannot be reached by applying the update rules of logical clocks mmmmhhhh... tss Ω

slide-47
SLIDE 47

An obvious problem

No ! Choose large enough that it cannot be reached by applying the update rules of logical clocks mmmmhhhh... Doing so assumes

upper bound on message delivery time upper bound relative process speeds

We better relax it... tss Ω

slide-48
SLIDE 48

Snapshot II

processor selects sends “take a snapshot at ” to all processes; it waits for all of them to reply and then sets its logical clock to when clock of reads then records its local state sends an empty message along its outgoing channels starts recording messages received on each incoming channel stops recording a channel when receives first message with timestamp greater than or equal to Ω p0 σi p0 Ω Ω Ω Ω pi pi

slide-49
SLIDE 49

Relaxing synchrony

Process does nothing for the protocol during this time! pi take a snapshot at Ω empty message: TS(m) ≥ Ω monitors channels records local state σi sends empty message: TS(m) ≥ Ω

Use empty message to announce snapshot!

slide-50
SLIDE 50

Snapshot III

processor sends itself “take a snapshot “ when receives “take a snapshot” for the first time from :

records its local state sends “take a snapshot” along its outgoing channels sets channel from to empty starts recording messages received over each of its other incoming channels

when receives “take a snapshot” beyond the first time from :

stops recording channel from

when has received “take a snapshot” on all channels, it sends

  • collected state to and stops.

p0 pi pj σi pk pi pi pj pk p0

slide-51
SLIDE 51

Snapshots: a perspective

The global state saved by the snapshot protocol is a consistent global state Σs

slide-52
SLIDE 52

Snapshots: a perspective

The global state saved by the snapshot protocol is a consistent global state But did it ever occur during the computation? a distributed computation provides only a partial order of events many total orders (runs) are compatible with that partial order all we know is that could have occurred Σs Σs

slide-53
SLIDE 53

Snapshots: a perspective

The global state saved by the snapshot protocol is a consistent global state But did it ever occur during the computation? a distributed computation provides only a partial order of events many total orders (runs) are compatible with that partial order all we know is that could have occurred We are evaluating predicates on states that may have never occurred! Σs Σs

slide-54
SLIDE 54

An Execution and its Lattice

p1 p2

e1

1

e2

1

e3

1

e4

1

e5

1

e6

1

e5

2

e4

2

e3

2

e2

2

e1

2

slide-55
SLIDE 55

An Execution and its Lattice

p1 p2

e1

1

e2

1

e3

1

e4

1

e5

1

e6

1

e5

2

e4

2

e3

2

e2

2

e1

2

Σ00

slide-56
SLIDE 56

An Execution and its Lattice

p1 p2

e1

1

e2

1

e3

1

e4

1

e5

1

e6

1

e5

2

e4

2

e3

2

e2

2

e1

2

Σ00 Σ10

slide-57
SLIDE 57

An Execution and its Lattice

p1 p2

e1

1

e2

1

e3

1

e4

1

e5

1

e6

1

e5

2

e4

2

e3

2

e2

2

e1

2

Σ00 Σ10 Σ01

slide-58
SLIDE 58

An Execution and its Lattice

p1 p2

e1

1

e2

1

e3

1

e4

1

e5

1

e6

1

e5

2

e4

2

e3

2

e2

2

e1

2

Σ00 Σ10 Σ01 Σ11

slide-59
SLIDE 59

An Execution and its Lattice

p1 p2

e1

1

e2

1

e3

1

e4

1

e5

1

e6

1

e5

2

e4

2

e3

2

e2

2

e1

2

Σ00 Σ10 Σ01 Σ11

slide-60
SLIDE 60

An Execution and its Lattice

p1 p2

e1

1

e2

1

e3

1

e4

1

e5

1

e6

1

e5

2

e4

2

e3

2

e2

2

e1

2

Σ00 Σ10 Σ01 Σ11 Σ02

slide-61
SLIDE 61

An Execution and its Lattice

p1 p2

e1

1

e2

1

e3

1

e4

1

e5

1

e6

1

e5

2

e4

2

e3

2

e2

2

e1

2

Σ00 Σ10 Σ01 Σ11 Σ02 Σ12

slide-62
SLIDE 62

An Execution and its Lattice

p1 p2

e1

1

e2

1

e3

1

e4

1

e5

1

e6

1

e5

2

e4

2

e3

2

e2

2

e1

2

Σ00 Σ10 Σ01 Σ11 Σ02 Σ12 Σ21

slide-63
SLIDE 63

An Execution and its Lattice

p1 p2

e1

1

e2

1

e3

1

e4

1

e5

1

e6

1

e5

2

e4

2

e3

2

e2

2

e1

2

Σ00 Σ10 Σ01 Σ11 Σ02 Σ12 Σ21 Σ22

slide-64
SLIDE 64

An Execution and its Lattice

p1 p2

e1

1

e2

1

e3

1

e4

1

e5

1

e6

1

e5

2

e4

2

e3

2

e2

2

e1

2

Σ00 Σ10 Σ01 Σ11 Σ02 Σ12 Σ21 Σ22 Σ32

slide-65
SLIDE 65

An Execution and its Lattice

p1 p2

e1

1

e2

1

e3

1

e4

1

e5

1

e6

1

e5

2

e4

2

e3

2

e2

2

e1

2

Σ00 Σ10 Σ01 Σ11 Σ02 Σ12 Σ21 Σ22 Σ32 Σ42

slide-66
SLIDE 66

An Execution and its Lattice

p1 p2

e1

1

e2

1

e3

1

e4

1

e5

1

e6

1

e5

2

e4

2

e3

2

e2

2

e1

2

Σ00 Σ10 Σ01 Σ11 Σ02 Σ12 Σ21 Σ22 Σ32 Σ42

slide-67
SLIDE 67

An Execution and its Lattice

p1 p2

e1

1

e2

1

e3

1

e4

1

e5

1

e6

1

e5

2

e4

2

e3

2

e2

2

e1

2

Σ00 Σ10 Σ01 Σ11 Σ02 Σ12 Σ21 Σ22 Σ32 Σ42 Σ03 Σ04 Σ14 Σ13 Σ23 Σ24 Σ31 Σ41 Σ43 Σ33 Σ34 Σ44 Σ35 Σ45 Σ55 Σ65 Σ64 Σ63 Σ53 Σ54

slide-68
SLIDE 68

Reachability

is reachable from if

there is a path from to in the lattice

Σ00 Σ10 Σ01 Σ11 Σ02 Σ12 Σ22 Σ32 Σ42 Σ03 Σ04 Σ14 Σ13 Σ23 Σ24 Σ31 Σ41 Σ43 Σ33 Σ34 Σ44 Σ35 Σ45 Σ65 Σ64 Σ63 Σ53 Σ54 Σ21 Σ55

Σij Σkl Σkl Σij

slide-69
SLIDE 69

Reachability

is reachable from if

there is a path from to in the lattice

Σ00 Σ10 Σ01 Σ11 Σ02 Σ12 Σ22 Σ32 Σ42 Σ03 Σ04 Σ14 Σ13 Σ23 Σ24 Σ31 Σ41 Σ43 Σ33 Σ34 Σ44 Σ35 Σ45 Σ65 Σ64 Σ63 Σ53 Σ54

Σij Σkl Σkl Σij

Σ55 Σ21

slide-70
SLIDE 70

Reachability

is reachable from if

there is a path from to in the lattice

Σ00 Σ10 Σ01 Σ11 Σ02 Σ12 Σ22 Σ32 Σ42 Σ03 Σ04 Σ14 Σ13 Σ23 Σ24 Σ31 Σ41 Σ43 Σ33 Σ34 Σ44 Σ35 Σ45 Σ65 Σ64 Σ63 Σ53 Σ54 Σ55 Σ21

Σij Σkl Σkl Σij

slide-71
SLIDE 71

Reachability

is reachable from if

there is a path from to in the lattice

Σ00 Σ10 Σ01 Σ11 Σ02 Σ12 Σ22 Σ32 Σ42 Σ03 Σ04 Σ14 Σ13 Σ23 Σ24 Σ31 Σ41 Σ43 Σ33 Σ34 Σ44 Σ35 Σ45 Σ65 Σ64 Σ63 Σ53 Σ54 Σ55 Σ21

Σij Σkl

Σij Σkl Σkl Σij

slide-72
SLIDE 72

So, why do we care about again?

Deadlock is a stable property Deadlock Deadlock If a run of the snapshot protocol starts in and terminates in , then

Σs

⇒ Σi Σf R Σi R Σf

slide-73
SLIDE 73

So, why do we care about again?

Deadlock is a stable property Deadlock Deadlock If a run of the snapshot protocol starts in and terminates in , then Deadlock in implies deadlock in

Σs

⇒ Σi Σf R Σi R Σf Σf Σs

slide-74
SLIDE 74

So, why do we care about again?

Deadlock is a stable property Deadlock Deadlock If a run of the snapshot protocol starts in and terminates in , then Deadlock in implies deadlock in No deadlock in implies no deadlock in

Σs

⇒ Σi Σf R Σi R Σf

Σi

Σf Σs Σs

slide-75
SLIDE 75

Same problem, different approach

Monitor process does not query explicitly Instead, it passively collects information and uses it to build an observation.

(reactive architectures, Harel and Pnueli [1985])

An observation is an ordering of event of the distributed computation based on the order in which the receiver is notified of the events.