total order broadcast VL Networked Embedded Systems Markus - - PowerPoint PPT Presentation

total order broadcast
SMART_READER_LITE
LIVE PREVIEW

total order broadcast VL Networked Embedded Systems Markus - - PowerPoint PPT Presentation

total order broadcast VL Networked Embedded Systems Markus Kammerstetter (e0226196) overview basic broadcast basic service specification and quality of service ordering reliability implementation Markus Kammerstetter,


slide-1
SLIDE 1

total order broadcast

VL Networked Embedded Systems

Markus Kammerstetter (e0226196)

slide-2
SLIDE 2

Markus Kammerstetter, 2007 2

  • verview
  • basic broadcast
  • basic service specification and quality of

service

  • ordering
  • reliability
  • implementation
slide-3
SLIDE 3

Markus Kammerstetter, 2007 3

basic broadcast (1)

  • different strengths of reliability

(e.g. best effort – fail/recovery model)

  • no ordering, messages are considered

„independent“

slide-4
SLIDE 4

Markus Kammerstetter, 2007 4

basic broadcast (2)

  • two messages from same process might

not be delivered in the order they were broadcast

  • a message m1 that causes a message m2

might be delivered by some process after m2

slide-5
SLIDE 5

Markus Kammerstetter, 2007 5

basic broadcast problems

  • how can messages be ordered (type of
  • rdering) ?
  • who orders messages ?
  • what degree of fault tolerance should there

be ?

slide-6
SLIDE 6

Markus Kammerstetter, 2007 6

basic service specification (1)

  • broadcast service can support various

properties like type of ordering or degree

  • f fault tolerance
  • properties form quality of service
slide-7
SLIDE 7

Markus Kammerstetter, 2007 7

basic service specification (2)

  • interface to basic broadcast service has to be

modified to support quality of service bc-sendi(m, qos): an input event of processor pi, which

sends a message m to all processors, containing an indicator of the sender; qos is a parameter describing the qulity of service required

slide-8
SLIDE 8

Markus Kammerstetter, 2007 8

basic service specification (3)

bc-recvi(m, j, qos): an output event in which processor

pi, receives the message m previously broadcast by pj; qos is a parameter describing the quality of service required

slide-9
SLIDE 9

Markus Kammerstetter, 2007 9

basic service specification (4)

  • basic broadcast service:

– each bc-recvi(m, j, qos) event is mapped to an earlier bc-sendi (m,qos) event – every message was previously sent (integrity) – every sent message is eventually received

  • nce (liveness)

– every sent message is received only once at every processor (no duplicates)

slide-10
SLIDE 10

Markus Kammerstetter, 2007 10

broadcast service qualities (1)

  • broadcast properties can be devided in

two cathegories:

– ordering and reliability

slide-11
SLIDE 11

Markus Kammerstetter, 2007 11

broadcast service qualities (2)

  • ordering:

– do all processors see all messages in the same order or just see the messages from a single processor in the order they were sent ? – does the order in which messages are received preserve the happens-before (causal) relation ?

slide-12
SLIDE 12

Markus Kammerstetter, 2007 12

broadcast service qualities (3)

  • reliability:

– do all processors see the same set of messages even if failures occur in the underlying system ? – do all processors see all the messages broadcast by a nonfaulty processors ?

slide-13
SLIDE 13

Markus Kammerstetter, 2007 13

  • rdering (1)
  • single-source FIFO: for all messages m1 and

m2 and all processors pi and pj, if pi sends m1 before m2, then m2 is not received at pj before m1 is.

slide-14
SLIDE 14

Markus Kammerstetter, 2007 14

  • rdering (2)
  • totally ordered: for all messages m1 and m2

and all processors pi and pj, if m1 is received at pi before m2 is, then m2 is not received at pj before m1 is. (all processes must deliver all messages according to the same order, i.e. the order is now total)

slide-15
SLIDE 15

Markus Kammerstetter, 2007 15

  • rdering (3)
  • causally ordered: for all messages m1 and m2

and every processor pi, if m1 happens before m2, then m2 is not received at pi, before m1 is.

A message m1 is said to happen before m2 if either:

  • the bc-recv event for m1 happend before the bc-send event for m2
  • m1 and m2 are sent by the same processor and m1 is sent before

m2

slide-16
SLIDE 16

Markus Kammerstetter, 2007 16

relationships (1)

  • what are the relationships between these
  • rdering requirements ?

– only one implication: causally ordered implies single- source FIFO, because the happens-before relation on messages respects the order in which individual processors send the messages – besides that implication, none of the ordering requirements implies any other

slide-17
SLIDE 17

Markus Kammerstetter, 2007 17

relationships (2)

  • causally ordered does not imply totally
  • rdered
slide-18
SLIDE 18

Markus Kammerstetter, 2007 18

relationships (3)

  • totally ordered does not imply causally
  • rdered or single-source FIFO ordered
slide-19
SLIDE 19

Markus Kammerstetter, 2007 19

relationships (4)

  • single-source FIFO ordered does not imply

causally ordered or totally ordered

slide-20
SLIDE 20

Markus Kammerstetter, 2007 20

reliability (1)

  • as mentioned earlier, the basic broadcast

service must satisfy:

– integrity – liveness – no duplicates

  • however, in presence of faulty processors, the

Liveness property needs to be weakened

slide-21
SLIDE 21

Markus Kammerstetter, 2007 21

reliability (2)

there are at most f faulty processors and K mappings between bc-recv(m) and bc-send(m) events

  • integrity: for each processor pi ,

(0<= i <= n-1), the restriction of K to bc-recvi events is well-defined.  every received message was previously sent, no message is received „out of thin air“

slide-22
SLIDE 22

Markus Kammerstetter, 2007 22

reliability (3)

  • no duplicates: for each processor pi ,

(0<= i <= n-1), the restriction of K to

  • ne-to-one.

 no message is received more than

  • nce at any single processor
slide-23
SLIDE 23

Markus Kammerstetter, 2007 23

reliability (4)

  • nonfaulty liveness: when restricted to bc-

send and bc-recv events at nonfaulty processors,K is surjective.  all messages broadcast by a nonfaulty processor are eventually received by all nonfaulty processors

slide-24
SLIDE 24

Markus Kammerstetter, 2007 24

reliability (5)

  • faulty liveness: if one non-faulty

processor has a bc-recv event that maps to a particual bc-send event at a faulty processor, then every nonfaulty processor has such an event.  every message sent by a faulty processor is either received by all nonfaulty processors or by none of them

slide-25
SLIDE 25

Markus Kammerstetter, 2007 25

implementation (basic broadcast)

  • the basic broadcast service is using the

underlying point-to-point message system with no failures.

slide-26
SLIDE 26

Markus Kammerstetter, 2007 26

implementation (single-source FIFO)

  • the single-source FIFO ordering is implemented
  • n top of basic broadcast.
  • each processor assigns an incremental

sequence number to each new message that it broadcasts

  • the recipent waits to perform the receipt of the

message, until all previous messages with lower sequence numbers have been processed

slide-27
SLIDE 27

Markus Kammerstetter, 2007 27

implementation (single-source FIFO)

N N = N + 1 bc-send(data, N) bc-recv(data, N) if all previous messages(i) (0<=i<=n) are received

slide-28
SLIDE 28

Markus Kammerstetter, 2007 28

implementation (total order)

  • total order is a more difficult property

– either asymmetric: relies on central coordinator that orders all messages, (implemented on top of basic broadcast) – or symmetric: processors decide together on an order for all broadcast messages, (implemented on top of single-source FIFO)

slide-29
SLIDE 29

Markus Kammerstetter, 2007 29

implementation (total order)

asymmetric total order algorithm:

  • processor pi sends m using the basic broadcast

service to a unique central site at processor pc

  • processor pc assigns a sequence number to

each message and then sends it to all processors using the basic broadcast service

  • a message k is received if all previous

messages i (0<=i <=k) are received

slide-30
SLIDE 30

Markus Kammerstetter, 2007 30

implementation (total order)

  • to spread the communication overhead,

the role of the central processor can rotate among processors (e.g. by a rotating token)

  • Since all messages are assigned a

number in a central site, it is clear that the receives of the messages happen in the same total order at all processors

slide-31
SLIDE 31

Markus Kammerstetter, 2007 31

implementation (total order)

symmetric total order algorithm:

  • based on assigning timestamps to

messages

  • assume that underlying communication

system provides single-source FIFO

  • each processor maintains an increasing

counter (timestamp)

  • before sending, each message is tagged

with the current timestamp

slide-32
SLIDE 32

Markus Kammerstetter, 2007 32

implementation (total order)

  • each processor also maintains a vector which

estimates the timestamps of all other processors. Processor pi updates its entry for pj using the tags on messages received from pj and using special „timstamp update“ messages sent by pj.

  • a message with timestamp T is received if all

previous messages with timestamps <=T have arrived (this is done by waiting until every vector entry is at least T)

slide-33
SLIDE 33

Markus Kammerstetter, 2007 33

proof (total order)

proof:

  • we must show:

– integrity – no duplicates – liveness – total ordering

slide-34
SLIDE 34

Markus Kammerstetter, 2007 34

proof (total order)

  • integrity: holds because of underlying

single-source FIFO

  • no duplicates: holds because of

underlying single-source FIFO

slide-35
SLIDE 35

Markus Kammerstetter, 2007 35

proof (total order)

liveness:

  • suppose in contradiction some processor pi has some

entry (m,T) stuck in its pending set forever, where T is the smallest timestamp of all stuck entries

  • eventually (m,T) has the smallest timestamp of all entries
  • why is (m,T) stuck at pi ?  because its estimate of

some pk‘s timestamp is suck at some value T‘< T

  • that would mean that either pk never receives (m,T) or

pk‘s timestamp update message resulting from pk receiving (m,T) is never received at pi.  contradiction to correctness of single-source FIFO broadcast

slide-36
SLIDE 36

Markus Kammerstetter, 2007 36

proof (total order)

total ordering: suppose pi does to-bc-recv() for message m with timestamp T, and later it does the same for message m‘ with timestamp T‘. Thus we have to show T < T‘

slide-37
SLIDE 37

Markus Kammerstetter, 2007 37

proof (total order)

total ordering (cont.):

  • if (m‘,T‘) is in pi‘s pending set when pi does

to-bc-recv() for (m,T)  then T < T‘  proofed

  • the other case is that (m‘,T‘) is not yet in pi‘s pending set.

Let pj be the processor that initiated to-bc-send() of m‘.

  • when pi does to-bc-recv for (m,T) from pk

 T<= ts[j] so pi has received a message from pj with timestamp >= T

  • by the single-source FIFO property, every subsequent

message pi receives from pk will have a timestamp > T, so T‘ must be > T  proofed

slide-38
SLIDE 38

Markus Kammerstetter, 2007 38

symmetric to-algorithm

when to-bc-sendi(m) occurs: ts[i]++ add (m,ts[i]) to pending invoke ssf-bc-sendi((m,ts[i])) when ssf-bc-recvi((m,T)) from pj

  • ccurs:

ts[j] := T add (m,T) to pending if T > ts[i] then ts[i] := T invoke ssf-bc-sendi("ts-up",T) invoke to-bc-recvi(m) when: (m,T) is entry in pending with smallest T T ≤ ts[k] for all k result: remove (m,T) from pending when ssf-bc-recvi("ts-up",T) from pj occurs: ts[j] := T

initially: ts[j]=0 (0 <= j <= n-1) pending set is empty

slide-39
SLIDE 39

Markus Kammerstetter, 2007 39

end

  • thank you for your attention
  • questions ?