Time, Clocks, and State Machine Replication
Dan Ports, CSEP 552
Time, Clocks, and State Machine Replication Dan Ports, CSEP 552 - - PowerPoint PPT Presentation
Time, Clocks, and State Machine Replication Dan Ports, CSEP 552 Todays question How do we order events in a distributed system? physical clocks logical clocks snapshots (break) application: state machine replication
Dan Ports, CSEP 552
(Chain Replication / Lab 2)
if object O depends on source S, and O.time < S.time, rebuild O
center, cross-datacenter replication, edge caches
effect, then snaps back => results in an oscillation at resonant frequency
5 min / yr
1 sec / yr
<1 ms / yr
100 ns / yr
(measurements from Amazon EC2)
(measurements from Amazon EC2)
200ms is a user-perceptible difference
(How do we know the master’s time is correct?)
value in the message
(How do we know the master’s time is correct?)
value in the message + minimum delay
(How do we know the master’s time is correct?)
value in the message + minimum delay
to T1 + (T2-T0)/2
to T1 + (T2-T0)/2
If we know the minimum latency: (T2-T0)/2 - min
to min RTT
latency introduced in network
events
events
happened first!
Ci(a) < Ci(b)
b = process j receives m Ci(a) < Cj(b)
local events
timestamp Tm = (Ci at the time message was sent)
Cj = max(Cj, Tm + 1) + 1
1 1 1 2 3 3 3 4 5 7 8 6 7 8 8
C(a) < C(b) => a -> b
C(a) < C(b) => a -> b
sometimes neither C(a) < C(b) or C(b) < C(a)
C(a) < C(b) => a -> b
sometimes neither C(a) < C(b) or C(b) < C(a)
i.e. a list of all previous events
C(a) < C(b) => a -> b
sometimes neither C(a) < C(b) or C(b) < C(a)
i.e. a list of all previous events
reputation to the pages it links to
same time”
the sending process’s checkpoint should reflect sending it
message needs to be reflected in the receiver’s or channel’s checkpoint
another!
if a -> b, and b is in some checkpoint, so is a
error-free, in-order delivery, finite delay
until receiving a token on that channel
“before the snapshot” from “after the snapshot”
message from j then j’s state includes sending that message
by using (at least) f+1 copies
applied to all replicas with the same result
sequence
Key idea: If the system is a state machine, keeping the replicas consistent means agreeing on the order of operations
(multicore)
if a finishes before b starts, a is ordered before b
assigns order to requests
replicas (here, all f+1)
primary respects the order of all successful operations (this is the hard part!)
detected
membership
response comes from tail
parallel, waits for responses
chain always two nodes (primary & backup)
key-value store
state
the state from the old master?
even if the primary fails!
primary?
succeed
so it rejects forwarded ops from the old primary