Tim e, and the lack thereof Outline Introduction A global notion - - PDF document

tim e and the lack thereof outline
SMART_READER_LITE
LIVE PREVIEW

Tim e, and the lack thereof Outline Introduction A global notion - - PDF document

Distributed System s Fall 2 0 0 9 Time and Synchronization Tim e, and the lack thereof Outline Introduction A global notion of the correct time would be tremendously useful. Basic definitions Synchronization algorithms Why?


slide-1
SLIDE 1

Distributed System s Fall 2 0 0 9 Time and Synchronization

Fall 2 0 0 9 5 DV0 2 0 3

Outline

  • Introduction
  • Basic definitions
  • Synchronization algorithms

– Synchronous systems – Cristian's algorithm – Berkeley algorithm – Network Time Protocol

  • Summary
Fall 2 0 0 9 5 DV0 2 0 4

Tim e, and the lack thereof

  • A global notion of the correct time

would be tremendously useful. Why?

– Consistency of distributed data, transactions, authenticity checks (ticket lifetimes), duplication detection, distributed debugging and garbage detection, etc.

Fall 2 0 0 9 5 DV0 2 0 5

Tim e, and the lack thereof

  • Why do we not have global time?

– Clocks drift, are inaccurate, may fail arbitrarily, etc. – Time is relative, and depends on the

  • bserver of the timed events
  • Causal relationships (cause and effect) may

not be violated

Fall 2 0 0 9 5 DV0 2 0 6

Basic definitions

  • Distributed system is P, consisting
  • f N processes: pi, i = 1, 2, …, N
  • Each process has state si
  • Processes communicate only via

message passing (network)

  • Events e occur in processes

– Internal events – Send events – Receive events

slide-2
SLIDE 2 Fall 2 0 0 9 5 DV0 2 0 7

Basic definitions

  • Events are ordered within a process

by the relation →i

e0 →i e1 →i e2

  • Define a history of pi as the events

as described by →i

history(pi) = hi = <ei

0, ei 1, ei 2, ...>

Fall 2 0 0 9 5 DV0 2 0 8

Basic definitions

  • Clock skew

– Instantaneous difference between readings of any two clocks

  • Clock drift

– Variations in how clocks count time (oscillations in a crystal), which cause divergence between clocks

Basic definitions

  • Clock drift rate

– Change in offset between clock and a perfect clock

  • Consumer level clocks 10-6 seconds/second,

roughly 1 second for each 11.6 days

Fall 2 0 0 9 5 DV0 2 0 1 0

Com puter clocks

  • Hardware clock H(t)

– Gives “raw” time reading

  • Software clock

C(t) = αH(t) + β – Scaled by OS to give accurate time – Used for timestamps

Fall 2 0 0 9 5 DV0 2 0 1 1

Tim e sources

  • Coordinated Universal Time

(abbreviated UTC, thanks to the French)

– Atomic clocks – Used for synchronization of all kinds of equipment (e.g. your computer, GPS, fancy radio-controlled clocks, etc.)

Fall 2 0 0 9 5 DV0 2 0 1 2

Synchronization types

  • External synchronization

– Processes are synchronized to external time source (e.g. UTC)

  • Internal synchronization

– “Correct time” exists only within a group of processes – Must not be synchronized to external source

slide-3
SLIDE 3 Fall 2 0 0 9 5 DV0 2 0 1 3

Correctness and m onotonicity

  • Correctness (drift is bounded):

(1 – p)(t' – t) ≤ H(t') – H(t) ≤ (1 + p)(t' – t)

– Forbids “jumps” in hardware clocks to the bound p

  • Monotonicity (ever-increasing)

t' > t ⇒ C(t') > C(t)

– Note: only deals with software clock – Simpler, and often sufficient

Fall 2 0 0 9 5 DV0 2 0 1 4

Synchronization algorithm s

  • Internal synchronization

– In synchronous systems (trivial case) – Berkeley algorithm

  • External synchronization

– Cristian's algorithm – Network Time Protocol (NTP)

Fall 2 0 0 9 5 DV0 2 0 1 5

Clock synchronization in synchronous system s

  • Synchronous systems define

bounds on all relevant parts

– Clock drift – Message transmission delays – Process execution step requirements

  • Send request, get response back

I n t e r n a l

Clock synchronization in synchronous system s

  • Only uncertainty is actual current

transmission delay

u = (max – min) – Set time to (time in response) + u/2 – For N processes, optimum bound is u(1 - 1/N)

I n t e r n a l

Fall 2 0 0 9 5 DV0 2 0 1 7

Cristian's algorithm

  • S is connected to time source
  • p requests (mr) and receives (mt)

time

– S records time as soon before transmitting message as possible – p knows total round-trip-time Tround – Simply set time to (t + Tround / 2)?

E x t e r n a l

Fall 2 0 0 9 5 DV0 2 0 1 8

Cristian's algorithm

  • Only if at same LAN! But then, if

minimum transmit time (tmin) is known:

– Latest time S could have placed time in mt was tmin after p dispatched mr, and tmin before p received mt – [t + tmin , t + Tround – tmin] – Width of range is (Tround – 2tmin), so accuracy is +-(Tround /2 - tmin)

E x t e r n a l

slide-4
SLIDE 4 Fall 2 0 0 9 5 DV0 2 0 1 9

Cristian's algorithm

  • Single point of failure!
  • Crashing server?

– Multicast to group of servers

  • Fake servers?

– Establish cryptographic authentication

  • Arbitrarily failing servers?

– Have enough correct ones to achieve agreement

E x t e r n a l

Fall 2 0 0 9 5 DV0 2 0 2 0

Berkeley algorithm

  • Uses Cristian's methods
  • Master/Slave relationship
  • Master polls slaves

– Gets current time in each slave – Sends the offset from own time to each slave

  • Master fails?

– Crash: elect a new one! – Arbitrary failure? Oops…

I n t e r n a l

Fall 2 0 0 9 5 DV0 2 0 2 1

Netw ork Tim e Protocol

  • Unlike the others, designed for

WAN rather than LAN use

– Time servers close to the time source are more trusted – Redundant paths → survives disconnects – Massively scalable – Authentication of time servers to avoid propagation of arbitrary failures

E x t e r n a l

Fall 2 0 0 9 5 DV0 2 0 2 2

Netw ork Tim e Protocol

  • Synchronization subnets

– Primary level (stratum) is directly connected to time source – Secondary level syncs to primary, tertiary to secondary, etc.

  • High strata number means less reliable

– Dynamically reconfigurable: if time source goes down, primary level becomes secondary level

E x t e r n a l

Fall 2 0 0 9 5 DV0 2 0 2 3

Netw ork Tim e Protocol

  • Multicast mode

– “Time is X” between LAN nodes

  • Only as accurate as LAN allows
  • Used only for unimportant nodes
  • Procedure-call mode

– Similar to Cristian's algorithm – More accurate than multicast mode

  • Symmetric mode

– Pairs of messages – Used in lower strata

E x t e r n a l

Fall 2 0 0 9 5 DV0 2 0 2 4

Netw ork Tim e Protocol

  • All messages sent over UDP
  • For procedure-call and symmetric

mode, messages contain

– Local time of previous NTP messages between the nodes were sent and received – Local time of current message transmission

  • Receiver notes local time when

message is received E x t e r n a l

slide-5
SLIDE 5 Fall 2 0 0 9 5 DV0 2 0 2 5

Netw ork Tim e Protocol

  • Delay in Server B may be non-

negligible

  • Messages may be lost along the

way E x t e r n a l

Fall 2 0 0 9 5 DV0 2 0 2 6

Netw ork Tim e Protocol

  • For each message pair calculate
  • i estimated offset between clocks

di total transmission time (delay)

  • True offset is denoted o (without

the index)

  • Denote transmission time of m as t,

and that of m' as t' E x t e r n a l

Fall 2 0 0 9 5 DV0 2 0 2 7

Netw ork Tim e Protocol

Ti-2 = Ti-3 + t + o Ti = Ti-1 + t' – o leads to di = t + t' = Ti-2 – Ti-3 + Ti – Ti-1 also

  • = oi + (t' – t)/2, where
  • i = (Ti-2 – Ti-3 + Ti-1 - Ti)/2

E x t e r n a l

Fall 2 0 0 9 5 DV0 2 0 2 8

Netw ork Tim e Protocol

  • Since t, t' ≥ 0, we know that
  • i – di /2 ≤ o ≤ oi + di /2
  • Or, in English: oi is an estimate of

the offset, and di is a measure of its accuracy E x t e r n a l

Fall 2 0 0 9 5 DV0 2 0 2 9

Netw ork Tim e Protocol

  • Pairs are retained for quality

calculations

  • NTP peers communicate with many
  • ther peers, to decrease error

E x t e r n a l

Fall 2 0 0 9 5 DV0 2 0 3 0

Sum m ary

  • We do not have universal time

– But we can synchronize clocks “reasonably well” anyway

  • Internal vs. external

synchronization

  • Real-time systems must use more

sophisticated algorithms than what we have seen during this lecture!

slide-6
SLIDE 6 Fall 2 0 0 9 5 DV0 2 0 3 1

Sum m ary

  • Algorithms

– Synchronous system (trivial) – Cristian's algorithm

  • Used in many others

– Berkeley algorithm

  • Master/Slave application of Cristian's for

internal synchronization

– Network Time Protocol

  • Suitable for WANs
  • Message pairs
Fall 2 0 0 9 5 DV0 2 0 3 2

Next lecture

  • Logical time
  • Global states
  • Distributed debugging