Verification of clock synchronization algorithm (Original - - PowerPoint PPT Presentation

verification of clock synchronization algorithm
SMART_READER_LITE
LIVE PREVIEW

Verification of clock synchronization algorithm (Original - - PowerPoint PPT Presentation

Verification of clock synchronization algorithm (Original Welch-Lynch algorithm and adaptation to TTA) Christian Mller cm@wjpserver.cs.uni-sb.de Saarland University 7. October 2005 1/33 Overvie view Clock synchronization in general


slide-1
SLIDE 1

Verification of clock synchronization algorithm

(Original Welch-Lynch algorithm and adaptation to TTA)

Christian Müller cm@wjpserver.cs.uni-sb.de Saarland University

  • 7. October 2005

1/33

slide-2
SLIDE 2

2/33

Overvie view

  • Clock synchronization in general
  • Original Welch-Lynch algorithm
  • Verification
  • Adaptation to TTA (Flexray)
slide-3
SLIDE 3

3/33

Cl Clock sy synchroniz ization

  • Typical problems

– hardware clocks are not synchronous – hardware clocks drift with different frequency – message delivery delay varies – software processes, which access the

hardware clocks, could be faulty itself

➔ messages could be discrepant (in the worst case:

dual faced clocks)

slide-4
SLIDE 4

4/33

Cl Clock sy synchroniz ization

  • Introduction to Welch-Lynch algorithm

– a fault tolerant algorithm for clock

synchronization in a distributed system

– intended for a fully connected network of n

processes

– will be executed periodically at the same

local time for all nodes

– requires at least n² messages between two

synchronization intervals

slide-5
SLIDE 5

5/33

Welch-Lynch algorit ithm

Step 1: exchange clock values Step 2: determine adjustment Step 3: adjust the local time Step 4: when time, apply it

slide-6
SLIDE 6

6/33

Welch-Lynch algorit ithm

Given: n := number of all nodes f := maximum number of faulty clocks with condition n > 3f (1) sort the clocks (c1..cn) from smallest to largest

(2) exclude f smallest and f largest clocks (3) compute the average of the f+1'st and n-f'th clocks

cfn[C1 ,... ,C n]=C f 1C n− f  2

slide-7
SLIDE 7

7/33

Welch-Lynch algorit ithm

  • Assumptions

– the drift from the real time of all clock is

bounded by a constant 0 < ρ << 1:

– there are maximal f < n/3 faulty clocks – in the beginning all nonfaulty clocks are

synchronized within some β

– message delivery delay is [δ-ε,δ+ε]

where δ > ε ≥ 0

1−ρ≤d H it dt ≤1ρ

slide-8
SLIDE 8

8/33

Welch-Lynch algorit ithm

  • Notation

– PCp is the physical clock of a node p – CORRp is the computed correction of PCp – VCp is the (virtual) local clock of a node p – VCp(t) = PCp(t) + CORRp(t)

  • clock names are always capitalized and map

real time to local time:

➔ VCp(t) returns the local time T of node p at the real

time t.

slide-9
SLIDE 9

9/33

Welch-Lynch algorit ithm

  • Correctness properties

– Agreement: all the non-faulty processes p

and q at each time t are synchronized to within γ:

– Validity: the clocks of non-faulty processes

are within a linear envelope of real-time.

∣VC pt−VC qt∣≤γ

slide-10
SLIDE 10

10/33

Welch-Lynch algorit ithm

t0 t1 t2 t3 t4 0,5 1 1,5 2 2,5 3 3,5 4

Liniar envelope of real time:

slope = 1 slope = 1+ρ slope = 1-ρ

real time local time

slide-11
SLIDE 11

11/33

Welch-Lynch algorit ithm

T := T0; repeat forever wait until VCp = T; broadcast SYNC; wait for Δ time units; ADJp := T + – cfn(ARR δ

p);

CORRp := CORRp + ADJp; T := T + P; end of loop.

  • n reception of SYNC message from q do ARRp[q] := VCp.

initialization

}

slide-12
SLIDE 12

12/33

Welch-Lynch algorit ithm

  • For a correct execution of the algorithm,

P and Δ have to satisfy several conditions

– the last SYNC message in the current round

can arrive the node p at the time t with:

t ≤ tp + β + δ + ε where:

tp := is th real time when the round starts β := maximal clock drift in real time δ + ε := maximal message delay

slide-13
SLIDE 13

13/33

Welch-Lynch algorit ithm

  • For a correct execution of the algorithm,

P and Δ have to satisfy several conditions

– the last SYNC message in the current round

can arrive the node p at the time

t ≤ tp + β + δ + ε

– VC(tp + β +

δ + ε) ≤ T + (1+ρ)(β + δ + ε)

➔ Δ ≥ (1+ )( + + )

ρ β δ ε

slide-14
SLIDE 14

14/33

Welch-Lynch algorit ithm

  • For a correct execution of the algorithm,

P and Δ have to satisfy several conditions

– for p not to miss the next round, T+P must be

larger than the new clock at the time of the correction!

➔ P ≥ Δ + ADJmax

where

ADJmax = ( + ) + β ε ρ∙| β - + | δ ε

(can be easily derived)

slide-15
SLIDE 15

15/33

slide-16
SLIDE 16

16/33

Ve Verificatio ion

  • Abstract idea

– although the algorithm is fairly simple, its

analysis is surprisingly complicated and requires a long series of lemmas

– to make the proof presentable, we abstract

from several details and concentrate on its main idea

– for simplicity we assume that broadcasting a

message, computing the adjustment, storing arrival time are instantaneous operations

slide-17
SLIDE 17

17/33

Ve Verificatio ion

  • Idea

– To examine two non-faulty clocks before a

synchronization round, where the clock drift is maximal

  • Consider two clocks before the same

synchronization round

– Cp(t) = cfn(ARRp) – Cq(t) (analogous)

slide-18
SLIDE 18

18/33

Ve Verificatio ion

  • Assumption

|Cp(tsync) – Cq(tsync)| ≤ γ for all non-faulty p,q at tsync:

  • Proof

|cfn(ARRp) – cfn(ARRq)| = ?

  • what returns a cfn-function?

0 tsync tsync+1 | |

slide-19
SLIDE 19

19/33

Ve Verificatio ion

  • What do we now about this arrays?

– they are sorted from smallest to largest – mARRp is a subset of ARRp – mARRp contains all the non-faulty clocks and is

equal for all nodes at each synchronization interval

– length(mARRp) ≥ 2f + 1

A1 . . . Af+1 . . . An-f . . . An

ARRp: mARRp:

M1 . . . Mm

slide-20
SLIDE 20

20/33

Ve Verificatio ion

  • M1 = Ai for some i

➔ i ≤ f+1

=> Af+1 ≤ Mf+1

➔ analogous for M1 ≤ Af+1

M1 ≤ Af+1 ≤ Mf+1

➔ analogous for

Mm-f ≤ An-f ≤ Mm

A1 . . . Af+1 . . . An-f . . . An

ARRp: mARRp:

M1 . . . Mm (I) (II)

slide-21
SLIDE 21

21/33

Ve Verificatio ion

  • Let be k any index between f+1 and m-f.

– since m ≥ 2f+1, such a k exists.

  • Because of (I) and (II) holds:

M1 ≤ Af+1 ≤ Mk ≤ An-f ≤ Mm

A1 . . . Af+1 . . . An-f . . . An

ARRp: mARRp:

M1 . . . Mm

slide-22
SLIDE 22

22/33

Ve Verificatio ion

M1 ≤ Af+1 ≤ Mk ≤ An-f ≤ Mm

M 1M k 2 ≤ A f 1An− f  2 ≤M kM m 2

➔ (M1 + Mk)/2 ≤ cfn(ARRp) ≤ (Mk + Mm)/2 ➔ (M1 + Mk)/2 ≤ cfn(ARRq) ≤ (Mk + Mm)/2 ➔ the cfn-function returns a result depending

  • nly on non-faulty nodes => fault-tolerance
slide-23
SLIDE 23

23/33

Ve Verificatio ion

  • Proof:

| Cp(tsync) – Cq(tsync) | = |cfn(ARRp) – cfn(ARRq)| ≤ |(M1+Mk)/2 – (Mk+Mm)/2| = |(M1+Mm)/2| = (γ + λ)/2 for γ ≥ λ holds: (γ + λ)/2 ≤ γ

slide-24
SLIDE 24

24/33

Ve Verificatio ion

t0 t1 t2 t3 t4 0,5 1 1,5 2 2,5 3 3,5 4

Proof of validity

slope = 1 slope = 1+ρ slope = 1-ρ faulty clock11 faulty clock 2

real time local time

slide-25
SLIDE 25

25/33

Ve Verificatio ion

  • Since VC(t) is a linear function, holds:

VC(a + b) = A + VC(b)

  • Consider the local time difference of

some node between two synchronization intervals:

VC(ti+1)-VC(ti) = VC(ti + (ti+1-ti))-VC(ti) = T + VC(ti+1-ti) – T = VC(ti+1-ti) (1+ )(t ρ

i+1-ti) ≤ Ti+1-Ti ≤ (1- )(t

ρ

i+1-ti)

slide-26
SLIDE 26

26/33

Ve Verificatio ion

  • But!

– our model is very abstract and not practical – we neglected message delivery delays and

the run time of all procedures

  • Normally we have to bound each possible

delay to a constant and then choose appropriate values for it

slide-27
SLIDE 27

27/33

Ad Adaptatio ion to TTA (Fle lexray)

  • TTA version is basically WLA, but:

– k = 1 with k > 3f – some changes in the fault assumptions – TTA doesn't consider all accurate clocks,

when choosing second smallest and second largest, but just 4 of them!

– this accurate clocks are choosen by the

membership algorithm

➔ so have all non-faulty nodes the same members at

all times

slide-28
SLIDE 28

28/33

Ad Adaptatio ion to TTA (Fle lexray)

  • Fault assumptions

– in TTA bus topology and in a Flexray system

there is no dual faced clock effects

  • each node always receive the same time from a

faulty node (there is only one channel)

  • no LWA needed?
  • No. Because the messages can lost!
slide-29
SLIDE 29

29/33

Ad Adaptatio ion to TTA (Fle lexray)

  • Further changes:

– each node maintains a push-down stack of

depth 4 for clock readings

– is a SYF-message arrive and it is valid (it

should be from the one of members)

  • clock differenece reading will be computed an

pushed on the stack

– when time, synchronize the local clock using

the stack values

slide-30
SLIDE 30

30/33

Ad Adaptatio ion to TTA (Fle lexray)

Membership Node I 2 1 3 2 Node II

SYF-message

Node I

  • Computing the clock difference:

e.g. this SYF-message was expected at time 5 but sended at time 8

5 – 8 = -3

  • Push -3 on the stack now
  • 3

2 1 3

  • The oldest value get discarded
slide-31
SLIDE 31

31/33

Ad Adaptatio ion to TTA (Fle lexray)

  • How a node computes the difference?

– communication in TTA is time-triggered

according to global schedules

– each node knows beforehand at wich time a

certain message will be sent

➔ difference between the expected time and

actual arrival time can be used to calculate the deviation between the sender's and receiver's clock

slide-32
SLIDE 32

32/33

Ad Adaptatio ion to TTA (Fle lexray)

  • Further changes

– in Flexray and TTA each node starts a

synchronization round at different time

➔ the duration of one round P have to be changed

according to this

slide-33
SLIDE 33

33/33

Thanks for attention!