Virtualized PhysicalClocks What do we use Clocks for When did - - PowerPoint PPT Presentation

virtualized physicalclocks what do we use clocks for
SMART_READER_LITE
LIVE PREVIEW

Virtualized PhysicalClocks What do we use Clocks for When did - - PowerPoint PPT Presentation

Virtualized PhysicalClocks What do we use Clocks for When did something happen? When will it happen This class starts at 3pm How long does something take? This class lasts for 1 hour 20 minutes What happened first and happened


slide-1
SLIDE 1

Virtualized PhysicalClocks

slide-2
SLIDE 2

What do we use Clocks for

  • When did something happen? When will it happen
  • This class starts at 3pm
  • How long does something take?
  • This class lasts for 1 hour 20 minutes
  • What happened first and happened later
  • The class started before it ended
slide-3
SLIDE 3

Clocks in Distributed Systems

  • We use clocks for similar things in distributed systems
  • Take a backup at 5pm/Restore to the backup at 5pm
  • Take a backup every hour
  • Ensure that resource is released by process 1 before process 2 accesses it
slide-4
SLIDE 4

Application of Clocks to Order Events

  • Consider a multi‐version database system
  • When a new version is created, we add it to existing versions
  • A transaction (system on behalf of the transition) can determine which

version to read

  • Each version has a timestamp.
  • Suppose you have a perfectly synchronized clock and a very fast

processor

  • Treat every transaction as if it were instantaneous
  • Assign a timestamp say T for the transaction
  • Each read and write of the transaction would have time T
slide-5
SLIDE 5

Application of Clocks to Order Events

  • Example:
  • T1 has timestamp 100
  • It reads x which has versions at time 0, 50, 75, 90
  • T1 would read the version at time 90
  • It creates a new version of x
  • It would have a timestamp of 100
  • T2 has timestamp 110
  • Assuming no transactions other than T1 and T2. If it reads x, it should read x written by T1
  • Advantages
  • To know what the state of the system was at time 100 is trivial
  • Read only transactions are never aborted
  • Problems
  • If T2 ran concurrently with T1 and read x before T1 had written x, aborting T1 or T2 may be

necessary

slide-6
SLIDE 6

But..

  • Our Clocks are not perfectly synchronized
  • Problems caused by loosely synchronized clocks
  • Suppose we have transactions T3 and T4 such that
  • T3 wrote x
  • T4 read x
  • T3 finished before T4 started
  • Then, T4 must be ordered later than T3 in serialization order
  • i.e., T4 must read x written by T3 (or some later transaction)
  • Loose synchronization may, however, permit the possibility that T3’s timestamp is

higher than T4’s timestamp. It will prevent T4 from reading the value of x written by T3.

  • To prevent this problem, Google Spanner introduces the notion of commit‐wait
  • Force T4 to delay thereby guarantee that its timestamp is higher than that of T3
slide-7
SLIDE 7

Why did this happen?

  • We anticipated/wanted that temporal dependency would translate

into causal dependency.

  • T3 finished before T4 started
  • We wanted to T3 to impact T4
  • Notion of causality captures what events can (potentially) affect other

events

slide-8
SLIDE 8

Causality

  • Causality (happened before) captures the information flow
  • Event a happened before b iff
  • a and b are on the same process and a occurred before b
  • a is a send event and b is corresponding receive event, or
  • there exists event c such that
  • a happened before c and
  • c happened before b
  • Lamport’s logical clocks assign a timestamp to each event such that
  • a happened before b  l.a < l.b
  • Vector clocks assign a (vector) timestamp to each event such that
  • a happened before b  vc.a < vc.b
slide-9
SLIDE 9

Causality (Continued)

  • Implementation of Logical Clocks
  • When process j sends message m
  • l.j = l.j + 1
  • l.m = l.j
  • For receive event where message m is received
  • l.j = max(l.j, l.m) + 1
  • Property of logical clocks
  • a happened before b  l.a < l.b
  • l.a = l.b  l.a is concurrent with l.b
  • Useful to take a consistent snapshot
slide-10
SLIDE 10

How would logical clocks be different?

  • Given the expected dependency between T3 and T4
  • Assign timestamp of T4 to be higher than that of T3
  • Waiting not involved since it is a logical clock
slide-11
SLIDE 11

Let’s review what we wanted to do with (logical) clocks

  • When did something happen? When will it happen
  • This class starts at 3pm
  • NO
  • How long does something take?
  • This class takes 1 hour 20 minutes
  • NO
  • What happened first and happened later
  • The class started before it ended
  • YES/NO
slide-12
SLIDE 12

What is the problem?

  • Logical clocks did not convey any meaning to the actual real/physical

time

slide-13
SLIDE 13

Goals

  • Problem: Given a distributed system, assign each event e a timestamp l.e, such

that

  • 1. e hb f => l.e < l.f
  • 2. Space requirement of l.e is O(1) integers
  • 3. l.e is represented with bounded space
  • 4. l.e is close to pt.e i.e. |l.e – pt.e| is bounded.
slide-14
SLIDE 14

Naïve Algorithm

Logical Clocks

  • When process j sends message m
  • l.j = l.j + 1
  • l.m = l.j
  • For receive event where message m is

received

  • l.j = max(l.j, l.m) + 1

Naïve Algoirthm

  • When process j sends message m
  • l.j = l.j + 1
  • l.j := max(l.j, pt.j)
  • l.m = l.j
  • For receive event where message m is

received

  • l.j = max(l.j, l.m) + 1
  • l.j := max(l.j, pt.j)
slide-15
SLIDE 15

Naïve Algorithm

Satisfies first two requirements:

  • 1. e hb f => l.e < l.f
  • 2. Space requirement of l.e is O(1) integers
  • Fails these requirements (we will ignore proof)
  • 1. l.e is represented with bounded space
  • 2. l.e is close to pt.e i.e. |l.e – pt.e| is bounded.
  • Unbounded drift caused by
  • l.j := max (l.j+1, pt.j), and
  • l.j := max(l.j+1, l.m+1, pt.j)

15

slide-16
SLIDE 16

This is an example to show that drift between l and pt can increase in unbounded fashion

16

slide-17
SLIDE 17

Problem with Naïve algorithm

  • Drift between l.e and pt.e is not bounded
  • Why is this a problem?
  • Consider the case where the user wants a snapshot of a database at time t
  • Since no process knows the precise physical time, the snapshot provided will not be precisely

at physical time t.

  • It will be at time t’ (that is hopefully close to t)
  • If clock skew is , the best we can do is to let t’ to be in [t‐ , t+ ]

17

slide-18
SLIDE 18

Algorithm for Hybrid Logical Clocks

Naïve Algoirthm

  • When process j sends message m
  • l.j = l.j + 1
  • l.j := max(l.j, pt.j)
  • l.m = l.j

Revised Algorithm

  • When process j sends message m
  • l.j’ = l.j
  • l.j := max(l.j, pt.j)
  • If (l.j = l.j’) c.j = c.j + 1
  • Else c.j = 0
  • l.m = l.j, c.m = c.j
slide-19
SLIDE 19

Algorithm for Hybrid Logical Clocks (Continued)

Naïve Algorithm

  • Upon receiving m at j
  • l.j = max(l.j, l.m) + 1
  • l.j := max(l.j, pt.j)

Revised Algorithm

  • Upon receiving m at j
  • l.j’ := l.j;
  • l.j := max(l.j’, l.m, pt.j);
  • If (l.j =l.j’ =l.m) then c.j := max(c.j,

c.m)+1

  • Elseif (l.j’ =l.j) then c.j := c.j + 1
  • Elseif (l.j =l.m) then c.j := c.m + 1
  • Else c.j := 0
  • l.m = l.j
slide-20
SLIDE 20

HLC Algorithm

20

10,10,0 0,0,0 1,10,1 2,10,2 2,10,3 3,10,4 3,10, 5 4,10, 6 14,14,0 13,13,0 pt.j l’.j c.j l.m = 10 c.m = 0 l.m = 10 c.m = 2 l.m = 10 c.m = 4 l.m = 10 c.m = 4

1 2 3

l’.j c.j pt.j l.j := max(l’.j, l.m, pt.j); elseif (l.j =l.m) then c.j := c.m + 1 l’.j c.j pt.j l.j := max(l’.j, pt.j); If (l.j =l’.j) then c.j := c.j + 1 l’.j pt.j l.j := max(l’.j, l.m, pt.j);

Reset c

slide-21
SLIDE 21

Properties of HLC

  • Logical clock property:
  • e hb f => (l.e, c.e) < (l.f, c.f) (lexicographical comparison)
  • |l.f – pt.f| <= є
  • pt.e <= l.e <= pt.e + є
  • The value c.e is bounded
  • c.e <= N * (number of events that can be created on a process within є)
  • In practice, it is very small (in single digits)

21

slide-22
SLIDE 22

Let’s review what we wanted to do with (hybrid logical) clocks

  • When did something happen? When will it happen
  • The class started at 3pm
  • Yes. Choose l value to be within epsilon of 3pm. The best we can do anyway.
  • How long does something take?
  • The class took 1 hour 20 minutes
  • Look at the difference between the l values
  • What happened first and happened later
  • The class started before it ended
  • Use lexicographic ordering (guarantees consistency with causal order)
slide-23
SLIDE 23

Revisiting Multiversion Database

slide-24
SLIDE 24

Review the earlier example

  • Problems caused by loosely synchronized clocks
  • Suppose we have transactions T3 and T4 such that
  • T3 wrote x
  • T4 read x
  • T3 finished before T4 started
  • Then, T4 must be ordered later than T3 in serialization order
  • i.e., T4 must read x written by T3 (or some later transaction)
  • Loose synchronization may, however, permit the possibility that T3’s timestamp is

higher than T4’s timestamp. It will prevent T4 from reading the value of x written by T3.

  • To prevent this problem, Google Spanner introduces the notion of commit‐wait
  • Force T4 to delay thereby guarantee that its timestamp is higher than that of T3
slide-25
SLIDE 25

Other Choices

  • Alternate choice
  • Increase (physical) time of the machine running T4
  • Unacceptable, as it would cause problems to other applications (e.g., sleep

function) as well as NTP synchronization

  • A better choice
  • Create a new HLC timesdtamp for T4 that is higher than that of T3
  • Leave physical time unchanged
  • Change l value of the timestamp (and if necessary c value)
  • c value is still bounded
slide-26
SLIDE 26

Other Applications of HLC

  • Causally Consistent Data Store
  • Rollback on Key‐Value store
  • Runtime monitoring partially synchronous distributed systems
slide-27
SLIDE 27

Moral

Questions?

slide-28
SLIDE 28