Distributed Systems Time and Global States MISM - - PowerPoint PPT Presentation

distributed systems
SMART_READER_LITE
LIVE PREVIEW

Distributed Systems Time and Global States MISM - - PowerPoint PPT Presentation

Distributed Systems Time and Global States MISM 95-702 Distributed Systems 1 Learning Goals To understand: The challenge of time in a distributed system How to synchronize distributed clocks How you can


slide-1
SLIDE 1

MISM 95-702 Distributed Systems 1

Distributed Systems

Time and Global States

slide-2
SLIDE 2

MISM 95-702 Distributed Systems 2

Learning Goals

  • To understand:

– The challenge of time in a distributed system – How to synchronize distributed clocks – How you can assess the state of a distributed system – Debugging distributed systems

slide-3
SLIDE 3

Example

  • Browse to http://tinyurl.com/702clock

– This is your local clock

  • Take out a piece of paper
  • Solve by hand: 643 * 192

– Timestamp each line after you complete it

  • E.g.

MISM 95-702 Distributed Systems 3

Arithmetic Timestamp 643 92 192 96 1286 112 …

slide-4
SLIDE 4

Time in distributed systems

  • Who finished first?
  • How could decide computationally?
  • Can you use the timestamps?

– Are they reliable? – Why are why not?

  • How could you make the timestamps

more reliable?

  • What other approach could you take?

MISM 95-702 Distributed Systems 4

slide-5
SLIDE 5

MISM 95-702 Distributed Systems 5

Skew and drift

  • Why can’t we have a global clock on

distributed systems?

– Clock skew - two clocks, two times – Clock drift - each clock varies in speed

slide-6
SLIDE 6

Time

  • What is a second?

– 9,192,631,770 periods of transition between the two hyperfine levels of the ground state of Caesium-133 (Cs133)

  • Ordinary quartz crystal clocks

– Drifts 1 second every 11 days – How many things can a 2 GHz processor do in that 1 second of drift?

MISM 95-702 Distributed Systems 6

slide-7
SLIDE 7

MISM 95-702 Distributed Systems 7

Clocks

  • Cesium clocks

– Expensive

  • GPS receiver

– Less expensive – (GPS system has cesium clock(s))

  • Terrestrial radio

– Least expensive and least accurate

slide-8
SLIDE 8

MISM 95-702 Distributed Systems 8

3 Days in the life of my Mac

3/20/10 11:55:00 PM ntpd[26] time reset -1.782968 s 3/21/10 12:47:41 PM ntpd[26] time reset -0.719539 s 3/21/10 4:30:51 PM ntpd[26] time reset +0.327154 s 3/21/10 7:55:42 PM ntpd[26] time reset -0.238545 s 3/21/10 10:29:06 PM ntpd[26] time reset +0.364890 s 3/22/10 11:28:51 AM ntpd[26] time reset -1.058507 s 3/22/10 3:09:51 PM ntpd[26] time reset +0.572059 s 3/22/10 9:33:25 PM ntpd[26] time reset -0.165838 s 3/22/10 10:19:11 PM ntpd[26] time reset +1.000670 s 3/23/10 7:50:47 AM ntpd[26] time reset -0.171427 s 3/23/10 10:10:30 AM ntpd[26] time reset +0.133970 s 3/23/10 11:55:39 AM ntpd[26] time reset -0.136061 s 3/23/10 12:37:57 PM ntpd[26] time reset -0.526902 s 3/23/10 1:09:51 PM ntpd[26] time reset +0.400528 s

slide-9
SLIDE 9

MISM 95-702 Distributed Systems 9

Demonstrate External Synchronization

slide-10
SLIDE 10

MISM 95-702 Distributed Systems 10

Demonstrate Internal Synchronization

slide-11
SLIDE 11

MISM 95-702 Distributed Systems 11

Network Time Protocol

Design Goals:

  • Sync with UTC over

Internet

  • Reliability via

redundancy

  • Scale to large number
  • f clients and servers
  • Defend against Mallory

Graphic source: http://en.wikipedia.org/wiki/Network_Time_Protocol

slide-12
SLIDE 12

How is time synchronized?

Simulation:

Two clocks UDP packet (reusable)

MISM 95-702 Distributed Systems 12

slide-13
SLIDE 13

MISM 95-702 Distributed Systems 13

a Sent time b Received time c Sent-back time d Returned-back time

Calculation UDP Packet (reusable)

e Total round trip time (d-a) f Remote processing time (c-b) g Delay each way (e-f)/2 h Offset relative to remote (d-g) - c i Amount to adjust local clock

  • h

a Sent time b Received time c Sent-back time d Returned-back time

Calculation UDP Packet (reusable)

e Total round trip time (d-a) f Remote processing time (c-b) g Delay each way (e-f)/2 h Offset relative to remote (d-g) - c i Amount to adjust local clock

  • h

a Sent time b Received time c Sent-back time d Returned-back time

Calculation UDP Packet (reusable)

e Total round trip time (d-a) f Remote processing time (c-b) g Delay each way (e-f)/2 h Offset relative to remote (d-g) - c i Amount to adjust local clock

  • h
slide-14
SLIDE 14

Test your synchronization

  • 1 student be a “1”
  • 2 students be “2’s”
  • Remaining be “3’s”

MISM 95-702 Distributed Systems 14

slide-15
SLIDE 15

MISM 95-702 Distributed Systems 15

Summarize

  • Summarize in your own words how

NTP synchronization works

  • What is NTP synchronized time good

enough for?

  • What are its shortcomings?
slide-16
SLIDE 16

Simulation Setup

  • Each student take n candies and n

coins

– Set candies aside in the mine. – Leave coins in inventory in front of you

  • Have a piece of paper to write on

MISM 95-702 Distributed Systems 16

slide-17
SLIDE 17

Simulation Process:

  • Occasionally move candy from mine to inventory
  • Occasionally pass a coin to someone

– Receive a candy in return

  • Occasionally pass a candy to someone

– Receive a coin in return

  • Record each step in the process
  • E.g.

– Send Betsy coin – Mine candy – Receive candy from Fred – Send coin to Fred – Receive candy from Betsy – Mine candy – …

MISM 95-702 Distributed Systems 17

slide-18
SLIDE 18

Distributed Systems Histories

  • Could you re-enact what happened

from your record?

  • How?
  • How precise would it be?
  • How precise does it need to be?

MISM 95-702 Distributed Systems 18

slide-19
SLIDE 19

MISM 95-702 Distributed Systems 19

Global State Terminology

Define by example:

  • Process history
  • Global history
  • Happened-before relation
  • Cut
  • Consistent cut
  • Inconsistent cut
  • Frontier of the cut
  • Run
  • Linearization
slide-20
SLIDE 20

MISM 95-702 OCT 20

Linearize these two process histories

Process A Process B State 3c, 6p State 4c, 6p SendB 2p RecA 2p State 3c, 4p State 4c, 8p RecB 1c SendA 1c State 4c, 4p State 3c, 8p SendB 2c RecA 2p State 2c, 4p State 3c, 10p SendB 2p SendA 2p State 2c, 2p State 3c, 8p RecB 2p State 2c, 4p

slide-21
SLIDE 21

MISM 95-702 OCT 21

Make up a story for p1, p2, p3

p1 p2 p3 a b c d e f m1 m2 Physical time

slide-22
SLIDE 22

MISM 95-702 OCT 22

Draw 5 consistent cuts

p1 p2 p3 a b c d e f m1 m2 Physical time

slide-23
SLIDE 23

MISM 95-702 OCT 23

Draw 2 inconsistent cuts

p1 p2 p3 a b c d e f m1 m2 Physical time

slide-24
SLIDE 24

MISM 95-702 OCT 24

Write down all x->y

p1 p2 p3 a b c d e f m1 m2 Physical time

slide-25
SLIDE 25

MISM 95-702 OCT 25

Write down all x->y

p1 p2 p3 a b c d e f m1 m2 Physical time

  • 1. a->b
  • 2. a->c
  • 3. a->d
  • 4. a->f
  • 5. b->c
  • 6. b->d
  • 7. b->f
  • 8. c->d
  • 9. c->f
  • 10. d->f
  • 11. e->f
slide-26
SLIDE 26

MISM 95-702 OCT 26

Is a->e?

p1 p2 p3 a b c d e f m1 m2 Physical time

slide-27
SLIDE 27

MISM 95-702 OCT 27

Lamport (Logical) Clocks

  • Since we cannot rely on physical clocks
  • Events on one process happen in order

– Each happens-before the next

  • The passing of messages can be used to indicate

happens-before between processes

– The sending of the message happens-before the receiving

  • f the message.
  • Used in Dynamo: Amazon.com’s highly available key-

value storage system that some of their core services use.

– See: http://s3.amazonaws.com/AllThingsDistributed/sosp/amazon-dynamo-sosp2007.pdf

slide-28
SLIDE 28

MISM 95-702 OCT 28

Number a-f

p1 p2 p3 a b c d e f m1 m2 Physical time

slide-29
SLIDE 29

MISM 95-702 OCT 29

Is your numbering similar?

a b c d e f m1 m2 2 1 3 4 5 1 p1 p2 p3 Physical time

slide-30
SLIDE 30

MISM 95-702 OCT 30

What time is g?

a b c d e f m1 m2 2 1 3 4 5 1 p1 p2 p3 Physical time g

slide-31
SLIDE 31

MISM 95-702 OCT 31

Now what time is g?

a b c d e f m1 m2 2 1 3 4 5 1 p1 p2 p3 Physical time g

slide-32
SLIDE 32

MISM 95-702 OCT 32

L(d)>L(g) so did d happen after g?

a b c d e f m1 m2 2 1 3 4 5 1 p1 p2 p3 Physical time g

slide-33
SLIDE 33

MISM 95-702 OCT 33

L(d)>L(g) so did d happen after g?

a b c d e f m1 m2 2 1 3 4 5 1 p1 p2 p3 Physical time g

No.

– d->f implies L(d) < L(f) – L(g) < L(d) does not imply g->d

slide-34
SLIDE 34

Problem 1

  • We have stores and warehouses all
  • ver the world
  • Each has a local system that tracks

inventory.

  • What is our current level of inventory?

MISM 95-702 Distributed Systems 34

slide-35
SLIDE 35

Problem 2

  • We have offices around the world
  • Each is buying and selling currency
  • What is our current level of capital?

MISM 95-702 Distributed Systems 35

slide-36
SLIDE 36

Problem 3

  • We have a very complex chemical

manufacturing plant

  • Each sensor and valve is computer

controlled

  • There are some sensor and valve

combinations that are very dangerous

  • How do we know if we are in one of

those states?

MISM 95-702 Distributed Systems 36

slide-37
SLIDE 37

Finding Global States

  • It is often important to obtain the global

state of a distributed system.

  • It would be nice to

– Be omniscient – see all at once – Or stop time

  • But in distributed systems, we only have

unreliable messages that take time to go from one process to another.

  • What to do?

MISM 95-702 Distributed Systems 37

slide-38
SLIDE 38

MISM 95-702 Distributed Systems 38

Chandy & Lamport Snapshot Algorithm

  • Assumptions:

– Reliable channels – Every message sent is reliably received

  • Once (not more than once)
  • In order (FIFO)

– There is a path between every 2 processes – Any process may initiate a snapshot – Processes continue as normal while snapshot is taking place.

slide-39
SLIDE 39

MISM 95-702 Distributed Systems 39

Simulation

  • 3 types of messages

– Coins – Candies – Markers

  • 3 distributed systems (3 people)

– Business logic thread:

  • Mine new candies (about one every 10 seconds)
  • Trade candies and coins

– Marker receiving rule thread – Marker sending rule thread

  • TCP Streams (3 people)

– Move coins, candies, or markers from sender to destination

  • Monitoring (TA)

– Request a distributed system to start snapshot – Sums all counts from distributed systems

slide-40
SLIDE 40

MISM 95-702 Distributed Systems 40

Shapshot Algorithm

  • What will the snapshot look like?

– The state of each process

  • I.e. how many coins and candies

– The state of each channel

  • 2 directional channels between every 2

processes

  • How many coins and candies were “on the

channel”, i.e. in transport.

slide-41
SLIDE 41

MISM 95-702 Distributed Systems 41

Snapshot Algorithm

  • Initiator:

Record state (coins and candies) Do Marker sending rule

  • Marker receiving rule:

When I receive a marker on channel c If (I haven’t yet recorded my state) { Record my state now (count coins and candies) Do Marker sending rule Record that the state of channel c is empty Begin keeping track of state all other incoming channels (I.e. count coins and candies arriving via each channel.) } else this is coming in from a new direction { Record that the state of channel c is all messages it has received since it began keeping track. (No more coming.) If (this is last channel to receive marker on) send records to Monitor process }

  • Marker sending rule:

For (each outgoing channel c) { Send one marker message over c }

slide-42
SLIDE 42

Snapshot State

Candies Coins My state when I received first marker Channel Betsy Channel Fred … … TOTAL

MISM 95-702 Distributed Systems 42

  • You continue to mine candies after you have recorded your

state and sent out markers

  • But your initially recorded state does not change
  • You continue to count candies and coins coming in on each

channel until you receive a marker on it.

slide-43
SLIDE 43

MISM 95-702 Distributed Systems 43

Snapshot Algorithm

  • Initiator:

Record state (coins and candies) Do Marker sending rule

  • Marker receiving rule:

When I receive a marker on channel c If (I haven’t yet recorded my state) { Record my state now (count coins and candies) Do Marker sending rule Record that the state of channel c is empty Begin keeping track of state all other incoming channels (I.e. count coins and candies arriving via each channel.) } else this is coming in from a new direction { Record that the state of channel c is all messages it has received since it began keeping track. (No more coming.) If (this is last channel to receive marker on) send records to Monitor process }

  • Marker sending rule:

For (each outgoing channel c) { Send one marker message over c }

slide-44
SLIDE 44

MISM 95-702 Distributed Systems 44

Try it

  • Begin

– Trading coins and candies – Mining new candies

  • Soon we will initiate a snapshot

– Play out the Chandy & Lamport algorithm

slide-45
SLIDE 45

MISM 95-702 OCT 45

How many candies are in the system?

D1 State: 2c D3 State: 3c D2 State: 2c

C1 Message: 3c C2 Message: 2c C3 Message: 1c C4 C5 C6 C2 Message: 1c

slide-46
SLIDE 46

MISM 95-702 Distributed Systems 46

Summarize

  • Summarize in your own words how

the Chandy & Lamport snapshot algorithm works.

  • How do its assumptions limit it?
  • What shortcomings does it have?
slide-47
SLIDE 47

Using State for Debugging

  • A: waitreply(B)
  • B:
  • C: waitreply(A)
  • D: waitreply(F)
  • E: waitreply(F)
  • F: waitreply(G)
  • G: waitreply(D)
  • How could info

like this be collected?

  • What does it say

about the state of the system?

MISM 95-702 OCT 47