Accurate&Network&Clock&Synchronization at&Scale - - PowerPoint PPT Presentation

accurate network clock synchronization at scale
SMART_READER_LITE
LIVE PREVIEW

Accurate&Network&Clock&Synchronization at&Scale - - PowerPoint PPT Presentation

Accurate&Network&Clock&Synchronization at&Scale Balaji&Prabhakar Joint&work&with:&Yilong Geng,&Zi Yin,&Shiyu Liu,&Ashish&Naik,& Mendel&Rosenblum&and&Amin&Vahdat


slide-1
SLIDE 1

Balaji&Prabhakar

Joint&work&with:&Yilong Geng,&Zi Yin,&Shiyu Liu,&Ashish&Naik,& Mendel&Rosenblum&and&Amin&Vahdat

Accurate&Network&Clock&Synchronization at&Scale

slide-2
SLIDE 2

A"classical"hard"problem:"affects"performance"of"distributed"systems

  • Can"boost"performance"of"existing"solutions
  • e.g.,"in"finance,"in"databases"(causality"and"external"consistency)
  • Or"enable"new"ones
  • e.g.,"distributed"ledgers,"fine?grained"resource"and"task"scheduling,

In"financial"trading"end$to$end'clock'synchronization'can"provide

  • Fairness"in"trading"exchanges,"enabling"them"to"move"to"the"Cloud
  • Accurate"timestamps"for"market"data"collected"at"different"geographic"regions
  • Solve"a"host"of"other"issues"(e.g.,"prevent"front?running)

Background:"Clock"Synchronization

slide-3
SLIDE 3

Synchronizing+clocks+with+probes

Probe Probe !"# $"% !"% $"#

t

&# = & + Δ&# &% = & + Δ&% All+methods+of+clock+sync+use+these+4+timestamps;+main+differences:

1. Where+to+take+timestamps? 2. How+to+process+the+timestamps?

slide-4
SLIDE 4

Where%to%take%timestamps

Server%A Switch

Queue

CPU

NIC

PHY PHY PHY Queue

NTP PTP DTP

Huygens

  • Uses NIC%or%CPU%timestamps%and,%respectively,%gets%nsD and%usDlevel%sync%accuracy
  • is%a%softwareDbased%approach

Server%B

CPU

NIC

PHY

slide-5
SLIDE 5

First,'Let’s Look at Factors'Which'Make'Clock'Sync'Hard

Each'clock'is'different

  • Clocks'have'different'resonant'frequencies
  • Clocks'behave'differently to'the'same'frequency'or'offset'control'signal

Control(Stopped

slide-6
SLIDE 6

Factors(Which(Make(Clock(Sync(Hard

A(clock’s(behavior(varies(over(time

  • Due(to(temperature
  • Due(to(vibration(noise((e.g.,(cooling(fans),(etc

3am 8pm 3am 6am 9am

slide-7
SLIDE 7

The$network$connecting$the$clock

  • Variations$in$path$delays$which$affect$clock$sync$probe$latencies
  • Network$path$asymmetries$(propagation$delays$from$A!B$and$B!A$are$different)

What$Makes$Clock$Sync$Hard

Therefore,$to$achieve$good$clock$sync

  • Need$to$solve$both the$heterogenous+clocks+and$the jittery+network+and$problems
  • Continuously
slide-8
SLIDE 8

Two$approaches

  • 1. Treat$each$set$of$timestamp$quadruples$individually
  • DTP$and$PTP
  • Susceptible$to$clock$rounding$errors

How$to$process$timestamps

slide-9
SLIDE 9

Two$approaches

  • 1. Treat$each$set$of$timestamp$quadruples$individually
  • DTP$and$PTP
  • Susceptible$to$clock$rounding$errors
  • 2. Treat$many$sets$of$timestamp$quadruples$collectively

A. NTP$does$this$for$timestamp$quads$from$multiple$packets$between$a$single$pair$of$machines B. Huygens$does$this$for$timestamp$quads$from$many$probes$and$across$a$network$of$machines

  • Goes$below$clock$rounding$errors
  • Synchronizes$clocks$to$10s$of$nanoseconds$under$heavy$load
  • Achieves$global$consensus$of$time

How$to$process$timestamps

slide-10
SLIDE 10

Accurate(sync(methods(aim(to(use(the(network(to(synchronize(clocks

  • To(synchronize(clocks(A(and(B(in(end7hosts,(synchronize(all(the(switches/links(between(A(

and(B((or(have(them(participate(in(“transparent(mode”)(

⎼ e.g.,(PTP,(DTP(

  • Use(dedicated(wires(of(precisely(known(lengths(to(send(time(pulses(along

⎼ PPS

  • Use(a(dedicated(synchronous(Ethernet(network(to(convey(pulses/time

⎼ White(Rabbit

All(of(the(above

  • Require(hardware(upgrades(and/or(special(protocols
  • Do(not(scale(to(large(distances(or(large(numbers(of(nodes

NTP,(on(the(other(hand,(is(scalable(to(1000s(of(nodes(across(large(distances

  • But(is(quite(inaccurate((100s(of(us(– 10s(of(milliseconds)(and(has(a(high(variance

Current Solutions: Discussion

slide-11
SLIDE 11

Our$Approach:$The$Huygens$Algorithm

Huygens$is$a$software9based$algorithm$that$can

  • Work$off$of$timestamps$from$the$NICs,$hosts,$VMs$or$containers
  • Accuracy$

⎼ NIC$timestamps ! nanosecond$level$ ⎼ host/VM/container$timestamps$! 100s$of$ns$to$192$us in$a$single$DC ⎼ across$multiple$DCs$ ! 1—10$us,$depending$on$the$WAN$link$quality ! Paper:$“Exploiting$a$Natural$Network$Effect$for$Fine9grade$Clock$Synchronization”,$ NSDI$2018.$https://www.usenix.org/conference/nsdi18/presentation/geng

Huygens$can$synchronize$clocks$at$just%the%desired%nodes

  • No$need$to$sync$all$intermediate$clocks$! enables$scaling$in$size$and$distance
  • Being$a$software$overlay,$it$needs$no$hardware$upgrades$and$can$deploy$in$current$

DCs$and$Clouds

slide-12
SLIDE 12

Problem:)Given)N)clocks)connected)by)a)packet7switched)network,) synchronize)them)as)accurately)as)possible Introduce)a)probe&mesh:

  • Each)clock)randomly)picks)a)constant)number,)say)5,)of)other)clocks)to)probe
  • Probes)are)acked
  • Each)probe)or)ack)carries)a)transmit)timestamp)and)a)receive)timestamp)from)the)

sending)and)receiving)clocks

  • Probing)overhead:)minimal,)roughly)700Kbps,)in)total)per)node,)counting)probes)

and)acks

12

The)Huygens)Algorithm

slide-13
SLIDE 13

Consider)one)pair)of)clocks)at)servers)A)and)B

Probe Ack !"# $"% !"% $"#

t

&# = & + Δ&# &% = & + Δ&%

slide-14
SLIDE 14

Probe&from&A&to&B:

  • Receive&time&=&transmit&time&+&delay
  • !"# − %&# = (") − %&) + +,-.-/0&1-2 023 4565612/ 36708
  • %&# − %&) = !"# − (") − +,-.-/0&1-2 023 4565612/ 36708
  • %&# − %&) < !"# − (")

Ack&from&B&to&A:

  • %&# − %&) > ("# − !")

Each&probe/ack&bounds&clock&discrepancy

slide-15
SLIDE 15

Clock&bounds&over&time

!"# − !"% (&')

"% (')*)

Offset:&793.3&us Drift:&71.65&us/sec Offset:&796.6&us

slide-16
SLIDE 16

Clocks'can'drift'away'from' each'other'as'fast'as'30us/sec

More'examples'of'drifting'clocks'

−30 −20 −10 10 20 30 Clock drift (us/sec) 50 100 150 200 250 300 350 Number of server pairs

slide-17
SLIDE 17
  • Clock&drifting&speed&varies&
  • ver&time&due&to&

temperature&variations

  • Approximate&clock&

difference&with&piecewise& linear&functions

Nonlinear&Clock&Drifts

17

slide-18
SLIDE 18

1. Support)vector)machine 2. Coded)probes 3. Network)effect

3)Steps)to)Finding)the)Middle)Red Line

slide-19
SLIDE 19

SVM$achieves$sync$ accuracy$of$300~400$ns. Noisy$timestamps$cause$ synchronization$errors!

Step$1:$SVMs

How$to$identify$packets$ with$zero$queueing$ delays$and$no$timestamp$ noise?

slide-20
SLIDE 20

Step%2:%Coded%probes

Network

Second%packet% delayed%more First%packet% delayed%more Likely%no% queueing%delay Second% packet First% packet 10%us >>10%us <<%10%us ~%10%us

slide-21
SLIDE 21

Empirically,+coded+probes+filter+out+ 90%+of+bad+data+and+reduce+the+ clock+sync+error+by+a+factor+of+4.

Coded+probes

slide-22
SLIDE 22

Step%3:%The%network%effect

A B C

If%my%clock%is%at%10,%B’s% clock%must%be%at%10:15 If%my%clock%is%at%10:15,%C’s% clock%must%be%at%10:05 If%my%clock%is%at%10:05,%A’s% clock%must%be%at%9:50 Guys,%we%are%off%by% 10%minutes!

2? 2? 6? H10? 5? 15? 3.3 3.3 3.3

slide-23
SLIDE 23

The$network$effect

A D B E C F

2 4 6 8 10 12 K 10 20 30 40 50 Error (ns)

mean 99th percentile

!"#(%&& '()%& *. ,. ) ≈ 1 !"#(%&& 1%(2&% *. ,. )

slide-24
SLIDE 24

Google&– Jupiter&testbed

  • 32stage&40Gb/s&Clos&network
  • 20&racks,&237&servers

A&10G240G&production&network

  • 52stage&Clos&network

Stanford&testbed

  • 22layer&1G&network
  • 8&racks,&128&servers
  • Cisco&2960&and&Cisco&3560&switches

Many&financial&firms

Pilots&and&deployments

slide-25
SLIDE 25

Stanford

  • 2*stage.1Gb/s.Clos.network
  • 8.racks,.128.servers

Stanford.Testbed

Cisco 2960

slide-26
SLIDE 26

Comparison*with*NTP

NTP (with)NIC)timestamps) Huygens Mean*abs.* error 99th percentile* abs.*error Mean*abs. error 99th percentile* abs.*error 0%*load 177.7*ns 558.8*ns 10.2*ns 18.5*ns 40%*load 77,975*ns 347,638 ns 11.2*ns 22.0*ns 80%*load 211,011*ns 778,070*ns 14.3*ns 32.7*ns

27

slide-27
SLIDE 27

10 20 30 40 50 60 70 80 90 Network load (%) 10 20 30 40 50 60 Error (ns)

mean 99th percentile

Robust'to'high'network'load

Huygens:'Synchronization'error' stays'under'50'ns'at'90%'load

slide-28
SLIDE 28

Comparison*with*NTP

NTP (with)NIC)timestamps) Huygens Mean*abs.* error 99th percentile* abs.*error Mean*abs. error 99th percentile* abs.*error 0%*load 177.7*ns 558.8*ns 10.2*ns 18.5*ns 40%*load 77,975*ns 347,638 ns 11.2*ns 22.0*ns 80%*load 211,011*ns 778,070*us 14.3*ns 32.7*ns

slide-29
SLIDE 29

Demo%of%Clock%Sync%in%the%Cloud

31