 
              Accurate&Network&Clock&Synchronization at&Scale Balaji&Prabhakar Joint&work&with:&Yilong Geng,&Zi Yin,&Shiyu Liu,&Ashish&Naik,& Mendel&Rosenblum&and&Amin&Vahdat
Background:"Clock"Synchronization A"classical"hard"problem:"affects"performance"of"distributed"systems Can"boost"performance"of"existing"solutions • - e.g.,"in"finance,"in"databases"(causality"and"external"consistency) Or"enable"new"ones • - e.g.,"distributed"ledgers,"fine?grained"resource"and"task"scheduling, In"financial"trading" end$to$end'clock'synchronization' can"provide Fairness"in"trading"exchanges,"enabling"them"to"move"to"the"Cloud • Accurate"timestamps"for"market"data"collected"at"different"geographic"regions • Solve"a"host"of"other"issues"(e.g.,"prevent"front?running) •
Synchronizing+clocks+with+probes !" # $" % Probe Probe t $" # !" % & # = & + Δ& # & % = & + Δ& % All+methods+of+clock+sync+use+these+4+timestamps;+main+differences: 1. Where+to+take+timestamps? 2. How+to+process+the+timestamps?
Where%to%take%timestamps NTP Server%A PTP Server%B DTP CPU CPU Switch NIC PHY PHY Queue NIC Queue PHY PHY Huygens Uses NIC%or%CPU%timestamps%and,%respectively,%gets%nsD and%usDlevel%sync%accuracy • is%a%softwareDbased%approach •
First,'Let’s Look at Factors'Which'Make'Clock'Sync'Hard Each'clock'is'different • Clocks'have'different'resonant'frequencies • Clocks'behave' differently to'the'same'frequency'or'offset'control'signal Control(Stopped
Factors(Which(Make(Clock(Sync(Hard A(clock’s(behavior(varies(over(time Due(to(temperature • Due(to(vibration(noise((e.g.,(cooling(fans),(etc • 3am 8pm 3am 6am 9am
What$Makes$Clock$Sync$Hard The$network$connecting$the$clock Variations$in$path$delays$which$affect$clock$sync$probe$latencies • Network$path$asymmetries$(propagation$delays$from$A ! B$and$B ! A$are$different) • Therefore,$to$achieve$good$clock$sync Need$to$solve$ both the$ heterogenous+clocks+ and$the jittery+network+ and$problems • Continuously •
How$to$process$timestamps Two$approaches 1. Treat$each$set$of$timestamp$quadruples$individually - DTP$and$PTP - Susceptible$to$clock$rounding$errors
How$to$process$timestamps Two$approaches 1. Treat$each$set$of$timestamp$quadruples$individually - DTP$and$PTP - Susceptible$to$clock$rounding$errors 2. Treat$many$sets$of$timestamp$quadruples$collectively A. NTP$does$this$for$timestamp$quads$from$multiple$packets$between$a$single$pair$of$machines B. Huygens$does$this$for$timestamp$quads$from$many$probes$and$across$a$network$of$machines • Goes$below$clock$rounding$errors Synchronizes$clocks$to$10s$of$nanoseconds$under$heavy$load • Achieves$global$consensus$of$time •
Current Solutions: Discussion Accurate(sync(methods(aim(to(use(the(network(to(synchronize(clocks To(synchronize(clocks(A(and(B(in(end7hosts,(synchronize(all(the(switches/links(between(A( • and(B((or(have(them(participate(in(“transparent(mode”)( e.g.,(PTP,(DTP( ⎼ Use(dedicated(wires(of(precisely(known(lengths(to(send(time(pulses(along • PPS ⎼ Use(a(dedicated(synchronous(Ethernet(network(to(convey(pulses/time • White(Rabbit ⎼ All(of(the(above Require(hardware(upgrades(and/or(special(protocols • Do(not(scale(to(large(distances(or(large(numbers(of(nodes • NTP,(on(the(other(hand,(is(scalable(to(1000s(of(nodes(across(large(distances But(is(quite(inaccurate((100s(of(us(– 10s(of(milliseconds)(and(has(a(high(variance •
Our$Approach:$The$Huygens$Algorithm Huygens$is$a$software9based$algorithm$that$can Work$off$of$timestamps$from$the$NICs,$hosts,$VMs$or$containers • Accuracy$ • NIC$timestamps ! nanosecond$level$ ⎼ host/VM/container$timestamps$ ! 100s$of$ns$to$192$us in$a$single$DC ⎼ across$multiple$DCs$ ! 1—10$us,$depending$on$the$WAN$link$quality ⎼ Paper:$“Exploiting$a$Natural$Network$Effect$for$Fine9grade$Clock$Synchronization”,$ ! NSDI$2018.$https://www.usenix.org/conference/nsdi18/presentation/geng Huygens$can$synchronize$clocks$at$ just%the%desired%nodes No$need$to$sync$all$intermediate$clocks$ ! enables$scaling$in$size$and$distance • Being$a$software$overlay,$it$needs$no$hardware$upgrades$and$can$deploy$in$current$ • DCs$and$Clouds
The)Huygens)Algorithm Problem:)Given)N)clocks)connected)by)a)packet7switched)network,) synchronize)them)as)accurately)as)possible Introduce)a) probe&mesh: Each)clock)randomly)picks)a)constant)number,)say)5,)of)other)clocks)to)probe • Probes)are)acked • Each)probe)or)ack)carries)a)transmit)timestamp)and)a)receive)timestamp)from)the) • sending)and)receiving)clocks Probing)overhead:)minimal,)roughly)700Kbps,)in)total)per)node,)counting)probes) • and)acks 12
Consider)one)pair)of)clocks)at)servers)A)and)B !" # $" % Probe t Ack $" # !" % & # = & + Δ& # & % = & + Δ& %
Each&probe/ack&bounds&clock&discrepancy Probe&from&A&to&B: Receive&time&=&transmit&time&+&delay • !" # − %& # = (" ) − %& ) + +,-.-/0&1-2 023 4565612/ 36708 • %& # − %& ) = !" # − (" ) − +,-.-/0&1-2 023 4565612/ 36708 • %& # − %& ) < !" # − (" ) • Ack&from&B&to&A: %& # − %& ) > (" # − !" ) •
Clock&bounds&over&time Offset:&793.3&us !" # − !" % ( &' ) Offset:&796.6&us Drift:&71.65&us/sec " % (')*)
More'examples'of'drifting'clocks' 350 Number of server pairs 300 Clocks'can'drift'away'from' 250 200 each'other'as'fast'as'30us/sec 150 100 50 0 − 30 − 20 − 10 0 10 20 30 Clock drift (us/sec)
Nonlinear&Clock&Drifts Clock&drifting&speed&varies& • over&time&due&to& temperature&variations Approximate&clock& • difference&with&piecewise& linear&functions 17
3)Steps)to)Finding)the)Middle)Red Line 1. Support)vector)machine 2. Coded)probes 3. Network)effect
Step$1:$SVMs How$to$identify$packets$ with$zero$queueing$ delays$and$no$timestamp$ noise? SVM$achieves$sync$ accuracy$of$300~400$ns. Noisy$timestamps$cause$ synchronization$errors!
Step%2:%Coded%probes Second%packet% delayed%more 10%us >>10%us First%packet% Network delayed%more <<%10%us Second% First% packet packet Likely%no% queueing%delay ~%10%us
Coded+probes Empirically,+coded+probes+filter+out+ 90%+of+bad+data+and+reduce+the+ clock+sync+error+by+a+factor+of+4.
Step%3:%The%network%effect If%my%clock%is%at%10,%B’s% If%my%clock%is%at%10:15,%C’s% clock%must%be%at%10:15 clock%must%be%at%10:05 B H10? 3.3 2? 3.3 2? 5? A If%my%clock%is%at%10:05,%A’s% 3.3 6? 15? clock%must%be%at%9:50 C Guys,%we%are%off%by% 10%minutes!
The$network$effect D E mean 50 99th percentile 40 Error (ns) 30 C A B 20 10 0 0 2 4 6 8 10 12 K F !"#(%&& '()%& *. ,. ) ≈ 1 !"#(%&& 1%(2&% *. ,. ) 0
Pilots&and&deployments Google&– Jupiter&testbed 32stage&40Gb/s&Clos&network • 20&racks,&237&servers • A&10G240G&production&network 52stage&Clos&network • Stanford&testbed 22layer&1G&network • 8&racks,&128&servers • Cisco&2960&and&Cisco&3560&switches • Many&financial&firms
Stanford.Testbed Stanford 2*stage.1Gb/s.Clos.network • 8.racks,.128.servers • Cisco 2960
Comparison*with*NTP NTP Huygens (with)NIC)timestamps) Mean*abs.* 99 th Mean*abs. 99 th error percentile* error percentile* abs.*error abs.*error 0%*load 177.7*ns 558.8*ns 10.2*ns 18.5*ns 40%*load 77,975*ns 347,638 ns 11.2*ns 22.0*ns 80%*load 211,011*ns 778,070*ns 14.3*ns 32.7*ns 27
Robust'to'high'network'load 60 mean 50 99th percentile 40 Error (ns) 30 20 10 0 0 10 20 30 40 50 60 70 80 90 Network load (%) Huygens:'Synchronization'error' stays'under'50'ns'at'90%'load
Comparison*with*NTP NTP Huygens (with)NIC)timestamps) Mean*abs.* 99 th Mean*abs. 99 th error percentile* error percentile* abs.*error abs.*error 0%*load 177.7*ns 558.8*ns 10.2*ns 18.5*ns 40%*load 77,975*ns 347,638 ns 11.2*ns 22.0*ns 80%*load 211,011*ns 778,070*us 14.3*ns 32.7*ns
Demo%of%Clock%Sync%in%the%Cloud 31
Recommend
More recommend