Keeping in Sync George Neville-Neil gnn@neville-neil.com October - - PowerPoint PPT Presentation

keeping in sync
SMART_READER_LITE
LIVE PREVIEW

Keeping in Sync George Neville-Neil gnn@neville-neil.com October - - PowerPoint PPT Presentation

Keeping in Sync George Neville-Neil gnn@neville-neil.com October 2014 George Neville-Neil (gnn@neville-neil.com) Keeping in Sync October 2014 1 / 28 Introduction The Need for Accurate Time Coordination Amongst Distributed Systems


slide-1
SLIDE 1

Keeping in Sync

George Neville-Neil

gnn@neville-neil.com

October 2014

George Neville-Neil (gnn@neville-neil.com) Keeping in Sync October 2014 1 / 28

slide-2
SLIDE 2

Introduction

The Need for Accurate Time

◮ Coordination Amongst Distributed Systems ◮ Cellular Networks ◮ Factory Automation ◮ Electrical Power Grids ◮ High Frequency Trading

George Neville-Neil (gnn@neville-neil.com) Keeping in Sync October 2014 2 / 28

slide-3
SLIDE 3

Introduction

Why Aren’t Computer Clocks Accurate?

◮ Commodity crystals can be ±30 seconds per year ◮ Can wander by tens of milliseconds over brief periods ◮ Environmental Effects ◮ Crystal Aging

George Neville-Neil (gnn@neville-neil.com) Keeping in Sync October 2014 3 / 28

slide-4
SLIDE 4

Introduction

Hardware Solutions

◮ Better Crystals ◮ Add in Cards ◮ All of these are expensive, $1000 per host

George Neville-Neil (gnn@neville-neil.com) Keeping in Sync October 2014 4 / 28

slide-5
SLIDE 5

Introduction

Network Solutions

◮ Ask a “better” clock what time it is ◮ daytime protocol, circa 1983 ◮ NTP circa 1980 (The Gold Standard) ◮ Precision Time Protocol 2002/2008

George Neville-Neil (gnn@neville-neil.com) Keeping in Sync October 2014 5 / 28

slide-6
SLIDE 6

Introduction

Precision Time Protocol

◮ First defined as IEEE-1588/2002 (Version 1) ◮ Current version is IEEE-1588/2008 (Version 2) ◮ Heavy use of Multicast ◮ Works best on a LAN

George Neville-Neil (gnn@neville-neil.com) Keeping in Sync October 2014 6 / 28

slide-7
SLIDE 7

Theory of Operation

Theory of Operation

◮ A Really Good Clock Exists ◮ Grandmaster Multicasts its time on the LAN ◮ Slaves work out network latency to Grandmaster ◮ Slaves adjust their local clocks based on their measurements

George Neville-Neil (gnn@neville-neil.com) Keeping in Sync October 2014 7 / 28

slide-8
SLIDE 8

Theory of Operation

Terms

Grandmaster The Really Good Clock Steering The process of manipulating the local clock Slave A client that wishes to steer its clock Offset Time difference between slave and master Delay Time a packet takes to go between the master and slave

George Neville-Neil (gnn@neville-neil.com) Keeping in Sync October 2014 8 / 28

slide-9
SLIDE 9

Theory of Operation

Hardware Time-stamping

◮ Time-stamps should be recorded as close to the PHY as possible ◮ Recording a time-stamp in the kernel is better than in user space ◮ The kernel is really not quite good enough

George Neville-Neil (gnn@neville-neil.com) Keeping in Sync October 2014 9 / 28

slide-10
SLIDE 10

Theory of Operation

Possible Time-stamping Locations

George Neville-Neil (gnn@neville-neil.com) Keeping in Sync October 2014 10 / 28

slide-11
SLIDE 11

Theory of Operation

PTP Packets

◮ ANNOUNCE ◮ SYNC ◮ FOLLOWUP ◮ DELAY REQUEST ◮ DELAY RESPONSE

George Neville-Neil (gnn@neville-neil.com) Keeping in Sync October 2014 11 / 28

slide-12
SLIDE 12

Theory of Operation

SYNC

◮ Main packet from master to slave ◮ Carries the time from the master (T1) ◮ Sent every 2 seconds (default) ◮ Slave timestamps these packets (T2)

George Neville-Neil (gnn@neville-neil.com) Keeping in Sync October 2014 12 / 28

slide-13
SLIDE 13

Theory of Operation

FOLLOWUP

◮ Used when master does not have hardware time-stamping ◮ Follows the SYNC packet ◮ Carries the time at which the SYNC was transmitted

George Neville-Neil (gnn@neville-neil.com) Keeping in Sync October 2014 13 / 28

slide-14
SLIDE 14

Theory of Operation

DELAY REQUEST

◮ Sent from the slave to the master ◮ First of a pair of packets used to derive network delay ◮ Slave records when it sent this packet (T3) ◮ Can now be sent via unicast or multicast

George Neville-Neil (gnn@neville-neil.com) Keeping in Sync October 2014 14 / 28

slide-15
SLIDE 15

Theory of Operation

DELAY RESPONSE

◮ Master responds with this packet to a DELAY REQUEST ◮ Carries the time at which the DELAY REQUEST arrived (T4) ◮ Used by the slave to determine network delay

George Neville-Neil (gnn@neville-neil.com) Keeping in Sync October 2014 15 / 28

slide-16
SLIDE 16

Theory of Operation

Network Time Diagram

T1 T2 T3 T4 Master Slave

SYNC [T1] FOLLOWUP [T1] DELAY REQUEST [T3] DELAY RESPONSE [T4] George Neville-Neil (gnn@neville-neil.com) Keeping in Sync October 2014 16 / 28

slide-17
SLIDE 17

Implementation

Features

◮ Can run as a master, serving hundreds of slaves ◮ Runs as a slave on the client ◮ Lightweight, does not consume many resources ◮ Uses BPF/PCAP timestamps for enhanced accuracy on FreeBSD ◮ Records performance data ◮ Records quality files (more later)

George Neville-Neil (gnn@neville-neil.com) Keeping in Sync October 2014 17 / 28

slide-18
SLIDE 18

Implementation

Supporting Scripts

ptplib R library for logfiles

  • ffset.R Graphs offset and delay data from log files

compare.R Compare two clients log files stats.R Raw statistics

George Neville-Neil (gnn@neville-neil.com) Keeping in Sync October 2014 18 / 28

slide-19
SLIDE 19

Measurements

Measuring Slave Synchronization

◮ All slaves record several measurements

◮ Master to Slave Delay ◮ Slave to Master Delay ◮ Offset from Master ◮ Quality George Neville-Neil (gnn@neville-neil.com) Keeping in Sync October 2014 19 / 28

slide-20
SLIDE 20

Measurements

Inter-Slave Quality Measurement

◮ What often most concerns us is synchronization between slaves ◮ We need an external event that all slaves see equally ◮ It is necessary to line up measurements between multiple slaves ◮ Use the SYNC packet’s sequence id ◮ SYNC is multicast so all slaves see the same packet at the same

time

◮ More or less

◮ Record local time on each SYNC ◮ Post processing shows us how well synchronized two slaves are

George Neville-Neil (gnn@neville-neil.com) Keeping in Sync October 2014 20 / 28

slide-21
SLIDE 21

Measurements

Offset Measurements for Unsynchronized System

George Neville-Neil (gnn@neville-neil.com) Keeping in Sync October 2014 21 / 28

slide-22
SLIDE 22

Measurements

Offset Measurements for System Synchronized to NTP

George Neville-Neil (gnn@neville-neil.com) Keeping in Sync October 2014 22 / 28

slide-23
SLIDE 23

Measurements

Offset Measurements for PTP Synchronized System

George Neville-Neil (gnn@neville-neil.com) Keeping in Sync October 2014 23 / 28

slide-24
SLIDE 24

Measurements

Quality Measurements for PTP Synchronized System

George Neville-Neil (gnn@neville-neil.com) Keeping in Sync October 2014 24 / 28

slide-25
SLIDE 25

Tuning

Sources of Timing Inaccuracy

◮ Network Jitter ◮ Store and Forward Routers ◮ Packet Asynchrony ◮ Competing Network Traffic ◮ Interrupt Moderation in NIC Drivers

George Neville-Neil (gnn@neville-neil.com) Keeping in Sync October 2014 25 / 28

slide-26
SLIDE 26

Current Status and Future Work

Current Status

◮ Maintained at SourceForge ◮ BSD License ◮ Runs on FreeBSD, Linux and Mac OS ◮ FreeBSD Ports ◮ RedHat RPM

George Neville-Neil (gnn@neville-neil.com) Keeping in Sync October 2014 26 / 28

slide-27
SLIDE 27

Current Status and Future Work

Future Work

◮ Support for low level timestamping ◮ netmap(4) support ◮ DPDK support ◮ Better support for embedded systems ◮ Move to github ◮ Always more testing to do

George Neville-Neil (gnn@neville-neil.com) Keeping in Sync October 2014 27 / 28