Centers Parmjeet Singh, Myungjin Lee Sagar Kumar, Ramana Rao - - PowerPoint PPT Presentation

centers
SMART_READER_LITE
LIVE PREVIEW

Centers Parmjeet Singh, Myungjin Lee Sagar Kumar, Ramana Rao - - PowerPoint PPT Presentation

Enabling Flow-level Latency Measurements across Routers in Data Centers Parmjeet Singh, Myungjin Lee Sagar Kumar, Ramana Rao Kompella Latency-critical applications in data centers Guaranteeing low end-to-end latency is important Web


slide-1
SLIDE 1

Enabling Flow-level Latency Measurements across Routers in Data Centers

Parmjeet Singh, Myungjin Lee Sagar Kumar, Ramana Rao Kompella

slide-2
SLIDE 2

Latency-critical applications in data centers

 Guaranteeing low end-to-end latency is important

 Web search (e.g., Google’s instant search service)  Retail advertising  Recommendation systems  High-frequency trading in financial data centers

 Operators want to troubleshoot latency anomalies

 End-host latencies can be monitored locally  Detection, diagnosis and localization through a network: no

native support of latency measurements in a router/switch

slide-3
SLIDE 3

Prior solutions

 Lossy Difference Aggregator (LDA)

 Kompella et al. [SIGCOMM ’09]  Aggregate latency statistics

 Reference Latency Interpolation (RLI)

 Lee et al. [SIGCOMM ’10]  Per-flow latency measurements

More suitable due to more fine-grained measurements

slide-4
SLIDE 4

Deployment scenario of RLI

 Upgrading all switches/routers in a data center network  Pros

 Provide finest granularity of latency anomaly localization

 Cons

 Significant deployment cost  Possible downtime of entire production data centers

 In this work, we are considering partial deployment of RLI

 Our approach: RLI across Routers (RLIR)

slide-5
SLIDE 5

Overview of RLI architecture

 Goal

 Latency statistics on a per-flow basis between interfaces

 Problem setting

 No storing timestamp for each packet at ingress and egress

due to high storage and communication cost

 Regular packets do not carry timestamps

Router

Ingress I Egress E

slide-6
SLIDE 6

Overview of RLI architecture

 Premise of RLI: delay locality  Approach

1) The injector sends reference packets regularly 2) Reference packet carries ingress timestamp 3) Linear interpolation: compute per-packet latency estimates at the latency estimator 4) Per-flow estimates by aggregating per-packet estimates

Latency Estimator Reference Packet Injector

Ingress I Egress E

1 2 1

R

L

Delay Time

R

L

1

Linear interpolation line

2

Interpolated delay

slide-7
SLIDE 7

Full vs. Partial deployment

 Full deployment: 16 RLI sender-receiver pairs  Partial deployment: 4 RLI senders + 2 RLI receivers  81.25 % deployment cost reduction Switch 1 Switch 5 Switch 2 Switch 4 Switch 3 Switch 6 RLI Sender (Reference Packet Injector) RLI Receiver (Latency Estimator)

slide-8
SLIDE 8

Case 1: Presence of cross traffic

 Issue: Inaccurate link utilization estimation at the sender

leads to high reference packet injection rate

 Approach

 Not actively addressing the issue  Evaluation shows no much impact on packet loss rate increase  Details in the paper

Switch 1 Switch 5 Switch 2 Switch 4 Switch 3 Switch 6 RLI Sender (Reference Packet Injector) RLI Receiver (Latency Estimator)

Link utilization estimation on Switch 1 Bottleneck Link Cross Traffic

slide-9
SLIDE 9

Case 2: RLI Sender side

 Issue: Traffic may take different routes at an intermediate

switch

 Approach: Sender sends reference packets to all receivers Switch 1 Switch 5 Switch 2 Switch 4 Switch 3 Switch 6 RLI Sender (Reference Packet Injector) RLI Receiver (Latency Estimator)

slide-10
SLIDE 10

Case 3: RLI Receiver side

 Issue: Hard to associate reference packets and regular

packets that traversed the same path

 Approaches

 Packet marking: requires native support from routers  Reverse ECMP computation: ‘reverse’ engineer intermediate

routes using ECMP hash function

 IP prefix matching at limited situation

Switch 1 Switch 5 Switch 2 Switch 4 Switch 3 Switch 6 RLI Sender (Reference Packet Injector) RLI Receiver (Latency Estimator)

slide-11
SLIDE 11

Deployment example in fat-tree topology

RLI Sender (Reference Packet Injector) RLI Receiver (Latency Estimator)

IP prefix matching Reverse ECMP computation / IP prefix matching

slide-12
SLIDE 12

Evaluation

 Simulation setup

 Trace: regular traffic (22.4M pkts) + cross traffic (70M pkts)  Simulator

 Results

 Accuracy of per-flow latency estimates

Traffic Divider Switch1 Switch2 Cross Traffic Injector Packet Trace RLI Receiver RLI Sender

Cross Traffic Regular Traffic Reference packets

10% / 1% injection rate

slide-13
SLIDE 13

67%

Accuracy of per-flow latency estimates

10% injection 1% injection 10% injection 1% injection Bottleneck link utilization: 93% Relative error CDF 1.2% 4.5% 18% 31%

slide-14
SLIDE 14

Summary

 Low latency applications in data centers

 Localization of latency anomaly is important

 RLI provides flow-level latency statistics, but full

deployment (i.e., all routers/switches) cost is expensive

 Proposed a solution enabling partial deployment of RLI

 No too much loss in localization granularity (i.e., every other

router)

slide-15
SLIDE 15

Thank you! Questions?