RADWAN | Rate Adaptive Wide Area Networks Rachee Singh / U. - - PowerPoint PPT Presentation

radwan rate adaptive wide area networks
SMART_READER_LITE
LIVE PREVIEW

RADWAN | Rate Adaptive Wide Area Networks Rachee Singh / U. - - PowerPoint PPT Presentation

RADWAN | Rate Adaptive Wide Area Networks Rachee Singh / U. Massachusetts Amherst Manya Ghobadi / Microsoft Research Klaus-Tycho Foerster / University of Vienna Mark Filer / Microsoft Research Phillipa Gill / U. Massachusetts Amherst 1


slide-1
SLIDE 1

RADWAN | Rate Adaptive Wide Area Networks

Rachee Singh / U. Massachusetts Amherst Manya Ghobadi / Microsoft Research Klaus-Tycho Foerster / University of Vienna Mark Filer / Microsoft Research Phillipa Gill / U. Massachusetts Amherst

1

slide-2
SLIDE 2

Wide Area Networks

2

Costs O(100) million dollars per year O(100) datacenters Dedicated Wide Area Network

[SIGCOMM ’13] [SIGCOMM ’14] [SIGCOMM ’16]

slide-3
SLIDE 3

3

O(100,000 miles) of fiber O(1,000) optical devices Fiber is scarce, expensive

Identify inefficiencies in the optical backbone to gain capacity, availability at reduced cost.

slide-4
SLIDE 4

Gain 134 Tbps of capacity and prevent 25% link failures in large North American WAN.

4

This Talk

slide-5
SLIDE 5

5

Talk Outline 1 2 3 How inefficient are optical backbones? Dynamic capacity links in WANs Challenges in dynamically adapting link capacities 4 Rate Adaptive WANs

slide-6
SLIDE 6

Optical Backbone Networks

Optical cross-connects (OXCs)

  • OXC: switches optical signals
  • Signal-to-noise ratio (SNR)

measures signal quality

  • At OXC, measure signal quality
  • 8,000 wavelengths
  • Every 15 minutes
  • February 2015 to June 2017

fiber

6

slide-7
SLIDE 7

Longitudinal Signal Quality on Fiber

7

Higher is better 100 Gbps 75 Gbps 150 Gbps 175 Gbps 200 Gbps 125 Gbps 50 Gbps

Failure SNR

Capacity Threshold

01-07-2017

slide-8
SLIDE 8

Opportunity for capacity gain

0.00 0.25 0.50 0.75 1.00 2 4 6 8 10 12 14 16

Average SNR CDF

For 8,000 wavelengths in WAN:

  • Analyze average SNR
  • Compare with thresholds for link

capacity 64% of optical wavelengths can

  • perate at 175 Gbps or more.

95% of optical wavelengths can

  • perate at higher than 100 Gbps.

8

(dB)

100 Gbps 125 Gbps 150 Gbps 175 Gbps 200 Gbps

slide-9
SLIDE 9

Opportunity for availability gain

  • Distribution of link failure SNR
  • Across WAN links
  • For 2.5 years

25% of failures have SNR > 2.5dB

9

(dB)

These failures can be prevented by reducing link capacity to 50 Gbps

slide-10
SLIDE 10

Dynamically adapt link capacities in response to changes in SNR.

10

Our Proposal Gain 134 Tbps capacity By increasing link capacity when high SNR Prevent 25% link failures By reducing link capacity when low SNR

slide-11
SLIDE 11

11

Talk Outline 1 2 3 How inefficient are optical backbones? Dynamic capacity links in WANs Challenges in dynamically adapting link capacities 4 Rate Adaptive WANs

slide-12
SLIDE 12

Challenges in dynamically adapting link capacities

12

1 Requires hardware support for capacity reconfiguration Requires re-thinking IP layer traffic engineering 2

slide-13
SLIDE 13

Can we use commodity hardware for changing link capacities?

13

Bandwidth Variable Transceiver Arista 7504 linecards Key question Supports higher order modulations (QPSK, 8-QAM, 16-QAM) Link capacity of 100G, 150G, 200G

slide-14
SLIDE 14

Arista 7504 Chassis

Challenge 1: Adapting capacity on commodity h/w

Increasing noise from attenuator

Capacity Downgrade to 150G Capacity Downgrade to 100G

14

Ethernet 3/1/1 Ethernet 4/1/1

Variable Optical Attenuator

200G Link Down 150G Link Down

Takes over 1 minute to change capacity à link downtime

100G

slide-15
SLIDE 15

Commodity hardware is not optimized for dynamically adapting link capacity.

15

Problem

slide-16
SLIDE 16

16

Question

Link Usable Link not usable Link Usable

Turn off laser Program Registers Turn laser on Laser is on

What causes latency of capacity reconfiguration? Majority of time spent in turning laser on.

1 minute

slide-17
SLIDE 17

Can we reduce the latency of capacity reconfiguration by not turning off the laser?

17

Question

slide-18
SLIDE 18

18

Can we reduce the latency of capacity reconfiguration by not turning off the laser?

Question Acacia BVT Evaluation Board

Do not turn off laser in the evaluation board Program registers for modulation change

If the laser is left on, the outage is only 35ms to change capacity

Repeat experiment 200X

slide-19
SLIDE 19

How should traffic engineering incorporate dynamic capacity links?

19

Key question

slide-20
SLIDE 20

How should traffic engineering incorporate dynamic capacity links?

20

Question

Capacity changes cause links to be una unavailable for carrying ng traff ffic. Capacity changes lead to ne network chur hurn n and can be di disrupt ptive.

slide-21
SLIDE 21

21

Talk Outline 1 2 3 How inefficient are optical backbones? Dynamic capacity links in WANs Challenges in dynamically adapting link capacities 4 Rate Adaptive WANs

slide-22
SLIDE 22

We design the Rate Adaptive Wide Area Network (RADWAN) traffic engineering controller.

22

Solution

SNR-aware Knows possible capacity gain of each link Minimally disruptive Reconfigure capacity while minimizing network churn Rate Adaptive Adapts link rates to meet demands and improve availability

slide-23
SLIDE 23

RADWAN Traffic Engineering Formulation

23

Network T

  • pology

Flow Allocations Demand Matrix Optimization Objective Inputs Outputs Constraints Optical T

  • pology

and SNR Current Flow Allocation Links to reconfigure

slide-24
SLIDE 24

Proof of concept: RADWAN

24

A C

D

B

390 km 375 km 410 km 365 km Router Amplifier

slide-25
SLIDE 25

Throughput Gains with RADWAN

25

SWAN [SIGCOMM ‘13] SWAN-150 RADWAN RADWAN-hitless (Gbps) RADWAN has 40% Higher network throughput compared to SWAN

slide-26
SLIDE 26

Conclusion

  • Physical layer today is configured statically
  • We show that this leaves money on the table, in terms of
  • Network performance capacity
  • Link availability
  • Equipment cost ($/Gbps)
  • RAD

RADWAN AN introduces programmability in Layer 1

  • Improves network throughput by 40%

40%

  • Reduces link downtime by a factor of 18

18

  • Reduces equipment cost ($/Gbps) by 32%

32%

26