Perseverance-Aware Traffic Engineering in Rate-Adaptive Networks - - PowerPoint PPT Presentation

perseverance aware traffic engineering in rate adaptive
SMART_READER_LITE
LIVE PREVIEW

Perseverance-Aware Traffic Engineering in Rate-Adaptive Networks - - PowerPoint PPT Presentation

Perseverance-Aware Traffic Engineering in Rate-Adaptive Networks with Reconfiguration Delay Shih-Hao Tseng, (pronounced as She-How Zen) October 10, 2019 Department of Computing and Mathematical Sciences, California Institute of Technology


slide-1
SLIDE 1

Perseverance-Aware Traffic Engineering in Rate-Adaptive Networks with Reconfiguration Delay

Shih-Hao Tseng, (pronounced as “She-How Zen”) October 10, 2019

Department of Computing and Mathematical Sciences, California Institute of Technology

slide-2
SLIDE 2

Optical Networks

  • Modern wide-area networks consist of expensive optical fibers.
  • The capacity of the optical fibers is determined by the

signal-to-noise ratio (SNR) and the adopted modulation (such as PSK, QAM, etc.).

modulation noise capacity

1

slide-3
SLIDE 3

Rate-Adaptive Networks

  • In practice, SNR is much better than required.
  • RADWAN (Singh et al., 2018) leverages bandwidth variable

transceivers (BVTs) to change the modulation and vary the capacity.

modulation noise capacity

Singh et al., “RADWAN: Rate Adaptive Wide Area Network,” 2018.

2

slide-4
SLIDE 4

Rate-Adaptive Networks

  • In practice, SNR is much better than required.
  • RADWAN (Singh et al., 2018) leverages bandwidth variable

transceivers (BVTs) to change the modulation and vary the capacity.

modulation noise capacity

Singh et al., “RADWAN: Rate Adaptive Wide Area Network,” 2018.

2

slide-5
SLIDE 5

Rate-Adaptive Networks: Challenge

  • Reconfiguration delay: During the change of modulation, the
  • ptical link is down for a while.

reconfiguration delay

3

slide-6
SLIDE 6

One-Shot Update and Churn

  • The reconfiguration delay causes traffic disturbance, which is

named churn in RADWAN.

4

slide-7
SLIDE 7

One-Shot Update and Churn

  • The reconfiguration delay causes traffic disturbance, which is

named churn in RADWAN.

  • Adaptive links bring higher final throughput while causing
  • churn. RADWAN updates the links in one-shot and addresses

the trade-off by max (final throughput) − ǫ · (churn) where ǫ is the trade-off factor.

4

slide-8
SLIDE 8

One-Shot Update

  • One-shot update leads to considerable traffic fluctuation.

initial

  • ne-shot

final

5

slide-9
SLIDE 9

Multi-Step Reconfiguration

  • One-shot update leads to considerable traffic fluctuation.
  • We can update links in batches to reduce the traffic

fluctuation by introducing intermediate steps similar to SWAN (Hong et al., 2013).

initial

  • ne-shot

final step 1 step 2

Hong et al., “Achieving High Utilization with Software-Driven WAN,” 2013.

6

slide-10
SLIDE 10

Multi-Step Reconfiguration and Perseverance

  • Given a multi-step plan, we can consider not only the total

impact (churn) but also the smoothness of the update.

churn Throughput Step 1 2 3 4

7

slide-11
SLIDE 11

Multi-Step Reconfiguration and Perseverance

  • Given a multi-step plan, we can consider not only the total

impact (churn) but also the smoothness of the update.

  • We propose perseverance to describe the smoothness of the
  • transition. The perseverance level is defined as the maximum

allowed throughput drop between two consecutive steps.

≤ 40% ≤ 40% ≤ 40% perseverance level = 40% Throughput Step 1 2 3 4

7

slide-12
SLIDE 12

Multi-Step Reconfiguration and Perseverance

  • Incorporating perseverance into consideration, we consider the
  • ptimization as follows, which is different from RADWAN’s

churn-based proposal: max (final throughput in T steps) s.t. (perseverance level ≥ ρ) where ρ is the lower bound on the perseverance level.

  • A multi-step reconfiguration allows higher final throughput

without the degradation of perseverance level.

8

slide-13
SLIDE 13

Multi-Step Reconfiguration and Perseverance initial

  • ne-shot

final perseverance level ρ = 0

9

slide-14
SLIDE 14

Multi-Step Reconfiguration and Perseverance initial final step 1 step 2 perseverance level ρ = 0.5

9

slide-15
SLIDE 15

Rate Adaptation Planning (RAP) Problem

  • Consider a network shared by N flows. Each sends at rate

xn(t) during step t. With the horizon T, we can write down the discrete-time control formulation of the rate adaptation planning (RAP) problem as follows RAP = max

  • n∈N

xn(T) subject to capacity constraints, perseverance constraints, initial constraints, and feasibility constraints.

10

slide-16
SLIDE 16

Rate Adaptation Planning (RAP) Problem: Constraints

  • Perseverance constraints: Given a perseverance level ρ, the

perseverance constraint can be written as ρxn(t − 1) ≤ xn(t) for all t = 1, 2, . . . , T.

  • Initial constraints: xn(0) is given for all n. Each link has an

initial capacity.

  • Feasibility constraints: Each flow n has a predetermined

path set to send its traffic. xn(t) is the sum of the traffic along all the paths.

11

slide-17
SLIDE 17

Rate Adaptation Planning (RAP) Problem: Constraints

  • Capacity constraints: The capacity of a link cl is determined

by the adopted modulations (and the underlying SNR). Once the modulation is changed, the link is down for one step.

t = 0 t = 1 t = 2 t = 3 t = 4 t = T = 5

12

slide-18
SLIDE 18

Mixed Integer Linear Programming Formulation

  • Under a fixed SNR, we can show that an optimal update plan

can be achieved by changing the modulation on each link l at most once – to one providing the highest capacity.

t = 0 t = 1 t = 2 t = 3 t = 4 t = T = 5

13

slide-19
SLIDE 19

Mixed Integer Linear Programming Formulation

  • Under a fixed SNR, we can show that an optimal update plan

can be achieved by changing the modulation on each link l at most once – to one providing the highest capacity.

  • As such, we introduce the auxiliary integer variable zl(t) for

each link l to indicate whether the modulation of l has been changed at step t.

zl(0) = 0 zl(1) = 1 zl(2) = 1 zl(3) = 1 zl(4) = 1 zl(5) = 1 t = 0 t = 1 t = 2 t = 3 t = 4 t = T = 5

13

slide-20
SLIDE 20

Mixed Integer Linear Programming Formulation

  • Under a fixed SNR, we can show that an optimal update plan

can be achieved by changing the modulation on each link l at most once – to one providing the highest capacity.

  • As such, we introduce the auxiliary integer variable zl(t) for

each link l to indicate whether the modulation of l has been changed at step t.

zl(0) = 0 zl(1) = 0 zl(2) = 0 zl(3) = 1 zl(4) = 1 zl(5) = 1 t = 0 t = 1 t = 2 t = 3 t = 4 t = T = 5

13

slide-21
SLIDE 21

Mixed Integer Linear Programming Formulation

  • Under a fixed SNR, we can show that an optimal update plan

can be achieved by changing the modulation on each link l at most once – to one providing the highest capacity.

  • As such, we introduce the auxiliary integer variable zl(t) for

each link l to indicate whether the modulation of l has been changed at step t.

zl(0) = 0 zl(1) = 0 zl(2) = 0 zl(3) = 0 zl(4) = 0 zl(5) = 0 t = 0 t = 1 t = 2 t = 3 t = 4 t = T = 5

13

slide-22
SLIDE 22

Mixed Integer Linear Programming Formulation

  • Transformed capacity constraints: Using zl(t), we can

write the capacity as cl(t) = cmin

l

(1 − zl(t)) + cmax

l

zl(t − 1) where cmin

l

and cmax

l

are the minimum and the maximum achievable capacity of the link under the SNR.

zl(0) = 0 zl(1) = 1 zl(2) = 1 zl(3) = 1 zl(4) = 1 zl(5) = 1 t = 0 t = 1 t = 2 t = 3 t = 4 t = T = 5

14

slide-23
SLIDE 23

Mixed Integer Linear Programming Formulation

max

  • n∈N

xn(T) (RAP) s.t. capacity constraints perseverance constraints initial constraints feasibility constraints zl(t − 1) ≤ zl(t) ∀t ∈ T, l ∈ L zl(t) ∈ {0, 1} ∀t ∈ T, l ∈ L

15

slide-24
SLIDE 24

Analysis of Rate Adaptation Planning (RAP) Problem

  • Can we solve RAP in polynomial time?

16

slide-25
SLIDE 25

Analysis of Rate Adaptation Planning (RAP) Problem

  • Can we solve RAP in polynomial time?

→ Unlikely, RAP is NP-hard.

16

slide-26
SLIDE 26

Analysis of Rate Adaptation Planning (RAP) Problem

  • Can we solve RAP in polynomial time?

→ Unlikely, RAP is NP-hard.

  • Can we approximate RAP within a constant factor?

16

slide-27
SLIDE 27

Analysis of Rate Adaptation Planning (RAP) Problem

  • Can we solve RAP in polynomial time?

→ Unlikely, RAP is NP-hard.

  • Can we approximate RAP within a constant factor?

→ No, unless P=NP.

16

slide-28
SLIDE 28

Analysis of Rate Adaptation Planning (RAP) Problem

  • Can we solve RAP in polynomial time?

→ Unlikely, RAP is NP-hard.

  • Can we approximate RAP within a constant factor?

→ No, unless P=NP.

  • Why is RAP so hard?

16

slide-29
SLIDE 29

Analysis of Rate Adaptation Planning (RAP) Problem

  • Can we solve RAP in polynomial time?

→ Unlikely, RAP is NP-hard.

  • Can we approximate RAP within a constant factor?

→ No, unless P=NP.

  • Why is RAP so hard?

→ Under some mild assumptions, we can always reach the

  • ptimal final throughput, despite that the update sequence

may be extremely long.

16

slide-30
SLIDE 30

Analysis of Rate Adaptation Planning (RAP) Problem

  • Can we solve RAP in polynomial time?

→ Unlikely, RAP is NP-hard.

  • Can we approximate RAP within a constant factor?

→ No, unless P=NP.

  • Why is RAP so hard?

→ Under some mild assumptions, we can always reach the

  • ptimal final throughput, despite that the update sequence

may be extremely long. → We would prefer to finish the update in a bounded number

  • f steps. Therefore, we need some good heuristics for RAP.

16

slide-31
SLIDE 31

Algorithm Design Ideas

  • Find a feasible reconfiguration plan.
  • Fix the configuration (i.e., when we should change the

modulations of which links) and maximize the usage of available links.

17

slide-32
SLIDE 32

Algorithm Design Ideas

  • Find a feasible reconfiguration plan.
  • Fix the configuration (i.e., when we should change the

modulations of which links) and maximize the usage of available links. In sum, we design our algorithm (ALG) to

1

solve 2-step LP relaxation at time t for relaxed zl(t) ∈ (0, 1);

2

upround zl(t) to integers to form the configuration at t;

3

iterate through t = 1, . . . , T − 1 to obtain the configurations and find the work conserving reconfiguration plan.

17

slide-33
SLIDE 33

Proposed Algorithm (ALG)

xn(1) ≥ ρxn(0) max

n∈N

xn(2) xn(2) ≥ ρ2xn(0) max

n∈N

xn(3) xn(3) ≥ ρ3xn(0) max

n∈N

xn(4) xn(4) ≥ ρ4xn(0) max

n∈N

xn(5) zl(1) t = 1 zl(2) t = 2 zl(3) t = 3 zl(4) t = 4 max

t∈T

  • n∈N

xn(t) t = T = 5 xn

p(1)

xn

p(2)

xn

p(3)

xn

p(4)

xn

p(5)

1 2 3

18

slide-34
SLIDE 34

Simulations: Questions of Interest

  • Is it still beneficial to have rate-adaptive links under the

reconfiguration delay and perseverance constraints?

  • What is a reasonable perseverance level?
  • How well does perseverance smoothen the process?
  • How hard is it to find a perseverance-aware solution?

19

slide-35
SLIDE 35

Simulation Setup

  • We simulate RADWAN, the optimal solution to RAP (OPT),

and the proposed algorithm (ALG) based on the the existing WAN topologies: SWAN, Internet2, and B4.

  • The baseline case: T = 5 and ρ = 0.5.

(a) SWAN (8 nodes, 12 links) (b) B4 (18 nodes, 39 links)

20

slide-36
SLIDE 36

Advantage of Rate Adaptive Links

  • Is it still beneficial to have rate-adaptive links under the

reconfiguration delay and perseverance constraints?

  • What is a reasonable perseverance level?
  • How well does perseverance smoothen the process?
  • How hard is it to find a perseverance-aware solution?

21

slide-37
SLIDE 37

Advantage of Rate Adaptive Links

Table 1: Average throughput (Gbps) under different WANs. Our methods OPT and ALG boost the throughput by 40% to 50% while ensuring a more steady reconfiguration plan.

topology ρ ≈ 1 ρ = 0.5 ρ ≈ 0 w/o adaptive links OPT ALG RADWAN† SWAN 681.85 998.623 998.611 998.625 Internet2 1071.15 1510.13 1509.89 1510.15 B4 2621.12 3919.81 3919.14 3919.87

†We also simulate RADWAN with ǫ = 0.001, and the resulting average throughput

remains the same as ǫ = 0.1. 22

slide-38
SLIDE 38

Convergence under Different Perseverance Levels

  • Is it still beneficial to have rate-adaptive links under the

reconfiguration delay and perseverance constraints?

  • What is a reasonable perseverance level?

→ How does a perseverance level slow down the convergence to the final throughput?

  • How well does perseverance smoothen the process?
  • How hard is it to find a perseverance-aware solution?

23

slide-39
SLIDE 39

Convergence under Different Perseverance Levels

1 2 3 4 5 6 7 8 9 10 1,200 1,400 1,600 0.5 0.55 0.75 0.8 0.85 0.9 Step Throughput (Gbps)

Figure 2: Larger perseverance levels ρ (boxed values) prevent aggressive update with large disturbance, and hence, it takes more steps for ALG to converge to the maximum throughput.

24

slide-40
SLIDE 40

Convergence under Different Perseverance Levels

1 2 3 4 5 6 7 8 9 10 0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 Number of Steps for Convergence Perseverance Level ρ

Figure 3: The 1st-5th-50th-95th-99th percentiles of the minimum number of steps needed for throughput convergence. When ρ = 0.5, ALG converges in 5 steps in 99% of the 1000 random cases.

25

slide-41
SLIDE 41

Mitigation of Transition Fluctuation

  • Is it still beneficial to have rate-adaptive links under the

reconfiguration delay and perseverance constraints?

  • What is a reasonable perseverance level?

→ How does a perseverance level slow down the convergence to the final throughput?

  • How well does perseverance smoothen the process?
  • How hard is it to find a perseverance-aware solution?

26

slide-42
SLIDE 42

Mitigation of Transition Fluctuation

OPT ALG RADWAN

200 400 600 800 1,000 1,200 1,400 Maximum Throughput Deviation (Gbps) (a) SWAN (8 nodes, 12 links) 1,000 2,000 3,000 4,000 5,000 Maximum Throughput Deviation (Gbps) (b) B4 (18 nodes, 39 links)

27

slide-43
SLIDE 43

Comparison of Computation Overhead

  • Is it still beneficial to have rate-adaptive links under the

reconfiguration delay and perseverance constraints?

  • What is a reasonable perseverance level?

→ How does a perseverance level slow down the convergence to the final throughput?

  • How well does perseverance smoothen the process?
  • How hard is it to find a perseverance-aware solution?

→ What are the computation overheads?

28

slide-44
SLIDE 44

Comparison of Computation Overhead

Table 2: Average CPU computation time (ms) and fraction. ALG uses much less time and scales better than OPT (current reconfiguration downtime is 68 s, which can be potentially reduced to 35 ms).

topology OPT ALG fraction ALG OPT

  • SWAN

67.7 15.0 22.2% Internet2 497.1 39.6 8.0% B4 1.2 × 106 332.0 0.03%

29

slide-45
SLIDE 45

Conclusion

  • Instead of one-shot update and churn, we introduce the idea

perseverance for a multi-step reconfiguration.

  • We propose an efficient algorithm (ALG) to approach RAP.

The proposed algorithm improves the overall throughput and smoothens the transition with small computation overhead.

  • Besides perseverance, the network operators might maintain

some other properties (such as throughput level) during a multi-step reconfiguration. It is possible to extend the proposed multi-step update framework and examine some

  • ther transition metrics.

30

slide-46
SLIDE 46

Questions & Answers

slide-47
SLIDE 47

References

  • R. Singh, M. Ghobadi, K.-T. Foerster, M. Filer, and P. Gill,

“RADWAN: Rate adaptive wide area network,” in Proc. ACM SIGCOMM. ACM, 2018, pp. 547–560. C.-Y. Hong, S. Kandula, R. Mahajan, M. Zhang, V. Gill,

  • M. Nanduri, and R. Wattenhofer, “Achieving high utilization

with software-driven WAN,” ACM SIGCOMM CCR, vol. 43,

  • no. 4, pp. 15–26, 2013.