PCF: Provably Resilient Flexible Routing
Chuan Jiang, Sanjay Rao, Mohit Tawarmalani Purdue University
- 1
ACM SIGCOMM 2020
PCF: Provably Resilient Flexible Routing Chuan Jiang, Sanjay Rao, - - PowerPoint PPT Presentation
PCF: Provably Resilient Flexible Routing Chuan Jiang, Sanjay Rao, Mohit Tawarmalani Purdue University ACM SIGCOMM 2020 1 Background The network performance requirements are increasingly stringent. Over a 5 year period, traffic has
Chuan Jiang, Sanjay Rao, Mohit Tawarmalani Purdue University
ACM SIGCOMM 2020
2
met 99.99% of time (vs. 99% of the time)[1].
performance.
[1] Hong et al, B4 and after: managing hierarchy, partitioning, and asymmetry for availability and scale in google’s software-defined WAN. SIGCOMM 2018.
3
Design the networks so that the desired traffic can be served over a target set of failures.
met 99.99% of time (vs. 99% of the time)[1].
performance.
[1] Hong et al, B4 and after: managing hierarchy, partitioning, and asymmetry for availability and scale in google’s software-defined WAN. SIGCOMM 2018.
simultaneous link failures)
across failures.
[1] Hong et al, Achieving high utilization with software-driven WAN, SIGCOMM 2013. [2] Jain et al, B4: Experience with a globally- deployed software defined wan, SIGCOMM 2013. [3] Liu et al, Traffic engineering with forward fault correction, SIGCOMM 2014. [4] Sinha et al, Network design for tolerating multiple link failures using Fast Re-route (FRR), DRCN 2014. [5] Wang et al, R3: resilient routing reconfiguration, SIGCOMM 2010.
4
very conservative.
when a failure occurs, which always provide the best throughput.
failures.
5
6
Tractable failure analysis Yes No Throughput FFC Optimal Throughput Optimal FFC Response Overhead high low low high low high
7
Yes No Throughput FFC Optimal Throughput Optimal FFC Desired area for new mechanisms
sustains high throughput with low response
analysis.
Tractable failure analysis Response Overhead low high low high high low
tunnels.
free and resilient Flexible routing).
8
tunnels.
free and resilient Flexible routing).
9
PCF’s schemes can sustain higher throughput than FFC by a factor of upto 1.5X on average across the topologies, while providing a benefit of 2.6X in some cases.
S T U
Link capacity: 1/3 Link capacity: 1 e1 e2 e3 e4 e5 Tunnels: l1 - e1,e4 l2 - e1,e5 l3 - e2,e4 l4 - e2,e5 l5 - e3,e4 l6 - e3,e5
10
S T U
Link capacity: 1/3 Link capacity: 1 e1 e2 e3 e4 e5 Tunnels: l1 - e1,e4 l2 - e1,e5 l3 - e2,e4 l4 - e2,e5 l5 - e3,e4 l6 - e3,e5
11
S T U
Link capacity: 1/3 Link capacity: 1 e1 e2 e3 e4 e5 Reservation on tunnels: l1 - e1,e4: 1/6 l2 - e1,e5: 1/6 l3 - e2,e4: 1/6 l4 - e2,e5: 1/6 l5 - e3,e4: 1/6 l6 - e3,e5: 1/6
12
S T U
Link capacity: 1/3 Link capacity: 1 e1 e2 e3 e4 e5 Reservation on tunnels: l1 - e1,e4: 1/6 l2 - e1,e5: 1/6 l3 - e2,e4: 1/6 l4 - e2,e5: 1/6 l5 - e3,e4: 1/6 l6 - e3,e5: 1/6 Remaining tunnels can
13
S T U
Link capacity: 1/3 Link capacity: 1 e1 e2 e3 e4 e5 Reservation on tunnels: l1 - e1,e4: 1/6 l2 - e1,e5: 1/6 l3 - e2,e4: 1/6 l4 - e2,e5: 1/6 l5 - e3,e4: 1/6 l6 - e3,e5: 1/6 Remaining tunnels can
FFC’s performance guarantee: 1/2 Optimal scheme: 2/3
14
U
S T
Link capacity: 1/3 Link capacity: 1 e3 e4 e5 Reservation on tunnels: l1 - e1,e4: 1/6 l2 - e1,e5: 1/6 l3 - e2,e4: 1/6 l4 - e2,e5: 1/6 l5 - e3,e4: 1/6 l6 - e3,e5: 1/6
throughput.
15
e2 e1 l1 l5 l3
tractable failure analysis.
simultaneously meet three objectives:
16
S T U
Link capacity: 1/3 Link capacity: 1 e1 e2 e3 e4 e5
Tunnels: l1 - e1 l2 - e2 l3 - e3 l4 - e4 l5 - e5
17
T) of the logical sequence.
works upon failures.
S T U
Link capacity: 1/3 Link capacity: 1 e1 e2 e3 e4 e5
U
2/3 unit of traffic can be sent under single link failure.
18
S T U
Link capacity: 1/3 Link capacity: 1 e1 e2 e3 e4 e5
U
2/3 unit of traffic can be sent under single link failure.
19
1 unit of traffic can be sent under single link failure.
S T U
Link capacity: 1/3 Link capacity: 1 e1 e2 e3 e4 e5
U
2/3 unit of traffic can be sent under single link failure.
20
1 unit of traffic can be sent under single link failure. We can reserve 2/3 unit on the logical sequence S-U-T. This reservation is always available under single link failure. Performance guarantee: 2/3 (optimal)
S v1 v2 vm t Logical sequences
Logical segment
21
active under certain conditions (e.g. a set of links fail).
S v1 v2 vm t Logical sequences Physical tunnels
S v1 v1 v2 vm t
22
sequence
sequences from s to t
Link capacity: 1 Link capacity: 1/2 Tunnels s 1 3 t 2 4 l1 l2 l3 l4
24
Provided tunnels Maximum Number of tunnels sharing a common link Estimated number of tunnel failures under single link failure l1, l2, l3 1 1
considers all combinations of so many tunnel failures.
Link capacity: 1 Link capacity: 1/2 Tunnels s 1 3 t 2 4 l1 l2 l3 l4
25
Provided tunnels Maximum Number of tunnels sharing a common link Estimated number of tunnel failures under single link failure l1, l2, l3, l4 2 2
l1 and l2 at the same time.
single link failure, the performance will be very low.
Link capacity: 1 Link capacity: 1/2 Tunnels s 1 3 t 2 4 l1 l2 l3 l4
26
Provided tunnels Maximum Number of tunnels sharing a common link Estimated number of tunnel failures under single link failure l1, l2, l3, l4 2 2
l1 and l2 at the same time.
single link failure, the performance will be very low.
PCF solves this issue by modeling the fact that when one link fails, l1 and l2 can not die at the same time.
27
additional tunnels, and performs at least as well as FFC.
throughput is arbitrarily worse than optimal even when exponentially many tunnels are used; and (ii) PCF’s throughput achieves the optimal with only polynomially many tunnels.
PCF - implementation
(satisfy a topological order):
failure
28
PCF - family of schemes
29
PCF-TF FFC PCF-LS- General PCF-LS- TopSort PCF-CLS- General PCF-CLS- TopSort
A is provably better than B
A B
All PCF schemes are associated with tractable models that guarantee the network is congestion-free under failures.
30
PCF-TF FFC PCF-LS- General PCF-LS- TopSort PCF-CLS- General PCF-CLS- TopSort A B
Distribute traffic proportionally (fully distributed)
A is provably better than B
PCF - family of schemes
All PCF schemes are associated with tractable models that guarantee the network is congestion-free under failures.
31
PCF-TF FFC PCF-LS- General PCF-LS- TopSort PCF-CLS- General PCF-CLS- TopSort A B
Distribute traffic proportionally (fully distributed) Solve a linear system
A is provably better than B
PCF - family of schemes
All PCF schemes are associated with tractable models that guarantee the network is congestion-free under failures.
shortest paths.
32
all pairs can be scaled)
33
Demand scale 0.2 0.4 0.6 0.8 FFC PCF-TF 2 Tunnels 3 Tunnels 4 Tunnels
with 3 and 4 tunnels than with
are added.
34
Deltacom topology, single link failure
35
1.0 1.5 2.0 2.5 3.0 DemDnd sFDOe reODtive to FFC 0.0 0.2 0.4 0.6 0.8 1.0 Fdf 2ptimDO
higher throughput than FFC.
40% more demand than FFC.
Better performance
21 topologies, up to 3 link failures
Fraction of topologies
36
by 11% on average and more than 50% in the best case.
response mechanism as FFC.
1.0 1.5 2.0 2.5 3.0 DePDnd sFDOe reODtive to FFC 0.0 0.2 0.4 0.6 0.8 1.0 Fdf 3CF-TF 2ptiPDO
Better performance
21 topologies, up to 3 link failures
Fraction of topologies
37
by 25% on average, and performs 2.6x better in the best case.
mechanism
1.0 1.5 2.0 2.5 3.0 DePDnd sFDOe reODtLve to FFC 0.0 0.2 0.4 0.6 0.8 1.0 Fdf 3CF-TF 3CF-L6 2StLPDO
Better performance
21 topologies, up to 3 link failures
Fraction of topologies
38
by 50% on average, and matches the optimal for most cases.
failures
1.0 1.5 2.0 2.5 3.0 DePDnd sFDOe reODtLve to FFC 0.0 0.2 0.4 0.6 0.8 1.0 Fdf 3CF-TF 3CF-L6 3CF-CL6 2StLPDO
Better performance
21 topologies, up to 3 link failures
Fraction of topologies
39
experiments
demands
local routing under single link failure.
50 100 150 200 250 300 1uPber of sub-OLnNs 10−1 100 101 102 103 soOvLng tLPe(s) 1 h (trunFated) 3C)-T) 3C)-CLS 2StLPaO
under 10 seconds.
solving time is under 100 seconds.
topologies.
finish.
40
21 topologies, up to 3 link failures
the network’s intrinsic capability. We present the underlying reasons.
21 topologies.
41
42
Chuan Jiang: jiang486@purdue.edu Sanjay Rao: sanjay@ecn.purdue.edu