Eliminating Adverse Control Plane Interactions in Independent Network Systems
Matthew K. Mukerjee
Computer Science PhD Thesis Defense
May 1st, 2018
Eliminating Adverse Control Plane Interactions in Independent - - PowerPoint PPT Presentation
Eliminating Adverse Control Plane Interactions in Independent Network Systems Matthew K. Mukerjee Computer Science PhD Thesis Defense May 1st, 2018 Network Control CDN server selection Routing Congestion Control VM migration Network
Eliminating Adverse Control Plane Interactions in Independent Network Systems
Matthew K. Mukerjee
Computer Science PhD Thesis Defense
May 1st, 2018
Network Control
VM migration Routing CDN server selection Congestion Control
Network Control
VM migration Routing CDN server selection Congestion Control
? ? ? ?
Coordination Coordination Coordination Coordination
Destination Source Which way? Routing: “figure out” best path (periodically computed) Forwarding: data transmission (done per-packet) Control
Control Plane Data Plane
Controller
Distributed
OSPF
Centralized
SDN
Controller
Distributed
OSPF
Centralized
SDN
Quick failure response Good at performance
Bad at performance
Slow failure response
Controller
Distributed
OSPF
Centralized
SDN
Quick failure response Good at performance
Bad at performance
Slow failure response
Bad CDN server selection → ISP paying for costly routes
Coordination?
Cheap ExpensiveCDN server A CDN server B
App TE + ISP TE
User decisions TCP decisions Application decisions
Issues? Issues? Issues?
Categorizing Control Coordination
App TE + ISP TE
Cheap ExpensiveBGP + BGP
Expensive CheapOSPF ISP C OSPF ISP B OSPF ISP A
BGP Internet-scale Routing (BGP + OSPF) Coflow (App + DC scheduling) +
Coflow (App + DC scheduling) + BGP + BGP
Expensive CheapCategorizing Control Coordination
App TE + ISP TE
OSPF ISP C OSPF ISP B OSPF ISP A
BGP
Cheap ExpensiveInternet-scale Routing (BGP + OSPF)
Reaction Priority Ranking Transparency Hierarchical Partitioning
Control Coordination
Hierarchical Partitioning Transparency
Cofmow
VDN
Internet-scale Routing
VDX
Reaction
App TE + ISP TE
Priority Ranking
BGP + BGP
Etalon
Scenario:Scalability
Scenario:Admin
Scenario:Layering
Control Coordination
Hierarchical Partitioning Transparency
Cofmow
VDN
Internet-scale Routing
VDX
Reaction
App TE + ISP TE
Priority Ranking
BGP + BGP
Etalon
Scenario:Scalability
Scenario:Admin
Scenario:Layering
Difficult to scale datacenters with demand
P-FatTree: A multi-channel datacenter network topology. HotNets 2016.
Higher Bandwidth
Higher Port Count
CMOS limits…
Use circuits to build bigger + faster networks!
Reconfigurable Datacenter Networks (RDCNs)
Packet Switch Circuit Switch
Network Scheduling End-to-End Challenges
Circuit SwitchCircuit Switch Design
What existing things break? How do you make use of it? How do you physical build it?
Packet Network
Packet Switch…
RDCN switch design
Challenge: Workloads
Rack 1 Server 1 Server 2 Server M
ToR Switch
Packet Switch Circuit SwitchRack 2 Server 1 Server 2 Server M
ToR Switch
Rack N Server 1 Server 2 Server M
ToR Switch
… … …
App Demand
…
Circuit SwitchPacket Network
Packet Switch…
RDCN switch design
Challenge: Workloads
Rack 1 Server 1 Server 2 Server M
ToR Switch
Packet Switch Circuit SwitchRack 2 Server 1 Server 2 Server M
ToR Switch
Rack N Server 1 Server 2 Server M
ToR Switch
App Demand
…
Circuit Switch… … …
Rack 1
Packet Switch Circuit SwitchRack 2 Rack N
…
1 —> 2 1 —> 3 1 —> N 1 —> 2 2 —> 3 2 —> 6 2 —> N 2 —> 5 N —> 4 N —> 1 N —> 2 N —> 7
RDCN scheduling
Rack 1
Packet Switch Circuit SwitchRack 2 Rack N
…
1 —> 2 1 —> 3 1 —> N 1 —> 2 2 —> 3 2 —> 6 2 —> N 2 —> 5 N —> 4 N —> 1 N —> 2 N —> 7
RDCN Scheduling Algorithm (e.g., Solstice)
RDCN scheduling
Rack 1
Packet Switch Circuit SwitchRack 2 Rack N
…
1 —> 2 1 —> 3 1 —> N 1 —> 2 2 —> 3 2 —> 6 2 —> N 2 —> 5 N —> 4 N —> 1 N —> 2 N —> 7
RDCN Scheduling Algorithm (e.g., Solstice) For Circuit Switch For Packet Switch
RDCN scheduling
1 —> 2 1 —> 3 1 —> N 1 —> 2 2 —> 3 2 —> 6 2 —> N 2 —> 5 N —> 4 N —> 1 N —> 2 N —> 7
RDCN Scheduling Algorithm (e.g., Solstice) For Circuit Switch For Packet Switch
Rack 1
Packet Switch Circuit SwitchRack 2 Rack N
…
RDCN scheduling
Config 1 1 —> 2 2 —> 3 … N —> 4 RECONFIG DELAY
Circuit Schedule:
300μs Time (μs) 20μs
Config 2 1 —> N 2 —> 6 … N —> 2 Config 3 1 —> 3 2 —> N … N —> 1 RECONFIG DELAY RECONFIG DELAY
180μs 20μs 20μs 518μs
…
1 —> 1 —> 3 1 —> N 1 —> 2 2 —> 3 2 —> 6 2 —> N 2 —> 5 N —> 4 N —> 1 N —> 2 N —> 7 RDCN Scheduling Algorithm (e.g., Solstice) For Circuit Switch For Packet SwitchRack 1
Packet Switch Circuit SwitchRack 2 Rack N
…RDCN scheduling
Contributions
End-to-End Challenges
Challenge: Demand Estimation Challenge: Workloads Challenge: BW Fluct.
Solution: Endhost-based Estimation Solution: Dynamic Buffer Resizing Solution: App-specific Modification
>_
Etalon, an RDCN Emulator
Overview
End-to-End Challenges
Challenge: Demand Estimation Challenge: Workloads Challenge: BW Fluct.
Solution: Endhost-based Estimation Solution: Dynamic Buffer Resizing Solution: App-specific Modification
>_
Etalon, an RDCN Emulator
Circuit Switch Packet Switch Circuit Switch Sender Receiver High BW Low BW ToR Queue ToR Queue
Challenge: BW Fluct.
Solution: Dynamic Buffer Resizing
TCP and rapid bw fluctuations
Time (μs) BW
Circuit Switch Packet Switch Sender Receiver High BW Low BW ToR Queue ToR Queue
Challenge: BW Fluct.
Solution: Dynamic Buffer Resizing
TCP and rapid bw fluctuations
Time (μs) BW
Packet Switch
Sender Receiver Circuit Switch High BW Low BW ToR Queue ToR Queue
Challenge: BW Fluct.
Solution: Dynamic Buffer Resizing
TCP and rapid bw fluctuations
Time (μs) BW
Packet Switch
Sender Receiver Circuit Switch High BW Low BW ToR Queue ToR Queue
Challenge: BW Fluct.
Solution: Dynamic Buffer Resizing
TCP and rapid bw fluctuations
Time (μs) BW
Packet Switch
Sender Receiver Circuit Switch High BW Low BW ToR Queue ToR Queue
Challenge: BW Fluct.
Solution: Dynamic Buffer Resizing
Circuit Switch
TCP and rapid bw fluctuations
Time (μs) BW
Circuit Switch Packet Switch Circuit Switch Sender Receiver High BW Low BW ToR Queue ToR Queue
Challenge: BW Fluct.
Solution: Dynamic Buffer Resizing
TCP and rapid bw fluctuations
Time (μs) BW
Circuit Switch Packet Switch Circuit Switch Sender Receiver High BW Low BW ToR Queue ToR Queue
Challenge: BW Fluct.
Solution: Dynamic Buffer Resizing
TCP and rapid bw fluctuations
Time (μs) BW Time (μs) BW
Challenge: BW Fluct.
Solution: Dynamic Buffer Resizing
TCP and rapid bw fluctuations
Time (μs) BW Time (μs) BW
What we want What we get
Packet Switch
Sender Receiver Circuit Switch High BW Low BW ToR Queue ToR Queue
SMALL
Challenge: BW Fluct.
Solution: Dynamic Buffer Resizing
Latency
TCP and rapid bw fluctuations
Time (μs) BW
Packet Switch
Sender Receiver Circuit Switch High BW Low BW ToR Queue ToR Queue
BIG
Challenge: BW Fluct.
Solution: Dynamic Buffer Resizing
Latency
TCP and rapid bw fluctuations
Time (μs) BW
Circuit Switch
TCP and rapid bw fluctuations
Packet Switch Circuit Switch
Sender Receiver ToR Queue ToR Queue High BW Low BW
BIG
Challenge: BW Fluct.
Solution: Dynamic Buffer Resizing
Bandwidth Time (μs) BW How Early?
99 92 61 39 25 16
Low utilization SMALL BIG
Challenge: BW Fluct.
Solution: Dynamic Buffer Resizing
Static buffers provide good circuit util or latency
Buffer size (packets)
4 8 16 32 64 128
SMALL BIG High Latency
Challenge: BW Fluct.
Solution: Dynamic Buffer Resizing
99 92 61 39 25 16
Buffer size (packets)
4 8 16 32 64 128 Buffer size (packets) Median latency (μs) 48 16 32 64 128
200 300 400 500 600
good circuit util or latency
100 100 88 79 65 56 52 39
Challenge: BW Fluct.
Solution: Dynamic Buffer Resizing
Early buffer resize (μs) 200 400 600 800 1000 1200 1400
Buffer resize provides good circuit util and latency
100 100 88 79 65 56 52 39
Steady Latency 2x increase in utilization
Challenge: BW Fluct.
Solution: Dynamic Buffer Resizing
Early buffer resize (μs) 200 400 600 800 1000 1200 1400
Early buffer resize (μs) Median latency (μs) 100 200 300 400 500 600
400 600 1200 1400
good circuit util and latency
Overview
End-to-End Challenges
Challenge: Demand Estimation Challenge: Workloads Challenge: BW Fluct.
Solution: Endhost-based Estimation Solution: Dynamic Buffer Resizing Solution: App-specific Modification
>_
Etalon, an RDCN Emulator
Difficult to schedule workloads
Name Node Rack A
Data Node Data Node Data Node Data Node
Rack B
Data Node Data Node Data Node Data Node
Rack C
Data Node Data Node Data Node Data Node
Rack D
Data Node Data Node Data Node Data Node
1 1 1 2 2 2 3 3 3
Challenge: Workloads
Solution: App-specific Modification
Difficult to schedule workloads
Name Node Rack A
Data Node Data Node Data Node Data Node
Rack B
Data Node Data Node Data Node Data Node
Rack C
Data Node Data Node Data Node Data Node
Rack D
Data Node Data Node Data Node Data Node
1 1 1 2 2 2 3 3 3
Challenge: Workloads
Solution: App-specific Modification
workloads
Config 1 A —> B B —> C C —> D D —> A Config 2 A —> C B —> D C —> A D —> B Config 3 A —> D B —> A C —> B D —> C RECONFIG DELAY RECONFIG DELAY RECONFIG DELAY
Schedule:
Name Node Rack A
Data Node Data Node Data Node Data Node
Rack B
Data Node Data Node Data Node Data Node
Rack C
Data Node Data Node Data Node Data Node
Rack D
Data Node Data Node Data Node Data Node
1 1 1 2 2 2 3 3 3
Config 1 A —> B B —> C C —> D D —> A Config 2 A —> C B —> D C —> A D —> B Config 3 A —> D B —> A C —> B D —> C RECONFIG DELAY RECONFIG DELAY RECONFIG DELAY
Schedule:
Name Node Rack A
Data Node Data Node Data Node Data Node
Rack B
Data Node Data Node Data Node Data Node
Rack C
Data Node Data Node Data Node Data Node
Rack D
Data Node Data Node Data Node Data Node
1 1 1 2 2 2 3 3 3
reHDFS
Config 1 A —> C B —> D C —> A D —> B
Schedule:
Name Node Rack A
Data Node Data Node Data Node Data Node
Rack B
Data Node Data Node Data Node Data Node
Rack C
Data Node Data Node Data Node Data Node
Rack D
Data Node Data Node Data Node Data Node
1 1 1 2 2 2 3 3 3
Challenge: Workloads
Solution: App-specific Modification
reHDFS reduces tail latency
9x decrease in write time
Challenge: Workloads
Solution: App-specific Modification
HDFS write completion time (ms) CDF (%)
50 100
1500
reHDFS
Control Coordination
Hierarchical Partitioning Transparency
Cofmow
VDN
Internet-scale Routing
VDX
Reaction
App TE + ISP TE
Priority Ranking
BGP + BGP
Etalon
Scenario:Scalability
Scenario:Admin
Scenario:Layering
Control Coordination
Hierarchical Partitioning Transparency
Cofmow
VDN
Internet-scale Routing
VDX
Reaction
App TE + ISP TE
Priority Ranking
BGP + BGP
Etalon
Scenario:Scalability
Scenario:Admin
Scenario:Layering
Control Coordination
Hierarchical Partitioning Transparency
Cofmow
VDN
Internet-scale Routing
VDX
Reaction
App TE + ISP TE
Priority Ranking
BGP + BGP
Etalon
Scenario:Scalability
Scenario:Admin
Scenario:Layering
Traditional Content Delivery
CDN Client Content Provider (CP)
Content Legend:
Changing Content Delivery
CDN Client Content Provider (CP)
Content Legend:
Client Client CDN
Brokered Content Delivery
CDN Content Provider (CP)
Content Legend:
Broker
Control
Client Client Client CDN
Brokered Content Delivery
Content Provider (CP)
Content Legend:
Broker
Control
Client Client Client CDN CDN
Easier for CPs to meet performance and cost goals
Brokered Content Delivery
Content Provider (CP)
Content Legend:
Broker
Control
Client Client Client B B B CDN CDN
Brokers select “best” CDN for clients to minimize cost and meet performance goals
Brokered Content Delivery
Content Provider (CP)
Content Legend:
Broker
Control
Client Client Client B B B CDN CDN
How do brokers and CDNs impact each other? (this talk)
Contributions
for each other by analyzing data from both
interfaces
CDN Cost and Pricing
CDN Client Content Provider (CP)
Legend: Content
Internal Costs: Bandwidth (mostly)
CDN Cost and Pricing
CDN Client Content Provider (CP)
Legend: Content
Internal Costs: Bandwidth (mostly)
Do bandwidth costs differ across geographic regions?
CDN Cost / Byte Delivered
difference in cost per byte between the most expensive and least expensive countries
CDN Internal Cost
CDN Y CDN X CDN X CDN X
CDN Internal Cost
CDN Y $ CDN X $ CDN X $ CDN X $$$$
CDN External Price
CDN Y $ CDN X $ CDN X $ CDN X $$$$
Content Provider (CP)
CDN Y CDN X
CDN Pricing
$$ $$$
Client Client Client Client Client Client Client
CDN External Price
CDN Y $
Client Client Client Client Client Client Client
CDN X $ CDN X $ CDN X $$$$
Content Provider (CP)
CDN Y CDN X
CDN Pricing
$$ $$$
CDN Y makes money, CDN X loses money
Client Client Client Client Client Client Client
CDN External Price
CDN Y $ CDN X $ CDN X $ CDN X $$$$
Content Provider (CP)
CDN Y CDN X
CDN Pricing
$$ $$$
Do we see traffic patterns like this at the country level?
Country Level Traffic
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Country (Anonymized) 25 50 75 100 % Used in Country
CDN A CDN B CDN C Other
Country Level Traffic
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Country (Anonymized) 25 50 75 100 % Used in Country
CDN A CDN B CDN C Other
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Country (Anonymized) 25 50 75 100 % Used in Country
CDN A CDN B CDN C Other
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Country (Anonymized) 25 50 75 100
Country Level Traffic
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Country (Anonymized) 25 50 75 100 % Used in Country
CDN A CDN B CDN C Other
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Country (Anonymized) 25 50 75 100 % Used in Country
CDN A CDN B CDN C Other
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Country (Anonymized) 25 50 75 100 % Used in Country
CDN A CDN B CDN C Other
Country Level Traffic
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Country (Anonymized) 25 50 75 100 % Used in Country
CDN A CDN B CDN C Other
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Country (Anonymized) 25 50 75 100 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Country (Anonymized) 25 50 75 100 % Used in Country
CDN A CDN B CDN C Other
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Country (Anonymized) 25 50 75 100 % Used in Country
CDN A CDN B CDN C Other
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Country (Anonymized) 25 50 75 100 % Used in Country
CDN A CDN B CDN C Other
Flat pricing makes CDN profits unpredictable with brokers Country 8 costly —> CDN B loses money! Country 7 cheap —> CDN A profits!
Contributions
for each other by analyzing data from both
interfaces
Brokered Delivery Today
CDN Content Provider (CP) Broker
Content Legend: Control
Client Client Client CDN
Brokered Delivery Today
CDN Content Provider (CP) Broker Client Client Client CDN
Brokered Delivery Today
CDN Content Provider (CP) Broker Client Client Client CDN
Brokered Delivery Today
CDN Content Provider (CP) Broker Client Client Client CDN
Brokered Delivery Today
CDN Content Provider (CP) Broker Client Client Client CDN
Brokered Delivery Today
CDN Content Provider (CP) Broker Client Client Client CDN
Brokered Delivery Today
CDN Broker Client Client Client CDN
Brokered Delivery Today
CDN Broker Client Client Client CDN
Latency & loss Measurements ISP, device type, location, …
Brokered Delivery Today
CDN Broker Client Client Client CDN
Which cluster to receive from Which CDN to use
Brokered Delivery Today
CDN Broker Client Client Client CDN
CDN
VDX
CDN Broker Client Client Client CDN
CDN
1 2 3
Example
Client
CDN Y $
Client Client Client Client Client Client
CDN X $ CDN X $ CDN X $$$$
Content Provider (CP)
CDN Y CDN X
CDN Pricing
$$ $$$
Example
Client
CDN Y $
Client Client Client Client Client Client
CDN X $ CDN X $ CDN X $$$$
Content Provider (CP)
CDN X
CDN Pricing
$$ $$$$
CDN X
CDN Y $$
Example
Client
CDN Y $
Client Client Client Client Client Client
CDN X $ CDN X $ CDN X $$$$
Content Provider (CP)
CDN X
CDN Pricing
$$ $$$$
CDN X
CDN Y $$
Example
Client
CDN Y $
Client Client Client Client Client Client
CDN X $ CDN X $ CDN X $$$$
Content Provider (CP)
CDN X
CDN Pricing
$$ $$$$
CDN X
CDN Y $$
CDN X can compete with
Evaluation
as public data from 13 other CDNs
client performance, delivery costs, etc.
distributions, etc.
1 2 3 4 5 6 7 8 9 10 11 12 13 14
CDN Profit
Brokered VDX
Per-CDN Profits
Today VDX
1 2 3 4 5 6 7 8 9 10 11 12 13 14
CDN Profit
Brokered VDX
Per-CDN Profits
Today VDX
Evaluation Takeaways
(performance can be better; most CDNs lose money on brokered video delivery)
and cost
Control Coordination
Hierarchical Partitioning Transparency
Cofmow
VDN
Internet-scale Routing
VDX
Reaction
App TE + ISP TE
Priority Ranking
BGP + BGP
Etalon
Scenario:Scalability
Scenario:Admin
Scenario:Layering
Control Coordination
Hierarchical Partitioning Transparency
Cofmow
VDN
Internet-scale Routing
VDX
Reaction
App TE + ISP TE
Priority Ranking
BGP + BGP
Etalon
Scenario:Scalability
Scenario:Admin
Scenario:Layering
Control Coordination
Hierarchical Partitioning Transparency
Cofmow
VDN
Internet-scale Routing
Reaction
App TE + ISP TE
Priority Ranking
BGP + BGP
Etalon
Scenario:Scalability
Scenario:Admin
Scenario:Layering
Information Sharing
Some Full
VDX
Control Coordination
Hierarchical Partitioning Transparency
Cofmow
VDN
Internet-scale Routing
Reaction
App TE + ISP TE
Priority Ranking
BGP + BGP
Etalon
Scenario:Scalability
Scenario:Admin
Scenario:Layering
Information Sharing
Some Full
VDX
Live Video is Becoming Wildly Popular
Live Video is Becoming Wildly Popular
Internet traffic
CDN Live Video Delivery Background
A B
Video Sources
E F G
Edge Clusters
C D
Reflector Clusters
Control ▶︎ ◀ Data
Video Requests
HTTP GET HTTP RESPONSE Video 2 Video 2 Legend Video 1 Requests: Video 1 Responses:
CDN Live Video Delivery Background
A B
Video Sources
E F G
Edge Clusters DNS
C D
Reflector Clusters
Control ▶︎ ◀ Data
H I J
Clients
1KLink Capacity
2K 3K 200 750 2K 300 500 300 750 700Link Cost
100 100 120 25 20 15 15 1 10 1CDN Live Video Delivery Background
A B
Video Sources
E F G
Edge Clusters DNS
C D
Reflector Clusters
Control ▶︎ ◀ Data
H I J
Clients Link Capacity Link Cost
Objective: Reasonable service quality & Minimal delivery cost
Problems with CDNs Today
2 4 6 8 10 # of Videos (Thousands) 1000 2000 3000 4000 5000 6000 7000 8000
Optimal CDN
Service Quality
Simulation using Conviva traces, modeling user-generated content
Delivery Cost
Simulation using Conviva traces, modeling large sports events
(per request)
CDN
2.0x
OPTIMAL
1.0x
QUALITATIVE QUANTITATIVE
Problems with CDNs Today
2 4 6 8 10 # of Videos (Thousands) 1000 2000 3000 4000 5000 6000 7000 8000Service Quality Delivery Cost
CDN
2.0x
OPTIMAL
1.0x
Not Fine-Grained Slow DNS Updates
Videos aggregated into large groups
Can’t push updates DNS entries get cached
QUALITATIVE QUANTITATIVE
Solution?
2 4 6 8 10 # of Videos (Thousands) 1000 2000 3000 4000 5000 6000 7000 8000Service Quality Delivery Cost
CDN
2.0x
OPTIMAL
1.0x
Not Fine-Grained Slow DNS Updates
Videos aggregated into large groups
Can’t push updates DNS entries get cached
[Liu, Xi et. al. A Case for a Coordinated Video Control Plane. SIGCOMM 2012]
Outline
Centralized Control Distributed Control Problems with Live Video Today Putting it all Together Hybrid Control
Motivating Centralized Optimization
A B
Video Sources
E F G
Edge Clusters
C D
Reflector Clusters
Control ▶︎ ◀ Data
H I J
Clients
1KLink Capacity
2K 3K 200 750 2K 300 500 300 750 700 300 200DNS
Motivating Centralized Optimization
A B
Video Sources
E F G
Edge Clusters
C D
Reflector Clusters
Control ▶︎ ◀ Data
H I J
Clients
1KLink Capacity
2K 3K 200 750 2K 300 500 300 750 700 300 200 300DNS Congestion!
Motivating Centralized Optimization
A B
Video Sources
E F G
Edge Clusters
C D
Reflector Clusters
Control ▶︎ ◀ Data
H I J
Clients
1KLink Capacity
2K 3K 200 750 2K 300 500 300 750 700 300 200 300 200DNS
Motivating Centralized Optimization
A B
Video Sources
E F G
Edge Clusters
C D
Reflector Clusters
Control ▶︎ ◀ Data
H I J
Clients Link Capacity
500 300 750 700 300 200 300 200DNS
Needs global view to coordinate videos and network resources
Unfortunately… No Free Lunch
50 100 150 200 # of videos 5 10 15 20 25 Join Time (Seconds)
Light Load
Fully Centralized
Slow join times! Experiments on EC2 nodes with a centralized controller at CMU across the Internet
Outline
Centralized Control Distributed Control Problems with Live Video Today Putting it all Together Hybrid Control Slow join times
Alternate Approach: Distributed
A B
Video Sources
E F G
Edge Clusters
C D
Reflector Clusters
Control ▶︎ ◀ Data
H I J
Clients
800Legend Video 1 Data Requests: Link Capacity
2K 3K 200 2K 800 800 500 300 750 700 1K 1KVideo 1 Responses:
800?
Alternate Approach: Distributed
A B
Video Sources
E F G
Edge Clusters
C D
Reflector Clusters
Control ▶︎ ◀ Data
H I J
Clients
800Central Controller Legend Video 1 Data Requests: Link Capacity
2K 3K 500 300 750 700 1K 1KVideo 1 Responses:
800?
Build “distance-to-video” tables at each cluster, for each video
Alternate Approach: Distributed
A B
Video Sources
E F G
Edge Clusters
C D
Reflector Clusters
Control ▶︎ ◀ Data
H I J
Clients
800Central Controller Legend Video 1 Data Requests: Link Capacity
2K 3K 200 2K 800 800 500 300 750 700 1K 1KVideo 1 Responses:
800 800DISTANCE AT CLUSTER F VIDEO 1: VIA C: 2; (B, 1K) VIA D: 1; (D, 800) PICK SHORTEST PATH WITH ENOUGH CAPACITY
Alternate Approach: Distributed
A B
Video Sources
E F G
Edge Clusters
C D
Reflector Clusters
Control ▶︎ ◀ Data
H I J
Clients
800Central Controller Legend Video 1 Data Requests: Link Capacity
2K 3K 500 300 750 700 1K 1KVideo 1 Responses:
800 800DISTANCE AT CLUSTER F VIDEO 1: VIA C: 2; (B, 1K) VIA D: 1; (D, 800)
Distributed decisions fast (ms) but sub-optimal
PICK SHORTEST PATH WITH ENOUGH CAPACITY
Alternate Approach: Distributed
A B
Video Sources
E F G
Edge Clusters
C D
Reflector Clusters
Control ▶︎ ◀ Data
H I J
Clients
800Central Controller Legend Video 1 Data Requests: Link Capacity
2K 3K 500 300 750 700 1K 1KVideo 1 Responses:
800 800DISTANCE AT CLUSTER F VIDEO 1: VIA C: 2; (B, 1K) VIA D: 1; (D, 800)
Combine approaches? “Hybrid Control”
PICK SHORTEST PATH WITH ENOUGH CAPACITY
Outline
Centralized Control Distributed Control Problems with Live Video Today Putting it all Together Hybrid Control Slow join times Low bitrate
Combining Approaches: VDN
A B
Video Sources
E F G
Edge Clusters
C D
Reflector Clusters
Control ▶︎ ◀ Data
H I J
Clients
800Legend Video 1 Data Requests:
2K 3K 200 2K 800 800 500 300 750 700 1K 1KVideo 1 Responses:
800? Central Controller The Internet HIGH LATENCY HIGH LATENCY
Combining Approaches: VDN
A B
Video Sources
E F G
Edge Clusters
C D
Reflector Clusters
Control ▶︎ ◀ Data
H I J
Clients
800Legend Video 1 Data Requests:
2K 3K 200 2K 800 800 500 300 750 700 1K 1KVideo 1 Responses:
800Central Controller The Internet HIGH LATENCY HIGH LATENCY
800Combining Approaches: VDN
A B
Video Sources
E F G
Edge Clusters
C D
Reflector Clusters
Control ▶︎ ◀ Data
H I J
Clients
800Legend Video 1 Data Requests:
2K 3K 200 2K 800 800 500 300 750 700 1K 1KVideo 1 Responses:
800Central Controller The Internet HIGH LATENCY HIGH LATENCY
800Video 1 Control Traffic:
interactions
Challenges of Hybrid Control
TRIVIAL PRIOR WORK CHALLENGING
Challenges of Hybrid Control
CHALLENGING
interactions
50 100 150 200 # of videos 5 10 15 20 25 Join Time (Seconds)
Light Load
Hybrid Control Fully Centralized Fully Distributed
Hybrid Control and Responsiveness
Slow join times! Experiments on EC2 nodes with a centralized controller at CMU across the Internet
50 100 150 200 # of videos 5 10 15 20 25 Join Time (Seconds)
Light Load
Hybrid Control Fully Centralized Fully Distributed
Hybrid Control and Responsiveness
Slow join times! Experiments on EC2 nodes with a centralized controller at CMU across the Internet Not stable
50 100 150 200 # of videos 5 10 15 20 25 Join Time (Seconds)
Light Load
Hybrid Control Fully Centralized Fully Distributed
Hybrid Control and Responsiveness
Slow join times! Experiments on EC2 nodes with a centralized controller at CMU across the Internet Not stable Great join times and more stable
Control Coordination
Hierarchical Partitioning
VDN
Reaction
App TE + ISP TE
Scenario:Scalability
Internet Routing
Transparency
Cofmow
Etalon
Scenario:Layering
Priority Ranking
Scenario:Admin
BGP + BGP
VDX
Information Sharing
Some Full
Control Coordination
Hierarchical Partitioning
VDN
Internet Routing
Reaction
App TE + ISP TE
Scenario:Scalability
Information Sharing
Some Full
Transparency
Cofmow
Etalon
Scenario:Layering
Priority Ranking
Scenario:Admin
BGP + BGP
VDX
Control Coordination
Reaction
App TE + ISP TE
Transparency
Cofmow
Etalon
Scenario:Layering
Priority Ranking
Scenario:Admin
BGP + BGP
VDX
Hierarchical Partitioning
VDN
Internet Routing
Scenario:Scalability
Information Sharing
Some Full
Control Coordination
Hierarchical Partitioning Transparency
Cofmow
VDN
Internet Routing
Reaction
App TE + ISP TE
Priority Ranking
Etalon
Scenario:Scalability
Scenario:Admin
Scenario:Layering
Information Sharing
Some Full
BGP + BGP
VDX
Shared resources
Yes No
Control Coordination
Hierarchical Partitioning Transparency
Cofmow
VDN
Internet Routing
Reaction
App TE + ISP TE
Priority Ranking
Etalon
Scenario:Scalability
Scenario:Admin
Scenario:Layering
Information Sharing
Some Full
BGP + BGP
VDX
Shared resources
Yes No None
Control Coordination
Hierarchical Partitioning Transparency
Cofmow
VDN
Internet Routing
Reaction
App TE + ISP TE
Priority Ranking
Etalon
Scenario:Scalability
Scenario:Admin
Scenario:Layering
Information Sharing
Some Full
BGP + BGP
VDX
Shared resources
Yes No None
Route Redistribution Pytheas C3 (Conviva) OSPF Fibbing Bohatei Klein Wiser P4P (vanilla) DASH + HTTP + TCP OSPF Areas Congestion Control CC + AQM
Future Work
Control Coordination
Hierarchical Partitioning Transparency
Cofmow
VDN
Internet Routing
Reaction
App TE + ISP TE
Priority Ranking
Etalon
Scenario:Scalability
Scenario:Admin
Scenario:Layering
Information Sharing
Some Full
BGP + BGP
VDX
Shared resources
Yes No None
Eliminating Adverse Control Plane Interactions in Independent Network Systems
Matthew K. Mukerjee
Computer Science PhD Thesis Defense
May 1st, 2018