Practical, Real-time Centralized Control for CDN-based Live Video Delivery
Matt Mukerjee, David Naylor, Junchen Jiang, Dongsu Han, Srini Seshan, Hui Zhang
Practical, Real-time Centralized Control for CDN-based Live Video - - PowerPoint PPT Presentation
Practical, Real-time Centralized Control for CDN-based Live Video Delivery Matt Mukerjee , David Naylor, Junchen Jiang, Dongsu Han, Srini Seshan, Hui Zhang Live Video is Becoming Wildly Popular Commercial sports streams User-generated
Practical, Real-time Centralized Control for CDN-based Live Video Delivery
Matt Mukerjee, David Naylor, Junchen Jiang, Dongsu Han, Srini Seshan, Hui Zhang
Live Video is Becoming Wildly Popular
Live Video is Becoming Wildly Popular
Internet traffic
to efficiently manage quality and cost, with high responsiveness
Central Optimization Distributed Control Quality and cost management Responsiveness to joins and failures Hybrid Control
Centralized Control Distributed Control Problems with Live Video Today Putting it all Together Hybrid Control
CDN Live Video Delivery Background
A B
Video Sources
E F G
Edge Clusters
C D
Reflector Clusters
Control ▶︎ ◀ Data
Video Requests
HTTP GET HTTP RESPONSE Video 2 Video 2 Legend Video 1 Requests: Video 1 Responses:
CDN Live Video Delivery Background
A B
Video Sources
E F G
Edge Clusters DNS
C D
Reflector Clusters
Control ▶︎ ◀ Data
H I J
Clients
1K
Link Capacity
2K 3K 200 750 2K 300 500 300 750 700
Link Cost
100 100 120 25 20 15 15 1 10 1
CDN Live Video Delivery Background
A B
Video Sources
E F G
Edge Clusters DNS
C D
Reflector Clusters
Control ▶︎ ◀ Data
H I J
Clients
1K
Link Capacity
2K 3K 200 750 2K 300 500 300 750 700
Link Cost
100 100 120 25 20 15 15 1 10 1
2 4 6 8 10 # of Videos (Thousands) 1000 2000 3000 4000 5000 6000 7000 8000
Optimal CDN
Service Quality
Simulation using Conviva traces, modeling user-generated content
Delivery Cost
Simulation using Conviva traces, modeling large sports events
(per request)
CDN
OPTIMAL
QUALITATIVE QUANTITATIVE
2 4 6 8 10 # of Videos (Thousands) 1000 2000 3000 4000 5000 6000 7000 8000
Service Quality Delivery Cost
CDN
2.0x
OPTIMAL
1.0x
Not Fine-Grained Slow DNS Updates
Videos aggregated into large groups
Can’t push updates DNS entries get cached
QUALITATIVE QUANTITATIVE
Service Quality Fine-Grained Control Real-time Response
Per-video Control
Sub-second response to failures and joins
2 4 6 8 10 # of Videos (Thousands) 1000 2000 3000 4000 5000 6000 7000 8000
CDN
2.0x
OPTIMAL
1.0x
Room for improvement, but Internet latency / loss
Delivery Cost
QUALITATIVE QUANTITATIVE
Service Quality Fine-Grained Control Real-time Response
Per-video Control
Sub-second response to failures and joins
2 4 6 8 10 # of Videos (Thousands) 1000 2000 3000 4000 5000 6000 7000 8000
CDN
2.0x
OPTIMAL
1.0x
Room for improvement, but Internet latency / loss
Delivery Cost
[Liu, Xi et. al. A Case for a Coordinated Video Control Plane. SIGCOMM 2012]
Centralized Control Distributed Control Problems with Live Video Today Putting it all Together Hybrid Control
A B
Video Sources
E F G
Edge Clusters
C D
Reflector Clusters
Control ▶︎ ◀ Data
H I J
Clients
1K
Link Capacity
2K 3K 200 750 2K 300 500 300 750 700
300 200
DNS
A B
Video Sources
E F G
Edge Clusters
C D
Reflector Clusters
Control ▶︎ ◀ Data
H I J
Clients
1K
Link Capacity
2K 3K 200 750 2K 300 500 300 750 700
300 200 300
DNS Congestion!
A B
Video Sources
E F G
Edge Clusters
C D
Reflector Clusters
Control ▶︎ ◀ Data
H I J
Clients
1K
Link Capacity
2K 3K 200 750 2K 300 500 300 750 700
300 200 300 200
DNS
A B
Video Sources
E F G
Edge Clusters
C D
Reflector Clusters
Control ▶︎ ◀ Data
H I J
Clients
1K
Link Capacity
2K 3K 200 750 2K 300 500 300 750 700
300 200 300 200
DNS
A B
Video Sources
E F G
Edge Clusters
C D
Reflector Clusters
Control ▶︎ ◀ Data
H I J
Clients
1K
Link Capacity
2K 3K 200 750 2K 300 500 300 750 700
300 200 300 200
DNS
A B
Video Sources
E F G
Edge Clusters
C D
Reflector Clusters
Control ▶︎ ◀ Data
H I J
Clients
1K
Link Capacity
2K 3K 200 750 2K 300 500 300 750 700
300 200 300 200
Central Controller
DELIVERY COST
SERVICE QUALITY DON’T EXCEED LINK CAPACITY SENDER MUST HAVE RECEIVED VIDEO
MAXIMIZE MINIMIZE SUBJECT TO
∀ ∈ ∈ ∈ { } ∀l ∈ L : P
∀l ∈ L, o ∈ O : P
l0∈InLinks(l) Servesl0,o ≥ Servesl,o
P − · P
l∈L,o∈O
· ·
l,o
subject to: ∀l ∈ L, o ∈ O : Servesl,o ∈ {0, 1} P max ws · P
l∈L AS ,o∈O Priorityo · Requestl,o · Servesl,o
− wc · P
l∈L,o∈O Cost(l) · Bitrate(o) · Servesl,o
subject to:
SERVICE QUALITY DELIVERY COST DON’T EXCEED LINK CAPACITY SENDER MUST HAVE RECEIVED VIDEO
P − · P
∈
subject to: ∀ ∈ ∈ max −
Flexibility of Centralized Optimization
A B
Video Sources
E F G
Edge Clusters
C D
Reflector Clusters
Control ▶︎ ◀ Data
H I J
Clients
800
2K
Link Capacity
2K 2K 2K 2K 1K 2K 1K 800 1K 800
Link Cost
1 1 1 1 1 1 1 1 1 10 1
Central Controller
800 800
?
Flexibility of Centralized Optimization
A B
Video Sources
E F G
Edge Clusters
C D
Reflector Clusters
Control ▶︎ ◀ Data
H I J
Clients
800 900
2K
Link Capacity
2K 2K 2K 2K 1K 2K 1K 800 1K 800
Link Cost
1 1 1 1 1 1 1 1 1 10 1
Video Priority
100 1
Central Controller
800 800 900
2 4 6 8 10 # of Videos (Thousands) 1000 2000 3000 4000 5000 6000 7000 8000
Optimal CDN
Service Quality Delivery Cost
(per request)
CDN
OPTIMAL
Simulation using Conviva traces, modeling user-generated content
Simulation using Conviva traces, modeling large sports events
Delivery Cost
(per request)
CDN
VDN
2 4 6 8 10 # of Videos (Thousands) 1000 2000 3000 4000 5000 6000 7000 8000
VDN CDN
Service Quality
Simulation using Conviva traces, modeling user-generated content
Simulation using Conviva traces, modeling large sports events
50 100 150 200 # of videos 5 10 15 20 25 Join Time (Seconds)
Light Load
Fully Centralized
Slow join times! Experiments on EC2 nodes with a centralized controller at CMU across the Internet
Problems with Centralization
A B
Video Sources
E F G
Edge Clusters
C D
Reflector Clusters
Control ▶︎ ◀ Data
H I J
Clients
800
Central Controller Legend Video 1 Data Requests: Video 1 Control Traffic: The Internet HIGH LATENCY HIGH LATENCY
Centralized Control Distributed Control Problems with Live Video Today Putting it all Together Hybrid Control Slow join times
Alternate Approach: Distributed
A B
Video Sources
E F G
Edge Clusters
C D
Reflector Clusters
Control ▶︎ ◀ Data
H I J
Clients
800
Central Controller Legend Video 1 Data Requests: Link Capacity
2K 3K 200 2K 800 800 500 300 750 700 1K 1K
Video 1 Responses:
800
?
Alternate Approach: Distributed
A B
Video Sources
E F G
Edge Clusters
C D
Reflector Clusters
Control ▶︎ ◀ Data
H I J
Clients
800
Central Controller Legend Video 1 Data Requests: Link Capacity
2K 3K 200 2K 2K 800 500 300 750 700 1K 1K
Video 1 Responses:
800
?
Alternate Approach: Distributed
A B
Video Sources
E F G
Edge Clusters
C D
Reflector Clusters
Control ▶︎ ◀ Data
H I J
Clients
800
Central Controller Legend Video 1 Data Requests: Link Capacity
2K 3K 200 2K 800 800 500 300 750 700 1K 1K
Video 1 Responses:
800
? DISTANCE AT CLUSTER F VIDEO 1: 1; (B, 2K)
Alternate Approach: Distributed
A B
Video Sources
E F G
Edge Clusters
C D
Reflector Clusters
Control ▶︎ ◀ Data
H I J
Clients
800
Central Controller Legend Video 1 Data Requests: Link Capacity
2K 3K 200 2K 800 800 500 300 750 700 1K 1K
Video 1 Responses:
800
? DISTANCE AT CLUSTER F VIDEO 1: VIA C: 2; (B, 1K) 1; (B, 2K) # OF HOPS TO VIDEO PATH BOTTLENECK
Alternate Approach: Distributed
A B
Video Sources
E F G
Edge Clusters
C D
Reflector Clusters
Control ▶︎ ◀ Data
H I J
Clients
800
Central Controller Legend Video 1 Data Requests: Link Capacity
2K 3K 200 2K 800 800 500 300 750 700 1K 1K
Video 1 Responses:
800
? DISTANCE AT CLUSTER F VIDEO 1: VIA C: 2; (B, 1K) VIA D: 1; (D, 800) 1; (B, 2K)
Alternate Approach: Distributed
A B
Video Sources
E F G
Edge Clusters
C D
Reflector Clusters
Control ▶︎ ◀ Data
H I J
Clients
800
Central Controller Legend Video 1 Data Requests: Link Capacity
2K 3K 200 2K 800 800 500 300 750 700 1K 1K
Video 1 Responses:
800
? DISTANCE AT CLUSTER F VIDEO 1: VIA C: 2; (B, 1K) VIA D: 1; (D, 800) 1; (B, 2K) PICK SHORTEST PATH WITH ENOUGH CAPACITY
Alternate Approach: Distributed
A B
Video Sources
E F G
Edge Clusters
C D
Reflector Clusters
Control ▶︎ ◀ Data
H I J
Clients
800
Central Controller Legend Video 1 Data Requests: Link Capacity
2K 3K 200 2K 800 800 500 300 750 700 1K 1K
Video 1 Responses:
800 800
DISTANCE AT CLUSTER F VIDEO 1: VIA C: 2; (B, 1K) VIA D: 1; (D, 800) PICK SHORTEST PATH WITH ENOUGH CAPACITY
Alternate Approach: Distributed
A B
Video Sources
E F G
Edge Clusters
C D
Reflector Clusters
Control ▶︎ ◀ Data
H I J
Clients
800
Central Controller Legend Video 1 Data Requests: Link Capacity
2K 3K 200 2K 800 800 500 300 750 700 1K 1K
Video 1 Responses:
800 800
DISTANCE AT CLUSTER F VIDEO 1: VIA C: 2; (B, 1K) VIA D: 1; (D, 800)
PICK SHORTEST PATH WITH ENOUGH CAPACITY
Alternate Approach: Distributed
A B
Video Sources
E F G
Edge Clusters
C D
Reflector Clusters
Control ▶︎ ◀ Data
H I J
Clients
800
Central Controller Legend Video 1 Data Requests: Link Capacity
2K 3K 200 2K 800 800 500 300 750 700 1K 1K
Video 1 Responses:
800 800
DISTANCE AT CLUSTER F VIDEO 1: VIA C: 2; (B, 1K) VIA D: 1; (D, 800)
PICK SHORTEST PATH WITH ENOUGH CAPACITY
Centralized Control Distributed Control Problems with Live Video Today Putting it all Together Hybrid Control Slow join times Low bitrate
Central Optimization Distributed Control Quality and cost management (minutes) Responsiveness to joins and failures (milliseconds) Hybrid Control
interactions
TRIVIAL PRIOR WORK CHALLENGING
Combining Approaches: Hybrid
A B
Video Sources
E F G
Edge Clusters
C D
Reflector Clusters
Control ▶︎ ◀ Data
H I J
Clients
800
Legend Video 1 Data Requests:
2K 3K 200 2K 800 800 500 300 750 700 1K 1K
Video 1 Responses:
800
? Central Controller The Internet HIGH LATENCY HIGH LATENCY
Combining Approaches: Hybrid
A B
Video Sources
E F G
Edge Clusters
C D
Reflector Clusters
Control ▶︎ ◀ Data
H I J
Clients
800
Legend Video 1 Data Requests:
2K 3K 200 2K 800 800 500 300 750 700 1K 1K
Video 1 Responses:
800
Central Controller The Internet HIGH LATENCY HIGH LATENCY
800
Combining Approaches: Hybrid
A B
Video Sources
E F G
Edge Clusters
C D
Reflector Clusters
Control ▶︎ ◀ Data
H I J
Clients
800
Legend Video 1 Data Requests:
2K 3K 200 2K 800 800 500 300 750 700 1K 1K
Video 1 Responses:
800
Central Controller The Internet HIGH LATENCY HIGH LATENCY
800
Video 1 Control Traffic:
TRIVIAL PRIOR WORK CHALLENGING
interactions
CHALLENGING
centralized decisions
interactions
50 100 150 200 # of videos 5 10 15 20 25 Join Time (Seconds)
Light Load
Hybrid Control Fully Centralized Fully Distributed
Hybrid Control and Responsiveness
Slow join times! Experiments on EC2 nodes with a centralized controller at CMU across the Internet
50 100 150 200 # of videos 5 10 15 20 25 Join Time (Seconds)
Light Load
Hybrid Control Fully Centralized Fully Distributed
Hybrid Control and Responsiveness
Slow join times! Experiments on EC2 nodes with a centralized controller at CMU across the Internet Not stable
50 100 150 200 # of videos 5 10 15 20 25 Join Time (Seconds)
Light Load
Hybrid Control Fully Centralized Fully Distributed
Hybrid Control and Responsiveness
Slow join times! Experiments on EC2 nodes with a centralized controller at CMU across the Internet Not stable Great join times and more stable
Centralized Control Distributed Control Problems with Live Video Today Putting it all Together Hybrid Control Slow join times Low bitrate “Better than both”
A B
Video Sources
E F G
Edge Clusters
C D
Reflector Clusters
Control ▶︎ ◀ Data
H I J
Clients Logically centralized controller “Local Agent” per cluster
DISCOVERY CONTROL DISCOVERY CONTROL
DISCOVERY CONTROL
CENTRAL CONTROLLER
DISCOVERY CONTROL
LOCAL AGENT DATA PLANE HYBRID CONTROL CENTRALIZED DISTRIBUTED TOPOLOGY AND VIDEO INFO DISTRIBUTION TREES
HTTP Server HTTP Server HTTP Server
based live video delivery
Central Optimization Distributed Control Quality and cost management Responsiveness to joins and failures Hybrid Control
Practical, Real-time Centralized Control for CDN-based Live Video Delivery
Matt Mukerjee, David Naylor, Junchen Jiang, Dongsu Han, Srini Seshan, Hui Zhang
A B
Video Sources
E F G
Edge Clusters
C D
Reflector Clusters
Control ▶︎ ◀ Data
H I J
Clients
1K
Link Capacity
2K 3K 200 750 2K 300 500 300 750 700
300 200 1.5K 1.5K
Even Split (1K)
300 200
A B
Video Sources
E F G
Edge Clusters
C D
Reflector Clusters
Control ▶︎ ◀ Data
H I J
Clients
1K
Link Capacity
2K 3K 200 750 2K 300 500 300 750 700
300 200 1.5K 1.5K
Uneven Split (1.5K / 500)
200 1.5K
Distributed: Example of Sub-optimal
A B
Video Sources
E F G
Edge Clusters
C D
Reflector Clusters
Control ▶︎ ◀ Data
H I J
Clients
800
Legend Video 1 Data Requests: Link Capacity
2K 3K 200 2K 2K 800 500 300 750 700 1K 1K
Video 1 Responses:
800 800
Wasting bandwidth
Distributed: Example of Sub-optimal
A B
Video Sources
E F G
Edge Clusters
C D
Reflector Clusters
Control ▶︎ ◀ Data
H I J
Clients
800
Legend Video 1 Data Requests: Link Capacity
2K 3K 200 2K 2K 800 500 300 750 700 1K 1K
Video 1 Responses:
800 800
Coordination difficult without centralization
workload
could greatly improve user experience but would be difficult to design over Internet