Aug. 16, 2012 Yale LANS Live Streaming is a Major Internet App - - PowerPoint PPT Presentation

aug 16 2012
SMART_READER_LITE
LIVE PREVIEW

Aug. 16, 2012 Yale LANS Live Streaming is a Major Internet App - - PowerPoint PPT Presentation

ShadowStream: Performance Evaluation as a Capability in Production Internet Live Streaming Networks Chen Tian Richard Alimi Yang Richard Yang David Zhang Aug. 16, 2012 Yale LANS Live Streaming is a Major Internet App Yale LANS Poor


slide-1
SLIDE 1

Yale LANS

ShadowStream: Performance Evaluation as a Capability in Production Internet Live Streaming Networks Chen Tian Richard Alimi Yang Richard Yang David Zhang

  • Aug. 16, 2012
slide-2
SLIDE 2

Yale LANS

Live Streaming is a Major Internet App

slide-3
SLIDE 3

Yale LANS

Poor Performance After Updates

Lacking sufficient evaluation before release

slide-4
SLIDE 4

Yale LANS

Don’t We Already Have …

  • Emulab
  • PlanetLab
  • ….

Testbeds

  • Gradually rolling out

Testing Channels

They are not enough !

slide-5
SLIDE 5

Yale LANS

Live Streaming Background

We focus on hybrid live streaming systems: CDN + P2P

slide-6
SLIDE 6

Yale LANS

Live Streaming Background

We focus on hybrid live streaming systems: CDN + P2P

slide-7
SLIDE 7

Yale LANS

With Connection Limit

Testbed: Misleading Results at Small Scale

Production Default Small-Scale Large-Scale Piece Missing Ratio 3.7% 0.7% 64.8% 3.5% Live streaming performance can be highly non-linear.

slide-8
SLIDE 8

Yale LANS

Testbed: Misleading Results due to Missing Features

Piece Missing Ratio # Timed-out Requests # Received Duplicate Packets # Received Outdated Packets LAN Style (Same BW) 1.5% 1404.25 5.65 ADSL Style (Same BW) 7.3% 2548.25 633 154.20 Realistic features can have large performance impacts.

slide-9
SLIDE 9

Yale LANS

Testing Channel: Lacking QoE Protection

slide-10
SLIDE 10

Yale LANS

Testing Channel: Lacking Orchestration

What we want is … What we have is …

5000 10000 15000 1000 2000 3000 4000 5000 6000 Time (Seconds) Number of Peers Expected 5000 10000 15000 1000 2000 3000 4000 5000 6000 Time (Seconds) Number of Peers Expected Provided

slide-11
SLIDE 11

Yale LANS

ShadowStream Design Goal

  • Protection of real user QoE
  • Transparent orchestration of testing

conditions Use production network for testing with

slide-12
SLIDE 12

Yale LANS

Roadmap

Motivation Protection Design Orchestration Design Evaluations Conclusions and Future Work

slide-13
SLIDE 13

Yale LANS

Protection: Basic Scheme

Note: R denotes Repair, E denotes Experiment

slide-14
SLIDE 14

Yale LANS

Example Illustration: E Success

slide-15
SLIDE 15

Yale LANS

Example Illustration: E Success

slide-16
SLIDE 16

Yale LANS

slide-17
SLIDE 17

Yale LANS

slide-18
SLIDE 18

Yale LANS

Example Illustration: E Fail

slide-19
SLIDE 19

Yale LANS

Example Illustration: E Fail

slide-20
SLIDE 20

Yale LANS

Example Illustration: E Fail

slide-21
SLIDE 21

Yale LANS

Example Illustration: E Fail

slide-22
SLIDE 22

Yale LANS

How to Repair?

Choice 1: dedicated CDN resources (R=rCDN)

– Benefit: simple – Limitations

  • requires resource reservation,

–e.g., 100,000 clients x 1 Mbps = 100 Gbps

  • may not work well when there is network

bottleneck

slide-23
SLIDE 23

Yale LANS

How to Repair?

  • Choice 2: production machine (R=production)

– Benefit 1: Larger resource pool – Benefit 2: Fine-tuned algorithms – Benefit 3: A unified approach to protection &

  • rchestration (later)
slide-24
SLIDE 24

Yale LANS

R= Production: Resource Competition

Competition leads to underestimation on Experiment performance Repair and Experiment compete on client upload bandwidth

slide-25
SLIDE 25

Yale LANS

x=θ y=m(θ)

R= Production: Misleading Result

missing ratio

θ0

O

x

x+y=θ0

θR

O

x

accurate result repair demand

θL

x

O

θ*

O

x

misleading result

slide-26
SLIDE 26

Yale LANS

slide-27
SLIDE 27

Yale LANS

slide-28
SLIDE 28

Yale LANS

slide-29
SLIDE 29

Yale LANS

slide-30
SLIDE 30

Yale LANS

Implementing PCE

  • Streaming machine transparent of testing state
  • Streaming machines are isolated from each other

Requirements

slide-31
SLIDE 31

Yale LANS

slide-32
SLIDE 32

Yale LANS

slide-33
SLIDE 33

Yale LANS

slide-34
SLIDE 34

Yale LANS

Client Components

slide-35
SLIDE 35

Yale LANS

Roadmap

Motivation Protection Design Orchestration Design Evaluations Conclusions and Future Work

slide-36
SLIDE 36

Yale LANS

Orchestration Challenges

  • How to start an Experiment streaming machine

– Transparent to real viewers

  • How to control the arrival/departure of each

Experiment machine in a scalable way

Orchestrator P

Streaming Hypervisor

C E client

slide-37
SLIDE 37

Yale LANS

Transparent Orchestration Idea

Viewer Enters Channel

slide-38
SLIDE 38

Yale LANS

real playpoint virtual playpoint

E R

Transparent Orchestration Idea

Experiment Enters Testing

slide-39
SLIDE 39

Yale LANS

Transparent Orchestration Idea

real playpoint virtual playpoint

E R

Experiment Leaves Testing

slide-40
SLIDE 40

Yale LANS

Distributed Activation of Testing

  • Orchestrator distributes parameters to clients
  • Each client independently generates its arrival time

according to the same distribution function F(t)

  • Together they achieve global arrival pattern

– Cox and Lewis Theorem

slide-41
SLIDE 41

Yale LANS

Orchestrator Components

slide-42
SLIDE 42

Yale LANS

Roadmap

Motivation Protection Design Orchestration Design Evaluations Conclusions and Future Work

slide-43
SLIDE 43

Yale LANS

Software Implementation

  • Compositional Runtime

– Modular design, including scheduler, dynamic loading

  • f blocks, etc.

– 3400 lines of code

  • Pre-packaged blocks

– HTTP integration, UDP sockets and debugging – 500 lines of code

  • Live streaming machine

– 4200 lines of code

slide-44
SLIDE 44

Yale LANS

Experimental Opportunities

slide-45
SLIDE 45

Yale LANS

Protection and Accuracy

Virtual Playpoint Real Playpoint Buggy 8.73% N/A R=rCDN 8.72% 0% R=rCDN w/ bottleneck 8.81% 5.42%

Piece Missing Ratio

slide-46
SLIDE 46

Yale LANS

Protection and Accuracy

Virtual Playpoint Real Playpoint PCE bottleneck 9.13% 0.15% PCE w/ higher bottleneck 8.85% 0%

Piece Missing Ratio

slide-47
SLIDE 47

Yale LANS

Orchestration: Distributed Activation

slide-48
SLIDE 48

Yale LANS

Utility on Top: Deterministic Replay

  • Event
  • Message
  • Random seeds

Control non-deterministic inputs Practical per-client log size

Log Size 100 clients; 650 seconds 223KB 300 clients; 1,800 seconds 714KB

slide-49
SLIDE 49

Yale LANS

Roadmap

Motivation Protection Design Orchestration Design Evaluations Conclusions and Future Work

slide-50
SLIDE 50

Yale LANS

Contributions

  • Design and implementation of a novel live

streaming network that introduces performance evaluation as an intrinsic capability in production networks

– Scalable (PCE) protection of QoE despite large- scale Experiment failures – Transparent orchestration for flexible testing

slide-51
SLIDE 51

Yale LANS

Future Work

  • Large-scale deployment and evaluation
  • Apply the Shadow (Experiment->Validation-

>Repair) scheme to other applications

  • Extend the Shadow (Experiment->Validation-

>Repair) scheme

– E.g., repair does not mean do the same job as Experiment, as long as it masks visible failures

slide-52
SLIDE 52

Yale LANS

Adaptive Rate Streaming Repair

Accuracy Protected QoE Protection Overhead

Follow

1.26x 1.59x 1.49 Kbps

Base

1.26x 1.42x 3.69 Kbps

Adaptive

1.26x 1.58x 1.39 Kbps

slide-53
SLIDE 53

Yale LANS

Thank you!

slide-54
SLIDE 54

Yale LANS

Questions?

slide-55
SLIDE 55

Yale LANS

backup