Popularity Prediction of Facebook Videos for Higher Quality - - PowerPoint PPT Presentation

popularity prediction of facebook videos for higher
SMART_READER_LITE
LIVE PREVIEW

Popularity Prediction of Facebook Videos for Higher Quality - - PowerPoint PPT Presentation

Popularity Prediction of Facebook Videos for Higher Quality Streaming Linpeng Tang Qi Huang , Amit Puntambekar Ymir Vigfusson , Wyatt Lloyd , Kai Li 1 Videos are Central to Facebook 8


slide-1
SLIDE 1

Popularity Prediction of Facebook Videos for Higher Quality Streaming

1

Linpeng Tang

Qi Huang , Amit Puntambekar Ymir Vigfusson , Wyatt Lloyd , Kai Li

∗ ∗

♭ ‡

slide-2
SLIDE 2

Videos are Central to Facebook

8 billion views per day

2

9-year old singing on America’s Got Talent 44M views Black bear roaming in Princeton 3.8K views Small shop making frozen yogurt 122 views

slide-3
SLIDE 3

Workflow of Videos on Facebook

3

Original

Streaming Video Engine

CDN

Encoded ABR Streaming Upload

Backend Storage

ABR streams the best quality version of the video that fits!

Intensive processing needed to create multiple video versions for ABR streaming

slide-4
SLIDE 4

Better Video Streaming from More Processing

  • Better compression at the same quality
  • QuickFire: 20% size reduction using

20X computation

  • More users can view the high quality versions

4

BaQdwith Video Quality Alice Bob

slide-5
SLIDE 5

Better Video Streaming from More Processing

  • Better compression at the same quality
  • QuickFire: 20% size reduction using

20X computation

  • More users can view the high quality versions

5

BaQdwith Video Quality Alice Bob

Better Compression

slide-6
SLIDE 6

How to apply QuickFire for FB videos

  • Infeasible to encode all videos with QuickFire

– Increase by 20X the already large processing fleet

  • High skew in popularity

– Reap most benefit with modest processing?

6

slide-7
SLIDE 7

Opportunity: High Skew in Popularity

  • Access logs of 1 million videos randomly sampled by ID
  • Watch time: total time users spent watching a video

7

100 101 102 103 104 105 106

Video rank

0.0 0.2 0.4 0.6 0.8 1.0

Cumulative watch time ratio

slide-8
SLIDE 8

100 101 102 103 104 105 106

Video rank

0.0 0.2 0.4 0.6 0.8 1.0

Cumulative watch time ratio

Opportunity: High Skew in Popularity

  • We can serve most watch time even with a small

fraction of videos encoded with QuickFire

  • Can we predict these videos for more processing?

8

80%+ watch time

slide-9
SLIDE 9

CHESS Video Prediction System

  • Popularity prediction is important for higher

quality streaming

– Direct encoding on videos with the largest benefit

  • Goal of CHESS video prediction system

– Identify videos with highest future watch time – Maximize watch-time ratio with budgeted processing

9

slide-10
SLIDE 10

CHESS Video Prediction System

10

Streaming Video Engine

CDN

Backend Storage CHESS-VPS

Predicted Popular Videos Social signals

Facebook Graph Serving System

Access logs

slide-11
SLIDE 11

CHESS Video Prediction System

11

Streaming Video Engine

CDN

Backend Storage

Predicted Popular Videos QuickFire Encoded

CHESS-VPS

Original Social signals

Facebook Graph Serving System

Access logs

slide-12
SLIDE 12

Social signals

Facebook Graph Serving System

Access logs

CHESS Video Prediction System

12

Streaming Video Engine

CDN

Backend Storage

Predicted Popular Videos QuickFire Encoded

CHESS-VPS

Original

Serving QuickFire-encoded versions!

slide-13
SLIDE 13

Requirements of CHESS-VPS

  • Handle working set of ~80 million videos
  • Generate new predictions every few minutes
  • Requires a new prediction algorithm: CHESS!

13

slide-14
SLIDE 14

CHESS Key Insights

  • Efficiently model influence of past accesses as

the basis for scalable prediction

  • Combine multiple predictors to boost accuracy

14

slide-15
SLIDE 15

Efficiently model past access influence

  • Self exciting process

– A past access makes future accesses more probable, i.e. provides some influence on future popularity

15

1 2 3 4 5

7ime

1 2 3 4 5 6 7 8

InIluence

1 2 3 4 5

7ime

1 2 3 4 5 6 7 8

InIluence

Influence of past accesses

slide-16
SLIDE 16

1 2 3 4 5

7ime

1 2 3 4 5 6 7 8

InIluence

Efficiently model past access influence

  • Self exciting process

– A past access makes future accesses more probable, i.e. provides some influence on future popularity – Prediction: sum up total future influence of all past accesses

16

Total future influence now

slide-17
SLIDE 17

Efficiently model past access influence

  • Influence modeled with kernel function
  • Power-law kernel used by prior works

– Provides high accuracy – Scan all past accesses, O(N) time/space not scalable

17

1 2 3 4 5

TLPe

0.0 0.5 1.0 1.5 2.0

InIluence

3ower Law

y = (x + β)−α

slide-18
SLIDE 18

Efficiently model past access influence

  • Influence modeled with kernel function
  • Power-law kernel used by prior works
  • Key insight: use exponential kernel for scalability

18

1 2 3 4 5

TLPe

0.0 0.2 0.4 0.6 0.8 1.0

InIluence

3ower Law ExponentLal

y = exp(−x/w)

y = (x + β)−α

slide-19
SLIDE 19

Efficiently model past access influence

  • Self exciting process with the exponential kernel

19

˜ F (t) = x w + exp ✓−(t − u) w ◆ ˜ F (u)

Current Access Watch-time + Previous Prediction Exponential Decay x

slide-20
SLIDE 20

Efficiently model past access influence

  • Single exponential kernel is less accurate than

power-law kernel

– 10% lower watch time ratio

  • O(1) space/time to maintain

20

Single exponential kernel is less accurate yet scalable

slide-21
SLIDE 21

Combining Efficient Features in a Model

21

  • Key insight: maintain multiple exponential kernels
  • O(1) space/time

Exp Exponential ke kernels ls Mod Modeled by

Combining multiple exponential kernels is as accurate as a power-law kernel

Time WaWch Time

Ac Actual acces access pa pattern

slide-22
SLIDE 22

Combining Efficient Features in a Model

22

Social signals further boosts accuracy

Future Popularity

Neural Network

Raw features

Multiple Kernels Directly-used Features likes comments shares

  • wner likes

video age Past access watch-time

slide-23
SLIDE 23

CHESS Video Prediction System

23

Aggr Aggregated top videos Aggr Worker1 Worker2 Worker3 Worker4 Prediction workers Shard1 Shard2 Shard3 Shard4 Access logs Streaming Model Model NN Models Client Client

slide-24
SLIDE 24

Evaluation

  • What is the accuracy of CHESS?
  • How do our design decisions on CHESS affect its

accuracy and resource consumption?

  • What is CHESS’s impact on video processing and

watch time ratio of QuickFire?

24

slide-25
SLIDE 25

Evaluation

  • What is the accuracy of CHESS?
  • How do our design decisions on CHESS affect its

accuracy and resource consumption?

  • What is CHESS’s impact on video processing and

watch time ratio of QuickFire?

25

slide-26
SLIDE 26

Metrics

  • Watch time ratio

– Ratio of watch time from better encoded videos – Directly proportional to benefits of better encoding

  • Processing time

26

slide-27
SLIDE 27

Metrics

  • Watch time ratio

– Ratio of watch time from better encoded videos – Directly proportional to benefits of better encoding

  • Processing time (infeasible to encode all videos)

– Video length processing time – Video length ratio ≈ computation overhead

27

slide-28
SLIDE 28

CHESS is Accurate

28

  • Vary video length ratio (proxy for processing overhead)
  • Observe watch time ratio of better encoded videos

10-6 10-5 10-4 10-3 10-2 10-1 100

Video lengWh raWio

0.0 0.2 0.4 0.6 0.8 1.0

WaWch Wime raWio

slide-29
SLIDE 29

CHESS is Accurate

29

10-6 10-5 10-4 10-3 10-2 10-1 100

Video lengWh raWio

0.0 0.2 0.4 0.6 0.8 1.0

WaWch Wime raWio

IniWial(1d) (CAC0'10)

  • Initial(1d): initial watch time up to 1 day after upload
slide-30
SLIDE 30

CHESS is Accurate

30

10-6 10-5 10-4 10-3 10-2 10-1 100

Video lengWK rDWio

0.0 0.2 0.4 0.6 0.8 1.0

WDWcK Wime rDWio

6(I60IC (.'''15) IniWiDl(1d) (CAC0'10)

  • Initial(1d): initial watch time up to 1 day after upload
  • SESIMIC: handcrafted power-law kernel
slide-31
SLIDE 31

10-6 10-5 10-4 10-3 10-2 10-1 100

VidHo lHngWK rDWio

0.0 0.2 0.4 0.6 0.8 1.0

WDWcK WimH rDWio

CH(66 6(I60IC (.'''15) IniWiDl(1d) (CAC0'10)

CHESS is Accurate

31

  • Initial(1d): initial watch time up to 1 day after upload
  • SESIMIC: handcrafted power-law kernel

CHESS provides higher accuracy than even the non-scalable state of the art

slide-32
SLIDE 32

10 20 30 40 50 60

3URcHssing C38 (%)

0.3 0.4 0.5 0.6 0.7 0.8 0.9

WaWch 7iPH 5aWiR

CH(66 2wnHU OikHs

CHESS Reduces Encoding Processing

  • Predict on whole Facebook video workload in real-time
  • Sample 0.5% videos for actual encoding

32

CHESS reduces CPU by 3x (54% to 17%) for 80% watch time ratio

slide-33
SLIDE 33

Related Work

33

Popularity Prediction Video QoE Optimization Caching

Hawkes'71, Crane'08, Szabo'10, Cheng'14, SEISMIC'15 Liu'12, Aaron'15, Huang'15, Jiang'16, QuickFire'16 LFU‘93, LRU’94, SLRU‘94, GDS’97, GDSF‘98, MQ’01 CHESS is scalable and accurate Optimize encoding with access feedback Identify hot items to improve efficiency

slide-34
SLIDE 34

Conclusion

  • Popularity prediction can direct encoding for higher

quality streaming

  • CHESS: first scalable and accurate popularity predictor

– Model influence of past accesses with O(1) time/space – Combine multiple kernels & social signals to boost accuracy

  • Evaluation on Facebook video workload

– More accurate than non-scalable state of the art method – Serve 80% user watch time with 3x reduction in processing

34