The Lifecyle of a Youtube Video: Phases, Content and Popularity - - PowerPoint PPT Presentation
The Lifecyle of a Youtube Video: Phases, Content and Popularity - - PowerPoint PPT Presentation
The Lifecyle of a Youtube Video: Phases, Content and Popularity Honglin Yu, Lexing Xie, Scott Sanner Australian National University, NICTA May 22, 2015 Overview The scarce, and therefore valuable, resource is now attention B. A.
Overview
“The scarce, and therefore valuable, resource is now attention”
— B. A. Huberman
◮ Previous: Crane and Sornette’s model (PNAS 2008) ◮ But, in reality ...
A u g 2 1 1 O c t 2 1 1 D e c 2 1 1 F e b 2 1 2 A p r 2 1 2 J u n 2 1 2 A u g 2 1 2 O c t 2 1 2 D e c 2 1 2 F e b 2 1 3 A p r 2 1 3 500 1000 1500 2000 2500 3000 3500 daily viewcount A u g 2 1 1 S e p 2 1 1 O c t 2 1 1 N
- v
2 1 1 D e c 2 1 1 J a n 2 1 2 F e b 2 1 2 M a r 2 1 2 A p r 2 1 2 M a y 2 1 2 J u n 2 1 2 2000 4000 6000 8000 10000 12000 14000 16000 daily viewcount
Generalized Power-law Phases
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 5 10 15 20 25 30 35
[CS08]: x[t] ∼ tb
◮ result of epidemic
branching processes
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 10 20 30 40 50
[Ours]: x[t] = atb + c
◮ sufficiently expressive for
monotonic curves
◮ model multiple phases ◮ account for different
background processes Both are efficient to fit
The Phase-finding Algorithm
min.
n
- i=1
fitting error
- Ei{x[ ts
i : te i boundary
], ai, bi, ci
paremeter
} +
Regularizer
- η(n − 1)
◮ Try all the possible segmentation ◮ Dynamic programming with fitting in loop
The “Tweeted Video” Dataset
Category #videos Category #videos Music 64096 Howto 4357 Entertainment 26602 Travel 3379 Comedy 14616 Games 3299 People 12759 Nonprofit 2672 News 10422 Autos 2398 Film 8356 Animals 2375 Sports 7872 Shows 407 Tech 4626 Movies 15 Education 4577 Trailers 13 Total number: 172841
◮ Unique longitudinal popularity history for a large+diverse set
- f videos
◮ From 20-30% sample of tweets 2009.06-07
Examples of Segmentation Result
J u l
- 9
O c t
- 9
J a n
- 1
A p r
- 1
J u l
- 1
O c t
- 1
J a n
- 1
1 A p r
- 1
1 20 40 60 80 100
daily viewcount
A u g
- 9
O c t
- 9
D e c
- 9
F e b
- 1
A p r
- 1
J u n
- 1
A u g
- 1
200 400 600 800 1000 1200 A u g
- 7
N
- v
- 7
F e b
- 8
M a y
- 8
A u g
- 8
N
- v
- 8
F e b
- 9
M a y
- 9
100 200 300 400 500 600 700 800
daily viewcount
N
- v
- 9
M a y
- 1
N
- v
- 1
M a y
- 1
1 N
- v
- 1
1 M a y
- 1
2 N
- v
- 1
2 M a y
- 1
3 100 200 300 400 500 600 700 800 dates (mmm-yy) (a) ID: 3o3hfNmtxYg (b) ID: IoNcZRkwbCA (c) ID: Hi0cQ5ELdt4 (d) ID: LRDihKbdrwc
Four Types of Phases
172K video
convex.inc convex.dec concave.inc concave.dec 0% 15%30%45%60%
#phase
3.3/video
duration
233/phase
#views
22K+/phase
#Phase v.s. Video Popularity
5 15 25 35 45 55 65 75 85 95
popularity percentile
0.0 0.2 0.4 0.6 0.8 1.0
%videos with various #phases
1 2 3 4 5 6 ≥ 7
◮ Popular videos have more phases.
Dominant Convex Decreasing Phases
100 200 300 400 500 600 700
#days after uploading
500000 1000000 1500000 2000000 2500000
daily viewcount
Tphase ≥ 0.9T
◮ Novelty is the (only) most
important factor
◮ Do not revive
F i l m M u s i c A n i m a C
- m
e d E d u c a H
- w
t
- T
r a v e A u t
- s
E n t e r S p
- r
t P e
- p
l T e c h G a m e s N
- n
p r N e w s 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7
%videos with domVexDec phase
How do popularity change?
5 15 25 35 45 55 65 75 85 95
popularity percentile (%) at 1 year
5 20 35 50 65 80 95
percentile at 6 months
rank change → time range →
◮ Many videos go through a
jump in popularity.
◮ They have been in a
continuously increasing phase, or have at least
- ne new phase.
Phase-aware Viewcount Prediction
20 40 60 80 100
Days after uploading
100 200 300 400 500 600 700 800
Daily viewcount
feature feature target target pivot date tp
◮ Target: χ∗ = ∆t τ=1 x[tp + τ] ◮ Prediction: ˆ
χ = tp
τ=1 ατx[τ] ◮ Measure: normalized MSE,
ǫ =
1 ∆t|V|
- v∈V(χ∗ − ˆ
χ)2
30 45 60 75 90 105 120
pivot date
0.18 0.20 0.22 0.24 0.26 0.28 0.30 0.32
prediction error vex.inc prediction error
30 45 60 75 90 105 120
pivot date
0.12 0.14 0.16 0.18 0.20 0.22 0.24 0.26 0.28
prediction error cav.dec
Summary
◮ Main contribution
◮ New representation: popularity phases. ◮ New method: phase extraction algorithm. ◮ A large-scale measurement study. ◮ Better viewcount prediction.
◮ Links
◮ Segmentation Algorithm:
https://github.com/yuhonglin/segfit
◮ Dataset: https://github.com/yuhonglin/ytphasedata ◮ Data crawler: https://github.com/yuhonglin/YTCrawl