The Lifecyle of a Youtube Video: Phases, Content and Popularity - - PowerPoint PPT Presentation

the lifecyle of a youtube video phases content and
SMART_READER_LITE
LIVE PREVIEW

The Lifecyle of a Youtube Video: Phases, Content and Popularity - - PowerPoint PPT Presentation

The Lifecyle of a Youtube Video: Phases, Content and Popularity Honglin Yu, Lexing Xie, Scott Sanner Australian National University, NICTA May 22, 2015 Overview The scarce, and therefore valuable, resource is now attention B. A.


slide-1
SLIDE 1

The Lifecyle of a Youtube Video: Phases, Content and Popularity

Honglin Yu, Lexing Xie, Scott Sanner

Australian National University, NICTA

May 22, 2015

slide-2
SLIDE 2

Overview

“The scarce, and therefore valuable, resource is now attention”

— B. A. Huberman

◮ Previous: Crane and Sornette’s model (PNAS 2008) ◮ But, in reality ...

A u g 2 1 1 O c t 2 1 1 D e c 2 1 1 F e b 2 1 2 A p r 2 1 2 J u n 2 1 2 A u g 2 1 2 O c t 2 1 2 D e c 2 1 2 F e b 2 1 3 A p r 2 1 3 500 1000 1500 2000 2500 3000 3500 daily viewcount A u g 2 1 1 S e p 2 1 1 O c t 2 1 1 N

  • v

2 1 1 D e c 2 1 1 J a n 2 1 2 F e b 2 1 2 M a r 2 1 2 A p r 2 1 2 M a y 2 1 2 J u n 2 1 2 2000 4000 6000 8000 10000 12000 14000 16000 daily viewcount

slide-3
SLIDE 3

Generalized Power-law Phases

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 5 10 15 20 25 30 35

[CS08]: x[t] ∼ tb

◮ result of epidemic

branching processes

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 10 20 30 40 50

[Ours]: x[t] = atb + c

◮ sufficiently expressive for

monotonic curves

◮ model multiple phases ◮ account for different

background processes Both are efficient to fit

slide-4
SLIDE 4

The Phase-finding Algorithm

min.

n

  • i=1

fitting error

  • Ei{x[ ts

i : te i boundary

], ai, bi, ci

paremeter

} +

Regularizer

  • η(n − 1)

◮ Try all the possible segmentation ◮ Dynamic programming with fitting in loop

slide-5
SLIDE 5

The “Tweeted Video” Dataset

Category #videos Category #videos Music 64096 Howto 4357 Entertainment 26602 Travel 3379 Comedy 14616 Games 3299 People 12759 Nonprofit 2672 News 10422 Autos 2398 Film 8356 Animals 2375 Sports 7872 Shows 407 Tech 4626 Movies 15 Education 4577 Trailers 13 Total number: 172841

◮ Unique longitudinal popularity history for a large+diverse set

  • f videos

◮ From 20-30% sample of tweets 2009.06-07

slide-6
SLIDE 6

Examples of Segmentation Result

J u l

  • 9

O c t

  • 9

J a n

  • 1

A p r

  • 1

J u l

  • 1

O c t

  • 1

J a n

  • 1

1 A p r

  • 1

1 20 40 60 80 100

daily viewcount

A u g

  • 9

O c t

  • 9

D e c

  • 9

F e b

  • 1

A p r

  • 1

J u n

  • 1

A u g

  • 1

200 400 600 800 1000 1200 A u g

  • 7

N

  • v
  • 7

F e b

  • 8

M a y

  • 8

A u g

  • 8

N

  • v
  • 8

F e b

  • 9

M a y

  • 9

100 200 300 400 500 600 700 800

daily viewcount

N

  • v
  • 9

M a y

  • 1

N

  • v
  • 1

M a y

  • 1

1 N

  • v
  • 1

1 M a y

  • 1

2 N

  • v
  • 1

2 M a y

  • 1

3 100 200 300 400 500 600 700 800 dates (mmm-yy) (a) ID: 3o3hfNmtxYg (b) ID: IoNcZRkwbCA (c) ID: Hi0cQ5ELdt4 (d) ID: LRDihKbdrwc

slide-7
SLIDE 7

Four Types of Phases

172K video

convex.inc convex.dec concave.inc concave.dec 0% 15%30%45%60%

#phase

3.3/video

duration

233/phase

#views

22K+/phase

slide-8
SLIDE 8

#Phase v.s. Video Popularity

5 15 25 35 45 55 65 75 85 95

popularity percentile

0.0 0.2 0.4 0.6 0.8 1.0

%videos with various #phases

1 2 3 4 5 6 ≥ 7

◮ Popular videos have more phases.

slide-9
SLIDE 9

Dominant Convex Decreasing Phases

100 200 300 400 500 600 700

#days after uploading

500000 1000000 1500000 2000000 2500000

daily viewcount

Tphase ≥ 0.9T

◮ Novelty is the (only) most

important factor

◮ Do not revive

F i l m M u s i c A n i m a C

  • m

e d E d u c a H

  • w

t

  • T

r a v e A u t

  • s

E n t e r S p

  • r

t P e

  • p

l T e c h G a m e s N

  • n

p r N e w s 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7

%videos with domVexDec phase

slide-10
SLIDE 10

How do popularity change?

5 15 25 35 45 55 65 75 85 95

popularity percentile (%) at 1 year

5 20 35 50 65 80 95

percentile at 6 months

rank change → time range →

◮ Many videos go through a

jump in popularity.

◮ They have been in a

continuously increasing phase, or have at least

  • ne new phase.
slide-11
SLIDE 11

Phase-aware Viewcount Prediction

20 40 60 80 100

Days after uploading

100 200 300 400 500 600 700 800

Daily viewcount

feature feature target target pivot date tp

◮ Target: χ∗ = ∆t τ=1 x[tp + τ] ◮ Prediction: ˆ

χ = tp

τ=1 ατx[τ] ◮ Measure: normalized MSE,

ǫ =

1 ∆t|V|

  • v∈V(χ∗ − ˆ

χ)2

30 45 60 75 90 105 120

pivot date

0.18 0.20 0.22 0.24 0.26 0.28 0.30 0.32

prediction error vex.inc prediction error

30 45 60 75 90 105 120

pivot date

0.12 0.14 0.16 0.18 0.20 0.22 0.24 0.26 0.28

prediction error cav.dec

slide-12
SLIDE 12

Summary

◮ Main contribution

◮ New representation: popularity phases. ◮ New method: phase extraction algorithm. ◮ A large-scale measurement study. ◮ Better viewcount prediction.

◮ Links

◮ Segmentation Algorithm:

https://github.com/yuhonglin/segfit

◮ Dataset: https://github.com/yuhonglin/ytphasedata ◮ Data crawler: https://github.com/yuhonglin/YTCrawl

◮ Our on-going work: generative model of popularity

Thank you!

slide-13
SLIDE 13

Riley Crane and Didier Sornette. Robust dynamic classes revealed by measuring the response function of a social system. Proceedings of the National Academy of Sciences, 105(41):15649–15653, 2008.