Popularity over time Analysis of Videos on YouTube Tizian Sarre - - PowerPoint PPT Presentation

popularity over time
SMART_READER_LITE
LIVE PREVIEW

Popularity over time Analysis of Videos on YouTube Tizian Sarre - - PowerPoint PPT Presentation

Fakultt fr Informatik Technische Universitt Mnchen Popularity over time Analysis of Videos on YouTube Tizian Sarre Advisor(s): Dr. Heiko Niedermayer Supervisor: Prof. Dr.-Ing. Georg Carle Chair of Network Architectures and Services


slide-1
SLIDE 1

Fakultät für Informatik

Technische Universität München

Popularity over time – Analysis of Videos on YouTube

Tizian Sarre

Advisor(s): Dr. Heiko Niedermayer Supervisor: Prof. Dr.-Ing. Georg Carle Chair of Network Architectures and Services Department of Informatics Technical University of Munich (TUM)

slide-2
SLIDE 2

2

Outline

  • YouTube Platform
  • Motivation
  • Dataset
  • Modeling
  • Conclusion
  • Future work
  • References
slide-3
SLIDE 3

3

The Platform

  • 2nd biggest website worldwide
  • Over 1 billion users
  • More than 4 billion daily views
  • More than 300 hours of newly uploaded videos every minute
slide-4
SLIDE 4

4

Motivation

Huge amounts of data

  • Storage costs
  • Networking costs

Video popularity analysis

  • Better understanding user behavior
  • Modeling video views
  • (Network performance improvement)
slide-5
SLIDE 5

5

Measured for 6 months (October 16th 2015 - April 14th 2016) YouTube metrics

  • 59328 static video information
  • 8909 unavailable videos
  • 58594 measured videos

Facebook metrics

  • About YouTube videos from dataset

Dataset Overview

slide-6
SLIDE 6

6

YouTube metrics Static video information

  • Title
  • Duration
  • Description
  • Published

Unavailable videos

  • Time

Dynamic video measurements

  • Views
  • Likes
  • Dislikes
  • Comments

Dataset (YouTube)

slide-7
SLIDE 7

7

YouTube Analysis -> Metrics Correlation

 YouTube views strongly positively correlate with video likes

slide-8
SLIDE 8

8

YouTube Analysis -> Search Ranking Benefits

 Keywords, description, etc. barely correlate positively with views

slide-9
SLIDE 9

9

YouTube Analysis -> Unavailable Videos

 Most video taken down early  Still, life time median surprisingly high (23)

slide-10
SLIDE 10

10

Dataset (Facebook)

Facebook metrics Dynamic YouTube video measurements

  • Likes
  • Comments
  • Shares
  • Totals
slide-11
SLIDE 11

11

Facebook Analysis: Influence on YouTube

 Facebook shares correlates positively with YouTube video

views

 Linear regression line “curved” due to logarithmic scale

slide-12
SLIDE 12

12

Modeling -> Top Days

Certain videos more popular than others  When and how often do the highest views gains (top days)

  • ccur?
slide-13
SLIDE 13

13

Modeling -> Event Days

When and how often do distinctly popular days (event days)

  • ccur?

How exactly are event days defined? Desired properties:

  • Independence of future data
  • Outstanding popularity
  • Independence of video age
  • Event day occurrence independence
  • Popularity only dependence
slide-14
SLIDE 14

14

Modeling -> Event Days

Different event day definition attempts have been made

  • Absolute median
  • Popularity categories
  • Power law-based
  • Varying event models
slide-15
SLIDE 15

15

Modeling -> Event Days -> Absolute Median

Idea: Calculate daily views gains medians of all videos in the dataset for each day in their lifetimes  Use daily medians as event day decider

slide-16
SLIDE 16

16

Modeling -> Event Days -> Absolute Median

Why not use the average? Too heavily influenced by higher values  Represents reality less appropriately Daily views gains medians of the dataset:

slide-17
SLIDE 17

17

Modeling -> Event Days -> Absolute Median

How do we use the daily views gains medians to derive event days? We could use the exact medians as decider

 Better: Add arbitrary positive modifier to accomplish

  • utstanding popularity
slide-18
SLIDE 18

18

Modeling -> Event Days -> Absolute Median

Observation Lots of event days

slide-19
SLIDE 19

19

Modeling -> Event Days -> Absolute Median

Problem Daily views gains medians no good decider…

  • Values too small

 Need more fine-grained decider

slide-20
SLIDE 20

20

Modeling -> Event Days -> Popularity categories

Idea Classify videos dynamically according to popularity categories  Smaller intervals for lower views gains (due to geometric views gains distribution)  More realistic event day determination via interval median as decider

slide-21
SLIDE 21

21

Problem Event day determination depends on interval choice

 Seemingly random event days  Popularity only dependence violated

Modeling -> Event Days -> Popularity categories

slide-22
SLIDE 22

22

Modeling -> Event Days -> Power Law

Idea Calculate a power law based model on the dataset’s views gains  Strong positive deviations are event days

slide-23
SLIDE 23

23

Modeling -> Event Days -> Power Law

Results

 Slightly fluctuating but overall decreasing event day occurrence  Event days less likely than with absolute medians

slide-24
SLIDE 24

24

Modeling -> Event Days -> Power Law

Positives:

  • Relatively reasonable model

Negatives: Model changes are not considered Multiple models are not supported

slide-25
SLIDE 25

25

Idea Adjust current power law model when strong deviations

Modeling -> Event Days -> Varying Event Models

slide-26
SLIDE 26

26

Deviation weight using least squares?  No good, high deviation between views gains too severely weighted

Modeling -> Event Days -> Varying Event Models

slide-27
SLIDE 27

27

Modeling -> Event Days -> Varying Event Models

Better with relative measure:

 Deviations weighted linearly

slide-28
SLIDE 28

28

Modeling -> Event Days -> Varying Event Models

Decider for model adaption? More/Less than 50% of expected value

 Further model adaptions less likely (consecutive event days less

likely)

slide-29
SLIDE 29

29

Modeling -> Event Days -> Varying Event Models

 Similar results compared to power law approach  Still no multiple event models supported due to uncertainty

slide-30
SLIDE 30

30

Conclusion

  • We discussed various better and worse possible event day

definitions

  • External popularity influence not considered due to uncertainty
  • Vague models for predictions
slide-31
SLIDE 31

31

Future Work

  • Consider video recommendations for popularity analysis
  • YouTube internal and external search engine rank/hits
  • Consider YouTube channel subscriber base for video popularity

analysis

  • Twitter/Snapchat/Instagram social media influence analysis
  • Alternate event day definitions and multiple models detection

support (e.g. with CUSUM)

slide-32
SLIDE 32

32

Thank you

slide-33
SLIDE 33

33

References

http://www.alexa.com/topsites https://www.youtube.com/yt/press/en-GB/statistics.html https://cnet3.cbsistatic.com/hub/i/r/2013/11/11/c3fb1098-6de7- 11e3-913e- 14feb5ca9861/resize/570xauto/4ddbc82dd9df232db62c49b29192f 268/sandvine-2H13-NA-top10-peak_.png http://www.cisco.com/c/en/us/solutions/collateral/service- provider/visual-networking-index-vni/complete-white-paper-c11- 481360.html