SEISMIC: A Self-Exciting Point Process Model for Predicting Tweet - - PowerPoint PPT Presentation

seismic a self exciting point process model for
SMART_READER_LITE
LIVE PREVIEW

SEISMIC: A Self-Exciting Point Process Model for Predicting Tweet - - PowerPoint PPT Presentation

SEISMIC: A Self-Exciting Point Process Model for Predicting Tweet Popularity Qingyuan Zhao 1 , Murat A. Erdogdu 1 , Hera Y. He 1 , Anand Rajaraman 2 , Jure Leskovec 2 Department of Statistics 1 and Computer Science 2 , Stanford University


slide-1
SLIDE 1

SEISMIC: A Self-Exciting Point Process Model for Predicting Tweet Popularity

Qingyuan Zhao1, Murat A. Erdogdu1, Hera Y. He1, Anand Rajaraman2, Jure Leskovec2

Department of Statistics1 and Computer Science2, Stanford University

KDD’15, Aug 12, 2015

slide-2
SLIDE 2

SEISMIC Background SEISMIC Experiments Summary 1/19

Information cascade

An information cascade occurs when people engage in the same actions.

Source: wikimedia.org Source: adweek.com

slide-3
SLIDE 3

SEISMIC Background SEISMIC Experiments Summary 2/19

Twitter

Twitter provides the ideal playground to study information cascades. Start: a Twitter user posts a 140-character message which can be seen by his/her followers. Spread: a tweet is forwarded in Twitter by another user.

slide-4
SLIDE 4

SEISMIC Background SEISMIC Experiments Summary 3/19

Predicting cascades in real time

Goal Given the tweet and retweets up to time T, predict its final popularity.

slide-5
SLIDE 5

SEISMIC Background SEISMIC Experiments Summary 3/19

Predicting cascades in real time

Goal Given the tweet and retweets up to time T, predict its final popularity. Applications

Ranking content. Detecting viral/breakout tweets. Understanding human social behavior.

slide-6
SLIDE 6

SEISMIC Background SEISMIC Experiments Summary 4/19

Mathematical definitions

Data Relative retweet time t0 = 0, t1, t2, . . .

Number of retweets by time t: Rt =

  • ti≤t

1.

Number of followers of each retweeter n0, n1, n2, . . .

Number of exposed users by time t: Nt =

  • ti≤t

ni.

slide-7
SLIDE 7

SEISMIC Background SEISMIC Experiments Summary 4/19

Mathematical definitions

Data Relative retweet time t0 = 0, t1, t2, . . .

Number of retweets by time t: Rt =

  • ti≤t

1.

Number of followers of each retweeter n0, n1, n2, . . .

Number of exposed users by time t: Nt =

  • ti≤t

ni.

Problem statement Given (Rt, Nt) for 0 ≤ t ≤ T, predict R∞.

slide-8
SLIDE 8

SEISMIC Background SEISMIC Experiments Summary 5/19

Approaches to cascade prediction

Broadly categorized into two groups: Feature based methods (the majority): Point process based methods:

slide-9
SLIDE 9

SEISMIC Background SEISMIC Experiments Summary 5/19

Approaches to cascade prediction

Broadly categorized into two groups: Feature based methods (the majority):

Feature engineering: temporal, network structure, content, user, . . .

Point process based methods:

slide-10
SLIDE 10

SEISMIC Background SEISMIC Experiments Summary 5/19

Approaches to cascade prediction

Broadly categorized into two groups: Feature based methods (the majority):

Feature engineering: temporal, network structure, content, user, . . . Supervised learning: linear regression, collaborative filtering, regression trees, topic modeling, . . .

Point process based methods:

slide-11
SLIDE 11

SEISMIC Background SEISMIC Experiments Summary 5/19

Approaches to cascade prediction

Broadly categorized into two groups: Feature based methods (the majority):

Feature engineering: temporal, network structure, content, user, . . . Supervised learning: linear regression, collaborative filtering, regression trees, topic modeling, . . .

Point process based methods:

Dynamic Poisson process, reinforced Poisson process

slide-12
SLIDE 12

SEISMIC Background SEISMIC Experiments Summary 5/19

Approaches to cascade prediction

Broadly categorized into two groups: Feature based methods (the majority):

Feature engineering: temporal, network structure, content, user, . . . Supervised learning: linear regression, collaborative filtering, regression trees, topic modeling, . . .

Point process based methods:

Dynamic Poisson process, reinforced Poisson process Our model (SEISMIC): self-exciting point process.

slide-13
SLIDE 13

SEISMIC Background SEISMIC Experiments Summary 6/19

Example

slide-14
SLIDE 14

SEISMIC Background SEISMIC Experiments Summary 7/19

Example

25 50 75 2 4 6

Retweet Count Histogram of Retweet Times

5000 10000 15000 20000 2 4 6

Time since original tweet (hour) Retweets

Final SEISMIC Cumulative

Prediction by SEISMIC

slide-15
SLIDE 15

SEISMIC Background SEISMIC Experiments Summary 8/19

SEISMIC

SEISMIC (Self-Exciting Model of Information Cascades) is a flexible model of information cascades. Highlights Generative model. Easy interpretation. Scalable: prediction takes O(# retweets). State-of-the-art performance.

slide-16
SLIDE 16

SEISMIC Background SEISMIC Experiments Summary 9/19

Background: point processes

Point process models Rt is characterized by its intensity λt = lim

∆↓0

P (Rt+∆ − Rt = 1) ∆ .

slide-17
SLIDE 17

SEISMIC Background SEISMIC Experiments Summary 9/19

Background: point processes

Point process models Rt is characterized by its intensity λt = lim

∆↓0

P (Rt+∆ − Rt = 1) ∆ .

Examples Poisson process: λt = λ; Reinforced Poisson process1: λt = p · φ(t) · g(Rt).

  • 1S. Gao, J. Ma, and Z. Chen. Modeling and predicting retweeting

dynamics on microblogging platforms. In WSDM ’15, 2015.

slide-18
SLIDE 18

SEISMIC Background SEISMIC Experiments Summary 9/19

Background: point processes

Point process models Rt is characterized by its intensity λt = lim

∆↓0

P (Rt+∆ − Rt = 1) ∆ .

Examples Poisson process: λt = λ; Reinforced Poisson process1: λt = p · φ(t) · g(Rt). They are not suitable to model viral tweets.

  • 1S. Gao, J. Ma, and Z. Chen. Modeling and predicting retweeting

dynamics on microblogging platforms. In WSDM ’15, 2015.

slide-19
SLIDE 19

SEISMIC Background SEISMIC Experiments Summary 10/19

SEISMIC

Key steps of retweeting How often does a user check Twitter? What is the user’s probability of retweeting a given tweet?

slide-20
SLIDE 20

SEISMIC Background SEISMIC Experiments Summary 10/19

SEISMIC

Key steps of retweeting How often does a user check Twitter?

Memory kernel (power law distribution).

What is the user’s probability of retweeting a given tweet?

slide-21
SLIDE 21

SEISMIC Background SEISMIC Experiments Summary 10/19

SEISMIC

Key steps of retweeting How often does a user check Twitter?

Memory kernel (power law distribution).

What is the user’s probability of retweeting a given tweet?

Tweet infectiousness.

slide-22
SLIDE 22

SEISMIC Background SEISMIC Experiments Summary 10/19

SEISMIC

Key steps of retweeting How often does a user check Twitter?

Memory kernel (power law distribution).

What is the user’s probability of retweeting a given tweet?

Tweet infectiousness.

Self-exciting point process Infectiousness: “probability” of retweeting λt = p ·

  • ti≤t

niφ(t − ti) , t ≥ t0. Self-exciting: “rate” of viewing

slide-23
SLIDE 23

SEISMIC Background SEISMIC Experiments Summary 11/19

Time-varying infectiousness

Fixed p is not enough to model viral tweets.

25 50 75 2 4 6

Retweet Count Histogram of Retweet Times

0.00 0.02 0.04 0.06 2 4 6

Infectiousness Infectiousness Estimated by SEISMIC

SEISMIC replaces p by a smooth process pt.

slide-24
SLIDE 24

SEISMIC Background SEISMIC Experiments Summary 12/19

Estimate infectiousness

We estimate pt by locally smoothing the maximum likelihood estimator (MLE): “Number of retweets” ˆ pt =

Rt

  • i=1

Kt(t − ti)

Rt

  • i=0

ni t

ti

Kt(t − s)φ(s − ti)ds . “Number of views”

slide-25
SLIDE 25

SEISMIC Background SEISMIC Experiments Summary 13/19

Predict popularity

SEISMIC prediction formula Assume the out-degrees in the network have mean n∗ and the infectiousness parameter pt ≡ p for t ≥ T. Then E[R∞| FT] =        RT + p(NT − Ne

T)

1 − pn∗ , if p < 1 n∗ , ∞, if p ≥ 1 n∗ . where Ne

T = RT

  • i=0

ni T

ti

φ(t − ti)dt. See our paper for derivation.

slide-26
SLIDE 26

SEISMIC Background SEISMIC Experiments Summary 14/19

Example

25 50 75 2 4 6

Retweet Count Histogram of Retweet Times

5000 10000 15000 20000 2 4 6

Time since original tweet (hour) Retweets

Final SEISMIC Cumulative

Prediction by SEISMIC

slide-27
SLIDE 27

SEISMIC Background SEISMIC Experiments Summary 15/19

Experiments: dataset

Raw dataset: all tweet and retweet activities from October 7 to November 7, 2011. Filter by:

Posted in the first 15 days. English tweets; No hashtag; At least 50 retweets;

End up with 166076 cascades (in total over 34 million tweets/retweets).

slide-28
SLIDE 28

SEISMIC Background SEISMIC Experiments Summary 16/19

Baselines

We compare SEISMIC to four different baselines:

1 LR: linear regression 2 LR-D: linear regression with degree 3 DPM: dynamic Poisson model 4 RPS: reinforced Poisson model

slide-29
SLIDE 29

SEISMIC Background SEISMIC Experiments Summary 17/19

Comparison: Absolute Percentage Error (APE)

APE = |ˆ

R∞ − R∞|/R∞. 15% vs 25% percentage error when observe 1 hour.

slide-30
SLIDE 30

SEISMIC Background SEISMIC Experiments Summary 18/19

Comparison: Coverage of breakouts

A list of true top 500 tweets with most retweets. Lists of predicted top 500 tweets at all time points. 70% vs 55% coverage when observe 25% retweets.

slide-31
SLIDE 31

SEISMIC Background SEISMIC Experiments Summary 19/19

Summary

In conclusion, SEISMIC Effectively models information cascades by self-exciting point processes; Efficiently updates parameters and makes prediction; Outperforms several baselines and state-of-the-art. Code and data available online at http://snap.stanford.edu/seismic.

slide-32
SLIDE 32

SEISMIC Background SEISMIC Experiments Summary 19/19

Estimation of memory kernel φ(t)

slide-33
SLIDE 33

SEISMIC Background SEISMIC Experiments Summary 19/19

More detail: final tweak

The prediction is unstable when ˆ pt is close to 1 n∗ . The real ps is likely to decrease. Stabilized prediction ˆ R∞(t) = Rt + αt ˆ pt(Nt − Ne

t )

1 − γtˆ ptn∗ where 0 < αt, γt ≤ 1 are trained for the network.