U Berlin Institute of Technology Department Machine Learning - - PowerPoint PPT Presentation

u
SMART_READER_LITE
LIVE PREVIEW

U Berlin Institute of Technology Department Machine Learning - - PowerPoint PPT Presentation

Canonical Trend Analysis for Social Networks Felix Biemann, Jens-Michalis Papaioannou, Mikio Braun, Matthias L. Jugel, Klaus-Robert Mller, Andreas Harth U Berlin Institute of Technology Department Machine Learning Trends Canonical


slide-1
SLIDE 1

U

Berlin Institute of Technology Department Machine Learning

Canonical Trend Analysis for Social Networks

Felix Bießmann, Jens-Michalis Papaioannou, Mikio Braun, Matthias L. Jugel, Klaus-Robert Müller, Andreas Harth

slide-2
SLIDE 2

U

Canonical Trends

Temporal Dynamics of Web Data

2

slide-3
SLIDE 3

U

Canonical Trends

Temporal Dynamics of Web Data

2

Web content is copied, repeated or rephrased (Trends/Memes)

slide-4
SLIDE 4

U

Canonical Trends

Temporal Dynamics of Web Data

2

Web content is copied, repeated or rephrased (Trends/Memes) This temporal structure contains important information

slide-5
SLIDE 5

U

Canonical Trends

Temporal Dynamics of Web Data

2

Web content is copied, repeated or rephrased (Trends/Memes) This temporal structure contains important information Growing interest in temporal dynamics of graphs

slide-6
SLIDE 6

U

Canonical Trends

Temporal Dynamics of Web Data

2

Web content is copied, repeated or rephrased (Trends/Memes) This temporal structure contains important information Growing interest in temporal dynamics of graphs

Understanding dynamic graphs [Leskovec et al, KDD, 2005]

slide-7
SLIDE 7

U

Canonical Trends

Temporal Dynamics of Web Data

2

Web content is copied, repeated or rephrased (Trends/Memes) This temporal structure contains important information Growing interest in temporal dynamics of graphs

Understanding dynamic graphs [Leskovec et al, KDD, 2005] Causal Inference [Lozano and Sindhwani, NIPS 2010]

slide-8
SLIDE 8

U

Canonical Trends

Temporal Dynamics of Web Data

2

Web content is copied, repeated or rephrased (Trends/Memes) This temporal structure contains important information Growing interest in temporal dynamics of graphs

Understanding dynamic graphs [Leskovec et al, KDD, 2005] Causal Inference [Lozano and Sindhwani, NIPS 2010] Diffusion of information [Gomez Rodriguez et al, ICML 2011/2012]

slide-9
SLIDE 9

U

Canonical Trends

Temporal Dynamics of Web Data

2

Web content is copied, repeated or rephrased (Trends/Memes) This temporal structure contains important information Growing interest in temporal dynamics of graphs

Understanding dynamic graphs [Leskovec et al, KDD, 2005] Causal Inference [Lozano and Sindhwani, NIPS 2010] Diffusion of information [Gomez Rodriguez et al, ICML 2011/2012]

Canonical Trend Analysis

slide-10
SLIDE 10

U

Canonical Trends

Temporal Dynamics of Web Data

2

Web content is copied, repeated or rephrased (Trends/Memes) This temporal structure contains important information Growing interest in temporal dynamics of graphs

Understanding dynamic graphs [Leskovec et al, KDD, 2005] Causal Inference [Lozano and Sindhwani, NIPS 2010] Diffusion of information [Gomez Rodriguez et al, ICML 2011/2012]

Canonical Trend Analysis

  • Exploits temporal structure to find trends
slide-11
SLIDE 11

U

Canonical Trends

Temporal Dynamics of Web Data

2

Web content is copied, repeated or rephrased (Trends/Memes) This temporal structure contains important information Growing interest in temporal dynamics of graphs

Understanding dynamic graphs [Leskovec et al, KDD, 2005] Causal Inference [Lozano and Sindhwani, NIPS 2010] Diffusion of information [Gomez Rodriguez et al, ICML 2011/2012]

Canonical Trend Analysis

  • Exploits temporal structure to find trends
  • Find web sources that precede/follow trends
slide-12
SLIDE 12

U

Canonical Trends

Temporal Dynamics of Web Data

2

Web content is copied, repeated or rephrased (Trends/Memes) This temporal structure contains important information Growing interest in temporal dynamics of graphs

Understanding dynamic graphs [Leskovec et al, KDD, 2005] Causal Inference [Lozano and Sindhwani, NIPS 2010] Diffusion of information [Gomez Rodriguez et al, ICML 2011/2012]

Canonical Trend Analysis

  • Exploits temporal structure to find trends
  • Find web sources that precede/follow trends

Examples:

slide-13
SLIDE 13

U

Canonical Trends

Temporal Dynamics of Web Data

2

Web content is copied, repeated or rephrased (Trends/Memes) This temporal structure contains important information Growing interest in temporal dynamics of graphs

Understanding dynamic graphs [Leskovec et al, KDD, 2005] Causal Inference [Lozano and Sindhwani, NIPS 2010] Diffusion of information [Gomez Rodriguez et al, ICML 2011/2012]

Canonical Trend Analysis

  • Exploits temporal structure to find trends
  • Find web sources that precede/follow trends

Examples:

  • Spatiotemporal Dynamics of Retweets to News Articles
slide-14
SLIDE 14

U

Canonical Trends

Temporal Dynamics of Web Data

2

Web content is copied, repeated or rephrased (Trends/Memes) This temporal structure contains important information Growing interest in temporal dynamics of graphs

Understanding dynamic graphs [Leskovec et al, KDD, 2005] Causal Inference [Lozano and Sindhwani, NIPS 2010] Diffusion of information [Gomez Rodriguez et al, ICML 2011/2012]

Canonical Trend Analysis

  • Exploits temporal structure to find trends
  • Find web sources that precede/follow trends

Examples:

  • Spatiotemporal Dynamics of Retweets to News Articles
  • Music trends on Last.fm
slide-15
SLIDE 15

U

Canonical Trends

Canonical Correlation Analysis

3

Z

Latent Variable (Trend)

Y

Features

(e.g. Bag of Words, User actions, edge histograms, ...)

X

slide-16
SLIDE 16

U

Canonical Trends

Canonical Correlation Analysis

3

Z

Latent Variable (Trend)

Y

Features

(e.g. Bag of Words, User actions, edge histograms, ...)

X

w>

x X

w>

y Y

slide-17
SLIDE 17

U

Canonical Trends

Canonical Correlation Analysis

3

Z

Latent Variable (Trend)

Y

Features

(e.g. Bag of Words, User actions, edge histograms, ...)

X

[Jordan 1875], [Hotelling 1936], [Bach and Jordan 2006]

argmax

wx, wy

w>

x XY >wy

q w>

x XX>wxw> y Y Y >wy

w>

x X

w>

y Y

slide-18
SLIDE 18

U

Canonical Trends

Canonical Trend Model

4

Z

Latent Variable (Trend)

Y

Features

(e.g. Bag of Words, User actions, edge histograms, ...)

X

faster

slide-19
SLIDE 19

U

Canonical Trends

Canonical Trend Model

4

Z

Latent Variable (Trend)

Y

Features

(e.g. Bag of Words, User actions, edge histograms, ...)

X

faster

w>

y Yt

X

τ

wx(τ)>Xtτ

slide-20
SLIDE 20

U

Canonical Trends

An Example on News Trends

5

slide-21
SLIDE 21

U

Canonical Trends

An Example on News Trends

5

mashable.com arstechnica.com techcrunch.com slashdot.org

slide-22
SLIDE 22

U

Canonical Trends

An Example on News Trends

5

mashable.com arstechnica.com techcrunch.com slashdot.org mashable.com arstechnica.com techcrunch.com slashdot.org

Time t=-1 t=0

slide-23
SLIDE 23

U

Canonical Trends

An Example on News Trends

5

mashable.com arstechnica.com techcrunch.com slashdot.org mashable.com arstechnica.com techcrunch.com slashdot.org

Time t=-1 t=0

Xf ∈ RW ×T

slide-24
SLIDE 24

U

Canonical Trends

An Example on News Trends

5

mashable.com arstechnica.com techcrunch.com slashdot.org mashable.com arstechnica.com techcrunch.com slashdot.org

Time t=-1 t=0

Y = X

f 06=f

Xf 0 Xf ∈ RW ×T

slide-25
SLIDE 25

U

Canonical Trends

An Example on News Trends

5

mashable.com arstechnica.com techcrunch.com slashdot.org mashable.com arstechnica.com techcrunch.com slashdot.org

Time t=-1 t=0

Predict future content

  • f all other web sources

from past content of single web source

Y = X

f 06=f

Xf 0 Xf ∈ RW ×T

slide-26
SLIDE 26

U

Canonical Trends

An Example on News Trends

5

mashable.com arstechnica.com techcrunch.com slashdot.org mashable.com arstechnica.com techcrunch.com slashdot.org

Time t=-1 t=0

Y = X

f 06=f

Xf 0 Xf ∈ RW ×T X

τ

wx(τ)>Xtτ w>

y Yt

Z

slide-27
SLIDE 27

U

Canonical Trends

Why Projecting to Canonical Subspace?

6

Easily interpretable: For Text data each canonical direction is a topic

[De Bie and Cristianini, 2004]

Information theoretic optimal compression

[Creutzig 2009]

Conversion of canonical correlations to granger causality index

[Otter 1991]

slide-28
SLIDE 28

U

Canonical Trends

Canonical Trend Analysis For Social Networks

7

slide-29
SLIDE 29

U

Canonical Trends

Canonical Trend Analysis For Social Networks

7

Quantifying spatiotemporal retweet response to news content

slide-30
SLIDE 30

U

Canonical Trends

Canonical Trend Analysis For Social Networks

7

Quantifying spatiotemporal retweet response to news content Finding users ahead and following music trends on Last.fm

slide-31
SLIDE 31

U

Canonical Trends

Canonical Trend Analysis For Social Networks

8

slide-32
SLIDE 32

U

Canonical Trends

Canonical Trend Analysis For Social Networks

8

Some news web site publishes some content ...

slide-33
SLIDE 33

U

Canonical Trends

Canonical Trend Analysis For Social Networks

8

Some news web site publishes some content ... Time

t

... which is retweeted

t + τ1

slide-34
SLIDE 34

U

Canonical Trends

Canonical Trend Analysis For Social Networks

8

Some news web site publishes some content ... Time

t

... which is retweeted

t + τ1

... at different locations

slide-35
SLIDE 35

U

Canonical Trends

Data Extraction

9

slide-36
SLIDE 36

U

Canonical Trends

Data Extraction

9

f ∈ {1, 2, . . . , F}

For each news site extract

slide-37
SLIDE 37

U

Canonical Trends

Data Extraction

9

f ∈ {1, 2, . . . , F}

For each news site extract Bag-of-Words Features

Xf = [xf(t = 1), . . . , xf(t = T)] ∈ RW ×T

slide-38
SLIDE 38

U

Canonical Trends

Data Extraction

9

f ∈ {1, 2, . . . , F}

For each news site extract Retweet locations

Yf = [yf(t = 1), . . . , yf(t = T)] ∈ RL×T

Bag-of-Words Features

Xf = [xf(t = 1), . . . , xf(t = T)] ∈ RW ×T

slide-39
SLIDE 39

U

Canonical Trends

Data Extraction: Retweet Locations

10

slide-40
SLIDE 40

U

Canonical Trends

Data Extraction: Retweet Locations

10

  • 1. Extract URI of each news article in twitter stream
slide-41
SLIDE 41

U

Canonical Trends

Data Extraction: Retweet Locations

10

  • 1. Extract URI of each news article in twitter stream
  • 2. Retrieve Location from Twitter User Profile
slide-42
SLIDE 42

U

Canonical Trends

Data Extraction: Retweet Locations

10

  • 1. Extract URI of each news article in twitter stream
  • 2. Retrieve Location from Twitter User Profile
  • 3. Resolve Ambiguities / Remove non-sense Locations
slide-43
SLIDE 43

U

Canonical Trends

Data Extraction: Retweet Locations

10

  • 1. Extract URI of each news article in twitter stream
  • 2. Retrieve Location from Twitter User Profile
  • 3. Resolve Ambiguities / Remove non-sense Locations
  • 4. Downsample Geographic Locations
slide-44
SLIDE 44

U

Canonical Trends

Mean Locations of Retweeted News Articles

11

slide-45
SLIDE 45

U

Canonical Trends

Downsampling of Geographic Information

12

GADM: An RDF spatial representation

  • f all the administrative

regions in the world

slide-46
SLIDE 46

U

Canonical Trends

Canonical Trend Analysis

13

News Content (Bag-of-Words) Retweet Locations

Z

Hidden Variable (News Topic)

ˆ yf(t) = X

τ

wy(τ)>Yf(:, t + τ) ˆ xf(t) = w>

x Xf(:, t)

slide-47
SLIDE 47

U

Canonical Trends

Canonical Trend Analysis

14

Optimal and

argmax

wy(τ),wx

Corr(ˆ xf(t), ˆ yf(t)). ˆ xf(t) = w>

x Xf(:, t)

ˆ yf(t) = X

τ

wy(τ)>Yf(:, t + τ)

News Content (Bag-of-Words) Retweet Locations

wx ∈ RW wy(τ) ∈ RW Nτ

slide-48
SLIDE 48

U

Canonical Trends

Efficient Computation of Canonical Trends

15

(linear) ‘Kernel Trick’

  • Very efficient for high-dimensional feature spaces

˜ Yf = 2 6 4 Yf,τ=1 . . . Yf,τ=Nτ 3 7 5 ∈ RLNτ ⇥T .

[Takens 1981]

slide-49
SLIDE 49

U

Canonical Trends

Efficient Computation of Canonical Trends

15

Temporal Embedding

  • Standard CCA problem

(linear) ‘Kernel Trick’

  • Very efficient for high-dimensional feature spaces

[Jordan 1875], [Hotelling 1936], [Anderson 1999] [Fyfe 2000], [Fukumizu 2007]

˜ Yf = 2 6 4 Yf,τ=1 . . . Yf,τ=Nτ 3 7 5 ∈ RLNτ ⇥T .

wy(τ) = Yf,τα, wx = Xfβ.

[Takens 1981]

slide-50
SLIDE 50

U

Canonical Trends

Efficient Computation of Canonical Trends

16

Objective function is maximized in the dual where are linear kernels

K ˜

Y = ˜

Y > ˜ Y KX =X>X

Corr(ˆ x(t), ˆ y(t)) = P

τ(wy(τ)>Yτ)>Xwx

pP

τ(wy(τ)>YτY > τ wy(τ))w> x XX>wx

= α>K ˜

Y KXβ

q α>K2

˜ Y αβ>K2 Xβ

slide-51
SLIDE 51

U

Canonical Trends

Efficient Computation of Canonical Trends

17

Dual coefficients are solution to generalized eigenvalue equation

 K ˜

Y KX

KXK ˜

Y

 α β

  • = λ

 K2

˜ Y + Iκy

K2

X + I + κx

 α β

  • Corr(ˆ

x(t), ˆ y(t)) = P

τ(wy(τ)>Yτ)>Xwx

pP

τ(wy(τ)>YτY > τ wy(τ))w> x XX>wx

= α>K ˜

Y KXβ

q α>K2

˜ Y αβ>K2 Xβ

slide-52
SLIDE 52

U

Canonical Trends

Efficient Computation of Canonical Trends

17

Bießmann et al, Machine Learning, 2010

Dual coefficients are solution to generalized eigenvalue equation

 K ˜

Y KX

KXK ˜

Y

 α β

  • = λ

 K2

˜ Y + Iκy

K2

X + I + κx

 α β

  • Corr(ˆ

x(t), ˆ y(t)) = P

τ(wy(τ)>Yτ)>Xwx

pP

τ(wy(τ)>YτY > τ wy(τ))w> x XX>wx

= α>K ˜

Y KXβ

q α>K2

˜ Y αβ>K2 Xβ

slide-53
SLIDE 53

U

Canonical Trends

Comparisons: Mean, PCA and Canonical Trends

18

slide-54
SLIDE 54

U

Canonical Trends

Comparisons: Mean, PCA and Canonical Trends

18

Mean PCA Canonical Trends

slide-55
SLIDE 55

U

Canonical Trends

Comparisons: Mean, PCA and Canonical Trends

18

Mean PCA Canonical Trends

argmax

wy(τ),wx

Corr(ˆ xf(t), ˆ yf(t)).

slide-56
SLIDE 56

U

Canonical Trends

Comparisons: Mean, PCA and Canonical Trends

18

Mean PCA

w>

x = 1x/N, wy(τ) = 1y/N

Canonical Trends

argmax

wy(τ),wx

Corr(ˆ xf(t), ˆ yf(t)).

slide-57
SLIDE 57

U

Canonical Trends

Comparisons: Mean, PCA and Canonical Trends

18

Mean PCA

argmax

wy(τ)

(wy(τ)> ˜ Yf ˜ Y >

f wy(τ)),

argmax

wx

(w>

x XX>wx),

s.t. wy(τ)>wy(τ) = w>

x wx = 1

w>

x = 1x/N, wy(τ) = 1y/N

Canonical Trends

argmax

wy(τ),wx

Corr(ˆ xf(t), ˆ yf(t)).

slide-58
SLIDE 58

U

Canonical Trends

Comparisons: Mean, PCA and Canonical Trends

18

Mean PCA Canonical Trends Hypothesis News Content helps predicting retweet frequency Mean Wordcount predicts mean tweet frequency best Wordcount variance predicts tweet variance

slide-59
SLIDE 59

U

Canonical Trends

Comparisons: Mean, PCA and Canonical Trends

18

Mean PCA Canonical Trends

slide-60
SLIDE 60

U

Canonical Trends

Comparisons: Mean, PCA and Canonical Trends

19

slide-61
SLIDE 61

U

Canonical Trends

Canonical Convolution

20

Excerpts from LA Times Spatiotemporal Response California New York Ontario

slide-62
SLIDE 62

U

Canonical Trends

Spatiotemporal Analysis of Retweets of News

21

slide-63
SLIDE 63

U

Canonical Trends

Spatiotemporal Analysis of Retweets of News

21

We use canonical correlation analysis to compute

slide-64
SLIDE 64

U

Canonical Trends

Spatiotemporal Analysis of Retweets of News

21

We use canonical correlation analysis to compute a Bag-of-Word subspace (topic) and

slide-65
SLIDE 65

U

Canonical Trends

Spatiotemporal Analysis of Retweets of News

21

We use canonical correlation analysis to compute a Bag-of-Word subspace (topic) and spatiotemporal twitter response patterns

slide-66
SLIDE 66

U

Canonical Trends

Spatiotemporal Analysis of Retweets of News

21

We use canonical correlation analysis to compute a Bag-of-Word subspace (topic) and spatiotemporal twitter response patterns such that news content and retweets are maximally correlated

slide-67
SLIDE 67

U

Canonical Trends

Spatiotemporal Analysis of Retweets of News

21

We use canonical correlation analysis to compute a Bag-of-Word subspace (topic) and spatiotemporal twitter response patterns such that news content and retweets are maximally correlated Results can be interpreted w.r.t

slide-68
SLIDE 68

U

Canonical Trends

Spatiotemporal Analysis of Retweets of News

21

We use canonical correlation analysis to compute a Bag-of-Word subspace (topic) and spatiotemporal twitter response patterns such that news content and retweets are maximally correlated Results can be interpreted w.r.t

  • How much impact has a news site on Twitter-Community
slide-69
SLIDE 69

U

Canonical Trends

Spatiotemporal Analysis of Retweets of News

21

We use canonical correlation analysis to compute a Bag-of-Word subspace (topic) and spatiotemporal twitter response patterns such that news content and retweets are maximally correlated Results can be interpreted w.r.t

  • How much impact has a news site on Twitter-Community
  • (Content that will lead to high retweet frequency)
slide-70
SLIDE 70

U

Canonical Trends

Spatiotemporal Analysis of Retweets of News

21

We use canonical correlation analysis to compute a Bag-of-Word subspace (topic) and spatiotemporal twitter response patterns such that news content and retweets are maximally correlated Results can be interpreted w.r.t

  • How much impact has a news site on Twitter-Community
  • (Content that will lead to high retweet frequency)
  • (Where and when maximal impact is reached)
slide-71
SLIDE 71

U

Canonical Trends

Users and Trends on Last.fm

22

A last.fm user subgraph

18-Nov-2007 25-Nov-2007 02-Dec-2007 09-Dec-2007 lfm:user/JoanLandor#i _:b1 #weeklychart _:b2 #weeklychart lfm:user/popnutten foaf:knows #from #to _:a1 #list New Order #rank_1 Tiger Lou #rank_2 Aereogramme #rank_3 new wave #tag electronic #tag indie #tag alternative #tag #from #to _:a2 #list Air #rank_1 Death Cab for Cutie #rank_2 The Notwist #rank_3 #tag #tag #tag

slide-72
SLIDE 72

U

Canonical Trends

Users and Trends on Last.fm

23

Xf = [xf(t = 1), . . . , xf(t = T)] ∈ RM×T Yf = X

f 06=f

Xf 0 Extract Weekly Chartlist Last.fm-Music-tags Single User Chart Time Series All Other Users

slide-73
SLIDE 73

U

Canonical Trends

Users and Trends on Last.fm

23

Canonical Correlogram ρ(τ) = Corr

  • wx(τ)>Xτ, w>

y Y

  • =

wx(τ)>XτY >wy wx(τ)>XτX>

τ wx(τ) · w> y Y Y >wy

= α>KτKY β α>K2

τ α · β>K2 Y β

Xf = [xf(t = 1), . . . , xf(t = T)] ∈ RM×T Yf = X

f 06=f

Xf 0 Extract Weekly Chartlist Last.fm-Music-tags Single User Chart Time Series All Other Users

slide-74
SLIDE 74

U

Canonical Trends

Users and Trends on Last.fm

24

slide-75
SLIDE 75

U

Canonical Trends

Users and Trends on Last.fm

24

Time lag (weeks) Top Tags −10 −5 5 10 deutscher hip h german hip hop german reggae minimal synth minimalist −10 −5 5 10 0.1 0.2 0.3 0.4 0.5 Time lag (weeks) Correlation 8 10 12 14 16 18

Behind the Trend

slide-76
SLIDE 76

U

Canonical Trends

Users and Trends on Last.fm

24

Time lag (weeks) Top Tags −10 −5 5 10 punk cabaret crust punk hardcore punk punk blues emo −10 −5 5 10 0.2 0.3 0.4 0.5 Time lag (weeks) Correlation 5 10 15 20

Ahead of Trend

Time lag (weeks) Top Tags −10 −5 5 10 deutscher hip h german hip hop german reggae minimal synth minimalist −10 −5 5 10 0.1 0.2 0.3 0.4 0.5 Time lag (weeks) Correlation 8 10 12 14 16 18

Behind the Trend

slide-77
SLIDE 77

U

Canonical Trends

Summary

25

slide-78
SLIDE 78

U

Canonical Trends

Summary

25

Canonical Trend Analysis (CTA)

slide-79
SLIDE 79

U

Canonical Trends

Summary

25

Canonical Trend Analysis (CTA)

Finds maximally correlated subspace of graph feature time series

slide-80
SLIDE 80

U

Canonical Trends

Summary

25

Canonical Trend Analysis (CTA)

Finds maximally correlated subspace of graph feature time series Efficient computations via representer theorem

slide-81
SLIDE 81

U

Canonical Trends

Summary

25

Canonical Trend Analysis (CTA)

Finds maximally correlated subspace of graph feature time series Efficient computations via representer theorem CTA between news content and retweet location

slide-82
SLIDE 82

U

Canonical Trends

Summary

25

Canonical Trend Analysis (CTA)

Finds maximally correlated subspace of graph feature time series Efficient computations via representer theorem CTA between news content and retweet location Reveals ‘strongest’ topics and spatiotemporal tweet response

slide-83
SLIDE 83

U

Canonical Trends

Summary

25

Canonical Trend Analysis (CTA)

Finds maximally correlated subspace of graph feature time series Efficient computations via representer theorem CTA between news content and retweet location Reveals ‘strongest’ topics and spatiotemporal tweet response CTA between users on Last.fm

slide-84
SLIDE 84

U

Canonical Trends

Summary

25

Canonical Trend Analysis (CTA)

Finds maximally correlated subspace of graph feature time series Efficient computations via representer theorem CTA between news content and retweet location Reveals ‘strongest’ topics and spatiotemporal tweet response CTA between users on Last.fm Finds users ahead and behind musical trends

slide-85
SLIDE 85

U

Canonical Trends

Future Work

26

slide-86
SLIDE 86

U

Canonical Trends

Future Work

26

Sparse, non-negative canonical directions

slide-87
SLIDE 87

U

Canonical Trends

Future Work

26

Sparse, non-negative canonical directions Other features than BoW

slide-88
SLIDE 88

U

Canonical Trends

Future Work

26

Sparse, non-negative canonical directions Other features than BoW Online optimization

slide-89
SLIDE 89

U

Canonical Trends

Future Work

26

Sparse, non-negative canonical directions Other features than BoW Online optimization What about Nonstationarities?

slide-90
SLIDE 90

U

Canonical Trends

Detecting ‘Trendsetting’ News Websites

27

Real Data Example:

BoW Features from 96 Technology News Feeds in October 2011

yf(t) ˆ yf(t)

slide-91
SLIDE 91

U

Canonical Trends

Comparison Canonical Trend Analysis and LSA

28

Canonical Topics predict overall topics better than Latent Semantic Indexing Canonical trend analysis between and

  • vs. LSA on and separately

Xf Xf Yf Yf