Relational Stacked Denoising Autoencoder for Tag Recommendation Hao - - PowerPoint PPT Presentation

relational stacked denoising autoencoder for tag
SMART_READER_LITE
LIVE PREVIEW

Relational Stacked Denoising Autoencoder for Tag Recommendation Hao - - PowerPoint PPT Presentation

Relational Stacked Denoising Autoencoder for Tag Recommendation Hao Wang Dept. of Computer Science and Engineering Hong Kong University of Science and Technology Joint work with Xingjian Shi and Dit-Yan Yeung To appear in AAAI 2015 Hao Wang


slide-1
SLIDE 1

Relational Stacked Denoising Autoencoder for Tag Recommendation Hao Wang

  • Dept. of Computer Science and Engineering

Hong Kong University of Science and Technology Joint work with Xingjian Shi and Dit-Yan Yeung To appear in AAAI 2015

Hao Wang Relational SDAE 1 / 36

slide-2
SLIDE 2

Outline

1

Background and Related Work

2

Generalized Probabilistic SDAE

3

Relational SDAE

4

Performance Evaluation

5

Case study

6

Conclusion

Hao Wang Relational SDAE 2 / 36

slide-3
SLIDE 3

Outline

1

Background and Related Work

2

Generalized Probabilistic SDAE

3

Relational SDAE

4

Performance Evaluation

5

Case study

6

Conclusion

Hao Wang Relational SDAE 3 / 36

slide-4
SLIDE 4

Tag Recommendation: Flickr

https://www.flickr.com

Hao Wang Relational SDAE 3 / 36

slide-5
SLIDE 5

Tag Recommendation: CiteULike

http://www.citeulike.org

Hao Wang Relational SDAE 4 / 36

slide-6
SLIDE 6

Tag Recommendation

item tag

1 1 2 2 3 3 4 4 5 5

Hao Wang Relational SDAE 5 / 36

slide-7
SLIDE 7

Related Work Content-based:

1 Chen et al., 2008 2 Chen et al., 2010 3 Shen and Fan, 2010

Co-occurrence based:

1 Garg and Weber, 2008 2 Weinberger et al., 2008 3 Rendle and Schmidt-Thieme, 2010

Hybrid:

1 Wu et al., 2009 2 Wang and Blei, 2011 3 Yang et al., 2013 Hao Wang Relational SDAE 6 / 36

slide-8
SLIDE 8

Content-based

1 Chen et al., 2008 2 Chen et al., 2010 3 Shen and Fan, 2010 4 . . .

Pros:

1 Tag independence 2 Interpretability 3 No New-item problem

Cons:

1 Need domain knowledge Hao Wang Relational SDAE 7 / 36

slide-9
SLIDE 9

Co-occurrence based

1 Garg and Weber, 2008 2 Weinberger et al., 2008 3 Rendle and Schmidt-Thieme, 2010 4 . . .

Pros:

1 No domain knowledge needed

Cons:

1 Requires some form of rating feedback (co-occurrence matrix) 2 New-tag problem and new-item problem Hao Wang Relational SDAE 8 / 36

slide-10
SLIDE 10

Hybrid

1 Wu et al., 2009 2 Wang and Blei, 2011 3 Yang et al., 2013 4 . . .

BEST OF BOTH WORLDS

Hao Wang Relational SDAE 9 / 36

slide-11
SLIDE 11

Collaborative Topic Regression (CTR) (Wang and Blei, KDD 2011)

Hao Wang Relational SDAE 10 / 36

slide-12
SLIDE 12

Collaborative Topic Regression (CTR) (Wang and Blei, KDD 2011)

LDA: sparse, relatively high dimension MF: low rank, low dimension

Hao Wang Relational SDAE 11 / 36

slide-13
SLIDE 13

Problems to Explore

1 Can SDAE learn effective representation for recommendation? 2 How to incorporate relational information into SDAE? 3 How is the performance? Hao Wang Relational SDAE 12 / 36

slide-14
SLIDE 14

Outline

1

Background and Related Work

2

Generalized Probabilistic SDAE

3

Relational SDAE

4

Performance Evaluation

5

Case study

6

Conclusion

Hao Wang Relational SDAE 13 / 36

slide-15
SLIDE 15

Stacked Denoising Autoencoder (Vincent et al. JMLR 2010)

X0 X0 X1 X1 X2 X2 X3 X3 X4 X4 Xc Xc

min

{Wl},{bl} Xc − XL2 F + λ

  • l

Wl2

F ,

where λ is a regularization parameter and · F denotes the Frobenius norm.

Hao Wang Relational SDAE 13 / 36

slide-16
SLIDE 16

Generalized Probabilistic SDAE

J

x0 x0 x2 x2 x3 x3

x4 x4

xc xc x1 x1

W+ W+

¸n ¸n ¸w ¸w

1 For each layer l of the SDAE network, 1 For each column n of the weight matrix Wl, draw

Wl,∗n ∼ N(0, λ−1

w IKl).

2 Draw the bias vector bl ∼ N(0, λ−1

w IKl).

3 For each row j of Xl, draw

Xl,j∗ ∼ N(σ(Xl−1,j∗Wl + bl), λ−1

s IKl).

2 For each item j, draw a clean input

Xc,j∗ ∼ N(XL,j∗, λ−1

n IB).

Hao Wang Relational SDAE 14 / 36

slide-17
SLIDE 17

Outline

1

Background and Related Work

2

Generalized Probabilistic SDAE

3

Relational SDAE

4

Performance Evaluation

5

Case study

6

Conclusion

Hao Wang Relational SDAE 15 / 36

slide-18
SLIDE 18

Relational SDAE: Generative Process

1 Draw the relational latent matrix S from a matrix variate normal

distribution: S ∼ NK,J(0, IK ⊗ (λlLa)−1).

2 For layer l of the SDAE where l = 1, 2, . . . , L

2 − 1,

1 For each column n of the weight matrix Wl, draw

Wl,∗n ∼ N(0, λ−1

w IKl).

2 Draw the bias vector bl ∼ N(0, λ−1

w IKl).

3 For each row j of Xl, draw

Xl,j∗ ∼ N(σ(Xl−1,j∗Wl + bl), λ−1

s IKl).

3 For layer L

2 of the SDAE network, draw the representation vector

for item j from the product of two Gaussians (PoG): X L

2 ,j∗ ∼ PoG(σ(X L 2 −1,j∗Wl + bl), sT

j , λ−1 s IK, λ−1 r IK).

Hao Wang Relational SDAE 15 / 36

slide-19
SLIDE 19

Relational SDAE: Generative Process

1 For layer l of the SDAE network where l = L

2 + 1, L 2 + 2, . . . , L,

1 For each column n of the weight matrix Wl, draw

Wl,∗n ∼ N(0, λ−1

w IKl).

2 Draw the bias vector bl ∼ N(0, λ−1

w IKl).

3 For each row j of Xl, draw

Xl,j∗ ∼ N(σ(Xl−1,j∗Wl + bl), λ−1

s IKl).

2 For each item j, draw a clean input

Xc,j∗ ∼ N(XL,j∗, λ−1

n IB).

Hao Wang Relational SDAE 16 / 36

slide-20
SLIDE 20

Relational SDAE: Graphical Model

J

x0 x0 x2 x2 x3 x3

x4 x4

xc xc x1 x1

s

W+ W+

A

¸l ¸l

¸n ¸n ¸w ¸w ¸r ¸r

Hao Wang Relational SDAE 17 / 36

slide-21
SLIDE 21

Multi-Relational SDAE: Graphical Model

J

x0 x0 x2 x2 x3 x3

x4 x4

xc xc x1 x1

s

Q

W+ W+

A

¸n ¸n

¸l ¸l ¸w ¸w

¸r ¸r

Hao Wang Relational SDAE 18 / 36

slide-22
SLIDE 22

Relational SDAE: Objective function

The log-likelihood: L = − λl 2 tr(SLaST ) − λr 2

  • j

(sT

j − X L

2 ,j∗)2

2

− λw 2

  • l

(Wl2

F + bl2 2)

− λn 2

  • j

XL,j∗ − Xc,j∗2

2

− λs 2

  • l
  • j

σ(Xl−1,j∗Wl + bl) − Xl,j∗2

2,

where Xl,j∗ = σ(Xl−1,j∗Wl + bl). Similar to the generalized SDAE, taking λs to infinity, the last term of the joint log-likelihood will vanish.

Hao Wang Relational SDAE 19 / 36

slide-23
SLIDE 23

Updating Rules

For S: Sk∗(t + 1) ← Sk∗(t) + δ(t)r(t) r(t) ← λrXT

L 2 ,∗k − (λlLa + λrIJ)Sk∗(t)

δ(t) ← r(t)T r(t) r(t)T (λlLa + λrIJ)r(t). For X, W, and b: Use Back Propagation.

Hao Wang Relational SDAE 20 / 36

slide-24
SLIDE 24

From Representation to Tag Recommendation

Objective function: L = − λu 2

  • i

ui2

2 − λv

2

  • j

vj − XT

L 2 ,j∗2

2

  • i,j

cij 2 (Rij − uT

i vj)2,

where λu and λv are hyperparameters. cij is set to 1 for the existing ratings and 0.01 for the missing entries.

Hao Wang Relational SDAE 21 / 36

slide-25
SLIDE 25

Algorithm

  • 1. Learning representation:

repeat Update S using the updating rules Update X, W, and b until convergence Get resulting representation X L

2 ,j∗

  • 2. Learning ui and vj:

Optimize the objective function L

  • 3. Recommend tags to items according to the predicted Rij:

Rij = uT

i vj

Rank R1j,R2j,. . . ,RIj Recommend tags with largest Rij to item j

Hao Wang Relational SDAE 22 / 36

slide-26
SLIDE 26

Problems to Explore

1 Can SDAE learn effective representation for recommendation? 2 How to incorporate relational information into SDAE? 3 How is the performance? Hao Wang Relational SDAE 23 / 36

slide-27
SLIDE 27

Outline

1

Background and Related Work

2

Generalized Probabilistic SDAE

3

Relational SDAE

4

Performance Evaluation

5

Case study

6

Conclusion

Hao Wang Relational SDAE 24 / 36

slide-28
SLIDE 28

Datasets

Description of datasets

citeulike-a citeulike-t movielens-plot #items 16980 25975 7261 #tags 7386 8311 2988 #tag-item paris 204987 134860 51301 #relations 44709 32665 543621

Hao Wang Relational SDAE 24 / 36

slide-29
SLIDE 29

citeulike-a, Sparse Settting

50 100 150 200 250 300 0.15 0.2 0.25 0.3 0.35

M Recall RSDAE SDAE CTR−SR CTR

Hao Wang Relational SDAE 25 / 36

slide-30
SLIDE 30

citeulike-a, Dense Settting

50 100 150 200 250 300 0.3 0.35 0.4 0.45 0.5 0.55 0.6

M Recall RSDAE SDAE CTR−SR CTR

Hao Wang Relational SDAE 26 / 36

slide-31
SLIDE 31

movielens-plot, Sparse Settting

50 100 150 200 250 300 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2 0.22 0.24

M Recall RSDAE SDAE CTR−SR CTR

Hao Wang Relational SDAE 27 / 36

slide-32
SLIDE 32

movielens-plot, Dense Settting

50 100 150 200 250 300 0.2 0.25 0.3 0.35 0.4 0.45

M Recall RSDAE SDAE CTR−SR CTR

Hao Wang Relational SDAE 28 / 36

slide-33
SLIDE 33

Outline

1

Background and Related Work

2

Generalized Probabilistic SDAE

3

Relational SDAE

4

Performance Evaluation

5

Case study

6

Conclusion

Hao Wang Relational SDAE 29 / 36

slide-34
SLIDE 34

Tagging Scientific Articles

An example article with recommended tags

Example Article Title: Mining the Peanut Gallery: Opinion Extraction and Semantic Classification of Product Reviews Top topic 1: language, text, mining, representation, semantic, concepts, words, relations, processing, categories Top 10 tags SDAE True? RSDAE True?

  • 1. instance

no

  • 1. sentiment analysis

no

  • 2. consumer

yes

  • 2. instance

no

  • 3. sentiment analysis

no

  • 3. consumer

yes

  • 4. summary

no

  • 4. summary

no

  • 5. 31july09

no

  • 5. sentiment

yes

  • 6. medline

no

  • 6. product review mining

yes

  • 7. eit2

no

  • 7. sentiment classification

yes

  • 8. l2r

no

  • 8. 31july09

no

  • 9. exploration

no

  • 9. opinion mining

yes

  • 10. biomedical

no

  • 10. product

yes

Hao Wang Relational SDAE 29 / 36

slide-35
SLIDE 35

Tagging Movies

An example movie with recommended tags

Example Movie Title: E.T. the Extra-Terrestrial Top topic 1: crew, must, on, earth, human, save, ship, rescue, by, find, scientist, planet Top 10 recommended tags SDAE True tag?

  • 1. Saturn Award (Best Special Effects)

yes

  • 2. Want

no

  • 3. Saturn Award (Best Fantasy Film)

no

  • 4. Saturn Award (Best Writing)

yes

  • 5. Cool but freaky

no

  • 6. Saturn Award (Best Director)

no

  • 7. Oscar (Best Editing)

no

  • 8. almost favorite

no

  • 9. Steven Spielberg

yes

  • 10. sequel better than original

no

Hao Wang Relational SDAE 30 / 36

slide-36
SLIDE 36

Tagging Movies

An example movie with recommended tags

Example Movie Title: E.T. the Extra-Terrestrial Top topic 1: crew, must, on, earth, human, save, ship, rescue, by, find, scientist, planet Top 10 recommended tags RSDAE True tag?

  • 1. Steven Spielberg

yes

  • 2. Saturn Award (Best Special Effects)

yes

  • 3. Saturn Award (Best Writing)

yes

  • 4. Oscar (Best Editing)

no

  • 5. Want

no

  • 6. Liam Neeson

no

  • 7. AFI 100 (Cheers)

yes

  • 8. Oscar (Best Sound)

yes

  • 9. Saturn Award (Best Director)

no

  • 10. Oscar (Best Music - Original Score)

yes

Hao Wang Relational SDAE 31 / 36

slide-37
SLIDE 37

Outline

1

Background and Related Work

2

Generalized Probabilistic SDAE

3

Relational SDAE

4

Performance Evaluation

5

Case study

6

Conclusion

Hao Wang Relational SDAE 32 / 36

slide-38
SLIDE 38

Conclusion

Contribution:

1 Adapt SDAE for tag recommendation 2 A probabilistic relational model for relational deep learning 3 State-of-the-art performance

Take-home Message:

1 Deep models significantly boost recommendation accuracy 2 Probabilistic formulation facilitates relational deep learning 3 Incorporating relational information further boosts accuracy Hao Wang Relational SDAE 32 / 36

slide-39
SLIDE 39

Future Work

1 Applications other than tag recommendation 2 Adaption for other deep learning models 3 Integrated model instead of separate ones 4 Fully Bayesian methods Hao Wang Relational SDAE 33 / 36

slide-40
SLIDE 40

Other Work

1 Collaborative CNN for recommender systems 2 Relational deep learning for link prediction Hao Wang Relational SDAE 34 / 36

slide-41
SLIDE 41

citeulike-a, Sparse Settting

50 100 150 200 250 300 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2 0.22 M Recall

CCNN CTR CMF SVDFeature

Hao Wang Relational SDAE 35 / 36

slide-42
SLIDE 42

Q & A

Hao Wang Relational SDAE 36 / 36