Collaborative Deep Learning for Recommender Systems Hao Wang - - PowerPoint PPT Presentation

collaborative deep learning for recommender systems
SMART_READER_LITE
LIVE PREVIEW

Collaborative Deep Learning for Recommender Systems Hao Wang - - PowerPoint PPT Presentation

Collaborative Deep Learning for Recommender Systems Hao Wang Naiyan Wang Dit-Yan Yeung 1 Motivation Stacked Denoising Autoencoders Probabilistic Matrix Factorization Collaborative Deep Learning Experiments Summary


slide-1
SLIDE 1

Collaborative Deep Learning for Recommender Systems

Hao Wang Naiyan Wang Dit-Yan Yeung

1

slide-2
SLIDE 2
  • Motivation
  • Stacked Denoising Autoencoders
  • Probabilistic Matrix Factorization
  • Collaborative Deep Learning
  • Experiments
  • Summary

Motivation Stacked DAE PMF Collaborative DL Experiments Summary 2

slide-3
SLIDE 3

Recommender Systems

Motivation Stacked DAE PMF Collaborative DL Experiments Summary 3

Observed preferences: To predict: Matrix completion Rating matrix:

slide-4
SLIDE 4

Recommender Systems with Content

Motivation Stacked DAE PMF Collaborative DL Experiments Summary 4 Content information: Plots, directors, actors, etc.

slide-5
SLIDE 5

Modeling the Content Information

Handcrafted features Automatically learn features Automatically learn features and

adapt for ratings

Motivation Stacked DAE PMF Collaborative DL Experiments Summary 5

slide-6
SLIDE 6
  • 1. Powerful features for content information

Deep learning

Modeling the Content Information

Motivation Stacked DAE PMF Collaborative DL Experiments Summary 6

  • 2. Feedback from rating information

Non-i.i.d. Collaborative deep learning

slide-7
SLIDE 7

Deep Learning

Stacked denoising autoencoders Convolutional neural networks Recurrent neural networks Motivation Stacked DAE PMF Collaborative DL Experiments Summary 7 Deep learning allows computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction. Bengio et al. 2015

slide-8
SLIDE 8

Deep Learning

Stacked denoising autoencoders Convolutional neural networks Recurrent neural networks

Motivation Stacked DAE PMF Collaborative DL Experiments Summary

Typically for i.i.d. data

8

slide-9
SLIDE 9
  • 1. Powerful features for content information

Deep learning

  • 2. Feedback from rating information

Non-i.i.d. Collaborative deep learning (CDL)

Modeling the Content Information

Motivation Stacked DAE PMF Collaborative DL Experiments Summary 9

slide-10
SLIDE 10

Contribution

Collaborative deep learning: * deep learning for non-i.i.d. data * joint representation learning and collaborative filtering

Motivation Stacked DAE PMF Collaborative DL Experiments Summary 10

slide-11
SLIDE 11

Contribution

Collaborative deep learning Complex target: * beyond targets like classification and regression * to complete a low-rank matrix

Motivation Stacked DAE PMF Collaborative DL Experiments Summary 11

slide-12
SLIDE 12

Contribution

Collaborative deep learning Complex target First hierarchical Bayesian models for hybrid deep recommender system

Motivation Stacked DAE PMF Collaborative DL Experiments Summary 12

slide-13
SLIDE 13

Contribution

Collaborative deep learning Complex target First hierarchical Bayesian models for hybrid deep recommender system Significantly advance the state of the art

Motivation Stacked DAE PMF Collaborative DL Experiments Summary 13

slide-14
SLIDE 14
  • Motivation
  • Stacked Denoising Autoencoders
  • Probabilistic Matrix Factorization
  • Collaborative Deep Learning
  • Experiments
  • Summary

14

slide-15
SLIDE 15

Stacked Denoising Autoencoders (SDAE)

Motivation Stacked DAE PMF Collaborative DL Experiments Summary Corrupted input Clean input Vincent et al. 2010 15

slide-16
SLIDE 16
  • Motivation
  • Stacked Denoising Autoencoders
  • Probabilistic Matrix Factorization
  • Collaborative Deep Learning
  • Experiments
  • Summary

16

slide-17
SLIDE 17

Probabilistic Matrix Factorization (PMF)

Motivation Stacked DAE PMF Collaborative DL Experiments Summary

Graphical model: Generative process: Objective function if using MAP:

latent vector of item j latent vector of user i rating of item j from user i

Notation:

Salakhutdinov et al. 2008 17

slide-18
SLIDE 18
  • Motivation
  • Stacked Denoising Autoencoders
  • Probabilistic Matrix Factorization
  • Collaborative Deep Learning
  • Experiments
  • Summary

18

slide-19
SLIDE 19

Probabilistic SDAE

Motivation Stacked DAE PMF Collaborative DL Experiments Summary

Generalized SDAE Graphical model: Generative process:

corrupted input clean input weights and biases

Notation:

19

slide-20
SLIDE 20

Collaborative Deep Learning

Motivation Stacked DAE PMF Collaborative DL Experiments Summary

Graphical model: Collaborative deep learning SDAE

corrupted input clean input weights and biases content representation rating of item j from user i latent vector of item j latent vector of user i

Notation:

Two-way interaction

  • More powerful representation
  • Infer missing ratings from content
  • Infer missing content from ratings

20

slide-21
SLIDE 21

Motivation Stacked DAE PMF Collaborative DL Experiments Summary

Neural network representation for degenerated CDL

Collaborative Deep Learning

21

slide-22
SLIDE 22

Motivation Stacked DAE PMF Collaborative DL Experiments Summary

Collaborative Deep Learning

Information flows from ratings to content

22

slide-23
SLIDE 23

Motivation Stacked DAE PMF Collaborative DL Experiments Summary

Collaborative Deep Learning

Information flows from content to ratings

23

slide-24
SLIDE 24

Motivation Stacked DAE PMF Collaborative DL Experiments Summary

Collaborative Deep Learning

Reciprocal: representation and recommendation

24

slide-25
SLIDE 25

Learning

maximizing the posterior probability is equivalent to maximizing the joint log-likelihood

Motivation Stacked DAE PMF Collaborative DL Experiments Summary 25

slide-26
SLIDE 26

Learning

Prior (regularization) for user latent vectors, weights, and biases

Motivation Stacked DAE PMF Collaborative DL Experiments Summary 26

slide-27
SLIDE 27

Learning

Motivation Stacked DAE PMF Collaborative DL Experiments Summary

Generating item latent vectors from content representation with Gaussian offset

27

slide-28
SLIDE 28

Learning

Motivation Stacked DAE PMF Collaborative DL Experiments Summary

‘Generating’ clean input from the output of probabilistic SDAE with Gaussian offset

28

slide-29
SLIDE 29

Learning

Motivation Stacked DAE PMF Collaborative DL Experiments Summary

Generating the input of Layer l from the output of Layer l-1 with Gaussian offset

29

slide-30
SLIDE 30

Learning

Motivation Stacked DAE PMF Collaborative DL Experiments Summary

measures the error of predicted ratings

30

slide-31
SLIDE 31

Learning

If goes to infinity, the likelihood becomes

¸s ¸s

Motivation Stacked DAE PMF Collaborative DL Experiments Summary 31

slide-32
SLIDE 32

Update Rules

For U and V, use block coordinate descent: For W and b, use a modified version of backpropagation:

Motivation Stacked DAE PMF Collaborative DL Experiments Summary 32

slide-33
SLIDE 33
  • Motivation
  • Stacked Denoising Autoencoders
  • Probabilistic Matrix Factorization
  • Collaborative Deep Learning
  • Experiments
  • Summary

33

slide-34
SLIDE 34

Datasets

Motivation Stacked DAE PMF Collaborative DL Experiments Summary Content information Titles and abstracts Titles and abstracts Movie plots Wang et al. 2011 Wang et al. 2013 34

slide-35
SLIDE 35

Evaluation Metrics

Recall: Mean Average Precision (mAP):

Motivation Stacked DAE PMF Collaborative DL Experiments Summary

Higher recall and mAP indicate better recommendation performance

35

slide-36
SLIDE 36

Comparing Methods

Motivation Stacked DAE PMF Collaborative DL Experiments Summary Hybrid methods using BOW and ratings Loosely coupled; interaction is not two-way PMF+LDA 36

slide-37
SLIDE 37

Recall@M

citeulike-t, sparse setting citeulike-t, dense setting Netflix, sparse setting Netflix, dense setting

Motivation Stacked DAE PMF Collaborative DL Experiments Summary

When the ratings are very sparse: When the ratings are dense:

37

slide-38
SLIDE 38

Mean Average Precision (mAP)

Motivation Stacked DAE PMF Collaborative DL Experiments Summary

Exactly the same as Oord et al. 2013, we set the cutoff point at 500 for each user. A relative performance boost of about 50%

38

slide-39
SLIDE 39

Number of Layers

Sparse Setting Dense Setting

Motivation Stacked DAE PMF Collaborative DL Experiments Summary

The best performance is achieved when the number of layers is 2 or 3 (4 or 6 layers of generalized neural networks).

39

slide-40
SLIDE 40

Example User

Moonstruck True Romance

Romance Movies

Precision: 30% VS 20%

Motivation Stacked DAE PMF Collaborative DL Experiments Summary 40

slide-41
SLIDE 41

Example User

Johnny English American Beauty

Action & Drama Movies

Precision: 50% VS 20%

Motivation Stacked DAE PMF Collaborative DL Experiments Summary 41

slide-42
SLIDE 42

Example User

Precision: 90% VS 50%

Motivation Stacked DAE PMF Collaborative DL Experiments Summary 42

slide-43
SLIDE 43
  • Motivation
  • Stacked Denoising Autoencoders
  • Probabilistic Matrix Factorization
  • Collaborative Deep Learning
  • Experiments
  • Summary

43

slide-44
SLIDE 44

Summary

Non-i.i.d (collaborative) deep learning With a complex target First hierarchical Bayesian models for hybrid deep recommender system Significantly advance the state of the art

Motivation Stacked DAE PMF Collaborative DL Experiments Summary 44

slide-45
SLIDE 45

Word2vec, tf-idf Sampling-based, variational inference Tagging information, networks

Summary

Motivation Stacked DAE PMF Collaborative DL Experiments Summary 45

slide-46
SLIDE 46

Thank you!

More results, code, and datasets: http://www.wanghao.in Hao Wang hwangaz@cse.ust.hk

46