CS249: ADVANCED DATA MINING Recommender Systems II Instructor: - - PowerPoint PPT Presentation

cs249 advanced data mining
SMART_READER_LITE
LIVE PREVIEW

CS249: ADVANCED DATA MINING Recommender Systems II Instructor: - - PowerPoint PPT Presentation

CS249: ADVANCED DATA MINING Recommender Systems II Instructor: Yizhou Sun yzsun@cs.ucla.edu May 31, 2017 Recommender Systems Recommendation via Information Network Analysis Hybrid Collaborative Filtering with Information Networks


slide-1
SLIDE 1

CS249: ADVANCED DATA MINING

Instructor: Yizhou Sun

yzsun@cs.ucla.edu May 31, 2017

Recommender Systems II

slide-2
SLIDE 2

Recommender Systems

  • Recommendation via Information Network

Analysis

  • Hybrid Collaborative Filtering with

Information Networks

  • Graph Regularization for Recommendation
  • Summary

2

slide-3
SLIDE 3

Traditional View of Recommendation

3

Avatar Titanic Aliens Revolutionary Road

slide-4
SLIDE 4

Recommendation Paradigm

4

recommender system recommendation user feedback external knowledge product features user-item feedback

Collaborative Filtering

E.g., K-Nearest Neighbor (Sarwar WWW’01), Matrix Factorization (Hu ICDM’08, Koren IEEE-CS’09), Probabilistic Model (Hofmann SIGIR’03)

Content-Based Methods

E.g., (Balabanovic Comm. ACM’ 97, Zhang SIGIR’02)

Hybrid Methods

E.g., Content-Based CF (Antonopoulus, IS’06), External Knowledge CF (Ma WSDM’11)

slide-5
SLIDE 5

An Example of Traditional Method: Matrix Factorization

5

𝑆: Rating Matrix 𝑆: Estimated Rating Matrix

slide-6
SLIDE 6

Challenges

  • How to address the data sparsity and cold

start issues?

  • How to leverage different sources of

information?

6

slide-7
SLIDE 7

Solution: A Heterogeneous Information Network View of Recommendation

7

Avatar Titanic Aliens Revolutionary Road James Cameron Kate Winslet Leonardo Dicaprio Zoe Saldana Adventure Romance

slide-8
SLIDE 8

What Are Information Networks?

  • A network where each node represents an entity (e.g.,

user in a social network) and each link (e.g., friendship) a relationship between entities.

  • Nodes/links may have attributes, labels, and weights.
  • Links may carry rich semantic information.

8

slide-9
SLIDE 9

We are living in a connected world!

9

slide-10
SLIDE 10

Even in Biomedical Domain

10

Gene Patient

Symptom

Microbe carriedBy

cause Drug

Compound

Side Effect similarTo contain Disease Disease

slide-11
SLIDE 11

Recommender Systems

  • Recommendation via Information Network

Analysis

  • Hybrid Collaborative Filtering with

Information Networks

  • Graph Regularization for Recommendation
  • Summary

11

slide-12
SLIDE 12

Recommendation Paradigm

12

recommender system recommendation user feedback external knowledge product features user-item feedback

Collaborative Filtering

E.g., K-Nearest Neighbor (Sarwar WWW’01), Matrix Factorization (Hu ICDM’08, Koren IEEE-CS’09), Probabilistic Model (Hofmann SIGIR’03)

Content-Based Methods

E.g., (Balabanovic Comm. ACM’ 97, Zhang SIGIR’02)

Hybrid Methods

E.g., Content-Based CF (Antonopoulus, IS’06), External Knowledge CF (Ma WSDM’11)

slide-13
SLIDE 13

Problem Definition

13

recommender system recommendation user feedback information network implicit user feedback

hybrid collaborative filtering with information networks

slide-14
SLIDE 14

Recommend with Trust and Distrust Relationships [Ma et al., RecSys’09]

  • Users can be easily influenced by the

friends they trust, and prefer their friends’ recommendations.

14

Where to have dinner?

Ask Ask Ask Good Very Good Cheap & Delicious

slide-15
SLIDE 15

Trust and Distrust Graph

15

𝑻𝑼: Trust Graph 𝑻𝑬: Distrust Graph R: User Item Rating Matrix

slide-16
SLIDE 16

Recommendation with Trust and Distrust Relationships

16

𝑻𝑼: Trust Graph 𝑻𝑬: Distrust Graph

slide-17
SLIDE 17

Results

  • Dataset: Epinions
  • Metric: RMSE

17

slide-18
SLIDE 18

Hybrid Collaborative Filtering with Networks

  • Utilizing network relationship information

can enhance the recommendation quality

  • However, most of the previous studies only

use single type of relationship between users

  • r items (e.g., social network Ma,WSDM’11, trust

relationship

Ester, KDD’10, service membership Yuan, RecSys’11)

18

slide-19
SLIDE 19

The Heterogeneous Information Network View

  • f Recommender System

19

Avatar Titanic Aliens Revolution

  • ary Road

James Cameron Kate Winslet Leonardo Dicaprio Zoe Saldana Adventure Romance

slide-20
SLIDE 20

Relationship Heterogeneity Alleviates Data Sparsity

20

# of users or items

A small number

  • f users and items

have a large number of ratings Most users and items have a small number of ratings

# of ratings Collaborative filtering methods suffer from data sparsity issue

  • Heterogeneous relationships complement each other
  • Users and items with limited feedback can be connected to the

network by different types of paths

  • Connect new users or items (cold start) in the information

network

slide-21
SLIDE 21

Relationship Heterogeneity Based Personalized Recommendation Models (Yu et al., WSDM’14)

21

Different users may have different behaviors or preferences

Aliens

James Cameron fan 80s Sci-fi fan Sigourney Weaver fan

Different users may be interested in the same movie for different reasons

Two levels of personalization

Data level

  • Most recommendation methods use
  • ne model for all users and rely on

personal feedback to achieve personalization Model level

  • With different entity relationships, we

can learn personalized models for different users to further distinguish their differences

slide-22
SLIDE 22

Preference Propagation-Based Latent Features

22 Alice Bob Kate Winslet Naomi Watts Titanic revolutionary road skyfall King Kong

genre: drama

Sam Mendes tag: Oscar Nomination Charlie

Generate L different meta-path (pa

path h typ ypes) es)

connecting users and items Propagate user implicit feedback along each meta- path Calculate latent- features for users and items for each meta-path with NMF related method

Ralph Fiennes

slide-23
SLIDE 23

L user-cluster similarity

Recommendation Models

23

Observation 1: Different meta-paths may have different importance

Global Recommendation Model Personalized Recommendation Model

Observation 2: Different users may require different models

ranking score the q-th meta-path features for user i and item j c total soft user clusters

(1) (2)

slide-24
SLIDE 24

Parameter Estimation

24

  • Bayesian personalized ranking (Rendle UAI’09)
  • Objective function

min

Θ sigmoid function for each correctly ranked item pair i.e., 𝑣𝑗 gave feedback to 𝑓𝑏 but not 𝑓𝑐

Soft cluster users with NMF + k-means For each user cluster, learn one model with Eq. (3) Generate personalized model for each user on the fly with Eq. (2) (3)

Learning Personalized Recommendation Model

slide-25
SLIDE 25

Experiment Setup

  • Datasets
  • Comparison methods:
  • Popularity: recommend the most popular items to

users

  • Co-click: conditional probabilities between items
  • NMF: non-negative matrix factorization on user

feedback

  • Hybrid-SVM: use Rank-SVM with plain features

(utilize both user feedback and information network)

25

slide-26
SLIDE 26

Performance Comparison

26

HeteRec personalized recommendation (HeteRec-p) provides the best recommendation results

p

slide-27
SLIDE 27

Performance under Different Scenarios

27

HeteRec–p consistently outperform other methods in different scenarios better recommendation results if users provide more feedback better recommendation for users who like less popular items

p p user

slide-28
SLIDE 28

Recommender Systems

  • Recommendation via Information Network

Analysis

  • Hybrid Collaborative Filtering with

Information Networks

  • Graph Regularization for Recommendation
  • Summary

28

slide-29
SLIDE 29

From Graph Regularization Point of View

  • Why additional links help?
  • They define new similarity metrics between users or items.
  • How to integrate this assumption into recommendation?
  • Use graph regularization to force two entities to be similar in latent

space, if they are similar in graph

  • The original form of graph regularization
  • 1

2 ∑𝑥𝑗𝑘 𝑔 𝑗 − 𝑔 𝑘 2 = 𝑔′𝑀𝑔

  • 𝑥𝑗𝑘 ∶ 𝑡𝑗𝑛𝑗𝑚𝑏𝑠𝑗𝑢𝑧 𝑝𝑔 𝑜𝑝𝑒𝑓 𝑗 𝑏𝑜𝑒 𝑘
  • 𝑔

𝑗: some latent representation for node i

  • L: Laplacian matrix of W, i.e., 𝑀 = 𝐸 − 𝑋,
  • 𝑥ℎ𝑓𝑠𝑓 𝐸 𝑗𝑡 𝑏 𝑒𝑗𝑏𝑕𝑝𝑜𝑏𝑚 𝑛𝑏𝑢𝑠𝑗𝑦 𝑏𝑜𝑒 𝐸𝑗𝑗 = ∑𝑘 𝑥𝑗𝑘

29

slide-30
SLIDE 30

Recommender Systems with Social Regularization [Ma et al., WSDM’11]

  • Input: Social Relation + Rating Matrix

30

slide-31
SLIDE 31

Two Regularization Forms

  • Model 1: Average-based Regularization
  • We are similar to the average of our friends
  • Model2: Individual-based Regularization
  • We are similar to each of our friends

31

Similarity can be propagated via friends: transitivity!

slide-32
SLIDE 32

How to compute similarity between two users?

  • Cosine similarity (VSS)
  • Pearson correlation coefficient (PCC)

32

slide-33
SLIDE 33

Results

33

slide-34
SLIDE 34

Meta-Path-based Regularization [Yu et al., IJCAI-HINA’13]

  • What if it is more than one type of relation?
  • Solution:
  • Use meta-path to generate similarity relation between items,

e.g., movie-director-movie

  • Learn the importance score for each meta-path

34

Rating Data Heterogeneous Information Network

slide-35
SLIDE 35

Notations

  • We have n users and m items.
  • By computing similarity scores of all item

pairs along certain meta-path, we can get a similarity matrix

  • With L different meta-paths, we can calculate

L similarity matrices as

  • 35
slide-36
SLIDE 36

Objective Function

36

Approximate R with U V product Regularization on U V Regularization on θ, which is the importance score for each meta-path Similar items measured from HIN should have similar low-rank representations

slide-37
SLIDE 37

Equivalent Objective Function Using Graph Laplacian

37

Similar items measured from HIN should have similar low-rank representations

slide-38
SLIDE 38

Dataset

  • We combine IMDb + MovieLens100K

38

We random sample training datasets of different sizes (0.4, 0.6, and 0.8)

slide-39
SLIDE 39

Results

39

slide-40
SLIDE 40

Recommender Systems

  • Recommendation via Information Network

Analysis

  • Hybrid Collaborative Filtering with

Information Networks

  • Graph Regularization for Recommendation
  • Summary

40

slide-41
SLIDE 41

Summary

  • Recommendation via Information Network Analysis
  • Users and items are embedded in a heterogeneous

information network

  • Recommendation can be considered as a link prediction

problem

  • Hybrid Collaborative Filtering with Information

Networks

  • Propagate the feedback via meta-paths
  • Graph Regularization for Recommendation
  • Similar items/users should have similar latent vectors

41

slide-42
SLIDE 42

More about Course Project

  • Presentation
  • 20mins+5minsQ&A
  • Time arrangement
  • June 5: Team 1-4
  • June 7: Team 5-8
  • Course Project Final Report + Data (link) +

Code

  • Due June 12

42

slide-43
SLIDE 43

Peer Evaluation Questions

  • 1. Is the

proposed problem interesting and novel?

  • 2. Is the

problem formalization reasonable?

  • 3. Is the

solution solid and reasonable?

  • 4. To what

extent the project achieves the claimed goal?

  • 5. How good is

the presentation?

43