Effective Missing Data Prediction for Collaborative Filtering Hao - - PowerPoint PPT Presentation

effective missing data prediction for collaborative
SMART_READER_LITE
LIVE PREVIEW

Effective Missing Data Prediction for Collaborative Filtering Hao - - PowerPoint PPT Presentation

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Effective Missing Data Prediction for Collaborative Filtering Hao Ma, Irwin King, and Michael R. Lyu Department of Computer Science and Engineering


slide-1
SLIDE 1

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work

Effective Missing Data Prediction for Collaborative Filtering

Hao Ma, Irwin King, and Michael R. Lyu

Department of Computer Science and Engineering The Chinese University of Hong Kong

SIGIR 2007, Amsterdam, the Netherlands July 24, 2007

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-2
SLIDE 2

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work

1

Introduction Simple Examples of Recommender System Definitions of Some Concepts A Simple CF Example Pearson Correlation Coefficient Significance Weighting

2

Missing Data Prediction Collaborative Filtering Challenges User-Item Matrix Similar Neighbors Selection Missing Data Prediction Parameter Discussion

3

Empirical Analysis Datasets Metrics Summary of Experiments Comparisons Impact of Parameters

4

Conclusions and Future Work Conclusions and Future Work

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-3
SLIDE 3

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Simple Examples of Recommender System Definitions of Some Concepts A Simple CF Example Pearson Correlation Coefficient Significance Weighting

Search Using Google

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-4
SLIDE 4

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Simple Examples of Recommender System Definitions of Some Concepts A Simple CF Example Pearson Correlation Coefficient Significance Weighting

Search Using Google

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-5
SLIDE 5

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Simple Examples of Recommender System Definitions of Some Concepts A Simple CF Example Pearson Correlation Coefficient Significance Weighting

Search Using Google

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-6
SLIDE 6

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Simple Examples of Recommender System Definitions of Some Concepts A Simple CF Example Pearson Correlation Coefficient Significance Weighting

Searching Products on Amazon.com If a user is viewing the palm Treo 750 Smartphone on Amazon.com, other related information will be recommended to user besides the specification

  • f Treo 750

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-7
SLIDE 7

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Simple Examples of Recommender System Definitions of Some Concepts A Simple CF Example Pearson Correlation Coefficient Significance Weighting

Searching Products on Amazon.com If a user is viewing the palm Treo 750 Smartphone on Amazon.com, other related information will be recommended to user besides the specification

  • f Treo 750

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-8
SLIDE 8

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Simple Examples of Recommender System Definitions of Some Concepts A Simple CF Example Pearson Correlation Coefficient Significance Weighting

Searching Products on Amazon.com These methods are very popular in many online recommendation systems

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-9
SLIDE 9

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Simple Examples of Recommender System Definitions of Some Concepts A Simple CF Example Pearson Correlation Coefficient Significance Weighting

Searching Products on Amazon.com These methods are very popular in many online recommendation systems

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-10
SLIDE 10

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Simple Examples of Recommender System Definitions of Some Concepts A Simple CF Example Pearson Correlation Coefficient Significance Weighting

More Complicated Recommendations

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-11
SLIDE 11

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Simple Examples of Recommender System Definitions of Some Concepts A Simple CF Example Pearson Correlation Coefficient Significance Weighting

More Complicated Recommendations

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-12
SLIDE 12

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Simple Examples of Recommender System Definitions of Some Concepts A Simple CF Example Pearson Correlation Coefficient Significance Weighting

More Complicated Recommendations

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-13
SLIDE 13

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Simple Examples of Recommender System Definitions of Some Concepts A Simple CF Example Pearson Correlation Coefficient Significance Weighting

More Complicated Recommendations

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-14
SLIDE 14

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Simple Examples of Recommender System Definitions of Some Concepts A Simple CF Example Pearson Correlation Coefficient Significance Weighting

More Complicated Recommendations

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-15
SLIDE 15

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Simple Examples of Recommender System Definitions of Some Concepts A Simple CF Example Pearson Correlation Coefficient Significance Weighting

More Complicated Recommendations The technique Amazon.com adopts is called Collaborative Filtering!

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-16
SLIDE 16

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Simple Examples of Recommender System Definitions of Some Concepts A Simple CF Example Pearson Correlation Coefficient Significance Weighting

More Complicated Recommendations The technique Amazon.com adopts is called Collaborative Filtering!

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-17
SLIDE 17

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Simple Examples of Recommender System Definitions of Some Concepts A Simple CF Example Pearson Correlation Coefficient Significance Weighting

Google Similarity calculation Link analysis Amazon – Simple Example User-item matrix is consisted of lots of 0s and 1s Frequent pattern mining Amazon – Complicated Example User-item matrix is consisted of lots of ratings which are rated by different users Predict other missing data as accurate as possible

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-18
SLIDE 18

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Simple Examples of Recommender System Definitions of Some Concepts A Simple CF Example Pearson Correlation Coefficient Significance Weighting

Google Similarity calculation Link analysis Amazon – Simple Example User-item matrix is consisted of lots of 0s and 1s Frequent pattern mining Amazon – Complicated Example User-item matrix is consisted of lots of ratings which are rated by different users Predict other missing data as accurate as possible

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-19
SLIDE 19

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Simple Examples of Recommender System Definitions of Some Concepts A Simple CF Example Pearson Correlation Coefficient Significance Weighting

Google Similarity calculation Link analysis Amazon – Simple Example User-item matrix is consisted of lots of 0s and 1s Frequent pattern mining Amazon – Complicated Example User-item matrix is consisted of lots of ratings which are rated by different users Predict other missing data as accurate as possible

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-20
SLIDE 20

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Simple Examples of Recommender System Definitions of Some Concepts A Simple CF Example Pearson Correlation Coefficient Significance Weighting

Google Similarity calculation Link analysis Amazon – Simple Example User-item matrix is consisted of lots of 0s and 1s Frequent pattern mining Amazon – Complicated Example User-item matrix is consisted of lots of ratings which are rated by different users Predict other missing data as accurate as possible

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-21
SLIDE 21

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Simple Examples of Recommender System Definitions of Some Concepts A Simple CF Example Pearson Correlation Coefficient Significance Weighting

Definition of Recommendation Systems Computer programs Predict items that a user may be interested in Items could be movies, music, books, news, web pages, etc. Given some information about the user’s profile

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-22
SLIDE 22

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Simple Examples of Recommender System Definitions of Some Concepts A Simple CF Example Pearson Correlation Coefficient Significance Weighting

Definition of Recommendation Systems Computer programs Predict items that a user may be interested in Items could be movies, music, books, news, web pages, etc. Given some information about the user’s profile

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-23
SLIDE 23

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Simple Examples of Recommender System Definitions of Some Concepts A Simple CF Example Pearson Correlation Coefficient Significance Weighting

Definition of Recommendation Systems Computer programs Predict items that a user may be interested in Items could be movies, music, books, news, web pages, etc. Given some information about the user’s profile

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-24
SLIDE 24

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Simple Examples of Recommender System Definitions of Some Concepts A Simple CF Example Pearson Correlation Coefficient Significance Weighting

Definition of Recommendation Systems Computer programs Predict items that a user may be interested in Items could be movies, music, books, news, web pages, etc. Given some information about the user’s profile

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-25
SLIDE 25

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Simple Examples of Recommender System Definitions of Some Concepts A Simple CF Example Pearson Correlation Coefficient Significance Weighting

Definition of Recommendation Systems Computer programs Predict items that a user may be interested in Items could be movies, music, books, news, web pages, etc. Given some information about the user’s profile

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-26
SLIDE 26

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Simple Examples of Recommender System Definitions of Some Concepts A Simple CF Example Pearson Correlation Coefficient Significance Weighting

Definition of Collaborative Filtering Making automatic predictions (filtering) about the interests of a user By collecting taste information from many other users (collaborating)

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-27
SLIDE 27

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Simple Examples of Recommender System Definitions of Some Concepts A Simple CF Example Pearson Correlation Coefficient Significance Weighting

Definition of Collaborative Filtering Making automatic predictions (filtering) about the interests of a user By collecting taste information from many other users (collaborating)

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-28
SLIDE 28

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Simple Examples of Recommender System Definitions of Some Concepts A Simple CF Example Pearson Correlation Coefficient Significance Weighting

Definition of Collaborative Filtering Making automatic predictions (filtering) about the interests of a user By collecting taste information from many other users (collaborating)

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-29
SLIDE 29

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Simple Examples of Recommender System Definitions of Some Concepts A Simple CF Example Pearson Correlation Coefficient Significance Weighting

User-based Collaborative Filtering

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-30
SLIDE 30

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Simple Examples of Recommender System Definitions of Some Concepts A Simple CF Example Pearson Correlation Coefficient Significance Weighting

User-based Collaborative Filtering

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-31
SLIDE 31

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Simple Examples of Recommender System Definitions of Some Concepts A Simple CF Example Pearson Correlation Coefficient Significance Weighting

User-based Collaborative Filtering

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-32
SLIDE 32

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Simple Examples of Recommender System Definitions of Some Concepts A Simple CF Example Pearson Correlation Coefficient Significance Weighting

User-based Collaborative Filtering

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-33
SLIDE 33

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Simple Examples of Recommender System Definitions of Some Concepts A Simple CF Example Pearson Correlation Coefficient Significance Weighting

User-based Collaborative Filtering

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-34
SLIDE 34

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Simple Examples of Recommender System Definitions of Some Concepts A Simple CF Example Pearson Correlation Coefficient Significance Weighting

User-based Collaborative Filtering

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-35
SLIDE 35

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Simple Examples of Recommender System Definitions of Some Concepts A Simple CF Example Pearson Correlation Coefficient Significance Weighting

User-based Collaborative Filtering User-based collaborative filtering predicts the ratings of active users based

  • n the ratings of similar users found in the user-item matrix

The similarity between users could be defined as: Sim(a, u) =

  • i∈I(a)∩I(u)

(ra,i − ra) · (ru,i − ru)

  • i∈I(a)∩I(u)

(ra,i − ra)2 ·

  • i∈I(a)∩I(u)

(ru,i − ru)2 Sim(a, u) is ranging from [−1, 1], and a larger value means users a and u are more similar

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-36
SLIDE 36

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Simple Examples of Recommender System Definitions of Some Concepts A Simple CF Example Pearson Correlation Coefficient Significance Weighting

User-based Collaborative Filtering User-based collaborative filtering predicts the ratings of active users based

  • n the ratings of similar users found in the user-item matrix

The similarity between users could be defined as: Sim(a, u) =

  • i∈I(a)∩I(u)

(ra,i − ra) · (ru,i − ru)

  • i∈I(a)∩I(u)

(ra,i − ra)2 ·

  • i∈I(a)∩I(u)

(ru,i − ru)2 Sim(a, u) is ranging from [−1, 1], and a larger value means users a and u are more similar

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-37
SLIDE 37

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Simple Examples of Recommender System Definitions of Some Concepts A Simple CF Example Pearson Correlation Coefficient Significance Weighting

User-based Collaborative Filtering User-based collaborative filtering predicts the ratings of active users based

  • n the ratings of similar users found in the user-item matrix

The similarity between users could be defined as: Sim(a, u) =

  • i∈I(a)∩I(u)

(ra,i − ra) · (ru,i − ru)

  • i∈I(a)∩I(u)

(ra,i − ra)2 ·

  • i∈I(a)∩I(u)

(ru,i − ru)2 Sim(a, u) is ranging from [−1, 1], and a larger value means users a and u are more similar

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-38
SLIDE 38

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Simple Examples of Recommender System Definitions of Some Concepts A Simple CF Example Pearson Correlation Coefficient Significance Weighting

User-based Collaborative Filtering User-based collaborative filtering predicts the ratings of active users based

  • n the ratings of similar users found in the user-item matrix

The similarity between users could be defined as: Sim(a, u) =

  • i∈I(a)∩I(u)

(ra,i − ra) · (ru,i − ru)

  • i∈I(a)∩I(u)

(ra,i − ra)2 ·

  • i∈I(a)∩I(u)

(ru,i − ru)2 Sim(a, u) is ranging from [−1, 1], and a larger value means users a and u are more similar

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-39
SLIDE 39

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Simple Examples of Recommender System Definitions of Some Concepts A Simple CF Example Pearson Correlation Coefficient Significance Weighting

User-based Collaborative Filtering

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-40
SLIDE 40

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Simple Examples of Recommender System Definitions of Some Concepts A Simple CF Example Pearson Correlation Coefficient Significance Weighting

User-based Collaborative Filtering

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-41
SLIDE 41

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Simple Examples of Recommender System Definitions of Some Concepts A Simple CF Example Pearson Correlation Coefficient Significance Weighting

Item-based Collaborative Filtering Item-based collaborative filtering predicts the ratings of active users based

  • n the information of similar items computed

The similarity between items could be defined as: Sim(i, j) =

  • u∈U(i)∩U(j)

(ru,i − ri) · (ru,j − rj)

  • u∈U(i)∩U(j)

(ru,i − ri)2 ·

  • u∈U(i)∩U(j)

(ru,j − rj)2 Like user similarity, item similarity Sim(i, j) is also ranging from [−1, 1]

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-42
SLIDE 42

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Simple Examples of Recommender System Definitions of Some Concepts A Simple CF Example Pearson Correlation Coefficient Significance Weighting

Item-based Collaborative Filtering Item-based collaborative filtering predicts the ratings of active users based

  • n the information of similar items computed

The similarity between items could be defined as: Sim(i, j) =

  • u∈U(i)∩U(j)

(ru,i − ri) · (ru,j − rj)

  • u∈U(i)∩U(j)

(ru,i − ri)2 ·

  • u∈U(i)∩U(j)

(ru,j − rj)2 Like user similarity, item similarity Sim(i, j) is also ranging from [−1, 1]

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-43
SLIDE 43

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Simple Examples of Recommender System Definitions of Some Concepts A Simple CF Example Pearson Correlation Coefficient Significance Weighting

Item-based Collaborative Filtering Item-based collaborative filtering predicts the ratings of active users based

  • n the information of similar items computed

The similarity between items could be defined as: Sim(i, j) =

  • u∈U(i)∩U(j)

(ru,i − ri) · (ru,j − rj)

  • u∈U(i)∩U(j)

(ru,i − ri)2 ·

  • u∈U(i)∩U(j)

(ru,j − rj)2 Like user similarity, item similarity Sim(i, j) is also ranging from [−1, 1]

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-44
SLIDE 44

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Simple Examples of Recommender System Definitions of Some Concepts A Simple CF Example Pearson Correlation Coefficient Significance Weighting

Item-based Collaborative Filtering Item-based collaborative filtering predicts the ratings of active users based

  • n the information of similar items computed

The similarity between items could be defined as: Sim(i, j) =

  • u∈U(i)∩U(j)

(ru,i − ri) · (ru,j − rj)

  • u∈U(i)∩U(j)

(ru,i − ri)2 ·

  • u∈U(i)∩U(j)

(ru,j − rj)2 Like user similarity, item similarity Sim(i, j) is also ranging from [−1, 1]

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-45
SLIDE 45

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Simple Examples of Recommender System Definitions of Some Concepts A Simple CF Example Pearson Correlation Coefficient Significance Weighting

An Example

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-46
SLIDE 46

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Simple Examples of Recommender System Definitions of Some Concepts A Simple CF Example Pearson Correlation Coefficient Significance Weighting

An Example

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-47
SLIDE 47

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Simple Examples of Recommender System Definitions of Some Concepts A Simple CF Example Pearson Correlation Coefficient Significance Weighting

An Example

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-48
SLIDE 48

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Simple Examples of Recommender System Definitions of Some Concepts A Simple CF Example Pearson Correlation Coefficient Significance Weighting

Significance Weighting We use the following equation to solve this problem: Sim′(a, u) = Min(|Ia ∩ Iu|, γ) γ · Sim(a, u), where |Ia ∩ Iu| is the number of items which user a and user u rated in common Then the similarity between items could be defined as: Sim′(i, j) = Min(|Ui ∩ Uj|, δ) δ · Sim(i, j), where |Ui ∩ Uj| is the number of users who rated both item i and item j

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-49
SLIDE 49

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Simple Examples of Recommender System Definitions of Some Concepts A Simple CF Example Pearson Correlation Coefficient Significance Weighting

Significance Weighting We use the following equation to solve this problem: Sim′(a, u) = Min(|Ia ∩ Iu|, γ) γ · Sim(a, u), where |Ia ∩ Iu| is the number of items which user a and user u rated in common Then the similarity between items could be defined as: Sim′(i, j) = Min(|Ui ∩ Uj|, δ) δ · Sim(i, j), where |Ui ∩ Uj| is the number of users who rated both item i and item j

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-50
SLIDE 50

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Simple Examples of Recommender System Definitions of Some Concepts A Simple CF Example Pearson Correlation Coefficient Significance Weighting

Significance Weighting We use the following equation to solve this problem: Sim′(a, u) = Min(|Ia ∩ Iu|, γ) γ · Sim(a, u), where |Ia ∩ Iu| is the number of items which user a and user u rated in common Then the similarity between items could be defined as: Sim′(i, j) = Min(|Ui ∩ Uj|, δ) δ · Sim(i, j), where |Ui ∩ Uj| is the number of users who rated both item i and item j

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-51
SLIDE 51

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Collaborative Filtering Challenges User-Item Matrix Similar Neighbors Selection Missing Data Prediction Parameter Discussion

User-Item Matrix Challenges of Collaborative Filtering Data Sparsity Prediction Accuracy Scalability

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-52
SLIDE 52

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Collaborative Filtering Challenges User-Item Matrix Similar Neighbors Selection Missing Data Prediction Parameter Discussion

User-Item Matrix Challenges of Collaborative Filtering Data Sparsity Prediction Accuracy Scalability

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-53
SLIDE 53

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Collaborative Filtering Challenges User-Item Matrix Similar Neighbors Selection Missing Data Prediction Parameter Discussion

User-Item Matrix Challenges of Collaborative Filtering Data Sparsity Prediction Accuracy Scalability

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-54
SLIDE 54

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Collaborative Filtering Challenges User-Item Matrix Similar Neighbors Selection Missing Data Prediction Parameter Discussion

User-Item Matrix Challenges of Collaborative Filtering Data Sparsity Prediction Accuracy Scalability

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-55
SLIDE 55

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Collaborative Filtering Challenges User-Item Matrix Similar Neighbors Selection Missing Data Prediction Parameter Discussion

User-Item Matrix Challenges of Collaborative Filtering Data Sparsity Prediction Accuracy Scalability

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-56
SLIDE 56

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Collaborative Filtering Challenges User-Item Matrix Similar Neighbors Selection Missing Data Prediction Parameter Discussion

Challenges of Collaborative Filtering Data Sparsity Prediction Accuracy Scalability Data Sparsity Propose an algorithm to increase the density of User-Item Matrix Only predict some of the missing data Prediction Accuracy Adopt significance weighting Linearly combine user information with item information Predict the missing data with high confidence Our algorithm increases 6.24% of prediction accuracy over other state-of-the-art methods in average

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-57
SLIDE 57

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Collaborative Filtering Challenges User-Item Matrix Similar Neighbors Selection Missing Data Prediction Parameter Discussion

Challenges of Collaborative Filtering Data Sparsity Prediction Accuracy Scalability Data Sparsity Propose an algorithm to increase the density of User-Item Matrix Only predict some of the missing data Prediction Accuracy Adopt significance weighting Linearly combine user information with item information Predict the missing data with high confidence Our algorithm increases 6.24% of prediction accuracy over other state-of-the-art methods in average

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-58
SLIDE 58

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Collaborative Filtering Challenges User-Item Matrix Similar Neighbors Selection Missing Data Prediction Parameter Discussion

Challenges of Collaborative Filtering Data Sparsity Prediction Accuracy Scalability Data Sparsity Propose an algorithm to increase the density of User-Item Matrix Only predict some of the missing data Prediction Accuracy Adopt significance weighting Linearly combine user information with item information Predict the missing data with high confidence Our algorithm increases 6.24% of prediction accuracy over other state-of-the-art methods in average

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-59
SLIDE 59

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Collaborative Filtering Challenges User-Item Matrix Similar Neighbors Selection Missing Data Prediction Parameter Discussion

Challenges of Collaborative Filtering Data Sparsity Prediction Accuracy Scalability Data Sparsity Propose an algorithm to increase the density of User-Item Matrix Only predict some of the missing data Prediction Accuracy Adopt significance weighting Linearly combine user information with item information Predict the missing data with high confidence Our algorithm increases 6.24% of prediction accuracy over other state-of-the-art methods in average

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-60
SLIDE 60

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Collaborative Filtering Challenges User-Item Matrix Similar Neighbors Selection Missing Data Prediction Parameter Discussion

Challenges of Collaborative Filtering Data Sparsity Prediction Accuracy Scalability Data Sparsity Propose an algorithm to increase the density of User-Item Matrix Only predict some of the missing data Prediction Accuracy Adopt significance weighting Linearly combine user information with item information Predict the missing data with high confidence Our algorithm increases 6.24% of prediction accuracy over other state-of-the-art methods in average

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-61
SLIDE 61

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Collaborative Filtering Challenges User-Item Matrix Similar Neighbors Selection Missing Data Prediction Parameter Discussion

Challenges of Collaborative Filtering Data Sparsity Prediction Accuracy Scalability Data Sparsity Propose an algorithm to increase the density of User-Item Matrix Only predict some of the missing data Prediction Accuracy Adopt significance weighting Linearly combine user information with item information Predict the missing data with high confidence Our algorithm increases 6.24% of prediction accuracy over other state-of-the-art methods in average

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-62
SLIDE 62

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Collaborative Filtering Challenges User-Item Matrix Similar Neighbors Selection Missing Data Prediction Parameter Discussion

Challenges of Collaborative Filtering Data Sparsity Prediction Accuracy Scalability Data Sparsity Propose an algorithm to increase the density of User-Item Matrix Only predict some of the missing data Prediction Accuracy Adopt significance weighting Linearly combine user information with item information Predict the missing data with high confidence Our algorithm increases 6.24% of prediction accuracy over other state-of-the-art methods in average

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-63
SLIDE 63

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Collaborative Filtering Challenges User-Item Matrix Similar Neighbors Selection Missing Data Prediction Parameter Discussion

Challenges of Collaborative Filtering Data Sparsity Prediction Accuracy Scalability Data Sparsity Propose an algorithm to increase the density of User-Item Matrix Only predict some of the missing data Prediction Accuracy Adopt significance weighting Linearly combine user information with item information Predict the missing data with high confidence Our algorithm increases 6.24% of prediction accuracy over other state-of-the-art methods in average

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-64
SLIDE 64

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Collaborative Filtering Challenges User-Item Matrix Similar Neighbors Selection Missing Data Prediction Parameter Discussion

Challenges of Collaborative Filtering Data Sparsity Prediction Accuracy Scalability Data Sparsity Propose an algorithm to increase the density of User-Item Matrix Only predict some of the missing data Prediction Accuracy Adopt significance weighting Linearly combine user information with item information Predict the missing data with high confidence Our algorithm increases 6.24% of prediction accuracy over other state-of-the-art methods in average

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-65
SLIDE 65

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Collaborative Filtering Challenges User-Item Matrix Similar Neighbors Selection Missing Data Prediction Parameter Discussion

User-Item Matrix Predicted User-Item Matrix

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-66
SLIDE 66

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Collaborative Filtering Challenges User-Item Matrix Similar Neighbors Selection Missing Data Prediction Parameter Discussion

Similar Neighbors Selection For every missing data ru,i, a set of similar users S(u) towards user u can be generated according to: S(u) = {ua|Sim′(ua, u) > η, ua = u} where Sim′(ua, u) is computed using Significance Weighting, and η is the user similarity threshold At the same time, for every missing data ru,i, a set of similar items S(i) towards item i can be generated according to: S(i) = {ik|Sim′(ik, i) > θ, ik = i} where θ is the item similarity threshold

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-67
SLIDE 67

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Collaborative Filtering Challenges User-Item Matrix Similar Neighbors Selection Missing Data Prediction Parameter Discussion

Similar Neighbors Selection For every missing data ru,i, a set of similar users S(u) towards user u can be generated according to: S(u) = {ua|Sim′(ua, u) > η, ua = u} where Sim′(ua, u) is computed using Significance Weighting, and η is the user similarity threshold At the same time, for every missing data ru,i, a set of similar items S(i) towards item i can be generated according to: S(i) = {ik|Sim′(ik, i) > θ, ik = i} where θ is the item similarity threshold

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-68
SLIDE 68

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Collaborative Filtering Challenges User-Item Matrix Similar Neighbors Selection Missing Data Prediction Parameter Discussion

Similar Neighbors Selection For every missing data ru,i, a set of similar users S(u) towards user u can be generated according to: S(u) = {ua|Sim′(ua, u) > η, ua = u} where Sim′(ua, u) is computed using Significance Weighting, and η is the user similarity threshold At the same time, for every missing data ru,i, a set of similar items S(i) towards item i can be generated according to: S(i) = {ik|Sim′(ik, i) > θ, ik = i} where θ is the item similarity threshold

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-69
SLIDE 69

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Collaborative Filtering Challenges User-Item Matrix Similar Neighbors Selection Missing Data Prediction Parameter Discussion

Missing Data Prediction Algorithm Given the missing data ru,i, if S(u) = ∅ ∧ S(i) = ∅, the prediction of missing data P(ru,i) is defined as: P(ru,i) = λ × (u +

  • ua∈S(u)

Sim′(ua, u) · (rua,i − ua)

  • ua∈S(u)

Sim′(ua, u) ) + (1 − λ) × (i +

  • ik∈S(i)

Sim′(ik, i) · (ru,ik − ik)

  • ik∈S(i)

Sim′(ik, i) )

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-70
SLIDE 70

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Collaborative Filtering Challenges User-Item Matrix Similar Neighbors Selection Missing Data Prediction Parameter Discussion

Missing Data Prediction Algorithm Given the missing data ru,i, if S(u) = ∅ ∧ S(i) = ∅, the prediction of missing data P(ru,i) is defined as: P(ru,i) = λ × (u +

  • ua∈S(u)

Sim′(ua, u) · (rua,i − ua)

  • ua∈S(u)

Sim′(ua, u) ) + (1 − λ) × (i +

  • ik∈S(i)

Sim′(ik, i) · (ru,ik − ik)

  • ik∈S(i)

Sim′(ik, i) )

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-71
SLIDE 71

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Collaborative Filtering Challenges User-Item Matrix Similar Neighbors Selection Missing Data Prediction Parameter Discussion

Missing Data Prediction Algorithm If S(u) = ∅ ∧ S(i) = ∅, the prediction of missing data P(ru,i) is defined as: P(ru,i) = u +

  • ua∈S(u)

Sim′(ua, u) · (rua,i − ua)

  • ua∈S(u)

Sim′(ua, u) If S(u) = ∅ ∧ S(i) = ∅, the prediction of missing data P(ru,i) is defined as: P(ru,i) = i +

  • ik∈S(i)

Sim′(ik, i) · (ru,ik − ik)

  • ik∈S(i)

Sim′(ik, i)

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-72
SLIDE 72

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Collaborative Filtering Challenges User-Item Matrix Similar Neighbors Selection Missing Data Prediction Parameter Discussion

Missing Data Prediction Algorithm If S(u) = ∅ ∧ S(i) = ∅, the prediction of missing data P(ru,i) is defined as: P(ru,i) = u +

  • ua∈S(u)

Sim′(ua, u) · (rua,i − ua)

  • ua∈S(u)

Sim′(ua, u) If S(u) = ∅ ∧ S(i) = ∅, the prediction of missing data P(ru,i) is defined as: P(ru,i) = i +

  • ik∈S(i)

Sim′(ik, i) · (ru,ik − ik)

  • ik∈S(i)

Sim′(ik, i)

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-73
SLIDE 73

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Collaborative Filtering Challenges User-Item Matrix Similar Neighbors Selection Missing Data Prediction Parameter Discussion

Missing Data Prediction Algorithm If S(u) = ∅ ∧ S(i) = ∅, the prediction of missing data P(ru,i) is defined as: P(ru,i) = u +

  • ua∈S(u)

Sim′(ua, u) · (rua,i − ua)

  • ua∈S(u)

Sim′(ua, u) If S(u) = ∅ ∧ S(i) = ∅, the prediction of missing data P(ru,i) is defined as: P(ru,i) = i +

  • ik∈S(i)

Sim′(ik, i) · (ru,ik − ik)

  • ik∈S(i)

Sim′(ik, i)

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-74
SLIDE 74

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Collaborative Filtering Challenges User-Item Matrix Similar Neighbors Selection Missing Data Prediction Parameter Discussion

Missing Data Prediction Algorithm If S(u) = ∅ ∧ S(i) = ∅, the prediction of missing data P(ru,i) is defined as: P(ru,i) = 0 This consideration is different from all other existing prediction or smoothing methods – they always try to predict all the missing data in the user-item matrix, which will predict some missing data with bad quality

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-75
SLIDE 75

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Collaborative Filtering Challenges User-Item Matrix Similar Neighbors Selection Missing Data Prediction Parameter Discussion

Missing Data Prediction Algorithm If S(u) = ∅ ∧ S(i) = ∅, the prediction of missing data P(ru,i) is defined as: P(ru,i) = 0 This consideration is different from all other existing prediction or smoothing methods – they always try to predict all the missing data in the user-item matrix, which will predict some missing data with bad quality

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-76
SLIDE 76

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Collaborative Filtering Challenges User-Item Matrix Similar Neighbors Selection Missing Data Prediction Parameter Discussion

Missing Data Prediction Algorithm If S(u) = ∅ ∧ S(i) = ∅, the prediction of missing data P(ru,i) is defined as: P(ru,i) = 0 This consideration is different from all other existing prediction or smoothing methods – they always try to predict all the missing data in the user-item matrix, which will predict some missing data with bad quality

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-77
SLIDE 77

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Collaborative Filtering Challenges User-Item Matrix Similar Neighbors Selection Missing Data Prediction Parameter Discussion

Parameter γ δ η θ λ Discussion on γ and δ Employed to avoid overestimating the user similarities and item similarities Too high = ⇒ users or items do not have enough neighbors = ⇒ decrease of prediction accuracy Too low = ⇒ overestimate problem still exists = ⇒ decrease

  • f prediction accuracy

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-78
SLIDE 78

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Collaborative Filtering Challenges User-Item Matrix Similar Neighbors Selection Missing Data Prediction Parameter Discussion

Parameter γ δ η θ λ Discussion on γ and δ Employed to avoid overestimating the user similarities and item similarities Too high = ⇒ users or items do not have enough neighbors = ⇒ decrease of prediction accuracy Too low = ⇒ overestimate problem still exists = ⇒ decrease

  • f prediction accuracy

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-79
SLIDE 79

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Collaborative Filtering Challenges User-Item Matrix Similar Neighbors Selection Missing Data Prediction Parameter Discussion

Parameter γ δ η θ λ Discussion on γ and δ Employed to avoid overestimating the user similarities and item similarities Too high = ⇒ users or items do not have enough neighbors = ⇒ decrease of prediction accuracy Too low = ⇒ overestimate problem still exists = ⇒ decrease

  • f prediction accuracy

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-80
SLIDE 80

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Collaborative Filtering Challenges User-Item Matrix Similar Neighbors Selection Missing Data Prediction Parameter Discussion

Parameter γ δ η θ λ Discussion on γ and δ Employed to avoid overestimating the user similarities and item similarities Too high = ⇒ users or items do not have enough neighbors = ⇒ decrease of prediction accuracy Too low = ⇒ overestimate problem still exists = ⇒ decrease

  • f prediction accuracy

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-81
SLIDE 81

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Collaborative Filtering Challenges User-Item Matrix Similar Neighbors Selection Missing Data Prediction Parameter Discussion

Parameter γ δ η θ λ Discussion on η and θ Thresholds to select neighbors Too high = ⇒ few missing data need to be predicted= ⇒ user-item matrix is very sparse Too low = ⇒ almost all the missing data need to be predicted = ⇒ user-item matrix is very dense

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-82
SLIDE 82

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Collaborative Filtering Challenges User-Item Matrix Similar Neighbors Selection Missing Data Prediction Parameter Discussion

Parameter γ δ η θ λ Discussion on η and θ Thresholds to select neighbors Too high = ⇒ few missing data need to be predicted= ⇒ user-item matrix is very sparse Too low = ⇒ almost all the missing data need to be predicted = ⇒ user-item matrix is very dense

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-83
SLIDE 83

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Collaborative Filtering Challenges User-Item Matrix Similar Neighbors Selection Missing Data Prediction Parameter Discussion

Parameter γ δ η θ λ Discussion on η and θ Thresholds to select neighbors Too high = ⇒ few missing data need to be predicted= ⇒ user-item matrix is very sparse Too low = ⇒ almost all the missing data need to be predicted = ⇒ user-item matrix is very dense

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-84
SLIDE 84

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Collaborative Filtering Challenges User-Item Matrix Similar Neighbors Selection Missing Data Prediction Parameter Discussion

Parameter γ δ η θ λ Discussion on η and θ Thresholds to select neighbors Too high = ⇒ few missing data need to be predicted= ⇒ user-item matrix is very sparse Too low = ⇒ almost all the missing data need to be predicted = ⇒ user-item matrix is very dense

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-85
SLIDE 85

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Collaborative Filtering Challenges User-Item Matrix Similar Neighbors Selection Missing Data Prediction Parameter Discussion

Parameter γ δ η θ λ Discussion on λ Determines how closely the rating prediction relies on user information or item information λ = 1 = ⇒ prediction depends completely upon user-based information λ = 0 = ⇒ prediction depends completely upon item-based information

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-86
SLIDE 86

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Collaborative Filtering Challenges User-Item Matrix Similar Neighbors Selection Missing Data Prediction Parameter Discussion

Parameter γ δ η θ λ Discussion on λ Determines how closely the rating prediction relies on user information or item information λ = 1 = ⇒ prediction depends completely upon user-based information λ = 0 = ⇒ prediction depends completely upon item-based information

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-87
SLIDE 87

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Collaborative Filtering Challenges User-Item Matrix Similar Neighbors Selection Missing Data Prediction Parameter Discussion

Parameter γ δ η θ λ Discussion on λ Determines how closely the rating prediction relies on user information or item information λ = 1 = ⇒ prediction depends completely upon user-based information λ = 0 = ⇒ prediction depends completely upon item-based information

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-88
SLIDE 88

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Collaborative Filtering Challenges User-Item Matrix Similar Neighbors Selection Missing Data Prediction Parameter Discussion

Parameter γ δ η θ λ Discussion on λ Determines how closely the rating prediction relies on user information or item information λ = 1 = ⇒ prediction depends completely upon user-based information λ = 0 = ⇒ prediction depends completely upon item-based information

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-89
SLIDE 89

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Collaborative Filtering Challenges User-Item Matrix Similar Neighbors Selection Missing Data Prediction Parameter Discussion

Parameter Discussion

Table: The relationship between parameters with other CF approaches (MDP: Mission Data Predicted)

λ η θ Related CF Approaches 1 1 1 User-based CF without MDP 1 1 Item-based CF without MDP 1 User-based CF with full MDP Item-based CF with full MDP

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-90
SLIDE 90

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Collaborative Filtering Challenges User-Item Matrix Similar Neighbors Selection Missing Data Prediction Parameter Discussion

Parameter Discussion

Table: The relationship between parameters with other CF approaches (MDP: Mission Data Predicted)

λ η θ Related CF Approaches 1 1 1 User-based CF without MDP 1 1 Item-based CF without MDP 1 User-based CF with full MDP Item-based CF with full MDP

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-91
SLIDE 91

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Datasets Metrics Summary of Experiments Comparisons Impact of Parameters

Movielens It contains 100,000 ratings (1-5 scales) rated by 943 users on 1,682 movies, and each user at least rated 20 movies. The density of the user-item matrix is: 100000 943 × 1682 = 6.30% The statistics of dataset MovieLens is summarized in the following table:

Table: Statistics of Dataset MovieLens

Statistics User Item

  • Min. Num. of Ratings

20 1

  • Max. Num. of Ratings

737 583

  • Avg. Num. of Ratings

106.04 59.45

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-92
SLIDE 92

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Datasets Metrics Summary of Experiments Comparisons Impact of Parameters

Movielens It contains 100,000 ratings (1-5 scales) rated by 943 users on 1,682 movies, and each user at least rated 20 movies. The density of the user-item matrix is: 100000 943 × 1682 = 6.30% The statistics of dataset MovieLens is summarized in the following table:

Table: Statistics of Dataset MovieLens

Statistics User Item

  • Min. Num. of Ratings

20 1

  • Max. Num. of Ratings

737 583

  • Avg. Num. of Ratings

106.04 59.45

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-93
SLIDE 93

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Datasets Metrics Summary of Experiments Comparisons Impact of Parameters

Movielens It contains 100,000 ratings (1-5 scales) rated by 943 users on 1,682 movies, and each user at least rated 20 movies. The density of the user-item matrix is: 100000 943 × 1682 = 6.30% The statistics of dataset MovieLens is summarized in the following table:

Table: Statistics of Dataset MovieLens

Statistics User Item

  • Min. Num. of Ratings

20 1

  • Max. Num. of Ratings

737 583

  • Avg. Num. of Ratings

106.04 59.45

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-94
SLIDE 94

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Datasets Metrics Summary of Experiments Comparisons Impact of Parameters

Mean Absolute Errors We use the Mean Absolute Error (MAE) metrics to measure the prediction quality of our proposed approach with other collaborative filtering methods MAE is defined as: MAE =

  • u,i |ru,i −

ru,i| N , where ru,i denotes the rating that user u gave to item i, and ru,i denotes the rating that user u gave to item i which is predicted by our approach, and N denotes the number of tested ratings

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-95
SLIDE 95

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Datasets Metrics Summary of Experiments Comparisons Impact of Parameters

Mean Absolute Errors We use the Mean Absolute Error (MAE) metrics to measure the prediction quality of our proposed approach with other collaborative filtering methods MAE is defined as: MAE =

  • u,i |ru,i −

ru,i| N , where ru,i denotes the rating that user u gave to item i, and ru,i denotes the rating that user u gave to item i which is predicted by our approach, and N denotes the number of tested ratings

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-96
SLIDE 96

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Datasets Metrics Summary of Experiments Comparisons Impact of Parameters

Mean Absolute Errors We use the Mean Absolute Error (MAE) metrics to measure the prediction quality of our proposed approach with other collaborative filtering methods MAE is defined as: MAE =

  • u,i |ru,i −

ru,i| N , where ru,i denotes the rating that user u gave to item i, and ru,i denotes the rating that user u gave to item i which is predicted by our approach, and N denotes the number of tested ratings

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-97
SLIDE 97

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Datasets Metrics Summary of Experiments Comparisons Impact of Parameters

Summary of Experiments Comparisons with Traditional PCC Methods Comparisons with State-of-the-Art Algorithms Impact of Missing Data Prediction Impact of γ and δ Impact of λ Impact of η and θ

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-98
SLIDE 98

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Datasets Metrics Summary of Experiments Comparisons Impact of Parameters

Summary of Experiments Comparisons with Traditional PCC Methods Comparisons with State-of-the-Art Algorithms Impact of Missing Data Prediction Impact of γ and δ Impact of λ Impact of η and θ Comparisons with Traditional PCC Methods User-based collaborative filtering using Pearson Correlation Coefficient Item-based collaborative filtering using Pearson Correlation Coefficient

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-99
SLIDE 99

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Datasets Metrics Summary of Experiments Comparisons Impact of Parameters

Summary of Experiments Comparisons with Traditional PCC Methods Comparisons with State-of-the-Art Algorithms Impact of Missing Data Prediction Impact of γ and δ Impact of λ Impact of η and θ Comparisons with State-of-the-Art Algorithms Similarity Fusion (SF) [J. Wang, et al., SIGIR 2006] Smoothing and Cluster-Based PCC (SCBPCC) [G. Xue, et al., SIGIR 2005] Aspect Model (AM) [T. Hofmann, TOIS 2004] Personality Diagnosis (PD) [D. M. Pennock, et al., UAI 2000]

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-100
SLIDE 100

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Datasets Metrics Summary of Experiments Comparisons Impact of Parameters

Summary of Experiments Comparisons with Traditional PCC Methods Comparisons with State-of-the-Art Algorithms Impact of Missing Data Prediction Impact of γ and δ Impact of λ Impact of η and θ Impact of Missing Data Prediction Effective Missing Data Prediction (EMDP) Predict Every Missing Data (PEMD)

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-101
SLIDE 101

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Datasets Metrics Summary of Experiments Comparisons Impact of Parameters

Summary of Experiments Comparisons with Traditional PCC Methods Comparisons with State-of-the-Art Algorithms Impact of Missing Data Prediction Impact of γ and δ Impact of λ Impact of η and θ Impact of Parameters Impact of each parameter

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-102
SLIDE 102

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Datasets Metrics Summary of Experiments Comparisons Impact of Parameters

MAE Comparisons with PCC Methods

Table: MAE comparison with other approaches (A smaller MAE value means a better performance)

Training Users Methods Given5 Given10 Given20 EMDP 0.784 0.765 0.755 MovieLens 300 UPCC 0.838 0.814 0.802 IPCC 0.870 0.838 0.813 EMDP 0.796 0.770 0.761 MovieLens 200 UPCC 0.843 0.822 0.807 IPCC 0.855 0.834 0.812 EMDP 0.811 0.778 0.769 MovieLens 100 UPCC 0.876 0.847 0.811 IPCC 0.890 0.850 0.824

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-103
SLIDE 103

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Datasets Metrics Summary of Experiments Comparisons Impact of Parameters

MAE Comparisons with PCC Methods

Table: MAE comparison with other approaches (A smaller MAE value means a better performance)

Training Users Methods Given5 Given10 Given20 EMDP 0.784 0.765 0.755 MovieLens 300 UPCC 0.838 0.814 0.802 IPCC 0.870 0.838 0.813 EMDP 0.796 0.770 0.761 MovieLens 200 UPCC 0.843 0.822 0.807 IPCC 0.855 0.834 0.812 EMDP 0.811 0.778 0.769 MovieLens 100 UPCC 0.876 0.847 0.811 IPCC 0.890 0.850 0.824

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-104
SLIDE 104

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Datasets Metrics Summary of Experiments Comparisons Impact of Parameters

MAE Comparisons with State-of-the-Art Algorithms

Table: MAE comparison with state-of-the-art algorithms (A smaller MAE value means a better performance)

  • Num. of Training Users

100 200 300 Ratings Given 5 10 20 5 10 20 5 10 20 EMDP 0.807 0.769 0.765 0.793 0.760 0.751 0.788 0.754 0.746 SF 0.847 0.774 0.792 0.827 0.773 0.783 0.804 0.761 0.769 SCBPCC 0.848 0.819 0.789 0.831 0.813 0.784 0.822 0.810 0.778 AM 0.963 0.922 0.887 0.849 0.837 0.815 0.820 0.822 0.796 PD 0.849 0.817 0.808 0.836 0.815 0.792 0.827 0.815 0.789 PCC 0.874 0.836 0.818 0.859 0.829 0.813 0.849 0.841 0.820 Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-105
SLIDE 105

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Datasets Metrics Summary of Experiments Comparisons Impact of Parameters

MAE Comparisons with State-of-the-Art Algorithms

Table: MAE comparison with state-of-the-art algorithms (A smaller MAE value means a better performance)

  • Num. of Training Users

100 200 300 Ratings Given 5 10 20 5 10 20 5 10 20 EMDP 0.807 0.769 0.765 0.793 0.760 0.751 0.788 0.754 0.746 SF 0.847 0.774 0.792 0.827 0.773 0.783 0.804 0.761 0.769 SCBPCC 0.848 0.819 0.789 0.831 0.813 0.784 0.822 0.810 0.778 AM 0.963 0.922 0.887 0.849 0.837 0.815 0.820 0.822 0.796 PD 0.849 0.817 0.808 0.836 0.815 0.792 0.827 0.815 0.789 PCC 0.874 0.836 0.818 0.859 0.829 0.813 0.849 0.841 0.820 Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-106
SLIDE 106

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Datasets Metrics Summary of Experiments Comparisons Impact of Parameters

Impact of Missing Data Prediction

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.74 0.75 0.76 0.77 0.78 0.79 0.8 0.81 0.82 0.83 Lambda MAE EMDP−Given20 PEMD−Given20 EMDP−Given10 PEMD−Given10 EMDP−Given5 PEMD−Given5

Figure: MAE Comparison of EMDP and PEMD (A smaller MAE value means a better performance)

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-107
SLIDE 107

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Datasets Metrics Summary of Experiments Comparisons Impact of Parameters

Impact of γ and δ

Figure: Impact of γ and δ on MAE and Matrix Density

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-108
SLIDE 108

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Datasets Metrics Summary of Experiments Comparisons Impact of Parameters

Impact of λ

Figure: Impact of λ on MAE

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-109
SLIDE 109

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Datasets Metrics Summary of Experiments Comparisons Impact of Parameters

Impact of η and θ

Figure: Impact of η and θ on MAE and Density

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-110
SLIDE 110

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Conclusions and Future Work

Conclusions Proposes an effective missing data prediction algorithm for Collaborative Filtering Combines users information and items information together Outperforms other state-of-the-art collaborative filtering approaches Future Work Explore the relationship between user information and item information Scalability analysis and improvement of our algorithm Employ more metrics to measure our algorithm

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-111
SLIDE 111

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Conclusions and Future Work

Conclusions Proposes an effective missing data prediction algorithm for Collaborative Filtering Combines users information and items information together Outperforms other state-of-the-art collaborative filtering approaches Future Work Explore the relationship between user information and item information Scalability analysis and improvement of our algorithm Employ more metrics to measure our algorithm

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-112
SLIDE 112

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Conclusions and Future Work

Conclusions Proposes an effective missing data prediction algorithm for Collaborative Filtering Combines users information and items information together Outperforms other state-of-the-art collaborative filtering approaches Future Work Explore the relationship between user information and item information Scalability analysis and improvement of our algorithm Employ more metrics to measure our algorithm

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-113
SLIDE 113

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Conclusions and Future Work

Conclusions Proposes an effective missing data prediction algorithm for Collaborative Filtering Combines users information and items information together Outperforms other state-of-the-art collaborative filtering approaches Future Work Explore the relationship between user information and item information Scalability analysis and improvement of our algorithm Employ more metrics to measure our algorithm

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-114
SLIDE 114

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Conclusions and Future Work

Conclusions Proposes an effective missing data prediction algorithm for Collaborative Filtering Combines users information and items information together Outperforms other state-of-the-art collaborative filtering approaches Future Work Explore the relationship between user information and item information Scalability analysis and improvement of our algorithm Employ more metrics to measure our algorithm

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-115
SLIDE 115

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Conclusions and Future Work

Conclusions Proposes an effective missing data prediction algorithm for Collaborative Filtering Combines users information and items information together Outperforms other state-of-the-art collaborative filtering approaches Future Work Explore the relationship between user information and item information Scalability analysis and improvement of our algorithm Employ more metrics to measure our algorithm

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-116
SLIDE 116

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Conclusions and Future Work

Conclusions Proposes an effective missing data prediction algorithm for Collaborative Filtering Combines users information and items information together Outperforms other state-of-the-art collaborative filtering approaches Future Work Explore the relationship between user information and item information Scalability analysis and improvement of our algorithm Employ more metrics to measure our algorithm

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-117
SLIDE 117

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Conclusions and Future Work

Conclusions Proposes an effective missing data prediction algorithm for Collaborative Filtering Combines users information and items information together Outperforms other state-of-the-art collaborative filtering approaches Future Work Explore the relationship between user information and item information Scalability analysis and improvement of our algorithm Employ more metrics to measure our algorithm

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering

slide-118
SLIDE 118

Outline Introduction Missing Data Prediction Empirical Analysis Conclusions and Future Work Conclusions and Future Work

Q & A Home Page: http://www.cse.cuhk.edu.hk/∼hma Email: hma@cse.cuhk.edu.hk Thanks!

Hao Ma, Irwin King, and Michael R. Lyu Effective Missing Data Prediction for Collaborative Filtering