Item-based vs User-based Collaborative Recommendation Predictions - - PowerPoint PPT Presentation

item based vs user based collaborative recommendation
SMART_READER_LITE
LIVE PREVIEW

Item-based vs User-based Collaborative Recommendation Predictions - - PowerPoint PPT Presentation

Item-based vs User-based Collaborative Recommendation Predictions Joel Azzopardi Department of Artificial Intelligence Faculty of ICT University of Malta joel.azzopardi@um.edu.mt September 2017 Joel Azzopardi (University of Malta) IKC 2017,


slide-1
SLIDE 1

Item-based vs User-based Collaborative Recommendation Predictions

Joel Azzopardi

Department of Artificial Intelligence Faculty of ICT University of Malta joel.azzopardi@um.edu.mt

September 2017

Joel Azzopardi (University of Malta) IKC 2017, Gdansk, Poland September 2017 1 / 21

slide-2
SLIDE 2

Overview

1

The Problem

2

Background

3

Research Questions

4

Methodology

5

Evaluation

6

Conclusions

Joel Azzopardi (University of Malta) IKC 2017, Gdansk, Poland September 2017 2 / 21

slide-3
SLIDE 3

The Problem

Joel Azzopardi (University of Malta) IKC 2017, Gdansk, Poland September 2017 3 / 21

slide-4
SLIDE 4

The Problem

Information Overload Information Retrieval – user ‘pulls’ relevant information after submitting query. Recommendation Systems – system ‘pushes’ relevant information to the user based on user model. Main Challenge: handling large amounts of data efficiently and effectively.

Joel Azzopardi (University of Malta) IKC 2017, Gdansk, Poland September 2017 4 / 21

slide-5
SLIDE 5

Background

Joel Azzopardi (University of Malta) IKC 2017, Gdansk, Poland September 2017 5 / 21

slide-6
SLIDE 6

Recommendation Approaches

Content-based techniques – recommendation is performed on the basis of similarity between the content of the different items (documents).

Need to extract features from the different items (documents). Does not suffer from new user/item problem, and from sparse matrix problem. Suitable for items with high turn-over (e.g. news).

Collaborative techniques – recommendation is performed on the basis of what other ‘similar’ users have found useful.

Does not use features from the items/documents. Need to have substantial user-item rating overlap.

Joel Azzopardi (University of Malta) IKC 2017, Gdansk, Poland September 2017 6 / 21

slide-7
SLIDE 7

Collaborative Recommendation

More effective than content-based approaches. Exploit the fact that humans enjoy sharing their opinions with others. 2 main types:

User-based – an item’s recommendation score for a user is calculated depending on that items’ ratings by other similar users Item-based – item’s rating is predicted based on how similar items have been rated by that user.

Joel Azzopardi (University of Malta) IKC 2017, Gdansk, Poland September 2017 7 / 21

slide-8
SLIDE 8

Research Questions

Joel Azzopardi (University of Malta) IKC 2017, Gdansk, Poland September 2017 8 / 21

slide-9
SLIDE 9

Research Questions

What will be the performance of an ensemble system combining both user-based and item-based approaches? What is the effect of Latent Semantic Analysis (LSA) applied to the collaborative recommendation algorithms? What is the optimal neighbourhood size for the different collaborative recommendation setups?

Joel Azzopardi (University of Malta) IKC 2017, Gdansk, Poland September 2017 9 / 21

slide-10
SLIDE 10

Latent Semantic Analyses

X = T · S · DT

Figure 1 : Latent Semantic Analysis Process, from: http://www.slideshare.net/vitomirkovanovic/topic-modeling-for-learning-analytics-researchers- lak15-tutorial, September 2016

Joel Azzopardi (University of Malta) IKC 2017, Gdansk, Poland September 2017 10 / 21

slide-11
SLIDE 11

Methodology

Joel Azzopardi (University of Malta) IKC 2017, Gdansk, Poland September 2017 11 / 21

slide-12
SLIDE 12

Collaborative Recommendation Algorithm

predictRating -SimUsers (UserSimMatrix , UserID, ItemID, k) CandidateRatings ← φ SimUsers ← getSimilarUsers (UserSimMatrix , UserID) curk ← 0 while (curk < k) user ← getNextMostSimilarUser (SimUsers) SimUserRating ← getUserItemRating (user , ItemID) if (exists(SimUserRating )) updateCandidateRatings (CandidateRatings, SimUserRating , Similarity (user ,UserID)) k ← k + 1 end if end while return ( getHighestWeightedCandidate (CandidateRatings)) end

Joel Azzopardi (University of Malta) IKC 2017, Gdansk, Poland September 2017 12 / 21

slide-13
SLIDE 13

Methodology

Algorithm is based on k Nearest Neighbours (kNN).

Votes are weighted according to neighbours’ similarities.

Use of:

User pair-wise similarity matrix in user-based recommendation. Item pair-wise similarity matrix in item-based recommendation.

In LSA, these similarity matrices are decomposed, and only the top dimensions are considered. Ensemble algorithm:

Separate candidate user-item ratings are obtained from user-based and item-based algorithms. Lists are merged together. Predicted recommendation score is set to the highest weighted candidate score in the merged list.

Joel Azzopardi (University of Malta) IKC 2017, Gdansk, Poland September 2017 13 / 21

slide-14
SLIDE 14

Evaluation

Joel Azzopardi (University of Malta) IKC 2017, Gdansk, Poland September 2017 14 / 21

slide-15
SLIDE 15

Evaluation Dataset

MovieLens 1M dataset

1000209 ratings 3883 movies 6040 different users

Split into 80% / 20% for training and testing.

Training set consists of the oldest 80% ratings for each user. Rest into test set.

Metric used: Mean Average Error (MAE) Neighbourhood sizes: 1, 2, 3, 6, 10, 20, 40, 80, 140, 200

Joel Azzopardi (University of Malta) IKC 2017, Gdansk, Poland September 2017 15 / 21

slide-16
SLIDE 16

System Configurations Evaluated

Algorithm Similar Similar Item LSA Dimensions Index Items Users Category Used 1

  • 2
  • 300

3

  • 4
  • 1000

5

  • 6
  • 300

7

  • 8
  • 9
  • 300

10

  • 11
  • 1000

12

  • 13
  • 300

Joel Azzopardi (University of Malta) IKC 2017, Gdansk, Poland September 2017 16 / 21

slide-17
SLIDE 17

Results

Joel Azzopardi (University of Malta) IKC 2017, Gdansk, Poland September 2017 17 / 21

slide-18
SLIDE 18

Conclusions

Joel Azzopardi (University of Malta) IKC 2017, Gdansk, Poland September 2017 18 / 21

slide-19
SLIDE 19

Comparison of the Different Setups

Item-based recommenders perform considerably better than the user-based ones. LSA has a beneficial effect on user-based recommendations, but an

  • verall negative effect on the item-based recommendations.

Ensemble system that uses LSA gives best (albeit slightly) results across practically all neighbourhood sizes.

Joel Azzopardi (University of Malta) IKC 2017, Gdansk, Poland September 2017 19 / 21

slide-20
SLIDE 20

Optimal Neighbourhood Size

Optimal neighbourhood size seems to be around 40. Item-based recommenders are most effective with a neighbourhood size of 40 with a slight deterioration of results for larger sizes. Performance of user-based recommenders keeps improving (albeit very slightly) as neighbourhood sizes are increased. Ensemble algorithm that uses LSA obtains the best results with a neighbourhood size of 80, and results degrade slightly with larger neighbourhoods.

Joel Azzopardi (University of Malta) IKC 2017, Gdansk, Poland September 2017 20 / 21

slide-21
SLIDE 21

Future Work

Investigation of the different methods of how content-type features may be incorporated in collaborative systems. Recommendation over big-data: how to perform distributed recommendation over multiple datasets and merging the recommendation scores.

Joel Azzopardi (University of Malta) IKC 2017, Gdansk, Poland September 2017 21 / 21