Recommender Systems Alejandro Bellogn Supervised by Pablo Castells - - PowerPoint PPT Presentation

recommender systems
SMART_READER_LITE
LIVE PREVIEW

Recommender Systems Alejandro Bellogn Supervised by Pablo Castells - - PowerPoint PPT Presentation

Predicting Performance in Recommender Systems Alejandro Bellogn Supervised by Pablo Castells and Ivn Cantador Escuela Politcnica Superior Universidad Autnoma de Madrid @abellogin alejandro.bellogin@uam.es IRG ACM Conference on


slide-1
SLIDE 1

ACM Conference on Recommender Systems 2011 – Doctoral Symposium October 23, Chicago, USA

IRG

IR Group @ UAM

Predicting Performance in Recommender Systems

Alejandro Bellogín

Supervised by Pablo Castells and Iván Cantador

Escuela Politécnica Superior Universidad Autónoma de Madrid

@abellogin alejandro.bellogin@uam.es

slide-2
SLIDE 2

IRG

IR Group @ UAM

ACM Conference on Recommender Systems 2011 – Doctoral Symposium October 23, Chicago, USA 2

Motivation

Is it possible to predict the accuracy of a recommendation?

slide-3
SLIDE 3

IRG

IR Group @ UAM

ACM Conference on Recommender Systems 2011 – Doctoral Symposium October 23, Chicago, USA 3

Hypothesis Data that are commonly available to a Recommender System could contain signals that enable an a priori estimation

  • f the success of the recommendation
slide-4
SLIDE 4

IRG

IR Group @ UAM

ACM Conference on Recommender Systems 2011 – Doctoral Symposium October 23, Chicago, USA 4

Research Questions

1.

Is it possible to define a performance prediction theory for recommender systems in a sound, formal way?

2.

Is it possible to adapt query performance techniques (from IR) to the recommendation task?

3.

What kind of evaluation should be performed? Is IR evaluation still valid in our problem?

4.

What kind of recommendation problems can these models be applied to?

slide-5
SLIDE 5

IRG

IR Group @ UAM

ACM Conference on Recommender Systems 2011 – Doctoral Symposium October 23, Chicago, USA 5

Predicting Performance in Recommender Systems

  • RQ1. Is it possible to define a performance prediction theory

for recommender systems in a sound, formal way? a) Define a predictor of performance  = (u, i, r, …) b) Agree on a performance metric  = (u, i, r, …) c) Check predictive power by measuring correlation corr([(x1), …, (xn)], [(x1), …, (xn)]) d) Evaluate final performance: dynamic vs static

slide-6
SLIDE 6

IRG

IR Group @ UAM

ACM Conference on Recommender Systems 2011 – Doctoral Symposium October 23, Chicago, USA 6

Predicting Performance in Recommender Systems

  • RQ2. Is it possible to adapt query performance techniques

(from IR) to the recommendation task?

  • In IR: “Estimation of the system’s performance in response

to a specific query”

  • Several predictors proposed
  • We focus on query clarity  user clarity
slide-7
SLIDE 7

IRG

IR Group @ UAM

ACM Conference on Recommender Systems 2011 – Doctoral Symposium October 23, Chicago, USA 7

User clarity

  • It captures uncertainty in user’s data
  • Distance between the user’s and the system’s probability model
  • X may be: users, items, ratings, or a combination

       

| clarity | lo g

x X c

p x u u p x u p x

        

system’s model user’s model

slide-8
SLIDE 8

IRG

IR Group @ UAM

ACM Conference on Recommender Systems 2011 – Doctoral Symposium October 23, Chicago, USA 8

User clarity

  • Three user clarity formulations:

       

| clarity | lo g

x X c

p x u u p x u p x

        

background model user model Name Vocabulary User model Background model Rating-based Ratings Item-based Items Item-and-rating-based Items rated by the user

 

| p r u

 

| p i u

 

| , p r i u

 

c

p r

 

c

p i

 

|

m l

p r i

slide-9
SLIDE 9

IRG

IR Group @ UAM

ACM Conference on Recommender Systems 2011 – Doctoral Symposium October 23, Chicago, USA 9

User clarity

  • Seven user clarity models implemented:

Name Formulation User model Background model RatUser Rating-based RatItem Rating-based ItemSimple Item-based ItemUser Item-based IRUser Item-and-rating-based IRItem Item-and-rating-based IRUserItem Item-and-rating-based

 

c

p r

 

c

p r

 

c

p i

 

c

p i

 

|

m l

p r i

 

|

m l

p r i

 

|

m l

p r i

 

| ,

U

p r i u

 

| ,

I

p r i u

 

| ,

U I

p r i u

 

|

R

p i u

 

|

U R

p i u

   

| , ; |

U U R

p r i u p i u

   

| , ; |

I U R

p r i u p i u

slide-10
SLIDE 10

IRG

IR Group @ UAM

ACM Conference on Recommender Systems 2011 – Doctoral Symposium October 23, Chicago, USA 10

User clarity

  • Predictor that captures uncertainty in user’s data
  • Different formulations capture different nuances
  • More dimensions in RS than in IR: user, items, ratings, features, …

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 4 3 5 2 1 4 3 5 2 1

Ratings

RatItem

p_c(x) p(x|u1) p(x|u2)

slide-11
SLIDE 11

IRG

IR Group @ UAM

ACM Conference on Recommender Systems 2011 – Doctoral Symposium October 23, Chicago, USA 11

Predicting Performance in Recommender Systems

  • RQ3. What kind of evaluation should be performed? Is IR

evaluation still valid in our problem?

  • In IR: Mean Average Precision + correlation
  • 50 points (queries) vs 1000+ points (users)
  • Performance metric is not clear: error-based, precision-based?
  • What is performance?
  • It may depend on the final application
  • Possible bias
  • E.g., towards users or items with larger profiles

r ~ 0.57

slide-12
SLIDE 12

IRG

IR Group @ UAM

ACM Conference on Recommender Systems 2011 – Doctoral Symposium October 23, Chicago, USA 12

Predicting Performance in Recommender Systems

  • RQ3. What kind of evaluation should be performed? Is IR

evaluation still valid in our problem?

  • In IR: Mean Average Precision + correlation
  • 50 points (queries) vs 1000+ points (users)
  • Performance metric is not clear: error-based, precision-based?
  • What is performance?
  • It may depend on the final application
  • Possible bias
  • E.g., towards users or items with larger profiles
slide-13
SLIDE 13

IRG

IR Group @ UAM

ACM Conference on Recommender Systems 2011 – Doctoral Symposium October 23, Chicago, USA 13

Predicting Performance in Recommender Systems

  • RQ4. What kind of recommendation problems can these

models be applied to?

  • Whenever a combination of strategies is available
  • Example 1: dynamic neighbor weighting
  • Example 2: dynamic ensemble recommendation
slide-14
SLIDE 14

IRG

IR Group @ UAM

ACM Conference on Recommender Systems 2011 – Doctoral Symposium October 23, Chicago, USA 14

Dynamic neighbor weighting

  • The user’s neighbors are weighted according to their similarity
  • Can we take into account the uncertainty in neighbor’s data?
  • User neighbor weighting [1]
  • Static:
  • Dynamic:

     

[ ]

, sim , ,

v N u

g u i C u v ra t v i

 

       

[ ]

, γ sim , ,

v N u

g u i C v u v ra t v i

  

slide-15
SLIDE 15

IRG

IR Group @ UAM

ACM Conference on Recommender Systems 2011 – Doctoral Symposium October 23, Chicago, USA 15

Dynamic hybrid recommendation

  • Weight is the same for every item and user (learnt from training)
  • What about boosting those users predicted to perform better for

some recommender?

  • Hybrid recommendation [3]
  • Static:
  • Dynamic:

       

R 1 R 2

, , 1 , g u i g u i g u i       

       

 

 

R 1 R 2

, γ , 1 γ , g u i u g u i u g u i     

slide-16
SLIDE 16

IRG

IR Group @ UAM

ACM Conference on Recommender Systems 2011 – Doctoral Symposium October 23, Chicago, USA 16

0,80 0,82 0,84 0,86 0,88 0,90 0,92 0,94 0,96 0,98 10 20 30 40 50 60 70 80 90 MAE % of ratings for training

Standard CF Clarity-enhanced CF

0,80 0,81 0,82 0,83 0,84 0,85 0,86 0,87 0,88 100 150 200 250 300 350 400 450 500 MAE Neighbourhood size

Standard CF Clarity-enhanced CF

Results – Neighbor weighting

  • Correlation analysis [1]
  • With respect to Neighbor Goodness metric: “how good a neighbor is to her vicinity”
  • Performance [1] (MAE = Mean Average Error, the lower the better)
slide-17
SLIDE 17

IRG

IR Group @ UAM

ACM Conference on Recommender Systems 2011 – Doctoral Symposium October 23, Chicago, USA 17

0,80 0,82 0,84 0,86 0,88 0,90 0,92 0,94 0,96 0,98 10 20 30 40 50 60 70 80 90 MAE % of ratings for training

Standard CF Clarity-enhanced CF

0,80 0,81 0,82 0,83 0,84 0,85 0,86 0,87 0,88 100 150 200 250 300 350 400 450 500 MAE Neighbourhood size

Standard CF Clarity-enhanced CF

Results – Neighbour weighting

  • Correlation analysis [1]
  • With respect to Neighbour Goodness metric: “how good a neighbour is to her vicinity”
  • Performance [1] (MAE = Mean Average Error, the lower the better)

Improvement of over 5% wrt. the baseline Plus, it does not degrade performance Positive, although not very strong correlations

slide-18
SLIDE 18

IRG

IR Group @ UAM

ACM Conference on Recommender Systems 2011 – Doctoral Symposium October 23, Chicago, USA 18

0,05 0,1 0,15 0,2 H1 H2 H3 H4

nDCG@50

Adaptive Static

Results – Hybrid recommendation

  • Correlation analysis [2]
  • With respect to nDCG@50 (nDCG, normalized Discount Cumulative Gain)
  • Performance [3]
slide-19
SLIDE 19

IRG

IR Group @ UAM

ACM Conference on Recommender Systems 2011 – Doctoral Symposium October 23, Chicago, USA 19

0,05 0,1 0,15 0,2 H1 H2 H3 H4

nDCG@50

Adaptive Static

Results – Hybrid recommendation

  • Correlation analysis [2]
  • With respect to nDCG@50 (nDCG, normalized Discount Cumulative Gain)
  • Performance [3]

In average, most of the predictors obtain positive, strong correlations

Dynamic strategy outperforms static for different combination of recommenders

slide-20
SLIDE 20

IRG

IR Group @ UAM

ACM Conference on Recommender Systems 2011 – Doctoral Symposium October 23, Chicago, USA 20

Summary

  • Inferring user’s performance in a recommender system
  • Different adaptations of query clarity techniques
  • Building dynamic recommendation strategies
  • Dynamic neighbor weighting: according to expected goodness of neighbor
  • Dynamic hybrid recommendation: based on predicted performance
  • Encouraging results
  • Positive predictive power (good correlations between predictors and metrics)
  • Dynamic strategies obtain better (or equal) results than static
slide-21
SLIDE 21

IRG

IR Group @ UAM

ACM Conference on Recommender Systems 2011 – Doctoral Symposium October 23, Chicago, USA 21

Related publications

  • [1] A Performance Prediction Aproach to Enhance Collaborative

Filtering Performance. A. Bellogín and P. Castells. In ECIR 2010.

  • [2] Predicting the Performance of Recommender Systems: An

Information Theoretic Approach. A. Bellogín, P. Castells, and I.

  • Cantador. In ICTIR 2011.
  • [3] Performance Prediction for Dynamic Ensemble Recommender
  • Systems. A. Bellogín, P. Castells, and I. Cantador. In press.
slide-22
SLIDE 22

IRG

IR Group @ UAM

ACM Conference on Recommender Systems 2011 – Doctoral Symposium October 23, Chicago, USA 22

Future Work

  • Explore other input sources
  • Item predictors
  • Social links
  • Implicit data (with time)
  • We need a theoretical background
  • Why do some predictors work better?
  • Larger datasets
slide-23
SLIDE 23

IRG

IR Group @ UAM

ACM Conference on Recommender Systems 2011 – Doctoral Symposium October 23, Chicago, USA 23

FW – Other input sources

  • Item predictors
  • Social links
  • Implicit data (with time)
  • Item predictors could be very useful:
  • Different recommender behavior depending on item attributes
  • They would allow to capture popularity, diversity, etc.
slide-24
SLIDE 24

IRG

IR Group @ UAM

ACM Conference on Recommender Systems 2011 – Doctoral Symposium October 23, Chicago, USA 24

FW – Other input sources

  • Item predictors
  • Social links
  • Implicit data (with time)
  • First results using social-based predictors
  • Combination of social and CF
  • Graph-based measures as predictors
  • “Indicators” of the user strength
slide-25
SLIDE 25

IRG

IR Group @ UAM

ACM Conference on Recommender Systems 2011 – Doctoral Symposium October 23, Chicago, USA 25

FW – Theoretical background

  • We need a theoretical background
  • Why do some predictors work better?
slide-26
SLIDE 26

IRG

IR Group @ UAM

ACM Conference on Recommender Systems 2011 – Doctoral Symposium October 23, Chicago, USA 26

Thank you! Predicting Performance in Recommender Systems

Alejandro Bellogín

Supervied by Pablo Castells and Iván Cantador

Escuela Politécnica Superior Universidad Autónoma de Madrid

@abellogin alejandro.bellogin@uam.es

Acknowledgements to the National Science Foundation for the funding to attend the conference.

slide-27
SLIDE 27

ACM Conference on Recommender Systems 2011 – Doctoral Symposium October 23, Chicago, USA 27

Reviewer’s comments: Confidence

 Other methods to measure self-performance of RS

  • Confidence
  • These methods capture the performance of the RS,

not user’s performance

slide-28
SLIDE 28

ACM Conference on Recommender Systems 2011 – Doctoral Symposium October 23, Chicago, USA 28

Reviewer’s comments: Neighbor’s goodness

 Neighbor goodness seems to be a little bit ad-hoc

  • We need a measurable definition of neighbor performance

NG(u) ~ “total MAE reduction by u” ~ “MAE without u” – “MAE with u”

  • Some attempts in trust research: sign and error deviation [Rafter et al. 2009]

         

 

 

     

: ( , )

1 1 C E C E | | | | C E | , , |

U U u v U u v U u U u U u X X i ra t v i

v v R R v r v i r v i

        

   

  

slide-29
SLIDE 29

ACM Conference on Recommender Systems 2011 – Doctoral Symposium October 23, Chicago, USA 29

Reviewer’s comments: Neighbor’s weighting issues

 Neighbor size vs dynamic neighborhood weighting

  • So far, only dynamic weighting
  • Same training time than static weighting
  • Future work: dynamic size

 Apply this method for larger datasets

  • Current work

 Apply this method for other CF methods (e.g., latent factor models, SVD)

  • More difficult to identify the combination
  • Future work
slide-30
SLIDE 30

ACM Conference on Recommender Systems 2011 – Doctoral Symposium October 23, Chicago, USA 30

Reviewer’s comments: Dynamic hybrid issues

 Other methods to combine recommenders

  • Stacking
  • Multi-linear weighting
  • We focus on linear weighted hybrid recommendation
  • Future work: cascade, stacking