SLIM : Sparse Linear Methods for Top-N Recommender Systems Xia Ning - - PowerPoint PPT Presentation

slim sparse linear methods for top n recommender systems
SMART_READER_LITE
LIVE PREVIEW

SLIM : Sparse Linear Methods for Top-N Recommender Systems Xia Ning - - PowerPoint PPT Presentation

SLIM : Sparse Linear Methods for Top-N Recommender Systems Xia Ning and George Karypis Computer Science & Engineering University of Minnesota, Minneapolis, MN Email: {xning,karypis@cs.umn.edu} December 14, 2011 Introduction Methods


slide-1
SLIDE 1

SLIM: Sparse Linear Methods for Top-N Recommender Systems

Xia Ning and George Karypis

Computer Science & Engineering University of Minnesota, Minneapolis, MN Email: {xning,karypis@cs.umn.edu}

December 14, 2011

slide-2
SLIDE 2

Introduction Methods Materials Experimental Results Conclusions 2/25

Outline

1

Introduction Top-N Recommender Systems Definitions and Notations The State-of-the-Art methods

2

Methods Sparse LInear Methods for top-N Recommendation Learning W for SLIM SLIM with Feature Selection

3

Materials

4

Experimental Results SLIM on Binary Data

Top-N Recommendation Performance SLIM for Long-Tail Distribution SLIM Regularization Effects

SLIM on Rating Data

5

Conclusions

Xia Ning and George Karypis

  • SLIM: Sparse Linear Methodsfor Top-N Recommender Systems
slide-3
SLIDE 3

Introduction Methods Materials Experimental Results Conclusions 3/25

Outline

1

Introduction Top-N Recommender Systems Definitions and Notations The State-of-the-Art methods

2

Methods Sparse LInear Methods for top-N Recommendation Learning W for SLIM SLIM with Feature Selection

3

Materials

4

Experimental Results SLIM on Binary Data

Top-N Recommendation Performance SLIM for Long-Tail Distribution SLIM Regularization Effects

SLIM on Rating Data

5

Conclusions

Xia Ning and George Karypis

  • SLIM: Sparse Linear Methodsfor Top-N Recommender Systems
slide-4
SLIDE 4

Introduction Methods Materials Experimental Results Conclusions 4/25

Top-N Recommender Systems

❑ Top-N recommendation

❑ E-commerce: huge amounts of products ❑ Recommend a short ranked list of items for users

❑ Top-N recommender systems

❑ Neighborhood-based Collaborative Filtering (CF)

❑ Item based [2]: fast to generate recommendations, low recommendation quality

❑ Model-based methods [1, 3, 5]

❑ Matrix Factorization (MF) models: slow to learn the models, high recommendation quality

❑ SLIM: Sparse LInear Methods

❑ Fast and high recommendation quality

Xia Ning and George Karypis

  • SLIM: Sparse Linear Methodsfor Top-N Recommender Systems
slide-5
SLIDE 5

Introduction Methods Materials Experimental Results Conclusions 5/25

Definitions and Notations

Table 1: Definitions and Notations

Def Descriptions ui user tj item U all users (|U| = n) T all items (|T | = m) A user-item purchase/rating matrix, size n × m W item-item similarity matrix/coefficient matrix aT

i

The i-th row of A, the purchase/rating history of ui on T aj The j-th column of A, the purchase/rating history of U on tj

❑ Row vectors are represented by having the transpose supscriptT, otherwise by default they are column vectors. ❑ Use matrix/vector notations instead of user/item purchase/rating profiles

Xia Ning and George Karypis

  • SLIM: Sparse Linear Methodsfor Top-N Recommender Systems
slide-6
SLIDE 6

Introduction Methods Materials Experimental Results Conclusions 6/25

The State-of-the-Art Methods

Item-based Collaborative Filtering (1)

❑ Item-based k-nearest-neighbor (itemkNN) CF

❑ Identify a set of similar items ❑ Item-item similarity:

❑ Calculated from A ❑ Cosine similarity measure

u1 u2 u3

. . . . . .

ui

. . . . . .

un−1 un t1 t2 t3. . .

. . . tj . . . . . .

tm−1 tm

1 1 1 1 1 1 1 1 1

. . . . . . . . . . . .

1 1 1 1 1 1 1 1 1

t1 t2 t3

. . . . . .

tj

. . . . . .

tm−1 tm t1 t2 t3. . .

. . . tj . . . . . .

tm−1 tm

s s s s s s s s s

. . . . . . . . . . . .

s s s s s s s s s s s

1st nn 2nd nn

A W

Xia Ning and George Karypis

  • SLIM: Sparse Linear Methodsfor Top-N Recommender Systems
slide-7
SLIDE 7

Introduction Methods Materials Experimental Results Conclusions 7/25

The State-of-the-Art Methods

Item-based Collaborative Filtering (2)

t1 t2 t3

. . . . . .

tj

. . . . . .

tm−1 tm t1 t2 t3. . .

. . .tj. . . . . .

tm−1 tm

s s s s s s s s s

. . . . . . . . . . . .

s s s s s s s s s s s

×

1 1 1 1

uT

∗·

t1 t2 t3

. . . . . .

tj

. . . . . .

tm−1 tm

=

uT

∗·

p p p p p p

t1 t2 t3

. . . . . .

tj

. . . . . .

tm−1 tm

❑ itemkNN recommendation

❑ Recommend similar items to what the user has purchased ˜ aT

i = aT i × W

❑ Fast: sparse item neighborhood ❑ Low quality: no knowledge is learned

Xia Ning and George Karypis

  • SLIM: Sparse Linear Methodsfor Top-N Recommender Systems
slide-8
SLIDE 8

Introduction Methods Materials Experimental Results Conclusions 8/25

The State-of-the-Art Methods

Matrix Factorization (1)

❑ Latent factor models

❑ Factorize A into low-rank user factors (U) and item factors (VT)

❑ U and VT represent user and item characteristics in a common latent space

❑ Formulated as an optimization problem

minimize

U,VT

1 2A − UVT2

F + β

2U2

F + λ

2VT2

F u1 u2 u3

. . . . . .

ui

. . . . . .

un−1 un t1 t2 t3. . .

. . . tj . . . . . .

tm−1 tm

1 1 1 1 1 1 1 1 1

. . . . . . . . . . . .

1 1 1 1 1 1 1 1 1

u1 u2 u3

. . . . . .

ui

. . . . . .

uk−1 uk l1 l2. . .lk

u u u u u u u u u u u u u u u u u u u u u u u u u u u u u u u u u u u u u u u u

×

l1 l2 . . . lk t1 t2 t3. . .

. . . tj . . . . . .

tm−1tm

v v v v v v v v v v v v v v v v v v v v v v v v v v v v v v v v v v v v v v v v

A U × VT

Xia Ning and George Karypis

  • SLIM: Sparse Linear Methodsfor Top-N Recommender Systems
slide-9
SLIDE 9

Introduction Methods Materials Experimental Results Conclusions 9/25

The State-of-the-Art Methods

Matrix Factorization (2)

u∗ l1 l2 . . . lk

u u u u ×

l1 l2 . . . lk t1 t2 t3. . .

. . . tj . . . . . .

tm−1tm

v v v v v v v v v v v v v v v v v v v v v v v v v v v v v v v v v v v v v v v v

=

uT

∗·

t1 t2 t3

. . . . . .

tj

. . . . . .

tm−1 tm

p p p p p p p p p p

❑ MF recommendation

❑ Prediction: dot product in the latent space

˜ aij = UT

i · Vj

❑ Slow: dense U and VT ❑ High quality: user tastes and item properties are learned

Xia Ning and George Karypis

  • SLIM: Sparse Linear Methodsfor Top-N Recommender Systems
slide-10
SLIDE 10

Introduction Methods Materials Experimental Results Conclusions 10/25

Outline

1

Introduction Top-N Recommender Systems Definitions and Notations The State-of-the-Art methods

2

Methods Sparse LInear Methods for top-N Recommendation Learning W for SLIM SLIM with Feature Selection

3

Materials

4

Experimental Results SLIM on Binary Data

Top-N Recommendation Performance SLIM for Long-Tail Distribution SLIM Regularization Effects

SLIM on Rating Data

5

Conclusions

Xia Ning and George Karypis

  • SLIM: Sparse Linear Methodsfor Top-N Recommender Systems
slide-11
SLIDE 11

Introduction Methods Materials Experimental Results Conclusions 11/25

SLIM for top-N Recommendation

❑ Motivations:

❑ recommendations generated fast ❑ high-quality recommendations ❑ “have my cake and eat it too”

❑ Key ideas:

❑ retain the nature of itemkNN: sparse W ❑ optimize the recommendation performance: learn W from A

❑ sparsity structures ❑ coefficient values

Xia Ning and George Karypis

  • SLIM: Sparse Linear Methodsfor Top-N Recommender Systems
slide-12
SLIDE 12

Introduction Methods Materials Experimental Results Conclusions 12/25

Learning W for SLIM

❑ The optimization problem:

minimize

W

1 2A − AW2

F + β

2W2

F + λW1

subject to W ≥ 0 diag(W) = 0, (1)

Xia Ning and George Karypis

  • SLIM: Sparse Linear Methodsfor Top-N Recommender Systems
slide-13
SLIDE 13

Introduction Methods Materials Experimental Results Conclusions 12/25

Learning W for SLIM

❑ The optimization problem:

minimize

W

1 2A − AW2

F + β

2W2

F + λW1

subject to W ≥ 0 diag(W) = 0, (1)

❑ Computing W:

❑ The columns of W are independent: easy to parallelize ❑ The decoupled problems:

minimize

wj

1 2aj − Awj2

2 + β

2wj2

2 + λwj1

subject to wj ≥ 0 wj,j = 0, (2)

Xia Ning and George Karypis

  • SLIM: Sparse Linear Methodsfor Top-N Recommender Systems
slide-14
SLIDE 14

Introduction Methods Materials Experimental Results Conclusions 13/25

Reducing model learning time

minimize

wj

1 2aj − Awj2

2 + β

2wj2

2 + λwj1

❑ fsSLIM: SLIM with feature selection

❑ Prescribe the potential non-zero structure of wj ❑ Select a subset of columns from A

❑ itemkNN item-item similarity matrix

u1 u2 u3

. . . . . .

ui

. . . . . .

un−1 un aj

1 1 1 1 1 1 1 1 1 1 1

. . .

1 1

. . .

1 1

. . .

1 1 1

. . .

1 1 1 1 1 1 1 1 1

u1 u2 u3

. . . . . .

uj

. . . . . .

um−1 um

1 1 1 1 1 1 1 1 1 1 1 1

A A′ minimize

wj

1 2aj − A′wj2

2 + β

2wj2

2 + λwj1

Xia Ning and George Karypis

  • SLIM: Sparse Linear Methodsfor Top-N Recommender Systems
slide-15
SLIDE 15

Introduction Methods Materials Experimental Results Conclusions 14/25

Outline

1

Introduction Top-N Recommender Systems Definitions and Notations The State-of-the-Art methods

2

Methods Sparse LInear Methods for top-N Recommendation Learning W for SLIM SLIM with Feature Selection

3

Materials

4

Experimental Results SLIM on Binary Data

Top-N Recommendation Performance SLIM for Long-Tail Distribution SLIM Regularization Effects

SLIM on Rating Data

5

Conclusions

Xia Ning and George Karypis

  • SLIM: Sparse Linear Methodsfor Top-N Recommender Systems
slide-16
SLIDE 16

Introduction Methods Materials Experimental Results Conclusions 15/25

Datasets, Evaluation Methodology and Metrics

Table 2: The Datasets Used in Evaluation

dataset #users #items #trns rsize csize density ratings ccard 42,067 18,004 308,420 7.33 17.13 0.04%

  • ctlg2

22,505 17,096 1,814,072 80.61 106.11 0.47%

  • ctlg3

58,565 37,841 453,219 7.74 11.98 0.02%

  • ecmrc

6,594 3,972 50,372 7.64 12.68 0.19%

  • BX

3,586 7,602 84,981 23.70 11.18 0.31% 1-10 ML10M 69,878 10,677 10,000,054 143.11 936.60 1.34% 1-10 Netflix 39,884 8,478 1,256,115 31.49 148.16 0.37% 1-5 Yahoo 85,325 55,371 3,973,104 46.56 71.75 0.08% 1-5

❑ Datasets: 8 real datasets of 2 categories ❑ Evaluation methodology: Leave-One-Out cross validation ❑ Evaluation metrics

❑ Hit Rate:

HR = #hits #users

❑ Average Reciprocal Hit-Rank (ARHR) [2]:

ARHR = 1

#users #hits

  • i=1

1 pi

Xia Ning and George Karypis

  • SLIM: Sparse Linear Methodsfor Top-N Recommender Systems
slide-17
SLIDE 17

Introduction Methods Materials Experimental Results Conclusions 16/25

Outline

1

Introduction Top-N Recommender Systems Definitions and Notations The State-of-the-Art methods

2

Methods Sparse LInear Methods for top-N Recommendation Learning W for SLIM SLIM with Feature Selection

3

Materials

4

Experimental Results SLIM on Binary Data

Top-N Recommendation Performance SLIM for Long-Tail Distribution SLIM Regularization Effects

SLIM on Rating Data

5

Conclusions

Xia Ning and George Karypis

  • SLIM: Sparse Linear Methodsfor Top-N Recommender Systems
slide-18
SLIDE 18

Introduction Methods Materials Experimental Results Conclusions 17/25

SLIM on Binary Data

Top-N recommendation performance

Figure 1: HR comparison

0.08 0.12 0.16 0.20 0.24 0.28

ccard ecmrc Netflix HR datasets

Figure 2: ARHR comparison

0.04 0.08 0.12 0.16 0.20 0.24

ccard ecmrc Netflix ARHR datasets

Figure 3: learning time comparison

0.1 1 10 100 1000 10000 100000

ccard ecmrc Netflix learning time (s) datasets

Figure 4: testing time comparison

0.1 1 10 100 1000 10000

ccard ecmrc Netflix testing time (s) datasets

itemkNN itemprob userkNN PureSVD WRMF BPRMF BPRkNN SLIM fsSLIM

Xia Ning and George Karypis

  • SLIM: Sparse Linear Methodsfor Top-N Recommender Systems
slide-19
SLIDE 19

Introduction Methods Materials Experimental Results Conclusions 18/25

SLIM on Binary Data

SLIM for Long-Tail Distribution

Figure 5: Rating Distribution in ML10M

0.001% 0.01% 0.1% 1% 10% 100% 20% 40% 60% 80% 100% % of items % of purchases/ratings short

  • head

(popular) long-tail (unpopular)

❑ SLIM outperforms the rest methods on the “long tail”.

Figure 6: HR in ML10M tail

0.12 0.16 0.20 0.24

HR ML10M tail

Figure 7: ARHR in ML10M tail

0.05 0.07 0.09 0.11

ARHR ML10M tail

itemkNN itemprob userkNN PureSVD WRMF BPRMF BPRkNN SLIM fsSLIM

Xia Ning and George Karypis

  • SLIM: Sparse Linear Methodsfor Top-N Recommender Systems
slide-20
SLIDE 20

Introduction Methods Materials Experimental Results Conclusions 19/25

SLIM on Binary Data

SLIM Recommendations for Different top-N

Figure 8: BX

0.03 0.06 0.09 0.12 0.15 5 10 15 20 25 HR N

Figure 9: Netflix

0.10 0.15 0.20 0.25 0.30 5 10 15 20 25 HR N

itemkNN itemprob userkNN PureSVD WRMF BPRMF BPRkNN SLIM

❑ The performance difference between SLIM and the best of the other methods are higher for smaller values of N. ❑ SLIM tends to rank most relevant items higher than the

  • ther methods.

Xia Ning and George Karypis

  • SLIM: Sparse Linear Methodsfor Top-N Recommender Systems
slide-21
SLIDE 21

Introduction Methods Materials Experimental Results Conclusions 20/25

SLIM on Binary Data

SLIM Regularization Effects

Figure 10: SLIM Regularization Effects on BX

0.0 0.5 1.0 2.0 3.0 5.0 0.0 0.5 1.0 2.0 3.0 5.0 β λ 0.0 0.5 1.0 1.5 2.0 2.5 time (s) 0.0 0.5 1.0 2.0 3.0 5.0 0.0 0.5 1.0 2.0 3.0 5.0 β λ 0.06 0.07 0.08 0.09 0.10 0.11 HR

minimize

W

1 2 A − AW2

F + β

2 W2

F + λW1

❑ As greater ℓ1-norm regularization (i.e., larger λ ) is applied, lower recommendation time is achieved, indicating that the learned W is sparser. ❑ The best recommendation quality is achieved when both of the regularization parameters β and λ are non-zero. ❑ The recommendation quality changes smoothly as the regularization parameters β and λ change.

Xia Ning and George Karypis

  • SLIM: Sparse Linear Methodsfor Top-N Recommender Systems
slide-22
SLIDE 22

Introduction Methods Materials Experimental Results Conclusions 21/25

SLIM on Rating Data

Top-N recommendation performance

Figure 11: SLIM on Netflix

10% 20% 30%

1 2 3 4 5

distribution rating 0.00 0.08 0.16 0.24 0.32 1 2 3 4 5 rHR rating PureSVD-r PureSVD-b WRMF-r WRMF-b BPRkNN-r BPRkNN-b SLIM-r SLIM-b

❑ Evaluation metics:

❑ per-rating Hit Rate: rHR

❑ All the -r methods produce higher hit rates on items with higher ratings. ❑ The -r methods outperform -b methods on high-rated items. ❑ SLIM-r consistently outperforms the other methods on items with higher ratings.

Xia Ning and George Karypis

  • SLIM: Sparse Linear Methodsfor Top-N Recommender Systems
slide-23
SLIDE 23

Introduction Methods Materials Experimental Results Conclusions 22/25

Outline

1

Introduction Top-N Recommender Systems Definitions and Notations The State-of-the-Art methods

2

Methods Sparse LInear Methods for top-N Recommendation Learning W for SLIM SLIM with Feature Selection

3

Materials

4

Experimental Results SLIM on Binary Data

Top-N Recommendation Performance SLIM for Long-Tail Distribution SLIM Regularization Effects

SLIM on Rating Data

5

Conclusions

Xia Ning and George Karypis

  • SLIM: Sparse Linear Methodsfor Top-N Recommender Systems
slide-24
SLIDE 24

Introduction Methods Materials Experimental Results Conclusions 23/25

Conclusions

❑ SLIM: Sparse LInear Method for top-N recommendations

❑ The recommendation score for a new item can be calculated as an aggregation of other items ❑ A sparse aggregation coefficient matrix W is learned for SLIM to make the aggregation very fast ❑ W is learned by solving an ℓ1-norm and ℓ2-norm regularized

  • ptimization problem such that sparsity is introduced into W

❑ Fast and efficient

Xia Ning and George Karypis

  • SLIM: Sparse Linear Methodsfor Top-N Recommender Systems
slide-25
SLIDE 25

Introduction Methods Materials Experimental Results Conclusions 24/25

References

P . Cremonesi, Y. Koren, and R. Turrin. Performance of recommender algorithms on top-n recommendation tasks. In Proceedings of the fourth ACM conference on Recommender systems, RecSys ’10, pages 39–46, New York, NY, USA, 2010. ACM.

  • M. Deshpande and G. Karypis.

Item-based top-n recommendation algorithms. ACM Transactions on Information Systems, 22:143–177, January 2004.

  • Y. Hu, Y. Koren, and C. Volinsky.

Collaborative filtering for implicit feedback datasets. In Proceedings of the 2008 Eighth IEEE International Conference on Data Mining, pages 263–272, Washington, DC, USA, 2008. IEEE Computer Society.

  • S. Rendle, C. Freudenthaler, Z. Gantner, and S.-T. Lars.

Bpr: Bayesian personalized ranking from implicit feedback. In Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence, UAI ’09, pages 452–461, Arlington, Virginia, United States, 2009. AUAI Press.

  • V. Sindhwani, S. S. Bucak, J. Hu, and A. Mojsilovic.

One-class matrix completion with low-density factorizations. In Proceedings of the 2010 IEEE International Conference on Data Mining, ICDM ’10, pages 1055–1060, Washington, DC, USA, 2010. IEEE Computer Society.

  • R. Tibshirani.

Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society (Series B), 58:267–288, 1996. Xia Ning and George Karypis

  • SLIM: Sparse Linear Methodsfor Top-N Recommender Systems
slide-26
SLIDE 26

Introduction Methods Materials Experimental Results Conclusions 25/25

Thank You!

Xia Ning and George Karypis

  • SLIM: Sparse Linear Methodsfor Top-N Recommender Systems