Ordinal Non-negative Matrix Factorization for Recommendation - - PowerPoint PPT Presentation

ordinal non negative matrix factorization for
SMART_READER_LITE
LIVE PREVIEW

Ordinal Non-negative Matrix Factorization for Recommendation - - PowerPoint PPT Presentation

Ordinal Non-negative Matrix Factorization for Recommendation International Conference on Machine Learning Olivier Gouvert 1 Thomas Oberlin 2 evotte 1 C edric F 1 IRIT, Universit e de Toulouse, CNRS, France 2 ISAE-SUPAERO, Universit e


slide-1
SLIDE 1

Ordinal Non-negative Matrix Factorization for Recommendation

International Conference on Machine Learning

Olivier Gouvert1 Thomas Oberlin2 C´ edric F´ evotte1

1IRIT, Universit´

e de Toulouse, CNRS, France

2ISAE-SUPAERO, Universit´

e de Toulouse, France

slide-2
SLIDE 2

Introduction OrdNMF Experimental Results Conclusion

Collaborative Filtering (CF)

◮ Based only on the feedbacks of users on items ◮ Y: feedback matrix, of size U × I yui: feedback of a user u ∈ {1, . . . , U} on an item i ∈ {1, . . . , I}

Ordinal Non-negative Matrix Factorization for Recommendation Gouvert O., Oberlin T. and F´ evotte C. ICML 2020 2 of 14

slide-3
SLIDE 3

Introduction OrdNMF Experimental Results Conclusion

Collaborative Filtering (CF)

◮ Based only on the feedbacks of users on items ◮ Y: feedback matrix, of size U × I yui: feedback of a user u ∈ {1, . . . , U} on an item i ∈ {1, . . . , I} ◮ Ordinal data: nominal data which exhibit a natural ordering [Stevens et al., 1946]:

  • Explicit feedbacks: bad ≺ average ≺ good ≺ excellent
  • Implicit feedbacks: quantized play counts
  • ... without loss of generality yui ∈ {0, . . . , V }

Ordinal Non-negative Matrix Factorization for Recommendation Gouvert O., Oberlin T. and F´ evotte C. ICML 2020 2 of 14

slide-4
SLIDE 4

Introduction OrdNMF Experimental Results Conclusion

Non-negative Matrix Factorization (NMF)

◮ Approximation: Y ≈ WHT [Lee and Seung, 1999]

  • W ≥ 0 of size U × K: preferences of the users
  • H ≥ 0 of size I × K: attributes of the items

Y W HT ˆ Y

Ordinal Non-negative Matrix Factorization for Recommendation Gouvert O., Oberlin T. and F´ evotte C. ICML 2020 3 of 14

slide-5
SLIDE 5

Introduction OrdNMF Experimental Results Conclusion

Non-negative Matrix Factorization (NMF)

◮ Approximation: Y ≈ WHT [Lee and Seung, 1999]

  • W ≥ 0 of size U × K: preferences of the users
  • H ≥ 0 of size I × K: attributes of the items

Y W HT ˆ Y

◮ How to process ordinal data?

Ordinal Non-negative Matrix Factorization for Recommendation Gouvert O., Oberlin T. and F´ evotte C. ICML 2020 3 of 14

slide-6
SLIDE 6

Introduction OrdNMF Experimental Results Conclusion

How to process ordinal data?

◮ Threshold models:

  • Quantization of a continuous latent variable
  • Some examples: [Chu and Ghahramani, 2005, Paquet et al., 2012,

Hernandez-Lobato et al., 2014]

Ordinal Non-negative Matrix Factorization for Recommendation Gouvert O., Oberlin T. and F´ evotte C. ICML 2020 4 of 14

slide-7
SLIDE 7

Introduction OrdNMF Experimental Results Conclusion

How to process ordinal data?

◮ Threshold models:

  • Quantization of a continuous latent variable
  • Some examples: [Chu and Ghahramani, 2005, Paquet et al., 2012,

Hernandez-Lobato et al., 2014]

◮ Contributions:

  • NMF for ordinal data (OrdNMF)

◮ Non-negative constraints ◮ Multiplicative noise ◮ Link with Poisson factoriaztion (PF) [Gopalan et al., 2015]

  • Efficient variational algorithm

◮ Augmentation trick ◮ Scales with the number of non-zero values

  • Excellent flexibility of OrdNMF

Ordinal Non-negative Matrix Factorization for Recommendation Gouvert O., Oberlin T. and F´ evotte C. ICML 2020 4 of 14

slide-8
SLIDE 8

Introduction OrdNMF Experimental Results Conclusion

Ordinal Non-negative Matrix Factorization (OrdNMF)

◮ Approximation: Y ≈ Gb(WHT )

  • Y ordinal matrix
  • W ≥ 0 and H ≥ 0

◮ Quantization of the non-negative numbers Gb : R+ → {0, . . . , V } x → v such that x ∈ [bv−1, bv) where b is an increasing sequence of thresholds

100 102 104 106 x 2 4 6 8 10 v v = Gb(x)

Ordinal Non-negative Matrix Factorization for Recommendation Gouvert O., Oberlin T. and F´ evotte C. ICML 2020 5 of 14

slide-9
SLIDE 9

Introduction OrdNMF Experimental Results Conclusion

Ordinal Non-negative Matrix Factorization (OrdNMF)

◮ Approximation: Y ≈ Gb(WHT )

  • Y ordinal matrix
  • W ≥ 0 and H ≥ 0

◮ Quantization of the non-negative numbers Gb : R+ → {0, . . . , V } x → v such that x ∈ [bv−1, bv) where b is an increasing sequence of thresholds

100 102 104 106 x 2 4 6 8 10 v v = Gb(x)

◮ Goal: joint estimation of W, H and b

Ordinal Non-negative Matrix Factorization for Recommendation Gouvert O., Oberlin T. and F´ evotte C. ICML 2020 5 of 14

slide-10
SLIDE 10

Introduction OrdNMF Experimental Results Conclusion

Ordinal Non-negative Matrix Factorization (OrdNMF)

◮ Generative model: xui = [WHT ]ui · εui yui = Gb(xui)

W X Y H Gb

◮ Multiplicative noise: ε non-negative random variable with c.d.f. Fε

Ordinal Non-negative Matrix Factorization for Recommendation Gouvert O., Oberlin T. and F´ evotte C. ICML 2020 6 of 14

slide-11
SLIDE 11

Introduction OrdNMF Experimental Results Conclusion

Ordinal Non-negative Matrix Factorization (OrdNMF)

◮ Generative model: xui = [WHT ]ui · εui yui = Gb(xui)

W X Y H Gb

◮ Multiplicative noise: ε non-negative random variable with c.d.f. Fε ◮ Cumulative distribution function: P[yui ≤ v|W, H] = P [Gb(xui) ≤ v|W, H] = P

  • [WHT ]ui · εui < bv
  • = P
  • εui <

bv [WHT ]ui

  • = Fε
  • bv

[WHT ]ui

  • Ordinal Non-negative Matrix Factorization for Recommendation

Gouvert O., Oberlin T. and F´ evotte C. ICML 2020 6 of 14

slide-12
SLIDE 12

Introduction OrdNMF Experimental Results Conclusion

Inverse-Gamma OrdNMF

◮ Generative model: xui = [WHT ]ui · εui yui = Gb(xui)

W X Y H Gb

◮ Inverse-gamma noise: εui ∼ IG(1, 1)

Ordinal Non-negative Matrix Factorization for Recommendation Gouvert O., Oberlin T. and F´ evotte C. ICML 2020 7 of 14

slide-13
SLIDE 13

Introduction OrdNMF Experimental Results Conclusion

Inverse-Gamma OrdNMF

◮ Generative model: xui = [WHT ]ui · εui yui = Gb(xui)

W X Y H Gb

◮ Inverse-gamma noise: εui ∼ IG(1, 1) ◮ Cumulative distribution function: P[yui ≤ v|W, H] = e−[WHT ]uib−1

v

  • r P[yui > v|W, H] = 1 − e−[WHT ]uib−1

v Ordinal Non-negative Matrix Factorization for Recommendation Gouvert O., Oberlin T. and F´ evotte C. ICML 2020 7 of 14

slide-14
SLIDE 14

Introduction OrdNMF Experimental Results Conclusion

Interpretation

◮ V dependent Bernoulli models: {yui > v} ∼ Bern

  • 1 − e−[WHT ]uibv−1

, v ∈ {0, . . . , V − 1}

  • V = 1: Bernoulli-Poisson factorization (BePoF) [Acharya et al., 2015]
  • ... Poisson factorization (PF) [Gopalan et al., 2015] applied on binary data

Y with V = 3 (4 classes) Y > 1 Y > 0 Y > 2

Ordinal Non-negative Matrix Factorization for Recommendation Gouvert O., Oberlin T. and F´ evotte C. ICML 2020 8 of 14

slide-15
SLIDE 15

Introduction OrdNMF Experimental Results Conclusion

Bayesian Inference

◮ Bayesian inference:

  • A priori: wuk ∼ Gamma(αW , βW

u ) and hik ∼ Gamma(αH, βH i )

  • Variational inference (VI): p(W, H|Y) ≈ q(W)q(H)

Ordinal Non-negative Matrix Factorization for Recommendation Gouvert O., Oberlin T. and F´ evotte C. ICML 2020 9 of 14

slide-16
SLIDE 16

Introduction OrdNMF Experimental Results Conclusion

Bayesian Inference

◮ Bayesian inference:

  • A priori: wuk ∼ Gamma(αW , βW

u ) and hik ∼ Gamma(αH, βH i )

  • Variational inference (VI): p(W, H|Y) ≈ q(W)q(H)

◮ Log-likelihood, with ∆v = b−1

v−1 − b−1 v :

log P[yui = v|W, H] =

  • −[WHT ]ui b−1

0 , if v = 0

−[WHT ]ui b−1

v + log(1 − e−[WHT ]ui∆v), if v > 0

Non-conjugate model

Ordinal Non-negative Matrix Factorization for Recommendation Gouvert O., Oberlin T. and F´ evotte C. ICML 2020 9 of 14

slide-17
SLIDE 17

Introduction OrdNMF Experimental Results Conclusion

Model Augmentation

◮ Trick: model augmentation similar to [Acharya et al., 2015] nui|yui, W, H ∼

  • δ0,

if yui = 0 ZTP([WHT ]ui∆yui), if yui > 0

  • Joint likelihood: generalized Kullback-Leibler divergence
  • Scales with the number of non-zero in Y

Ordinal Non-negative Matrix Factorization for Recommendation Gouvert O., Oberlin T. and F´ evotte C. ICML 2020 10 of 14

slide-18
SLIDE 18

Introduction OrdNMF Experimental Results Conclusion

Model Augmentation

◮ Trick: model augmentation similar to [Acharya et al., 2015] nui|yui, W, H ∼

  • δ0,

if yui = 0 ZTP([WHT ]ui∆yui), if yui > 0

  • Joint likelihood: generalized Kullback-Leibler divergence
  • Scales with the number of non-zero in Y

◮ Threshold optimization: working on the decrement sequence ∆ (defined as ∆v = b−1

v−1 − b−1 v ) rather than on the threshold sequence b

  • Very simple update rules for b

Ordinal Non-negative Matrix Factorization for Recommendation Gouvert O., Oberlin T. and F´ evotte C. ICML 2020 10 of 14

slide-19
SLIDE 19

Introduction OrdNMF Experimental Results Conclusion

Experimental Results

◮ MovieLens dataset

  • Ratings of users on movies on a scale from 1 to 10
  • Class 0: absence of a rating

◮ Splitting of Y :

  • Ytrain: 80% of the non-zero values
  • Ytest: remaining 20%

Ordinal Non-negative Matrix Factorization for Recommendation Gouvert O., Oberlin T. and F´ evotte C. ICML 2020 11 of 14

slide-20
SLIDE 20

Introduction OrdNMF Experimental Results Conclusion

Prediction Results

◮ Normalized discounted cumulative gain (NDCG)

  • Ranking metric
  • Relevance: rel(u, i) = 1[ytest

ui ≥ s]

Table: Recommendation performance. R: raw data. B: binary data

NDCG @100 with threshold s Model Data K s = 1 s = 4 s = 6 s = 8 s = 10 OrdNMF R 150 0.444 0.444 0.439 0.414 0.353 BePoF B (≥ 1) 50 0.433 0.430 0.421 0.383 0.310 PF B (≥ 1) 100 0.431 0.428 0.418 0.380 0.306 BePoF B (≥ 8) 50 0.389 0.393 0.399 0.408 0.369 PF B (≥ 8) 150 0.386 0.389 0.395 0.403 0.365

Ordinal Non-negative Matrix Factorization for Recommendation Gouvert O., Oberlin T. and F´ evotte C. ICML 2020 12 of 14

slide-21
SLIDE 21

Introduction OrdNMF Experimental Results Conclusion

Posterior Predictive Check (PPC)

◮ Generating new data based on the posterior predictive distribution p(Y∗, W, H|Y) ≈ p(Y∗|W, H)q(W)q(H)

1 2 3 4 5 6 7 8 9 10 Class 104 105 106 107 Occurence number Truth (2.47) OrdNMF (2.49)

Figure: PPC of the distribution of the classes in the MovieLens dataset

Ordinal Non-negative Matrix Factorization for Recommendation Gouvert O., Oberlin T. and F´ evotte C. ICML 2020 13 of 14

slide-22
SLIDE 22

Introduction OrdNMF Experimental Results Conclusion

Conclusion

◮ Take-home message:

  • NMF framework to process ordinal data
  • Natural extension of BePoF
  • Efficient variational algorithm - scales with non-zero values
  • Flexibility of OrdNMF

◮ More information:

  • GitHub: https://github.com/Oligou/OrdNMF
  • Contact: oliviergouvert@gmail.com

Ordinal Non-negative Matrix Factorization for Recommendation Gouvert O., Oberlin T. and F´ evotte C. ICML 2020 14 of 14

slide-23
SLIDE 23

References I

[Acharya et al., 2015] Acharya, A., Ghosh, J., and Zhou, M. (2015). Nonparametric Bayesian Factor Analysis for Dynamic Count Matrices. In Proc. International Conference on Artificial Intelligence and Statistics (AISTATS). [Chu and Ghahramani, 2005] Chu, W. and Ghahramani, Z. (2005). Gaussian processes for ordinal regression. The Journal of Machine Learning Research, 6(Jul):1019–1041. [Gopalan et al., 2015] Gopalan, P., Hofman, J. M., and Blei, D. M. (2015). Scalable Recommendation with Hierarchical Poisson Factorization. In Proc. Conference on Uncertainty in Artificial Intelligence (UAI), pages 326–335. [Hernandez-Lobato et al., 2014] Hernandez-Lobato, J. M., Houlsby, N., and Ghahramani, Z. (2014). Probabilistic Matrix Factorization with Non-random Missing Data. In Proc. International Conference on Machine Learning (ICML), pages 1512–1520.

Ordinal Non-negative Matrix Factorization for Recommendation Gouvert O., Oberlin T. and F´ evotte C. ICML 2020 1 of 2

slide-24
SLIDE 24

References II

[Lee and Seung, 1999] Lee, D. D. and Seung, H. S. (1999). Learning the Parts of Objects by Non-negative Matrix Factorization. Nature, 401(6755):788–791. [Paquet et al., 2012] Paquet, U., Thomson, B., and Winther, O. (2012). A hierarchical model for ordinal matrix factorization. Statistics and Computing, 22(4):945–957. [Stevens et al., 1946] Stevens, S. S. et al. (1946). On the theory of scales of measurement. Science.

Ordinal Non-negative Matrix Factorization for Recommendation Gouvert O., Oberlin T. and F´ evotte C. ICML 2020 2 of 2