Online-Updating Regularized Kernel Matrix Factorization Models for - PowerPoint PPT Presentation

Introduction Matrix factorization Learning Matrix Factorization Models Online Updates Evaluation Online-Updating Regularized Kernel Matrix Factorization Models for Large-Scale Recommender Systems. Steffen Rendle, Lars Schmidt-Thieme University of Hildesheim, Germany February 10th, 2012 Steffen Rendle, Lars Schmidt-Thieme Online-Updating RKMF Models for Large-Scale Recommenders

Introduction Matrix factorization Learning Matrix Factorization Models Online Updates Evaluation Outline Motivation Related work Matrix Factorization (MF) Kernel Matrix Factorization (KMF) Learning Matrix Factorization Models SVD versus Regularized KMF Online Updates Evaluation Steffen Rendle, Lars Schmidt-Thieme Online-Updating RKMF Models for Large-Scale Recommenders

Introduction Matrix factorization Learning Matrix Factorization Models Online Updates Evaluation Motivation Recommenders predict how much a user likes a given item. A matrix completition task, where a matrix R : | U | × | I | should be completed. The entry r u , i represents a rating of a user u for item i . A set S of observed ratings contains triples ( u , i , v ). A matrix factorization estimates R with ˆ R and in a such way new ratings are predicted. Steffen Rendle, Lars Schmidt-Thieme Online-Updating RKMF Models for Large-Scale Recommenders

Introduction Matrix factorization Learning Matrix Factorization Models Online Updates Evaluation Motivation Dynamics of recommender systems require often recomputation of prediction models e.g., when new user enters a system. Models for large-scale recommenders are static and do not reflect after training users’ ratings – new-user problem . The profile C ( u , . ) of a user u grows from 0 to k ratings where: C ( u , i ) := { r u ′ , i ′ ∈ S | u ′ = u ∧ i ′ = i } Steffen Rendle, Lars Schmidt-Thieme Online-Updating RKMF Models for Large-Scale Recommenders

Introduction Matrix factorization Learning Matrix Factorization Models Online Updates Evaluation Related work Different approaches for rating prediction: Collaborative filtering based on k-nearest-neighbor method (kNN) [SKKR01]. Latent semantic models [Hof04], classifiers [ST05]. Models based on matrix factorization (MF) [Wu07]. Steffen Rendle, Lars Schmidt-Thieme Online-Updating RKMF Models for Large-Scale Recommenders

Introduction Matrix factorization Learning Matrix Factorization Models Online Updates Evaluation Matrix factorization (MF) The goal is to approximate the true unobserved ratings-matrix R by ˆ R : | U | × | I | . R = W · H t ˆ where W : | U | × k and H : | I | × k w u represents k features that describe user u . h i represents k features that describe item i . k � r u , i = < w u , h i > = ˆ w u , f · h i , f f =1 Often can be added a bias term b u , i which centers the approximation. Steffen Rendle, Lars Schmidt-Thieme Online-Updating RKMF Models for Large-Scale Recommenders

Introduction Matrix factorization Learning Matrix Factorization Models Online Updates Evaluation Kernel Matrix Factorization Interactions between the feature vector w u and h i are kernelized: r u , i = < w u , h i > = a + c · K ( w u , h i ) ˆ Terms a , c allow rescaling the approximation. Kernel function K : R k × R k → R can utilize one of the following well-known kernels: linear: K l ( w u , h i ) = < w u , h i > K p ( w u , h i ) = (1+ < w u , h i > ) d polynomial: K r ( w u , h i ) = exp ( − || w u , h i || 2 RBF: ) 2 σ 2 logistic: K s ( w u , h i ) = φ s ( b u , i + < w u , h i > ) 1 where φ s ( x ) = 1+exp − x Steffen Rendle, Lars Schmidt-Thieme Online-Updating RKMF Models for Large-Scale Recommenders

Introduction Matrix factorization Learning Matrix Factorization Models Online Updates Evaluation Kernel Matrix Factorization Benefits of using kernels: Using kernel like logistic one results into bounded values of the ratings to the application domain. Model non-linear correlations between users and items. Kernels lead to different models that can be combined in an ensemble. Steffen Rendle, Lars Schmidt-Thieme Online-Updating RKMF Models for Large-Scale Recommenders

Introduction Matrix factorization Learning Matrix Factorization Models Online Updates Evaluation Non-negative matrix factorization Additional constraints on feature matrices W and H such that each entry has to be non-negative. The motivation is to eliminate interactions between negative correlations (commonly used in CF algorithms). Steffen Rendle, Lars Schmidt-Thieme Online-Updating RKMF Models for Large-Scale Recommenders

Introduction Matrix factorization Learning Matrix Factorization Models Online Updates Evaluation Learning Matrix Factorization Models. Minimize an error between an approximated matrix ˆ R and original matrix R . The optimization task is defined as: argmin E ( S , W , H ) where: W , H � E ( S , W , H ) := ( r u , i − ˆ r u , i ) r u , i ∈ S Overfitting: Netflix dataset contains 480000 users and 17000 items with 100 million ratings. It leads to estimating 50 million of parameters when k = 100 which results into overfitting. Two strategies: Regularization Early stopping criterion Steffen Rendle, Lars Schmidt-Thieme Online-Updating RKMF Models for Large-Scale Recommenders

Introduction Matrix factorization Learning Matrix Factorization Models Online Updates Evaluation Regularization A regularization term is added to the optimization task. Tikhonov regularization is used and a parameter λ controls regularization. The final optimization task is: argmin Opt ( S , W , H ) where: W , H Opt ( S , W , H ) := E ( S , W , H ) + λ ( || W || 2 F + || H || 2 F ) Steffen Rendle, Lars Schmidt-Thieme Online-Updating RKMF Models for Large-Scale Recommenders

Introduction Matrix factorization Learning Matrix Factorization Models Online Updates Evaluation Optimization by Gradient Descent Gradient descent is used for MF and also for KMF: Figure: Generic learning algorithm for KMF. Steffen Rendle, Lars Schmidt-Thieme Online-Updating RKMF Models for Large-Scale Recommenders

Introduction Matrix factorization Learning Matrix Factorization Models Online Updates Evaluation SVD versus Regularized KMF Singular Value Decomposition (SVD) decomposes a matrix into 3 matrices: R = W ′ Σ H ′ , where: W ′ : | U | × | U | , Σ : | U | × | I | , H ′ : | I | × | I | SVD is not suitable for recommender systems because of: The huge number of missing values that has to be estimated e.g, sparsity rate is 99% for Netflix dataset. Lack of regularization leads to overfitting. Figure: RMSE results on Netflix probe for RKMF and k-rank SVD. Steffen Rendle, Lars Schmidt-Thieme Online-Updating RKMF Models for Large-Scale Recommenders

Introduction Matrix factorization Learning Matrix Factorization Models Online Updates Evaluation Online Updates Retraining the whole KMF when a new rating arrives is not suitable e.g., For Netflix dataset when k = 40, i = 120 and S = 100000000 results into 480 billion feature updates. Online Updates technique: Steffen Rendle, Lars Schmidt-Thieme Online-Updating RKMF Models for Large-Scale Recommenders Figure: Online updates for new-user problem.

Introduction Matrix factorization Learning Matrix Factorization Models Online Updates Evaluation Further Speedup Retraining a user u on a new rating is extremely important for a user with small profile. Proposed rules that allow to skip some online updates when user’s profile size is large enough: P u ( train | r u , i ) = γ | C u ,. | , γ ∈ (0 , 1) � m � ; m ∈ N + P u ( train | r u , i ) = max 1 , | C ( u , . ) | Steffen Rendle, Lars Schmidt-Thieme Online-Updating RKMF Models for Large-Scale Recommenders

Introduction Matrix factorization Learning Matrix Factorization Models Online Updates Evaluation Evaluation 1 Create a new-user scenario. Pick n% of the users and put them in U t . For each user u ∈ U t do Split the ratings in C ( u , . ) in 2 disjoint sets T u and V u where | T u | = min( m , | C ( u ,. ) | ) and V u = C ( u , . ) T u 2 Remove all of the ratings C ( u , . ) from S 2 Train the model on S: ( W , H ) ← Opt ( S , W , H ). 3 Evaluate the new-user scenario for j = 1 . . . m do: For each user u ∈ U t do add one rating r u , i ∈ T u to S . update the model: ( W , H ) ← USERUPDATE ( S , W , H , r u , i ) calculate error se j u = E ( V u , W , H ). calculate RMSE j = � 1 se j � | V u | · u . � u : | T u |≥ j u : | Tu |≥ j Steffen Rendle, Lars Schmidt-Thieme Online-Updating RKMF Models for Large-Scale Recommenders

Introduction Matrix factorization Learning Matrix Factorization Models Online Updates Evaluation Evaluation settings Evaluation on two movie recommendation datasets: Dataset Users Items Ratings Neflix 480000 17000 100 million Movielens 6040 3706 1 million Steffen Rendle, Lars Schmidt-Thieme Online-Updating RKMF Models for Large-Scale Recommenders

Introduction Matrix factorization Learning Matrix Factorization Models Online Updates Evaluation Quality of recommendations Figure: New-user/ new-item problem on Movielens and Netflix. Curves show the RMSE of online-updates (see protocol) compared to a full retrain. Steffen Rendle, Lars Schmidt-Thieme Online-Updating RKMF Models for Large-Scale Recommenders

Online-Updating Regularized Kernel Matrix Factorization Models for - PowerPoint PPT Presentation

Introduction Matrix factorization Learning Matrix Factorization Models Online Updates Evaluation Online-Updating Regularized Kernel Matrix Factorization Models for Large-Scale Recommender Systems. Steffen Rendle, Lars Schmidt-Thieme

L101: Matrix Factorization In a nutshell Matrix factorization/completion you know? In NLP?

Tensor Factorization via Matrix Factorization Volodymyr Kuleshov Arun Tejasvi Chaganty Percy

Relation Regularized Matrix Factorization Wu-Jun Li, Dit-Yan Yeung Department of Computer Science

A Model For Mixed Linear-Tropical Matrix Factorization James Hook, Sanjar Karaev, Pauli Miettinen

Regularized generalized CCA (RGCCA) Arthur Tenenhaus (SUPELEC) Michel Tenenhaus (HEC Paris) 1

Tight Kernel Query Complexity of Kernel Ridge Regression and Kernel -means Clustering Manuel

Singular Value Decomposition (matrix factorization) Singular Value Decomposition The SVD is a

Matrix Factorization and Factorization Machines for Recommender Systems Chih-Jen Lin Department

[3] The Matrix What is a matrix? Traditional answer Neo: What is the Matrix? Trinity: The answer

Matrix Multiplication Matrix Multiplication via Matrix-Vector Mult Defn. If matrix A is m n

Updating Autonomous Start to an Updating Autonomous Start to an RTK Field Survey (Part II) RTK

Updating Autonomous Start to an Updating Autonomous Start to an RTK Field Survey RTK Field

Robust Spectral Inference for Joint Stochastic Matrix Factorization Kun Dong Cornell University

CuMF: Large-Scale Matrix Factorization on Just One Machine with GPUs Wei Tan, IBM T. J. Watson

Multimodal Visualization Based On Non-negative Matrix Factorization Jorge Camargo Juan Caicedo

Matrix Factorization For Topic Models Dr. Derek Greene Insight Latent Space Workshop

Governors Office of Student Achievement Annual AFY16/FY17 Budget Presentation Joint Education

Seycove Secondary School Junior (Gr. 9 & 10) Programming January 31, 2019 Important Dates

East Santa Clara Master Plan Four Community Meetings #1 - September 20 #2 - October 11 #3 - TBD

BUILDING ENVELOPE ASSESSMENT February 4, 2013 2017 WINDOW MASTER PLAN ST. ANSELM CHURCH 685

Da iry Dig e ste r E missio ns Ma trix (DRAF T ) DAIRY AND L IVE ST OCK SUBGROUP #2: F

Clarifications Clarifications Berioska Quispe Estrada Ministry of Environment of Per July,

Offshoring: A new methodology for complex and spatial LCA calculations Pascal Lesage (CIRAIG,

Matrix Multiplication CPS343 Parallel and High Performance Computing Spring 2016 CPS343

Online-Updating Regularized Kernel Matrix Factorization Models for - PowerPoint PPT Presentation

Introduction Matrix factorization Learning Matrix Factorization Models Online Updates Evaluation Online-Updating Regularized Kernel Matrix Factorization Models for Large-Scale Recommender Systems. Steffen Rendle, Lars Schmidt-Thieme

L101: Matrix Factorization In a nutshell Matrix factorization/completion you know? In NLP?

Tensor Factorization via Matrix Factorization Volodymyr Kuleshov Arun Tejasvi Chaganty Percy

Relation Regularized Matrix Factorization Wu-Jun Li, Dit-Yan Yeung Department of Computer Science

A Model For Mixed Linear-Tropical Matrix Factorization James Hook, Sanjar Karaev, Pauli Miettinen

Regularized generalized CCA (RGCCA) Arthur Tenenhaus (SUPELEC) Michel Tenenhaus (HEC Paris) 1

Tight Kernel Query Complexity of Kernel Ridge Regression and Kernel -means Clustering Manuel

Singular Value Decomposition (matrix factorization) Singular Value Decomposition The SVD is a

Matrix Factorization and Factorization Machines for Recommender Systems Chih-Jen Lin Department

[3] The Matrix What is a matrix? Traditional answer Neo: What is the Matrix? Trinity: The answer

Matrix Multiplication Matrix Multiplication via Matrix-Vector Mult Defn. If matrix A is m n

Updating Autonomous Start to an Updating Autonomous Start to an RTK Field Survey (Part II) RTK

Updating Autonomous Start to an Updating Autonomous Start to an RTK Field Survey RTK Field

Robust Spectral Inference for Joint Stochastic Matrix Factorization Kun Dong Cornell University

CuMF: Large-Scale Matrix Factorization on Just One Machine with GPUs Wei Tan, IBM T. J. Watson

Multimodal Visualization Based On Non-negative Matrix Factorization Jorge Camargo Juan Caicedo

Matrix Factorization For Topic Models Dr. Derek Greene Insight Latent Space Workshop

Governors Office of Student Achievement Annual AFY16/FY17 Budget Presentation Joint Education

Seycove Secondary School Junior (Gr. 9 &amp; 10) Programming January 31, 2019 Important Dates

East Santa Clara Master Plan Four Community Meetings #1 - September 20 #2 - October 11 #3 - TBD

BUILDING ENVELOPE ASSESSMENT February 4, 2013 2017 WINDOW MASTER PLAN ST. ANSELM CHURCH 685

Da iry Dig e ste r E missio ns Ma trix (DRAF T ) DAIRY AND L IVE ST OCK SUBGROUP #2: F

Clarifications Clarifications Berioska Quispe Estrada Ministry of Environment of Per July,

Offshoring: A new methodology for complex and spatial LCA calculations Pascal Lesage (CIRAIG,

Matrix Multiplication CPS343 Parallel and High Performance Computing Spring 2016 CPS343

Seycove Secondary School Junior (Gr. 9 & 10) Programming January 31, 2019 Important Dates