Deep Learning Based Recommendation Systems Prof. Srijan Kumar - PowerPoint PPT Presentation

CSE 6240: Web Search and Text Mining. Spring 2020 Deep Learning Based Recommendation Systems Prof. Srijan Kumar http://cc.gatech.edu/~srijan 1 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

Today’s Lecture • Introduction • Neural Collaborative Filtering • RRN • LatentCross • JODIE Reference paper: Deep Learning based Recommender System: A Survey and New Perspectives. Zhang et al., ACM CSUR 2019. 2 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

Deep Recommender Systems • How can deep learning advance recommendation systems? • Simple way for content-based models: Use CNNs, LSTMs for generate image and text features of items 3 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

Deep Recommender Systems • But how can DL be used for tasks and methods at the core of recommendation systems? – For collaborative filtering? – For latent factor models? – For temporal dynamics? – Some new techniques? 4 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

Why Deep Learning Techniques Pros: • Capture non-linearity well • Non-manual representation learning • Efficient sequence modeling • Somewhat flexible and easy to retrain Cons: • Lack of interpretability • Large data requirements • Extensive hyper-parameter tuning 5 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

Applicable DL Techniques Deep Learning methods: • MLPs and AutoEncoders • CNNs • RNNs • Adversarial Networks • Attention models • Deep reinforcement learning How to uses these methods to improve recommender systems? 6 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

Today’s Lecture • Introduction • Neural Collaborative Filtering • Recurrent Recommender Networks • LatentCross • JODIE Reference Paper: Neural Collaborative Filtering. He Xiangnan, Liao Lizi, Zhang Hanwang, Nie Liqiang, Hu Xia, Tat-Seng Chua. WWW 2017 7 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

Matrix Factorization • MF uses an inner product as the interaction function – Latent factors are independent with each other • Limitations: The simple choice of inner product function can limit the expressiveness of a MF model. • Potential solution: increase the number of factors. However, – This increases the complexity of the model – Leads to overfitting 8 8 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

Improving Matrix Factorization • Key question: How can we improve matrix factorization? • Answer: Learn the relation between factors from the data, rather than fixing it to be the simple, fixed inner product – Does not increase the complexity – Does not lead to overfitting • One solution: Neural Collaborative Filtering 9 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

Neural Collaborative Filtering • Neural Collaborative Filtering (NCF) is a deep learning version of the traditional recommender system • Learns the interaction function with a deep neural network – Non-linear functions, e.g., multi-layer perceptrons, to learn the interaction function – Models well when latent factors are not independent with each other, especially true in large real datasets 10 10 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

Neural Collaborative Filtering • Neural extensions of traditional recommender system • Input: rating matrix, user profile and item features (optional) – If user/item features are unavailable, we can use one-hot vectors • Output: User and item embeddings, prediction scores • Traditional matrix factorization is a special case of NCF 11 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

NCF Setup • User feature vector: • Item feature vector: • User embedding matrix: U • Item embedding matrix: I • Neural network: f • Neural network parameters: 𝛪 • Predicted rating: 12 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

NCF Model Architecture • Multiple layers of fully connected layers form the Neural CF layer. • Output is a rating score • Real rating score is r ui 13 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

1-Layer NCF • Layer 1 an element-wise product • Output Layer as a fully connected layer without bias 14 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

Multi-Layer NCF • Each layer is a multi-layer perceptron , with non-linearity on the top • Final score is used to calculate the loss and train the layers 15 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

NCF model: Loss function • Train on the difference between predicted rating and the real rating • Use negative sampling to reduce the negative data points • Loss = cross-entropy loss 16 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

Experimental Setup • Two public datasets: MovieLens, Pinterest – Transform MovieLens ratings to 0/1 implicit case • Evaluation protocols: – Leave-one-out setting: hold-out the latest rating of each user as the test – Top-k evaluation: create a ranked list of items – Evaluation metrics: • Hit Ratio: does the correct item appear in top 10 17 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

Baselines • Item Popularity – Items are ranked by their popularity • ItemKNN [Sarwar et al, WWW’01] – The standard item-based CF method • BPR [Rendle et al, UAI’09] – Bayesian Personalized Ranking optimizes MF model with a pairwise ranking loss • eALS [He et al, SIGIR’16] – The state-of-the-art CF method for implicit data. It optimizes MF model with a varying-weighted regression loss. 18 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

Performance vs. Embedding Size • NeuMF > eALS and BPR (5% improvement) • NeuMF > MLP (MLP has lower training loss but higher test loss) 19 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

Convergence Behavior • Most effective updates in the first 10 iterations • More iterations make NeuMF overfit • Trade-off between representation ability and generalization ability of a model. 20 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

Is Deeper Helpful? • Same number of factors, but more nonlinear layers improves the performance. • Linear layers degrades the performance • Improvement diminishes for more layers 21 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

NCF: Shortcomings • Architecture is limited • NCF does not model the temporal behavior of users or items – Recall: users and items exhibit temporal bias – NCF has the same input for user • Non-inductive: new users and new items, on which training was not done, can not be processed 22 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

Today’s Lecture • Introduction • Neural Collaborative Filtering • RRN • LatentCross • JODIE 23 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

RRN • RRN = Recurrent Recommender Networks • One of the first methods to model the temporal evolution of user and item behavior • Reference paper: Recurrent Recommender Networks. CY Wu, A Ahmed, A Beutel, A Smola, H Jing. WSDM 2017 24 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

Traditional Methods • Existing models assume user and item states are stationary – States = embeddings, hidden factors, representations • However, user preferences and item states change over time • How to model this? • Key idea: use of RNNs to learn evolution of user embeddings 25 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

User Preferences • User preference changes over time 10 years ago ? now 26 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

Item States • Movie reception changes over time So bad that it’s great to watch Bad movie 27 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

Exogenous Effects “La La Land” won big at Golden Globes 28 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

Seasonal Effects Only watch during Christmas 29 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

Traditional Methods • Traditional matrix factorization, including NCF, assumes user state u i and item state m j are fixed and independent of each other • Use both to make predictions about the rating score r ij • Right figure: latent variable block diagram of traditional MF 30 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

RRN Framework • RRN innovates by modeling temporal dynamics within each user state u i and movie state m j • u it depends on u it- and influences u it+ – Same for movies • User and item states are independent of each other 31 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

Deep Learning Based Recommendation Systems Prof. Srijan Kumar - PowerPoint PPT Presentation

CSE 6240: Web Search and Text Mining. Spring 2020 Deep Learning Based Recommendation Systems Prof. Srijan Kumar http://cc.gatech.edu/~srijan 1 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining Todays Lecture

DSC 102 Systems for Scalable Analytics Arun Kumar Topic 6: Deep Learning Systems 1 Outline

Hao Su July 6, 2017 Outline Overview of 3D deep learning 3D deep learning algorithms

All You Want To Know About CNNs Yukun Zhu Deep Learning Deep Learning Image from

Deep Neural Networks and Deep Reinforcement Learning Deep Learning, Goodfellow, Bengio and

Recommendation Systems Stony Brook University CSE545, Fall 2017 Recommendation Systems What

Recommendation Systems Stony Brook University CSE545, Spring 2019 Recommendation Systems

AGN deep multiwavelength AGN deep multiwavelength AGN deep multiwavelength surveys: surveys:

A Preference-Based Bandit Framework for Personalized Recommendation Maryam Tavakol and Ulf

Deep Learning: Theory and Practice Deep Learning - Practical 02-04-2020 Considerations

Presentation about Deep Learning --- Zhongwu xie Contents 1.Brief introduction of Deep learning.

Deep Learning on GPUs March 2016 What is Deep Learning? GPUs and DL AGENDA DL in practice

Deep learning Deep reinforcement learning Hamid Beigy Sharif university of technology December

Differen'able Func'onal Programming Noel Welsh @noelwelsh underscore Goals Deep learning

ACCELERATE DEEP LEARNING WITH NVIDIA'S DEEP LEARNING PLATFORM | STEPHEN JONES | GTC16 DEEP

Deep learning for natural language processing A short primer on deep learning Benoit Favre <

Relational Deep Learning: A Deep Latent Variable Model for Link Prediction Hao Wang, Xingjian

I = ( M 1 , . . . , M q ) monomial ideal in polynomial ring. Question. What are the Betti numbers

Mixed Factorization for Collaborative Recommendation with Heterogeneous Explicit Feedbacks Weike

CS615 - Aspects of System Administration System Security Department of Computer Science Stevens

The electroweak effective field theory

Workshop 3 Philip Newsome, CER EC Market Coupling Governance Guideline CACM Draft Network

Ranking prediction by online learning Rbert Plovics Informatics Laboratory, Department of

Vers un environnement collaboratif multi-utilisateurs Laurent Lucas 1 , Herv Deleau 2,1 ,

Panel Data Analysis Part I Classical Methods: Background Material James J. Heckman

Deep Learning Based Recommendation Systems Prof. Srijan Kumar - PowerPoint PPT Presentation

CSE 6240: Web Search and Text Mining. Spring 2020 Deep Learning Based Recommendation Systems Prof. Srijan Kumar http://cc.gatech.edu/~srijan 1 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining Todays Lecture

DSC 102 Systems for Scalable Analytics Arun Kumar Topic 6: Deep Learning Systems 1 Outline

Hao Su July 6, 2017 Outline Overview of 3D deep learning 3D deep learning algorithms

All You Want To Know About CNNs Yukun Zhu Deep Learning Deep Learning Image from

Deep Neural Networks and Deep Reinforcement Learning Deep Learning, Goodfellow, Bengio and

Recommendation Systems Stony Brook University CSE545, Fall 2017 Recommendation Systems What

Recommendation Systems Stony Brook University CSE545, Spring 2019 Recommendation Systems

AGN deep multiwavelength AGN deep multiwavelength AGN deep multiwavelength surveys: surveys:

A Preference-Based Bandit Framework for Personalized Recommendation Maryam Tavakol and Ulf

Deep Learning: Theory and Practice Deep Learning - Practical 02-04-2020 Considerations

Presentation about Deep Learning --- Zhongwu xie Contents 1.Brief introduction of Deep learning.

Deep Learning on GPUs March 2016 What is Deep Learning? GPUs and DL AGENDA DL in practice

Deep learning Deep reinforcement learning Hamid Beigy Sharif university of technology December

Differen'able Func'onal Programming Noel Welsh @noelwelsh underscore Goals Deep learning

ACCELERATE DEEP LEARNING WITH NVIDIA'S DEEP LEARNING PLATFORM | STEPHEN JONES | GTC16 DEEP

Deep learning for natural language processing A short primer on deep learning Benoit Favre &lt;

Relational Deep Learning: A Deep Latent Variable Model for Link Prediction Hao Wang, Xingjian

I = ( M 1 , . . . , M q ) monomial ideal in polynomial ring. Question. What are the Betti numbers

Mixed Factorization for Collaborative Recommendation with Heterogeneous Explicit Feedbacks Weike

CS615 - Aspects of System Administration System Security Department of Computer Science Stevens

The electroweak effective field theory

Workshop 3 Philip Newsome, CER EC Market Coupling Governance Guideline CACM Draft Network

Ranking prediction by online learning Rbert Plovics Informatics Laboratory, Department of

Vers un environnement collaboratif multi-utilisateurs Laurent Lucas 1 , Herv Deleau 2,1 ,

Panel Data Analysis Part I Classical Methods: Background Material James J. Heckman

Deep learning for natural language processing A short primer on deep learning Benoit Favre <