Neighborhood-Enhanced Transfer Learning for One-Class Collaborative - PowerPoint PPT Presentation

Neighborhood-Enhanced Transfer Learning for One-Class Collaborative Filtering Wanling Cai 1 , 2 , Jiongbin Zheng 1 , Weike Pan 1 ∗ , Jing Lin 1 , Lin Li 1 , Li Chen 2 , Xiaogang Peng 1 ∗ and Zhong Ming 1 ∗ cswlcai@comp.hkbu.edu.hk, jiongbin92@gmail.com, panweike@szu.edu.cn, linjing4@email.szu.edu.cn, lilin20171@email.szu.edu.cn, lichen@comp.hkbu.edu.hk, pengxg@szu.edu.cn, mingz@szu.edu.cn 1 College of Computer Science and Software Engineering Shenzhen University, Shenzhen, China 2 Department of Computer Science Hong Kong Baptist University, Hong Kong, China Cai et al. (SZU & HKBU) TNF Neurocomputing 2019 1 / 27

Introduction Problem Definition One-Class Collaborative Filtering Input: A set of (user, item) pairs P = { ( u , i ) } , where each ( u , i ) pair means that user u has a positive feedback to item i . Goal: recommend each user u ∈ U a personalized ranked list of items from the set of unobserved items, i.e., I\P u . Cai et al. (SZU & HKBU) TNF Neurocomputing 2019 2 / 27

Introduction Challenges The sparisity of observed feedback. 1 The ambiguity of unobserved feedback. 2 Cai et al. (SZU & HKBU) TNF Neurocomputing 2019 3 / 27

Introduction Overall of Our Solution Figure: Illustration of our transfer learning solution Transfer by Neighborhood-Enhanced Factorization (TNF) We first extract the local knowledge of neighborhood information among users. We then transfer it to a global preference learning task in an enhanced factorization-based framework. Cai et al. (SZU & HKBU) TNF Neurocomputing 2019 4 / 27

Introduction Advantages of Our Solution Our TNF is able to inherit the merits of the localized neighborhood-based methods and the globalized factorization-based methods. Notice that neighborhood-based methods and factorization-based methods are rarely studied in one single framework or solution for OCCF. The factored representation of users and items allows TNF to capture and model transitive relations within a group of close neighbors on datasets of low density. Cai et al. (SZU & HKBU) TNF Neurocomputing 2019 5 / 27

Introduction Notations n number of users m number of items u ∈ U user ID i , i ′ ∈ I item ID R = { ( u , i ) } universe of all possible (user, item) pairs P = { ( u , i ) } the whole set of observed (user, item) pairs A , |A| = ρ |P| a sampled set of negative feedback from R\P item set observed by user u I u d number of latent dimensions b u ∈ R user bias b i ∈ R item bias V i · ∈ R 1 × d item-specific latent feature vector X u ′ · ∈ R 1 × d user-specific latent feature vector a set of nearest neighbors of user u N u r ui predicted preference of user u to item i ˆ α v , α x , β u , β v trade-off parameters on the regularization terms learning rate γ T iteration number in the algorithm Cai et al. (SZU & HKBU) TNF Neurocomputing 2019 6 / 27

Method Neighborhood Construction In order to extract the local knowledge from the records of users’ behaviors, we first calculate the cosine similarity between user u and user w , s uw = |I u ∩I w | √ |I u | √ |I w | , where |I u | , |I w | , |I u ∩ I w | denote the number of items observed by user u , user w , and both user u and user w , respectively. We can then obtain a set of the most similar users of each user u to construct a neighborhood N u . Cai et al. (SZU & HKBU) TNF Neurocomputing 2019 7 / 27

Method Assumption We assume that the knowledge of neighborhood extracted from the local association can be incorporated into a global factorization framework so as to better capture the latent representation. This process is just as human learning, in which people with intense concentration would digest knowledge locally but effectively while others with a big picture in mind are experts in building correlations between different domains or tasks. The learners who are able to exploit a key combination of the local and global cues may make a greater achievement. Cai et al. (SZU & HKBU) TNF Neurocomputing 2019 8 / 27

Method Transfer by Neighborhood-Enhanced Factorization Specifically, a recent work [Guo et al., 2017] inspires us to aggregate the like-minded users’ preferences. Finally, we have the estimated preference of user u to item i as follows, 1 X u ′ · V T r ui = b u + b i + � ˆ i · . (1) � |N u | u ′ ∈N u In this way, the local knowledge of neighborhood can be transferred into the factorization-based method. For this reason, we call it transfer by neighborhood-enhanced factorization (TNF). Notice that a closely related work FISM [Kabbur et al., 2013] focuses on learning the factored item similarity by incorporating the knowledge of items that have been observed by user u (i.e., I u ). Cai et al. (SZU & HKBU) TNF Neurocomputing 2019 9 / 27

Method Pointwise Preference Learning In our TNF, we adopt pointwise preference learning as our preference learning paradigm. The objective function is as follows, f ui + R (Θ) , � min (2) Θ ( u , i ) ∈P∪A where f ui = log ( 1 + exp ( − r ui ˆ r ui )) is a loss function defined on a ( u , i ) pair, and Θ = { X u ′ · , V i · , b u , b i ; i = 1 , . . . , m , u , u ′ = 1 , . . . , n } denotes the set of model parameters to be learned. Notice that we use r ui = 1 and r ui = − 1 to denote positive and negative preference for an observed ( u , i ) ∈ P pair and an unobserved ( u , i ) ∈ A pair, respectively. In addition, we introduce the regularization term u ′ ∈N u || X u ′ · || 2 2 || V i · || 2 2 b 2 2 b 2 F + β u u + β v R (Θ) = α x F + α v � i so that it can 2 contribute to avoid overfitting, where α x , α v , β u and β v are trade-off hyper parameters. Cai et al. (SZU & HKBU) TNF Neurocomputing 2019 10 / 27

Method Gradients In order to solve the optimization problem in Eq.(2), we adopt the commonly used stochastic gradient descent (SGD) algorithm. Specifically, for each ( u , i ) ∈ P ∪ A , we have the gradients, ∂ f ui 1 V i · + α x X u ′ · , u ′ ∈ N u , ∇ X u ′ · = − e ui = (3) ∂ X u ′ · � |N u | ∂ f ui 1 ∇ V i · = − e ui � X u ′ · + α v V i · , = (4) ∂ V i · � |N u | u ′ ∈N u ∂ f ui ∇ b u = − e ui + β u b u , = (5) ∂ b u ∂ f ui ∇ b i = − e ui + β v b i , = (6) ∂ b i r ui where e ui = r ui ) , and ¯ U u · = u ′ ∈N u X u ′ · is a certain virtual 1 √ � 1 + exp ( r ui ˆ |N u | user-specific latent feature vector of user u aggregated from the set of user neighborhood N u . Cai et al. (SZU & HKBU) TNF Neurocomputing 2019 11 / 27

Method Update Rules For each ( u , i ) ∈ P ∪ A , we have the update rules, X u ′ · − γ ∇ X u ′ · , u ′ ∈ N u , X u ′ · = (7) V i · V i · − γ ∇ V i · , = (8) b u b u − γ ∇ b u , = (9) b i b i − γ ∇ b i , = (10) where γ > 0 is the learning rate. Cai et al. (SZU & HKBU) TNF Neurocomputing 2019 12 / 27

Method Algorithm 1: Input: Observations P 2: Output: Recommended items for each user 3: Initialize model parameters Θ 4: Construct a neighborhood N u for each user u 5: for t 1 = 1 , . . . , T do Randomly pick up a set A with |A| = ρ |P| 6: for t 2 = 1 , 2 , . . . , |P ∪ A| do 7: Randomly draw a ( u , i ) pair from P ∪ A 8: U u · = u ′ ∈N u X u ′ · Calculate ¯ √ 1 � 9: |N u | U u · V T r ui = b u + b i + ¯ Calculate ˆ 10: i · r ui Calculate e ui = 11: 1 + exp ( r ui ˆ r ui ) Update b u , b i , V i · and X u ′ · for u ′ ∈ N u 12: end for 13: 14: end for Notes: randomly drawing a ( u , i ) pair from P ∪ A is more efficient than the user-wise sampling strategy in [Pan and Chen, 2013]. Cai et al. (SZU & HKBU) TNF Neurocomputing 2019 13 / 27

Experiments Datasets Table: Statistics of the datasets used in the experiments, including the number of users ( n ), the number of items ( m ), the number of (user, item) pairs in the training data ( |P| ), the number of (user, item) pairs in the test data ( |P te | ), and the density of each data i.e., ( |P| + |P te | ) / n / m . n m |P te | ( |P| + |P te | ) / n / m |P| Dataset ML100K 943 1,682 27,688 27,687 3.49% ML1M 6,040 3,952 287,641 287,640 2.41% UserTag 3,000 2,000 123,218 123,218 4.11% Netflix5K5K 5,000 5,000 77,936 77,936 0.62% XING5K5K 5,000 5,000 39,197 39,197 0.31% Notice that the datasets and code are publicly available 1 1 http://csse.szu.edu.cn/staff/panwk/publications/TNF/ Cai et al. (SZU & HKBU) TNF Neurocomputing 2019 14 / 27

Experiments Baselines UCF: user-oriented collaborative filtering [Aggarwal et al., 1999] MF: matrix factorization with square loss [Koren et al., 2009] BPR: Bayesian personalized ranking [Rendle et al., 2009] FISM: factored item similarity model [Kabbur et al., 2013] NeuMF: neural matrix factorization [He et al., 2017] Cai et al. (SZU & HKBU) TNF Neurocomputing 2019 15 / 27

Experiments Parameter Configurations (1/2) For UCF and TNF, we use cosine similarity and set the size of neighborhood as 20. For BPR, FISM and our TNF, we adopt the commonly used stochastic gradient descent (SGD) method with the same sampling strategy for fair comparison, and we fix the dimension as d = 20 and the learning rate as γ = 0 . 01. For FISM and TNF, we set ρ = 3, i.e., |A| = 3 |P| For the deep model NeuMF, we implement the method using TensorFlow 2 and keep the structure with the best performance as reported in [He et al., 2017]. 2 https://www.tensorflow.org/ Cai et al. (SZU & HKBU) TNF Neurocomputing 2019 16 / 27

Neighborhood-Enhanced Transfer Learning for One-Class Collaborative - PowerPoint PPT Presentation

Neighborhood-Enhanced Transfer Learning for One-Class Collaborative Filtering Wanling Cai 1 , 2 , Jiongbin Zheng 1 , Weike Pan 1 , Jing Lin 1 , Lin Li 1 , Li Chen 2 , Xiaogang Peng 1 and Zhong Ming 1 cswlcai@comp.hkbu.edu.hk,

An Enhanced Global Router An Enhanced Global Router An Enhanced Global Router An Enhanced Global

Voluntary Transfer Voluntary Transfer Neighborhood Schools Neighborhood Schools have been

Enhanced Learning Culture Enhanced Learning Culture for High Achieving Students

Industrial Transfer Learning Introduction to Industrial Transfer Learning Industrial Transfer

Radiative Transfer Radiative Transfer Radiative transfer is a branch of atmospheric physics. We

What Is Multicast? Key: Unicast transfer Broadcast transfer Unicast Multicast transfer

District 211 One-to-One Program One-to-One: Program Background 2012-2013 2016-2017 One-to-One

Transfer United: Partnerships to Foster Transfer Student Success Tuesday, November 5 th

Transfer Learning Eu Wern Teh What are we covering? Why transfer learning? Fine

Neighborhood Marketplace Initiative Why Neighborhood Commercial Districts? Vibrant

Neighborhood Liaison Program Neighborhood Beautification Grant 2 0 1 7 Leadership Aurora |

NEIGHBORHOOD WATCH Gretchen Lorenzo FCPP,CPTED,CIS,CIT Crime Prevention Coordinator Fort Myers

Neighborhood Natasha D. Pinol NC Representative Conservation Neighborhood Conservation Plan

Neighborhood Improvement District Plan Woodlands at Greystone 1 Presentation Outline I. Why

Neighborhood Area Planning Deborah Munkberg September 2018 1 Neighborhood Area Planning

Website: dlftx.org Neighborhood Leadership NeighborhoodRevitaliza1on

Managing a Multidisciplinary Experiential Program I NNOVATIVE D ESIGN FOR R EAL -W ORLD E

TM chartis a Tool Set for Designing Multiple Visualizations for Topic Maps Hendrik Thomas, Rike

Pythia Overview : 20132016 P e t e r S k a n d s ( M o n a s h U n i v e r s i t y ) On

LBNF/DUNE Far Site Detector Grounding System Requirements This document sets forth the LBNF

Formation of Nanocomposites From Cerium Dioxide Nanoparticles Nadezhda Zholobak 1, * , Eugene

F o (ni) W;rht Q,"eS*tsw ?r&{ ^** , t} = 'r,2.r rt, e ft) r, l>n -../}nrtJ

A Normal Form for Classical Planning Tasks Florian Pommerening 1 Malte Helmert University of

Whats in store for NetBSD 9.0 Sevan Janiyan <sevan@{pkgsrc,NetBSD,FreeBSD}.org>

Neighborhood-Enhanced Transfer Learning for One-Class Collaborative - PowerPoint PPT Presentation

Neighborhood-Enhanced Transfer Learning for One-Class Collaborative Filtering Wanling Cai 1 , 2 , Jiongbin Zheng 1 , Weike Pan 1 , Jing Lin 1 , Lin Li 1 , Li Chen 2 , Xiaogang Peng 1 and Zhong Ming 1 cswlcai@comp.hkbu.edu.hk,

An Enhanced Global Router An Enhanced Global Router An Enhanced Global Router An Enhanced Global

Voluntary Transfer Voluntary Transfer Neighborhood Schools Neighborhood Schools have been

Enhanced Learning Culture Enhanced Learning Culture for High Achieving Students

Industrial Transfer Learning Introduction to Industrial Transfer Learning Industrial Transfer

Radiative Transfer Radiative Transfer Radiative transfer is a branch of atmospheric physics. We

What Is Multicast? Key: Unicast transfer Broadcast transfer Unicast Multicast transfer

District 211 One-to-One Program One-to-One: Program Background 2012-2013 2016-2017 One-to-One

Transfer United: Partnerships to Foster Transfer Student Success Tuesday, November 5 th

Transfer Learning Eu Wern Teh What are we covering? Why transfer learning? Fine

Neighborhood Marketplace Initiative Why Neighborhood Commercial Districts? Vibrant

Neighborhood Liaison Program Neighborhood Beautification Grant 2 0 1 7 Leadership Aurora |

NEIGHBORHOOD WATCH Gretchen Lorenzo FCPP,CPTED,CIS,CIT Crime Prevention Coordinator Fort Myers

Neighborhood Natasha D. Pinol NC Representative Conservation Neighborhood Conservation Plan

Neighborhood Improvement District Plan Woodlands at Greystone 1 Presentation Outline I. Why

Neighborhood Area Planning Deborah Munkberg September 2018 1 Neighborhood Area Planning

Website: dlftx.org Neighborhood Leadership Neighborhood*Revitaliza1on*

Managing a Multidisciplinary Experiential Program I NNOVATIVE D ESIGN FOR R EAL -W ORLD E

TM chartis a Tool Set for Designing Multiple Visualizations for Topic Maps Hendrik Thomas, Rike

Pythia Overview : 20132016 P e t e r S k a n d s ( M o n a s h U n i v e r s i t y ) On

LBNF/DUNE Far Site Detector Grounding System Requirements This document sets forth the LBNF

Formation of Nanocomposites From Cerium Dioxide Nanoparticles Nadezhda Zholobak 1, * , Eugene

F o (ni) W;rht Q,&quot;eS*tsw ?r&amp;{ ^** , t} = 'r,2.r rt, *e* ft) r, l&gt;n -../}nrtJ

A Normal Form for Classical Planning Tasks Florian Pommerening 1 Malte Helmert University of

Whats in store for NetBSD 9.0 Sevan Janiyan &lt;sevan@{pkgsrc,NetBSD,FreeBSD}.org&gt;

Website: dlftx.org Neighborhood Leadership NeighborhoodRevitaliza1on

F o (ni) W;rht Q,"eS*tsw ?r&{ ^** , t} = 'r,2.r rt, e ft) r, l>n -../}nrtJ

Whats in store for NetBSD 9.0 Sevan Janiyan <sevan@{pkgsrc,NetBSD,FreeBSD}.org>