PAT: Preference-Aware Transfer Learning for Recommendation with - - PowerPoint PPT Presentation

pat preference aware transfer learning for recommendation
SMART_READER_LITE
LIVE PREVIEW

PAT: Preference-Aware Transfer Learning for Recommendation with - - PowerPoint PPT Presentation

PAT: Preference-Aware Transfer Learning for Recommendation with Heterogeneous Feedback Feng Liang a, b, c , Wei Dai a, b, c , Yunfeng Huang a, b, c , Weike Pan a, b, c , Zhong Ming a, b, c { liangfeng2018, daiwei20171, huangyunfeng2017 }


slide-1
SLIDE 1

PAT: Preference-Aware Transfer Learning for Recommendation with Heterogeneous Feedback

Feng Lianga, b, c, Wei Daia, b, c, Yunfeng Huanga, b, c, Weike Pana, b, c∗, Zhong Minga, b, c∗

{liangfeng2018, daiwei20171, huangyunfeng2017}@email.szu.edu.cn, {panweike, mingz}@szu.edu.cn aNational Engineering Laboratory for Big Data System Computing Technology,

Shenzhen University, Shenzhen, China

bGuangdong Laboratory of Artificial Intelligence and Digital Economy (SZ),

Shenzhen University, Shenzhen, China

cCollege of Computer Science and Software Engineering,

Shenzhen University, Shenzhen, China

Liang et al., (SZU) Preference-Aware Transfer (PAT) IJCNN 2020 1 / 30

slide-2
SLIDE 2

Introduction

Problem Definition

Rating Prediction with Users’ Heterogeneous Explicit Feedback Input: A set of grade score records R = {(u, i, rui)} with rui ∈ G as a grade score such as {0.5, 1, . . . , 5}, and a set of binary rating records ˜ R = {(u, i,˜ rui)} with ˜ rui ∈ B = {like, dislike}. Goal: Estimate the grade score of each (user, item) pair in the test data TE.

Liang et al., (SZU) Preference-Aware Transfer (PAT) IJCNN 2020 2 / 30

slide-3
SLIDE 3

Introduction

Motivation

1

TMF exploits the implicit preference context of users from the auxiliary binary data in grade score prediction, but the implicit preference context of users in the midst of the target data is not exploited.

2

SVD++ and MF-MPC only exploit the preference context in the target data in order to model users’ personalized preferences, and do not consider the binary ratings in the auxiliary data as used in TMF .

Liang et al., (SZU) Preference-Aware Transfer (PAT) IJCNN 2020 3 / 30

slide-4
SLIDE 4

Introduction

Our Contributions

1

In order to share knowledge between two different types of data more sufficiently, we address the problem from a transfer learning perspective, i.e., taking the grade scores as the target data and the binary ratings as the auxiliary data.

2

Besides the observed explicit feedback of grade scores and binary ratings, we propose to exploit the implicit preference context beneath the feedback, which is incorporated into the prediction process of users’ grade scores to items.

3

We conduct extensive empirical studies on two large and public datasets and find that our PAT performs significantly better than the state-of-the-art methods.

Liang et al., (SZU) Preference-Aware Transfer (PAT) IJCNN 2020 4 / 30

slide-5
SLIDE 5

Introduction

Notations (1/3)

Table: Some notations and their explanations.

Notation Explanation n number of users m number of items u, u′ ∈ {1, 2, . . . , n} user ID i, j ∈ {1, 2, . . . , m} item ID G = {0.5, 1, . . . , 5} grade score range B = {like, dislike} binary rating range rui ∈ G grade score of user u to item i ˜ rui ∈ B binary rating of user u to item i R = {(u, i, rui)} grade score records (training data) ˜ R = {(u, i,˜ rui)} binary rating records (training data) p = |R| number of grade scores (training data) ˜ p = | ˜ R| number of binary ratings (training data) Ig

u , g ∈ G

items rated by user u with score g (training data) Pu items liked (w/ positive feedback) by user u (training data) Nu items disliked (w/ negative feedback) by user u (training data) TE = {(u, i, rui)} grade score records in test data

Liang et al., (SZU) Preference-Aware Transfer (PAT) IJCNN 2020 5 / 30

slide-6
SLIDE 6

Introduction

Notations (2/3)

Table: Some notations and their explanations (cont.).

Notation Explanation µ ∈ R global average rating value bu ∈ R user bias bi ∈ R item bias d ∈ R number of latent dimensions Uu·, Wu· ∈ R1×d user-specific latent feature vector U, W ∈ Rn×d user-specific latent feature matrix Vi·, Cp

j·, Cn j·, Co i′·, Cg i′· ∈ R1×d

item-specific latent feature vector V, Cp, Cn, Co, Cg ∈ Rm×d item-specific latent feature matrix ˆ rui predicted grade score of user u to item i ˆ ˜ rui predicted binary rating of user u to item i

Liang et al., (SZU) Preference-Aware Transfer (PAT) IJCNN 2020 6 / 30

slide-7
SLIDE 7

Introduction

Notations (3/3)

Table: Some notations and their explanations (cont.).

Notation Explanation γ learning rate ρ interaction weight between grade scores and binary ratings α tradeoff parameter on the corresponding regularization terms δO, δG, δp, δn ∈ {0, 1} indicator variable for positive and negative feedback wp, wn weight on positive and negative feedback T iteration number in the algorithm

Liang et al., (SZU) Preference-Aware Transfer (PAT) IJCNN 2020 7 / 30

slide-8
SLIDE 8

Related Work

Related Work (1/2)

Probabilistic matrix factorization (PMF) [Salakhutdinov and Mnih, 2008] is a dominant recommendation model that takes the explicit grade score matrix as input and outputs the learned low-rank feature vectors of users and items. Transfer by collective factorization (TCF) [Pan and Yang, 2013] models users’ personalized preferences from both grade scores and binary ratings collectively by sharing users’ features and items’ features. Notice that when the auxiliary binary ratings are not considered, TCF is reduced to PMF. Interaction-Rich transfer by collective factorization (iTCF) [Pan and Ming, 2014] is proposed based on CMF [Singh and Gordon, 2008] which exploits the rich interactions among the user-specific latent features of the target data and the auxiliary data when calculating the gradients of items in the model training stage. Notice that when the rich interactions mentioned above is not exploited, iTCF reduces to CMF . Transfer by mixed factorization (TMF) [Pan et al., 2016] combines the feature vectors learned from two different types of data in a collective and integrative

  • manner. Notice that when the like/dislike feedback of users to items are not

considered in grade score prediction, TMF becomes iTCF .

Liang et al., (SZU) Preference-Aware Transfer (PAT) IJCNN 2020 8 / 30

slide-9
SLIDE 9

Related Work

Related Work (2/2)

In SVD++ [Koren, 2008], a user’s estimated score to an item is related to

  • ther items that the user rated in the past, which are called preference

context of the user. Furthermore, there is no difference among these rated items because whatever scores they are assigned, they are in the same set, or in other words, their effects are classified into the same class, which is a typical example of one-class preference context (OPC). When predicting the unobserved score, the introduction of OPC can provide a global preference context for the users. In MF-MPC [Pan and Ming, 2017], on the other side, rated items except the target one of a given user, i.e., preference context, are classified into several clusters in terms of the grade scores, which are named multi-class preference context (MPC). Intuitively, MPC is an advanced version of OPC which not only offers the global preference information of users, but also distinguishes the information with different values.

Liang et al., (SZU) Preference-Aware Transfer (PAT) IJCNN 2020 9 / 30

slide-10
SLIDE 10

Method

Collective Matrix Factorization (CMF)

In order to jointly model two different types of explicit feedback, i.e., rui and ˜ rui, a state-of-the-art method is proposed to approximate the grade score and binary rating simultaneously by sharing some latent variable [Singh and Gordon, 2008],

  • ˆ

rui = Uu·V T

i· + bu + bi + µ

ˆ ˜ rui = Wu·V T

(1) where the item-specific latent feature vector Vi· is shared between two factorization subtasks. However, for the goal of grade score prediction, some implicit preference contexts are missing in the above joint modeling approach shown in CMF [Singh and Gordon, 2008].

Liang et al., (SZU) Preference-Aware Transfer (PAT) IJCNN 2020 10 / 30

slide-11
SLIDE 11

Method

Implicit Preference Context

Mathematically, we may represent the one-class preference context as ¯ CO

u·, the

graded preference context as ¯ CG

u·, the positive preference context as ¯

Cp

u·, and the

negative preference context as ¯ Cn

u· as

follows [Koren, 2008, Pan and Ming, 2017, Pan et al., 2016], ¯ CO

u· = δO

1

  • |Iu\{i}|
  • i′∈Iu\{i}

Co

i′·

(2) ¯ CG

u· = δG

  • g∈G

1

  • |Ig

u \{i}|

  • i′∈Ig

u \{i}

Cg

i′·

(3) ¯ Cp

u· = δpwp

1

  • |Pu|
  • j∈Pu

Cp

(4) ¯ Cn

u· = δnwn

1

  • |Nu|
  • j∈Nu

Cn

(5) where δO, δG, δp, δn ∈ {1, 0} are the indicator variables, and wp and wn are the weights

  • n positive feedback and negative feedback, respectively.

Liang et al., (SZU) Preference-Aware Transfer (PAT) IJCNN 2020 11 / 30

slide-12
SLIDE 12

Method

Transfer with Implicit Preference Context

With the preference context, we propose to incorporate them into the collective factorization framework,

  • ˆ

rui = Uu·V T

i· + bu + bi + µ + (¯

CO

u· + ¯

CG

u· + ¯

Cp

u· + ¯

Cn

u·)V T i·

ˆ ˜ rui = Wu·V T

(6) which will bring two user-specific latent feature vectors of user u and user u′ to be close if they have similar implicit preference context in a similar way to that of SVD++ [Koren, 2008]. Notice that we incorporate the preference context into the prediction rule of grade scores instead of that of binary ratings because that matches our final goal of grade score prediction rather than binary rating prediction.

Liang et al., (SZU) Preference-Aware Transfer (PAT) IJCNN 2020 12 / 30

slide-13
SLIDE 13

Method

Overall Prediction Rule of PAT

The following methods are special cases of our PAT RSVD [Koren, 2008]: {e1, e2} CMF [Singh and Gordon, 2008]: {e1, e2, e3, e4} iTCF [Pan and Ming, 2014]: {e1, e2, e3, e4, e5} TMF [Pan et al., 2016]: {e1, e2, e3, e4, e5, e6, e7} SVD++ [Koren, 2008], MF-MPC [Pan and Ming, 2017]: {e1, e2, e8} Liang et al., (SZU) Preference-Aware Transfer (PAT) IJCNN 2020 13 / 30

slide-14
SLIDE 14

Method

Objective Function

We then reach an objective function similar to that of CMF [Singh and Gordon, 2008], iTCF [Pan and Ming, 2014] and TMF [Pan et al., 2016], min

Θ n

  • u=1

m

  • i=1

yuifui + λ

n

  • u=1

m

  • i=1

˜ yui˜ fui (7) where fui = 1

2(rui − ˆ

rui)2 + α

2 Uu·2 + α 2 Vi·2 + α 2 bu2 + α 2 bi2 +

δp α

2

  • j∈Pu Cp

j·2 F + δn α 2

  • j∈Nu Cn

j·2 F + δO α 2

  • i′∈Iu\{i} Co

i′·2 F +

δG α

2

  • g∈G
  • i′∈Ig

u \{i} ||Cg

i′·||2 F, and

˜ fui = 1

2(˜

rui − ˜ ˆ rui)2 + α

2 Wu·2 + α 2 Vi·2.

Liang et al., (SZU) Preference-Aware Transfer (PAT) IJCNN 2020 14 / 30

slide-15
SLIDE 15

Method

Gradients (1/2)

We have the gradients of the model parameters w.r.t. fui as follows, ∇µ = −eui ∇bu = −eui + αbu ∇bi = −eui + αbi ∇Uu· = −euiVi· + αUu· ∇Vi· = −eui(ρUu· + (1 − ρ)Wu· + ¯ Cp

u· + ¯

Cn

u· + ¯

CO

u· + ¯

CG

u·) + αVi·

∇Co

i′· = δO(−eui

1

  • |Iu\{i}|

Vi· + αCo

i′·), i′ ∈ Iu\{i}

∇Cg

i′· = δG(−eui

1

  • |Ig

u \{i}|

Vi· + αCg

i′·), i′ ∈ Ig u \{i}, g ∈ G

where eui = (rui − ˆ rui) is the error w.r.t. the target grade score, ρUu· + (1 − ρ)Wu· is used to introduce rich interactions [Pan and Ming, 2014] between the user-specific latent features Uu· and Wu·.

Liang et al., (SZU) Preference-Aware Transfer (PAT) IJCNN 2020 15 / 30

slide-16
SLIDE 16

Method

Gradients (2/2)

We have the gradients of the model parameters w.r.t. ˜ fui as follows, ∇Cp

j· = δp(−euiwp

1

  • |Pu|

Vi· + αCp

j·), j ∈ Pu

∇Cn

j· = δn(−euiwn

1

  • |Nu|

Vi· + αCn

j·), j ∈ Nu

∇Wu· = λ(−˜ euiVi· + αWu·) ∇Vi· = λ(−˜ eui(ρWu· + (1 − ρ)Uu·) + αVi·) where ˜ eui = (˜ rui − ˆ ˜ rui) is the error w.r.t. the auxiliary binary rating.

Liang et al., (SZU) Preference-Aware Transfer (PAT) IJCNN 2020 16 / 30

slide-17
SLIDE 17

Method

Update Rule

We have update rules, θ = θ − γ∇θ, (8) where γ is the learning rate, and θ can be µ, bu, bi, Uu·, Vi·, Cp

j·, Cn j·

,Co

i′·, Cg i′·, Wu·.

Liang et al., (SZU) Preference-Aware Transfer (PAT) IJCNN 2020 17 / 30

slide-18
SLIDE 18

Method

Algorithm

Algorithm 1 The algorithm of preference-aware transfer (PAT).

1: for t = 1, . . . , T do 2:

for iter = 1, . . . , |R ∪ ˜ R| do

3:

Randomly pick up a record (u, i, rui) or (u, i,˜ rui) from R ∪ ˜ R.

4:

Calculate the gradients w.r.t. fui or ˜ fui accordingly.

5:

Update the corresponding model parameters.

6:

end for

7:

Decrease the learning rate via γ ← γ × 0.9.

8: end for

Liang et al., (SZU) Preference-Aware Transfer (PAT) IJCNN 2020 18 / 30

slide-19
SLIDE 19

Method

Discussion

Our transfer learning solution is very generic, which contains several state-of-the-art factorization-based recommendation methods as special cases, including RSVD [Koren, 2008], CMF [Singh and Gordon, 2008], iTCF [Pan and Ming, 2014], TMF [Pan et al., 2016], SVD++ [Koren, 2008] and MF-MPC [Pan and Ming, 2017]. From Table 4, we can see that our PAT contains several pluggable components such as the components for the auxiliary binary ratings, the positive and negative preference context, the multiclass preference context, and the interaction between two types of feedback, which shows that our solution is very generic and flexible.

Table: Relationships between our preference-aware transfer (PAT) and other factorization-based

methods in the perspective of its projection to the graphical model of our PAT shown in Fig. 13. Algorithm Edges RSVD [Koren, 2008] {e1, e2} CMF [Singh and Gordon, 2008] {e1, e2, e3, e4} iTCF [Pan and Ming, 2014] {e1, e2, e3, e4, e5} TMF [Pan et al., 2016] {e1, e2, e3, e4, e5, e6, e7} SVD++ [Koren, 2008], MF-MPC [Pan and Ming, 2017] {e1, e2, e8} PAT-OPC, PAT (proposed) {e1, e2, e3, e4, e5, e6, e7, e8}

Liang et al., (SZU) Preference-Aware Transfer (PAT) IJCNN 2020 19 / 30

slide-20
SLIDE 20

Experiments

Datasets

We adopt two public datasets used in a previous study about modeling heterogeneous feedback [Pan et al., 2016], i.e, Movielens 10M (denoted as ML10M) and Flixter. The ML10M dataset contains 10, 000, 054 grade scores from 71, 567 users to 10, 681 items. The Flixter dataset contains 8, 196, 075 grade scores from 147, 612 users to 48, 794 items. For simulating the problem setting with heterogeneous feedback, we process each dataset as follows: (i) we randomly split the data into five parts with similar size, and (ii) we then take two parts as training data with grade scores, take another two parts as binary ratings by transforming grade scores larger than or equal to four to “like” and grade scores less than four to “dislike”, and take the remaining one part as the test data with grade scores. We repeat this process for five times and

  • btain five copies of grade score records, binary ratings and test data.

The results are averaged over those five copies of data.

Liang et al., (SZU) Preference-Aware Transfer (PAT) IJCNN 2020 20 / 30

slide-21
SLIDE 21

Experiments

Baselines

RSVD [Koren, 2008] is a basic matrix factorization method without modeling preference context and auxiliary binary ratings, which is a special case of our PAT with edges {e1, e2}; MF-MPC [Pan and Ming, 2017] is a recent advanced matrix factorization method exploiting multiclass preference context beneath the grade scores, which is a special case of our PAT with edges {e1, e2, e8}; and TMF [Pan et al., 2016] is a recent factorization-based transfer learning method incorporating the auxiliary binary ratings, which is a special case

  • f our PAT with edges {e1, e2, e3, e4, e5, e6, e7}.

Notice that we do not include some other algorithms for the studied problem such as collective matrix factorization (CMF) [Singh and Gordon, 2008] with edges {e1, e2, e3, e4}, and interaction-rich transfer by collective facrtorization (iTCF) [Pan and Ming, 2014] with edges {e1, e2, e3, e4, e5} because they usually perform worse than TMF [Pan et al., 2016].

Liang et al., (SZU) Preference-Aware Transfer (PAT) IJCNN 2020 21 / 30

slide-22
SLIDE 22

Experiments

Parameter Configurations

We adhere to the same rules used in TMF [Pan et al., 2016]. Specially, we fix the number of latent dimensions d = 20 on ML10M and d = 10 on Flixter, respectively, the iteration number T = 50, the learning rate γ = 0.01, the interaction weight ρ = 0.5, the weight on the auxiliary binary ratings λ = 1, the tradeoff parameter on the regularization terms α = 0.01, and the weight on positive and negative feedback wp = 2 and wn = 1.

Liang et al., (SZU) Preference-Aware Transfer (PAT) IJCNN 2020 22 / 30

slide-23
SLIDE 23

Experiments

Evaluation Metrics

Mean Absolute Error (MAE) MAE =

  • (u,i,rui)∈TE

|rui − ˆ rui|/|TE| Root Mean Square Error (RMSE) RMSE =

  • (u,i,rui)∈TE

(rui − ˆ rui)2/|TE| Performance: the smaller the better.

Liang et al., (SZU) Preference-Aware Transfer (PAT) IJCNN 2020 23 / 30

slide-24
SLIDE 24

Experiments

Main Results (1/4)

Table: Recommendation performance of our preference-aware transfer (PAT) and other factorization-based methods on ML10M and Flixter, where the results of RSVD [Koren, 2008] and TMF [Pan et al., 2016] are copied from [Pan et al., 2016]. Notice that we follow the parameter setting in TMF [Pan et al., 2016] and fix α = 0.01 and T = 50 for all the methods, and wp = 2 and wn = 1 for TMF and our PAT. We also include the configurations in our generic PAT framework for comparative study and reproducibility.

Data Algorithm MAE RMSE Configurations ML10M RSVD 0.6438± 0.0011 0.8364± 0.0012 δp = δn = 0, δG = 0, λ = 0, ρ = 1 MF-MPC 0.6162± 0.0006 0.8063± 0.0007 δp = δn = 0, δG = 1, λ = 0, ρ = 1 TMF 0.6124± 0.0007 0.8005± 0.0008 δp = δn = 1, δG = 0, λ = 1, ρ = 0.5 PAT 0.6107± 0.0003 0.7989± 0.0008 δp = δn = 1, δG = 1, λ = 1, ρ = 0.5 Flixter RSVD 0.6561± 0.0007 0.8814± 0.0010 δp = δn = 0, δG = 0, λ = 0, ρ = 1 MF-MPC 0.6383± 0.0004 0.8644± 0.0005 δp = δn = 0, δG = 1, λ = 0, ρ = 1 TMF 0.6348± 0.0007 0.8615± 0.0012 δp = δn = 1, δG = 0, λ = 1, ρ = 0.5 PAT 0.6332± 0.0006 0.8572± 0.0010 δp = δn = 1, δG = 1, λ = 1, ρ = 0.5 Liang et al., (SZU) Preference-Aware Transfer (PAT) IJCNN 2020 24 / 30

slide-25
SLIDE 25

Experiments

Main Results (2/4)

Observations: Our PAT performs significantly better than all the baseline methods across the two datasets, which shows the effectiveness

  • f our transfer learning solution in modeling users’ heterogeneous

feedback and preference context. Compared with RSVD and MF-MPC, TMF and our PAT with both target grade scores and auxiliary binary ratings perform better, which showcases the usefulness of the binary ratings.

Liang et al., (SZU) Preference-Aware Transfer (PAT) IJCNN 2020 25 / 30

slide-26
SLIDE 26

Experiments

Main Results (3/4)

S V D + + M F

  • M

P C P A T

  • O

P C P A T 0.6 0.61 0.62 0.63

MAE

S V D + + M F

  • M

P C P A T

  • O

P C P A T 0.79 0.8 0.81 0.82

RMSE

S V D + + M F

  • M

P C P A T

  • O

P C P A T 0.63 0.635 0.64 0.645 0.65

MAE

S V D + + M F

  • M

P C P A T

  • O

P C P A T 0.85 0.86 0.87 0.88

RMSE

Figure: Recommendation performance of factorization methods with one-class preference

context (OPC) and multiclass preference context (MPC), i.e., MF with OPC (SVD++ [Koren, 2008]), MF with MPC (MF-MPC [Pan and Ming, 2017]), reduced version of our PAT with OPC (i.e., PAT-OPC) and our PAT with MPC (PAT) on ML10M (top) and Flixter (bottom), respectively.

Liang et al., (SZU) Preference-Aware Transfer (PAT) IJCNN 2020 26 / 30

slide-27
SLIDE 27

Experiments

Main Results (4/4)

The overall performance ordering is SVD++ < MF-MPC < PAT-OPC < PAT, which clearly showcases the effectiveness of our preference-aware transfer learning solution in modeling users’ heterogeneous feedback. For the two methods with OPC, i.e., SVD++ and our PAT-OPC, and the two methods with MPC, i.e., MF-MPC and our PAT, we can see that integrating the binary rating records always brings performance improvement. ...

Liang et al., (SZU) Preference-Aware Transfer (PAT) IJCNN 2020 27 / 30

slide-28
SLIDE 28

Conclusions and Future Work

Conclusions

In particular, we take the grade scores as the target data and the likes/dislikes as the auxiliary data in a transfer learning view, and exploit the implicit preference beneath the target data and the auxiliary data as the preference context, in order to build a more accurate and generic recommendation model. Technically, we find that several recent algorithms can be projected to be parts of our generic solution PAT as special cases. Empirically, we obtain very promising results on two large and public datasets in comparison with several state-of-the-art methods. More importantly, we observe that the empirical results are consistent with that of the technical framework with different subsets of components, i.e., more components leading to better performance.

Liang et al., (SZU) Preference-Aware Transfer (PAT) IJCNN 2020 28 / 30

slide-29
SLIDE 29

Conclusions and Future Work

Future Work

We are interested in further generalizing our generic factorization framework with deep federated learning [Xue et al., 2019, Yang et al., 2019] and ranking-oriented recommendation [Wu et al., 2018, Pei et al., 2019].

Liang et al., (SZU) Preference-Aware Transfer (PAT) IJCNN 2020 29 / 30

slide-30
SLIDE 30

Thank you

Thank you!

We thank the support of National Natural Science Foundation of China Nos. 61872249, 61836005 and 61672358. Q & A: If you have any questions and/or suggestions, welcome sending us an email: liangfeng2018@email.szu.edu.cn.

Liang et al., (SZU) Preference-Aware Transfer (PAT) IJCNN 2020 30 / 30

slide-31
SLIDE 31

References Koren, Y. (2008). Factorization meets the neighborhood: A multifaceted collaborative filtering model. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 426–434. Pan, W. and Ming, Z. (2014). Interaction-rich transfer learning for collaborative filtering with heterogeneous user feedbacks. IEEE Intelligent Systems,, 29(6):48–54. Pan, W. and Ming, Z. (2017). Collaborative recommendation with multiclass preference context. IEEE Intelligent Systems, 32(2):45–51. Pan, W., Xia, S., Liu, Z., Peng, X., and Ming, Z. (2016). Mixed factorization for collaborative recommendation with heterogeneous explicit feedbacks. Information Sciences, 332:84–93. Pan, W. and Yang, Q. (2013). Transfer learning in heterogeneous collaborative filtering domains. Artificial Intelligence, 197:39–55. Pei, C., Zhang, Y., Zhang, Y., Sun, F., Lin, X., Sun, H., Wu, J., Jiang, P ., Ge, J., Ou, W., and Pei, D. (2019). Personalized re-ranking for recommendation. In Proceedings of the 13th ACM Conference on Recommender Systems, pages 3–11. Salakhutdinov, R. and Mnih, A. (2008). Probabilistic matrix factorization. In Annual Conference on Neural Information Processing Systems, pages 1257–1264. Singh, A. P . and Gordon, G. J. (2008). Relational learning via collective matrix factorization. In Proceeding of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 650–658. Liang et al., (SZU) Preference-Aware Transfer (PAT) IJCNN 2020 30 / 30

slide-32
SLIDE 32

References Wu, L., Hsieh, C., and Sharpnack, J. (2018). SQL-Rank: A listwise approach to collaborative ranking. In Proceedings of the 35th International Conference on Machine Learning, ICML ’18, pages 5311–5320. Xue, F., He, X., Wang, X., Xu, J., Liu, K., and Hong, R. (2019). Deep item-based collaborative filtering for top-n recommendation. ACM Transactions on Information Systems, 37(3):33:1–33:25. Yang, Q., Liu, Y., Chen, T., and Tong, Y. (2019). Federated machine learning: Concept and applications. ACM Transactions on Intelligent Systems and Technology, 10(2):12:1–12:19. Liang et al., (SZU) Preference-Aware Transfer (PAT) IJCNN 2020 30 / 30