MF-DMPC: Matrix Factorization with Dual Multiclass Preference - - PowerPoint PPT Presentation

mf dmpc matrix factorization with dual multiclass
SMART_READER_LITE
LIVE PREVIEW

MF-DMPC: Matrix Factorization with Dual Multiclass Preference - - PowerPoint PPT Presentation

MF-DMPC: Matrix Factorization with Dual Multiclass Preference Context for Rating Prediction Weike Pan Zhong Ming Jing Lin National Engineering Laboratory for Big Data System Computing Technology, College of Computer Science and Software


slide-1
SLIDE 1

MF-DMPC: Matrix Factorization with Dual Multiclass Preference Context for Rating Prediction

Jing Lin Weike Pan Zhong Ming

National Engineering Laboratory for Big Data System Computing Technology, College of Computer Science and Software Engineering, Shenzhen University, China linjing4@email.szu.edu.cn, {panweike,mingz}@szu.edu.cn

Lin, Pan and Ming (SZU) MF-DMPC 1 / 26

slide-2
SLIDE 2

Introduction

Context in RS

3 ? 1 ? ? ? ? ? 4 ? 5 2 2 ? ? 1 5 ? 3 ? ? 4 ? ? ? 1 2 3

n-1

n 1 2 3 …

m-1 m

items users Social network Weather Date Place …… Rating matrix

Internal Context External Context

La La Land

Comedy, Drama, Music …...

Avatar

Action, Adventure, Fantasy …...

Item content …

Using external context in RS may cause the problems of model inflexibility, computational burden and systems incompatibility. So algorithms that

  • nly use internal context (collaborative filtering) still occupy an important

place in the research community.

Lin, Pan and Ming (SZU) MF-DMPC 2 / 26

slide-3
SLIDE 3

Introduction

Motivation

MF-MPC combines the two complementary categories of collaborative filtering – neighborhood-based and model-based. MF-MPC is an improved method of SVD (a typical kind of matrix factorization method) by adding a matrix transformed from the multiclass preference context (MPC) of a certain user, which represents the user similarities in a neighborhood-based method. In this paper, we further introduce a matrix factorization model that combines not only user similarities but also item similarities.

Lin, Pan and Ming (SZU) MF-DMPC 3 / 26

slide-4
SLIDE 4

Introduction

The Derivation Process of MF-DMPC (1)

user-specific latent feature vector item-specific latent feature vector user-specific latent preference vector user bias,item bias, , global average (a) ) user-based MF-MPC PC model (u

u = = 1,2,…,n; ; i i = = 1,2,…,m)

According to Pan et al., MF-MPC refers to matrix factorization with (user-based) multiclass preference context.

Lin, Pan and Ming (SZU) MF-DMPC 4 / 26

slide-5
SLIDE 5

Introduction

The Derivation Process of MF-DMPC (2)

user-specific latent feature vector item-specific latent feature vector item-specific latent preference vector user bias,item bias, , global average (b) ) item-based MF-MPC PC model (u

u = = 1,2,…,n; ; i i = = 1,2,…,m)

Symmetrically, we introduce item-based multiclass preference context to represent item similarities.

Lin, Pan and Ming (SZU) MF-DMPC 5 / 26

slide-6
SLIDE 6

Introduction

The Derivation Process of MF-DMPC (3)

user-specific latent feature vector item-specific latent feature vector item-specific latent preference vector user bias,item bias, , global average user-specific latent preference vector (c) ) MF MF-DMPC PC model (u u = = 1,2,…,n; ; i i = = 1,2,…,m)

Finally, we introduce both user-based and item-based MPC (dual MPC) into the prediction rule to obtain our improved model called matrix factorization with dual multi-class preference context (MF-DMPC).

Lin, Pan and Ming (SZU) MF-DMPC 6 / 26

slide-7
SLIDE 7

Introduction

Advantage of Our Solution

MF-DMPC inherits high accuracy and good explainability of MF-MPC and performs even better. As a matter of fact, our model is a more generic method which successfully exploits the complementarity between user-based and item-based neighborhood information.

Lin, Pan and Ming (SZU) MF-DMPC 7 / 26

slide-8
SLIDE 8

Preliminaries

Problem Definition

In this paper, we study the problem of making good use of internal context in recommendation systems. We will only need an incomplete rating matrix R = {(u, i, rui)} for

  • ur task, where u represents one of the ID numbers of n users (or

rows), i represents one of the ID numbers of m items (or columns), and rui is the recorded rating of user u to item i. As a result, we will build an improved model to estimate the missing entries of the rating matrix.

Lin, Pan and Ming (SZU) MF-DMPC 8 / 26

slide-9
SLIDE 9

Preliminaries

Notations

Table: Some notations and explanations.

Symbol Meaning n user number m item number u, u′ user ID i, i′ item ID M multiclass preference set rui ∈ M rating of user u to item i R = {(u, i, rui )} rating records of training data yui ∈ {0, 1} indicator, yui = 1 if (u, i, rui ) ∈ R and yui = 0 otherwise Ir

u, r ∈ M

items rated by user u with rating r Iu items rated by user u Ur

i , r ∈ M

users who rate item i with rating r Ui users who rate item i µ ∈ R global average rating value bu ∈ R user bias bi ∈ R item bias d ∈ R number of latent dimensions Uu·, Nr

u· ∈ R1×d

user-specific latent feature vector Vi·, Mr

i· ∈ R1×d

item-specific latent feature vector Rte = {(u, i, rui )} rating records of test data ˆ rui predicted rating of user u to item i T iteration number in the algorithm Lin, Pan and Ming (SZU) MF-DMPC 9 / 26

slide-10
SLIDE 10

Preliminaries

Prediction Rule of SVD

In the state-of-the-art matrix factorization based model – SVD model, the prediction rule for the rating of user u to item i is as follows, ˆ rui = Uu·V T

i· + bu + bi + µ,

(1) where Uu· ∈ R1×d and Vi· ∈ R1×d are the user-specific and item-specific latent feature vectors, respectively, and bu , bi and µ are the user bias, the item bias and the global average, respectively.

Lin, Pan and Ming (SZU) MF-DMPC 10 / 26

slide-11
SLIDE 11

Preliminaries

Preference Generalization Probability of MF-MPC

In the MF-MPC model, the rating of user u to item i, rui, can be represented in a probabilistic way as, P(rui|(u, i); (u, i′, rui′), i′ ∈ ∪r∈MIr

u\{i}),

(2) which means that rui is dependent on not only the (user, item) pair (u, i), but also the examined items i′ ∈ Iu\{i} and the categorical score rui′ ∈ M

  • f each item. Notice that multiclass preference context (MPC) refers to

the condition (u, i′, rui′), i′ ∈ ∪r∈MIr

u\{i}.

Lin, Pan and Ming (SZU) MF-DMPC 11 / 26

slide-12
SLIDE 12

Preliminaries

Prediction Rule of MF-MPC

In order to introduce MPC into MF based model, we need a user-specific latent preference vector ¯ UMPC

for user u, ¯ UMPC

=

  • r∈M

1

  • |Ir

u\{i}|

  • i′∈Ir

u\{i}

Mr

i′·.

(3) Notice that Mr

i· ∈ R1×d is a classified item-specific latent feature vector and 1

|Ir

u\{i}| plays as a normalization term for the preference of class r.

By adding the neighborhood information ¯ UMPC

to SVD model, we get the MF-MPC prediction rule for the rating of user u to item i, ˆ rui = Uu·V T

i· + ¯

UMPC

V T

i· + bu + bi + µ,

(4) where Uu· and Vi·, bu , bi and µ are exactly the same with that of the SVD model. MF-MPC generates better recommendation performance than SVD and SVD++, and also contains them as particular cases.

Lin, Pan and Ming (SZU) MF-DMPC 12 / 26

slide-13
SLIDE 13

Our Model

Dual Multiclass Preference Context

We first define item-based multiclass preference context (item-based MPC) ¯ V MPC

to represent item similarities. Symmetrically, we have ¯ V MPC

=

  • r∈M

1

  • |Ur

i \{u}|

  • u′∈Ur

i \{u}

Nr

u′·,

(5) where Nr

u· ∈ R1×d is a classified user-specific latent feature vector.

So we have the item-based MF-MPC prediction rule, ˆ rui = Uu·V T

i· + ¯

V MPC

UT

u· + bu + bi + µ.

(6) We can introduce both user-based and item-base neighborhood information into matrix factorization method by keeping both ¯ UMPC

and ¯ V MPC

in the model, collectively called dual multiclass preference context (DMPC).

Lin, Pan and Ming (SZU) MF-DMPC 13 / 26

slide-14
SLIDE 14

Our Model

Prediction Rule of MF-DMPC

For matrix factorization with dual multiclass preference context, the prediction rule for the rating of user u to item i is defined as follows, ˆ rui = Uu·V T

i· + ¯

UMPC

V T

i· + ¯

V MPC

UT

u· + bu + bi + µ,

(7) with all notations described above. Finally, we call our new model “MF-DMPC” in short.

Lin, Pan and Ming (SZU) MF-DMPC 14 / 26

slide-15
SLIDE 15

Our Model

Optimization Problem

With the prediction rule, we can learn the model parameters in the following minimization problem, min

Θ n

  • u=1

m

  • i=1

yui[1 2(rui − ˆ rri)2 + reg(u, i)], (8) where reg(u, i) = αm

2

  • r∈M
  • i′∈Ir

u\{i} ||Mr

i′||2 F + αn 2

  • r∈M
  • u′∈Ur

i \{u} ||Nr

u′||2 F + αu 2 ||Uu·||2 + αv 2 ||Vi·||2 + βu 2 ||bu·||2 + βv 2 ||bi·||2 is the regularization term

used to avoid overfitting, and Θ = {Uu·, Vi·, bu, bi, µ, Mr

i·, Nr u·},

u = 1, 2, . . . , n, i = 1, 2, . . . , m, r ∈ M. Notice that the objective function of MF-DMPC is quite similar to that of MF-MPC. The difference lies in the “dual” MPC, i.e., ¯ V MPC

UT

u· in the

prediction rule, and αn

2

  • r∈M
  • u′∈Ur

i \{u} ||Nr

u′||2 F in the regularization

term.

Lin, Pan and Ming (SZU) MF-DMPC 15 / 26

slide-16
SLIDE 16

Our Model

Gradients

Using the stochastic gradient descent (SGD) algorithm, we have the gradients of the model parameters, ∇Uu· = −eui(Vi· + ¯ V MPC

) + αuUu· (9) ∇Vi· = −eui(Uu· + ¯ UMPC

) + αvVi· (10) ∇bu = −eui + βubu (11) ∇bi = −eui + βvbi (12) ∇µ = −eui (13) ∇Mr

i′· =

−euiVi·

  • |Ir

u\{i}|

+ αmMr

i′·, i′ ∈ Ir u\{i}, r ∈ M.

(14) ∇Nr

u′· =

−euiUu·

  • |Ur

i \{u}| + αnNr u′·, u′ ∈ Ur i \{u}, r ∈ M.

(15) where eui = (rui − ˆ rui) is the difference between the true rating and the predicted rating.

Lin, Pan and Ming (SZU) MF-DMPC 16 / 26

slide-17
SLIDE 17

Our Model

Update Rules

We have the update rules, θ = θ − γ∇θ (16) where γ is the learning rate, and θ ∈ Θ is a model parameter to be learned.

Lin, Pan and Ming (SZU) MF-DMPC 17 / 26

slide-18
SLIDE 18

Our Model

Algorithm of MF-DMPC

1: Initialize model parameters Θ 2: for t = 1, . . . , T do 3:

for t2 = 1, . . . , |R| do

4:

Randomly pick up a rating from R

5:

Calculate the gradients via Eq.(9 - 15)

6:

Update the parameters via Eq.(16)

7:

end for

8:

Decrease the learning rate γ ← − γ × 0.9

9: end for

Figure: The algorithm of MF-DMPC.

Lin, Pan and Ming (SZU) MF-DMPC 18 / 26

slide-19
SLIDE 19

Our Model

Analysis

The time complexity of MF-MPC and SVD++ and the proposed MF-DMPC is MF-DMPC > MF-MPC > SVD++, mainly because of the traversal during calculating ¯ UMPC

and ¯ V MPC

in MF-DMPC, ¯ UMPC

in MF-MPC, and ¯ UOPC

(oneclass preference context defined in) in SVD++. As for space complexity, we can reckon from the size of dominating model parameters vectors shown in table below.

Table: The size of dominating model parameters vectors in different models.

Model Dominating model parameters vectors size(×d) SVD Uu·, Vi· n + m SVD++ Uu·, Vi·, Mi· n + m + n MF-MPC Uu·, Vi·, Mr

n + m + n|M| MF-DMPC Uu·, Vi·, Mr

i·, Nr u·

n + m + n|M| + m|M|

Lin, Pan and Ming (SZU) MF-DMPC 19 / 26

slide-20
SLIDE 20

Experiments

Data Sets and Evaluation Metrics

We choose the same data sets used in previous research about MF-MPC for convenience (see table below). We use five-fold cross validation in the empirical studies.

Table: Statistics of the data sets used in the experiments.

Data set n m |R| + |Rte| |R|/n/m n/m ML100K 943 1,682 100,000 5.04% 0.56 ML1M 6,040 3,952 1,000,209 3.35% 1.53 ML10M 71,567 10,681 10,000,054 1.05% 6.70

We adopt mean absolute error (MAE) and root mean square error (RMSE) as evaluation metrics: MAE =

  • (u,i,rui)∈Rte

|rui − ˆ rui|/|Rte| RMSE =

  • (u,i,rui)∈Rte

(rui − ˆ rui)2/|Rte|

Lin, Pan and Ming (SZU) MF-DMPC 20 / 26

slide-21
SLIDE 21

Experiments

Baselines and Parameter Settings

In order to find out the effects of introducing different kinds of MPC into matrix factorization (MF) model, we compare the performance of SVD (see Eq.(1)) against that achieved by matrix factorization with user-based MPC (see Eq.(4)), item-based MPC (see Eq.(6)) and dual MPC (see Eq.(7)). We configure the parameter settings of factorization-based methods as follows:

The learning rate: γ = 0.01. The number of latent dimensions: d = 20. The iteration number: T = 50. The tradeoff parameters are searched through experiment using the first copy of each data and the RMSE metric, and follow the following conditions: αu = αv = βu = βv = α, α ∈ {0.001, 0.01, 0.1}; for user-based MF-MPC, αm = α; for item-based MF-MPC, αn = α; for dual MF-MPC, αm, αn ∈ {0.001, 0.01, 0.1}.

Lin, Pan and Ming (SZU) MF-DMPC 21 / 26

slide-22
SLIDE 22

Experiments

Results

Table: Recommendation performance of our MF-DMPC and other baseline methods on three Movielens data sets.

Data Method MAE RMSE Parameter (α, αm, αn) ML100K SVD 0.7446±0.0033 0.9445±0.0035 (0.01, N/A, N/A) User-Based MF-MPC 0.7123±0.0028 0.9102±0.0029 (0.01, 0.01, N/A) Item-Based MF-MPC 0.7038±0.0021 0.9008±0.0025 (0.01, N/A, 0.01) MF-DMPC 0.7011±0.0025 0.8991±0.0024 (0.01, 0.01, 0.01) ML1M SVD 0.7017±0.0016 0.8899±0.0023 (0.01, N/A, N/A) User-Based MF-MPC 0.6613±0.0015 0.8465±0.0017 (0.01, 0.01, N/A) Item-Based MF-MPC 0.6587±0.0009 0.8439±0.0013 (0.01, N/A, 0.01) MF-DMPC 0.6564±0.0016 0.8434±0.0017 (0.01, 0.01, 0.01) ML10M SVD 0.6067±0.0007 0.7913±0.0009 (0.01, N/A, N/A) User-Based MF-MPC 0.5965±0.0006 0.78135±0.0007 (0.01, 0.01, N/A) Item-Based MF-MPC 0.6024±0.0006 0.7900±0.0008 (0.01, N/A, 0.01) MF-DMPC 0.5955±0.0005 0.78133±0.0007 (0.01, 0.001, 0.1)

Lin, Pan and Ming (SZU) MF-DMPC 22 / 26

slide-23
SLIDE 23

Experiments

Observations

We can have the following observations:

The performence of factorization framework greatly improve when introducing multiclass preference context; Among all kinds of MPC, dual MPC contributes the most to the achievement of minimizing prediction error; In the MovieLens datasets, whether user-based or item-based MPC is more helpful depends on the ratio of user group size to item group size (n/m). Normally, item-based MF-MPC performs better when n/m is of a suitable

  • size. While as a massive number of users involved (n/m gets large),

user-based MPC become even more important; (May be affected by additional factors such as the density of rating matrix) The performance of MF-DMPC is in a way restrained by the better result between user-based and item-based MF-MPC – just slightly better than the better result. While the slight improvement supports the view that MF-DMPC successfully strikes the balance between user-based and item-based MPC, proving MF-DMPC to be a more generic method.

Lin, Pan and Ming (SZU) MF-DMPC 23 / 26

slide-24
SLIDE 24

Conclusions and Future Work

Conclusions

We present a novel collaborative filtering method that joins neighborhood information to factorization model for rating prediction. Specifically, we extend multiclass preference context (MPC) to include two types, i.e., user-based and item-based, and combine them in one single prediction rule in order to achieve better recommendation performance than the reference models.

Lin, Pan and Ming (SZU) MF-DMPC 24 / 26

slide-25
SLIDE 25

Conclusions and Future Work

Future Work

We are interested in studying the issues such as efficiency and robustness of factorization-based algorithms with preference context. We also expect some advanced strategy perhaps about adversarial sampling, denoising or multilayer perception to be used in the algorithm.

Lin, Pan and Ming (SZU) MF-DMPC 25 / 26

slide-26
SLIDE 26

Thank You

Thank you!

We thank Mr. Yunfeng Huang for assitant in code review and helpful discusions. We thank the support of National Natural Science Foundation of China No. 61502307 and No. 61672358, and Natural Science Foundation of Guangdong Province No. 2016A030313038.

Lin, Pan and Ming (SZU) MF-DMPC 26 / 26