Transfer Learning for Heterogeneous One-Class Collaborative - - PowerPoint PPT Presentation

transfer learning for heterogeneous one class
SMART_READER_LITE
LIVE PREVIEW

Transfer Learning for Heterogeneous One-Class Collaborative - - PowerPoint PPT Presentation

Transfer Learning for Heterogeneous One-Class Collaborative Filtering Weike Pan , Mengsi Liu and Zhong Ming panweike@szu.edu.cn, liumengsi@email.szu.edu.cn, mingz@szu.edu.cn College of Computer Science and Software Engineering Shenzhen


slide-1
SLIDE 1

Transfer Learning for Heterogeneous One-Class Collaborative Filtering

Weike Pan , Mengsi Liu and Zhong Ming∗

panweike@szu.edu.cn, liumengsi@email.szu.edu.cn, mingz@szu.edu.cn College of Computer Science and Software Engineering Shenzhen University

Pan, Liu and Ming (CSSE, SZU) HOCCF (TJSL) IEEE Intelligent Systems 1 / 27

slide-2
SLIDE 2

Introduction

Problem Definition

For a user u ∈ U, we have a set of preferred items, i.e., Pu, and a set

  • f examined items, i.e., Eu.

Our goal is then to exploit such two types of one-class feedbacks and recommend a ranked list of items from I\Pu for each user u.

Pan, Liu and Ming (CSSE, SZU) HOCCF (TJSL) IEEE Intelligent Systems 2 / 27

slide-3
SLIDE 3

Introduction

Challenges

Sparsity of positive feedbacks Uncertainty of implicit examinations

Pan, Liu and Ming (CSSE, SZU) HOCCF (TJSL) IEEE Intelligent Systems 3 / 27

slide-4
SLIDE 4

Introduction

Overall of Our Solution

Algorithm: Transfer via Joint Similarity Learning (TJSL) Learn a similarity between a candidate item and a preferred item Learn a similarity between a candidate item and an identified likely-to-prefer examined item

Pan, Liu and Ming (CSSE, SZU) HOCCF (TJSL) IEEE Intelligent Systems 4 / 27

slide-5
SLIDE 5

Introduction

Advantage of Our Solution

Joint similarity learning has the merit of being able to connect two seemingly not related items w.r.t. the sparse positive feedbacks

  • nly.

Pan, Liu and Ming (CSSE, SZU) HOCCF (TJSL) IEEE Intelligent Systems 5 / 27

slide-6
SLIDE 6

Introduction

Notations

U user set, u ∈ U I item set, i, i′, j ∈ I P positive feedbacks Pu = {i|(u, i) ∈ P} preferred items by user u E implicit examinations Eu = {i|(u, i) ∈ E} examined items by user u R = {(u, i)} all (user, item) pairs A ⊂ R\P sampled feedbacks Pte test positive feedbacks d ∈ R latent feature number Vi·, Pi′·, Ej· ∈ R1×d latent feature vectors bu ∈ R user bias bi ∈ R item bias rui ∈ {1, 0} preference of (u, i) ˆ rui prediction of (u, i) T, L, L0 iteration number ρ sampling parameter αv, αp, αe, βu, βv tradeoff parameters

Pan, Liu and Ming (CSSE, SZU) HOCCF (TJSL) IEEE Intelligent Systems 6 / 27

slide-7
SLIDE 7

Method

Prediction Rule of FISMrmse for OCCF

The predicted preference of user u on item i, ˆ rui = bu + bi +

  • i′∈Pu\{i}

si′i, (1) where

i′∈Pu\{i} si′i = ¯

U−i

u· V T i· , and

¯ U−i

u· =

1

  • |Pu\{i}|
  • i′∈Pu\{i}

Pi′·. Note that ¯ U−i

u· can be considered as a virtual user profile w.r.t. user u

and item i according to positive feedbacks.

Pan, Liu and Ming (CSSE, SZU) HOCCF (TJSL) IEEE Intelligent Systems 7 / 27

slide-8
SLIDE 8

Method

Prediction Rule of TJSL(1,1) for HOCCF

The predicted preference of user u on item i, ˆ rui = bu + bi +

  • i′∈Pu\{i}

si′i +

  • j∈Eu

sji, (2) where

i′∈Pu\{i} si′i = ¯

U−i

u· V T i· , j∈Eu sji = ˜

¯ Uu·V T

i· , and

¯ U−i

= 1

  • |Pu\{i}|
  • i′∈Pu\{i}

Pi′·, ˜ ¯ Uu· = 1

  • |Eu|
  • j∈Eu

Ej·, Note that ˜ ¯ Uu· is a virtual user profile of user u according to implicit examinations.

Pan, Liu and Ming (CSSE, SZU) HOCCF (TJSL) IEEE Intelligent Systems 8 / 27

slide-9
SLIDE 9

Method

Objective Function of TJSL(1,1)

The objective function of TJSL(1,1), min

Θ

  • (u,i)∈P∪A

fui, (3) where Θ = {bu, bi, Vi·, Pi′·, Ej·}, i, i′, j = 1, . . . , m, u = 1, . . . , n, and fui = 1

2(rui − ˆ

rui)2 + αv

2 ||Vi·||2 F + αp 2

  • i′∈Pu\{i} ||Pi′·||2

F + αe 2

  • j∈Eu ||Ej·||2

F + βu 2 b2 u + βv 2 b2 i .

Pan, Liu and Ming (CSSE, SZU) HOCCF (TJSL) IEEE Intelligent Systems 9 / 27

slide-10
SLIDE 10

Method

Gradients of TJSL(1,1)

For each (u, i) ∈ P ∪ A, we have the gradients, ∇bu = ∂fui ∂bu = −eui + βubu ∇bi = ∂fui ∂bi = −eui + βvbi ∇Vi· = ∂fui ∂Vi· = −eui(¯ U−i

u· + ˜

¯ Uu·) + αvVi· ∇Pi′· = ∂fui ∂Pi′· = −eui 1

  • |Pu\{i}|

Vi· + αpPi′·, i′ ∈ Pu\{i} ∇Ej· = ∂fui ∂Ej· = −eui 1

  • |Eu|

Vi· + αeEj·, j ∈ Eu where eui = rui − ˆ

  • rui. Note that rui = 1 if (u, i) ∈ P, and rui = 0 if

(u, i) ∈ A.

Pan, Liu and Ming (CSSE, SZU) HOCCF (TJSL) IEEE Intelligent Systems 10 / 27

slide-11
SLIDE 11

Method

Update Rules of TJSL(1,1)

For each (u, i) ∈ P ∪ A, we have the update rules, bu = bu − γ∇bu bi = bi − γ∇bi Vi· = Vi· − γ∇Vi· Pi′· = Pi′· − γ∇Pi′·, i′ ∈ Pu\{i} Ej· = Ej· − γ∇Ej·, j ∈ Eu where eui = rui − ˆ rui.

Pan, Liu and Ming (CSSE, SZU) HOCCF (TJSL) IEEE Intelligent Systems 11 / 27

slide-12
SLIDE 12

Method

Algorithm of TJSL(1,1)

1: Input: Positive feedbacks P, implicit examinations E, itera- tion number T, and parameters ρ, αv, αp, αe, βu, βv. 2: Output: Learned model Θ 3: Initialize the model Θ 4: for t = 1, . . . , T do 5: Randomly sample A ⊂ R\P with |A| = ρ|P| 6: for t2 = 1, . . . , |P ∪ A| do 7: Randomly pick up (u, i) ∈ P ∪ A 8: Calculate ¯ U−i

u· = 1

|Pu\{i}|

  • i′∈Pu\{i} Pi′·

9: Calculate ˜ ¯ Uu· =

1

|Eu|

  • j∈Eu Ej·

10: Calculate ˆ rui = bu + bi + ¯ U−i

u· V T i· + ˜

¯ Uu·V T

11: Calculate eui = rui − ˆ rui 12: Update bu, bi, Vi·, Pi′·, i′ ∈ Pu\{i} and Ej·, j ∈ Eu 13: end for 14: end for

Pan, Liu and Ming (CSSE, SZU) HOCCF (TJSL) IEEE Intelligent Systems 12 / 27

slide-13
SLIDE 13

Method

Prediction Rule of TJSL

The predicted preference of user u on item i, ˆ r(ℓ)

ui = bu + bi

  • i′∈Pu\{i}

si′i +

  • j∈E(ℓ)

u

sji, E(ℓ)

u

⊆ Eu, (4) where si′i =

1

|Pu\{i}|Pi′·V T i· , and sji = 1

  • |E(ℓ)

u |

Ej·V T

i· .

Pan, Liu and Ming (CSSE, SZU) HOCCF (TJSL) IEEE Intelligent Systems 13 / 27

slide-14
SLIDE 14

Method

Objective Function of TJSL

The objective function of TJSL is as follows, min

Θ(ℓ),E(ℓ)

u ⊆Eu

  • (u,i)∈P∪A

f (ℓ)

ui

(5) where f (ℓ)

ui

= 1

2(rui − ˆ

rui)2 + αv

2 ||Vi·||2 F + αp 2

  • i′∈Pu\{i} ||Pi′·||2

F + αe 2

  • j∈E(ℓ)

u ||Ej·||2

F + βu 2 b2 u + βv 2 b2 i , and the model parameters are

Θ(ℓ) = {bu, bi, Vi·, Pi′·, Ej·|u ∈ U, i ∈ I, i′ ∈ Pu, j ∈ E(ℓ)

u }.

Pan, Liu and Ming (CSSE, SZU) HOCCF (TJSL) IEEE Intelligent Systems 14 / 27

slide-15
SLIDE 15

Method

Gradients of TJSL

Given some selected examined items E(ℓ)

u , we have the gradient of

each corresponding θ ∈ Θ(ℓ) for (u, i) ∈ P ∪ A, ∇θ = ∂f (ℓ)

ui

∂θ , (6) where ∇θ can be ∇bu = −eui + βubu, ∇bi = −eui + βvbi, ∇Vi· = −eui(

1

|Pu\{i}|

  • i′∈Pu\{i} Pi′· +

1

  • |E(ℓ)

u |

  • j∈E(ℓ)

u Ej·) + αvVi·,

∇Pi′· = −eui

1

|Pu\{i}|Vi· + αpPi′·, i′ ∈ Pu\{i}, and

∇Ej· = −eui

1

|Eu|Vi· + αeEj·, j ∈ Eu. Note that eui = rui − ˆ

rui is the difference between the true preference and the estimated preference.

Pan, Liu and Ming (CSSE, SZU) HOCCF (TJSL) IEEE Intelligent Systems 15 / 27

slide-16
SLIDE 16

Method

Update Rule of TJSL

Finally, we have the update rule for each corresponding θ ∈ Θ(ℓ), θ ← θ − γ∇θ, (7) where γ(γ > 0) is the learning rate.

Pan, Liu and Ming (CSSE, SZU) HOCCF (TJSL) IEEE Intelligent Systems 16 / 27

slide-17
SLIDE 17

Method

Identification of Likely-to-Prefer Items

Once we have learned the model parameters Θ(ℓ), we can identify some likely-to-prefer items from Eu via the prediction rule in Eq.(4). Specifically, for each examined item j ∈ Eu by user u, we estimate its preference score ˆ r(ℓ)

uj , and then take τ|Eu| examined items with

the highest scores. Note that τ (0 < τ ≤ 1) is a parameter for item selection, which is initialized as 1 in the beginning and is then gradually decreased via τ ← τ × 0.9 so as to ignore some unlikely-to-prefer items.

Pan, Liu and Ming (CSSE, SZU) HOCCF (TJSL) IEEE Intelligent Systems 17 / 27

slide-18
SLIDE 18

Method

Algorithm of TJSL (part 1)

Input: Positive feedbacks P, implicit examinations E, iteration numbers T, L, L0, and parameters ρ, αv, αp, αe, βu, βv. Output: Selected examinations E(ℓ) and learned models Θ(ℓ), ℓ = L − L0 + 1, . . . , L. The final prediction rule is ˆ rui = L

ℓ=L−L0+1 ˆ

r(ℓ)

ui /L0, where ˆ

r(ℓ)

ui is

the prediction using the ℓth model parameters and data.

Pan, Liu and Ming (CSSE, SZU) HOCCF (TJSL) IEEE Intelligent Systems 18 / 27

slide-19
SLIDE 19

Method

Algorithm of TJSL (part 2)

1:

Let E(1) = E, τ = 1

2:

for ℓ = 1, . . . , L do

3:

Initialize the model Θ(ℓ)

4:

for t = 1, . . . , T do

5:

Randomly sample A ⊂ R\P with |A| = ρ|P|

6:

for t2 = 1, . . . , |P ∪ A| do

7:

Randomly pick up (u, i) ∈ P ∪ A

8:

Calculate ˆ r (ℓ)

ui

via Eq.(4)

9:

Calculate ∇θ, θ ∈ Θ(ℓ) via Eq.(6)

10:

Update θ, θ ∈ Θ(ℓ) via Eq.(7)

11:

end for

12:

end for

13:

if ℓ > L − L0 then

14:

Save the current model and data, i.e., Θ(ℓ), E(ℓ)

15:

end if

16:

if L > 1 and L > ℓ then

17:

τ ← τ × 0.9

18:

for u ∈ U do

19:

Select E(ℓ+1)

u

⊆ Eu with |E(ℓ+1)

u

| = τ|Eu|

20:

end for

21:

end if

22:

end for Pan, Liu and Ming (CSSE, SZU) HOCCF (TJSL) IEEE Intelligent Systems 19 / 27

slide-20
SLIDE 20

Method

Analysis

when L = L0 = 1, TJSL reduces to TJSL(1,1) when E = ∅, TJSL(1,1) reduces to FISMrmse when T = 0, FISMrmse reduces to PopRank

Pan, Liu and Ming (CSSE, SZU) HOCCF (TJSL) IEEE Intelligent Systems 20 / 27

slide-21
SLIDE 21

Experiments

Datasets

Table: Description of the data sets used in the experiments, including numbers of users (|U|), items (|I|), positive feedbacks (|P|), implicit examinations (|E|), and test positive feedbacks (|Pte|).

Data set |U| |I| |P| |E| |Pte| MovieLens100K 943 1682 9438 45285 2153 MovieLens1M 6040 3952 90848 400083 45075 Alibaba2015 7475 5257 9290 62659 2322 The data and code are available at http://www.cse.ust.hk/˜weikep/TL4HOCCF/

Pan, Liu and Ming (CSSE, SZU) HOCCF (TJSL) IEEE Intelligent Systems 21 / 27

slide-22
SLIDE 22

Experiments

Baselines

BPR: Bayesian Personalized Ranking FISMrmse: Factored Item Similarity Model using RMSE loss

Pan, Liu and Ming (CSSE, SZU) HOCCF (TJSL) IEEE Intelligent Systems 22 / 27

slide-23
SLIDE 23

Experiments

Initialization of Model Parameters

We use the statistics of positive feedbacks P to initialize the model parameters, bu =

m

  • i=1

yui/m − µ bi =

n

  • u=1

yui/n − µ Vik = (r − 0.5) × 0.01, k = 1, . . . , d Pi′k = (r − 0.5) × 0.01, k = 1, . . . , d Ejk = (r − 0.5) × 0.01, k = 1, . . . , d where r (0 ≤ r < 1) is a random variable, and µ = n

u=1

m

i=1 yui/n/m.

Pan, Liu and Ming (CSSE, SZU) HOCCF (TJSL) IEEE Intelligent Systems 23 / 27

slide-24
SLIDE 24

Experiments

Parameter Configurations

We fix ρ = 3, γ = 0.01 and d = 20, and search the best values of the following parameters using NDCG@5, αv = αp = αe = βu = βv ∈ {0.001, 0.01, 0.1} T ∈ {100, 500, 1000} For the iteration numbers, we fix L = 10 and L0 = 3.

Pan, Liu and Ming (CSSE, SZU) HOCCF (TJSL) IEEE Intelligent Systems 24 / 27

slide-25
SLIDE 25

Experiments

Results

BPR FISMrmse TJSL ML100K Precision@5 0.0552± 0.0006 0.0628± 0.0015 0.0697± 0.0016 NDCG@5 0.0874± 0.0020 0.1029± 0.0017 0.1133± 0.0047 ML1M Precision@5 0.0928± 0.0008 0.0971± 0.0013 0.1012± 0.0011 NDCG@5 0.1121± 0.0010 0.1189± 0.0008 0.1248± 0.0010 Alibaba2015 Precision@5 0.0050± 0.0006 0.0046± 0.0003 0.0071± 0.0004 NDCG@5 0.0138± 0.0017 0.0126± 0.0009 0.0200± 0.0008

Observations FISMrmse is close to or better than BPR, which shows the effectiveness of similarity learning and neighborhood-based prediction rule in FISMrmse as compared with that of BPR; and TJSL further boosts the performance of FISMrmse significantly via selection of examined items, which shows the usefulness of the selected examined items and the effectiveness of integrating them in a joint similarity learning manner.

Pan, Liu and Ming (CSSE, SZU) HOCCF (TJSL) IEEE Intelligent Systems 25 / 27

slide-26
SLIDE 26

Conclusion

Conclusion

We study a new and important recommendation problem called heterogeneous one-class collaborative filtering (HOCCF) containing positive feedbacks and implicit examinations. We map the HOCCF problem to the transfer learning paradigm with target data (positive feedbacks) and auxiliary data (implicit examinations), and then design a novel transfer learning algorithm, i.e., transfer via joint similarity learning (TJSL).

Pan, Liu and Ming (CSSE, SZU) HOCCF (TJSL) IEEE Intelligent Systems 26 / 27

slide-27
SLIDE 27

Thank you

Thank you!

We thank the editors and reviewers for their expert comments and constructive suggestions. We thank the support of Natural Science Foundation of Guangdong Province No. 2014A030310268, National Natural Science Foundation of China No. 61502307, 61170077, Grant of Shenzhen City No. KQCX20140519103756206 and Natural Science Foundation of SZU No. 201436.

Pan, Liu and Ming (CSSE, SZU) HOCCF (TJSL) IEEE Intelligent Systems 27 / 27