Relation Regularized Matrix Factorization Wu-Jun Li, Dit-Yan Yeung - - PowerPoint PPT Presentation

relation regularized matrix factorization
SMART_READER_LITE
LIVE PREVIEW

Relation Regularized Matrix Factorization Wu-Jun Li, Dit-Yan Yeung - - PowerPoint PPT Presentation

Relation Regularized Matrix Factorization Wu-Jun Li, Dit-Yan Yeung Department of Computer Science and Engineering Hong Kong University of Science and Technology Hong Kong, China IJCAI 2009 Li and Yeung (CSE, HKUST) RRMF IJCAI 2009 1 / 23


slide-1
SLIDE 1

Relation Regularized Matrix Factorization

Wu-Jun Li, Dit-Yan Yeung

Department of Computer Science and Engineering Hong Kong University of Science and Technology Hong Kong, China

IJCAI 2009

Li and Yeung (CSE, HKUST) RRMF IJCAI 2009 1 / 23

slide-2
SLIDE 2

Contents

1 Introduction 2 Relation Regularized Matrix Factorization

Model Formulation Learning Convergence and Complexity Analysis

3 Experiments 4 Conclusion

Li and Yeung (CSE, HKUST) RRMF IJCAI 2009 2 / 23

slide-3
SLIDE 3

Introduction

Matrix Factorization (MF)

To project instances into a lower-dimensional latent space. X : n × m, with each row Xi∗ denoting an instance X ≈ UVT U : n × D V : m × D D < m Ui∗ is the lower-dimensional representation of Xi∗ Objective: To get a U which can remove the noise in X Example: Latent semantic indexing (LSI) for document analysis

Li and Yeung (CSE, HKUST) RRMF IJCAI 2009 3 / 23

slide-4
SLIDE 4

Introduction

Relational Data

Contain both content information and relation (link) structure. Examples: Web pages: page content and hyperlinks Research papers: paper content and citations Representation: two matrices Content matrix Link matrix

Li and Yeung (CSE, HKUST) RRMF IJCAI 2009 4 / 23

slide-5
SLIDE 5

Introduction

Semantics of Relations

There exist at least two types of links with different semantics: Type I Links: If two instances link to or are linked by one common instance, they will be most likely to belong to the same class.

V1 V2 V3 V1 V2 V3

Example: Hyperlinks among web pages Type II Links: Two linked instances are most likely to belong to the same class.

V1 V2 V3 V1 V2 V3 V1 V2 V3

Example: Citations among research papers

Li and Yeung (CSE, HKUST) RRMF IJCAI 2009 5 / 23

slide-6
SLIDE 6

Introduction

Existing Work

Traditional MF methods:

Can only model one matrix Example: LSI

Joint Link-Content MF (LCMF):

Can model both content and link matrices simultaneously Can only model Type I links Illustration: Link structure

V1 V8 V6 V7 V4 V5 V2 V3

Result of LCMF

  • .8 -.5 .3 -.1 -.0
  • .0 .4 .6 -.1 -.4
  • .0 .4 .6 -.1 -.4

.3 -.2 .3 -.4 .3 .3 -.2 .3 -.4 .3

  • .4 .5 .0 -.2 .6
  • .4 .5 .0 -.2 .6
  • .1 .1 -.4 -.8 -.4

Li and Yeung (CSE, HKUST) RRMF IJCAI 2009 6 / 23

slide-7
SLIDE 7

Introduction

Our Contribution

Relation regularized matrix factorization (RRMF): To model Type II links Can also model Type I links by preprocessing the link structure Convergent Linear time-complexity

Li and Yeung (CSE, HKUST) RRMF IJCAI 2009 7 / 23

slide-8
SLIDE 8

Relation Regularized Matrix Factorization Model Formulation

Notations

Content matrix: X − n × m Xi∗: content feature vector for instance i Adjacency matrix: A − n × n Aij = 1 if there is a relation between instances i and j, and otherwise Aij = 0; Aii = 0 Note: This specification of A is only suitable for Type II links. We will introduce the strategy to specify A for Type I links later.

Li and Yeung (CSE, HKUST) RRMF IJCAI 2009 8 / 23

slide-9
SLIDE 9

Relation Regularized Matrix Factorization Model Formulation

Objective Function

min

U,V

1 2X − UVT2+α 2 (U2 + V2) + β 2 tr(UTLU) where L = D − A and D is a diagonal matrix with Dii =

j Aij.

tr(UTLU) = 1 2

n

  • i=1

n

  • j=1

AijUi∗ − Uj∗2 The goal of tr(UTLU) is to make the latent representations of two instances as close as possible if there exists a relation between them. ⇒ in line with the semantics of Type II links

Li and Yeung (CSE, HKUST) RRMF IJCAI 2009 9 / 23

slide-10
SLIDE 10

Relation Regularized Matrix Factorization Model Formulation

Illustration

The original feature representation and link structure: The goal to be achieved by RRMF:

Li and Yeung (CSE, HKUST) RRMF IJCAI 2009 10 / 23

slide-11
SLIDE 11

Relation Regularized Matrix Factorization Model Formulation

Adapting RRMF for Type I Links

Basic idea: To transform Type I links to Type II links Strategy: Artificially add a link between two instances if they link to or are linked by a common instance.

V1 V2 V3 V1 V2 V3 Type I links Type II links

Li and Yeung (CSE, HKUST) RRMF IJCAI 2009 11 / 23

slide-12
SLIDE 12

Relation Regularized Matrix Factorization Learning

Convexity of the Objective Function

Let f = 1

2X − UVT2 + α 2 (U2 + V2) + β 2 tr(UTLU).

Theorem f is convex w.r.t. U. Theorem f is convex w.r.t. V.

Li and Yeung (CSE, HKUST) RRMF IJCAI 2009 12 / 23

slide-13
SLIDE 13

Relation Regularized Matrix Factorization Learning

Alternating Projection Method

Each time we fix one parameter and then update the other one. Iterate until some termination condition is satisfied: Learn U with V fixed Learn V with U fixed

Li and Yeung (CSE, HKUST) RRMF IJCAI 2009 13 / 23

slide-14
SLIDE 14

Relation Regularized Matrix Factorization Learning

Learning U

To optimize one column U∗d at a time with the other columns fixed: F(d)U∗d = e(d) F(d) = E(d) + αI + βL E(d) = diag( ∂2g ∂U1d∂U1d , ∂2g ∂U2d∂U2d , . . . , ∂2g ∂Und∂Und ), ∂2g ∂Uid∂Uid =

m

  • j=1

V 2

jd

e(d) = (e(d)

1 , e(d) 2 , . . . , e(d) n )T, e(d) i

=

m

  • j=1

Vjd(Xij − Ui∗VT

j∗ + UidVjd)

Steepest descent to iteratively update U∗d: r(t) = e(d) − F(d)U∗d(t) δ(t) = r(t)Tr(t) r(t)TF(d)r(t) U∗d(t + 1) = U∗d(t) + δ(t)r(t)

Li and Yeung (CSE, HKUST) RRMF IJCAI 2009 14 / 23

slide-15
SLIDE 15

Relation Regularized Matrix Factorization Learning

Learning V

The update of the whole matrix V can naturally be decomposed into the update of each row Vj∗: Vj∗ = n

  • i=1

XijUi∗

  • K−1

K =

n

  • i=1

UT

i∗Ui∗ + αI

Li and Yeung (CSE, HKUST) RRMF IJCAI 2009 15 / 23

slide-16
SLIDE 16

Relation Regularized Matrix Factorization Convergence and Complexity Analysis

Convergence and Complexity

Theorem The learning algorithm will converge. The time complexity of the learning algorithm is O(n).

Li and Yeung (CSE, HKUST) RRMF IJCAI 2009 16 / 23

slide-17
SLIDE 17

Experiments

Data Sets

WebKB:

Web pages from the CS departments of 4 universities: Cornell, Texas, Washington, Wisconsin 7-class problem: a page belongs to one of the {student, professor, course, project, staff, department or “others”}.

Cora:

Research papers with their bibliographic citations Each paper is labeled as one of the subfields of data structure (DS), hardware and architecture (HA), machine learning (ML), and programming language (PL). Characteristics of the WebKB data set

#classes #instances #terms Cornell 7 827 4,134 Texas 7 814 4,029 Washington 7 1,166 4,165 Wisconsin 6 1,210 4,189

Characteristics of the Cora data set

#classes #instances #terms DS 9 751 6,234 HA 7 400 3,989 ML 7 1,617 8,329 PL 9 1,575 7,949

Li and Yeung (CSE, HKUST) RRMF IJCAI 2009 17 / 23

slide-18
SLIDE 18

Experiments

Convergence Speed

10 20 30 40 50 2 4 6 8 10 12 x 10

5

T Objective Function Value 10 20 30 40 50 0.65 0.7 0.75 0.8 0.85 0.9 T Accuracy

(a) Objective function (b) Accuracy

Li and Yeung (CSE, HKUST) RRMF IJCAI 2009 18 / 23

slide-19
SLIDE 19

Experiments

Performance on Cora

DS HA ML PL 45 50 55 60 65 70 75 80 85 90 Data Set Accuracy (in %) SVM on content SVM on links SVM on link−content directed graph regularization PLSI+PHITS PCA MMMF link−content MF link−content sup. MF RRMF Li and Yeung (CSE, HKUST) RRMF IJCAI 2009 19 / 23

slide-20
SLIDE 20

Experiments

Performance on WebKB

Cornell Texas Washington Wisconsin 65 70 75 80 85 90 95 100

Data Set Accuracy (in %)

SVM on content SVM on links SVM on link−content directed graph regularization PLSI+PHITS PCA MMMF link−content MF link−content sup. MF RRMF Li and Yeung (CSE, HKUST) RRMF IJCAI 2009 20 / 23

slide-21
SLIDE 21

Experiments

Sensitivity to Parameters

50 100 150 200 60 65 70 75 80 85 90 95 beta Accuracy (in %) 10 20 30 40 50 80 85 90 D Accuracy (in %)

(a) Effect of β (b) Effect of D

Li and Yeung (CSE, HKUST) RRMF IJCAI 2009 21 / 23

slide-22
SLIDE 22

Conclusion

Main Contributions

RRMF seamlessly integrates both relation and content information RRMF achieves state-of-the-art performance. RRMF is scalable to large-scale problems. RRMF is convergent and very stable.

Li and Yeung (CSE, HKUST) RRMF IJCAI 2009 22 / 23

slide-23
SLIDE 23

Conclusion

Code MATLAB code and data can be downloaded from: http://www.cse.ust.hk/∼liwujun

Li and Yeung (CSE, HKUST) RRMF IJCAI 2009 23 / 23