A Three-Way Model for Collective Learning on Multi-Relational Data - - PowerPoint PPT Presentation

a three way model for collective learning on multi
SMART_READER_LITE
LIVE PREVIEW

A Three-Way Model for Collective Learning on Multi-Relational Data - - PowerPoint PPT Presentation

Poster at Session P2 A Three-Way Model for Collective Learning on Multi-Relational Data 28th International Conference on Machine Learning Maximilian Nickel 1 Volker Tresp 2 Hans-Peter Kriegel 1 1 Ludwig-Maximilians Universitt, Munich 2 Siemens


slide-1
SLIDE 1

Poster at Session P2

A Three-Way Model for Collective Learning on Multi-Relational Data

28th International Conference on Machine Learning Maximilian Nickel1 Volker Tresp2 Hans-Peter Kriegel1

1Ludwig-Maximilians Universität, Munich 2Siemens AG, Corporate Technology, Munich

June 30th, 2011

Nickel, Tresp, Kriegel A Three-Way Model for Collective Learning June 30th, 2011 1 / 17

slide-2
SLIDE 2

Poster at Session P2

Outline

1 Introduction 2 RESCAL 3 Experiments 4 Summary

Nickel, Tresp, Kriegel A Three-Way Model for Collective Learning June 30th, 2011 2 / 17

slide-3
SLIDE 3

Poster at Session P2

Introduction

Multi-Relational Data

Multi-relational data is a part of many different important fields of application, such as Computational Biology, Social Networks, the Semantic Web, the Linked Data cloud (shown below) and many more

Nickel, Tresp, Kriegel A Three-Way Model for Collective Learning June 30th, 2011 3 / 17

slide-4
SLIDE 4

Poster at Session P2

Introduction

Motivation to use Tensors for Relational Learning

Why Tensors? Modelling simplicity: Multiple binary relations can be expressed straightforwardly as a three-way tensor No structure learning: Not necessary to have information about independent variables, knowledge bases, etc. or to infer it from data Expected performance: Relational domains are high-dimensional and sparse, a setting where factorization methods have shown very good results Problem: Tensor factorizations like CANDECOMP/PARAFAC (CP) or Tucker can not perform collective learning or in the case of DEDICOM have unreasonable constraints for relational learning. (For an excellent review on tensors see (Kolda and Bader, 2009))

Nickel, Tresp, Kriegel A Three-Way Model for Collective Learning June 30th, 2011 4 / 17

slide-5
SLIDE 5

Poster at Session P2

Introduction

Modelling and Terminology

Modelling binary relations as a tensor: Two modes of a tensor refer to the entities, one mode to the relations. The entries of the tensor are 1 when a relation between two entities exists and 0 otherwise We use the RDF formalism to model relations as (subject, predicate,

  • bject) triples

Nickel, Tresp, Kriegel A Three-Way Model for Collective Learning June 30th, 2011 5 / 17

slide-6
SLIDE 6

Poster at Session P2

Outline

1 Introduction 2 RESCAL 3 Experiments 4 Summary

Nickel, Tresp, Kriegel A Three-Way Model for Collective Learning June 30th, 2011 6 / 17

slide-7
SLIDE 7

Poster at Session P2

RESCAL

Tensor Factorizaion

RESCAL takes the inherent structure of dyadic relational data into account, by employing the tensor factorization Xk ≈ ARkAT A is a n × r matrix, representing the global entity-latent-component space Rk is an asymmetric r × r matrix that specifies the interaction of the latent components per predicate

Nickel, Tresp, Kriegel A Three-Way Model for Collective Learning June 30th, 2011 7 / 17

slide-8
SLIDE 8

Poster at Session P2

RESCAL

Tensor Factorizaion

RESCAL takes the inherent structure of dyadic relational data into account, by employing the tensor factorization Xk ≈ ARkAT A is a n × r matrix, representing the global entity-latent-component space Rk is an asymmetric r × r matrix that specifies the interaction of the latent components per predicate

Nickel, Tresp, Kriegel A Three-Way Model for Collective Learning June 30th, 2011 7 / 17

slide-9
SLIDE 9

Poster at Session P2

RESCAL

Solving canonical relational learning tasks

Link Prediction: To predict the existence of a relation between two entities, it is sufficient to look at the rank-reduced reconstruction of the appropriate slice ARkAT Collective Classification: Can be cast as a link prediction problem by including the classes as entities and adding a classOf relation. Alternatively, standard classification algorithms could be applied to the entites’ latent-component representation A Link-based Clustering: Since the entities latent-component representation is computed considering all relations, Link-based clustering can be done by clustering the entities in the latent-component space A

Nickel, Tresp, Kriegel A Three-Way Model for Collective Learning June 30th, 2011 8 / 17

slide-10
SLIDE 10

Poster at Session P2

RESCAL

Computing the factorization

To compute the factorization, we solve the optimization problem min

A,Rk loss(A, Rk) + reg(A, Rk)

where loss is the loss function loss(A, Rk) = 1 2

  • k

Xk − ARkAT2

F

and reg is the regularization term reg(A, Rk) = 1 2λ

  • A2

F +

  • k

Rk2

F

  • Efficient alternating-least squares algorithm based on ASALSAN

(Bader et al., 2007)

Nickel, Tresp, Kriegel A Three-Way Model for Collective Learning June 30th, 2011 9 / 17

slide-11
SLIDE 11

Poster at Session P2

RESCAL

Collective Learning Example

Predict party membership of US (vice) presidents

Bill Party X party Al vicePresidentOf party John party Lyndon vicePresidentOf party

Helpful to consider element-wise version of the loss function f f (A, Rk) = 1 2

  • i,j,k
  • Xijk − aT

i Rkaj

2

Nickel, Tresp, Kriegel A Three-Way Model for Collective Learning June 30th, 2011 10 / 17

slide-12
SLIDE 12

Poster at Session P2

RESCAL

Collective Learning Example

Predict party membership of US (vice) presidents

Bill Party X party Al vicePresidentOf party John party Lyndon vicePresidentOf party

Helpful to consider element-wise version of the loss function f f (A, Rk) = 1 2

  • i,j,k
  • Xijk − aT

i Rkaj

2

Nickel, Tresp, Kriegel A Three-Way Model for Collective Learning June 30th, 2011 10 / 17

slide-13
SLIDE 13

Poster at Session P2

RESCAL

Collective Learning Example

Predict party membership of US (vice) presidents

Bill Party X party Al vicePresidentOf party John party Lyndon vicePresidentOf party

Helpful to consider element-wise version of the loss function f f (A, Rk) = 1 2

  • i,j,k
  • Xijk − aT

i Rkaj

2

Nickel, Tresp, Kriegel A Three-Way Model for Collective Learning June 30th, 2011 10 / 17

slide-14
SLIDE 14

Poster at Session P2

RESCAL

Collective Learning Example

Predict party membership of US (vice) presidents

Bill Party X party Al vicePresidentOf party John party Lyndon vicePresidentOf party

Helpful to consider element-wise version of the loss function f f (A, Rk) = 1 2

  • i,j,k
  • Xijk − aT

i Rkaj

2

Nickel, Tresp, Kriegel A Three-Way Model for Collective Learning June 30th, 2011 10 / 17

slide-15
SLIDE 15

Poster at Session P2

RESCAL

Collective Learning Example

Predict party membership of US (vice) presidents

Bill Party X party Al vicePresidentOf party John party Lyndon vicePresidentOf party

Helpful to consider element-wise version of the loss function f f (A, Rk) = 1 2

  • i,j,k
  • Xijk − aT

i Rkaj

2

Nickel, Tresp, Kriegel A Three-Way Model for Collective Learning June 30th, 2011 10 / 17

slide-16
SLIDE 16

Poster at Session P2

RESCAL

Collective Learning Example

Predict party membership of US (vice) presidents

Bill Party X party Al vicePresidentOf party John party Lyndon vicePresidentOf party

Helpful to consider element-wise version of the loss function f f (A, Rk) = 1 2

  • i,j,k
  • Xijk − aT

i Rkbj

2

Nickel, Tresp, Kriegel A Three-Way Model for Collective Learning June 30th, 2011 10 / 17

slide-17
SLIDE 17

Poster at Session P2

RESCAL

Collective Learning Example

Predict party membership of US (vice) presidents

Bill

  • bject

Bill subject Party X party Al vicePresidentOf party John subject John

  • bject

party Lyndon vicePresidentOf party

Helpful to consider element-wise version of the loss function f f (A, Rk) = 1 2

  • i,j,k
  • Xijk − aT

i Rkbj

2

Nickel, Tresp, Kriegel A Three-Way Model for Collective Learning June 30th, 2011 10 / 17

slide-18
SLIDE 18

Poster at Session P2

RESCAL

Collective Learning with RESCAL

Collective learning is performed via the entities’ latent-component representation Important aspect of the model: Entities have a unique latent-component representation, regardless of their occurrence as subjects or objects

Nickel, Tresp, Kriegel A Three-Way Model for Collective Learning June 30th, 2011 11 / 17

slide-19
SLIDE 19

Poster at Session P2

Outline

1 Introduction 2 RESCAL 3 Experiments 4 Summary

Nickel, Tresp, Kriegel A Three-Way Model for Collective Learning June 30th, 2011 12 / 17

slide-20
SLIDE 20

Poster at Session P2

Experiments

Predicting the party membership of US (vice) presidents

Task: Predict party membership of US (vice) presidents No other information included in the data other than the party membership and who is (vice) president of whom

Random CP DEDICOM SUNS SUNS+AG RESCAL 0.0 0.2 0.4 0.6 0.8 1.0

AUC

0.16 0.44 0.64 0.48 0.74 0.78

Prediction of party membership Nickel, Tresp, Kriegel A Three-Way Model for Collective Learning June 30th, 2011 13 / 17

slide-21
SLIDE 21

Poster at Session P2

Experiments

Comparison to state-of-the-art approaches

Task: Perform link prediction on the IRM datasets Kinships, UMLS and Nations Comparison to MRC (Kok & Domingos, 2007), IRM (Kemp et al., 2007) and BCTF (Sutskever et al., 2009) as well as CP and DEDICOM

C P D E D I C O M B C T F I R M M R C R E S C A L 0.0 0.2 0.4 0.6 0.8 1.0

AUC

0.94 0.69 0.90 0.66 0.85 0.95

Kinships

C P D E D I C O M B C T F I R M M R C R E S C A L 0.0 0.2 0.4 0.6 0.8 1.0

AUC

0.95 0.95 0.98 0.70 0.98 0.98

UMLS C P D E D I C O M I R M M R C R E S C A L 0.0 0.2 0.4 0.6 0.8 1.0 AUC

0.83 0.81 0.75 0.75 0.84

Nations

Nickel, Tresp, Kriegel A Three-Way Model for Collective Learning June 30th, 2011 14 / 17

slide-22
SLIDE 22

Poster at Session P2

Experiments

Runtime and Implementation

RESCAL-ALS algorithm features very fast training times

Dataset Entities Relations Total Runtime in seconds Rank 10 20 40 Kinships 104 26 1.1 3.7 51.2 Nations 125 57 1.7 5.3 54.4 UMLS 135 49 2.6 4.9 72.3 Cora 2497 7 364 348 680 Table: Average runtime to compute a rank-r factorization in RESCAL

Implementation uses only standard matrix operations

Nickel, Tresp, Kriegel A Three-Way Model for Collective Learning June 30th, 2011 15 / 17

slide-23
SLIDE 23

Poster at Session P2

Outline

1 Introduction 2 RESCAL 3 Experiments 4 Summary

Nickel, Tresp, Kriegel A Three-Way Model for Collective Learning June 30th, 2011 16 / 17

slide-24
SLIDE 24

Poster at Session P2

Summary

RESCAL is an tensor-based relational learning approach capable of collective learning Collective learning mechanism works through information propagation via the entities’ latent-component representations Good performance compared to current state-of-the-art relational learning approaches Fast training times and simple Implementation Code available at http://www.cip.ifi.lmu.de/~nickel Thank you!

Nickel, Tresp, Kriegel A Three-Way Model for Collective Learning June 30th, 2011 17 / 17