Inductive Relation Prediction by Subgraph Reasoning Komal K. Teru, - - PowerPoint PPT Presentation

inductive relation prediction by subgraph reasoning
SMART_READER_LITE
LIVE PREVIEW

Inductive Relation Prediction by Subgraph Reasoning Komal K. Teru, - - PowerPoint PPT Presentation

Inductive Relation Prediction by Subgraph Reasoning Komal K. Teru, Etienne Denis, William L. Hamilton McGill University and Mila AI Institute of Quebec 1 Relation prediction in Knowledge graphs Automatically expand and complete existing


slide-1
SLIDE 1

Inductive Relation Prediction by Subgraph Reasoning

Komal K. Teru, Etienne Denis, William L. Hamilton

McGill University and Mila – AI Institute of Quebec

1

slide-2
SLIDE 2

Relation prediction in Knowledge graphs

2

Automatically expand and complete existing knowledge bases. Needs relational reasoning to make inference. Applications in e-commerce, medicine, materials science…

slide-3
SLIDE 3

Transductive relation prediction

3

Training graph Test graph

slide-4
SLIDE 4

▪ Encode each node to a low-dimensional embedding

Embeddings-based methods

4

LeBron Akron School Marie Savanna h Philant. Jenifer L.A Lakers

  • A. Davis

Britney Holly. f

  • u

n d e d _ b y mother_of spouse_of s p

  • u

s e _

  • f

mother_of

  • c

c u p a t i

  • n

part_of part_of teammate_of lives_in l i v e s _ i n located_in lives_in f a v _ n b a _ t e a m located_in

?

slide-5
SLIDE 5

Embeddings-based methods

5

LeBron Akron School Marie Savanna h Philant. Jenifer L.A Lakers

  • A. Davis

Britney Holly. founded_by mother_of spouse_of s p

  • u

s e _

  • f

mother_of

  • c

c u p a t i

  • n

part_of part_of teammate_of lives_in lives_in l

  • c

a t e d _ i n lives_in fav_nba_team located_in

? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

▪ Encode each node to a low-dimensional embedding ▪ Use the derived embeddings to make predictions (TransE, RotatE, etc.)

slide-6
SLIDE 6

Embeddings-based methods

6

LeBron Akron School Marie Savanna h Philant. Jenifer L.A Lakers

  • A. Davis

Britney Holly. founded_by mother_of spouse_of s p

  • u

s e _

  • f

mother_of

  • c

c u p a t i

  • n

part_of part_of teammate_of lives_in lives_in l

  • c

a t e d _ i n lives_in fav_nba_team located_in

? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

K u z m a

No embedding!

t e a m m a t e _

  • f

▪ Encode each node to a low-dimensional embedding ▪ Use the derived embeddings to make predictions (TransE, RotatE, etc.) ▪ Can’t make predictions on new nodes.

. . .

slide-7
SLIDE 7

Limitations of transductive relation prediction

▪ Problematic for production systems

▪ Need to re-train to deal with new nodes (e.g., entities, products) ▪ Predictions can become stale.

▪ Too many parameters

▪ Most transductive approaches have O(|V|) space complexity.

▪ Biases model development

▪ Focus on “embedding”-based methodologies. ▪ Static and unrepresentative benchmarks

7

slide-8
SLIDE 8

8

Training graph

Inductive learning: evolving data

Test graph

K u z m a

?

part_of

slide-9
SLIDE 9

Inductive learning: new graphs

9

Training graph Test graph

slide-10
SLIDE 10

GraIL: Inductive learning using GNNs

▪ A novel approach to learn entity-independent relational semantics (rules) ▪ SOTA performance on inductive benchmarks ▪ Extremely parameter efficient

10

slide-11
SLIDE 11

GraIL: Inductive learning using GNNs

11

▪ Idea 1: Apply graph neural networks (GNNs) on the subgraphs surrounding candidate edge. ▪ Idea 2: Avoid explicit rule induction. ▪ Idea 3: Ensure model is expressive enough to capture logical rules.

slide-12
SLIDE 12

GraIL: Relation prediction via subgraph reasoning

12

  • 1. Extract subgraph around

candidate edge

  • 2. Assign structural labels

to nodes

  • 3. Run GNN on the

extracted subgraph

slide-13
SLIDE 13

GNN architecture

13

Neural message-passing approach

slide-14
SLIDE 14

GNN architecture

14

Neural message-passing approach

Separately aggregate across different types of relations Learn a relation-specific transformation matrix Use attention to weigh information coming from different neighbors

slide-15
SLIDE 15

GNN architecture

15

Neural message-passing approach

Information aggregated from the neighborhood Information from the nodes embedding at the previous layer

slide-16
SLIDE 16

GraIL can learn logical rules

16

Theorem (Informally): GraIL can learn any logical rule of the form:

These “path-based” rules are the foundation of most state-of-the-art rule induction systems.

Example of such a rule:

slide-17
SLIDE 17

State-of-the-art inductive performance

17

  • Constructed inductive versions of three standard benchmarks.
  • Sampled mutually exclusive subgraphs of varying sizes
  • Tested four inductive datasets per each benchmark.

Table: AUC-PR results on inductive relation prediction

slide-18
SLIDE 18

State-of-the-art inductive performance

18

  • Compared against state-of-the-art neural rule induction methods
  • Also compared against the best statistical induction approach.

Table: AUC-PR results on inductive relation prediction

slide-19
SLIDE 19

State-of-the-art inductive performance

19

  • Key finding: GraIL outperforms all previous approaches on all

datasets (analogous results for hits@k)

Table: AUC-PR results on inductive relation prediction

slide-20
SLIDE 20

GraIL is extremely parameter efficient compared to the existing neural rule-induction methods. GraIL can naturally leverage external node attributes/embeddings

Added benefits

20

slide-21
SLIDE 21

Ensembling in the transductive setting

21

Table: Ensemble AUC-PR results on WN18RR

Each entry is a pair-wise ensemble of two methods

slide-22
SLIDE 22

Ensembling in the transductive setting

22

Table: Ensemble AUC-PR results on WN18RR

Each entry is a pair-wise ensemble of two methods GraIL has the lowest performance on its own

slide-23
SLIDE 23

Ensembling in the transductive setting

23

Table: Ensemble AUC-PR results on WN18RR

Each entry is a pair-wise ensemble of two methods GraIL has the lowest performance on its own… But ensembling with GraIL leads to the best performance

slide-24
SLIDE 24

Architecture details are important!

24

Table: Ablation study AUC-PR results

Naïve subgraph extraction causes severe overfitting Our node labelling and attention schemes are crucial for the theory and for strong performance.

slide-25
SLIDE 25

Future directions

  • Extracting interpretable rules from GraIL.
  • Expanding the class of first-order logical rules that can be represented

beyond the chain-like rules focussed in this work.

  • Extending the generalization capabilities to new relations added to the

knowledge graphs.

25

slide-26
SLIDE 26

Komal K. Teru komal.teru@mail.mcgill.ca Etienne Denis etienne.denis@mail.mcgill.ca William L. Hamilton wlh@cs.mcgill.ca Paper: https://arxiv.org/abs/1911.06962 Code and data: https://github.com/kkteru/grail

Thank you!

26