Graph and Knowledge Graph Representation Learning Prof. Srijan - - PowerPoint PPT Presentation

graph and knowledge graph representation learning
SMART_READER_LITE
LIVE PREVIEW

Graph and Knowledge Graph Representation Learning Prof. Srijan - - PowerPoint PPT Presentation

CSE 6240: Web Search and Text Mining. Spring 2020 Graph and Knowledge Graph Representation Learning Prof. Srijan Kumar http://cc.gatech.edu/~srijan 1 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining Todays


slide-1
SLIDE 1

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

1

CSE 6240: Web Search and Text Mining. Spring 2020

Graph and Knowledge Graph Representation Learning

  • Prof. Srijan Kumar

http://cc.gatech.edu/~srijan

slide-2
SLIDE 2

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

2

Today’s Lecture

  • Embedding entire graphs
  • Introduction to Knowledge Graphs
  • Embeddings in Knowledge Graphs

– TransE – TransR

slide-3
SLIDE 3

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

3

Embedding Entire Graphs

  • Goal: How to embed an entire graph 𝐻?
  • Tasks:

– Classifying toxic vs. non-toxic molecules – Identifying anomalous graphs

𝒜$

slide-4
SLIDE 4

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

4

Approach #1

Simple idea:

  • Run a standard graph embedding technique on

the (sub)graph 𝐻

  • Then just sum (or average) the node

embeddings in the (sub)graph 𝐻

  • Used by Duvenaud et al., 2016 to classify

molecules based on their graph structure

– Convolutional Networks on Graphs for Learning

Molecular Fingerprints. NeurIPS 2015

𝑨$ = ' 𝑨(

  • (∈$
slide-5
SLIDE 5

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

5

Approach #2

  • Idea: Introduce a “virtual node” to represent

the (sub)graph and run a standard graph embedding technique

  • Proposed by Li et al., 2016 as a general

technique for subgraph embedding

– Gated Graph Sequence Neural Networks. ICLR 2016

slide-6
SLIDE 6

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

6

Approach #3

  • Represent a graph as a

distribution/set of walks

  • n that graph
  • Anonymous Walk

Embeddings:

– States in anonymous walk correspond to the index

  • f the first time we visited

the node in a random walk – Anonymous Walk Embeddings, ICML 2018

slide-7
SLIDE 7

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

7

Number of Walks Grows

The number of anonymous walks grows exponentially:

– There are 5 anon. walks 𝑏, of length 3:

𝑏-=111, 𝑏.=112, 𝑏/= 121, 𝑏0= 122, 𝑏1= 123

slide-8
SLIDE 8

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

8

Idea #1: Anonymous Walks

  • Enumerate all possible anonymous walks 𝑏,
  • f 𝑚 steps and record their counts
  • Represent the graph as a probability

distribution over these walks

  • For example:

– Set 𝑚 = 3 – Then we can represent the graph as a 5-dim vector

  • Since there are 5 anonymous walks 𝑏, of length 3:

111, 112, 121, 122, 123

– 𝑎$[𝑗] = probability of anonymous walk 𝑏, in 𝐻

slide-9
SLIDE 9

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

9

Idea #2: Learn Walk Embeddings

Learn embedding 𝒜𝒋 of every anonymous walk 𝒃𝒋

  • The embedding of a graph 𝐻 is then

sum/avg/concatenation of walk embeddings z,

slide-10
SLIDE 10

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

10

Idea #2: Learn Walk Embeddings

How to embed walks?

  • Idea: Embed walks such

that the next walk starting from the same node can be predicted

– Set walk embedding z, such that we maximize 𝑄 𝑥>

? 𝑥>@A ?

, … , 𝑥>

? = 𝑔(𝑨)

  • Where 𝑥>

? is a 𝑢-th random

walk starting at node 𝑣

– Similar to the word2vec idea

slide-11
SLIDE 11

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

11

Idea #2: Learn Walk Embeddings

  • Run 𝑼 different random walks from 𝒗

each of length 𝒎: 𝑂M 𝑣 = 𝑏-

?, 𝑏. ? … 𝑏N ?

– Let 𝑏, be its anonymous version of walk 𝑥,

  • Learn to predict walks that co-occur in 𝚬-

size window

slide-12
SLIDE 12

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

12

Idea #2: Learn Walk Embeddings

  • Estimate embedding 𝒜𝒋 of anonymous

walk 𝒃𝒋 of 𝒙𝒋: max 1 𝑈 ' log 𝑄(𝑏>|𝑏>@A, … , 𝑏>@-)

N >ZA

where: Δ = context window size

  • 𝑄 𝑥> 𝑥>@A, … , 𝑥>@- =

\]^(_ `a ) ∑ \]^(_(`c))

d c

, i.e., softmax over all walks

  • 𝑔(𝑏>) = 𝑐 + 𝑉 ⋅
  • A ∑

𝑨,

A ,Z-

– where 𝑐 ∈ ℝ, 𝑉 ∈ ℝj, 𝑨, is the embedding of 𝑏, (anonymized version of walk 𝑥,)

slide-13
SLIDE 13

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

13

Summary of Graph Embeddings

We discussed 3 ideas to graph embeddings:

  • Approach 1: Embed nodes and sum/average

them

  • Approach 2: Create super-node that spans

the (sub) graph and then embed that node

  • Approach 3: Anonymous Walk Embeddings

– Idea 1: Represent the graph via the distribution

  • ver all the anonymous walks

– Idea 2: Embed anonymous walks

slide-14
SLIDE 14

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

14

Today’s Lecture

  • Embedding entire graphs
  • Introduction to Knowledge Graphs
  • Embeddings in Knowledge Graphs

– TransE – TransR

slide-15
SLIDE 15

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

15

Knowledge Graphs

  • Knowledge in graph form

– Capture entities, types, and relationships

  • Nodes are entities
  • Nodes are labeled with

their types

  • Edges between two

nodes capture relationships between entities

slide-16
SLIDE 16

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

16

Example: Bibliographic networks

  • Node types: paper, title, author, conference,

year

  • Relation types: pubWhere, pubYear,

hasTitle, hasAuthor, cite

slide-17
SLIDE 17

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

17

Example: Social networks

  • Node types: account, song, post, food,

channel

  • Relation types: friend, like, cook, watch,

listen

slide-18
SLIDE 18

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

18

Example: Google Knowledge Graph

paintedBy

slide-19
SLIDE 19

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

19

Knowledge Graphs in Practice

  • Google Knowledge Graph
  • Amazon Product Graph
  • Facebook Graph API
  • IBM Watson
  • Microsoft Satori
  • Project Hanover/Literome
  • LinkedIn Knowledge Graph
  • Yandex Object Answer
slide-20
SLIDE 20

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

20

Applications of Knowledge Graphs

  • Serving information
slide-21
SLIDE 21

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

21

Applications of Knowledge Graphs

  • Question answering and conversation

agents

slide-22
SLIDE 22

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

22

Knowledge Graph Datasets

  • Publicly available KGs:

– FreeBase, Wikidata, Dbpedia, YAGO, NELL

  • Common characteristics:

– Massive: millions of nodes and edges – Incomplete: many true edges are missing

Given a massive KG, enumerating all the possible facts is intractable! Can we predict plausible BUT missing links?

slide-23
SLIDE 23

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

23

Example: Freebase

  • Freebase

– ~50 million entities – ~38K relation types – ~3 billion facts/triples

  • FB15k/FB15k-237

– A complete subset of Freebase, used by researchers to learn KG models

93.8% of persons from Freebase have no place of birth and 78.5% have no nationality!

[1] Paulheim, Heiko. "Knowledge graph refinement: A survey of approaches and evaluation methods." Semantic web 8.3 (2017): 489-508. [2] Min, Bonan, et al. "Distant supervision for relation extraction with an incomplete knowledge base." Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2013.

slide-24
SLIDE 24

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

24

Today’s Lecture

  • Embedding entire graphs
  • Introduction to Knowledge Graphs
  • Embeddings in Knowledge Graphs

– TransE – TransR

slide-25
SLIDE 25

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

25

Key Task: KG Completion

  • Knowledge Graph completion is a link

prediction problem

  • KG incompleteness can substantially affect

the efficiency of systems relying on it

  • Main paper: Translating Embeddings for

Modeling Multi-relational Data. Bordes, Usunier, Garcia-Duran. NeurIPS 2013.

slide-26
SLIDE 26

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

26

Key Task: KG Completion

missing relation

  • Intuition: a link prediction model that learns

from local and global connectivity patterns in the KG, taking into account entities and relationships of different types at the same time

  • Models: TransE

and TransR

slide-27
SLIDE 27

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

27

Translating Embeddings: TransE

  • Relationships between entities = triplets

– 𝒊 (head entity), 𝒎 (relation), 𝒖 (tail entity) => (ℎ, 𝑚, 𝑢)

  • Entities and relations are all embedded in

an entity space 𝑆o

  • Relations are represented as translations

– ℎ + 𝑚 ≈ 𝑢 if the given fact is true; else, ℎ + 𝑚 ≠ 𝑢

slide-28
SLIDE 28

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

28

TransE

  • Translation Intuition:

– For a triple (ℎ, 𝑠, 𝑢), 𝐢, 𝐬, 𝐮 ∈ ℝv,

𝐢 + 𝐬 = 𝐮

  • Score function: 𝑔

w ℎ, 𝑢 = ||ℎ + 𝑠 − 𝑢||

𝐢 𝐮 𝐬

Obama Nationality American

NOTATION: embedding vectors will appear in boldface

slide-29
SLIDE 29

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

29

Link Prediction in a KG using TransE

  • Who has won the Turing award?
  • Who is a Canadian citizen?

Win

Hinton Bengio Pearl Turing Award Canada Trudeau Bieber

𝐫

Answers!

Hinton Bengio Pearl Turing Award Canada

Citizen

Trudeau Bieber

Answers!

𝐫

slide-30
SLIDE 30

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

30

TransE Optimization

  • Learn embeddings such that ℎ + 𝑚 = 𝑢 for

real triplets that exist in the knowledge graph, ℎ + 𝑚 ≠ 𝑢 for triplets that do not exist

– Create a positive training set: of valid triples – Create a negative training set: by replacing entities/relations from valid triples

  • Replacement is by random sampling

– Update embeddings till the distance for positive training set triples is minimized and distance for negative training set triples is maximized

slide-31
SLIDE 31

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

31

TransE Training

  • Translation Intuition: for a triple (ℎ, 𝑚, 𝑢),

𝐢 + 𝒎 = 𝐮

  • Max-margin loss:

ℒ = ' 𝛿 + 𝑒(ℎ + 𝑚, 𝑢) − 𝑒(ℎ′ + 𝑚, 𝑢′)

  • (~,•,>)∈$,(~€,•,>€)∉$

where 𝛿 is the margin, i.e., the smallest distance tolerated by the model between a valid triple and a corrupted one.

Valid triple Negative triple

slide-32
SLIDE 32

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

32

TransE Learning Algorithm

Entities and relations are initialized uniformly, and normalized Negative sampling with triplet that does not appear in the KG Comparative loss: favors lower distance values for valid triplets, high distance values for corrupted ones

Valid sample Negative sample

slide-33
SLIDE 33

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

33

Complex Relational Patterns

  • Symmetric Relations:

𝑠 ℎ, 𝑢 ⇒ 𝑠 𝑢, ℎ ∀ℎ, 𝑢

– Example: Family, Roommate

  • Composition Relations:

𝑠

  • 𝑦, 𝑧 ∧ 𝑠. 𝑧, 𝑨 ⇒ 𝑠/ 𝑦, 𝑨 ∀𝑦, 𝑧, 𝑨

– Example: My mother’s husband is my father.

  • 1-to-N, N-to-1 relations:

𝑠 ℎ, 𝑢- , 𝑠 ℎ, 𝑢. , … , 𝑠(ℎ, 𝑢‡) are all True.

– Example: 𝑠 is “StudentsOf”

slide-34
SLIDE 34

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

34

Composition in TransE

  • Composition Relations:

𝑠

  • 𝑦, 𝑧 ∧ 𝑠. 𝑧, 𝑨 ⇒ 𝑠/ 𝑦, 𝑨 ∀𝑦, 𝑧, 𝑨

– Example: My mother’s husband is my father.

  • In TransE, compositional relations are

possible if r3 = r1 + r2

𝐲 𝐬- 𝐬. 𝐬/ 𝐳 𝐴

slide-35
SLIDE 35

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

35

Symmetric Relations in TransE

  • Symmetric Relations: 𝑠 ℎ, 𝑢 ⇒ 𝑠 𝑢, ℎ ∀ℎ, 𝑢

– Example: Family, Roommate

  • In TransE, symmetric relations are not

possible:

– For TransE to handle symmetric relations 𝑠, for all ℎ, 𝑢 that satisfy 𝑠(ℎ, 𝑢), 𝑠(𝑢, ℎ) is also True. – So, ℎ + 𝑠 − 𝑢 = 0 and 𝑢 + 𝑠 − ℎ = 0. – Then 𝑠 = 0 and ℎ = 𝑢. – However ℎ and 𝑢 are two different entities and should be mapped to different locations.

slide-36
SLIDE 36

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

36

Limitation: N-ary Relations

  • 1-to-N, N-to-1, N-to-N relations

– Example: (ℎ, 𝑠, 𝑢-) and (ℎ, 𝑠, 𝑢.) both exist in the knowledge graph, e.g., 𝑠 is “StudentsOf”

  • In TransE, 𝑢- and 𝑢. will map to the same

vector, although they are different entities.

– 𝐮- = 𝐢 + 𝐬 = 𝐮. – 𝐮- ≠ 𝐮.

  • In TransE, N-ary

relations are not possible

𝐢 𝐮- 𝐮. 𝐬 𝐬

contradictory!

slide-37
SLIDE 37

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

37

Today’s Lecture

  • Embedding entire graphs
  • Introduction to Knowledge Graphs
  • Embeddings in Knowledge Graphs

– TransE – TransR

slide-38
SLIDE 38

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

38

Solution: TransR

  • Learn embeddings for entities and relations

in separate spaces

– Model entities as vectors in the entity space ℝv – Model a relation as vector 𝒔 in relation space ℝo

  • Learn a relation-specific transformation

from the entity-to-relation space per relation

– Train 𝐍w ∈ ℝo×v as the projection matrix for vector 𝒔

  • Reference: “Learning entity and relation

embeddings for knowledge graph completion.” Lin et al. AAAI 2015.

slide-39
SLIDE 39

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

39

TransR Formulation

  • ℎw = 𝑁wℎ, 𝑢w = 𝑁w𝑢
  • 𝑔

w ℎ, 𝑢 =

ℎw + 𝑠 − 𝑢w

– instead of 𝑔

w ℎ, 𝑢 = ℎ + 𝑠 − 𝑢

slide-40
SLIDE 40

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

40

Symmetric Relations in TransR

  • Symmetric Relations: 𝑠 ℎ, 𝑢 ⇒ 𝑠 𝑢, ℎ ∀ℎ, 𝑢

– Example: Family, Roommate

  • For TransR, we can learn Mr to map ℎ and 𝑢 to

the same location on the space of relation 𝑠 𝑠 = 0, ℎw = 𝑁wℎ = 𝑁w𝑢 = 𝑢‘ü

𝐢 𝐮w, ℎw 𝐮 𝑵w

slide-41
SLIDE 41

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

41

N-ary Relations in TransR

  • 1-to-N, N-to-1, N-to-N relations

– Example: If (ℎ, 𝑠, 𝑢-) and (ℎ, 𝑠, 𝑢.) exist in the knowledge

graph.

  • We can learn 𝑁w so that 𝑢w = 𝑁w𝑢- = 𝑁w𝑢.,

even though 𝑢- does not need to be equal to 𝑢.!

𝐢 𝐢w 𝐮w 𝐮- 𝐮. 𝐬

slide-42
SLIDE 42

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

42

Limitation: Composition in TransR

  • Composition Relations:

𝑠

  • 𝑦, 𝑧 ∧ 𝑠. 𝑧, 𝑨 ⇒ 𝑠/ 𝑦, 𝑨 ∀𝑦, 𝑧, 𝑨

– Example: My mother’s husband is my father.

  • Each relation has different space.
  • TransR is not naturally compositional for

multiple relations! û

slide-43
SLIDE 43

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

43

Translation-Based KG Embedding

Embedding Entity Relation 𝒈𝒔(𝒊, 𝒖) TransE ℎ, 𝑢 ∈ ℝv 𝑠 ∈ ℝv ||ℎ + 𝑠 − 𝑢|| TransR ℎ, 𝑢 ∈ ℝv 𝑠 ∈ ℝo, 𝑁w ∈ ℝo×v ||𝑁wℎ + 𝑠 − 𝑁w𝑢||

Embedding Symmetry Composition One-to- many TransE

û ü û

TransR

ü û ü

slide-44
SLIDE 44

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

44

Today’s Lecture

  • Embedding entire graphs
  • Introduction to Knowledge Graphs
  • Embeddings in Knowledge Graphs

– TransE – TransR