Temporal Graph Representation Learning Rakshit Trivedi School of - - PowerPoint PPT Presentation

temporal graph representation learning
SMART_READER_LITE
LIVE PREVIEW

Temporal Graph Representation Learning Rakshit Trivedi School of - - PowerPoint PPT Presentation

CSE 6240: Web Search and Text Mining. Spring 2020 Temporal Graph Representation Learning Rakshit Trivedi School of Computational Science and Engineering rstrivedi@gatech.edu 1 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and


slide-1
SLIDE 1

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

1

CSE 6240: Web Search and Text Mining. Spring 2020

Temporal Graph Representation Learning

Rakshit Trivedi

School of Computational Science and Engineering rstrivedi@gatech.edu

slide-2
SLIDE 2

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

2

Today’s Lecture

  • GraphSAGE
  • Dynamic Graphs and its Applications
  • Representation Learning with:

– Discrete-Time Approaches – Continuous-Time Approaches

slide-3
SLIDE 3

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

3

GraphSAGE Idea

  • In GCN, we aggregated the neighbors’

messages as the (weighted) average of all

  • neighbors. How can we generalize this?

INPUT GRAPH TARGET NODE

B D E F C A B C D A A A C F B E A

? ? ? ?

[Hamilton et al., NIPS 2017]

slide-4
SLIDE 4

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

4

GraphSAGE Idea

INPUT GRAPH TARGET NODE

B D E F C A B C D A A A C F B E A

hk

v = σ

⇥ Ak · agg({hk−1

u

, ∀u ∈ N(v)}), Bkhk−1

v

Any differentiable function that maps set of vectors in 𝑂(𝑣) to a single vector

slide-5
SLIDE 5

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

5

Neighborhood Aggregation

  • Simple neighborhood aggregation:
  • GraphSAGE:

Concatenate neighbor embedding and self embedding

hk

v = σ

⇥ Wk · agg

  • {hk−1

u

, ∀u ∈ N(v)}

  • , Bkhk−1

v

hk

v = σ

@Wk X

u∈N(v)

hk−1

u

|N(v)| + Bkhk−1

v

1 A

Generalized aggregation

slide-6
SLIDE 6

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

6

Neighbor Aggregation: Variants

  • Mean: Take a weighted average of

neighbors

  • Pool: Transform neighbor vectors and apply

symmetric vector function

  • LSTM: Apply LSTM to reshuffled of

neighbors

agg = X

u∈N(v)

hk−1

u

|N(v)|

Element-wise mean/max

agg = γ

  • {Qhk−1

u

, ∀u ∈ N(v)}

  • agg = LSTM
  • [hk−1

u

, ∀u ∈ π(N(v))]

slide-7
SLIDE 7

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

7

Experiments: Dataset

  • Dynamic datasets:

– Citation Network: Predict paper category

  • Data from 2000-2005
  • 302,424 nodes
  • Train: data till 2004, test: 2005 data

– Reddit Post Network: Predict subreddit of post

  • Nodes = posts
  • Edges between posts if common users comment on

the post

  • 232,965 posts
  • Train: 20 days of data, test: next 10 days of data
slide-8
SLIDE 8

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

8

Experiments: Results

slide-9
SLIDE 9

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

9

Summary: GCN and GraphSAGE

  • Key idea: Generate node embeddings based
  • n local neighborhoods

– Nodes aggregate “messages” from their neighbors

using neural networks

  • Graph convolutional networks:

– Basic variant: Average neighborhood information

and stack neural networks

  • GraphSAGE:

– Generalized neighborhood aggregation

slide-10
SLIDE 10

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

10

Today’s Lecture

  • GraphSAGE
  • Dynamic Graphs and its Applications
  • Representation Learning with:

– Discrete-Time Approaches – Continuous-Time Approaches

slide-11
SLIDE 11

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

11

Temporally Evolving Graphs

E-commerce Social media Finance Web Education IoT

slide-12
SLIDE 12

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

12

Temporally Evolving Graphs

(i) How to model dynamics over graphs? (ii) How leverage such a dynamic graph model to encode evolving graph information into low-dimensional representations?

slide-13
SLIDE 13

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

Application: Social Networks

13

slide-14
SLIDE 14

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

Application: Recommendation Systems

14

Features … . . . …... … . . . …... …...

Users Products

Time

slide-15
SLIDE 15

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

15

Application: Anomaly Detection

[Image from NetWalk presentation, Yu et. al. KDD 2018]

slide-16
SLIDE 16

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

16

How Do We Model Dynamics?

  • 1. Snapshot-Based Observation:

Network Evolution observed as a collection of snapshots of the graph at different time steps

Possibly significant changes in graph structure observed between the two-time steps

Time information may or may not be explicitly available

Demand Discrete-time modeling

  • 2. Event Based Observation:

Network Evolution observed as time-stamped edges (each edge represent an event)

Time information is fine-grained and explicitly available

Demand Continuous-time modeling

slide-17
SLIDE 17

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

17

Today’s Lecture

  • GraphSAGE
  • Dynamic Graphs and its Applications
  • Representation Learning with:

– Discrete-Time Approaches – Continuous-Time Approaches

slide-18
SLIDE 18

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

18

Snapshot based Evolution of Graphs

[Morris 2000] Ti Time t1 Ti Time t2 Ti Time t3

  • Let denote the graph at time t
  • Let 𝑩𝒖 be the corresponding adjacency matrix at time t
  • Dynamic graph 𝑯 = {𝑯𝟐, 𝑯𝟑, … , 𝑯𝑼} is the series of graph

snapshots recorded at T different time steps

𝑯𝒖 = (𝑾𝒖, 𝑭𝒖)

slide-19
SLIDE 19

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

19

Snapshot based Evolution of Graphs

[Morris 2000]

  • One Approach: Use a single graph encoder at each time

step to extract node features

  • Use RNN based model over these node features to

model dynamics What problems could this potentially have?

Ti Time t1 Ti Time t2 Ti Time t3 GE GE GE Em Embe beddi dding ngs Em Embe beddi dding ngs Em Embe beddi dding ngs RNN RNN

slide-20
SLIDE 20

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

20

Snapshot based Evolution of Graphs

[Morris 2000]

  • Number of Nodes and edges vary with time step
  • Above approach would require complete knowledge of

nodes

  • Doesn’t perform well in practice

Ti Time t1 Ti Time t2 Ti Time t3 GE GE GE Em Embe beddi dding ngs Em Embe beddi dding ngs Em Embe beddi dding ngs RNN RNN

slide-21
SLIDE 21

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

21

General Model

Alternative Approach: Use graph-specific encoder at each time step

GE(t1) GE(t2) GE(t3)

Em Embe beddi dding ngs Em Embe beddi dding ngs Em Embe beddi dding ngs Ti Time t1 Ti Time t2 Ti Time t3 Ca Capture Dy Dynamics? Ca Capture Dy Dynamics?

slide-22
SLIDE 22

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

22

General Model

Alternative Approach: Use graph-specific encoder at each time step

  • Adapt the architecture based on changes in graph

properties

  • Adapt Encoder parameters to model dynamics
  • Train using unsupervised or semi-supervised loss as

before e.g. cross-entropy loss

GE(t1) GE(t2) GE(t3)

Em Embe beddi dding ngs Em Embe beddi dding ngs Em Embe beddi dding ngs Ti Time t1 Ti Time t2 Ti Time t3 Ca Capture Dy Dynamics? Ca Capture Dy Dynamics?

slide-23
SLIDE 23

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

23

Variant I:

Dynamic Autoencoder Architecture

DynGEM: Deep Embedding Method for Dynamic Graphs

[Slides for DynGEM adapted from author’s original slides, Goyal et. al. 2018]

slide-24
SLIDE 24

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

24

DynGEM: Model

slide-25
SLIDE 25

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

25

DynGEM: Adaptive Architecture

Embedding Stability

slide-26
SLIDE 26

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

26

DynGEM: Data Setup

slide-27
SLIDE 27

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

27

DynGEM: Visualization

slide-28
SLIDE 28

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

28

DynGEM: Link Prediction

slide-29
SLIDE 29

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

29

DynGEM: Anomaly Detection

slide-30
SLIDE 30

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

30

Variant II: GCN Weight Evolution

EvolveGCN: Evolving Graph Convolutional Networks for Dynamic Graphs

slide-31
SLIDE 31

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

31

EvolveGCN: Model

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

33

EvolveGCN: Model

slide-32
SLIDE 32

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

32

EvolveGCN: Weight Evolution

  • GCN Reminder:
slide-33
SLIDE 33

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

33

EvolveGCN: Weight Evolution

  • GCN Reminder:
  • Weight Evolution I:

(only structural properties)

  • Weight Evolution II:

(for attributed graphs)

slide-34
SLIDE 34

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

34

EvolveGCN: Weight Evolution

  • GCN Reminder:
  • Weight Evolution I:

(only structural properties)

  • Weight Evolution II:

(for attributed graphs)

slide-35
SLIDE 35

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

35

EvolveGCN: Summarization

What is the challenge?

slide-36
SLIDE 36

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

36

EvolveGCN: Summarization

What is the challenge? (Need to account for changing dimension of H)

slide-37
SLIDE 37

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

37

EvolveGCN: Summarization

What is the challenge? (Need to account for changing dimension of H)

Use representative summarization:

p is time-independent parameter vector

slide-38
SLIDE 38

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

38

EvolveGCN: Datasets

  • SBM (Stochastic Block Model) – Popular Model for simulating

communities

  • BC-OTC; BC-Alpha: who-trusts-whom network of Bitcoin users
  • UCI: Messages sent between users in UC Irvine student

community

slide-39
SLIDE 39

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

39

EvolveGCN: Datasets

  • AS (Autonomous Systems): Communication network of routers that

exchange traffic flows with peers

  • Reddit: subreddit-to-subreddit hyperlink network, where each hyperlink
  • riginates from a post in the source community and links to a post in the

target community

  • Elliptic: bitcoin transactions, wherein each node represents one

transaction and the edges indicate payment flows

slide-40
SLIDE 40

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

40

EvolveGCN: Tasks

  • Training performed end-to-end based on task
  • 1. Link Prediction:
  • 2. Edge Classification:
  • 3. Node Classification:

For a pair of nodes u and v, concatenate their embedding and apply an MLP to compute link probability For an edge (u, v), similarly concatenate the corresponding node embedding and apply an MLP to compute edge class probability For a node u, follow standard practice of using a softmax activation as the last layer of the GCN, thus outputting node class probility

slide-41
SLIDE 41

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

41

EvolveGCN: Experiments

Link Prediction Edge Classification Node Classification

slide-42
SLIDE 42

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

42

Summary So Far

  • GCN GraphSAGE
  • Next step: Dynamic graphs (time dimension)

– Applications: Social Media, Citation Network Analysis,

Financial Transactions, Anomaly Detection and many more

  • Discrete Time Models for Snapshot Based Observation:

– Adaptive Architecture using Autoencoders – Adaptive Parameters using GCN and RNN

(Does not make use of time information explicitly and cannot handle fine-grained complex temporal dynamics)

Learned Aggregation

slide-43
SLIDE 43

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

43

Today’s Lecture

  • GraphSAGE
  • Dynamic Graphs and its Applications
  • Representation Learning with:

– Discrete-Time Approaches – Continuous-Time Approaches

slide-44
SLIDE 44

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

44

Event based Evolution of Graphs

slide-45
SLIDE 45

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

45

Event based Evolution of Graphs

slide-46
SLIDE 46

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

46

Preliminaries: Graph Attention

  • Simple neighborhood aggregation (GCN):

– Aggregates messages across neighborhood, 𝑶(𝒘) – 𝒓𝒗𝒘 =

𝟐 |𝑶 𝒘 | assigns weight (importance) of node u’s

message to node v

– Explicitly based on structural properties of graph with each

neighbor assigned equal importance

hk

v = σ

@Wk X

u∈N(v)

hk−1

u

|N(v)| + Bkhk−1

v

1 A

slide-47
SLIDE 47

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

47

Preliminaries: Graph Attention

  • Simple neighborhood aggregation (GCN):

– Aggregates messages across neighborhood, 𝑶(𝒘) – 𝒓𝒗𝒘 =

𝟐 |𝑶 𝒘 | assigns weight (importance) of node u’s

message to node v

– Explicitly based on structural properties of graph with each

neighbor assigned equal importance

hk

v = σ

@Wk X

u∈N(v)

hk−1

u

|N(v)| + Bkhk−1

v

1 A

Not all neighbors are equally important!

slide-48
SLIDE 48

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

48

Preliminaries: Graph Attention

Can we learn the weight factors 𝒓𝒗𝒘 implicitly? Assign arbitrary importance to different neighbors

  • f a node in the graph
slide-49
SLIDE 49

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

49

Preliminaries: Graph Attention

  • Attention Mechanism:

– While computing representation, nodes attend over their

neighborhood

– Implicitly specifying different weights to different nodes in

a neighborhood

Can we learn the weight factors 𝒓𝒗𝒘 implicitly? Assign arbitrary importance to different neighbors

  • f a node in the graph
slide-50
SLIDE 50

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

50

Preliminaries: Graph Attention Example

slide-51
SLIDE 51

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

51

Preliminaries: Graph Attention Example

𝑛"# = 𝒃 (𝑿𝒍 𝒊𝒗

𝒍&𝟐, 𝑿𝒍 𝒊𝒘 𝒍&𝟐)

a – attention mechanism; could be single layer NN Normalize using Softmax to make it comparable across different neighborhoods

slide-52
SLIDE 52

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

52

Preliminaries: Graph Attention Example

𝑛"# = 𝒃 (𝑿𝒍 𝒊𝒗

𝒍&𝟐, 𝑿𝒍 𝒊𝒘 𝒍&𝟐)

a – attention mechanism; could be single layer NN Normalize using Softmax to make it comparable across different neighborhoods

Multi-head attention to stabilize learning:

  • Repeat operation at

each layer R times

  • Aggregate outputs
slide-53
SLIDE 53

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

53

Preliminaries: Graph Attention Properties

  • Inductive:

– Shared edge-wise mechanism that does not depend on

graph structure

  • Localized:

– Only attends over local network neighborhood

  • Computationally Efficient:

– Computation of attentional coefficient can be parallelized

across all edges of the graph

– Aggregation can be parallelized across nodes

Key Benefit: Allows for implicitly specifying different important values 𝒓𝒗𝒘 to different neighbors

slide-54
SLIDE 54

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

54

Preliminaries: Temporal Point Process

slide-55
SLIDE 55

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

55

Preliminaries: Temporal Point Process

slide-56
SLIDE 56

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

56

Preliminaries: Temporal Point Process

slide-57
SLIDE 57

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

57

Event Based Modeling of Complex Temporal Dynamics

DyRep: Representation Learning over Dynamic Graphs

slide-58
SLIDE 58

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

58

Event Based Model

slide-59
SLIDE 59

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

59

Evolution Through Mediation

slide-60
SLIDE 60

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

60

Social Network Example

slide-61
SLIDE 61

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

61

DyRep Model

slide-62
SLIDE 62

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

62

DyRep Model

slide-63
SLIDE 63

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

63

Localized Embedding Propagation

slide-64
SLIDE 64

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

64

Temporal Point Process Attention

slide-65
SLIDE 65

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

65

Training Procedure

slide-66
SLIDE 66

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

66

Experiments: Setup

slide-67
SLIDE 67

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

67

Experiments: Tasks

  • Temporal Link Prediction
  • Event Time Prediction
slide-68
SLIDE 68

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

68

Experiment I: Dynamic Link Prediction

slide-69
SLIDE 69

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

69

Experiment I: Dynamic Link Prediction

slide-70
SLIDE 70

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

70

Experiment II: Event Time Prediction

slide-71
SLIDE 71

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

71

Today’s Lecture

  • GraphSAGE
  • Dynamic Graphs and its Applications
  • Representation Learning with:

– Discrete-Time Approaches – Continuous-Time Approaches