introduction generative model for graphs
play

Introduction: Generative Model for Graphs Modeling graphs is - PowerPoint PPT Presentation

GraphRNN: A Deep Generative Model for Graphs (24 Feb 2018) Jiaxuan You, Rex Ying, Xiang Ren, William L. Hamilton, Jure Leskovec Presented by: Jesse Bettencourt and Harris Chan March 9, 2018 University of Toronto, Vector Institute 1


  1. GraphRNN: A Deep Generative Model for Graphs (24 Feb 2018) Jiaxuan You, Rex Ying, Xiang Ren, William L. Hamilton, Jure Leskovec Presented by: Jesse Bettencourt and Harris Chan March 9, 2018 University of Toronto, Vector Institute 1

  2. Introduction: Generative Model for Graphs Modeling graphs is fundamental for studying networks e.g. medical, chemical, social Goal: Model and efficiently sample complex distributions over graphs Learn generative model from observed set of graphs 2

  3. Challenges in Graph Generation Large and variable output spaces Graph with n nodes requires n 2 to fully specify structure Number of nodes and edges varies between different graphs Non-unique representations Distributions over graphs without assuming fixed set of nodes n node graph represented by up to n ! equivalent adjacency matrices π ∈ Π is arbitrary node ordering Complex, non-local dependencies New edges depend on previously generated edges 3

  4. Overview to GraphRNN Decompose graph generation into two RNNs: • Graph-level: generates sequence of nodes • Edge-level: generates sequence of edges for each new node 4

  5. Modeling Graphs as Sequences Graph G ∼ p ( G ) with n nodes under node ordering π Define mapping f s from G to sequence S π = f S ( G , π ) = ( S π 1 , . . . , S π n ) (1) Each sequence element is adjacency vector i ∈ { 0 , 1 } i − 1 S π i ∈ { 1 , . . . , n } for edges between node π ( v i ) and π ( v j ) , j ∈ { 1 , . . . , i − 1 } 5

  6. Modeling Graphs as Sequences Graph G ∼ p ( G ) with n nodes under node ordering π Define mapping f s from G to sequence S π = f S ( G , π ) = ( S π 1 , . . . , S π n ) (1) Each sequence element is adjacency vector i ∈ { 0 , 1 } i − 1 S π i ∈ { 1 , . . . , n } for edges between node π ( v i ) and π ( v j ) , j ∈ { 1 , . . . , i − 1 } 5

  7. Modeling Graphs as Sequences Graph G ∼ p ( G ) with n nodes under node ordering π Define mapping f s from G to sequence S π = f S ( G , π ) = ( S π 1 , . . . , S π n ) (1) Each sequence element is adjacency vector i ∈ { 0 , 1 } i − 1 S π i ∈ { 1 , . . . , n } for edges between node π ( v i ) and π ( v j ) , j ∈ { 1 , . . . , i − 1 } 5

  8. Modeling Graphs as Sequences Graph G ∼ p ( G ) with n nodes under node ordering π Define mapping f s from G to sequence S π = f S ( G , π ) = ( S π 1 , . . . , S π n ) (1) Each sequence element is adjacency vector i ∈ { 0 , 1 } i − 1 S π i ∈ { 1 , . . . , n } for edges between node π ( v i ) and π ( v j ) , j ∈ { 1 , . . . , i − 1 } 5

  9. Modeling Graphs as Sequences Graph G ∼ p ( G ) with n nodes under node ordering π Define mapping f s from G to sequence S π = f S ( G , π ) = ( S π 1 , . . . , S π n ) (1) Each sequence element is adjacency vector i ∈ { 0 , 1 } i − 1 S π i ∈ { 1 , . . . , n } for edges between node π ( v i ) and π ( v j ) , j ∈ { 1 , . . . , i − 1 } 5

  10. Distribution on Graphs → Distribution on Sequences Instead of learning p ( G ) sample, π ∼ Π to get observations of S π Then learn p ( S π ) modeled autoregressively: � p ( S π ) ✶ [ f G ( S π ) = G ] p ( G ) = (3) S π Exploiting sequential structure of S π , decompose p ( S π ) n +1 P ( S π ) = � p ( S π i | S π 1 , . . . , S π i − 1 ) (4) i =1 n +1 � p ( S π i | S π = < i ) i =1 6

  11. Motivating GraphRNN Model p ( G ) Distribution over graphs ↓ Model p ( S π ) Distribution over sequence of edge connections ↓ Model p ( S π i | S π < i ) Distribution over edge connections for i -th node conditioned on previous nodes’ edge connections parameterize with an expressive neural network 7

  12. GraphRNN Framework Idea: Use an RNN that consists of a state-transition function and an output function: h i = f trans ( h i − 1 , S π i − 1 ) (5) θ i = f out ( h i ) (6) • h i ∈ R d encodes the state of the graph generated so far • S π i − 1 encodes adjacency for most recently generated node i − 1 • θ i specifies the distribution of next node’s adjacency vector S π i ∼ P θ i • f trans and f out can be arbitrary neural networks • P θ i can be an arbitrary distribution over binary vectors 8

  13. GraphRNN Framework Corrected Idea: Use an RNN that consists of a state-transition function and an output function: h i = f trans ( h i − 1 , S π i ) (5) θ i +1 = f out ( h i ) (6) • h i ∈ R d encodes the state of the graph generated so far • S π i encodes adjacency for most recently generated node i • θ i +1 specifies the distribution of next node’s adjacency vector S π i +1 ∼ P θ i +1 • f trans and f out can be arbitrary neural networks • P θ i can be an arbitrary distribution over binary vectors. 9

  14. GraphRNN Framework Corrected Idea: Use an RNN that consists of a state-transition function and an output function: h i = f trans ( h i − 1 , S π i ) (5) θ i +1 = f out ( h i ) (6) S π i +1 ∼ P θ i +1 10

  15. GraphRNN Inference Algorithm Algorithm 1 GraphRNN inference algorithm Input: RNN-based transition module f trans , output module f out , probability distribution P θ i parameterized by θ i , start token SOS , end token EOS , empty graph state h ′ Output: Graph sequence S π S π 0 = SOS , h 0 = h ′ , i = 0 repeat i = i + 1 h i = f trans ( h i − 1 , S π i − 1 ) { update graph state } θ i = f out ( h i ) S π i ∼ P θ i { sample node i ’s edge connections } until S π i is EOS Return S π = ( S π 1 , ..., S π i ) 11

  16. GraphRNN Inference Algorithm Corrected Algorithm 1 GraphRNN inference algorithm Input: RNN-based transition module f trans , output module f out , probability distribution P θ i parameterized by θ i , start token SOS , end token EOS , empty graph state h ′ Output: Graph sequence S π S π 01 = SOS , h 0 = h ′ , i = 0 ✁ repeat i = i + 1 h i = f trans ( h i − 1 , S π i − 1 i ) { update graph state } ✟ ✟ θ ✚ i i +1 = f out ( h i ) i i +1 { sample node ✚ ✚ S π i i +1 ∼ P θ ✚ i i + 1’s edge connections } ✚ until S π i i +1 is EOS ✚ Return S π = ( S π 1 , ..., S π i ) 12

  17. GraphRNN Variants Objective: � p model ( S π ) over all observed graph sequences Implement f trans as Gated Recurrent Unit (GRU) But different assumptions about p ( S π i | S π < i ) for each variant: 1. Multivariate Bernoulli (GraphRNN-S): f out is a MLP with sigmoid activation that outputs θ i +1 ∈ R i θ i +1 parameterizes the multivariate Bernoulli S π i +1 ∼ P θ i +1 independently 13

  18. GraphRNN Variants Objective: � p model ( S π ) over all observed graph sequences Implement f trans as Gated Recurrent Unit (GRU) But different assumptions about p ( S π i | S π < i ) for each variant: 2. Dependent Bernoulli sequence (GraphRNN): i − 1 p ( S π i | S π � p ( S π i , j | S π i ,< j , S π < i ) = < i ) (7) j =1 • S π i , j ∈ { 0 , 1 } indicating if node π ( v i ) is connected to node π ( v j ) • f out is a edge-level RNN generates the edges of a given node 14

  19. Tractability via Breadth First Search (BFS) Idea: Apply BFS ordering to the graph G with node permutation π before generating the sequence S π Benefits: • Reduce overall # of seq to consider Only need to train on all possible BFS orderings, rather than all possible node permutations • Reduce the number of edge predictions Edge-level RNN only predicts M edges, the maximum size of the BFS queue 15

  20. BFS Order Leads To Fixed Size S π i i ∈ R M represents “sliding window” over nodes in the BFS queue S π Zero-pad all S π i to be a length M vector: S π i = ( A π max(1 , i − M ) , i , ..., A π i − 1 , i ) T , i ∈ { 2 , ..., n } (9) 16

  21. Experiments

  22. Datasets 3 Synthetic and 2 real graph datasets: Dataset Type # Graphs Graph Size Description Community Synthetic 500 60 ≤ � V � ≤ 160 2-community, Erd˝ os-R´ enyimodel (E-R) Grid Synthetic 100 100 ≤ | V | ≤ 400 Standard 2D grid B-A Synthetic 500 100 ≤ | V | ≤ 200 Barab´ asi-Albert model, 4 existing nodes connected Protein Real 918 100 ≤ | V | ≤ 500 Amino acids nodes, edge if ≤ 6 Angstroms apart Ego Real 757 50 ≤ | V | ≤ 399 Document nodes, edges citation re- lationships, from Citeseer 17

  23. Baseline Methods & Settings • Compared GraphRNN to traditional models and deep learning baselines: Method Type Algorithm Erd˝ os-R´ enyiModel (E-R) (Erd¨ os & R´ enyi, 1959) Barab´ asi-Albert Model (B-A) (Albert & Barab´ asi, 2002) Traditional Kronecker graph models (Leskovec et al., 2010) Mixed-membership stochastic block models (MMSB) (Airoldi et al., 2008) GraphVAE (Simonovsky & Komodakis, 2018) Deep learning DeepGMG (Li et al., 2018) • 80%-20% train-test split • All models trained with early stopping • Traditional methods: learn from a single graph, so train a separate model for each training graph in order to compare with these methods • Deep learning baselines: use smaller dataset: Community-small: 12 ≤ | V | ≤ 20 18 Ego-small: 4 ≤ � V � ≤ 18

  24. Evaluating Generated Graph Via MMD Metric Existing: • Visual Inspection • Simple comparisons of average statistics between the two sets Proposed: A metric based on Maximum Mean Discrepancy (MMD) , to compare all moments of their empirical distributions using an exponential kernel with Wasserstein distance. 19

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend