http://cs224w.stanford.edu Output: Node embeddings. We can also - PowerPoint PPT Presentation

HW2 deadline postponed to next Thu, Oct 31! We are releasing an improved version of the starter code for HW2.Q4 -- keep an eye on Piazza! CS224W: Machine Learning with Graphs Jure Leskovec, JiaxuanYou, Stanford University http://cs224w.stanford.edu

… Output: Node embeddings. We can also embed larger network structures, subgraphs, graphs 10/24/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 2

¡ Key idea: Generate node embeddings based on local network neighborhoods A C TARGET NODE B B A A C B C A E F D F E D INPUT GRAPH A 10/24/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 3

¡ Intuition: Nodes aggregate information from their neighbors using neural networks A C TARGET NODE B B A A C B C A E F D F E D A INPUT GRAPH Neural networks 10/24/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 4

¡ Intuition: Network neighborhood defines a computation graph Every node defines a computation graph based on its neighborhood! 10/24/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 5

Key idea: Generate node embeddings based on local network neighborhoods § Nodes aggregate “messages” from their neighbors using neural networks ¡ Graph Convolutional Neural Networks: § Basic variant: Average neighborhood information and stack neural networks ¡ GraphSAGE: § Generalized neighborhood aggregation h k − 1 k h k − 1 u v 𝑤 h k v 10/24/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 6

… Output: Vector embeddings 10/24/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 7

… Output: Graph Structure! 10/24/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 8

1. Problem of Graph Generation 2. ML Basics for Graph Generation 3. GraphRNN 4. Applications and Open Questions 10/24/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 9

¡ We want to generate realistic graphs Generate a Given a large synthetic graph real graph ¡ What is a good model? ¡ How can we fit the model and generate the graph using it? 10/24/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 11

¡ Generation – Gives insight into the graph formation process ¡ Anomaly detection – abnormal behavior, evolution ¡ Predictions – predicting future from the past ¡ Simulations of novel graph structures ¡ Graph completion – many graphs are partially observed ¡ "What if” scenarios 10/24/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 12

Task 1: Realistic graph generation ¡ Generate graphs that are similar to a given set of graphs [Focus of this lecture] Task 2: Goal-directed graph generation ¡ Generate graphs that optimize given objectives/constraints § Drug molecule generation/optimization 10/24/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 13

Drug discovery ¡ Discover highly drug-like molecules Graph generative model drug_likeness=0.94 10/24/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 14

Drug discovery ¡ Complete an existing molecule to optimize a desired property Complete Improve Solubility=-5.55 Solubility=-1.78 10/24/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 15

Discovering novel structures Grid Community Ego Train GraphRNN 10/24/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 16

Network Science ¡ Null models for realistic networks Barabasi_Albert(n=50, m=2) ~ NeuralNet_X(n=50, p=3, q=5) ~ 10/24/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 17

¡ Large and variable output space § For 𝑜 nodes we need to generate 𝑜 $ values § Graph size (nodes, edges) varies 0 1 1 0 0 1 3 1 0 0 1 0 5 1 0 0 1 1 2 4 0 1 1 0 1 0 0 1 1 0 5 nodes: 25 values 1K nodes: 1M values 10/24/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 18

¡ Non-unique representations: § 𝑜 -node graph can be represented in 𝑜! ways § Hard to compute/optimize objective functions (e.g., reconstruction error) 0 1 1 0 0 1 3 1 0 0 1 0 5 1 0 0 1 1 Same graph 2 4 0 1 1 0 1 0 0 1 1 0 Very different 0 0 0 1 1 1 5 representations! 0 0 1 0 1 2 0 1 0 1 1 4 3 1 0 1 0 0 1 1 1 0 0 10/24/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 19

¡ Complex dependencies: § Edge formation has long-range dependencies Example: Generate a ring graph on 6 nodes: 1 1 1 1 Shouldn’t Should 1 1 1 have edge! have edge! 1 1 1 1 Existence of an edge may depend on the entire graph! 10/24/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 20

¡ Given: Graphs sampled from 𝑞 '()( (𝐻) ¡ Goal: § Learn the distribution 𝑞 -.'/0 (𝐻) § Sample from 𝑞 -.'/0 (𝐻) 𝑞 -.'/0 (𝐻) 𝑞 '()( (𝐻) Learn & Sample 10/24/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 23

Setup: ¡ Assume we want to learn a generative model from a set of data points (i.e., graphs) {𝒚 3 } § 𝑞 '()( (𝒚) is the data distribution, which is never known to us, but we have sampled 𝒚 3 ~ 𝑞 '()( (𝒚) § 𝑞 -.'/0 (𝒚; 𝜄) is the model, parametrized by 𝜄 , that we use to approximate 𝑞 '()( (𝒚) ¡ Goal: § (1) Make 𝑞 -.'/0 𝒚; 𝜄 close to 𝑞 '()( 𝒚 § (2) Make sure we can sample from 𝑞 -.'/0 𝒚; 𝜄 § We need to generate examples (graphs) from 𝑞 -.'/0 𝒚; 𝜄 10/24/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 24

(1) Make 𝒒 𝒏𝒑𝒆𝒇𝒎 𝒚; 𝜾 close to 𝒒 𝒆𝒃𝒖𝒃 𝒚 ¡ Key Principle : Maximum Likelihood ¡ Fundamental approach to modeling distributions § Find parameters 𝜄 ∗ , such that for observed data points 𝒚 3 ~𝑞 '()( , ∑ 3 log 𝑞 -.'/0 𝒚 3 ; 𝜄 ∗ has the highest value, among all possible choices of 𝜄 § That is, find the model that is most likely to have generated the observed data 𝑦 10/24/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 25

(2) Sample from 𝒒 𝒏𝒑𝒆𝒇𝒎 𝒚; 𝜾 ¡ Goal : Sample from a complex distribution ¡ The most common approach: § (1) Sample from a simple noise distribution 𝒜 3 ~𝑂(0,1) § (2) Transform the noise 𝑨 3 via 𝑔(⋅) 𝒚 3 = 𝑔(𝒜 3 ; 𝜄) Then 𝒚 3 follows a complex distribution ¡ Q : How to design 𝑔(⋅) ? ¡ A : Use Deep Neural Networks, and train it using the data we have! 10/24/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 26

[Goodfellow, NeurIPS 2016] Taxonomy of Deep Generative Models This lecture: Auto-regressive models: An autoregressive (AR) model predicts future behavior based on past behavior. 10/24/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 27

Auto-regressive models ¡ 𝒒 𝒏𝒑𝒆𝒇𝒎 𝒚; 𝜾 is used for both density estimation and sampling (from the probability density) § (other models like Variational Auto Encoders (VAEs), Generative Adversarial Nets (GANs) have 2 or more models, each playing one of the roles) § Apply chain rule : Joint distribution is a product of conditional distributions: R 𝑞 -.'/0 𝒚; 𝜄 = O 𝑞 -.'/0 (𝑦 ) |𝑦 Q , … , 𝑦 )UQ ; 𝜄) )PQ § E.g., 𝒚 is a vector, 𝑦 ) is the 𝑢 -th dimension; 𝒚 is a sentence, 𝑦 ) is the 𝑢 -th word. § In our case: 𝑦 ) will be the 𝑢 -th action (add node, add edge) 10/24/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 28

GraphRNN: Generating Realistic Graphs with Deep Auto-regressive Models. J. You, R. Ying, X. Ren, W. L. Hamilton, J. Leskovec. International Conference on Machine Learning (ICML) , 2018.

[You et al., ICML 2018] Generating graphs via sequentially adding nodes and edges Graph 𝐻 1 3 5 2 4 Generation process 𝑇 X 1 3 1 1 1 3 1 3 5 2 2 4 2 2 4 10/24/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 31

http://cs224w.stanford.edu Output: Node embeddings. We can also - PowerPoint PPT Presentation

HW2 deadline postponed to next Thu, Oct 31! We are releasing an improved version of the starter code for HW2.Q4 -- keep an eye on Piazza! CS224W: Machine Learning with Graphs Jure Leskovec, JiaxuanYou, Stanford University

http://cs224w.stanford.edu October August 12/3/2013 Jure Leskovec, Stanford CS224W: Social and

http://cs224w.stanford.edu 10/31/2012 Jure Leskovec, Stanford CS224W: Social and Information

http://cs224w.stanford.edu Course website: Course website: http://cs224w.stanford.edu

http://cs224w.stanford.edu 10/25/2010 Jure Leskovec, Stanford CS224W: Social and Information

http://cs224w.stanford.edu ? ? ? ? Machine Learning ? Node classification 12/4/17 Jure

http://cs224w.stanford.edu Nodes Nodes Network Adjacency matrix 11/30/17 Jure Leskovec,

http://cs224w.stanford.edu ? ? ? ? Machine Learning ? Node classification 10/15/19 Jure

http://cs224w.stanford.edu Output: Node embeddings. We can also embed larger network

http://cs224w.stanford.edu Stanford Social Web (ca. 1999) network

http://cs224w.stanford.edu Networks of tightly Networks of tightly connected groups

http://cs224w.stanford.edu Spreading through networks: Spreading through networks:

http://cs224w.stanford.edu Non overlapping vs overlapping communities Non overlapping

http://cs224w.stanford.edu Teams of 2 3 students (1 is also ok) Teams of 2 3 students

http://cs224w.stanford.edu How to organize/navigate it? How to organize/navigate it?

http://cs224w.stanford.edu Probabilistic models of network contagion Probabilistic models

http://cs224w.stanford.edu [LibenNowell Kleinberg 03] Link prediction task: Link

Exceptio ions and fil ile in input/output try-raise-except-finally Exception control

IRTF-NMRG Workshop IRTF-NMRG Workshop Challenges for Future Research on Challenges for Future

GUI Testing Chapter 19 GUI characteristic Figure 19.1 What is the main characteristic of

N ETWORK S CIENCE Game Theory Prof. Marcello Pelillo Ca Foscari University of Venice a.y.

Mining Approximate Top-K Subspace Anomalies in Multi-Dimensional Time-Series Data Xiaolei Li,

Palliative assessment in dementia Pa Pain Depressi ssion Anxi xiety Psyc sychosi sis De

Techniques and Tools for the Analysis of Timed Workflows Jiri Srba Department of Computer

Accessibility and Inclusive Design Tracy Tran | Microsoft Program Manager | tracyt@microsoft.com