Graph Neural Networks Prof. Srijan Kumar - PowerPoint PPT Presentation

CSE 6240: Web Search and Text Mining. Spring 2020 Graph Neural Networks Prof. Srijan Kumar http://cc.gatech.edu/~srijan 1 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

Today’s Lecture • Introduction to deep graph embeddings • Graph convolution networks • GraphSAGE 2 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

Goal: Node Embeddings Goal: similarity( u, v ) ≈ z > v z u Need to define! d-dimensional Input network embedding space 3 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

Deep Graph Encoders • Encoder: Map a node to a low-dimensional vector: enc ( v ) = z v • Deep encoder methods based on graph neural networks: multiple layers of enc ( v ) = non-linear transformations of graph structure • Graph encoders idea is inspired by CNN on (Animation Vincent Dumoul Image Graph images 4 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

Idea from Convolutional Networks • In CNN, pixel representation is created by transforming neighboring pixel representation – In GNN, node representations are created by transforming neighboring node representation • But graphs are irregular , unlike images – So, generalize convolutions beyond simple lattices, and leverage node features/attributes • Solution: deep graph encoders 5 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

Deep Graph Encoders • Once an encoder is defined, multiple layers of encoders can be stacked … Output: Node embeddings, embed larger network structures, subgraphs, graphs 6 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

Graph Encoder: A Naïve Approach • Join adjacency matrix and features • Feed them into a deep neural network: • Done? A B C D E Feat A 0 1 1 1 0 1 0 ? A B B 1 0 0 1 1 0 0 E C 1 0 0 1 0 0 1 C D D 1 1 1 0 1 1 1 E 0 1 0 1 0 1 0 • Issues with this idea: – 𝑃(|𝑊|) parameters – Not applicable to graphs of different sizes – Not invariant to node ordering 7 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

Graph Encoders: Two Instantiations 1. Graph convolution networks (GCN): one of the first frameworks to learn node embeddings in an end-to-end manner Different from random walk methods, which are – not end-to-end 2. GraphSAGE: generalized GCNs to various neighborhood aggregations 8 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

Today’s Lecture • Introduction to deep graph embeddings • Graph convolution networks (GCN) • GraphSAGE Main paper: “Semi-Supervised Classification with Graph Convolutional Networks”, Kipf and Welling, ICLR 2017 9 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

Content • Local network neighborhoods: – Describe aggregation strategies – Define computation graphs • Stacking multiple layers: – Describe the model, parameters, training – How to fit the model? – Simple example for unsupervised and supervised training 10 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

Setup • Assume we have a graph 𝐻 : – 𝑊 is the vertex set – 𝑩 is the adjacency matrix (assume binary) – 𝒀 ∈ ℝ +×|-| is a matrix of node features – Social networks: User profile, User image – Biological networks: Gene expression profiles – If there are no features, use: » Indicator vectors (one-hot encoding of a node) » Vector of constant 1: [1, 1, …, 1] 11 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

Graph Convolutional Networks • Idea: Generate node embeddings based on local network neighborhoods – A node’s neighborhood defines its computation graph • Learn how to aggregate information from the neighborhood to learn node embeddings – Transform information from the neighbors and combine it: Transform “messages” ℎ / from neighbors: 𝑋 / ℎ / • 12 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

Idea: Aggregate Neighbors • Intuition: Generate node embeddings based on local network neighborhoods • Nodes aggregate information from their neighbors using neural networks A C TARGET NODE B B A A C B C A E F D F E D A INPUT GRAPH Neural networks 13 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

Idea: Aggregate Neighbors • Intuition: Network neighborhood defines a computation graph Every node defines a computation graph based on its neighborhood 14 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

Deep Model: Many Layers • Model can be of arbitrary depth: – Nodes have embeddings at each layer – Layer-0 embedding of node 𝒗 is its input feature, 𝒚 𝒗 – Layer-K embedding gets information from nodes that are atmost K hops away Layer-0 Layer-1 x A A x C C TARGET NODE B B Layer-2 x A A A x B C B C A x E E F D x F F E D A x A INPUT GRAPH 15 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

Neighborhood Aggregation • Neighborhood aggregation: Key distinctions are in how different approaches aggregate information across the layers A ? C TARGET NODE B B What is in the box? A A C B ? ? C A E F D F E ? D INPUT GRAPH A 16 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

Neighborhood Aggregation • Basic approach: Average information from neighbors and apply a neural network (1) average messages A from neighbors C TARGET NODE B B A A C B C A E F D F E D INPUT GRAPH A (2) apply neural network 17 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

The Math: Deep Encoder • Basic approach: Average neighbor messages and apply a neural network – Note: Apply L2 normalization for each node embedding at every layer Previous layer Initial 0-th layer embeddings are h 0 v = x v embedding of v equal to node features 0 1 h k − 1 X | N ( v ) | + B k h k − 1 h k u A , ∀ k ∈ { 1 , ..., K } v = σ @ W k v u ∈ N ( v ) Average of neighbor’s z v = h K v previous layer embeddings Non-linearity Embedding after K layers of (e.g., ReLU) neighborhood aggregation 18 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

GCN: Matrix Form • H (l) is the representation in l th layer (l) and W 1 (l) are matrices to be learned for • W 0 each layer • A = adjacency matrix, D = diagonal degree matrix • GCN rewritten in vector form: 19 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

Training the Model • How do we train the model? – Need to define a loss function on the embeddings 𝒜 5 20 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

Model Parameters • We can feed these embeddings into any loss function and run stochastic gradient descent to train the weight parameters – Once we have the weight matrices, we can calculate the node embeddings Trainable weight matrices h 0 v = x v (i.e., what we learn) 0 1 h k − 1 X | N ( v ) | + B k h k − 1 u h k A , ∀ k ∈ { 1 , ..., K } v = σ @ W k v u ∈ N ( v ) z v = h K v 21 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

Unsupervised Training • Training can be unsupervised or supervised • Unsupervised training: – Use only the graph structure: “Similar” nodes have similar embeddings – Common unsupervised loss function = edge existence • Unsupervised loss function can be anything from the last section, e.g., a loss based on – Node proximity in the graph – Random walks 22 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

Supervised Training • Train the model for a supervised task (e.g., node E.g., Normal or anomalous node? classification) • Two ways: – Total loss = supervised loss – Total loss = supervised loss + unsupervised loss 23 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

Model Design: Overview (1) Define a neighborhood aggregation function 𝒜 5 (2) Define a loss function on the embeddings 24 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

Model Design: Overview (3) Train on a set of nodes, i.e., a batch of compute graphs 25 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

Model Design: Overview (4) Generate embeddings for nodes as needed Even for nodes we never trained on! 26 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

GCN: Inductive Capability • The same aggregation parameters are shared for all nodes: – The number of model parameters is sublinear in |𝑊| and we can generalize to unseen nodes shared parameters B A W k B k C F shared parameters D E INPUT GRAPH Compute graph for node A Compute graph for node B 27 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

Graph Neural Networks Prof. Srijan Kumar - PowerPoint PPT Presentation

CSE 6240: Web Search and Text Mining. Spring 2020 Graph Neural Networks Prof. Srijan Kumar http://cc.gatech.edu/~srijan 1 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining Todays Lecture Introduction to deep

Learning Neural Networks Learning Neural Networks Neural Networks can represent complex Neural

Neural Networks and Handwriting Recognition Background Neural Networks Neural Network Steven

Neural Networks Neural networks arise from attempts to model Neural Networks human/animal

Graph Neural Network Fang Yuanqiang, 2019/05/18 Graph Neural Network Why GNN? Preliminary

Sequential Data with Neural Networks Recurrent Neural Networks Sequential input / output Greg

Neural Information Retrieval Wassila Lalouani 1 Plan Neural network architectures Neural

CHAPTER II I CHAPTER I Recurrent Neural Networks Recurrent Neural Networks CHAPTER II : I :

CHAPTER II III I CHAPTER Neural Networks as Neural Networks as Associative Memory

Convolutional Neural Networks Convolutional neural networks One of the major kinds of ANNs in use

Neural Networks 0. Logistics Spring 2019 1 Neural Networks are taking over! Neural networks

Neural Networks and their Application to Go Neural Networks Learning Blackjack Theory Training

Neural Networks 1. Introduction Fall 2017 Neural Networks are taking over! Neural networks

Graph Neural Networks for Neutrino Classification Nicholas Choma and Joan Bruna July 18, 2018

Graph Neural Networks E. Daller, S. Bougleux, and L. Brun April 19, 2018 () Graph Neural

Neural Networks Neural Net Basics Dan Klein, John DeNero UC Berkeley Slides adapted from Greg

Relaxation and Hopfield Networks Neural Networks Neural Networks - Hopfield 1 Bibliography

Hacking NodeJS applications for fun and profit Testing NodeJS Security by @jmortegac Agenda

Express-Node Back-end Building a simple back-end application with Express, Node, and MongoDB

1 Agenda Quick'Intro' Node.js:'The'Beginning' What'Is'Node.js? Why'Use'Node.js?

Liz Keogh @lunivore http://lizkeogh.com @lunivore @lunivore Forbes: Top 10 qualities that

Using Node.js to improve the performance of Mobile apps and Mobile web Tom Hughes-Croucher @sh 1

Adversarial Attacks on Node Embeddings via Graph Poisoning Aleksandar Bojchevski, Stephan

Dev Lab: Node + Express What is Node? Node.js = JavaScript + File I/O + A Package Manager or:

Visualization Techniques for Big Data Jonathon Storrick Jon.Storrick@gmail.com Center for