MixHop: Higher-Order Graph Convolutional Architectures via - PowerPoint PPT Presentation

1 2 MixHop: Higher-Order Graph Convolutional Architectures via Sparsified Neighborhood Mixing Sami Abu-El-Haija 1 , Bryan Perozzi 2 , Amol Kapoor 2 , Nazanin Alipourfard 1 , Kristina Lerman 1 , Hrayr Harutyunyan 1 , Greg Ver Steeg 1 , Aram Galstyan 1 Code: http://github.com/samihaija/mixhop Slides: http://sami.haija.org/icml19 Abu-El-Haija et al, MixHop, ICML’19 Poster #88 Poster #88

Agenda ● Review Graph Convolutional Networks (GCN) ○ Application Semi-Supervised Node Classification (SSNC) ○ Shortcoming of GCN ● MixHop: Higher-Order GCN ○ Sparsification ● MixHop Results on SSNC Abu-El-Haija et al, MixHop, ICML’19 Poster #88

Graph Convolutional Network (GCN) [1] [1] Kipf & Welling, ICLR 2017 Abu-El-Haija et al, MixHop, ICML’19 Poster #88

Graph Convolutional Network (GCN) [1] x 4 x 5 x 2 x 3 x 6 x 1 [1] Kipf & Welling, ICLR 2017 Abu-El-Haija et al, MixHop, ICML’19 Poster #88

Graph Convolutional Network (GCN) [1] x 4 x 5 x 2 x 3 x 6 x 1 Input Features [1] Kipf & Welling, ICLR 2017 Abu-El-Haija et al, MixHop, ICML’19 Poster #88

Graph Convolutional Network (GCN) [1] x 4 x 5 x 2 x 3 x 6 GC Layer 1 x 1 Input Features [1] Kipf & Welling, ICLR 2017 Abu-El-Haija et al, MixHop, ICML’19 Poster #88

Graph Convolutional Network (GCN) [1] (1) h 4 x 4 x 5 (1) h 5 x 2 x 3 (1) h 3 x 6 (1) h 6 (1) h 2 GC Layer 1 x 1 (1) h 1 Input Features Latent Features [1] Kipf & Welling, ICLR 2017 Abu-El-Haija et al, MixHop, ICML’19 Poster #88

Graph Convolutional Network (GCN) [1] (1) h 4 x 4 x 5 (1) h 5 x 2 x 3 (1) h 3 x 6 (1) h 6 … (1) h 2 GC Layer 1 x 1 (1) h 1 Input Features Latent Features [1] Kipf & Welling, ICLR 2017 Abu-El-Haija et al, MixHop, ICML’19 Poster #88

Graph Convolutional Network (GCN) [1] (1) (L) h 4 h 4 x 4 x 5 (1) (L) h 5 h 5 x 2 x 3 (1) (L) h 3 h 3 x 6 (1) (L) h 6 h 6 … (1) (L) h 2 h 2 GC Layer 1 GC Layer L x 1 (1) (L) h 1 h 1 Input Features Latent Features Output Features [1] Kipf & Welling, ICLR 2017 Abu-El-Haija et al, MixHop, ICML’19 Poster #88

Graph Convolutional Network (GCN) [1] Train on semi-supervised node classification: (L) h 4 y 4 (L) h 5 ● measure Loss on labeled nodes ( y 4 , y 2 ) (L) h 3 (L) h 6 (L) h 2 y 2 (L) h 1 Output Features [1] Kipf & Welling, ICLR 2017 Abu-El-Haija et al, MixHop, ICML’19 Poster #88

Graph Convolutional Network (GCN) [1] Train on semi-supervised node classification: Loss (L) h 4 y 4 (L) h 5 ● measure Loss on labeled nodes ( y 4 , y 2 ) (L) h 3 ● Backprop to learn GC layers. (L) h 6 Loss (L) h 2 y 2 (L) h 1 Output Features [1] Kipf & Welling, ICLR 2017 Abu-El-Haija et al, MixHop, ICML’19 Poster #88

Graph Convolutional Network (GCN) [1] Loss (1) (L) h 4 h 4 x 4 y 4 x 5 (1) (L) h 5 h 5 x 2 x 3 (1) (L) h 3 h 3 x 6 (1) (L) h 6 h 6 … Loss (1) (L) h 2 h 2 y 2 GC Layer 1 GC Layer L x 1 (1) (L) h 1 h 1 update update Input Features Latent Features Output Features SGD [1] Kipf & Welling, ICLR 2017 Abu-El-Haija et al, MixHop, ICML’19 Poster #88

Graph Convolutional Network (GCN) [1] (1) (L) h 4 h 4 x 4 y 4 x 5 (1) (L) h 5 h 5 x 2 x 3 (1) (L) h 3 h 3 x 6 (1) (L) h 6 h 6 … (1) (L) h 2 h 2 y 2 GC Layer 1 GC Layer L x 1 (1) (L) h 1 h 1 ? [1] Kipf & Welling, ICLR 2017 Abu-El-Haija et al, MixHop, ICML’19 Poster #88

Graph Convolutional Network (GCN) [1] (1) h 4 x 4 x 5 (1) h 5 x 2 x 3 (1) h 3 x 6 (1) h 6 (1) h 2 GC Layer 1 x 1 (1) h 1 [1] Kipf & Welling, ICLR 2017 Abu-El-Haija et al, MixHop, ICML’19 Poster #88

Graph Convolutional Network (GCN) [1] (1) h 4 x 4 x 5 (1) h 5 x 3 x 2 (1) h 3 x 6 (1) h 6 Avg fc (1) h 2 GC Layer 1 x 1 (1) h 1 [1] Kipf & Welling, ICLR 2017 Abu-El-Haija et al, MixHop, ICML’19 Poster #88

Graph Convolutional Network (GCN) [1] Tensor Graph (1) h 4 x 4 x 5 (1) h 5 x 2 x 3 (1) h 3 x 6 (1) h 6 Avg fc (1) h 2 GC Layer 1 x 1 (1) h 1 [1] Kipf & Welling, ICLR 2017 Abu-El-Haija et al, MixHop, ICML’19 Poster #88

Shortcoming of Vanilla GCN Vanilla GC Layer [1] Kipf & Welling, ICLR 2017 Abu-El-Haija et al, MixHop, ICML’19 Poster #88

Shortcoming of Vanilla GCN Vanilla GC Layer 😁 fc is shared ⇒ inductive [1] Kipf & Welling, ICLR 2017 Abu-El-Haija et al, MixHop, ICML’19 Poster #88

Shortcoming of Vanilla GCN Vanilla GC Layer 😁 fc is shared ⇒ inductive 😣 Appendix Experiments of [1] shows no gains beyond 2 layers [1] Kipf & Welling, ICLR 2017 Abu-El-Haija et al, MixHop, ICML’19 Poster #88

Shortcoming of Vanilla GCN Vanilla GC Layer 😁 fc is shared ⇒ inductive 😣 Appendix Experiments of [1] shows no gains beyond 2 layers 😣 cannot mix neighbors from various distances in arbitrary linear combinations [1] Kipf & Welling, ICLR 2017 Abu-El-Haija et al, MixHop, ICML’19 Poster #88

Shortcoming of Vanilla GCN Vanilla GC Layer 😁 fc is shared ⇒ inductive 😣 Appendix Experiments of [1] shows no gains beyond 2 layers 😣 cannot mix neighbors from various distances in arbitrary linear combinations e.g. cannot learn Gabor Filters ! [1] Kipf & Welling, ICLR 2017 Abu-El-Haija et al, MixHop, ICML’19 Poster #88

Shortcoming of Vanilla GCN Vanilla GC Layer 😁 fc is shared ⇒ inductive 😣 Appendix Experiments of [1] shows no gains beyond 2 layers 😣 cannot mix neighbors from various distances in arbitrary linear combinations e.g. cannot learn Gabor Filters ! ? [1] Kipf & Welling, ICLR 2017 Abu-El-Haija et al, MixHop, ICML’19 Poster #88

Detour: Review Gabor Filters Neuroscientists discover their importance in the primate visual cortex [2, 3]: [2] Daugman, Vision Research,1980 [3] Daugman, Journal of the Optical Society of America, 1985 [4] Honglak Lee et al, ICML, 2009 [5] Alex Krizhevsky et al, NeurIPS 2012 Abu-El-Haija et al, MixHop, ICML’19 Poster #88

Detour: Review Gabor Filters Neuroscientists discover their importance in the primate visual cortex [2, 3]: Further, they are automatically recovered by training CNNs on images [4, 5] [2] Daugman, Vision Research,1980 [3] Daugman, Journal of the Optical Society of America, 1985 [4] Honglak Lee et al, ICML, 2009 [5] Alex Krizhevsky et al, NeurIPS 2012 Abu-El-Haija et al, MixHop, ICML’19 Poster #88

Main Motivation Abu-El-Haija et al, MixHop, ICML’19 Poster #88

Main Motivation Extend the class of representations realizable by GCNs e.g. to learn Gabor Filters Abu-El-Haija et al, MixHop, ICML’19 Poster #88

Agenda ● Review Graph Convolutional Networks (GCN) ○ Application Semi-Supervised Node Classification (SSNC) ○ Shortcoming of GCN ● MixHop: Higher-Order GCN ○ Sparsification ● MixHop Results on SSNC Abu-El-Haija et al, MixHop, ICML’19 Poster #88

Our Model: MixHop MixHop GC Layer Vanilla GC Layer Abu-El-Haija et al, MixHop, ICML’19 Poster #88

Our Model: MixHop MixHop GC Layer Vanilla GC Layer Couple of code lines implements concatenation Abu-El-Haija et al, MixHop, ICML’19 Poster #88

Our Model: MixHop MixHop GC Layer Abu-El-Haija et al, MixHop, ICML’19 Poster #88

Our Model: MixHop 😁 Inductive MixHop GC Layer Abu-El-Haija et al, MixHop, ICML’19 Poster #88

Our Model: MixHop 😁 Inductive MixHop GC Layer 😁 Can incorporate distant nodes Abu-El-Haija et al, MixHop, ICML’19 Poster #88

Our Model: MixHop 😁 Inductive MixHop GC Layer 😁 Can incorporate distant nodes 😁 Can mix neighbors across distances in arbitrary linear combinations Abu-El-Haija et al, MixHop, ICML’19 Poster #88

Our Model: MixHop 😁 Inductive MixHop GC Layer 😁 Can incorporate distant nodes 😁 Can mix neighbors across distances in arbitrary linear combinations i.e. can learn Gabor Filters ! Abu-El-Haija et al, MixHop, ICML’19 Poster #88

MixHop: Higher-Order Graph Convolutional Architectures via - PowerPoint PPT Presentation

1 2 MixHop: Higher-Order Graph Convolutional Architectures via Sparsified Neighborhood Mixing Sami Abu-El-Haija 1 , Bryan Perozzi 2 , Amol Kapoor 2 , Nazanin Alipourfard 1 , Kristina Lerman 1 , Hrayr Harutyunyan 1 , Greg Ver Steeg 1 , Aram

Convolutional Neural Networks Convolutional neural networks One of the major kinds of ANNs in use

Convolutional Neural Networks ---- Off the shelf top notch performances Convolutional Neural

Architectures Architectural styles Software architectures Architectures versus middleware

Higher order complexity Hugo Fre Mathieu Hoyrup CCA 2013 Hugo Fre Higher order

Outline Convolutional Neural Network Architectures for Matching Natural Language Sentences.

GRAPH MINING AND GRAPH KERNELS Part I: Graph Mining Karsten Borgwardt^ and Xifeng Yan*

Convolutional Kernel Networks for Graph-Structured Data Dexiong Chen 1 Laurent Jacob 2 Julien

Anytime Reliability of Systematic LDPC Motivation Convolutional Codes LDPC Convolutional Codes

Convolutional Autoencoder (CAE) Prof. Seungchul Lee Industrial AI Lab. Convolutional Autoencoder

Introduction CSCE 970 CSCE 970 Lecture 4: Lecture 4: Convolutional Convolutional Neural

Convolutional Kuan-Ting Lai 2020/3/31 Neural Network Convolutional Neural Networks (CNN)

York University www.cs.york.ac.uk/~ndm First order vs Higher order Higher order:

Higher order Ambisonics Higher order Ambisonics A future-proof 3D audio technique A future-proof

Higher Order Functions 1 Shell CSCE 314 TAMU Higher-order Functions A function is called

More JavaScript! Higher-Order Functions, Callbacks, and Array Methods Higher-Order Functions

Higher Order Proof Engineering Robert White ILLC/INRIA Cool Logic, ILLC 1/23 Higher Order

talking about and seeing blue (b) 2.5B 7.5BG 2.5BG (a) ! (a) vs. ! (b) (b) 2.5B 7.5BG

Deep learning for visual recognition Thurs April 27 Kristen Grauman UT Austin Last time

Seeki eking ng Interp erpretable retable Models ls for High Dimensi nsiona onal l Data

Marta Favali Thesis director: Alessandro Sarti Thesis co-director: Giovanna Citti Title of the

TTotal variation flow in the Subelliptic Heisenberg group Giovanna Citti October 11, 2014

Application of Virtual Visual Fields Yvonne Ou, MD Associate Professor of Ophthalmology

Y P O C T O N Phosphenes O D Intensive Course in E Transcranial Magnetic Stimulation S

3/12/20 Alternative Access for AAC Objectives 1. Gain a basic understanding of the Human