Graph Representation Learning with Graph Convolutional Networks - PowerPoint PPT Presentation

Graph Representation Learning with Graph Convolutional Networks Jure Leskovec

Networks: Common Language Movie 1 friend Actor 2 co-worker Actor 1 Mary Peter Actor 4 Movie 3 Tom Movie 2 friend brothers Actor 3 Albert Protein 2 Protein 1 Protein 5 |N|=4 Protein 9 |E|=4 Ju Jure Leskovec, Stanford University 2

Example: Node Classification ? ? ? ? Machine Learning ? Many possible ways to create node features: § Node degree, PageRank score, motifs, … § Degree of neighbors, PageRank of neighbors, … Ju Jure Leskovec, Stanford University 3

Machine Learning Lifecycle Node Network Learning Model Data Features Algorithm Automatically Feature Downstream learn the features Engineering prediction task (Supervised) Machine Learning Lifecycle: This feature, that feature. Every single time! Ju Jure Leskovec, Stanford University 4

Feature Learning in Graphs This talk: Feature learning for networks! vector node u !: # → ℝ & ℝ & Feature representation, embedding Ju Jure Leskovec, Stanford University 5

Gr Graph phSAGE GE: : Graph Convolutional Networks Inductive Representation Learning on Large Graphs. W. Hamilton, R. Ying, J. Leskovec. Neural Information Processing Systems (NIPS), 2017. Representation Learning on Graphs: Methods and Applications. W. Hamilton, R. Ying, J. Leskovec. IEEE Data Engineering Bulletin, 2017. Ju Jure Leskovec, Stanford University 6

From Images to Networks Single CNN layer with 3x3 filter: (Animation Vincent Dumoul Graph Image Transform information at the neighbors and combine it § Transform “messages” ℎ " from neighbors: # " ℎ " § Add them up: ∑ # " ℎ " " Ju Jure Leskovec, Stanford University 7

Real-World Graphs But what if your graphs look like this? s like this? or this: or this: § Examples: Social networks, Information networks, Knowledge graphs, Communication networks, Web graph, … Ju Jure Leskovec, Stanford University 8

A Naïve Approach § Join adjacency matrix and features § Feed them into a deep neural net: • Done? A B C D E Feat A 0 1 1 1 0 1 0 ? A B B 1 0 0 1 1 0 0 E C 1 0 0 1 0 0 1 C D D 1 1 1 0 1 1 1 E 0 1 0 1 0 1 0 § Issues with this idea: § !(#) parameters § Not applicable to graphs of different sizes § Not invariant to node ordering Jure Leskovec, Stanford University Ju 9

Graph Convolutional Networks § Graph Convolutional Networks: § Problem: For a given subgraph how to come with canonical node ordering? Niepert, Mathias, Mohamed Ahmed, and Konstantin Kutzkov. "Learning convolutional neural networks for graphs." ICML. 2016. (image source) Ju Jure Leskovec, Stanford 10 10

Desiderata § Invariant to node ordering § No graph isomorphism problem § Locality – operations depend on the neighbors of a given node § Number of model parameters should be independent of graph size § Model should be independent of graph structure and we should be able to transfer the model across graphs Ju Jure Leskovec, Stanford University 11 11

GraphSAGE § Adapt the GCN idea to inductive node embedding § Generalize beyond simple convolutions § Demonstrate that this generalization § Leads to significant performance gains § Allows the model to learn about local structures Ju Jure Leskovec, Stanford 12 12

Idea: Graph defines computation Idea: Node’s neighborhood defines a computation graph ! ! Determine node Propagate and computation graph transform information Learn Lear n ho how to pr propag pagat ate e inf nformat ation n acr across th the g graph to to c comp mpute te n node fe featu tures Semi-Supervised Classification with Graph Convolutional Networks. T. N. Kipf, M. Welling, ICLR 2017 Jure Leskovec, Stanford University Ju 13 13

Our Approach: GraphSAGE Q (1) W (1) Q (2) W (2) Q (1) W (1) § Each node defines its own computational graph § Each edge in this graph is a transformation/aggregation function Ju Jure Leskovec, Stanford 14 14

Our Approach: GraphSAGE Q (1) W (1) Q (2) W (2) Q (1) W (1) Upda Update te for r node de ! : (%&') = *+,- . % ℎ # % , % ) *+,-(1 % ℎ 2 ℎ # 0 2∈4 # 9 + 1 => level Transform 6 ’s own Transform and aggregate features from level 9 features of neighbors : features of node 6 5 = attributes of node 6 § ℎ # § Σ ⋅ : Aggregator function (e.g., avg., LSTM, max-pooling) Semi-Supervised Classification with Graph Convolutional Networks. T. N. Kipf, M. Welling, ICLR 2017 Jure Leskovec, Stanford Ju 15 15

GraphSAGE Algorithm initialize representations as features K = “search depth” aggregate information from neighbors concatenate neighborhood info with current representation and propagate classification (cross-entropy) loss

WL isomorphism test § The classic Weisfeiler-Lehman graph isomorphism test is a special case of GraphSAGE § We replace the hash function with trainable neural nets: HASH X X Shervashidze, Nino, et al. "Weisfeiler-Lehman graph kernels." Journal of Machine Learning Research (2011). Ju Jure Leskovec, Stanford 17 17

GraphSAGE: Training § Assume parameter sharing: W (2) W (2) W (2) Q (2) Q (2) Q (2) W (1) Q (1) § Two types of parameters: § Aggregate function can have params. § Matrix W (k) Adapt to inductive setting (e.g., unsupervised loss, § neighborhood sampling, minibatch optimization) Generalized notion of “aggregating neighborhood” § Jure Leskovec, Stanford University Ju 18 18

GraphSAGE: Benefits Can use different aggregators ! § Mean (simple element-wise mean), LSTM (to a random § order of nodes), Max-pooling (element-wise max) Can use different loss functions: § Cross entropy, Hinge loss, ranking loss § Model has a constant number of parameters § Fast scalable inference § Can be applied to any node in any network § Ju Jure Leskovec, Stanford University 19 19

GraphSAGE Performance: Experiments § Co Comp mpare Gr GraphSAGE GE to to alte terna nati tive metho thods § Logistic regression on features (no network information) § Node2vec, extended node2vec with features § Task: k: Node classification, transfer learning § Citation graph: 302,424 papers from 2000-05 des; Train on 2000-04, test on ‘05 § Pr Predi dict 6 subj bject code § Reddit posts: 232,965 posts, 50 communities, Sep ‘14 y does a post belong to? Train on first 20 § Wh What community days, test on remaining 10 days § Protein-protein interaction networks: 24 PPI networks from different tissues Transfer learning of protein function: Train on 20 networks, § Tr test on 2 DA DARPA SIMPLEX PI Meeting, February 6, 2018 MINER Project 20 20

GraphSAGE Performance: Results GraphSAGE performs best in all experiments. Achieves ~40% average improvement over raw features. DA DARPA SIMPLEX PI Meeting, February 6, 2018 MINER Project 21 21

Application: Pinterest Human Hum n cur urated collection n of pins ns Pi Pin: A visual bookmark someone has saved from the internet to a board they’ve created. Pin: Image, text, link Pi Bo Board: A greater collection of ideas (pins having sth. in common). Ju Jure Leskovec, Stanford University 22 22

Large-Scale Application § Semi-Supervised node embedding for graph-based recommendations ph: 2B pins, 1B boards, 20B edges § Gr Graph: Pins Q Boars Ju Jure Leskovec, Stanford University 23 23

Pinterest Graph Q § Graph is dynamic: need to apply to new nodes without model retraining § Rich node features: content, image Ju Jure Leskovec, Stanford University 24 24

Task: Item-Item Recs Related Pin recommendations § Given user is looking at pin Q , what pin X are they going to save next: Ha Hard d ne negative Qu Query Rnd. ne Rnd negative Po Positive ve Jure Leskovec, Stanford University Ju 25 25

GraphSAGE Training § Leverage inductive capability, and train on individual subgraphs § 300 million nodes, 1 billion edges, 1.2 billion pin pairs (Q, (Q, X) § Large batch size: 2048 per minibatch Ju Jure Leskovec, Stanford University 26 26

GraphSAGE: Inference § Use MapReduce for model inference § Avoids repeated computation Ju Jure Leskovec, Stanford University 27 27

Experiments Related Pin recommendations § Given user is looking at pin Q , predict what pin X are they going to save next § Ba Baselin lines fo for comparis ison Visual: VGG-16 visual features § Vi Annotation: Word2Vec model § An ned: combine visual and annotation § Co Comb mbine RW: Random-walk based algorithm § RW § Gr GraphS phSAGE GE § Se Setup: Embed 2B pins, perform nearest neighbor to generate recommendations Ju Jure Leskovec, Stanford University 28 28

Results: Ranking Task: Given Q , rank X as high as possible Ta 2B pins among 2B § Hit-rate: Pct. P was among top- k § MRR: Mean reciprocal rank Method Hit-rate MRR Visual 17% 0.23 Annotation 14% 0.19 Combined 27% 0.37 GraphSAGE 46% 0.56 Jure Leskovec, Stanford University Ju 29 29

Example Recommendations GS Ju Jure Leskovec, Stanford University 30 30

Graph Representation Learning with Graph Convolutional Networks - PowerPoint PPT Presentation

Graph Representation Learning with Graph Convolutional Networks Jure Leskovec Networks: Common Language Movie 1 friend Actor 2 co-worker Actor 1 Mary Peter Actor 4 Movie 3 Tom Movie 2 friend brothers Actor 3 Albert Protein 2

Convolutional Neural Networks ---- Off the shelf top notch performances Convolutional Neural

Convolutional Neural Networks Convolutional neural networks One of the major kinds of ANNs in use

Anytime Reliability of Systematic LDPC Motivation Convolutional Codes LDPC Convolutional Codes

Convolutional Autoencoder (CAE) Prof. Seungchul Lee Industrial AI Lab. Convolutional Autoencoder

Introduction CSCE 970 CSCE 970 Lecture 4: Lecture 4: Convolutional Convolutional Neural

Convolutional Kuan-Ting Lai 2020/3/31 Neural Network Convolutional Neural Networks (CNN)

Convolutional Kernel Networks for Graph-Structured Data Dexiong Chen 1 Laurent Jacob 2 Julien

On the Experimental transferability of Spectral Graph Convolutional Networks Masters project

Simplifying Graph Convolutional Networks Amauri Holanda Felix Wu* Tianyi Zhang* Christopher

EMNLP 2017 Copenhagen Contributions } Syntactic Graph Convolutional Networks }

Stable and Efficient Representation Learning with Nonnegativity Constraints Tsung-Han Lin and

ON TEGRA X1 ALAN WANG, NVIDIA Convolutional Neural Network optimization target Result

Convolutional Neural Nets CS447 Natural Language Processing (J. Hockenmaier)

Semantic Segmentation of the sekleton in bone scintigraphy images with convolutional neural

Coding and decoding with convolutional codes. The Viterbi Algorithm. J.-M. Brossier 2008 J.-M.

15-780 Graduate Artificial Intelligence: Convolutional and recurrent networks J. Zico Kolter

Q&A Please submit all questions concerning webinar content through the Q&A panel.

90-Day Pilot Program: A Nurse Navigators Assessment of Psychosocial Distress in Gynecologic

Graph Neural Networks for Drug Development Marinka Zitnik marinka@hms.harvard.edu Marinka

Prevalence Definition 1 Chronic Venous Insufficiency/Varicose Veins: Prevalence The

1 st Annual Leeds Citywide Patient Participation Group Event Friday 20 th October 2017 NHS Leeds

Remaining Profitable When Using Skin Substitutes in Your Practice COVID 19 Practice Survival

the Roles of CHWs Chavely Conde, BHS Kyla Alsman RN, BSN Objectives Understand how cancer is

Neuropharmacy Masterclass Case Studies Jan 19 Case 1 29 yr old female RRMS diagnosed

Graph Representation Learning with Graph Convolutional Networks - PowerPoint PPT Presentation

Graph Representation Learning with Graph Convolutional Networks Jure Leskovec Networks: Common Language Movie 1 friend Actor 2 co-worker Actor 1 Mary Peter Actor 4 Movie 3 Tom Movie 2 friend brothers Actor 3 Albert Protein 2

Convolutional Neural Networks ---- Off the shelf top notch performances Convolutional Neural

Convolutional Neural Networks Convolutional neural networks One of the major kinds of ANNs in use

Anytime Reliability of Systematic LDPC Motivation Convolutional Codes LDPC Convolutional Codes

Convolutional Autoencoder (CAE) Prof. Seungchul Lee Industrial AI Lab. Convolutional Autoencoder

Introduction CSCE 970 CSCE 970 Lecture 4: Lecture 4: Convolutional Convolutional Neural

Convolutional Kuan-Ting Lai 2020/3/31 Neural Network Convolutional Neural Networks (CNN)

Convolutional Kernel Networks for Graph-Structured Data Dexiong Chen 1 Laurent Jacob 2 Julien

On the Experimental transferability of Spectral Graph Convolutional Networks Masters project

Simplifying Graph Convolutional Networks Amauri Holanda Felix Wu* Tianyi Zhang* Christopher

EMNLP 2017 Copenhagen Contributions } Syntactic Graph Convolutional Networks }

Stable and Efficient Representation Learning with Nonnegativity Constraints Tsung-Han Lin and

ON TEGRA X1 ALAN WANG, NVIDIA Convolutional Neural Network optimization target Result

Convolutional Neural Nets CS447 Natural Language Processing (J. Hockenmaier)

Semantic Segmentation of the sekleton in bone scintigraphy images with convolutional neural

Coding and decoding with convolutional codes. The Viterbi Algorithm. J.-M. Brossier 2008 J.-M.

15-780 Graduate Artificial Intelligence: Convolutional and recurrent networks J. Zico Kolter

Q&amp;A Please submit all questions concerning webinar content through the Q&amp;A panel.

90-Day Pilot Program: A Nurse Navigators Assessment of Psychosocial Distress in Gynecologic

Graph Neural Networks for Drug Development Marinka Zitnik marinka@hms.harvard.edu Marinka

Prevalence Definition 1 Chronic Venous Insufficiency/Varicose Veins: Prevalence The

1 st Annual Leeds Citywide Patient Participation Group Event Friday 20 th October 2017 NHS Leeds

Remaining Profitable When Using Skin Substitutes in Your Practice COVID 19 Practice Survival

the Roles of CHWs Chavely Conde, BHS Kyla Alsman RN, BSN Objectives Understand how cancer is

Neuropharmacy Masterclass Case Studies Jan 19 Case 1 29 yr old female RRMS diagnosed

Q&A Please submit all questions concerning webinar content through the Q&A panel.