covariant compositional networks graphflow deep learning
play

Covariant Compositional Networks & GraphFlow Deep Learning - PowerPoint PPT Presentation

Covariant Compositional Networks & GraphFlow Deep Learning Framework In C++/CUDA Truong Son Hy Advisor: Prof. Risi Kondor The University of Chicago 2018 Truong Son Hy (UChicago) Master Presentation 2018 1 / 48 Overview Introduction


  1. Covariant Compositional Networks & GraphFlow Deep Learning Framework In C++/CUDA Truong Son Hy Advisor: Prof. Risi Kondor The University of Chicago 2018 Truong Son Hy (UChicago) Master Presentation 2018 1 / 48

  2. Overview Introduction 1 Graph Neural Networks 2 Covariant Compositional Networks 3 GraphFlow Deep Learning Framework 4 Experiments 5 Conclusion & Future Work 6 Truong Son Hy (UChicago) Master Presentation 2018 2 / 48

  3. Introduction In the field of Machine Learning, standard objects such as vectors, matrices, tensors were carefully studied and successfully applied into various areas including Computer Vision, Natural Language Processing, Speech Recog- nition, etc. However, none of these standard objects are efficient in cap- turing the structures of molecules, social networks or the World Wide Web which are not fixed in size. This arises the need of graph representation and extensions of Support Vector Machine and Convolution Neural Network to graphs. Truong Son Hy (UChicago) Master Presentation 2018 3 / 48

  4. Message Passing - The Motivation (a) Page Rank (b) Physics/Chemistry (c) Knowledge graph Truong Son Hy (UChicago) Master Presentation 2018 4 / 48

  5. Message Passing - Label Propagation Algorithm Given an input graph / network G = ( V , E ): Initially, each vertex v of the graph is associated with a feature repre- 1 sentation l v (label) or f 0 v . This feature representation can also be called as a message . Iteratively, at iteration ℓ , each vertex collects / aggregates all messages 2 of the previous iteration { f ℓ − 1 ,.., f ℓ − 1 } from other vertices in its neigh- v 1 v k borhood N ( v ) = { v 1 ,.., v k }, and then produces a new message f ℓ v via some hashing function Φ (.). The graph representation φ ( G ) is obtained by aggregating all messages 3 in the last iteration of every vertex. φ ( G ) is then used for downstream application. Truong Son Hy (UChicago) Master Presentation 2018 5 / 48

  6. Message Passing - Label Propagation Algorithm 1: for v ∈ V do f 0 v ← l v 2: 3: end for 4: for ℓ = 1 → L do for v ∈ V do 5: � � f ℓ f ℓ − 1 ,.., f ℓ − 1 v ← Φ where N ( v ) = { v 1 ,.., v k } 6: v 1 v k end for 7: 8: end for � � f L 1 ,.., f L 9: φ ( G ) ← Φ | V | 10: Use φ ( G ) for downstream regression / classification tasks. Truong Son Hy (UChicago) Master Presentation 2018 6 / 48

  7. Message Passing - Graph Neural Networks 1: for v ∈ V do f 0 v ← l v 2: 3: end for 4: for ℓ = 1 → L do for v ∈ V do 5: � � f ℓ f ℓ − 1 ,.., f ℓ − 1 ;{ W ℓ 1 ,.., W ℓ v ← Φ n ℓ } where N ( v ) = { v 1 ,.., v k } 6: v 1 v k end for 7: 8: end for � � f L 1 ,.., f L 9: φ ( G ) ← Φ | V | ;{ W 1 ,.., W n } 10: Use φ ( G ) for downstream regression / classification tasks. The gradient with respect to W can be computed via Back-propagation. Truong Son Hy (UChicago) Master Presentation 2018 7 / 48

  8. Message Passing - What others do? Weisfeiler-Lehman graph kernel (Shervashidze et al., 2011): 1 Extension of Weisfeiler-Lehman graph isomorphism test Applicable with Support Vector Machine and kernel methods Neural Graph Fingerprint (Duvenaud et al., 2015): 2 Vector vertex representation Φ as summation Learning Convolutional Neural Networks (Niepert et al., 2016): 3 Flatten the graph into a long vector of vertices Apply traditional convolution on top Gated Graph Neural Networks (Li et al., 2017): 4 Similar to Neural Graph Fingerprint Embeding LSTM/GRU Message Passing Neural Networks (Gilmer et al., 2017): 5 Summary of the field Truong Son Hy (UChicago) Master Presentation 2018 8 / 48

  9. Message Passing - What others miss? We will describe how our compositional architecture is a generalization of previous works with an extension to higher-order representations, which can retain this structural information. Recent works on graph neural networks can all be seen as instances of ze- roth order message passing where each vertex representation is a vector (1st order tensor) of c channels in which each channel is represented by a scalar (zeroth order P-tensor). This results in the loss of certain structural infor- mation during the message aggregation, and the network loses the ability to learn topological information of the graph’s multiscale structure. Truong Son Hy (UChicago) Master Presentation 2018 9 / 48

  10. Covariant Compositional Networks - Scheme Definition Let G be an object with n elementary parts (atoms) E = { e 1 ,.., e n }. A compositional scheme for G is a directed acyclic graph (DAG) M in which each node ν is associated with some subset P ν of E (these subsets are called parts of G ) in such a way that: In the bottom level, there are exactly n leaf nodes in which each leaf 1 node ν is associated with an elementary atom e . Then P ν contains a single atom e . M has a unique root node ν r that corresponds to the entire set 2 { e 1 ,.., e n }. For any two nodes ν and ν ′ , if ν is a descendant of ν ′ , then P ν is a 3 subset of P ν ′ . Truong Son Hy (UChicago) Master Presentation 2018 10 / 48

  11. Covariant Compositional Networks - Definition The compositional network N is constructed as follows: In layer ℓ = 0, each leaf node ν 0 i represents the single vertex P 0 i = { i } for 1 i ∈ V . The corresponding feature tensor f 0 i is initialized by the vertex label l i . In layers ℓ = 1,2,.., L , node ν ℓ i is connected to all nodes from the 2 previous level that are neighbors of i in G . The children of ν ℓ i are { ν ℓ − 1 | j : ( i , j ) ∈ E }. Thus, P ℓ j :( i , j ) ∈ E P ℓ − 1 . The feature tensor f ℓ i = � i is j j computed as an aggregation of feature tensors in the previous layer: f ℓ i = Φ ({ f ℓ − 1 | j ∈ P ℓ i }) j where Φ is some aggregation function. In layer ℓ = L + 1, we have a single node ν r that represents the entire 3 graph and collects information from all nodes at level ℓ = L : P r ≡ V f r = Φ ({ f L i | i ∈ P r }) Truong Son Hy (UChicago) Master Presentation 2018 11 / 48

  12. Covariant Compositional Networks - Covariance Definition For a graph G with the comp-net N , and an isomorphic graph G ′ with comp-net N ′ , let ν be any neuron of N and ν ′ be the corresponding neuron of N ′ . Assume that P ν = ( e p 1 ,.., e p m ) while P ν ′ = ( e q 1 ,.., e q m ), and let π ∈ S m be the permutation that aligns the orderings of the two receptive fields, i.e., for which e q π ( a ) = e p a . We say that N is covariant to permutations if for any π , there is a corresponding function R π such that f ν ′ = R π ( f ν ). Truong Son Hy (UChicago) Master Presentation 2018 12 / 48

  13. Covariant Compositional Networks - First order We propose first order message passing by representing each vertex v by a v ∈ R | P ℓ matrix: f ℓ v |× c , each row of this feature matrix corresponds to a vertex in the neighborhood of v . Definition We say that ν is a first order covariant node in a comp-net if under the permutation of its receptive field P ν by any π ∈ S | P ν | , its activation transforms as f ν �→ P π f ν , where P π is the permutation matrix: � 1, π ( j ) = i [ P π ] i , j � (1) 0, otherwise The transformed activation f ν ′ will be: [ f ν ′ ] a , s = [ f ν ] π − 1 ( a ), s where s is the channel index. Truong Son Hy (UChicago) Master Presentation 2018 13 / 48

  14. Covariant Compositional Networks - First order Figure: CCN 1D on C 2 H 4 molecular graph Truong Son Hy (UChicago) Master Presentation 2018 14 / 48

  15. Covariant Compositional Networks - Second order Instead of representing a vertex with a feature matrix as done in first order v ∈ R | P ℓ v |×| P ℓ message passing, we can represent it by a 3rd order tensor f ℓ v |× c and require these feature tensors to transform covariantly in a similar man- ner: Definition We say that ν is a second order covariant node in a comp-net if under the permutation of its receptive field P ν by an π ∈ S | P ν | , its activation transforms as f ν �→ P π f ν P T π . The transformed activation f ν ′ will be: [ f ν ′ ] a , b , s = [ f ν ] π − 1 ( a ), π − 1 ( b ), s where s is the channel index. Truong Son Hy (UChicago) Master Presentation 2018 15 / 48

  16. Covariant Compositional Networks - Second order Figure: CCN 2D on C 2 H 4 molecular graph Truong Son Hy (UChicago) Master Presentation 2018 16 / 48

  17. Covariant Compositional Networks - Second order T i 1 , i 2 , i 3 , i 4 , i 5 , i 6 = ( F i 1 ) i 2 , i 3 , i 6 · A i 4 , i 5 (2) The 1+1+1 case contracts T in the form T i 1 , i 2 , i 3 , i 4 , i 5 δ i a 1 δ i a 2 δ i a 3 , i.e., it 1 projects T down along 3 of its 5 dimensions. This can be done in � 5 � = 10 ways. 3 The 1+2 case contracts T in the form T i 1 , i 2 , i 3 , i 4 , i 5 δ i a 1 δ i a 2 , i a 3 , i.e., it 2 projects T along one dimension, and contracts it along two others. � 5 � This can be done in 3 = 30 ways. 3 The 3 case is a single 3-fold contraction T i 1 , i 2 , i 3 , i 4 , i 5 δ i a 1 , i a 2 , i a 3 . This can 3 � 5 � be done in = 10 ways. 3 Totally, we have 50 different contractions that result in 50 times more chan- nels. In practice, we only implement 18 contractions for efficiency. Truong Son Hy (UChicago) Master Presentation 2018 17 / 48

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend