kipf t welling m semi supervised classification with
play

Kipf, T., Welling, M.: Semi-Supervised Classification with Graph - PowerPoint PPT Presentation

Kipf, T., Welling, M.: Semi-Supervised Classification with Graph Convolutional Networks Radim petlk Czech Technical University in Prague 2 Overview - Kipf and Welling - use first order approximation in Fourier-domain to obtain an efficient


  1. Kipf, T., Welling, M.: Semi-Supervised Classification with Graph Convolutional Networks Radim Ε petlΓ­k Czech Technical University in Prague

  2. 2 Overview - Kipf and Welling - use first order approximation in Fourier-domain to obtain an efficient linear-time graph-CNNs - apply the approximation to the semi-supervised graph node classification problem

  3. 3 Graph Adjacency Matrix 𝑩 - symmetric, square matrix - 𝐡 π‘—π‘˜ = 1 iff vertices 𝑀 𝑗 and 𝑀 π‘˜ are incident - 𝐡 π‘—π‘˜ = 0 otherwise http://mathworld.wolfram.com/AdjacencyMatrix.html

  4. 4 Graph Convolutional Network - given a graph 𝐻 = π‘Š, 𝐹 , graph-CNN is a function which: - takes as input: - feature description π’š 𝒋 ∈ ℝ 𝐸 for every node 𝑗 ; summarized as π‘Œ ∈ ℝ 𝑂×𝐸 , where 𝑂 is number of nodes, 𝐸 is number of input features - description of the graph structure in matrix form, typically an adjacency matrix 𝐡 - produces: - node-level output π‘Ž ∈ ℝ 𝑂×𝐺 , where 𝐺 is the number of output features per node

  5. 5 Graph Convolutional Network - is composed of non-linear functions 𝐼 (π‘š+1) = 𝑔(𝐼 π‘š , 𝐡) , where 𝐼 0 = π‘Œ , and 𝐼 (𝑀) = π‘Ž , and 𝑀 is the number of layers.

  6. 6 Graph Convolutional Network - graphically: https://tkipf.github.io/graph-convolutional-networks/

  7. 7 Graph Convolutional Network Let’s start with a simple layer -wise propagation rule 𝑔 𝐼 π‘š , 𝐡 = 𝜏(𝐡𝐼 π‘š 𝑋 π‘š ) , where 𝑋 (π‘š) ∈ ℝ 𝐸 π‘š ×𝐸 π‘š+1 is a weight matrix for the π‘š -th neural network layer, 𝜏(β‹…) is a non-linear activation function, 𝐡 ∈ ℝ 𝑂×𝑂 is adjacency matrix, 𝑂 is the number of nodes, 𝐼 (π‘š) ∈ ℝ 𝑂×𝐸 π‘š https://samidavies.wordpress.com/2016/09/20/whats-up-with-the-graph-laplacian/

  8. 8 Graph Convolutional Network multiplication with 𝐡 not enough, we’re missing the node itself 𝑔 𝐼 π‘š , 𝐡 = 𝜏(𝐡𝐼 π‘š 𝑋 π‘š ) , we fix it by 𝑔 𝐼 π‘š , 𝐡 = 𝜏( መ 𝐡𝐼 π‘š 𝑋 π‘š ) , where መ 𝐡 = 𝐡 + 𝐽 , 𝐽 is the identity matrix

  9. 9 Graph Convolutional Network መ 𝐡 is typically not normalized; this multiplication 𝑔 𝐼 π‘š , 𝐡 = 𝜏( መ 𝐡𝐼 π‘š 𝑋 π‘š ) , would change the scale of features 𝐼 (π‘š) 𝐸 βˆ’ 1 𝐸 βˆ’ 1 we fix that by symmetric normalization, i.e. ΰ·‘ 2 𝐡ෑ 2 , where ΰ·‘ 𝐸 is the diagonal node degree matrix of መ 𝐡 , ΰ·‘ 𝐸 𝑗𝑗 = Οƒ π‘˜ መ 𝐡 π‘—π‘˜ , producing 𝐸 βˆ’ 1 𝐸 βˆ’ 1 𝑔 𝐼 π‘š , 𝐡 = 𝜏( ΰ·‘ 2 𝐼 π‘š 𝑋 π‘š ) , 2 መ 𝐡 ΰ·‘

  10. 10 Graph Convolutional Network Examining a single layer, single filter πœ„ ∈ ℝ , and a single node feature vector π’š ∈ ℝ 𝐸

  11. 11 Graph Convolutional Network 𝐡 = 𝐡 + 𝐽 , ΰ·‘ መ 𝐸 𝑗𝑗 = Οƒ π‘˜ መ 𝐡 π‘—π‘˜ … renormalization trick

  12. 12 Graph Convolutional Network β€² =- πœ„ 1 β€² πœ„ = πœ„ 0 β€² 𝑀 βˆ’ 𝐽 π’š β€² π’š + πœ„ 1 πœ„ 0

  13. 13 Graph Convolutional Network β€² 𝑀 βˆ’ 𝐽 π’š β€² π’š + πœ„ 1 πœ„ 0 ΰ·¨ 𝑀 = 𝑑 𝑀 βˆ’ 𝐽 , 𝑑 ∈ ℝ 𝒉 𝜾 ⋆ π’š = 𝑉𝒉 𝜾 𝑉 ⊀ π’š Inverse Fourier transform – filtering – Fourier transform

  14. 14 Graph Convolutional Network An efficient graph convolution approximation was performed when the multiplication was interpreted as approximation of convolution in Fourier domain using Chebyshev polynomials. where 𝑂 is number of nodes, E is number of edges, 𝐸 π‘š is number of input channels, 𝐸 π‘š+1 is number of output channels.

  15. 15 Overview - Kipf and Welling - use first order approximation in Fourier-domain to obtain an efficient linear-time graph-CNNs - apply the approximation to the semi-supervised graph node classification problem

  16. 16 Semi-supervised Classification Task β–ͺ given a point set π‘Œ = {𝑦 1 , … , 𝑦 π‘š , 𝑦 π‘š+1 , … ,𝑦 π‘œ } β–ͺ and a label set 𝑀 = {1,… 𝑑} , where – first π‘š points have labels 𝑧 1 , … , 𝑧 π‘š ∈ 𝑀 – remaining points are unlabeled – 𝑑 is the number of classes β–ͺ the goal is to – predict the labels of the unlabeled points

  17. 17 Semi-supervised Classification Task β–ͺ graphically: https://papers.nips.cc/paper/2506-learning-with-local-and-global-consistency.pdf

  18. 18 graph-CNN EXAMPLE β–ͺ example: – two-layer graph-CNN π΅π‘Œπ‘‹ 0 𝑋 1 π‘Ž = 𝑔 π‘Œ, 𝐡 = softmax መ 𝐡 ReLU መ where 𝑋 0 ∈ ℝ 𝐷×𝐼 with 𝐷 input channels and 𝐼 features maps, 𝑋 1 ∈ ℝ 𝐼×𝐺 with 𝐺 output features per node

  19. 19 Graph Convolutional Network - graphically: https://arxiv.org/pdf/1609.02907.pdf

  20. 20 graph-CNN EXAMPLE β–ͺ objective function: – cross-entropy where Y 𝑀 is a set of node indices that have labels, π‘Ž π‘šπ‘” is the element in the l-th row, f-th column of matrix π‘Ž , ground truth: 𝑍 π‘šπ‘” is 1 if instance π‘š comes from a class 𝑔 .

  21. 21 graph-CNN EXAMPLE - RESULTS β–ͺ weights trained with gradient descent

  22. 22 graph-CNN EXAMPLE - RESULTS β–ͺ different variants of propagation models

  23. 23 graph-CNN another EXAMPLE β–ͺ 3- layer GCN, β€œkarate - club” problem, one labeled example per class: 300 training iterations

  24. 24 Limitations - Memory grows linearly with data - only works with undirected graph - assumption of locality - assumption of equal importance of self-connections vs. edges to neighboring nodes መ 𝐡 = 𝐡 + πœ‡π½ where πœ‡ is a learnable parameter.

  25. 25 Summary - Kipf and Welling - use first order approximation in Fourier-domain to obtain an efficient linear-time graph-CNNs - apply the approximation to the semi-supervised graph node classification problem

  26. 26 Thank you very much for your time…

  27. 27 Answers to Questions ሚ 𝐡 = 𝐡 + πœ‡π½ 𝑂 - The lambda parameter would control the influence of neighbouring edges vs. self-connections. - How (or why) would the lambda parameter trade-off also between supervised and unsupervised learning?

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend