no node2vec
play

no node2vec: Scalable Feature Learning for Networks Aditya Grover - PowerPoint PPT Presentation

no node2vec: Scalable Feature Learning for Networks Aditya Grover and Jure Leskovec. KDD 2016. Presented by Haoxiang Wang. Feb 26, 2020. Node Embeddings Ou Outpu put A B In Input Intuition: Find embeddings of nodes in a d-


  1. no node2vec: Scalable Feature Learning for Networks Aditya Grover and Jure Leskovec. KDD 2016. Presented by Haoxiang Wang. Feb 26, 2020.

  2. Node Embeddings Ou Outpu put A B In Input ´ Intuition: Find embeddings of nodes in a d- dimensional space so that “similar” nodes in the graph have embeddings that are close together.

  3. Setup ´ Assume we have a graph G : ´ V is the vertex set (i.e., node set). ´ A is the adjacency matrix (assume binary).

  4. Embedding Nodes ´ Goal: to encode nodes so that similarity in the embedding space (e.g., dot product) approximates similarity in the original network. similarity( u, v ) ≈ z > v z u

  5. Random Walk Embeddings: Basic Idea probability that u and v co-occur on a z > u z v ≈ random walk over the network 1. Estimate probability of visiting node v on a random walk starting from node u using some random walk strategy R . 2. Optimize embeddings to encode these random walk statistics.

  6. Algorithm/Optimization of Random Walk Embeddings 1. Run short random walks starting from each node on the graph using some strategy R . For each node u collect N R ( u ) , the multiset * of nodes 2. visited on random walks starting from u. ( * N R ( u ) can have repeat elements since nodes can be visited multiple times on random walks. ) 3. Optimize embeddings to according to: X X L = − log( P ( v | z u )) u ∈ V v ∈ N R ( u ) exp( z > u z v ) P ( v | z u ) = n 2 V exp( z > P u z n ) In practice, random sampling based on some distribution over nodes

  7. Node2vec: Biased Random Walks ´ Idea: use flexible, biased random walks that can trade off between local and global views of the network (Grover and Leskovec, 2016). ´ BFS (Breath-First Search)and DFS (Depth-First Search): Two classic strategies to define a neighborhood 𝑂 , 𝑣 of a given node 𝑣 : 𝑂 ./0 𝑣 = { 𝑡 4 , 𝑡 6 , 𝑡 7 } s 1 s 2 s 8 Local microscopic view s 7 BFS u s 6 DFS 𝑂 9/0 𝑣 = { 𝑡 : , 𝑡 ; , 𝑡 < } s 9 s 4 s 5 s 3 Global macroscopic view

  8. Combine BFS + DFS by a Ratio Biased random walk 𝑆 that Unnormalized given a node 𝑣 generates Walker is at 𝑥 . transition prob. neighborhood 𝑂 , 𝑣 Where to go next? ´ Two parameters: 1 s 2 s 3 s 1 1/𝑞 ´ Return parameter 𝑞 : w s 2 1 w → 1/𝑟 Return back to the s 1 s 3 1/𝑟 u 1/𝑞 previous node BFS-like walk: Low value of 𝑞 ´ Walk-away parameter DFS-like walk: Low value of 𝑟 𝑟 : Moving outwards (DFS) vs. inwards (BFS)

  9. Benchmarks: Node Classification & Link Prediction ? ? Node ? Classification ? Machine Learning ? ? Link Prediction ? x ? Machine Learning

  10. Link Prediction Empirical Results Node Classification

  11. Advantages of Node2Vec ´ node2vec performs better on node classification compared with other node embedding methods. ´ Random walk approaches are generally more efficient (i.e., O(|E|) vs. O(|V| 2 ) ) ´ (Note: In general , one must choose definition of node similarity that matches application. )

  12. Other random walk node embedding works ´ Different kinds of biased random walks: ´ Based on node attributes (Dong et al., 2017). ´ Based on a learned weights (Abu-El-Haija et al., 2017) ´ Alternative optimization schemes: ´ Directly optimize based on 1-hop and 2-hop random walk probabilities (as in LINE from Tang et al. 2015). ´ Network preprocessing techniques: ´ Run random walks on modified versions of the original network (e.g., Ribeiro et al. 2017’s struct2vec, Chen et al. 2016’s HARP).

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend