SLIDE 1
no node2vec:
Scalable Feature Learning for Networks
Aditya Grover and Jure Leskovec. KDD 2016.
Presented by Haoxiang Wang. Feb 26, 2020.
SLIDE 2 Node Embeddings
´Intuition: Find embeddings of nodes in a d- dimensional space so that “similar” nodes in the graph have embeddings that are close together.
A B
Ou Outpu put In Input
SLIDE 3
Setup
´Assume we have a graph G: ´V is the vertex set (i.e., node set). ´A is the adjacency matrix (assume binary).
SLIDE 4
Embedding Nodes
´ Goal: to encode nodes so that similarity in the embedding space (e.g., dot product) approximates similarity in the original network.
similarity(u, v) ≈ z>
v zu
SLIDE 5 Random Walk Embeddings: Basic Idea
- 1. Estimate probability of visiting
node v on a random walk starting from node u using some random walk strategy R.
- 2. Optimize embeddings to
encode these random walk statistics. probability that u and v co-occur on a random walk over the network
z>
u zv ≈
SLIDE 6 Algorithm/Optimization of Random Walk Embeddings
1. Run short random walks starting from each node on the graph using some strategy R. 2. For each node u collect NR(u), the multiset* of nodes visited on random walks starting from u. (* NR(u) can have repeat elements since nodes can be visited multiple times on random walks.) 3. Optimize embeddings to according to:
L = X
u∈V
X
v∈NR(u)
− log(P(v|zu))
P(v|zu) = exp(z>
u zv)
P
n2V exp(z> u zn) In practice, random sampling based
- n some distribution over nodes
SLIDE 7
Node2vec: Biased Random Walks
´ Idea: use flexible, biased random walks that can trade off between local and global views of the network (Grover and Leskovec, 2016). ´ BFS (Breath-First Search)and DFS (Depth-First Search): Two classic strategies to define a neighborhood 𝑂, 𝑣 of a given node 𝑣:
u s3 s2
s1
s4 s8 s9 s6 s7 s5
BFS DFS
𝑂./0 𝑣 = { 𝑡4, 𝑡6, 𝑡7} 𝑂9/0 𝑣 = { 𝑡:, 𝑡;, 𝑡<} Local microscopic view Global macroscopic view
SLIDE 8
Combine BFS + DFS by a Ratio
Biased random walk 𝑆 that given a node 𝑣 generates neighborhood 𝑂, 𝑣 ´Two parameters: ´Return parameter 𝑞: Return back to the previous node ´Walk-away parameter 𝑟 : Moving outwards (DFS) vs. inwards (BFS)
1
1/𝑟 1/𝑞
s1 s2 w s3 u
s1 s2 s3 1/𝑞 1 1/𝑟
BFS-like walk: Low value of 𝑞 DFS-like walk: Low value of 𝑟
Unnormalized transition prob.
w →
Walker is at 𝑥. Where to go next?
SLIDE 9 Benchmarks: Node Classification & Link Prediction
? ? ? ? ?
Machine Learning
Node Classification Link Prediction
Machine Learning
? ? ?
x
SLIDE 10
Empirical Results
Node Classification Link Prediction
SLIDE 11
Advantages of Node2Vec
´node2vec performs better on node classification compared with other node embedding methods. ´Random walk approaches are generally more efficient (i.e., O(|E|) vs. O(|V|2)) ´(Note: In general, one must choose definition of node similarity that matches application. )
SLIDE 12 Other random walk node embedding works
´ Different kinds of biased random walks: ´Based on node attributes (Dong et al., 2017). ´Based on a learned weights (Abu-El-Haija et al., 2017) ´ Alternative optimization schemes: ´Directly optimize based on 1-hop and 2-hop random walk probabilities (as in LINE from Tang et al. 2015). ´ Network preprocessing techniques: ´Run random walks on modified versions of the original network (e.g., Ribeiro et al. 2017’s struct2vec, Chen et