Representation Learning on Networks
Yuxiao Dong
Microsoft Research, Redmond Joint work with Jiezhong Qiu, Jie Zhang, Jie Tang (Tsinghua University) Hao Ma (MSR & Facebook AI) and Kuansan Wang (MSR)
Representation Learning on Networks Yuxiao Dong Microsoft Research, - - PowerPoint PPT Presentation
Representation Learning on Networks Yuxiao Dong Microsoft Research, Redmond Joint work with Jiezhong Qiu, Jie Zhang, Jie Tang (Tsinghua University) Hao Ma (MSR & Facebook AI) and Kuansan Wang (MSR) Networks Social networks Economic
Yuxiao Dong
Microsoft Research, Redmond Joint work with Jiezhong Qiu, Jie Zhang, Jie Tang (Tsinghua University) Hao Ma (MSR & Facebook AI) and Kuansan Wang (MSR)
Economic networks Social networks Networks of neurons Biomedical networks Internet Information networks
Slides credit: Jure Leskovec
hand-crafted feature matrix
feature engineering
X
Graph & network applications
π¦ππ: node π€πβs ππ’β feature, e.g., π€πβs pagerank value
machine learning models
hand-crafted latent feature matrix
Feature engineering learning
Graph & network applications
Z machine learning models
π₯π π₯πβ2 π₯πβ1 π₯π+1 π₯π+2
Perozzi et al. DeepWalk: Online learning of social representations. In KDDβ 14, pp. 701β710.
skip-gram (word2vec)
β DeepWalk (walk length > 1) β LINE (walk length = 1)
β node2vec
β metapath2vec
Microsoft Academic Graph
metapath2vec
Harvard Stanford Columbia Yale UChicago Johns Hopkins Microsoft Google AT&T Labs MIT Facebook CMU
Input:
Adjacency Matrix
π©
Random Walk Skip Gram Output:
Vectors
π
β DeepWalk (walk length > 1) β LINE (walk length = 1)
β node2vec
β metapath2vec
Unifying DeepWalk, LINE, PTE, & node2vec as Matrix Factorization
π€ππ π» = ΰ·
π
ΰ·
π
π΅ππ π© Adjacency matrix π¬ Degree matrix b: #negative samples T: context window size
π₯π π₯πβ2 π₯πβ1 π₯π+1 π₯π+2
log(#(π, π )|π | π#(π₯)#(π))
π» = (π, πΉ)
Levy and Goldberg. Neural word embeddings as implicit matrix factorization. In NIPS 2014
log(#(π, π )|π | π#(π₯)#(π))
way in which each node and its context appear in a random walk node sequence.
Distinguish direction and distance log(#(π, π )|π | π#(π₯)#(π))
the length of random walk π β β
the length of random walk π β β
π₯π π₯πβ2 π₯πβ1 π₯π+1 π₯π+2
DeepWalk is asymptotically and implicitly factorizing
1. Qiu et al. Network embedding as matrix factorization: unifying deepwalk, line, pte, and node2vec. In WSDMβ18.
π€ππ π» = ΰ·
π
ΰ·
π
π΅ππ π© Adjacency matrix π¬ Degree matrix b: #negative samples T: context window size
Unifying DeepWalk, LINE, PTE, & node2vec as Matrix Factorization
Qiu et al. Network embedding as matrix factorization: unifying deepwalk, line, pte, and node2vec. In WSDMβ18. The most cited paper in WSDMβ18 as of May 2019
π₯π π₯πβ2 π₯πβ1 π₯π+1 π₯π+2
DeepWalk is asymptotically and implicitly factorizing
1. Qiu et al. Network embedding as matrix factorization: unifying deepwalk, line, pte, and node2vec. In WSDMβ18.
π» =
1. Qiu et al. Network embedding as matrix factorization: unifying deepwalk, line, pte, and node2vec. In WSDMβ18.
1. Qiu et al. Network embedding as matrix factorization: unifying deepwalk, line, pte, and node2vec. In WSDMβ18.
1. Qiu et al. Network embedding as matrix factorization: unifying deepwalk, line, pte, and node2vec. In WSDMβ18.
Input:
Adjacency Matrix
π©
Random Walk Skip Gram
π» = π(π©)
(dense) Matrix Factorization
Output:
Vectors
π
DeepWalk, LINE, node2vec, metapath2vec NetMF
Incorporate network structures π© into the similarity matrix π», and then factorize π»
π π© =
NetMF is not practical for very large networks π» =
π» =
π» =
For random-walk matrix polynomial where and non-negative One can construct a 1 + π -spectral sparsifier ΰ·¨ π΄ with non-zeros in time for undirected graphs
For random-walk matrix polynomial where and non-negative One can construct a 1 + π -spectral sparsifier ΰ·¨ π΄ with non-zeros in time
π» =
Factorize the constructed sparse matrix
π΅ ΰ·© π΅
#non-zeros ~4.5 Quadrillion β 45 Billion
Input:
Adjacency Matrix
π©
Random Walk Skip Gram
π» = π(π©)
(dense) Matrix Factorization
Output:
Vectors
π
DeepWalk, LINE, node2vec, metapath2vec (sparse) Matrix Factorization
Sparsify π»
NetSMF NetMF π π© =
Incorporate network structures π© into the similarity matrix π», and then factorize π»
ππ β πΈβ1π΅(π½π β ΰ·¨ π) ππ is the spectral filter of π = π½π β πΈβ1π΅ πΈβ1π΅(π½π β ΰ·¨ π) is πΈβ1π΅ modulated by the filter in the spectrum
The idea of Graph Neural Networks
20 Threads 1 Thread
ProNE offers 10-400X speedups (1 thread vs 20 threads)
19hours 98mins 10mins
ProNE embeds 100,000,000 nodes by 1 thread: 29 hours with performance superiority
Input:
Adjacency Matrix
π©
Random Walk Skip Gram
π» = π(π©)
(dense) Matrix Factorization
Output:
Vectors
π
DeepWalk, LINE, node2vec, metapath2vec (sparse) Matrix Factorization
Sparsify π»
NetSMF (sparse) Matrix Factorization
π = π(πβ²)
ProNE NetMF
Factorize π©, and then incorporate network structures via spectral propagation
Input:
Adjacency Matrix
π©
Random Walk Skip Gram
π» = π(π©)
(dense) Matrix Factorization
Output:
Vectors
π
DeepWalk, LINE, node2vec, metapath2vec (sparse) Matrix Factorization
Sparsify π»
NetSMF: handle billion-scale graphs (sparse) Matrix Factorization
π = π(πβ²)
ProNE: 10--400X speedups NetMF: theory & better accuracy
Scale Network Embedding as Sparse Matrix Factorization. WWW 2019.
Representation Learning. IJCAI 2019.
Matrix Factorization: Unifying DeepWalk, LINE, PTE, and node2vec. WSDM 2018.
Heterogeneous Networks. KDD 2017.
https://academic.microsoft.com as of Sep. 2019 The graph data is open!
230 million authors 25,570 Institutions 48,757 journals 4,307 conferences 664,862 fields of study 228 million papers/patents/books/preprints
1800 --- 2019
Papers & data & code available at https://ericdongyx.github.io/ ericdongyx@gmail.com Joint work with Jiezhong Qiu, Jie Zhang, Jie Tang (Tsinghua University) Hao Ma (MSR & Facebook AI) and Kuansan Wang (MSR)