Representation Learning on Networks Yuxiao Dong Microsoft Research, - PowerPoint PPT Presentation

Representation Learning on Networks Yuxiao Dong Microsoft Research, Redmond Joint work with Jiezhong Qiu, Jie Zhang, Jie Tang (Tsinghua University) Hao Ma (MSR & Facebook AI) and Kuansan Wang (MSR)

Networks Social networks Economic networks Biomedical networks Information networks Internet Networks of neurons Slides credit: Jure Leskovec

The Network & Graph Mining Paradigm 𝑦 𝑗𝑘 : node 𝑤 𝑗 ’s 𝑘 𝑢ℎ feature, e.g., 𝑤 𝑗 ’s pagerank value Graph & network applications • Node label inference; • X Link prediction; • User behavior… … hand-crafted feature matrix machine learning models feature engineering

Representation Learning for Networks Graph & network applications • Node label inference; • Z Node clustering; • Link prediction; • … … hand-crafted latent feature matrix machine learning models Feature engineering learning • Input: a network 𝐻 = (𝑊, 𝐹) Output: 𝒂 ∈ 𝑆 𝑊 ×𝑙 , 𝑙 ≪ |𝑊| , 𝑙 -dim vector 𝒂 𝑤 for each node v . •

Network Embedding: Random Walk + Skip-Gram 𝑥 𝑗−2 𝑥 𝑗−1 𝑥 𝑗 𝑥 𝑗+1 𝑥 𝑗+2 • sentences in NLP skip-gram • vertex-paths in Networks (word2vec) Perozzi et al. DeepWalk: Online learning of social representations. In KDD’ 14 , pp. 701 – 710.

Random Walk Strategies • Random Walk – DeepWalk (walk length > 1) – LINE (walk length = 1) • Biased Random Walk • 2 nd order Random Walk – node2vec • Metapath guided Random Walk – metapath2vec

Application: Embedding Heterogeneous Academic Graph metapath2vec Microsoft Academic Graph • https://academic.microsoft.com/ • https://www.openacademic.ai/oag/ • metapath2vec: scalable representation learning for heterogeneous networks. In KDD 2017.

Application 1: Related Venues • https://academic.microsoft.com/ • https://www.openacademic.ai/oag/ • metapath2vec: scalable representation learning for heterogeneous networks. In KDD 2017.

Application 2: Similarity Search (Institution) Microsoft Facebook Stanford Harvard Johns Hopkins UChicago AT&T Labs Google MIT Yale Columbia CMU • https://academic.microsoft.com/ • https://www.openacademic.ai/oag/ • metapath2vec: scalable representation learning for heterogeneous networks. In KDD 2017.

Network Embedding Random Walk Skip Gram Output: Input: Vectors Adjacency Matrix 𝒂 𝑩 • Random Walk – DeepWalk (walk length > 1) – LINE (walk length = 1) • Biased Random Walk 2 nd order Random Walk • – node2vec • Metapath guided Random Walk – metapath2vec

Unifying DeepWalk, LINE, PTE, & node2vec as Matrix Factorization • DeepWalk • LINE • PTE • node2vec 𝑩 Adjacency matrix b : #negative samples T : context window size 𝑬 Degree matrix 𝑤𝑝𝑚 𝐻 = ෍ 𝐵 𝑗𝑘 ෍ 𝑗 𝑘 1. Qiu et al. Network embedding as matrix factorization: unifying deepwalk, line, pte, and node2vec. In WSDM’18.

Understanding Random Walk + Skip Gram 𝑥 𝑗−2 𝑥 𝑗−1 𝑥 𝑗 𝑥 𝑗+1 𝑥 𝑗+2 𝐻 = (𝑊, 𝐹) • (𝑥, 𝑑) : co-occurrence of w & c ? log(#(𝒙, 𝒅)|𝒠| • (𝑥) : occurrence of node w • Adjacency matrix 𝑩 𝑐#(𝑥)#(𝑑)) • (𝑑) : occurrence of context c • Degree matrix 𝑬 • 𝒠: node−context pair (w, c) multi−set • Volume of 𝐻: 𝑤𝑝𝑚 𝐻 • |𝒠| : number of node-context pairs Levy and Goldberg. Neural word embeddings as implicit matrix factorization. In NIPS 2014

Understanding Random Walk + Skip Gram log(#(𝒙, 𝒅)|𝒠| 𝑐#(𝑥)#(𝑑)) • (𝑥, 𝑑) : co-occurrence of w & c • (𝑥) : occurrence of node w • (𝑑) : occurrence of context c • 𝒠: node−context pair (w, c) multi−set • |𝒠| : number of node-context pairs

Understanding Random Walk + Skip Gram log(#(𝒙, 𝒅)|𝒠| 𝑐#(𝑥)#(𝑑)) • (𝑥, 𝑑) : co-occurrence of w & c • (𝑥) : occurrence of node w • (𝑑) : occurrence of context c • 𝒠: node−context pair (w, c) multi−set • |𝒠| : number of node-context pairs • Partition the multiset 𝒠 into several sub-multisets according to the way in which each node and its context appear in a random walk node sequence. • More formally, for 𝑠 = 1, 2, ⋯ , 𝑈 , we define Distinguish direction and distance

Understanding Random Walk + Skip Gram the length of random walk 𝑀 → ∞ • (𝑥, 𝑑) : co-occurrence of w & c • 𝒠: (w, c) multi−set

Understanding Random Walk + Skip Gram the length of random walk 𝑀 → ∞

Understanding Random Walk + Skip Gram 𝑥 𝑗−2 𝑥 𝑗−1 𝑥 𝑗 𝑥 𝑗+1 𝑥 𝑗+2 DeepWalk is asymptotically and implicitly factorizing 𝑩 Adjacency matrix 𝑬 Degree matrix 𝑤𝑝𝑚 𝐻 = ෍ ෍ 𝐵 𝑗𝑘 𝑗 𝑘 b : #negative samples T : context window size Qiu et al. Network embedding as matrix factorization: unifying deepwalk, line, pte, and node2vec. In WSDM’18. 1.

Unifying DeepWalk, LINE, PTE, & node2vec as Matrix Factorization • DeepWalk • LINE • PTE • node2vec Qiu et al. Network embedding as matrix factorization: unifying deepwalk, line, pte, and node2vec. In WSDM’18. The most cited paper in WSDM’18 as of May 2019

NetMF: explicitly factorizing the DeepWalk matrix Matrix 𝑥 𝑗−2 𝑥 𝑗−1 𝑥 𝑗 Factorization 𝑥 𝑗+1 𝑥 𝑗+2 DeepWalk is asymptotically and implicitly factorizing Qiu et al. Network embedding as matrix factorization: unifying deepwalk, line, pte, and node2vec. In WSDM’18. 1.

the NetMF algorithm 1. Construction 2. Factorization 𝑻 = Qiu et al. Network embedding as matrix factorization: unifying deepwalk, line, pte, and node2vec. In WSDM’18. 1.

Results • Predictive performance on varying the ratio of training data; • The x -axis represents the ratio of labeled data (%) Qiu et al. Network embedding as matrix factorization: unifying deepwalk, line, pte, and node2vec. In WSDM’18. 1.

Results Explicit matrix factorization (NetMF) offers performance gains over implicit matrix factorization (DeepWalk & LINE) Qiu et al. Network embedding as matrix factorization: unifying deepwalk, line, pte, and node2vec. In WSDM’18. 1.

Network Embedding Random Walk Skip Gram DeepWalk, LINE, node2vec, metapath2vec (dense) Matrix Output: 𝑻 = 𝑔(𝑩) Input: Factorization Vectors Adjacency Matrix NetMF 𝒂 𝑩 Incorporate network structures 𝑩 into the similarity matrix 𝑻 , and then factorize 𝑻 𝑔 𝑩 =

Challenges 𝑻 = NetMF is not practical for very large networks

NetMF How can we solve this issue? 1. Construction 2. Factorization 𝑻 = 1. Qiu et al. NetSMF: Network embedding as sparse matrix factorization. In WWW 2019

NetSMF--Sparse How can we solve this issue? 1. Sparse Construction 2. Sparse Factorization 𝑻 = 1. Qiu et al. NetSMF: Network embedding as sparse matrix factorization. In WWW 2019

Sparsify 𝑻 For random-walk matrix polynomial where and non-negative One can construct a 1 + 𝜗 -spectral sparsifier ෨ 𝑴 with non-zeros in time for undirected graphs • Dehua Cheng, Yu Cheng, Yan Liu, Richard Peng, and Shang-Hua Teng, Efficient Sampling for Gaussian Graphical Models via Spectral Sparsification, COLT 2015. • Dehua Cheng, Yu Cheng, Yan Liu, Richard Peng, and Shang-Hua Teng. Spectral sparsification of random-walk matrix polynomials. arXiv:1502.03496.

Sparsify 𝑻 For random-walk matrix polynomial where and non-negative One can construct a 1 + 𝜗 -spectral sparsifier ෨ 𝑴 with non-zeros in time 𝑻 = 1. Qiu et al. NetSMF: Network embedding as sparse matrix factorization. In WWW 2019

NetSMF --- Sparse Factorize the constructed sparse matrix 1. Qiu et al. NetSMF: Network embedding as sparse matrix factorization. In WWW 2019

NetSMF---bounded approximation error 𝑵 ෩ 𝑵 1. Qiu et al. NetSMF: Network embedding as sparse matrix factorization. In WWW 2019

#non-zeros ~4.5 Quadrillion → 45 Billion 1. Qiu et al. NetSMF: Network embedding as sparse matrix factorization. In WWW 2019

1. Qiu et al. NetSMF: Network embedding as sparse matrix factorization. In WWW 2019

Effectiveness: • (sparse MF)NetSMF ≈ (explicit MF)NetMF > (implicit MF) DeepWalk/LINE Efficiency: • Sparse MF can handle billion-scale network embedding 1. Qiu et al. NetSMF: Network embedding as sparse matrix factorization. In WWW 2019

Embedding Dimension? 1. Qiu et al. NetSMF: Network embedding as sparse matrix factorization. In WWW 2019

Network Embedding Random Walk Skip Gram DeepWalk, LINE, node2vec, metapath2vec (dense) Matrix Output: 𝑻 = 𝑔(𝑩) Input: Factorization Vectors Adjacency Matrix NetMF 𝒂 𝑩 (sparse) Matrix Sparsify 𝑻 Factorization NetSMF Incorporate network structures 𝑩 into the similarity matrix 𝑻 , and then factorize 𝑻 𝑔 𝑩 =

ProNE: More fast & scalable network embedding 1. Zhang et al. ProNE: Fast and Scalable Network Representation Learning. In IJCAI 2019

Representation Learning on Networks Yuxiao Dong Microsoft Research, - PowerPoint PPT Presentation

Representation Learning on Networks Yuxiao Dong Microsoft Research, Redmond Joint work with Jiezhong Qiu, Jie Zhang, Jie Tang (Tsinghua University) Hao Ma (MSR & Facebook AI) and Kuansan Wang (MSR) Networks Social networks Economic

Stable and Efficient Representation Learning with Nonnegativity Constraints Tsung-Han Lin and

Unsupervised Language Learning: Representation Learning for NLP Katia Shutova ILLC University

K K Knowledge Knowledge l d l d Representation Representation Representation

Precise and Approximate Representation of Numbers The Cartesian-Lagrangian representation of

Image and Video Coding: Representation, Acquisition, Display ... 10011 ... encoder decoder

Number representation in Java Scientific notation Overview topics Binary representation of

parametric surface patches 1 implicit representation implicit surface representation f ( P ) = 0

P2P Networks as Content P2P Networks as Content Delivery Networks Delivery Networks FINAL

Current Network Structure for Pediatrics Hospital Networks Country, state, regional, Academic

CHAPTER VII VII CHAPTER Learning in Recurrent Networks Learning in Recurrent Networks CHAPTER

Lecture 5: Data Representation 1 / 43 Data Representation Discussion Deep learning job postings

Learning Neural Networks Learning Neural Networks Neural Networks can represent complex Neural

Neural Networks Neural networks arise from attempts to model Neural Networks human/animal

Mobile Communications Ad-Hoc Networks & Wireless Sensor Networks Ad-hoc networks

Outline Applications of Random Networks Random Networks Applications of Random Networks

Types of networks (social networks, computer networks, entity- relationship networks, )

using random walks to detect amenability in f Murray Elder, Andrew Rechnitzer, Buks van Rensburg,

Random Walks and Electric Resistance on Distance-Regular Graphs Greg Markowsky March 16, 2011

Random Walks Conditioned to Stay Positive Bob Keener Let S n be a random walk formed by summing

Random Walks, Random Fields, and Graph Kernels John Lafferty School of Computer Science

Two-dimensional self-avoiding walks Mireille Bousquet-Mlou CNRS, LaBRI, Bordeaux, France

spectral graph theory and clustering linear algebra reminder Real symmetric matrices have real

Monte Carlo methods for volumetric light transport Monte Carlo methods for volumetric light

A q -Random Walk Approximated by a q -Brownian Motion Malvina Vamvakari Department of

Sambuz

Useful Links

Newsletter

Mail Us

Representation Learning on Networks Yuxiao Dong Microsoft Research, - PowerPoint PPT Presentation

Representation Learning on Networks Yuxiao Dong Microsoft Research, Redmond Joint work with Jiezhong Qiu, Jie Zhang, Jie Tang (Tsinghua University) Hao Ma (MSR & Facebook AI) and Kuansan Wang (MSR) Networks Social networks Economic

Stable and Efficient Representation Learning with Nonnegativity Constraints Tsung-Han Lin and

Unsupervised Language Learning: Representation Learning for NLP Katia Shutova ILLC University

K K Knowledge Knowledge l d l d Representation Representation Representation

Precise and Approximate Representation of Numbers The Cartesian-Lagrangian representation of

Image and Video Coding: Representation, Acquisition, Display ... 10011 ... encoder decoder

Number representation in Java Scientific notation Overview topics Binary representation of

parametric surface patches 1 implicit representation implicit surface representation f ( P ) = 0

P2P Networks as Content P2P Networks as Content Delivery Networks Delivery Networks FINAL

Current Network Structure for Pediatrics Hospital Networks Country, state, regional, Academic

CHAPTER VII VII CHAPTER Learning in Recurrent Networks Learning in Recurrent Networks CHAPTER

Lecture 5: Data Representation 1 / 43 Data Representation Discussion Deep learning job postings

Learning Neural Networks Learning Neural Networks Neural Networks can represent complex Neural

Neural Networks Neural networks arise from attempts to model Neural Networks human/animal

Mobile Communications Ad-Hoc Networks &amp; Wireless Sensor Networks Ad-hoc networks

Outline Applications of Random Networks Random Networks Applications of Random Networks

Types of networks (social networks, computer networks, entity- relationship networks, )

using random walks to detect amenability in f Murray Elder, Andrew Rechnitzer, Buks van Rensburg,

Random Walks and Electric Resistance on Distance-Regular Graphs Greg Markowsky March 16, 2011

Random Walks Conditioned to Stay Positive Bob Keener Let S n be a random walk formed by summing

Random Walks, Random Fields, and Graph Kernels John Lafferty School of Computer Science

Two-dimensional self-avoiding walks Mireille Bousquet-Mlou CNRS, LaBRI, Bordeaux, France

spectral graph theory and clustering linear algebra reminder Real symmetric matrices have real

Monte Carlo methods for volumetric light transport Monte Carlo methods for volumetric light

A q -Random Walk Approximated by a q -Brownian Motion Malvina Vamvakari Department of

Sambuz

Useful Links

Newsletter

Mail Us

Mobile Communications Ad-Hoc Networks & Wireless Sensor Networks Ad-hoc networks