metapath2vec Scalable Representation Learning for Heterogeneous - PowerPoint PPT Presentation

metapath2vec Scalable Representation Learning for Heterogeneous Networks Yuxiao Dong Nitesh V. Chawla Ananthram Swami Microsoft Research University of Notre Dame Army Research Lab & Notre Dame Interdisciplinary Center for Network Science and Applications ( iCeNSA ) University of Notre Dame

Conventional Network Mining and Learning Network Mining Tasks node attribute inference ♣ community detection ♣ ♣ similarity search ♣ link prediction social recommendation ♣ … ♣ hand-crafted feature matrix feature engineering machine learning models 1

Network Embedding for Mining and Learning ? Network Mining Tasks node attribute inference ♣ community detection ♣ X ♣ similarity search ♣ link prediction social recommendation ♣ … ♣ latent representation matrix feature learning machine learning models Y. Bengio, A. Courville, and P. Vincent. Representation learning: A review and new perspectives. IEEE TPAMI , 35(8):1798 – 1828, 2013. 2 Y. LeCun, Y. Bengio, and G. Hinton. Deep learning. Nature , 521(7553):436 – 444, 2015.

Word Embedding in NLP ♣ Input: a text corpus 𝐸 = {𝑋} ♣ Output: 𝒀 ∈ 𝑆 𝑋 ×𝑒 , 𝑒 ≪ |𝑋| , d -dim vector 𝒀 𝑥 for each word w. input hidden output o Computational lens on big social 𝑥 𝑗−2 and information networks. o The connections between individuals form the structural … 𝑥 𝑗−1 o In a network sense, individuals X matters in the ways in which ... 𝑥 𝑗 o Accordingly, this thesis develops computational models to investigating the ways that ... 𝑥 𝑗+1 o We study two fundamental and interconnected directions: user demographics and network 𝑥 𝑗+2 diversity o ... ... sentences word2vec latent representation vector ♣ geographically close words---a word and its context words---in a sentence or document exhibit interrelations in human natural language. T. Mikolov, I Sutskever, K Chen, GS Corrado, J Dean. Distributed representations of words and phrases and their compositionality. In NIPS ’13 , pp. 3111-31119. 1. 3 T. Mikolov, K. Chen, G. Corrado , and J. Dean, “Efficient estimation of word representations in vector space,” arXiv:1301.3781 , 2013. 2.

Network Embedding ♣ Input: a network 𝐻 = (𝑊, 𝐹) ♣ Output: 𝒀 ∈ 𝑆 𝑊 ×𝑒 , 𝑒 ≪ |𝑊| , d -dim vector 𝒀 𝑤 for each node v. input hidden output v 1 v 1 v 5 𝑑 𝑗−2 v 2 v 3 v 3 𝑑 𝑗−1 … ... X 𝑤 v 3 v 5 v 2 𝑑 𝑗+1 v 5 v 1 v 3 𝑑 𝑗+2 random walk paths latent representation vector word2vec (sentences) DeepWalk [Perozzi et al., KDD14] B. Perozzi, R. Al-Rfou, and S. Skiena , “ DeepWalk : Online learning of social representations,” in KDD ’ 14 , pp. 701 – 710. 1. A. Grover, J. Leskovec. node2vec: Scalable Feature Learning for Networks. in KDD ’16 , pp. 855 — 864. 2. T. Mikolov, I Sutskever, K Chen, GS Corrado, J Dean. Distributed representations of words and phrases and their compositionality. In NIPS ’13 , pp. 3111-31119. 3. 4 T. Mikolov, K. Chen, G. Corrado , and J. Dean, “Efficient estimation of word representations in vector space,” arXiv:1301.3781 , 2013. 4.

Heterogeneous Network Embedding: Problem ♣ Input: a heterogeneous information network 𝐻 = (𝑊, 𝐹, 𝑈) ♣ Output: 𝒀 ∈ 𝑆 𝑊 ×𝑒 , 𝑒 ≪ |𝑊| , d -dim vector 𝒀 𝑤 for each node v. ? X latent representation vector 5

Heterogeneous Network Embedding: Challenges How do we effectively preserve the concept of “node - context” ♣ among multiple types of nodes, e.g., authors, papers, & venues in academic heterogeneous networks? ♣ Can we directly apply homogeneous network embedding architectures to heterogeneous networks? ♣ It is also difficult for conventional meta-path based methods to model similarities between nodes without connected meta-paths. 6

Heterogeneous Network Embedding: Solutions metapath2vec skip-gram meta-path-based random walks heterogeneous metapath2vec++ skip-gram 7

metapath2vec meta-path-based skip-gram random walks hidden output layer input layer layer prob. that KDD 0 KDD apears ACL 0 a 1 0 a 2 0 a 3 0 a 4 ... ... 1 a 5 0 MIT 0 CMU 0 p 1 0 p 2 0 p 3 prob. that 0 p 3 appears |V|-dim |V| x k 1. Y. Sun, J. Han. Mining heterogeneous information networks: Principles and Methodologies. Morgan & Claypool Publishers, 2012. 8 T. Mikolov, et al. Distributed representations of words and phrases and their compositionality. In NIPS ’13. 2.

metapath2vec: Meta-Path-Based Random Walks Goal: to generate paths that are able to capture both the semantic and structural correlations between different types of nodes, facilitating the transformation of heterogeneous network structures into skip-gram. 9

metapath2vec: Meta-Path-Based Random Walks Given a meta-path scheme ♣ The transition probability at step i is defined as ♣ Recursive guidance for random walkers, i.e., ♣ 10

metapath2vec: Meta-Path-Based Random Walks Given a meta-path scheme (Example) ♣ OAPVPAO In a traditional random walk procedure, in the toy example, ♣ the next step of a walker on node a4 transitioned from node CMU can be all types of nodes surrounding it — a2, a3, a5, p2, p3, and CMU. Under the metapath scheme ‘OAPVPAO’, for example, the ♣ walker is biased towards paper nodes (P) given its previous step on an organization node CMU (O), following the semantics of this meta-path. 11

metapath2vec meta-path-based skip-gram random walks hidden output layer input layer layer prob. that KDD 0 KDD apears ACL 0 a 1 0 a 2 0 a 3 0 a 4 ... ... 1 a 5 0 MIT 0 CMU 0 p 1 0 p 2 0 The potential issue of skip-gram for p 3 prob. that 0 p 3 appears |V|-dim |V| x k heterogeneous network embedding: To predict the context node 𝑑 𝑢 (type t ) given a node v , metapath2vec encourages all types of nodes to appear in this context position 1. Y. Sun, J. Han. Mining heterogeneous information networks: Principles and Methodologies. Morgan & Claypool Publishers, 2012. 12 T. Mikolov, et al. Distributed representations of words and phrases and their compositionality. In NIPS ’13. 2.

metapath2vec++ heterogeneous meta-path-based skip-gram random walks output layer prob. that KDD appears prob. that hidden ACL appears input layer |V V | x k V layer KDD 0 ACL 0 a 1 prob. that 0 a 3 appears a 2 0 a 3 prob. that 0 a 5 appears a 4 |V A | x k A 1 a 5 0 prob. that MIT 0 CMU appears CMU 0 p 1 |V o | x k o 0 p 2 0 p 3 0 prob. that p 2 appears |V|-dim prob. that p 3 appears |V p | x k P 13

metapath2vec++: Heterogeneous Skip-Gram output layer prob. that KDD appears prob. that ♣ softmax in metapath2vec hidden ACL appears input layer |V V | x k V layer KDD 0 ACL 0 a 1 prob. that 0 a 3 appears a 2 0 a 3 ♣ softmax in metapath2vec++ prob. that 0 a 5 appears a 4 |V A | x k A 1 a 5 0 prob. that MIT 0 CMU appears CMU 0 p 1 |V o | x k o 0 p 2 0 p 3 0 prob. that p 2 appears |V|-dim prob. that p 3 appears |V p | x k P ♣ objective function (heterogeneous ♣ stochastic gradient descent negative sampling) 14 T. Mikolov, et al. Distributed representations of words and phrases and their compositionality. In NIPS ’13. 1.

metapath2vec++ ♣ every sub-procedure is easy to parallelize ♣ 24-32X speedup by using 40 cores 40 metapath2vec metapath2vec++ 32 24 speedup 16 8 4 2 1 12 4 8 16 24 32 40 #threads 15

Network Mining and Learning Paradigm Network Applications node attribute inference ♣ community detection ♣ metapath2vec X similarity search ♣ ♣ link prediction metapath2vec++ ♣ social recommendation … ♣ latent representation vector 16

Experiments Heterogeneous Data Baselines Mining Tasks ♣ DeepWalk [KDD ’14] ♣ AMiner Academic Network ♣ node classification o 9 1.7 million authors o logistic regression ♣ node2vec [KDD ’16] o 3 million papers ♣ node clustering ♣ LINE [WWW ’15] o 3800+ venues o k-means o 8 research areas ♣ PTE [KDD ’15] ♣ similarity search o cosine similarity Parameters ♣ #walks: 1000 ♣ walk-length: 100 ♣ #dimensions: 128 ♣ neighborhood size: 7 J. Tang, et al. ArnetMiner: Extraction and Mining of Academic Social Networks. In KDD 2008. 17 https://aminer.org/aminernetwork publications

Application 1: Multi-Class Node Classification 18

Application 1: Multi-Class Node Classification 19

Application 2: Node Clustering 20 http://projector.tensorflow.org/

Application 3: Similarity Search 21

Visualization word2vec [ Mikolov, 2013 ] 22 http://projector.tensorflow.org/

Problem: Heterogeneous Network Embedding ♣ Models: metapath2vec & metapath2vec++ ♣ ♧ The automatic discovery of internal semantic relationships between different types of nodes in heterogeneous networks Applications: classification, clustering, & ♣ similarity search 23

Thank you! Data & Code https://ericdongyx.github.io/metapath2vec/m2v.html 24

metapath2vec Scalable Representation Learning for Heterogeneous - PowerPoint PPT Presentation

metapath2vec Scalable Representation Learning for Heterogeneous Networks Yuxiao Dong Nitesh V. Chawla Ananthram Swami Microsoft Research University of Notre Dame Army Research Lab & Notre Dame Interdisciplinary Center for Network

Charon-Suite Module Framework Modular Algorithms with Serializable C++ Objects Jens-Malte

Distributing Secrets Securely ? Presented by Simo Sorce Red Hat, Inc. Flock 2015 Historically

Java Enterprise Edition (JEE) Core Design Patterns JEE Core Design Patterns Presentation

Class 20 Announcements/info Reason review, clarifications Dictionaries Variant Types Options

Heterogeneous Subgraph Features for Information Networks Andreas Spitz , Diego Costa, Kai Chen,

Clustering and Ranking in Heterogeneous Information Networks via Gamma-Poisson Model Junxiang

Information Information systems/infrastructure systems/infrastructure complexity complexity

The network and the OS David Clark MIT CSAIL October,

Background p Network A ubiquitous data structure to model the relationships between entities p

Experiences with Distributed Heterogeneous Clouds over Community Networks Mennan Selimi , Felix

CS249: SPECIAL TOPICS MINING INFORMATION/SOCIAL NETWORKS 1: Introduction Instructor: Yizhou Sun

Interplay of mmWave based wireless access and ICN Atsushi Tagami KDDI Research, Inc. This PoC was

Heterogeneous Graph Transformer WWW20 1 Author Second-year CS Ph.D student, advised by

IoTMap: a modelling system for heterogeneous IoT networks Jonathan Tournier Franois Lesueur,

Design and Evaluation of a Virtual Experimental Environment for Distributed Systems L. Sarzyniec,

A Kernel-Based Approach to Exploiting Interaction-Networks in Heterogeneous Information Sources

Call Completion Probability in Heterogeneous Networks with Energy Harvesting Base Stations Craig

On Causal Analysis for Heterogeneous Networks Katerina Marazopoulou, David Arbour, David Jensen

DDoS Defense by Defense by DDoS Offense Offense Published in ACM SIGCOMM06 Presented By:

Performance Enhancement of Extended AFDX via Bandwidth Reservation for TSN/BLS Shapers Ana s

Hyperscale FPGAs for HPC and Cloud Christoph Hagleitner, hle@zurich.ibm.com IBM Research -

Demo Abstract: Real-time Heterogeneous Edge Computing System for Social Sensing Applications Yue

APPLICATIONS OF MINING HETEROGENEOUS INFORMATION NETWORKS Yizhou Sun College of Computer and

Community Viewing meets Network Coding: New Strategies for Distribution, Consumption and