Exploiting Graph Embeddings for Graph Analysis Tasks Fatemeh Salehi - PowerPoint PPT Presentation

Exploiting Graph Embeddings for Graph Analysis Tasks Fatemeh Salehi Rizi Graph Embedding Day University of Lyon September 7, 2018

Outline Circle Prediction Social labels in an ego-network Semantic Content of Vector Embeddings Network Centrality Measures Shortest Path Approximation Shortest path in scale-free networks Futurework 2 of 32

Graph Embedding ENC : V → R d DEC : R d × R d → R + 4 of 32

Circle Prediction � Predicting the social circle for a new added alter to the ego-network 1 128th International Conference on Database and Expert Systems Applications (DEXA), 2017 5 of 32

Circle Prediction � node2vec for leaning global representations for all nodes glo( v ) � Walking locally over an ego-network to generate sequence of nodes � Paragraph Vector [2] to learn local representation loc( u ) 6 of 32

Circle Prediction � node2vec for leaning global representations for all nodes glo( v ) � Walking locally over an ego-network to generate sequence of nodes � Paragraph Vector [2] to learn local representation loc( u ) � Predicting circle for the alter v Input layer Hidden layer Output layer � Input: loc( u ) ⊕ glo( v ) � Profile similarity: sim( u , v ) . . � loc( u ) ⊕ glo( v ) ⊕ sim( u , v ) . . . . . . . 6 of 32

Circle Prediction � Statistics of social network datasets Facebook Twitter Google+ nodes | V | 4,039 81,306 107,614 edges | E | 88,234 1,768,149 13,673,453 egos | U | 10 973 132 circles |C| 46 100 468 features f 576 2,271 4,122 � Performance of the prediction measured by F 1 -score Approach Facebook Twitter Google+ glo ⊕ glo 0.37 0.46 0.49 loc ⊕ glo 0.42 0.50 0.52 glo ⊕ glo ⊕ sim 0.40 0.49 0.51 loc ⊕ glo ⊕ sim 0.45 0.53 0.55 McAuley & Leskovec [1] 0.38 0.54 0.59 7 of 32

Do embeddings retain network centralities? 2 � Degree centrality DC ( u ) = deg( u ) 1 � Closeness centrality CC ( u ) = � v ∈ V d ( u , v ) σ s , t ( u ) � Betweenness centrality BC ( u ) = � s � = u � = t σ s , t � n � Eigenvector centrality EC ( u i ) = 1 j =1 A i , j EC ( v j ) λ 2Properties of Vector Embeddings in Social Networks, Algorithms Journal, 2017 9 of 32

Relating Embeddings and Centralities � A pair ( v i , v j ) are similar if: � embedding vectors are close � similar network characteristics 1.2 1.0 3 0 2 2.5 6 1 1 0.8 2.0 5 7 4 5 0.6 6 4 1.5 0 3 1.0 0.4 7 8 2 12 0.5 0.2 11 14 14 9 8 10 0.0 12 10 0.0 13 13 9 11 0.5 2.0 1.5 1.0 0.5 0.0 0.5 1.0 0.2 0.2 0.0 0.2 0.4 0.6 0.8 1.0 10 of 32

Relating Embeddings and Centralities � A pair ( v i , v j ) are similar if: � embedding vectors are close � similar network characteristics � Relation k � f ( Y i , Y j ) ∼ w i sim( v i , v j ) i =1 � Y i is the embedding vector of v i � w i is the weight of the centrality i � p i is a function computes similarity � k is the number of centrality measures 10 of 32

Relating Embeddings and Centralities � A pair ( v i , v j ) are similar if: � embedding vectors are close � similar network characteristics � Relation k � f ( Y i , Y j ) ∼ w i sim( v i , v j ) i =1 � Y i is the embedding vector of v i � w i is the weight of the centrality i � p i is a function computes similarity � k is the number of centrality measures Learning to Rank can learn weights 10 of 32

Learning to Rank � Ranking nodes according similarity in the embedding space � Feature matrix according similarity in the network � rankSVM objective function: 1 � 2 w T w + C max(0 , 1 − w T ( x i − x j ) ( i , j ) ∈ V w = ( w DC , w CC , w BC , w EC ) 11 of 32

Learning to Rank � Every pair ( v i , v j ) has a centrality similarity � P v i : histogram of centrality distribution in N ( v i ) � Q v j : histogram of centrality distribution in N ( v j ) � centrality similarity: 1 − D KL ( P v i � Q v j ) � Feature matrix X ∈ R | z |× 4 , z = n × ( n − 1)   sim DC ( v 1 , v 2 ) sim CC ( v 1 , v 2 ) sim BC ( v 1 , v 2 ) sim EC ( v 1 , v 2 ) sim DC ( v 1 , v 3 ) sim CC ( v 1 , v 3 ) sim BC ( v 1 , v 3 ) sim EC ( v 1 , v 3 )   X =   sim DC ( v 1 , v 4 ) sim CC ( v 1 , v 4 ) sim BC ( v 1 , v 4 ) sim EC ( v 1 , v 4 )   . . . .   . . . . . . . . 12 of 32

Learning to Rank � Every node v i sort all other nodes according to Y i · Y j � v i : [ v 1 , v 2 , · · · , v n − 1 ] � Every pair ( v i , v j ) has a rank label � Ground-truth y ∈ R | z |× 1 , z = n × ( n − 1)   rank( v 1 , v 2 ) rank( v 1 , v 3 )   y =   rank( v 1 , v 4 )    .  . . 13 of 32

Semantic content of embeddings � Deepwalk: d=128, k=5, r=10, l=80 � node2vec: d=128, q=5, p=0.1 � line: d=128 Dataset Weight DeepWalk LINE node2vec 0.09 ± 0.02 -0.15 ± 0.05 0.82 ± 0.01 w DC w CC -0.01 ± 0.04 -0.07 ± 0.00 0.04 ± 0.00 Facebook 0.64 ± 0.03 -0.55 ± 0.07 -0.01 ± 0.04 w BC w EC -0.64 ± 0.02 -0.68 ± 0.08 -0.07 ± 0.00 w DC 0.07 ± 0.09 -0.09 ± 0.05 0.53 ± 0.01 -0.15 ± 0.00 -0.00 ± 0.08 0.04 ± 0.17 w CC Twitter w BC 0.51 ± 0.04 -0.69 ± 0.00 -0.11 ± 0.10 w EC -0.71 ± 0.05 -0.58 ± 0.01 -0.03 ± 0.01 w DC 0.02 ± 0.04 -0.00 ± 0.10 0.65 ± 0.00 -0.05 ± 0.11 -0.04 ± 0.09 0.09 ± 0.07 w CC Google+ w BC 0.55 ± 0.05 -0.53 ± 0.07 -0.14 ± 0.00 -0.63 ± 0.03 -0.68 ± 0.06 -0.07 ± 0.03 w EC 14 of 32

Predicting Centrality Values Dataset | V | Average Closeness std Facebook 4 , 039 0 . 2759 0.0349 Feedforward Network Linear Regression 0.09 HARP 0.08 PRUNE 0.025 HOPE 0.07 node2vec HARP 0.020 PRUNE DeepWalk 0.06 MAE MAE HOPE 0.05 node2vec 0.015 DeepWalk 0.04 0.010 0.03 2 8 32 128 2 8 32 128 Embedding Size Embedding Size Linear Regression gives the minimum MAE by HARP: 0.0070 15 of 32

Shortest-path Problem Single-Source Shortest-Path (SSSP) Given a Graph G = ( V , E ) and Source s ∈ V , compute all distances δ ( s , v ), where v ∈ V . All-Pairs Shortest-Path (APSP) Given a graph G = ( V , E ), compute all distances between a source vertex s and a destination v , where s and v are elements of the set V . 17 of 32

Shortest-path Problem Single-Source Shortest-Path (SSSP) Given a Graph G = ( V , E ) and Source s ∈ V , compute all distances δ ( s , v ), where v ∈ V . All-Pairs Shortest-Path (APSP) Given a graph G = ( V , E ), compute all distances between a source vertex s and a destination v , where s and v are elements of the set V . � Exact methods: Algorithms try to find the exact shortest-paths between vertices in any type of graphs � Approximation Methods: Algorithms attempt to compute shortest-paths between nodes by querying only some of the distances. 17 of 32

Exact Methods Algorithm Time Complexity O ( | V | 2 log | V | + | V || E | log | V | ) Dijkstra ( V times) [14] O ( | V | 3 ) Floyd-Warshall [3] Thorup [4] O ( | E || V | ) Pettie & Ramachandran [5] O ( | E || V | log α ( | E | , | V | )) O ( | V | 3 / 2 Ω ( log | V | ) 1 / 2 ) Williams [6] O ( | V | 3 (log log | V | ) / (log | V | ) 2 ) Han and Takaoka [15] O ( | V | 3 (log log | V | ) / log | V | 1 / 3) Fredman [16] O ( | V | 3 / log | V | ) T. M. Chan [17] 18 of 32

Approximation Methods v � Landmark-based Methods [8, 9, 10, 11] d(l,v) L � A subset L of vertices as landmarks l � k = | L | , k ≪ | V | d(u,l) u 19 of 32

Approximation Methods v � Landmark-based Methods [8, 9, 10, 11] d(l,v) L � A subset L of vertices as landmarks l � k = | L | , k ≪ | V | d(u,l) u � For all l ∈ L and u ∈ V : d ( l , u ) � BFS: O ( k ( | E | + | V | )) 19 of 32

Approximation Methods v � Landmark-based Methods [8, 9, 10, 11] d(l,v) L � A subset L of vertices as landmarks l � k = | L | , k ≪ | V | d(u,l) u � For all l ∈ L and u ∈ V : d ( l , u ) � BFS: O ( k ( | E | + | V | )) � d ( u , v ) = min( d ( u , l ) + d ( l , v )) � Query time: O ( k ) 19 of 32

Approximation Methods v � Landmark-based Methods [8, 9, 10, 11] d(l,v) L � A subset L of vertices as landmarks l � k = | L | , k ≪ | V | d(u,l) u � For all l ∈ L and u ∈ V : d ( l , u ) � BFS: O ( k ( | E | + | V | )) � d ( u , v ) = min( d ( u , l ) + d ( l , v )) � Query time: O ( k ) � For all pairs: O ( k ( | E | + | V | )) + O ( k | V | 2 ) 19 of 32

Approximation Methods v � Landmark-based Methods [8, 9, 10, 11] d(l,v) L � A subset L of vertices as landmarks l � k = | L | , k ≪ | V | d(u,l) u � For all l ∈ L and u ∈ V : d ( l , u ) � BFS: O ( k ( | E | + | V | )) � d ( u , v ) = min( d ( u , l ) + d ( l , v )) � Query time: O ( k ) � For all pairs: O ( k ( | E | + | V | )) + O ( k | V | 2 ) Optimal Landmark selection is a NP-hard problem! 19 of 32

Exploiting Graph Embeddings for Graph Analysis Tasks Fatemeh Salehi - PowerPoint PPT Presentation

Exploiting Graph Embeddings for Graph Analysis Tasks Fatemeh Salehi Rizi Graph Embedding Day University of Lyon September 7, 2018 Outline Circle Prediction Social labels in an ego-network Semantic Content of Vector Embeddings Network

Embeddings @ Twitter Making ML easy with Embeddings !!! Sept 2018 Agenda 1 Team 2 Whats an

Word embeddings Rappel Embeddings ( pas Word Embeddings ) Est une lookup table Formalisme:

Word Embeddings Natural Language Processing VU (706.230) - Andi Rexha 02/04/2020 Word Embeddings

Word Embeddings Revisited: Contextual Embeddings CS 6956: Deep Learning for NLP Overview

Graph Embeddings Alicia Frame, PhD October 10, 2019 Overview Whats an embedding? How do

9.4 Local Perception Filters 9.4 Local Perception Filters Exploiting Exploiting Perceptual

Word Embeddings CS 6956: Deep Learning for NLP Overview Representing meaning Word

Word Embeddings CS 6956: Deep Learning for NLP Overview Representing meaning Word

Mixed membership word embeddings: Corpus-specific embeddings without big data James Foulds

Word Embeddings CS 6956: Deep Learning for NLP Overview Representing meaning Word

Word Embeddings CS 6956: Deep Learning for NLP Overview Representing meaning Word

of Graph Embeddings Aleksandar Bojchevski Technical University of Munich, Germany Graph

Shared Memory Programming with OpenMP Lecture 6: Tasks What are tasks? Tasks are

Scheduling Aperiodic Tasks Background Scheduling Treat aperiodic tasks as lowest-priority

CS 4803 / 7643: Deep Learning Guest Lecture: Embeddings and world2vec Feb. 18 th 2020 Ledell Wu

Z 2 -embeddings and Tournaments Radoslav Fulek , Jan Kyn cl Z 2 -embeddings and Tournaments

Mobility, Data Mining, and Privacy Yannis Theodoridis InfoLab, University of Piraeus, Greece

Subgroup Discovery and Community Detection on Attributed Graphs Martin Atzmueller Universit y of

AN INTRODUCTION TO NETWORK SCIENCE Nicola Perra n.perra@greenwich.ac.uk @net_science

trs trtr sts

MRP Assessment of Generic Implications of Davis-Besse RPV Head Corrosion MRP-NRC Staff Meeting

Math 3230 Abstract Algebra I Sec 2.1: Cyclic and abelian groups, orbits Slides created by M.

Chemistry 1000 Lecture 24: Group 14 and Boron Marc R. Roussel November 2, 2018 Marc R. Roussel

Acceptance criteria Exercise IAEA SAET EK DSA Acceptance criteria Regulatory requirements