Spaceland Embedding of Sparse Stochastic Graphs IEEE High - PowerPoint PPT Presentation

Spaceland Embedding of Sparse Stochastic Graphs IEEE High Performance Extreme Computing September 25, 2019 Nikos Pitsianis 12 Alexandros-S. Iliopoulos 2 Dimitris Floros 1 Xiaobai Sun 2 1 Department of Electrical and Computer Engineering, Aristotle University of Thessaloniki 2 Department of Computer Science, Duke University Pitsianis Iliopoulos Floros Sun (AUTh|Duke) Embedding of Sparse Stochastic Graphs IEEE HPEC19 | Sep 25, 2019 1 / 21

Outline 1. Introduction 2. Contribution A: SG-t-SNE 3. Contribution B: SG-t-SNE- Π 4. Key references Pitsianis Iliopoulos Floros Sun (AUTh|Duke) Embedding of Sparse Stochastic Graphs IEEE HPEC19 | Sep 25, 2019 2 / 21

1. Introduction Graph embedding Precursor work Significant impact Main limitations 2. Contribution A: SG-t-SNE 3. Contribution B: SG-t-SNE-Π 4. Key references

Introduction: graphs & graph embedding Graph/network G ( V , E ): relational data increasingly arise in various applications: biological, social, friend networks, food webs, co-author networks, word co-occurrence networks, product co-purchase networks, . . . Graph (vertex) embedding : ⇒ 𝒵 ⊆ R d Mapping/encoding: V = 𝒴 = - word embedding (of a co-occurrence graph) - image embedding (of a nearest-similarity graph) - product embedding (of a co-purchase graph) - user embedding (of a friend network) Social network orkut with n = 3 , 072 , 441 user nodes and m = 237 , 442 , 607 friendship links: to facilitate many tasks of graph data analysis Degree distribution (top) and 2D embedding (bottom) Pitsianis Iliopoulos Floros Sun (AUTh|Duke) Embedding of Sparse Stochastic Graphs IEEE HPEC19 | Sep 25, 2019 3 / 21

SNE: stochastic neighbor embedding algorithm X = { x i } n Y = { y i } n i =1 ∈ R d i =1 G ( V , E k ) G ( V , E k , W k ) k NN cast stochastic distribution V graph weights on E k matching sequence embedding in R 2 x i : RNA sequence SNE 1 pipeline illustrated with spatial embedding of n = 1 , 306 , 127 RNA sequences of E18 mouse brain cells 1 Hinton and Roweis, NIPS, 2003 10x Genomics, App Note, 2017 Pitsianis Iliopoulos Floros Sun (AUTh|Duke) Embedding of Sparse Stochastic Graphs IEEE HPEC19 | Sep 25, 2019 4 / 21

t-SNE: t-distributed SNE From input vertex data 𝒴 = ¶ x i ♢ n Vertex embedding coordinates i =1 Find k NNs among D = [ d 2 ( x i , x j )] n × n 𝒵 = ¶ y i ♢ n i =1 ∈ R d , d = 1 , 2 , 3 , . . . Cast D kNN to stochastic P = [ p j ♣ i + p i ♣ j ] / 2 Follow t-distribution (Cauchy kernel) p j ♣ i ( σ i ) = 1 )︄ [︄ ⊗ d 2 ij / 2 σ 2 exp (Gaussians) q ij = 1 i Z i Z (1 + ‖ y i ⊗ y j ‖ 2 ) ⊗ 1 Q : with σ i determined by the perplexity equations Determined by the best distribution matching ∑︂ measured by KL divergence 1 ⊗ a ij p j ♣ i ( σ i ) log( p j ♣ i ( σ i )) = log( u ) , ∀ i (1) j 𝒵 * = arg min 𝒵 KL( P ‖ Q ( 𝒵 )) u : perplexity parameter chosen by the user 1 van der Maaten and Hinton, JMLR, 2008 Pitsianis Iliopoulos Floros Sun (AUTh|Duke) Embedding of Sparse Stochastic Graphs IEEE HPEC19 | Sep 25, 2019 5 / 21

t-SNE: iterative embedding process X = { x i } n Y = { y i } n i =1 ∈ R d i =1 G ( V , E k ) G ( V , E k , W k ) k NN cast stochastic distribution V weights on E k graph matching digit embedding in R 2 x i : pixels in digit image SNE 1 pipeline illustrated with spatial embedding of n = 60 , 000 handwritten digits (MNIST dataset) 1 Hinton and Roweis, NIPS, 2003 LeCun et al., Proc IEEE, 1998 Pitsianis Iliopoulos Floros Sun (AUTh|Duke) Embedding of Sparse Stochastic Graphs IEEE HPEC19 | Sep 25, 2019 6 / 21

Significant impacts With low-dim. spatial embedding in particular, the SNE/t-SNE algorithm family has enabled – visual inspection, identification of connections/separations – network-based analysis for hidden connections – hypothesis generating and scientific discoveries Amir et al., Nat Biotechnol, 2013 Abdelmoula et al., PNAS, 2016 van Unen et al., Nat Commun, 2017 Pitsianis Iliopoulos Floros Sun (AUTh|Duke) Embedding of Sparse Stochastic Graphs IEEE HPEC19 | Sep 25, 2019 7 / 21

Main limitations Vertices of a network do not necessarily ⊲ Restricted to data in a metric space readily reside in a metric space A typical economic phenomenon: ⊲ Restricted to k NN-based stochastic graphs low-degree nodes in majority hub nodes in minority Degree k and perplexity u are coupled by Irregular in degree distribution condition 0 < u < k implied in (1) Defying the parameter condition u < deg ( i ) Amazon DBLP orkut Irregular degree distribution for each of three real-world networks: Low-degree nodes (including leaf nodes) in majority; high-degree nodes in minority. Pitsianis Iliopoulos Floros Sun (AUTh|Duke) Embedding of Sparse Stochastic Graphs IEEE HPEC19 | Sep 25, 2019 8 / 21

Main limitations ⊲ Existing software programs ⋆ are limited, due Many networks are large; to slow computation speed, to Spaceland (3D) embedding has much - small graphs, or greater potential in preserving/encoding - 1D/2D embedding more structural information (Left) kNN graph (k = 150) for a Möbius strip on a 256 × 32 lattice, with n = 8 , 192 nodes, (Middle) 2D embedding with missed/unresolved connections, (Right) 3D embedding with correct connections, also offering multiple or steerable views. ⋆ van der Maaten, JMLR, 2014 Linderman et al., Nat Methods, 2019 https://lvdmaaten.github.io/tsne https://github.com/KlugerLab/FIt-SNE Pitsianis Iliopoulos Floros Sun (AUTh|Duke) Embedding of Sparse Stochastic Graphs IEEE HPEC19 | Sep 25, 2019 9 / 21

1. Introduction 2. Contribution A: SG-t-SNE Admitting arbitrary stochastic graph (SG) Enabled embeddings of real-world graphs 3. Contribution B: SG-t-SNE-Π 4. Key references

SG-t-SNE: stochastic graph t-SNE X = { x i } n Y = { y i } n i =1 ∈ R d i =1 G ( V , E k ) G ( V , E k , W k ) k NN V graph cast/scale stochastic distribution or weights on E matching G G ( V , E , P ( λ )) embedding in R 2 admit arbitrary stochastic graph SG-SNE pipeline admitting two types of input (top) embedding of n = 1 , 306 , 127 RNA sequences of E18 mouse brain cells (bottom) embedding of n = 8 , 381 peripheral blood mononuclear cells 10x Genomics, App Note, 2017 Zheng et al., Nat Commun, 2017 Pitsianis Iliopoulos Floros Sun (AUTh|Duke) Embedding of Sparse Stochastic Graphs IEEE HPEC19 | Sep 25, 2019 10 / 21

SG-t-SNE: distinctive extension & the keystone Distinctions: ◇ Admitting arbitrary stochastic graph P = [ p j ♣ i ] i.e., extend the embedding to the entire family of stochastic graphs ◇ Making it feasible to exploit sparse connection pattern for - investigative/explorative data analysis - higher computation efficiency Key: the stochastic reshaping/rescaling equations: ∀ i ⎞ ⎡ p γ i a ij φ ⎞ ⎡ ∑︂ j ♣ i a ij φ p γ i = λ p j ♣ i ( λ ) = , ⇒ = j ♣ i λ j φ ≥ 0: reshaping function, monotonically increasing 1 λ > 0: re-scaling parameter; A = [ a ij ]: the binary-valued adjacency matrix; Solutions γ i exist unconditionally 1 We used φ ( x ) = x for the presented embeddings Pitsianis Iliopoulos Floros Sun (AUTh|Duke) Embedding of Sparse Stochastic Graphs IEEE HPEC19 | Sep 25, 2019 11 / 21

Enabled embedding of Amazon product co-purchase network (534) (678) ID n sub e in e out w in w out (534) 44 374 20 71.7 2.4 (678) 70 506 19 114.6 3.3 Amazon product sale network: n = 334 , 863 products, m = 1 , 851 , 744 edges for co-purchase connectivity, irregular degree distribution. (Left) 2D product embedding enabled by SG-t-SNE; (Right) two product clusters/subgraphs, the vertices for each are embedded closer together, with denser intra-connections. Yang and Leskovec, K&IS, 2015 Pitsianis Iliopoulos Floros Sun (AUTh|Duke) Embedding of Sparse Stochastic Graphs IEEE HPEC19 | Sep 25, 2019 12 / 21

Enabled embedding of social network orkut Social network orkut : n = 3 , 072 , 441 user nodes, m = 237 , 442 , 607 friendship links. (Left & Middle) 3D and 2D embeddings enabled by SG-t-SNE; (Right) Findings : There is a weak-link zone (easier to observe in 3D embedding), calibrated communities reside on one or the other side; the rich structure reflects/decodes information of geophysical regions and cultural diversities. Yang and Leskovec, K&IS, 2015 Pitsianis Iliopoulos Floros Sun (AUTh|Duke) Embedding of Sparse Stochastic Graphs IEEE HPEC19 | Sep 25, 2019 13 / 21

SG-t-SNE: exploiting sparse patterns ⊲ Vertex data: 8 k peripheral blood mononuclear cells (PBMCs) ⊲ PBMC embedding via kNN graphs by a cell similarity measure ⊲ SG-t-SNE can use a much sparser neighbor graph kNN graph P k , k = 30 t-SNE: k = 150 , u =50 SG-t-SNE: k = 30 , λ =80 PBM cells are color coded by provided labels with the data. Zheng et al., Nat Commun, 2017 Pitsianis Iliopoulos Floros Sun (AUTh|Duke) Embedding of Sparse Stochastic Graphs IEEE HPEC19 | Sep 25, 2019 14 / 21

1. Introduction 2. Contribution A: SG-t-SNE 3. Contribution B: SG-t-SNE- Π Challenges in gradient updates Fast calculation of sparse interactions Fast calculation of dense interactions Fast data translocation Comparisons in performance 4. Key references

Spaceland Embedding of Sparse Stochastic Graphs IEEE High - PowerPoint PPT Presentation

Spaceland Embedding of Sparse Stochastic Graphs IEEE High Performance Extreme Computing September 25, 2019 Nikos Pitsianis 12 Alexandros-S. Iliopoulos 2 Dimitris Floros 1 Xiaobai Sun 2 1 Department of Electrical and Computer Engineering, Aristotle

Greedy embedding of a graph Greedy embedding of a graph 99 Greedy embedding Greedy embedding

Sparse Matrices Example Of Sparse Matrices diagonal tridiagonal sparse many elements are

Graph Drawing Embedding Embedding For a given graph G = ( V , E ) , an embedding (into R 2 )

Planarity Embedding Embedding For a given graph G = ( V , E ) , an embedding (into R 2 ) assigns

Graphs () Graphs () Graphs Graphs Graphs are collections of nodes

Weighted graphs Weighted graphs Weighted graphs Weighted graphs Graphs with numbers, called

Extremal results for sparse pseudorandom graphs Yufei Zhao Massachusetts Institute of Technology

Sparse Matrices sparse many elements are zero dense few elements are zero Example Of

Week 4 Kullmann Graphs and directed graphs Elementary Graph Algorithms Representing graphs

On some classes of Deza graphs Deza graphs without 3-cocliques Line graphs V.V. Kabanov 1 Deza

Graphs Graphs Examples Definitions Implementation/Representation of graphs Graphs

Embedding 3-manifolds via surgery on surfaces Kyle Larson University of Texas at Austin

sparse matrices and graphs L. Olson Department of Computer Science University of Illinois at

Graphs Graphs Simple graphs Algorithms Depth-first search Breadth-first search

Searching on Graphs November 16, 2016 CMPE 250 Graphs- Searching on Graphs November 16, 2016 1

CS200: Graphs Prichard Ch. 14 Rosen Ch. 10 CS200 - Graphs 1 Graphs A collection of What can

Plug-in Scheduler Design for a Distributed Environm ent Eddy Caron Andreea Chis

Advanced Methods for Data Processing and Reconstruction Accelerating Reconstruction on advanced

Discrete Event (Network) Modeling, Patient Flow & Irregular Geometries in AnyLogic

Design and Implemention of a Plugin Scheduler for Diet & Performance Prediction in Diet with

1 = = ( ) ( ) 1 1 . s s s p s n = = 1 n p prime

Causal Effect Evaluation and Causal Network Learning Zhi Geng Peking University, China June

How I Learned to Stop Worrying about Exascale and Love MPI (Yes, MPI is indeed da bomb!) Pavan

An End-to-End, Large-Scale Measurement of DNS-over-Encryption: How Far Have We Come? Chaoyi Lu ,

Spaceland Embedding of Sparse Stochastic Graphs IEEE High - PowerPoint PPT Presentation

Spaceland Embedding of Sparse Stochastic Graphs IEEE High Performance Extreme Computing September 25, 2019 Nikos Pitsianis 12 Alexandros-S. Iliopoulos 2 Dimitris Floros 1 Xiaobai Sun 2 1 Department of Electrical and Computer Engineering, Aristotle

Greedy embedding of a graph Greedy embedding of a graph 99 Greedy embedding Greedy embedding

Sparse Matrices Example Of Sparse Matrices diagonal tridiagonal sparse many elements are

Graph Drawing Embedding Embedding For a given graph G = ( V , E ) , an embedding (into R 2 )

Planarity Embedding Embedding For a given graph G = ( V , E ) , an embedding (into R 2 ) assigns

Graphs () Graphs () Graphs Graphs Graphs are collections of nodes

Weighted graphs Weighted graphs Weighted graphs Weighted graphs Graphs with numbers, called

Extremal results for sparse pseudorandom graphs Yufei Zhao Massachusetts Institute of Technology

Sparse Matrices sparse many elements are zero dense few elements are zero Example Of

Week 4 Kullmann Graphs and directed graphs Elementary Graph Algorithms Representing graphs

On some classes of Deza graphs Deza graphs without 3-cocliques Line graphs V.V. Kabanov 1 Deza

Graphs Graphs Examples Definitions Implementation/Representation of graphs Graphs

Embedding 3-manifolds via surgery on surfaces Kyle Larson University of Texas at Austin

sparse matrices and graphs L. Olson Department of Computer Science University of Illinois at

Graphs Graphs Simple graphs Algorithms Depth-first search Breadth-first search

Searching on Graphs November 16, 2016 CMPE 250 Graphs- Searching on Graphs November 16, 2016 1

CS200: Graphs Prichard Ch. 14 Rosen Ch. 10 CS200 - Graphs 1 Graphs A collection of What can

Plug-in Scheduler Design for a Distributed Environm ent Eddy Caron Andreea Chis

Advanced Methods for Data Processing and Reconstruction Accelerating Reconstruction on advanced

Discrete Event (Network) Modeling, Patient Flow &amp; Irregular Geometries in AnyLogic

Design and Implemention of a Plugin Scheduler for Diet &amp; Performance Prediction in Diet with

1 = = ( ) ( ) 1 1 . s s s p s n = = 1 n p prime

Causal Effect Evaluation and Causal Network Learning Zhi Geng Peking University, China June

How I Learned to Stop Worrying about Exascale and Love MPI (Yes, MPI is indeed da bomb!) Pavan

An End-to-End, Large-Scale Measurement of DNS-over-Encryption: How Far Have We Come? Chaoyi Lu ,

Discrete Event (Network) Modeling, Patient Flow & Irregular Geometries in AnyLogic

Design and Implemention of a Plugin Scheduler for Diet & Performance Prediction in Diet with