Network Topology Inference Gonzalo Mateos Dept. of ECE and Goergen - PowerPoint PPT Presentation

Network Topology Inference Gonzalo Mateos Dept. of ECE and Goergen Institute for Data Science University of Rochester gmateosb@ece.rochester.edu http://www.ece.rochester.edu/~gmateosb/ April 9, 2019 Network Science Analytics Network Topology Inference 1

Network topology inference Network topology inference problems Link prediction Case study: Predicting lawyer collaboration Inference of association networks Case study: Inferring genetic regulatory interactions Tomographic network topology inference Case study: Computer network topology identification Network Science Analytics Network Topology Inference 2

Network topology inference ◮ So far dealt with modeling and inference of observed network graphs ⇒ Q: If a portion of G is unobserved, can we infer it from data? ◮ Discussed construction of representations G ( V , E ) for network mapping ⇒ Largely informal methodology, lacking an element of validation ◮ Formulate instead as statistical inference task, i.e. given ◮ Measurements x i of attributes at some or all vertices i ∈ V ◮ Indicators y ij of edge status for some vertex pairs { i , j } ∈ V (2) ◮ A collection G of candidate graphs G Goal: infer the topology of the network graph G ( V , E ) ◮ Three canonical network topology inference problems (i) Link prediction (ii) Association network inference (iii) Tomographic network topology inference Network Science Analytics Network Topology Inference 3

Link prediction Original graph Link prediction ◮ Suppose we observe vertex attributes x = [ x 1 , . . . , x N v ] ⊤ ; and ◮ Edge status is only observed for some subset of pairs V (2) obs ⊂ V (2) miss = V (2) \ V (2) ◮ Goal: predict edge status for all other pairs, i.e., V (2) obs Network Science Analytics Network Topology Inference 4

Association network inference Original graph Association network inference ◮ Suppose we only observe vertex attributes x = [ x 1 , . . . , x N v ] ⊤ ; and ◮ Assume ( i , j ) defined by nontrivial ‘level of association’ among x i , x j ◮ Goal: predict edge status for all vertex pairs V (2) Network Science Analytics Network Topology Inference 5

Tomographic network topology inference Original graph Tomographic inference ◮ Suppose we only observe x i for vertices i ⊂ V in the ‘perimeter’ of G ◮ Goal: predict edge and vertex status in the ‘interior’ of G Network Science Analytics Network Topology Inference 6

Link prediction Network topology inference problems Link prediction Case study: Predicting lawyer collaboration Inference of association networks Case study: Inferring genetic regulatory interactions Tomographic network topology inference Case study: Computer network topology identification Network Science Analytics Network Topology Inference 7

Link prediction ◮ Let G ( V , E ) be a random graph, with adjacency matrix Y ∈ { 0 , 1 } N v × N v ⇒ Y obs and Y miss denote entries in V (2) obs and V (2) miss Link prediction Predict entries in Y miss , given observations Y obs = y obs and possibly various vertex attributes X = x ∈ R N v ◮ Edge status information may be missing due to: ⇒ Difficulty in observation, issues of sampling ⇒ Edge is not yet present, wish to predict future status ◮ Given a model for X and ( Y obs , Y miss ), jointly predict Y miss based on � Y obs = y obs , X = x Y miss � � � P ⇒ More manageable to predict the variables Y miss individually ij Network Science Analytics Network Topology Inference 8

Informal scoring methods ◮ Idea: compute score s ( i , j ) for missing ‘potential edges’ { i , j } ∈ V (2) miss ⇒ Predicted edges returned by retaining the top n ∗ scores ◮ Scores designed to assess certain local structural properties of G obs ⇒ Distance-based, inspired by the small-world principle s ( i , j ) = − dist G obs ( i , j ) ⇒ Neighborhood-based, e.g., the number of common neighbors |N obs ∩ N obs | i j s ( i , j ) = |N obs ∩ N obs | or s ( i , j ) = i j |N obs ∪ N obs | i j ⇒ Favor loosely-connected common neighbors [Adamic-Adar’03] 1 � s ( i , j ) = log |N obs | k k ∈N obs ∩N obs i j Network Science Analytics Network Topology Inference 9

Tests on co-authorship networks ◮ Results from a link prediction study in [Liben Nowell-Kleinberg’03] Network Science Analytics Network Topology Inference 10

Classification methods ◮ Idea: use training data y obs and x to build a binary classifier ⇒ Classifier is in turn used to predict the entries in Y miss ◮ Logistic regression classifiers most popular, based on the model � � � � Z ij = z ) P β ( Y ij = 1 = β ⊤ z , log where � Z ij = z ) � P β ( Y ij = 0 (i) β ∈ R K is a vector of regression coefficients; and (ii) Z ij is a vector of explanatory variables indexed by { i , j } Z ij = [ g 1 ( Y obs ( − ij ) , X ) , . . . , g K ( Y obs ( − ij ) , X )] ⊤ ◮ Functions g k ( · ) encode useful predictive information in y obs ( − ij ) and x Ex: vertex attributes, score functions, network statistics in ERGMs Network Science Analytics Network Topology Inference 11

Logistic regression classifier ◮ Train: Obtain MLE ˆ β via iteratively-reweighted LS ◮ Test: Potential edges ( i , j ) declared present based on probabilities ⊤ z � � ˆ exp β � Z ij = z ) = � P ˆ β ( Y ij = 1 ⊤ z � � ˆ 1 + exp β ◮ Logistic regression assumes Y ij conditionally independent given z ⇒ Seldom the case with relational network data ◮ Underlying mechanism of data missingness is important ⇒ Classification for link prediction reminiscent of cross-validation ⇒ Assumption that data are missing at random is fundamental Network Science Analytics Network Topology Inference 12

Latent variable models ◮ In addition to a lineal predictor β ⊤ z , latent models describe Y ij ⇒ As a function of vertex-specific latent variables u i and u j Homophily Stochastic equivalence ◮ Latent models are flexible to capture underlying social mechanisms Ex: homophily (transitivity) and stochastic equivalence (groups) Network Science Analytics Network Topology Inference 13

Latent class and distance models ◮ Latent distance model: node i has unobserved position U i ∈ R d ◮ Positions U i in latent space assumed i.i.d. e.g., Gaussian distributed ◮ Model cond. probability of edge Y ij as function of β ⊤ z − � u i − u j � 2 ◮ Homophily: Nearby nodes in latent space more likely to link ◮ Latent class model: node i belongs to unobserved class U i ∈ { 1 , . . . , k } ◮ Classes U i assumed i.i.d. e.g., multinomial distributed ◮ Model cond. probability of edge Y ij as function of β ⊤ z − θ u i , u j ◮ Stochastic equivalence: Nodes in same class equally likely to link ◮ P. D. Hoff, “Modeling homophily and stochastic equivalence in symmetric relational data,” NIPS, 2008 Network Science Analytics Network Topology Inference 14

Logistic regression with latent variables ◮ Let M ∈ R N v × N v be unknown, random, and symmetric of the form M = U ⊤ ΛU + E , where (i) U = [ u 1 , . . . , u N v ] is a random orthonormal matrix of latent variables; (ii) Λ is a random diagonal matrix; and (iii) E is a symmetric matrix of i.i.d. noise entries ǫ ij ◮ Latent eigenmodel subsumes the class and distance variants [Hoff’08] ⇒ Notice that M ij = u T i Λu j + ǫ ij ◮ The logistic regression model with latent variables is � � Z ij = z , M ij = m ) � � P β ( Y ij = 1 = β ⊤ z + m log � Z ij = z , M ij = m ) � P β ( Y ij = 0 ◮ Y ij still assumed conditionally independent given Z ij and M ij ⇒ But they are conditionally dependent given only Z ij Network Science Analytics Network Topology Inference 15

Bayesian link prediction ◮ Specify distributions for U , Λ , E to make statistical link predictions ◮ Bayesian inference natural ⇒ Specify a prior for β as well ◮ To predict those entries in Y miss , threshold the posterior mean � �  β ⊤ Z ij + M ij  exp � Y obs = y obs , Z ij = z � � E   � β ⊤ Z ij + M ij 1 + exp ◮ Use MCMC algorithms to approximate the posterior distribution ◮ Gaussian distributions attractive for their conjugacy properties ◮ Higher complexity than MLE for standard logistic regression ⇒ Need to generate draws for N 2 v unobserved variables { U ij } ⇒ Major cost reduction with reduced rank( U ) = k ≪ N v models Network Science Analytics Network Topology Inference 16

Case study Network topology inference problems Link prediction Case study: Predicting lawyer collaboration Inference of association networks Case study: Inferring genetic regulatory interactions Tomographic network topology inference Case study: Computer network topology identification Network Science Analytics Network Topology Inference 17

Lawyer collaboration network ◮ Network G obs of working relationships among lawyers [Lazega’01] ◮ Nodes are N v = 36 partners, edges indicate partners worked together 13 33 5 8 36 6 31 30 10 24 32 18 23 20 15 28 4 22 35 3 34 26 14 19 25 12 16 17 9 7 29 2 27 21 11 1 ◮ Data includes various node-level attributes: ◮ Seniority (node labels indicate rank ordering) ◮ Office location (triangle, square or pentagon) ◮ Type of practice, i.e., litigation (red) and corporate (cyan) ◮ Gender (three partners are female labeled 27, 29 and 34) ◮ Goal: predict cooperation among social actors in an organization Network Science Analytics Network Topology Inference 18

Network Topology Inference Gonzalo Mateos Dept. of ECE and Goergen - PowerPoint PPT Presentation

Network Topology Inference Gonzalo Mateos Dept. of ECE and Goergen Institute for Data Science University of Rochester gmateosb@ece.rochester.edu http://www.ece.rochester.edu/~gmateosb/ April 9, 2019 Network Science Analytics Network Topology

Multiple Source Multiple Destination Topology Inference Destination Topology Inference using

Topological data analysis and topology-based visualization Leila De Floriani Topology-based

Large-Scale Network Topology Emulation and Inference Erik Rye Naval Postgraduate School

Topology Discovery Correlating different network topology layers in heterogeneous environments

1 Ring Topology Ring Topology In a ring network, every device has exactly two neighbours

Inference in Bayesian networks Chapter 14.45 Chapter 14.45 1 Outline Exact inference

Combinatorics and topology of toric arrangements II. Topology of arrangements in the complex torus

Order Topology Definition Let ( X , < ) be an ordered set. Then the order topology on X is the

I2RS Service Topology Draft-hares-i2rs-service-topo-dm-05 I2RS Service Topology Model Why

PNNI - Private Network to Network Interface Principles Topology concepts Routing

NETWORK LAYER INTERNET TOPOLOGY CONSTRUCTION M. Engin Tozal & Kamil Sarac The University of

ALM API for Topology Management ALM API for Topology Management and Network Layer Transparent

Post-Selection Inference Todd Kuffner Washington University in St. Louis PhyStat 2016

Soft Inference and Posterior Marginals September 19, 2013 Soft vs. Hard Inference Hard

Type Inference 75 Definition Type Inference Type inference = Java compiler's ability

Inference in Bayesian networks Chapter 14.45 Chapter 14.45 1 Outline Exact inference

Dcouverte dans les rseaux biologiques htrognes : l'exprience Adalab Cline

Attractors in synchronous and asynchronous genetic regulatory networks Marco Pedicini (Roma Tre

Outline CSE 527 Previously: Learning from data MLE: Max Likelihood Estimators Autumn 2009 EM:

Information Storage and Processing in Biological Systems: A seminar course for the Natural

Hill Kinetics Meets P Systems A Case Study on Gene Regulatory Networks as Computing Agents in

Cancer Genomes 02-223 Personalized Medicine: Understanding Your Own

Overview of Fellows Partners Positions Recruitment Research months Problems, consequences,

Studying the influence of stereochemistry in P-gp modulation: case-study with thioxantones Ana

Network Topology Inference Gonzalo Mateos Dept. of ECE and Goergen - PowerPoint PPT Presentation

Network Topology Inference Gonzalo Mateos Dept. of ECE and Goergen Institute for Data Science University of Rochester gmateosb@ece.rochester.edu http://www.ece.rochester.edu/~gmateosb/ April 9, 2019 Network Science Analytics Network Topology

Multiple Source Multiple Destination Topology Inference Destination Topology Inference using

Topological data analysis and topology-based visualization Leila De Floriani Topology-based

Large-Scale Network Topology Emulation and Inference Erik Rye Naval Postgraduate School

Topology Discovery Correlating different network topology layers in heterogeneous environments

1 Ring Topology Ring Topology In a ring network, every device has exactly two neighbours

Inference in Bayesian networks Chapter 14.45 Chapter 14.45 1 Outline Exact inference

Combinatorics and topology of toric arrangements II. Topology of arrangements in the complex torus

Order Topology Definition Let ( X , &lt; ) be an ordered set. Then the order topology on X is the

I2RS Service Topology Draft-hares-i2rs-service-topo-dm-05 I2RS Service Topology Model Why

PNNI - Private Network to Network Interface Principles Topology concepts Routing

NETWORK LAYER INTERNET TOPOLOGY CONSTRUCTION M. Engin Tozal &amp; Kamil Sarac The University of

ALM API for Topology Management ALM API for Topology Management and Network Layer Transparent

Post-Selection Inference Todd Kuffner Washington University in St. Louis PhyStat 2016

Soft Inference and Posterior Marginals September 19, 2013 Soft vs. Hard Inference Hard

Type Inference 75 Definition Type Inference Type inference = Java compiler's ability

Inference in Bayesian networks Chapter 14.45 Chapter 14.45 1 Outline Exact inference

Dcouverte dans les rseaux biologiques htrognes : l'exprience Adalab Cline

Attractors in synchronous and asynchronous genetic regulatory networks Marco Pedicini (Roma Tre

Outline CSE 527 Previously: Learning from data MLE: Max Likelihood Estimators Autumn 2009 EM:

Information Storage and Processing in Biological Systems: A seminar course for the Natural

Hill Kinetics Meets P Systems A Case Study on Gene Regulatory Networks as Computing Agents in

Cancer Genomes 02-223 Personalized Medicine: Understanding Your Own

Overview of Fellows Partners Positions Recruitment Research months Problems, consequences,

Studying the influence of stereochemistry in P-gp modulation: case-study with thioxantones Ana

Order Topology Definition Let ( X , < ) be an ordered set. Then the order topology on X is the

NETWORK LAYER INTERNET TOPOLOGY CONSTRUCTION M. Engin Tozal & Kamil Sarac The University of