Centrality Measures and Link Analysis Gonzalo Mateos Dept. of ECE - PowerPoint PPT Presentation

Centrality Measures and Link Analysis Gonzalo Mateos Dept. of ECE and Goergen Institute for Data Science University of Rochester gmateosb@ece.rochester.edu http://www.ece.rochester.edu/~gmateosb/ February 21, 2020 Network Science Analytics Centrality Measures and Link Analysis 1

Centrality measures Centrality measures Case study: Stability of centrality measures in weighted graphs Centrality, link analysis and web search A primer on Markov chains PageRank as a random walk PageRank algorithm leveraging Markov chain structure Network Science Analytics Centrality Measures and Link Analysis 2

Quantifying vertex importance ◮ In network analysis many questions relate to vertex importance Example ◮ Q1: Which actors in a social network hold the ‘reins of power’? ◮ Q2: How authoritative is a WWW page considered by peers? ◮ Q3: The ‘knock-out’ of which genes is likely to be lethal? ◮ Q4: How critical to the daily commute is a subway station? ◮ Measures of vertex centrality quantify such notions of importance ⇒ Degrees are simplest centrality measures. Let’s study others Network Science Analytics Centrality Measures and Link Analysis 3

Closeness centrality ◮ Rationale: ‘central’ means a vertex is ‘close’ to many other vertices ◮ Def: Distance d ( u , v ) between vertices u and v is the length of the shortest u − v path. Oftentimes referred to as geodesic distance ◮ Closeness centrality of vertex v is given by 1 c Cl ( v ) = � u ∈ V d ( u , v ) ◮ Interpret v ∗ = arg max v c Cl ( v ) as the most approachable node in G Network Science Analytics Centrality Measures and Link Analysis 4

Normalization, computation and limitations ◮ To compare with other centrality measures, often normalize to [0 , 1] N v − 1 c Cl ( v ) = � u ∈ V d ( u , v ) ◮ Computation: need all pairwise shortest path distances in G ⇒ Dijkstra’s algorithm in O ( N 2 v log N v + N v N e ) time ◮ Limitation 1: sensitivity, values tend to span a small dynamic range ⇒ Hard to discriminate between central and less central nodes ◮ Limitation 2: assumes connectivity, if not c Cl ( v ) = 0 for all v ∈ V ⇒ Compute centrality indices in different components Network Science Analytics Centrality Measures and Link Analysis 5

Betweenness centrality ◮ Rationale: ‘central’ node is (in the path) ‘between’ many vertex pairs ◮ Betweenness centrality of vertex v is given by σ ( s , t | v ) � c Be ( v ) = σ ( s , t ) s � = t � = v ∈ V ◮ σ ( s , t ) is the total number of s − t shortest paths ◮ σ ( s , t | v ) is the number of s − t shortest paths through v ∈ V ◮ Interpret v ∗ = arg max v c Be ( v ) as the controller of information flow Network Science Analytics Centrality Measures and Link Analysis 6

Computational considerations ◮ Notice that a s − t shortest path goes through v if and only if d ( s , t ) = d ( s , v ) + d ( v , t ) ◮ Betweenness centralities can be naively computed for all v ∈ V by: Step 1: Use Dijkstra to tabulate d ( s , t ) and σ ( s , t ) for all s , t Step 2: Use the tables to identify σ ( s , t | v ) for all v Step 3: Sum the fractions to obtain c Be ( v ) for all v ( O ( N 3 v ) time) ◮ Cubic complexity can be prohibitive for large networks ◮ O ( N v N e )-time algorithm for unweighted graphs in: U. Brandes, “A faster algorithm for betweenness centrality,” Journal of Mathematical Sociology, vol. 25, no. 2, pp. 163-177, 2001 Network Science Analytics Centrality Measures and Link Analysis 7

Eigenvector centrality ◮ Rationale: ‘central’ vertex if ‘in-neighbors’ are themselves important ⇒ Compare with ‘importance-agnostic’ degree centrality ◮ Eigenvector centrality of vertex v is implicitly defined as � c Ei ( v ) = α c Ei ( u ) ( u , v ) ∈ E ◮ No one points to 1 ◮ Only 1 points to 2 5 4 ◮ Only 2 points to 3, but 2 more important than 1 6 1 ◮ 4 as high as 5 with less links ◮ Links to 5 have lower rank 2 3 ◮ Same for 6 Network Science Analytics Centrality Measures and Link Analysis 8

Eigenvalue problem ◮ Recall the adjacency matrix A and � c Ei ( v ) = α c Ei ( u ) ( u , v ) ∈ E ◮ Vector c Ei = [ c Ei (1) , . . . , c Ei ( N v )] ⊤ solves the eigenvalue problem Ac Ei = α − 1 c Ei ⇒ Typically α − 1 chosen as largest eigenvalue of A [Bonacich’87] ◮ If G is undirected and connected, by Perron’s Theorem then ⇒ The largest eigenvalue of A is positive and simple ⇒ All the entries in the dominant eigenvector c Ei are positive ◮ Can compute c Ei and α − 1 via O ( N 2 v ) complexity power iterations Ac Ei ( k ) c Ei ( k + 1) = � Ac Ei ( k ) � , k = 0 , 1 , . . . Network Science Analytics Centrality Measures and Link Analysis 9

Example: Comparing centrality measures ◮ Q: Which vertices are more central? A: It depends on the context ◮ Each measure identifies a different vertex as most central ⇒ None is ‘wrong’, they target different notions of importance Network Science Analytics Centrality Measures and Link Analysis 10

Example: Comparing centrality measures ◮ Q: Which vertices are more central? A: It depends on the context Closeness Betweenness Eigenvector ◮ Small green vertices are arguably more peripheral ⇒ Less clear how the yellow, dark blue and red vertices compare Network Science Analytics Centrality Measures and Link Analysis 11

Case study Centrality measures Case study: Stability of centrality measures in weighted graphs Centrality, link analysis and web search A primer on Markov chains PageRank as a random walk PageRank algorithm leveraging Markov chain structure Network Science Analytics Centrality Measures and Link Analysis 12

Centrality measures robustness ◮ Robustness to noise in network data is of practical importance ◮ Approaches have been mostly empirical ⇒ Find average response in random graphs when perturbed ⇒ Not generalizable and does not provide explanations ◮ Characterize behavior in noisy real graphs ⇒ Degree and closeness are more reliable than betweenness ◮ Q: What is really going on? ⇒ Framework to study formally the stability of centrality measures ◮ S. Segarra and A. Ribeiro, “Stability and continuity of centrality measures in weighted graphs,” IEEE Trans. Signal Process. , 2015 Network Science Analytics Centrality Measures and Link Analysis 13

Definitions for weighted digraphs ◮ Weighted and directed graphs G ( V , E , W ) 5 a b ⇒ Set V of N v vertices 2 ⇒ Set E ⊆ V × V of edges 3 4 ⇒ Map W : E → R ++ of weights in each edge c ◮ Path P ( u , v ) is an ordered sequence of nodes from u to v ◮ When weights represent dissimilarities ⇒ Path length is the sum of the dissimilarities encountered ◮ Shortest path length s G ( u , v ) from u to v ℓ − 1 � s G ( u , v ) := min W ( u i , u i +1 ) P ( u , v ) i =0 Network Science Analytics Centrality Measures and Link Analysis 14

Stability of centrality measures ◮ Space of graphs G ( V , E ) with ( V , E ) as vertex and edge set ◮ Define the metric d ( V , E ) ( G , H ) : G ( V , E ) × G ( V , E ) → R + � d ( V , E ) ( G , H ) := | W G ( e ) − W H ( e ) | e ∈ E ◮ Def: A centrality measure c ( · ) is stable if for any vertex v ∈ V in any two graphs G , H ∈ G ( V , E ) , then � c G ( v ) − c H ( v ) � ≤ K G d ( V , E ) ( G , H ) � � ◮ K G is a constant depending on G only ◮ Stability is related to Lipschitz continuity in G ( V , E ) ◮ Independent of the definition of d ( V , E ) (equivalence of norms) ◮ Node importance should be robust to small perturbations in the graph Network Science Analytics Centrality Measures and Link Analysis 15

Degree centrality ◮ Sum of the weights of incoming arcs � c De ( v ) := W ( u , v ) u | ( u , v ) ∈ E ◮ Applied to graphs where the weights in W represent similarities ◮ High c De ( v ) ⇒ v similar to its large number of neighbors Proposition 1 For any vertex v ∈ V in any two graphs G , H ∈ G ( V , E ) , we have that | c G De ( v ) − c H De ( v ) | ≤ d ( V , E ) ( G , H ) i.e., degree centrality c De is a stable measure ◮ Can show closeness and eigenvector centralities are also stable Network Science Analytics Centrality Measures and Link Analysis 16

Betweenness centrality ◮ Look at the shortest paths for every two nodes distinct from v ⇒ Sum the proportion that contains node v σ ( s , t | v ) � c Be ( v ) := σ ( s , t ) s � = v � = t ∈ V ◮ σ ( s , t ) is the total number of s − t shortest paths ◮ σ ( s , t | v ) is the number of those paths going through v Proposition 2 The betweenness centrality measure c Be is not stable Network Science Analytics Centrality Measures and Link Analysis 17

Instability of betweenness centrality ◮ Compare the value of c Be ( v ) in graphs G and H G H 1 1 1 1 1 1 + ǫ 1 + ǫ 1 v v 1 1 1 1 1 1 1 1 c G c H Be ( v ) = 9 Be ( v ) = 0 ⇒ Centrality value c H Be ( v ) = 0 remains unchanged for any ǫ > 0 ◮ For small values of ǫ , graphs G and H become arbitrarily similar 9 = | c G Be ( v ) − c H Be ( v ) | ≤ K G d ( V , E ) ( G , H ) → 0 ⇒ Inequality is not true for any constant K G Network Science Analytics Centrality Measures and Link Analysis 18

Centrality Measures and Link Analysis Gonzalo Mateos Dept. of ECE - PowerPoint PPT Presentation

Centrality Measures and Link Analysis Gonzalo Mateos Dept. of ECE and Goergen Institute for Data Science University of Rochester gmateosb@ece.rochester.edu http://www.ece.rochester.edu/~gmateosb/ February 21, 2020 Network Science Analytics

Degree centrality Network Analysis in Python I Important nodes Which nodes are important?

REDEFINING CENTRALITY Redefining Centrality Overview - Regional Integration - Global and Local

Centrality Argimiro Arratia & R. Ferrer-i-Cancho Universitat Polit` ecnica de Catalunya Version

Centrality Social and Technological Networks Rik Sarkar University of Edinburgh, 2017.

A Round-Efficient Distributed Betweenness Centrality Algorithm Loc Hoang , Matteo Pontecorvi,

Array Based Betweenness Centrality Eric Robinson Northeastern University MIT Lincoln Labs

Computing Betweenness Centrality in Link Streams Cl emence Magnien joint work with Fr ed

Algorithmic Coalitional Game Theory Lecture 10: Game-Theoretic Network Centralities Oskar Skibski

Corporate Presentation September 2018 About Link REIT About Link REIT Link is Our Portfolio (1)

Vertex Standard EVX-Link Training EVX-Link Training What is the EVX-Link EVX-Link is a fast

Changing the Game - The De-Linking Paradigm Old Way Our Way De-Link De-Link Link Link

10 GHz Microwave Link 10 GHz Microwave Link 10 GHz Microwave Link 10 GHz Microwave Link Project

Centrality, treeness and miscellaneous Social and Technological Networks Rik Sarkar University

Efficient Batched Distance and Centrality Computation in Unweighted and Weighted Graphs Manuel

Maximum Betweenness Centrality: Approximability and Tractable Cases Martin Fink and Joachim

Tracking and centrality in HI Sasha Milov (for the HI working group) Heavy Ion readiness

Real World Evidence Applying Current Heart Failure Management to our Patients Dr. Nadia

RFA Board of Directors Meeting JW Marriott Hill Country San Antonio, TX January 10 11, 2017

A PCPs Guide to CKD Detection and Delaying Progression

Disclosures Update on Heart Failure with Honoraria (consulting): Alnylam Pharmaceuticals, Akcea

PUGACE, A Cellular Evolutionary Algorithm framework on GPUs Nicols Soca, Jos Luis Blengio,

Mental Models of a Cellular Phone Menu. Com paring Older and Younger Novice Users Martina Ziefle

Lecture 3 Cellular Systems I-Hsiang Wang ihwang@ntu.edu.tw 3/13, 2014 Cellular

GFG, Migration and the Environment: Modelling FNNR Interactions: What we are learning and gaps to

Centrality Measures and Link Analysis Gonzalo Mateos Dept. of ECE - PowerPoint PPT Presentation

Centrality Measures and Link Analysis Gonzalo Mateos Dept. of ECE and Goergen Institute for Data Science University of Rochester gmateosb@ece.rochester.edu http://www.ece.rochester.edu/~gmateosb/ February 21, 2020 Network Science Analytics

Degree centrality Network Analysis in Python I Important nodes Which nodes are important?

REDEFINING CENTRALITY Redefining Centrality Overview - Regional Integration - Global and Local

Centrality Argimiro Arratia &amp; R. Ferrer-i-Cancho Universitat Polit` ecnica de Catalunya Version

Centrality Social and Technological Networks Rik Sarkar University of Edinburgh, 2017.

A Round-Efficient Distributed Betweenness Centrality Algorithm Loc Hoang , Matteo Pontecorvi,

Array Based Betweenness Centrality Eric Robinson Northeastern University MIT Lincoln Labs

Computing Betweenness Centrality in Link Streams Cl emence Magnien joint work with Fr ed

Algorithmic Coalitional Game Theory Lecture 10: Game-Theoretic Network Centralities Oskar Skibski

Corporate Presentation September 2018 About Link REIT About Link REIT Link is Our Portfolio (1)

Vertex Standard EVX-Link Training EVX-Link Training What is the EVX-Link EVX-Link is a fast

Changing the Game - The De-Linking Paradigm Old Way Our Way De-Link De-Link Link Link

10 GHz Microwave Link 10 GHz Microwave Link 10 GHz Microwave Link 10 GHz Microwave Link Project

Centrality, treeness and miscellaneous Social and Technological Networks Rik Sarkar University

Efficient Batched Distance and Centrality Computation in Unweighted and Weighted Graphs Manuel

Maximum Betweenness Centrality: Approximability and Tractable Cases Martin Fink and Joachim

Tracking and centrality in HI Sasha Milov (for the HI working group) Heavy Ion readiness

Real World Evidence Applying Current Heart Failure Management to our Patients Dr. Nadia

RFA Board of Directors Meeting JW Marriott Hill Country San Antonio, TX January 10 11, 2017

A PCPs Guide to CKD Detection and Delaying Progression

Disclosures Update on Heart Failure with Honoraria (consulting): Alnylam Pharmaceuticals, Akcea

PUGACE, A Cellular Evolutionary Algorithm framework on GPUs Nicols Soca, Jos Luis Blengio,

Mental Models of a Cellular Phone Menu. Com paring Older and Younger Novice Users Martina Ziefle

Lecture 3 Cellular Systems I-Hsiang Wang ihwang@ntu.edu.tw 3/13, 2014 Cellular

GFG, Migration and the Environment: Modelling FNNR Interactions: What we are learning and gaps to

Centrality Argimiro Arratia & R. Ferrer-i-Cancho Universitat Polit` ecnica de Catalunya Version