The Hierarchical Structure
- f Networks
Aaron Clauset Santa Fe Institute 4 August 2008 SFI / CAIDA W
- rkshop
Networks and Navigation
The Hierarchical Structure of Networks Aaron Clauset Santa Fe - - PowerPoint PPT Presentation
The Hierarchical Structure of Networks Aaron Clauset Santa Fe Institute 4 August 2008 SFI / CAIDA W orkshop Networks and Navigation First, Some Pictures social groups or communities teenage friendships * research collaborations *
Aaron Clauset Santa Fe Institute 4 August 2008 SFI / CAIDA W
Networks and Navigation
research collaborations teenage friendships
social groups or communities
*image stolen from elsewhere * *
metabolites proteins
functional(?) clusters, hierarchies
* * *image stolen from elsewhere
amazon.com communities books on politics
co-purchasing (topical?) groups
* *image stolen from elsewhere
How can we extract
from complex networks?
some stylized ideas
no structure
no structure modular structure
no structure modular structure hierarchical structure
multi-scale
How can we extract
from complex networks?
network data hierarchy ?
Model-based inference
probability assortative modules
pr
D, {pr}
“inhomogeneous” random graph
model instance
Pr(i, j connected) = pr i j i j = p(lowest common ancestor of i,j)
scores quality of model
Nature 453, p98 (2008)
L = Pr( data | model )
graph from ensemble A test: do resampled graphs look like original?
D
G G
Grassland species* plant
herbivore
parasite
*thank you: Jennifer Dunne
10 10
1
10
!3
10
!2
10
!1
10
a
Degree, k Fraction of vertices with degree k
resampled
resampled
0.05 0.1 0.15 0.2 0.25 0.3 0.05 0.1 0.15 0.2 0.25
Fraction of graphs with clustering coefficient c Clustering coefficient, c
resampled
2 4 6 8 10 10
!3
10
!2
10
!1
10
b
Distance, d Fraction of vertex!pairs at distance d
resampled
A test: can model predict missing links?
n = 75 m = 113 pguess ≈ k n2 − m + k = O(n−2) k pguess = k/(2662 + k) G
for links
Test idea via leave-k-out cross-validation perfect accuracy: AUC = 1 no better than chance: AUC = 1/2
(i, j) ∈ G
D
pr G pr
0.2 0.4 0.6 0.8 1 0.4 0.5 0.6 0.7 0.8 0.9 1 Area under ROC curve Fraction of edges observed, k/m Grassland species network Pure chance Common neighbors Jaccard coeff. Degree product Shortest paths Hierarchical structure
simple predictors
hierarchy
pure chance
AUC
0.2 0.4 0.6 0.8 1 0.4 0.5 0.6 0.7 0.8 0.9 1 AUC Fraction of edges observed Terrorist association network
a
Pure chance Common neighbors Jaccard coefficient Degree product Shortest paths Hierarchical structure
0.2 0.4 0.6 0.8 1 0.4 0.5 0.6 0.7 0.8 0.9 1 AUC Fraction of edges observed
metabolic network
b
Pure chance Common neighbors Jaccard coefficient Degree product Shortest paths Hierarchical structure
Acknowledgments:
Given , choose random internal node Choose random reconfiguration of subtrees Recompute probabilities and likelihood Sampling states according to their likelihood
D
three subtree configurations
{pr} L
[ergodicity] [detailed balance] (up to relabeling)
Grassland species plant
herbivore
parasite
c
degree distribution
1 2 3 4 5 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 Distance, d p(d)distance distribution rich-club distribution short-loop distribution betweenness function degree-degree correlations ... etc.
The good
The bad
global modularity Q
C B B U U
local modularity R network motifs box covering clique covering ... etc.
The good
The bad
hierarchical random graphs community mixtures latent space models information bottlenecks correlation reconstruction network classification
I(X; Y ) = H(X) − H(X|Y )
The good
The bad
Zachary’s Karate Club NCAA Schedule 2000
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34n = 34 m = 78 n = 115 m = 613
MCMC mixes relatively quickly Equilibrium in steps
!"
!#!"
"!"
#!$$" !$"" !!%" !!&" !!'" !!$" !!"" !%" ()*+,-,.$ /01!/)2+/)3004
, ,!"
!#!"
"!"
#!$$"" !$""" !!%"" !!&"" !!'"" !!$"" !!""" !%"" ()*+,-,.$
, ,2565(+7,.89' :;<<,$"""7,.8!!#
O(n2)
equilibrium
8 14 3 13 4 20 22 1 8 2 12 5 6 7 11 17 1 9 31 23 15 19 21 32 29 2 8 2 4 27 3 16 3 3 10 34 2 6 2 5
point estimate consensus hierarchy
1 5 6 7 11 17 2 8 1 4 3 4 18 2 22 13 1 2 2 5 26 9 31 29 32 10 15 1 6 1 9 21 2 3 24 2 7 28 30 3 3 3 4
point estimate consensus hierarchy
LouisianTech (58) LouisianMonr (59) MidTNState (63) LouisianaLaf (97) F l