Hierarchically Modular Structure in Complex Networks Aaron Clauset - - PowerPoint PPT Presentation

hierarchically modular structure in complex networks
SMART_READER_LITE
LIVE PREVIEW

Hierarchically Modular Structure in Complex Networks Aaron Clauset - - PowerPoint PPT Presentation

Hierarchically Modular Structure in Complex Networks Aaron Clauset Santa Fe Institute 3 November 2008 DIMACS / DyDAn Network Models of Biological and Social Contagion Modular Hierarchies herbivore parasite plant


slide-1
SLIDE 1

Hierarchically Modular Structure in Complex Networks

Aaron Clauset Santa Fe Institute 3 November 2008 DIMACS / DyDAn “Network Models of Biological and Social Contagion”

slide-2
SLIDE 2

Modular Hierarchies

Grassland species*

*thank you: Jennifer Dunne

plant

→ →

herbivore

parasite

slide-3
SLIDE 3

Modular Hierarchies

c

slide-4
SLIDE 4

Modular Hierarchies

c

slide-5
SLIDE 5

The Task

How can we extract

  • this hierarchical (multi-scale) structure

from complex networks?

network ?

chierarchy

slide-6
SLIDE 6

One Approach

Model-based inference

  • 1. describe how to generate hierarchies (a model)
  • 2. “fit” model to empirical data
  • 3. test “fitted” model
  • 4. extract predictions + insight
  • 5. profit!
slide-7
SLIDE 7

A Model of Hierarchy

slide-8
SLIDE 8

A Model of Hierarchy

probability assortative modules

pr

D, {pr}

slide-9
SLIDE 9

“inhomogeneous” random graph

→ →

model instance

Pr(i, j connected) = pr i j i j = p(lowest common ancestor of i,j)

slide-10
SLIDE 10

Model Features

  • explicit model = explicit assumptions
  • very flexible (many parameters)
  • captures structure at all scales
  • arbitrary mixtures of assortativity, disassortativity
  • learnable directly from data
slide-11
SLIDE 11

Learning From Data

a direct approach

  • likelihood function

( scores quality of model)

  • sample the good models

via Markov chain Monte Carlo

  • technical details in arXiv : physics/0610051

L = Pr( data | model )

slide-12
SLIDE 12

From Graph to Ensemble

slide-13
SLIDE 13

From Graph to Ensemble

  • Given graph
  • run MCMC to equilibrium
  • then, for each sampled , draw a resampled

graph from ensemble A test: do resampled graphs look like original?

D

G G

slide-14
SLIDE 14

Grassland species* plant

→ →

herbivore

parasite

*thank you: Jennifer Dunne

slide-15
SLIDE 15

10 10

1

10

!3

10

!2

10

!1

10

a

Degree, k Fraction of vertices with degree k

Degree Distribution

resampled

  • riginal

slide-16
SLIDE 16

Clustering Coefficient

resampled

  • riginal

0.05 0.1 0.15 0.2 0.25 0.3 0.05 0.1 0.15 0.2 0.25

Fraction of graphs with clustering coefficient c Clustering coefficient, c

resampled

  • riginal

slide-17
SLIDE 17

2 4 6 8 10 10

!3

10

!2

10

!1

10

b

Distance, d Fraction of vertex!pairs at distance d

Distance Distribution

resampled

  • riginal

slide-18
SLIDE 18

Missing Links

A test: can model predict missing links?

slide-19
SLIDE 19

Predicting is Hard

  • remove edges from
  • how easy to guess a missing link?

n = 75 m = 113 pguess ≈ k n2 − m + k = O(n−2) k pguess = k/(2662 + k) G

slide-20
SLIDE 20
  • Given incomplete graph
  • run MCMC to equilibrium
  • then, over sampled , compute average

for links

  • predict links with high values are missing

Test idea via leave-k-out cross-validation perfect accuracy: AUC = 1 no better than chance: AUC = 1/2

(i, j) ∈ G

Predicting Missing Links

D

pr G pr

slide-21
SLIDE 21

Missing Structure

0.2 0.4 0.6 0.8 1 0.4 0.5 0.6 0.7 0.8 0.9 1 Area under ROC curve Fraction of edges observed, k/m Grassland species network Pure chance Common neighbors Jaccard coeff. Degree product Shortest paths Hierarchical structure

simple predictors

hierarchy

pure chance

AUC

slide-22
SLIDE 22

0.2 0.4 0.6 0.8 1 0.4 0.5 0.6 0.7 0.8 0.9 1 AUC Fraction of edges observed Terrorist association network

a

Pure chance Common neighbors Jaccard coefficient Degree product Shortest paths Hierarchical structure

Other Networks

0.2 0.4 0.6 0.8 1 0.4 0.5 0.6 0.7 0.8 0.9 1 AUC Fraction of edges observed

  • T. pallidum

metabolic network

b

Pure chance Common neighbors Jaccard coefficient Degree product Shortest paths Hierarchical structure

slide-23
SLIDE 23

Summary

  • Many real networks are hierarchically modular
  • Hierarchies can
  • model multi-scale structure
  • generalize a single network
  • predict missing links
  • Model-based inference is very powerful

Acknowledgments:

  • C. Moore, M.E.J. Newman, C.H. Wiggins, and C.R. Shalizi
slide-24
SLIDE 24

Fin