Bayesian Methods for Graph Clustering P. Latouche, E. Birmel e - - PowerPoint PPT Presentation

bayesian methods for graph clustering
SMART_READER_LITE
LIVE PREVIEW

Bayesian Methods for Graph Clustering P. Latouche, E. Birmel e - - PowerPoint PPT Presentation

Bayesian Methods for Graph Clustering P. Latouche, E. Birmel e Laboratoire Statistique et G enome (UMR CNRS 8071, INRA 1152) Journ ees MAS, August 2008 P. Latouche (Stat. & G enome) Bayesian Methods for Graph Clustering


slide-1
SLIDE 1

Bayesian Methods for Graph Clustering

  • P. Latouche, E. Birmel´

e

Laboratoire ”Statistique et G´ enome” (UMR CNRS 8071, INRA 1152)

Journ´ ees MAS, August 2008

  • P. Latouche (Stat. & G´

enome) Bayesian Methods for Graph Clustering Journ´ ees MAS, August 2008 1 / 22

slide-2
SLIDE 2

Outline

1

Introduction Real networks Random graph models The MixNet model Maximum likelihood estimation

2

Bayesian View of MixNet Bayesian probabilistic model Variational inference Model selection

3

Applications Affiliation models Metabolic network of E. coli

  • P. Latouche (Stat. & G´

enome) Bayesian Methods for Graph Clustering Journ´ ees MAS, August 2008 2 / 22

slide-3
SLIDE 3

Outline

1

Introduction Real networks Random graph models The MixNet model Maximum likelihood estimation

2

Bayesian View of MixNet Bayesian probabilistic model Variational inference Model selection

3

Applications Affiliation models Metabolic network of E. coli

  • P. Latouche (Stat. & G´

enome) Bayesian Methods for Graph Clustering Journ´ ees MAS, August 2008 3 / 22

slide-4
SLIDE 4

Real networks

Many scientific fields :

World Wide Web, Biology, sociology, physics.

Nature of data under study:

interactions between n

  • bjects,

O(n2) possible interactions.

Network topology :

describes the way nodes interact, structure/function relationship.

Sample of 250 blogs (nodes) with their links (edges)

  • f the French political Blogosphere.
  • P. Latouche (Stat. & G´

enome) Bayesian Methods for Graph Clustering Journ´ ees MAS, August 2008 4 / 22

slide-5
SLIDE 5

Estimation of random graph models

Random graph models

Random graph approaches (Govaert 1977, Frank and Harary 1982, Handcock 2006, Newman and Leicht 2007, Hofman and Wiggins 2008). The Block-Clustering model (Snijders and Nowicki 1997). Erd¨

  • s-R´

enyi Mixture Model for Network (MixNet; Daudin et al 2008).

Estimation of the model parameters

Bayesian strategies cannot handle large networks. Maximum likelihood strategies.

  • P. Latouche (Stat. & G´

enome) Bayesian Methods for Graph Clustering Journ´ ees MAS, August 2008 5 / 22

slide-6
SLIDE 6

The MixNet probabilistic model

Origin

model developed by J. Daudin et al. (2008), ER model generalization, application fields: biology, internet, social network...

Modelling connection heterogeneity

hyp.: there exists a hidden structure with Q classes, Z = (Zi)i, Ziq = I{i ∈ q} are indep. hidden variables, α = {αq}, the prior proportions of groups, (Zi) ∼ M(1, α).

Distribution of X

Conditional distribution: Xij|{ZiqZjℓ = 1} ∼ B(πql) where B() is the Bernoulli distribution, Xij|Z are independant. Marginal distribution: Xij ∼

qℓ αqαℓB(πql),

i j

Xij=1

k

Xjk=0

  • P. Latouche (Stat. & G´

enome) Bayesian Methods for Graph Clustering Journ´ ees MAS, August 2008 6 / 22

slide-7
SLIDE 7

Maximum likelihood estimation

Likelihood(s) of the model: → Observed data : p(X|α, π) =

Z p(X, Z|α, π).

→ Complete data : p(X, Z|α, π). → EM-like strategies require the knowledge of p(Z|X, α, π).

Problem

In our case, p(Z|X, α, π) is not tractable (no conditional independence).

  • P. Latouche (Stat. & G´

enome) Bayesian Methods for Graph Clustering Journ´ ees MAS, August 2008 7 / 22

slide-8
SLIDE 8

Variational method

Daudin et. al, 2008

Decomposition and variational EM

ln p(X|α, π) = L

  • q(.); α, π
  • + KL
  • q(.) || p(.|X, α, π)
  • ,

where L

  • q(.); α, π
  • =
  • Z

q(Z) ln{p(X, Z|α, π) q(Z) }, and KL

  • q(.) || p(.|X, α, π)
  • = −
  • Z

q(Z) ln{p(Z|X, α, π) q(Z) }.

Approximation

q(Z) =

N

  • i=1

q(Zi) =

N

  • i=1

M(Zi; 1, τi).

  • P. Latouche (Stat. & G´

enome) Bayesian Methods for Graph Clustering Journ´ ees MAS, August 2008 8 / 22

slide-9
SLIDE 9

Estimation of the number of classes

Criteria

Since p(X|α, π) is not tractable, many criteria cannot be used:

1 Akaike Information Criterion: AIC = ln p(X|αML, πML) − M. 2 Bayesian Information Criterion:

BIC = ln p(X|αMAP, πMAP) − 1

2M ln n.

Integrated Classification Likelihood (ICL)

1 Following the work of Biernacki et al. (2000), Mariadassou and Robin

(2007) used a criterion based on an asymptotic approximation of the Integrated Classification Likelihood (ICL).

2 ICL = maxα,π ln p(X, ˜

Z|α, π)− 1

2

  • Q(Q+1)

2

ln

  • n(n+1)
  • −(Q−1) ln(n)
  • P. Latouche (Stat. & G´

enome) Bayesian Methods for Graph Clustering Journ´ ees MAS, August 2008 9 / 22

slide-10
SLIDE 10

Outline

1

Introduction Real networks Random graph models The MixNet model Maximum likelihood estimation

2

Bayesian View of MixNet Bayesian probabilistic model Variational inference Model selection

3

Applications Affiliation models Metabolic network of E. coli

  • P. Latouche (Stat. & G´

enome) Bayesian Methods for Graph Clustering Journ´ ees MAS, August 2008 10 / 22

slide-11
SLIDE 11

Conjugate prior distributions

Mixing coefficients: α ∼ Dirichlet(α; n0) → n0 = (n0

1, . . . , n0 Q).

→ n0

q is the prior number of vertices in class q.

Connectivity matrix: π ∼ Q

q,l Beta(πql; η0 ql, ζ0 ql)

→ η0

ql is the prior number of edges connecting vertices of class q to

vertices of class l. → ζ0

ql is the prior number of non-edges connecting vertices of class

q to vertices of class l.

  • P. Latouche (Stat. & G´

enome) Bayesian Methods for Graph Clustering Journ´ ees MAS, August 2008 11 / 22

slide-12
SLIDE 12

Variational Bayes

Decomposition

ln p(X) = L

  • q(.)
  • + KL
  • q(.) || p(.|X
  • ,

where L

  • q(.)
  • =
  • Z

q(Z, α, π) ln{p(X, Z, α, π) q(Z, α, π) }dαdπ, and KL

  • q(.) || p(.|X)
  • = −
  • Z

q(Z, α, π) ln{p(Z, α, π|X) q(Z, α, π) }dαdπ.

  • P. Latouche (Stat. & G´

enome) Bayesian Methods for Graph Clustering Journ´ ees MAS, August 2008 12 / 22

slide-13
SLIDE 13

Variational Bayes

Factorization

q(Z, α, π) = q(α)q(π)q(Z) = q(α)q(π)

N

  • i=1

q(Zi).

Optimization

1 ln ˜

q(Zi) = EZ\i,α,π[ln p(X, Z, α, π)] + cste.

2 ln ˜

q(α) = EZ,π[ln p(X, Z, α, π)] + cste.

3 ln ˜

q(π) = EZ,α[ln p(X, Z, α, π)] + cste.

  • P. Latouche (Stat. & G´

enome) Bayesian Methods for Graph Clustering Journ´ ees MAS, August 2008 13 / 22

slide-14
SLIDE 14

Optimization

Variational Bayes E-step

q(Zi) = M(Zi; 1, τi = {τi1, . . . , τiQ}).

Variational Bayes M-step (1)

q(α) = Dir(α; n), where nq = n0

q + N i=1 τiq.

Variational Bayes M-step (2)

q(π) =

Q

  • q,l

Beta(πql|ηql, ζql), where ηql = η0

ql + N i=j Xijτiqτjl and ζql = ζ0 ql + N i=j(1 − Xij)τiqτjl.

  • P. Latouche (Stat. & G´

enome) Bayesian Methods for Graph Clustering Journ´ ees MAS, August 2008 14 / 22

slide-15
SLIDE 15

Model selection

The model evidence depends on Q. Bayes’ rule leads to p(Q|X) ∝ p(X|Q)p(Q). If p(Q) is broad, maximizing p(Q|X) is equivalent to maximizing p(X|Q). Since p(X|Q) is intractable we propose to use the lower bound L

  • q(.)
  • and to add a term ln Q! to take the multimodality into

account. First non-asymptotic criterion based on an approximation of the model evidence.

  • P. Latouche (Stat. & G´

enome) Bayesian Methods for Graph Clustering Journ´ ees MAS, August 2008 15 / 22

slide-16
SLIDE 16

Outline

1

Introduction Real networks Random graph models The MixNet model Maximum likelihood estimation

2

Bayesian View of MixNet Bayesian probabilistic model Variational inference Model selection

3

Applications Affiliation models Metabolic network of E. coli

  • P. Latouche (Stat. & G´

enome) Bayesian Methods for Graph Clustering Journ´ ees MAS, August 2008 16 / 22

slide-17
SLIDE 17

Experiments on affiliation models

Probability of intra-connection : λ. Probability of inter-connection : ǫ. Number of vertices : n = 50. For each graph model (λ + ǫ = 1) and for each number of classes QTrue ∈ {2, 3, 4, 5}, we generated 100 graphs. 5 initializations using spectral clustering techniques. Select the best number of estimated classes according to each criterion.

  • P. Latouche (Stat. & G´

enome) Bayesian Methods for Graph Clustering Journ´ ees MAS, August 2008 17 / 22

slide-18
SLIDE 18

Affiliation model (1)

a) QTrue/QICL 1 2 3 4 5 6 2 100 3 100 4 1 98 1 5 10 61 29 b) QTrue/QVB 1 2 3 4 5 6 2 100 3 100 4 98 2 5 1 29 65 5

Table: λ = 0.85 and ǫ = 0.15.

  • P. Latouche (Stat. & G´

enome) Bayesian Methods for Graph Clustering Journ´ ees MAS, August 2008 18 / 22

slide-19
SLIDE 19

Affiliation model (2)

a) QTrue/QICL 1 2 3 4 5 6 2 100 3 100 4 14 86 5 17 36 44 3 b) QTrue/QVB 1 2 3 4 5 6 2 100 3 100 4 5 94 1 5 4 18 43 29 6

Table: λ = 0.8 and ǫ = 0.2.

  • P. Latouche (Stat. & G´

enome) Bayesian Methods for Graph Clustering Journ´ ees MAS, August 2008 19 / 22

slide-20
SLIDE 20

Metabolic network of E. coli

593 vertices. 1782 edges. 60 initializations. Compute the lower bound.

  • P. Latouche (Stat. & G´

enome) Bayesian Methods for Graph Clustering Journ´ ees MAS, August 2008 20 / 22

slide-21
SLIDE 21

Model selection

1 3 5 7 9 12 15 18 21 24 27 30 33 36 39 −10000 −9500 −9000 −8500 −8000 −7500

  • P. Latouche (Stat. & G´

enome) Bayesian Methods for Graph Clustering Journ´ ees MAS, August 2008 21 / 22

slide-22
SLIDE 22

In summary

Flexibility of MixNet :

MixNet is a probabilistic model which captures features of real-networks, It considers classes of connectivity.

Bayesian framework

Estimate of the number of classes more robust, Can handle large graphs.

References :

Daudin J-J., Picard F., Robin S. (2008) , A mixture model for random graphs, Statistic and Computing Zanghi, H, Ambroise, C. and Miele, V. (to appear), Fast online Graph Clustering via Erd¨

  • s-R´

enyi Mixture, Pattern Recognition J.M. Hofman and C.H. Wiggins (2008), A bayesian approach to network modularity, Physical review letters

Softwares :

MixNet, a C ++ code (V. Miele) http://stat.genopole.cnrs.fr/software/mixnet MixNet, a R wrapper of MixNet C ++ code (available on demand).

  • P. Latouche (Stat. & G´

enome) Bayesian Methods for Graph Clustering Journ´ ees MAS, August 2008 22 / 22