Bayesian non parametric inference of discrete valued networks L. - - PowerPoint PPT Presentation

bayesian non parametric inference of discrete valued
SMART_READER_LITE
LIVE PREVIEW

Bayesian non parametric inference of discrete valued networks L. - - PowerPoint PPT Presentation

Bayesian non parametric inference of discrete valued networks L. Nouedoui, P . Latouche Universit e Paris 1 Panth eon-Sorbonne Laboratoire SAMM ESANN 13 L. Nouedoui, P . Latouche 1 Contents Introduction Real networks Graph


slide-1
SLIDE 1

Bayesian non parametric inference of discrete valued networks

  • L. Nouedoui, P

. Latouche

Universit´ e Paris 1 Panth´ eon-Sorbonne Laboratoire SAMM ESANN 13

  • L. Nouedoui, P

. Latouche 1

slide-2
SLIDE 2

Contents

Introduction Real networks Graph clustering Stochastic block models The model Poisson mixture model Infinite Poisson mixture model Chinese restaurant process Inference Experiments

  • L. Nouedoui, P

. Latouche 2

slide-3
SLIDE 3

Real networks

Subset of the yeast transcriptional regulatory network (Milo et al., 2002).

  • L. Nouedoui, P

. Latouche 3

slide-4
SLIDE 4

Graph clustering

◮ Existing methods look for :

◮ Community structure ◮ Disassortative mixing ◮ Heterogeneous structure

  • L. Nouedoui, P

. Latouche 4

slide-5
SLIDE 5

Graph clustering

◮ Existing methods look for :

◮ Community structure ◮ Disassortative mixing ◮ Heterogeneous structure

  • L. Nouedoui, P

. Latouche 4

slide-6
SLIDE 6

Graph clustering

◮ Existing methods look for :

◮ Community structure ◮ Disassortative mixing ◮ Heterogeneous structure

  • L. Nouedoui, P

. Latouche 4

slide-7
SLIDE 7

Graph clustering

◮ Existing methods look for :

◮ Community structure ◮ Disassortative mixing ◮ Heterogeneous structure

  • L. Nouedoui, P

. Latouche 4

slide-8
SLIDE 8

Stochastic Block Model (SBM)

◮ Nowicki and Snijders (2001)

◮ Earlier work : Govaert et al. (1977)

◮ Zi independent hidden variables :

◮ Zi ∼ M

  • 1, α = (α1, α2, . . . , αK)
  • ◮ Zik = 1 : vertex i belongs to class k

◮ X | Z edges drawn independently :

Xij|{ZikZjl = 1} ∼ B(πkl)

◮ A mixture model for graphs :

Xij ∼

K

  • k=1

K

  • l=1

αkαlB(πkl)

  • L. Nouedoui, P

. Latouche 5

slide-9
SLIDE 9

1 2 3 4 5 6 7 8 4 5 6 7 8

π••

9 10

π•• π•• π•• π••

Approximations

Gibbs : Nowicki and Snijders (2001) VEM : Daudin et al. (2008) VBEM : Latouche et al. (2012)

  • L. Nouedoui, P

. Latouche 6

slide-10
SLIDE 10

1 2 3 4 5 6 7 8 4 5 6 7 8

π••

9 10

π•• π•• π•• π••

Approximations

Gibbs : Nowicki and Snijders (2001) VEM : Daudin et al. (2008) VBEM : Latouche et al. (2012)

  • L. Nouedoui, P

. Latouche 6

slide-11
SLIDE 11

Poisson mixture model for networks

◮ Many networks have discrete edges ◮ Extension of SBM to discrete edges : Xij ∈ N ◮ Xij|{ZikZjl = 1} ∼ P(λkl) ◮ Poisson mixture model (PM) (Mariadassou et al., 2010) ◮ Inference : VEM + ICL

ICL

◮ Based on a Laplace (asymptotic) approximation ◮ Problem for small sample sizes

  • L. Nouedoui, P

. Latouche 7

slide-12
SLIDE 12

Poisson mixture model for networks

◮ Many networks have discrete edges ◮ Extension of SBM to discrete edges : Xij ∈ N ◮ Xij|{ZikZjl = 1} ∼ P(λkl) ◮ Poisson mixture model (PM) (Mariadassou et al., 2010) ◮ Inference : VEM + ICL

ICL

◮ Based on a Laplace (asymptotic) approximation ◮ Problem for small sample sizes

  • L. Nouedoui, P

. Latouche 7

slide-13
SLIDE 13

Chinese restaurant process

◮ Non parametric prior for PM ◮ Each class attracts new data points depending on its

current size

◮ Assume they are the m − 1 observations classified ◮ A new data point is assigned to

◮ class k with probability ∝ nk ◮ a new class with probability ∝ η0

◮ Exchangeable distribution

  • L. Nouedoui, P

. Latouche 8

slide-14
SLIDE 14

Chinese restaurant process

◮ Stick-Breaking prior

◮ βk ∼ Beta(1; η0), ∀k ◮ α1 = β1 ◮ αk = βk

k−1

l=1 (1 − βl)

◮ Zi|α ∼ M(1, α) ◮ Conjugate prior

◮ λkl|a, b ∼ Gamma(a, b) ◮ Choice for the hyperparameters a and b

  • L. Nouedoui, P

. Latouche 9

slide-15
SLIDE 15

Gibbs sampling

◮ p(Z, α, λ|X) not tractable ◮ Gibbs sampling procedure :

◮ β ∼ p(β|X, Z, λ) then compute α ◮ Zi ∼ p(Zi | X, Z\i, α, λ) ◮ λ ∼ p(λ| X, Z, α)

◮ Start with K = Kup classes ◮ Some classes get empty during the algorithm ◮ Number of non empty classes : estimate of K

  • L. Nouedoui, P

. Latouche 10

slide-16
SLIDE 16

Experiments

◮ Simulate networks ◮ N = 50, 100, 500, 1000 ◮ K = 3 ◮ Unbalanced proportions : αk ∝ (1/2)k

◮ α = (80.6, 16.1, 3.3)

◮ λkl = λ

′ and λkl = (1/2)λ ′

  • L. Nouedoui, P

. Latouche 11

slide-17
SLIDE 17

Experiments

Network size Model

  • Kn = 3
  • Kn = 2
  • Kn = 4

N = 50 IPM 0.59 0.41 0.00 PM 0.17 0.82 0.01 N = 100 IPM 0.96 0.04 0.00 PM 0.90 0.07 0.03 N = 500 IPM 1.00 0.00 0.00 PM 1.00 0.00 0.00 N = 1000 IPM 1.00 0.00 0.00 PM 1.00 0.00 0.00

  • L. Nouedoui, P

. Latouche 12

slide-18
SLIDE 18

Real data : Zachary on UCINET

  • Mr. Hi

John A.

  • L. Nouedoui, P

. Latouche 13

slide-19
SLIDE 19

References

◮ K. Nowicki and T.A.B. Snijders (2001), Estimation and

prediction for stochastic blockstructures. 96, 1077-1087

◮ E.M. Airoldi, D.M. Blei, S.E. Fienberg, E.P

. Xing (2008), Mixed membership stochastic blockmodels. Journal of Machine Learning Research, 9, 1981-2014

◮ J-J. Daudin, F

. Picard et S. Robin (2008), A mixture model for random graphs. Statistics and Computing, 18, 2, 151-171

◮ P

. Latouche, E. Birmel´ e, C. Ambroise (2011), Overlapping stochastic block models with application to the French political blogosphere network. Annals of Applied Statistics, 5, 1, 309-336

◮ P

. Latouche, E. Birmel´ e, C. Ambroise (2012), Variational Bayesian inference and complexity control for stochastic block models. Statistical Modelling, 12, 1, 93-115

  • L. Nouedoui, P

. Latouche 14

slide-20
SLIDE 20

Maximum likelihood estimation

◮ Log-likelihoods of the model :

◮ Observed-data : log p(X | α, Π) = log {

Z p(X, Z | α, Π)}

֒ → KN terms

◮ Expectation Maximization (EM) algorithm requires the

knowledge of p(Z | X, α, Π)

Problem

p(Z | X, α, Π) is not tractable (no conditional independence)

Approximations

Gibbs : Nowicki and Snijders (2001) VEM : Daudin et al. (2008) VBEM : Latouche et al. (2012)

  • L. Nouedoui, P

. Latouche 15

slide-21
SLIDE 21

Maximum likelihood estimation

◮ Log-likelihoods of the model :

◮ Observed-data : log p(X | α, Π) = log {

Z p(X, Z | α, Π)}

֒ → KN terms

◮ Expectation Maximization (EM) algorithm requires the

knowledge of p(Z | X, α, Π)

Problem

p(Z | X, α, Π) is not tractable (no conditional independence)

Approximations

Gibbs : Nowicki and Snijders (2001) VEM : Daudin et al. (2008) VBEM : Latouche et al. (2012)

  • L. Nouedoui, P

. Latouche 15

slide-22
SLIDE 22

Maximum likelihood estimation

◮ Log-likelihoods of the model :

◮ Observed-data : log p(X | α, Π) = log {

Z p(X, Z | α, Π)}

֒ → KN terms

◮ Expectation Maximization (EM) algorithm requires the

knowledge of p(Z | X, α, Π)

Problem

p(Z | X, α, Π) is not tractable (no conditional independence)

Approximations

Gibbs : Nowicki and Snijders (2001) VEM : Daudin et al. (2008) VBEM : Latouche et al. (2012)

  • L. Nouedoui, P

. Latouche 15

slide-23
SLIDE 23

Model selection

Criteria

Since log p(X | α, Π) is not tractable, we cannot rely on:

◮ AIC = log p(X |ˆ

α, ˆ Π) − C

◮ BIC = log p(X |ˆ

α, ˆ Π) − C

2 log N(N−1) 2

ICL

Biernacki et al. (2000) ֒ → Daudin et al. (2008)

Variational Bayes EM ֒ → ILvb

Latouche et al. (2012)

Exact ICL ֒ →ICLex

  • me and Latouche (2013)
  • L. Nouedoui, P

. Latouche 16

slide-24
SLIDE 24

Model selection

Criteria

Since log p(X | α, Π) is not tractable, we cannot rely on:

◮ AIC = log p(X |ˆ

α, ˆ Π) − C

◮ BIC = log p(X |ˆ

α, ˆ Π) − C

2 log N(N−1) 2

ICL

Biernacki et al. (2000) ֒ → Daudin et al. (2008)

Variational Bayes EM ֒ → ILvb

Latouche et al. (2012)

Exact ICL ֒ →ICLex

  • me and Latouche (2013)
  • L. Nouedoui, P

. Latouche 16

slide-25
SLIDE 25

Model selection

Criteria

Since log p(X | α, Π) is not tractable, we cannot rely on:

◮ AIC = log p(X |ˆ

α, ˆ Π) − C

◮ BIC = log p(X |ˆ

α, ˆ Π) − C

2 log N(N−1) 2

ICL

Biernacki et al. (2000) ֒ → Daudin et al. (2008)

Variational Bayes EM ֒ → ILvb

Latouche et al. (2012)

Exact ICL ֒ →ICLex

  • me and Latouche (2013)
  • L. Nouedoui, P

. Latouche 16