Bayesian estimation of the latent dimension and communities in - - PowerPoint PPT Presentation

bayesian estimation of the latent dimension and
SMART_READER_LITE
LIVE PREVIEW

Bayesian estimation of the latent dimension and communities in - - PowerPoint PPT Presentation

Bayesian community detection Results Conclusion References JSM19 Novel Approaches for Analyzing Dynamic Networks Bayesian estimation of the latent dimension and communities in stochastic blockmodels Francesco Sanna Passino , Nick Heard


slide-1
SLIDE 1

1/16

Bayesian community detection Results Conclusion References

JSM19 – Novel Approaches for Analyzing Dynamic Networks

Bayesian estimation of the latent dimension and communities in stochastic blockmodels

Francesco Sanna Passino, Nick Heard

Department of Mathematics, Imperial College London francesco.sanna-passino16@imperial.ac.uk July 30, 2019

Francesco Sanna Passino Imperial College London Bayesian estimation of the latent dimension and communities in stochastic blockmodels

slide-2
SLIDE 2

2/16

Bayesian community detection Results Conclusion References

Stochastic blockmodels as random dot product graphs

Consider an undirected graph with symmetric adjacency matrix A ∈ {0, 1}n×n. In random dot product graphs, the probability of a link between two nodes is expressed as the inner product between two latent positions xi, xj ∈ F, 0 ≤ x⊤y ≤ 1 ∀ x, y ∈ F: P(Aij = 1) = x⊤

i xj.

The stochastic blockmodel is the classical model for community detection in graphs. Given a matrix B ∈ [0, 1]K×K of within-community probabilities, the probability of a link depends on the community allocations zi and zj ∈ {1, . . . , K} of the two nodes: P(Aij = 1) = Bzizj. The stochastic blockmodel can be interpreted as a special case of a random dot product

  • graph. If Bkh = µ⊤

k µh with µk, µh ∈ F, and all the nodes in community k are assigned

the latent position µk, then: P(Aij = 1) = µ⊤

ziµzj, i < j, Aij = Aji.

Francesco Sanna Passino Imperial College London Bayesian estimation of the latent dimension and communities in stochastic blockmodels

slide-3
SLIDE 3

3/16

Bayesian community detection Results Conclusion References

Network embeddings

Consider an undirected graph with symmetric adjacency matrix A ∈ {0, 1}n×n, and modified Laplacian L = D−1/2AD−1/2, D = diag(n

i=1 Aij).

The adjacency embedding of A in Rd is: ˆ X = [ˆ x1, . . . , ˆ xn]⊤ = ˆ Γ ˆ Λ1/2 ∈ Rn×d, where ˆ Λ is a d × d diagonal matrix containing the top d largest eigenvalues of A, and ˆ Γ is a n × d matrix containing the corresponding orthonormal eigenvectors. The Laplacian embedding of A in Rd is: ˜ X = [˜ x1, . . . , ˜ xn]⊤ = ˜ Γ ˜ Λ1/2 ∈ Rn×d, where ˜ Λ is a d × d diagonal matrix containing the top d largest eigenvalues of L, and ˜ Γ is a n × d matrix containing the corresponding orthonormal eigenvectors.

Francesco Sanna Passino Imperial College London Bayesian estimation of the latent dimension and communities in stochastic blockmodels

slide-4
SLIDE 4

4/16

Bayesian community detection Results Conclusion References

Spectral estimation of the stochastic blockmodel

Based on asymptotic properties, Rubin-Delanchy et al., 2017, propose the following algorithm for consistent estimation of the latent positions in stochastic blockmodels: Algorithm 1: Spectral estimation of the stochastic blockmodel (spectral clustering) Input: adjacency matrix A (or the Laplacian matrix L), dimension d, and number of communities K ≥ d.

1 compute spectral embedding ˆ

X = [ˆ x1, . . . , ˆ xn]⊤ or ˜ X = [˜ x1, . . . , ˜ xn]⊤ into Rd,

2 fit a Gaussian mixture model with K components,

Result: return cluster centres µ1, . . . , µK ∈ Rd and node memberships z1, . . . , zn. In practice: d and K are estimated sequentially. Issues:

Sequential approach is sub-optimal: the estimate of K depends on choice of d. Theoretical results only hold for d fixed and known. Distributional assumptions when d is misspecified are not available.

This talk discusses a novel framework for joint estimation of d and K.

Francesco Sanna Passino Imperial College London Bayesian estimation of the latent dimension and communities in stochastic blockmodels

slide-5
SLIDE 5

5/16

Bayesian community detection Results Conclusion References

A Bayesian model for network embeddings

Choose integer m ≤ n and obtain embedding X ∈ Rn×m → m arbitrarily large. Bayesian model for simultaneous estimation of d and K → allow for d = rank(B) ≤ K. xi|d, zi, µzi, Σzi, σ2

zi d

∼ Nm µzi 0m−d

  • ,

Σzi σ2

ziIm−d

  • , i = 1, . . . , n,

(µk, Σk)|d iid ∼ NIWd(0, κ0, ν0 + d − 1, ∆d), k = 1, . . . , K, σ2

kj iid

∼ Inv-χ2(λ0, σ2

0), j = d + 1, . . . , m,

d|z d ∼ Uniform{1, . . . , K∅}, zi|θ iid ∼ Multinoulli(θ), i = 1, . . . , n, θ ∈ SK−1, θ|K d ∼ Dirichlet α K , . . . , α K

  • ,

K d ∼ Geometric(ω). where K∅ is the number of non-empty communities. Alternative: d d ∼ Geometric(δ). Yang et al., 2019, independently proposed a similar frequentist model.

Francesco Sanna Passino Imperial College London Bayesian estimation of the latent dimension and communities in stochastic blockmodels

slide-6
SLIDE 6

6/16

Bayesian community detection Results Conclusion References

Empirical model validation

−1 −0.9 −0.8 −0.7 −0.6 −0.5 −0.5 0.5 Figure 1. Scatterplot of the simulated X1 and X2 – i.e. X:d −0.4 −0.2 0.2 0.4 0.6 −0.4 −0.2 0.2 0.4 Figure 2. Scatterplot of the simulated X3 and X4

Simulated GRDPG-SBM with n = 2500, d = 2, K = 5. Nodes allocated to communities with probability θk = P(zi = k) = 1/K.

Francesco Sanna Passino Imperial College London Bayesian estimation of the latent dimension and communities in stochastic blockmodels

slide-7
SLIDE 7

7/16

Bayesian community detection Results Conclusion References

Empirical model validation

2 4 6 8 10 12 14 −1 −0.5 0.5 Dimension Overall mean Max/min within-cluster mean Within-cluster mean

Figure 3. Within-cluster and overall means of X:15 −0.2 −0.1 0.1 0.2 5 10 15 Correlation coefficient ρ(k)

ij

ρ(k)

ij for X:d

Histogram of ρ(k)

ij for Xd:

Figure 4. Within-cluster correlation coefficients of X:30

Means are approximately 0 for columns with index > d. Reasonable to assume correlation ρ(k)

ij = 0 for i, j > d.

Francesco Sanna Passino Imperial College London Bayesian estimation of the latent dimension and communities in stochastic blockmodels

slide-8
SLIDE 8

8/16

Bayesian community detection Results Conclusion References

Curse of dimensionality

100 200 300 400 500 2 4 6 ·10−2 Dimension Variance Within cluster variance Total variance 5 10 15 20 25 2 4 6 ·10−2 Dimension Variance Figure 5. Within-block variance and total variance for the adjacency embedding obtained from a simulated SBM with d = 2, K = 5, n = 500, and well separated means µ1 = [0.7, 0.4], µ2 = [0.1, 0.1], µ3 = [0.4, 0.8], µ4 = [−0.1, 0.5] and µ5 = [0.3, 0.5], and θ = (0.2, 0.2, 0.2, 0.2, 0.2).

For some k and k′: σ2

kj ≈ σ2 k′j for j ≫ d and k = k′.

Francesco Sanna Passino Imperial College London Bayesian estimation of the latent dimension and communities in stochastic blockmodels

slide-9
SLIDE 9

9/16

Bayesian community detection Results Conclusion References

Second order clustering

Bayesian model parsimony: K underestimated for d ≪ m. Possible solution: second order clustering v = (v1, . . . , vK) with vk ∈ {1, . . . , H}. If vk = vk′, then σ2

kj = σ2 k′j for j > d:

xi|d, zi, vzi, µzi, Σzi, σ2

vzi d

∼ Nm µzi 0m−d

  • ,

Σzi σ2

vziIm−d

  • , i = 1, . . . , n,

vk|K, H d ∼ Multinoulli(φ), k = 1, . . . , K, φ|H d ∼ Dirichlet β H , . . . , β H

  • ,

H|K d ∼ Uniform{1, . . . , K}. The parameter vk defines clusters of clusters. Empirical results show that the model is able to handle d ≪ m.

Francesco Sanna Passino Imperial College London Bayesian estimation of the latent dimension and communities in stochastic blockmodels

slide-10
SLIDE 10

10/16

Bayesian community detection Results Conclusion References

N3(µ1, Σ1) N3(µ2, Σ2) N3(µ3, Σ3) N3(µ4, Σ4) N3(µ5, Σ5) N8(08, σ2

3I5)

N8(08, σ2

2I5)

N8(08, σ2

1I5)

3 = latent dimension

Francesco Sanna Passino Imperial College London Bayesian estimation of the latent dimension and communities in stochastic blockmodels

slide-11
SLIDE 11

11/16

Bayesian community detection Results Conclusion References

Extension to directed and bipartite graphs

Consider a directed graph with adjacency matrix A ∈ {0, 1}n×n. The d-dimensional adjacency embedding of A in R2d, is defined as: ˆ X = ˆ U ˆ D1/2 ⊕ ˆ V ˆ D1/2 = ˆ U ˆ D1/2 ˆ V ˆ D1/2 = ˆ Xs ˆ Xr

  • .

where A = ˆ U ˆ D ˆ V⊤ + ˆ U⊥ ˆ D⊥ ˆ V⊤

⊥ is the SVD decomposition of A, where ˆ

U ∈ Rn×d, ˆ D ∈ Rd×d

+

diagonal, and ˆ V ∈ Rn×d. Essentially, only three distributions change: xi|d, K, zi

d

∼ N2m         µzi 0m−d µ′

zi

0m−d     ,     Σzi σ2

ziIm−d

Σ′

zi

σ2′

ziIm−d

        , (µk, Σk)|d, K iid ∼ NIWd(0, κ0, ν0 + d − 1, ∆d), k = 1, . . . , K, σ2

kj|d, K iid

∼ Inv-χ2(λ0, σ2

0), j = d + 1, . . . , m.

Co-clustering: different clusters for sources and receivers → bipartite graphs.

Francesco Sanna Passino Imperial College London Bayesian estimation of the latent dimension and communities in stochastic blockmodels

slide-12
SLIDE 12

12/16

Bayesian community detection Results Conclusion References

Enron e-mail network: number of clusters

5 10 15 1 2 3 ·105 K∅ H∅

Figure 6. Posterior histogram of K∅ and H∅, unconstrained model, MAP for d in red.

2 4 6 8 10 12 14 16 0.5 1 1.5 2 2.5 ·105 K∅ H∅

Figure 7. Posterior histogram of K∅ and H∅, constrained model, MAP for d in red.

Francesco Sanna Passino Imperial College London Bayesian estimation of the latent dimension and communities in stochastic blockmodels

slide-13
SLIDE 13

13/16

Bayesian community detection Results Conclusion References

Enron e-mail network: number of clusters

4 6 8 10 12 14 16 0.2 0.4 0.6 0.8 1 1.2 ·105 K∅

Figure 8. Posterior histogram of K∅, unconstrained model without second order clustering, MAP for d in red.

6 8 10 12 14 16 0.2 0.4 0.6 0.8 1 ·105 K∅

Figure 9. Posterior histogram of K∅, constrained model with-

  • ut second order clustering, MAP for d in red.

Francesco Sanna Passino Imperial College London Bayesian estimation of the latent dimension and communities in stochastic blockmodels

slide-14
SLIDE 14

14/16

Bayesian community detection Results Conclusion References

Enron e-mail network: scree-plot

50 100 150 5 10 15 20 25

Figure 10. Singular values of the adjacency matrix.

Choice of d is consistent with the elbow of the scree-plot.

Francesco Sanna Passino Imperial College London Bayesian estimation of the latent dimension and communities in stochastic blockmodels

slide-15
SLIDE 15

15/16

Bayesian community detection Results Conclusion References

Conclusion

Community detection and stochastic blockmodels:

Bayesian model for simultaneous selection of K and d in generalised random dot product graphs, Allow for initial misspecification of the arbitrarily large parameter m, then refine estimate d, Gaussian mixture model (with constraints) based on spectral embedding, Easy to extend to directed and bipartite graphs.

More details: Sanna Passino and Heard, 2019 – arXiv: 1904.05333.

Francesco Sanna Passino Imperial College London Bayesian estimation of the latent dimension and communities in stochastic blockmodels

slide-16
SLIDE 16

16/16

Bayesian community detection Results Conclusion References

References

Athreya, A. et al. (2018). “Statistical Inference on Random Dot Product Graphs: a Survey”. In: Journal of Machine Learning Research 18.226, pp. 1–92. Rubin-Delanchy, P. et al. (2017). “A statistical interpretation of spectral embedding: the generalised random dot product graph”. In: ArXiv e-prints. arXiv: 1709.05506. Sanna Passino, F. and N. A. Heard (2019). “Bayesian estimation of the latent dimension and communities in stochastic blockmodels”. In: arXiv e-prints. arXiv: 1904.05333. Yang, C. et al. (2019). “Simultaneous dimensionality and complexity model selection for spectral graph clustering”. In: arXiv e-prints. arXiv: 1904.02926.

Francesco Sanna Passino Imperial College London Bayesian estimation of the latent dimension and communities in stochastic blockmodels