The condensation threshold in stochastic block models Joe Neeman - - PowerPoint PPT Presentation
The condensation threshold in stochastic block models Joe Neeman - - PowerPoint PPT Presentation
The condensation threshold in stochastic block models Joe Neeman (with Jess Banks, Cris Moore, Praneeth Netrapalli) Austin, May 9, 2016 1 Stochastic block model G ( n , k , a , b ) 1. n nodes, k colors, about n / k nodes of each color a
Stochastic block model G(n, k, a, b)
- 1. n nodes, k colors, about n/k nodes of each color
- 2. connect u to v with probability
a n
if the same color
b n
if different colors
2
Problem I: detecting
Given the (uncolored) graph, recover the colors (up to permutation) better than a random guess. Definition Let
v
1 k be the color of v. For another coloring , Olap max v V
v v
n 1 k where max is over all permutations
- n 1
k . Definition Gn
n
n k a b is detectable if there exists 0 and maps An graphs labellings such that lim inf
n
Pr Olap
n An Gn
Otherwise it is undetectable.
3
Problem I: detecting
Given the (uncolored) graph, recover the colors (up to permutation) better than a random guess. Definition Let σv ∈ {1, . . . , k} be the color of v. For another coloring τ, Olap(σ, τ) = max #{v ∈ V : σv = τv } n − 1 k, where max is over all permutations
- n 1
k . Definition Gn
n
n k a b is detectable if there exists 0 and maps An graphs labellings such that lim inf
n
Pr Olap
n An Gn
Otherwise it is undetectable.
3
Problem I: detecting
Given the (uncolored) graph, recover the colors (up to permutation) better than a random guess. Definition Let σv ∈ {1, . . . , k} be the color of v. For another coloring τ, Olap(σ, τ) = max
π
#{v ∈ V : σv = π(τv)} n − 1 k, where max is over all permutations π on {1, . . . , k}. Definition Gn
n
n k a b is detectable if there exists 0 and maps An graphs labellings such that lim inf
n
Pr Olap
n An Gn
Otherwise it is undetectable.
3
Problem I: detecting
Given the (uncolored) graph, recover the colors (up to permutation) better than a random guess. Definition Let σv ∈ {1, . . . , k} be the color of v. For another coloring τ, Olap(σ, τ) = max
π
#{v ∈ V : σv = π(τv)} n − 1 k, where max is over all permutations π on {1, . . . , k}. Definition (Gn, σn) ∼ G(n, k, a, b) is detectable if there exists ϵ > 0 and maps An : {graphs} → {labellings} such that lim inf
n→∞ Pr(Olap(σn, An(Gn)) > ϵ) > ϵ.
Otherwise it is undetectable.
3
Problem II: distinguishing
Given the (uncolored) graph, did it come from G(n, k, a, b) or G(n, d
n), where d = a+(k−1)b k
? Definition Sequences
n and n of probability measures are
- contiguous if
n An
0 iff
n An
- orthogonal if
An with
n An
0 and
n An
1. Say that n k a b is
- distinguishable if it is orthogonal to
n d
n
- indistinguishable if it is contiguous with
n d
n 4
Problem II: distinguishing
Given the (uncolored) graph, did it come from G(n, k, a, b) or G(n, d
n), where d = a+(k−1)b k
? Definition Sequences Pn and Qn of probability measures are
- contiguous if Pn(An) → 0 iff Qn(An) → 0
- orthogonal if ∃An with Pn(An) → 0 and Qn(An) → 1.
Say that n k a b is
- distinguishable if it is orthogonal to
n d
n
- indistinguishable if it is contiguous with
n d
n 4
Problem II: distinguishing
Given the (uncolored) graph, did it come from G(n, k, a, b) or G(n, d
n), where d = a+(k−1)b k
? Definition Sequences Pn and Qn of probability measures are
- contiguous if Pn(An) → 0 iff Qn(An) → 0
- orthogonal if ∃An with Pn(An) → 0 and Qn(An) → 1.
Say that G(n, k, a, b) is
- distinguishable if it is orthogonal to G(n, d
n)
- indistinguishable if it is contiguous with G(n, d
n) 4
Better parametrization
- a
n = within-block edge probability
- b
n = between-block edge probability
- k = number of blocks
d = a + (k − 1)b k λ = a − b a + (k − 1)b Note λ ∈ [ −
1 k−1, 1
] .
5
Phase diagram for k = 2
5 10 15 20
d
−1.0 −0.5 0.0 0.5 1.0
λ λ2d = 1 detectable, distinguishable undetectable, indistinguishable
(Mossel/N/Sly, Massoulié)
6
Conjectured phase diagram for k = 20
200 400 600 800 1000
d
0.0 0.2 0.4 0.6 0.8 1.0
λ λ2d = 1 detectable, distinguishable undetectable, indistinguishable detectable but hard, distinguishable
(Decelle, Krzakala, Moore, Zdeborova)
7
What we know for k = 20
200 400 600 800 1000
d
0.0 0.2 0.4 0.6 0.8 1.0
λ detectable (quickly), distinguishable (Bordenave/Lelarge/Massouli´ e, Abbe/Sandon) detectable, distinguishable (Abbe/Sandon, this work) undetectable, indistinguishable (this work)
8
Theorem (Banks/Moore/N/Netrapalli) d+ = 2k log k (1 + (k − 1)λ) log(1 + (k − 1)λ) + (k − 1)(1 − λ) log(1 − λ) d− = 2 log(k − 1) λ2(k − 1)
- d > d+ implies detectability, distinguishability.
- d < d− implies undetectability, indistinguishability.
If k is large enough then there are such that d
1
2 , giving
the yellow region. lim
k
d d
2
1 log 1 where a b d If 1 and limk
d d
1 (planted coloring / giant)
9
Theorem (Banks/Moore/N/Netrapalli) d+ = 2k log k (1 + (k − 1)λ) log(1 + (k − 1)λ) + (k − 1)(1 − λ) log(1 − λ) d− = 2 log(k − 1) λ2(k − 1)
- d > d+ implies detectability, distinguishability.
- d < d− implies undetectability, indistinguishability.
If k is large enough then there are λ such that d+ <
1 λ2 , giving
the yellow region. lim
k
d d
2
1 log 1 where a b d If 1 and limk
d d
1 (planted coloring / giant)
9
Theorem (Banks/Moore/N/Netrapalli) d+ = 2k log k (1 + (k − 1)λ) log(1 + (k − 1)λ) + (k − 1)(1 − λ) log(1 − λ) d− = 2 log(k − 1) λ2(k − 1)
- d > d+ implies detectability, distinguishability.
- d < d− implies undetectability, indistinguishability.
If k is large enough then there are λ such that d+ <
1 λ2 , giving
the yellow region. lim
k→∞
d+ d− = µ2 (1 + µ) log(1 + µ) − µ where µ = a − b d . If µ ≈ ±1 and limk→∞
d+ d− ≈ 1 (planted coloring / giant) 9
The proofs
200 400 600 800 1000
d
0.0 0.2 0.4 0.6 0.8 1.0
λ detectable (quickly), distinguishable (Bordenave/Lelarge/Massouli´ e, Abbe/Sandon) detectable, distinguishable (Abbe/Sandon, this work) undetectable, indistinguishable (this work)
10
Detecting/distinguishing inefficiently
Consider partitions of G into k equal parts. A partition is good if its average in-degree is ≈ a
k and its average out-degree is
≈ (k−1)b
k
. For suitable a b k, w.h.p.
- n k a b :
all good partitions are correlated with the truth.
- n d
n :
there are no good partitions. Proof: concentration + union bound. Distinguishing: check if there is a good partition. Detecting: find a good partition. Abbe/Sandon improved this for small d by taking the giant component and pruning trees.
11
Detecting/distinguishing inefficiently
Consider partitions of G into k equal parts. A partition is good if its average in-degree is ≈ a
k and its average out-degree is
≈ (k−1)b
k
. For suitable a, b, k, w.h.p.
- G(n, k, a, b):
all good partitions are correlated with the truth.
- n d
n :
there are no good partitions. Proof: concentration + union bound. Distinguishing: check if there is a good partition. Detecting: find a good partition. Abbe/Sandon improved this for small d by taking the giant component and pruning trees.
11
Detecting/distinguishing inefficiently
Consider partitions of G into k equal parts. A partition is good if its average in-degree is ≈ a
k and its average out-degree is
≈ (k−1)b
k
. For suitable a, b, k, w.h.p.
- G(n, k, a, b):
all good partitions are correlated with the truth.
- G(n, d
n):
there are no good partitions. Proof: concentration + union bound. Distinguishing: check if there is a good partition. Detecting: find a good partition. Abbe/Sandon improved this for small d by taking the giant component and pruning trees.
11
Detecting/distinguishing inefficiently
Consider partitions of G into k equal parts. A partition is good if its average in-degree is ≈ a
k and its average out-degree is
≈ (k−1)b
k
. For suitable a, b, k, w.h.p.
- G(n, k, a, b):
all good partitions are correlated with the truth.
- G(n, d
n):
there are no good partitions. Proof: concentration + union bound. Distinguishing: check if there is a good partition. Detecting: find a good partition. Abbe/Sandon improved this for small d by taking the giant component and pruning trees.
11
Detecting/distinguishing inefficiently
Consider partitions of G into k equal parts. A partition is good if its average in-degree is ≈ a
k and its average out-degree is
≈ (k−1)b
k
. For suitable a, b, k, w.h.p.
- G(n, k, a, b):
all good partitions are correlated with the truth.
- G(n, d
n):
there are no good partitions. Proof: concentration + union bound. Distinguishing: check if there is a good partition. Detecting: find a good partition. Abbe/Sandon improved this for small d by taking the giant component and pruning trees.
11
Detecting/distinguishing inefficiently
Consider partitions of G into k equal parts. A partition is good if its average in-degree is ≈ a
k and its average out-degree is
≈ (k−1)b
k
. For suitable a, b, k, w.h.p.
- G(n, k, a, b):
all good partitions are correlated with the truth.
- G(n, d
n):
there are no good partitions. Proof: concentration + union bound. Distinguishing: check if there is a good partition. Detecting: find a good partition. Abbe/Sandon improved this for small d by taking the giant component and pruning trees.
11
200 400 600 800 1000
d
0.0 0.2 0.4 0.6 0.8 1.0
λ detectable (quickly), distinguishable (Bordenave/Lelarge/Massouli´ e, Abbe/Sandon) detectable, distinguishable (Abbe/Sandon, this work) undetectable, indistinguishable (this work)
12
Indistinguishability
If EQn (
dPn dQn
)2 → C < ∞ then Qn(An) → 0 ⇒ Pn(An) → 0. Set
n
n k a b and
n
n d
n . Then
d
n
d
n
k
n E a or b n Ec
1
a or b n E d n Ec
1
d n
d
n
d
n 2
k
2n E
a or b a or b d2
Ec
1
a or b n
1
a or b n
1
d n 2
Under
n, the events u v
E are all independent, so can compute:
n
d
n
d
n 2
C 1
- 1
exp XTBX where X is a multinomial vector of length k2.
13
Indistinguishability
If EQn (
dPn dQn
)2 → C < ∞ then Qn(An) → 0 ⇒ Pn(An) → 0. Set Pn = G(n, k, a, b) and Qn = G(n, d
n). Then
dPn dQn = k−n ∑
σ
∏
E a or b n
∏
Ec
( 1 − a or b
n
) ∏
E d n
∏
Ec
( 1 − d
n
) d
n
d
n 2
k
2n E
a or b a or b d2
Ec
1
a or b n
1
a or b n
1
d n 2
Under
n, the events u v
E are all independent, so can compute:
n
d
n
d
n 2
C 1
- 1
exp XTBX where X is a multinomial vector of length k2.
13
Indistinguishability
If EQn (
dPn dQn
)2 → C < ∞ then Qn(An) → 0 ⇒ Pn(An) → 0. Set Pn = G(n, k, a, b) and Qn = G(n, d
n). Then
dPn dQn = k−n ∑
σ
∏
E a or b n
∏
Ec
( 1 − a or b
n
) ∏
E d n
∏
Ec
( 1 − d
n
) ( dPn dQn )2 = k−2n ∑
σ,τ
∏
E
(a or b)(a or b) d2 ∏
Ec
(1 − a or b
n
)(1 − a or b
n
) (1 − d
n)2
Under
n, the events u v
E are all independent, so can compute:
n
d
n
d
n 2
C 1
- 1
exp XTBX where X is a multinomial vector of length k2.
13
Indistinguishability
If EQn (
dPn dQn
)2 → C < ∞ then Qn(An) → 0 ⇒ Pn(An) → 0. Set Pn = G(n, k, a, b) and Qn = G(n, d
n). Then
dPn dQn = k−n ∑
σ
∏
E a or b n
∏
Ec
( 1 − a or b
n
) ∏
E d n
∏
Ec
( 1 − d
n
) ( dPn dQn )2 = k−2n ∑
σ,τ
∏
E
(a or b)(a or b) d2 ∏
Ec
(1 − a or b
n
)(1 − a or b
n
) (1 − d
n)2
Under Qn, the events (u, v) ∈ E are all independent, so can compute: EQn ( dPn dQn )2 = C(1 + o(1))E exp(XTBX), where X is a multinomial vector of length k2.
13
Indistinguishability
Replacing multinomials with Gaussians, EQn ( dPn dQn )2 → CE exp(ZTBZ)
2d k 1
where x
e
x 2 x2 4
1 x
. Finite whenever
2d
1. multinomials Gaussians exp XTBX uniformly integrable xTBx nH x maximized at x X where H x is some kind of multivariate entropy. Achlioptas-Naor: sufficient condition for the maximum to be at x
- X. (They were studying planted colorings.)
14
Indistinguishability
Replacing multinomials with Gaussians, EQn ( dPn dQn )2 → CE exp(ZTBZ) = ψ(λ2d)k−1 where ψ(x) = e−x/2−x2/4
√1−x
. Finite whenever λ2d < 1. multinomials Gaussians exp XTBX uniformly integrable xTBx nH x maximized at x X where H x is some kind of multivariate entropy. Achlioptas-Naor: sufficient condition for the maximum to be at x
- X. (They were studying planted colorings.)
14
Indistinguishability
Replacing multinomials with Gaussians, EQn ( dPn dQn )2 → CE exp(ZTBZ) = ψ(λ2d)k−1 where ψ(x) = e−x/2−x2/4
√1−x
. Finite whenever λ2d < 1. multinomials ↔ Gaussians exp XTBX uniformly integrable xTBx nH x maximized at x X where H x is some kind of multivariate entropy. Achlioptas-Naor: sufficient condition for the maximum to be at x
- X. (They were studying planted colorings.)
14
Indistinguishability
Replacing multinomials with Gaussians, EQn ( dPn dQn )2 → CE exp(ZTBZ) = ψ(λ2d)k−1 where ψ(x) = e−x/2−x2/4
√1−x
. Finite whenever λ2d < 1. multinomials ↔ Gaussians ⇔ exp(XTBX) uniformly integrable xTBx nH x maximized at x X where H x is some kind of multivariate entropy. Achlioptas-Naor: sufficient condition for the maximum to be at x
- X. (They were studying planted colorings.)
14
Indistinguishability
Replacing multinomials with Gaussians, EQn ( dPn dQn )2 → CE exp(ZTBZ) = ψ(λ2d)k−1 where ψ(x) = e−x/2−x2/4
√1−x
. Finite whenever λ2d < 1. multinomials ↔ Gaussians ⇔ exp(XTBX) uniformly integrable ⇔xTBx − nH(x) maximized at x = EX, where H(x) is some kind of multivariate entropy. Achlioptas-Naor: sufficient condition for the maximum to be at x
- X. (They were studying planted colorings.)
14
Indistinguishability
Replacing multinomials with Gaussians, EQn ( dPn dQn )2 → CE exp(ZTBZ) = ψ(λ2d)k−1 where ψ(x) = e−x/2−x2/4
√1−x
. Finite whenever λ2d < 1. multinomials ↔ Gaussians ⇔ exp(XTBX) uniformly integrable ⇔xTBx − nH(x) maximized at x = EX, where H(x) is some kind of multivariate entropy. Achlioptas-Naor: sufficient condition for the maximum to be at x = EX. (They were studying planted colorings.)
14
Indistinguishability
Replacing multinomials with Gaussians, EQn ( dPn dQn )2 → CE exp(ZTBZ) = ψ(λ2d)k−1 where ψ(x) = e−x/2−x2/4
√1−x
. Finite whenever λ2d < 1. multinomials ↔ Gaussians ⇔ exp(XTBX) uniformly integrable ⇔xTBx − nH(x) maximized at x = EX, where H(x) is some kind of multivariate entropy. Achlioptas-Naor: sufficient condition for the maximum to be at x = EX. (They were studying planted colorings.)
14
Indistinguishability
For the other direction (Pn(An) → 0 ⇒ Qn(An) → 0), want to show dPn
dQn bounded away from zero.
Small subgraph conditioning (Robinson/Wormald): d
n
d
n is
essentially a function of the number of short cycles; it converges to an explicit limiting random variable that is never zero. Main thing to check: convergence of second moment.
15
Indistinguishability
For the other direction (Pn(An) → 0 ⇒ Qn(An) → 0), want to show dPn
dQn bounded away from zero.
Small subgraph conditioning (Robinson/Wormald): dPn
dQn is
essentially a function of the number of short cycles; it converges to an explicit limiting random variable that is never zero. Main thing to check: convergence of second moment.
15
Indistinguishability
For the other direction (Pn(An) → 0 ⇒ Qn(An) → 0), want to show dPn
dQn bounded away from zero.
Small subgraph conditioning (Robinson/Wormald): dPn
dQn is
essentially a function of the number of short cycles; it converges to an explicit limiting random variable that is never zero. Main thing to check: convergence of second moment.
15
Undetectability
Suffices to show that the distribution of G is not much affected by conditioning on σu, σv. Let
1 n 2 n be n conditioned on
two (possibly different) labellings of u v. dTV
1 n 2 n
n
d
1 n
d
n
d
2 n
d
n
n
d
1 n
d
n
d
2 n
d
n 2
n
d
i n
d
n
d
j n
d
n 2d k 1
Similar to previous second moment computation.
16
Undetectability
Suffices to show that the distribution of G is not much affected by conditioning on σu, σv. Let P1,n, P2,n be Pn conditioned on two (possibly different) labellings of u, v. dTV(P1,n, P2,n) → 0
n
d
1 n
d
n
d
2 n
d
n
n
d
1 n
d
n
d
2 n
d
n 2
n
d
i n
d
n
d
j n
d
n 2d k 1
Similar to previous second moment computation.
16
Undetectability
Suffices to show that the distribution of G is not much affected by conditioning on σu, σv. Let P1,n, P2,n be Pn conditioned on two (possibly different) labellings of u, v. dTV(P1,n, P2,n) → 0 ⇔ EQn
- dP1,n
dQn − dP2,n dQn
- → 0
n
d
1 n
d
n
d
2 n
d
n 2
n
d
i n
d
n
d
j n
d
n 2d k 1
Similar to previous second moment computation.
16
Undetectability
Suffices to show that the distribution of G is not much affected by conditioning on σu, σv. Let P1,n, P2,n be Pn conditioned on two (possibly different) labellings of u, v. dTV(P1,n, P2,n) → 0 ⇔ EQn
- dP1,n
dQn − dP2,n dQn
- → 0
⇐ EQn (dP1,n dQn − dP2,n dQn )2 → 0
n
d
i n
d
n
d
j n
d
n 2d k 1
Similar to previous second moment computation.
16
Undetectability
Suffices to show that the distribution of G is not much affected by conditioning on σu, σv. Let P1,n, P2,n be Pn conditioned on two (possibly different) labellings of u, v. dTV(P1,n, P2,n) → 0 ⇔ EQn
- dP1,n
dQn − dP2,n dQn
- → 0
⇐ EQn (dP1,n dQn − dP2,n dQn )2 → 0 ⇐ EQn dPi,n dQn dPj,n dQn → ψ(λ2d)k−1. Similar to previous second moment computation.
16
Undetectability
Suffices to show that the distribution of G is not much affected by conditioning on σu, σv. Let P1,n, P2,n be Pn conditioned on two (possibly different) labellings of u, v. dTV(P1,n, P2,n) → 0 ⇔ EQn
- dP1,n
dQn − dP2,n dQn
- → 0
⇐ EQn (dP1,n dQn − dP2,n dQn )2 → 0 ⇐ EQn dPi,n dQn dPj,n dQn → ψ(λ2d)k−1. Similar to previous second moment computation.
16
Summary
Indistinguishability and undetectability follow from an explicit second moment calculation. Use Achlioptas-Naor to estimate the set of parameters where the second moment is finite.
17
200 400 600 800 1000
d
0.0 0.2 0.4 0.6 0.8 1.0
λ detectable (quickly), distinguishable (Bordenave/Lelarge/Massouli´ e, Abbe/Sandon) detectable, distinguishable (Abbe/Sandon, this work) undetectable, indistinguishable (this work)
Thank you!
18
200 400 600 800 1000
d
0.0 0.2 0.4 0.6 0.8 1.0
λ detectable (quickly), distinguishable (Bordenave/Lelarge/Massouli´ e, Abbe/Sandon) detectable, distinguishable (Abbe/Sandon, this work) undetectable, indistinguishable (this work)