The condensation threshold in stochastic block models Joe Neeman - - PowerPoint PPT Presentation

the condensation threshold in stochastic block models
SMART_READER_LITE
LIVE PREVIEW

The condensation threshold in stochastic block models Joe Neeman - - PowerPoint PPT Presentation

The condensation threshold in stochastic block models Joe Neeman (with Jess Banks, Cris Moore, Praneeth Netrapalli) Austin, May 9, 2016 1 Stochastic block model G ( n , k , a , b ) 1. n nodes, k colors, about n / k nodes of each color a


slide-1
SLIDE 1

The condensation threshold in stochastic block models

Joe Neeman (with Jess Banks, Cris Moore, Praneeth Netrapalli) Austin, May 9, 2016

1

slide-2
SLIDE 2

Stochastic block model G(n, k, a, b)

  • 1. n nodes, k colors, about n/k nodes of each color
  • 2. connect u to v with probability

  

a n

if the same color

b n

if different colors

2

slide-3
SLIDE 3

Problem I: detecting

Given the (uncolored) graph, recover the colors (up to permutation) better than a random guess. Definition Let

v

1 k be the color of v. For another coloring , Olap max v V

v v

n 1 k where max is over all permutations

  • n 1

k . Definition Gn

n

n k a b is detectable if there exists 0 and maps An graphs labellings such that lim inf

n

Pr Olap

n An Gn

Otherwise it is undetectable.

3

slide-4
SLIDE 4

Problem I: detecting

Given the (uncolored) graph, recover the colors (up to permutation) better than a random guess. Definition Let σv ∈ {1, . . . , k} be the color of v. For another coloring τ, Olap(σ, τ) = max #{v ∈ V : σv = τv } n − 1 k, where max is over all permutations

  • n 1

k . Definition Gn

n

n k a b is detectable if there exists 0 and maps An graphs labellings such that lim inf

n

Pr Olap

n An Gn

Otherwise it is undetectable.

3

slide-5
SLIDE 5

Problem I: detecting

Given the (uncolored) graph, recover the colors (up to permutation) better than a random guess. Definition Let σv ∈ {1, . . . , k} be the color of v. For another coloring τ, Olap(σ, τ) = max

π

#{v ∈ V : σv = π(τv)} n − 1 k, where max is over all permutations π on {1, . . . , k}. Definition Gn

n

n k a b is detectable if there exists 0 and maps An graphs labellings such that lim inf

n

Pr Olap

n An Gn

Otherwise it is undetectable.

3

slide-6
SLIDE 6

Problem I: detecting

Given the (uncolored) graph, recover the colors (up to permutation) better than a random guess. Definition Let σv ∈ {1, . . . , k} be the color of v. For another coloring τ, Olap(σ, τ) = max

π

#{v ∈ V : σv = π(τv)} n − 1 k, where max is over all permutations π on {1, . . . , k}. Definition (Gn, σn) ∼ G(n, k, a, b) is detectable if there exists ϵ > 0 and maps An : {graphs} → {labellings} such that lim inf

n→∞ Pr(Olap(σn, An(Gn)) > ϵ) > ϵ.

Otherwise it is undetectable.

3

slide-7
SLIDE 7

Problem II: distinguishing

Given the (uncolored) graph, did it come from G(n, k, a, b) or G(n, d

n), where d = a+(k−1)b k

? Definition Sequences

n and n of probability measures are

  • contiguous if

n An

0 iff

n An

  • orthogonal if

An with

n An

0 and

n An

1. Say that n k a b is

  • distinguishable if it is orthogonal to

n d

n

  • indistinguishable if it is contiguous with

n d

n 4

slide-8
SLIDE 8

Problem II: distinguishing

Given the (uncolored) graph, did it come from G(n, k, a, b) or G(n, d

n), where d = a+(k−1)b k

? Definition Sequences Pn and Qn of probability measures are

  • contiguous if Pn(An) → 0 iff Qn(An) → 0
  • orthogonal if ∃An with Pn(An) → 0 and Qn(An) → 1.

Say that n k a b is

  • distinguishable if it is orthogonal to

n d

n

  • indistinguishable if it is contiguous with

n d

n 4

slide-9
SLIDE 9

Problem II: distinguishing

Given the (uncolored) graph, did it come from G(n, k, a, b) or G(n, d

n), where d = a+(k−1)b k

? Definition Sequences Pn and Qn of probability measures are

  • contiguous if Pn(An) → 0 iff Qn(An) → 0
  • orthogonal if ∃An with Pn(An) → 0 and Qn(An) → 1.

Say that G(n, k, a, b) is

  • distinguishable if it is orthogonal to G(n, d

n)

  • indistinguishable if it is contiguous with G(n, d

n) 4

slide-10
SLIDE 10

Better parametrization

  • a

n = within-block edge probability

  • b

n = between-block edge probability

  • k = number of blocks

d = a + (k − 1)b k λ = a − b a + (k − 1)b Note λ ∈ [ −

1 k−1, 1

] .

5

slide-11
SLIDE 11

Phase diagram for k = 2

5 10 15 20

d

−1.0 −0.5 0.0 0.5 1.0

λ λ2d = 1 detectable, distinguishable undetectable, indistinguishable

(Mossel/N/Sly, Massoulié)

6

slide-12
SLIDE 12

Conjectured phase diagram for k = 20

200 400 600 800 1000

d

0.0 0.2 0.4 0.6 0.8 1.0

λ λ2d = 1 detectable, distinguishable undetectable, indistinguishable detectable but hard, distinguishable

(Decelle, Krzakala, Moore, Zdeborova)

7

slide-13
SLIDE 13

What we know for k = 20

200 400 600 800 1000

d

0.0 0.2 0.4 0.6 0.8 1.0

λ detectable (quickly), distinguishable (Bordenave/Lelarge/Massouli´ e, Abbe/Sandon) detectable, distinguishable (Abbe/Sandon, this work) undetectable, indistinguishable (this work)

8

slide-14
SLIDE 14

Theorem (Banks/Moore/N/Netrapalli) d+ = 2k log k (1 + (k − 1)λ) log(1 + (k − 1)λ) + (k − 1)(1 − λ) log(1 − λ) d− = 2 log(k − 1) λ2(k − 1)

  • d > d+ implies detectability, distinguishability.
  • d < d− implies undetectability, indistinguishability.

If k is large enough then there are such that d

1

2 , giving

the yellow region. lim

k

d d

2

1 log 1 where a b d If 1 and limk

d d

1 (planted coloring / giant)

9

slide-15
SLIDE 15

Theorem (Banks/Moore/N/Netrapalli) d+ = 2k log k (1 + (k − 1)λ) log(1 + (k − 1)λ) + (k − 1)(1 − λ) log(1 − λ) d− = 2 log(k − 1) λ2(k − 1)

  • d > d+ implies detectability, distinguishability.
  • d < d− implies undetectability, indistinguishability.

If k is large enough then there are λ such that d+ <

1 λ2 , giving

the yellow region. lim

k

d d

2

1 log 1 where a b d If 1 and limk

d d

1 (planted coloring / giant)

9

slide-16
SLIDE 16

Theorem (Banks/Moore/N/Netrapalli) d+ = 2k log k (1 + (k − 1)λ) log(1 + (k − 1)λ) + (k − 1)(1 − λ) log(1 − λ) d− = 2 log(k − 1) λ2(k − 1)

  • d > d+ implies detectability, distinguishability.
  • d < d− implies undetectability, indistinguishability.

If k is large enough then there are λ such that d+ <

1 λ2 , giving

the yellow region. lim

k→∞

d+ d− = µ2 (1 + µ) log(1 + µ) − µ where µ = a − b d . If µ ≈ ±1 and limk→∞

d+ d− ≈ 1 (planted coloring / giant) 9

slide-17
SLIDE 17

The proofs

200 400 600 800 1000

d

0.0 0.2 0.4 0.6 0.8 1.0

λ detectable (quickly), distinguishable (Bordenave/Lelarge/Massouli´ e, Abbe/Sandon) detectable, distinguishable (Abbe/Sandon, this work) undetectable, indistinguishable (this work)

10

slide-18
SLIDE 18

Detecting/distinguishing inefficiently

Consider partitions of G into k equal parts. A partition is good if its average in-degree is ≈ a

k and its average out-degree is

≈ (k−1)b

k

. For suitable a b k, w.h.p.

  • n k a b :

all good partitions are correlated with the truth.

  • n d

n :

there are no good partitions. Proof: concentration + union bound. Distinguishing: check if there is a good partition. Detecting: find a good partition. Abbe/Sandon improved this for small d by taking the giant component and pruning trees.

11

slide-19
SLIDE 19

Detecting/distinguishing inefficiently

Consider partitions of G into k equal parts. A partition is good if its average in-degree is ≈ a

k and its average out-degree is

≈ (k−1)b

k

. For suitable a, b, k, w.h.p.

  • G(n, k, a, b):

all good partitions are correlated with the truth.

  • n d

n :

there are no good partitions. Proof: concentration + union bound. Distinguishing: check if there is a good partition. Detecting: find a good partition. Abbe/Sandon improved this for small d by taking the giant component and pruning trees.

11

slide-20
SLIDE 20

Detecting/distinguishing inefficiently

Consider partitions of G into k equal parts. A partition is good if its average in-degree is ≈ a

k and its average out-degree is

≈ (k−1)b

k

. For suitable a, b, k, w.h.p.

  • G(n, k, a, b):

all good partitions are correlated with the truth.

  • G(n, d

n):

there are no good partitions. Proof: concentration + union bound. Distinguishing: check if there is a good partition. Detecting: find a good partition. Abbe/Sandon improved this for small d by taking the giant component and pruning trees.

11

slide-21
SLIDE 21

Detecting/distinguishing inefficiently

Consider partitions of G into k equal parts. A partition is good if its average in-degree is ≈ a

k and its average out-degree is

≈ (k−1)b

k

. For suitable a, b, k, w.h.p.

  • G(n, k, a, b):

all good partitions are correlated with the truth.

  • G(n, d

n):

there are no good partitions. Proof: concentration + union bound. Distinguishing: check if there is a good partition. Detecting: find a good partition. Abbe/Sandon improved this for small d by taking the giant component and pruning trees.

11

slide-22
SLIDE 22

Detecting/distinguishing inefficiently

Consider partitions of G into k equal parts. A partition is good if its average in-degree is ≈ a

k and its average out-degree is

≈ (k−1)b

k

. For suitable a, b, k, w.h.p.

  • G(n, k, a, b):

all good partitions are correlated with the truth.

  • G(n, d

n):

there are no good partitions. Proof: concentration + union bound. Distinguishing: check if there is a good partition. Detecting: find a good partition. Abbe/Sandon improved this for small d by taking the giant component and pruning trees.

11

slide-23
SLIDE 23

Detecting/distinguishing inefficiently

Consider partitions of G into k equal parts. A partition is good if its average in-degree is ≈ a

k and its average out-degree is

≈ (k−1)b

k

. For suitable a, b, k, w.h.p.

  • G(n, k, a, b):

all good partitions are correlated with the truth.

  • G(n, d

n):

there are no good partitions. Proof: concentration + union bound. Distinguishing: check if there is a good partition. Detecting: find a good partition. Abbe/Sandon improved this for small d by taking the giant component and pruning trees.

11

slide-24
SLIDE 24

200 400 600 800 1000

d

0.0 0.2 0.4 0.6 0.8 1.0

λ detectable (quickly), distinguishable (Bordenave/Lelarge/Massouli´ e, Abbe/Sandon) detectable, distinguishable (Abbe/Sandon, this work) undetectable, indistinguishable (this work)

12

slide-25
SLIDE 25

Indistinguishability

If EQn (

dPn dQn

)2 → C < ∞ then Qn(An) → 0 ⇒ Pn(An) → 0. Set

n

n k a b and

n

n d

n . Then

d

n

d

n

k

n E a or b n Ec

1

a or b n E d n Ec

1

d n

d

n

d

n 2

k

2n E

a or b a or b d2

Ec

1

a or b n

1

a or b n

1

d n 2

Under

n, the events u v

E are all independent, so can compute:

n

d

n

d

n 2

C 1

  • 1

exp XTBX where X is a multinomial vector of length k2.

13

slide-26
SLIDE 26

Indistinguishability

If EQn (

dPn dQn

)2 → C < ∞ then Qn(An) → 0 ⇒ Pn(An) → 0. Set Pn = G(n, k, a, b) and Qn = G(n, d

n). Then

dPn dQn = k−n ∑

σ

E a or b n

Ec

( 1 − a or b

n

) ∏

E d n

Ec

( 1 − d

n

) d

n

d

n 2

k

2n E

a or b a or b d2

Ec

1

a or b n

1

a or b n

1

d n 2

Under

n, the events u v

E are all independent, so can compute:

n

d

n

d

n 2

C 1

  • 1

exp XTBX where X is a multinomial vector of length k2.

13

slide-27
SLIDE 27

Indistinguishability

If EQn (

dPn dQn

)2 → C < ∞ then Qn(An) → 0 ⇒ Pn(An) → 0. Set Pn = G(n, k, a, b) and Qn = G(n, d

n). Then

dPn dQn = k−n ∑

σ

E a or b n

Ec

( 1 − a or b

n

) ∏

E d n

Ec

( 1 − d

n

) ( dPn dQn )2 = k−2n ∑

σ,τ

E

(a or b)(a or b) d2 ∏

Ec

(1 − a or b

n

)(1 − a or b

n

) (1 − d

n)2

Under

n, the events u v

E are all independent, so can compute:

n

d

n

d

n 2

C 1

  • 1

exp XTBX where X is a multinomial vector of length k2.

13

slide-28
SLIDE 28

Indistinguishability

If EQn (

dPn dQn

)2 → C < ∞ then Qn(An) → 0 ⇒ Pn(An) → 0. Set Pn = G(n, k, a, b) and Qn = G(n, d

n). Then

dPn dQn = k−n ∑

σ

E a or b n

Ec

( 1 − a or b

n

) ∏

E d n

Ec

( 1 − d

n

) ( dPn dQn )2 = k−2n ∑

σ,τ

E

(a or b)(a or b) d2 ∏

Ec

(1 − a or b

n

)(1 − a or b

n

) (1 − d

n)2

Under Qn, the events (u, v) ∈ E are all independent, so can compute: EQn ( dPn dQn )2 = C(1 + o(1))E exp(XTBX), where X is a multinomial vector of length k2.

13

slide-29
SLIDE 29

Indistinguishability

Replacing multinomials with Gaussians, EQn ( dPn dQn )2 → CE exp(ZTBZ)

2d k 1

where x

e

x 2 x2 4

1 x

. Finite whenever

2d

1. multinomials Gaussians exp XTBX uniformly integrable xTBx nH x maximized at x X where H x is some kind of multivariate entropy. Achlioptas-Naor: sufficient condition for the maximum to be at x

  • X. (They were studying planted colorings.)

14

slide-30
SLIDE 30

Indistinguishability

Replacing multinomials with Gaussians, EQn ( dPn dQn )2 → CE exp(ZTBZ) = ψ(λ2d)k−1 where ψ(x) = e−x/2−x2/4

√1−x

. Finite whenever λ2d < 1. multinomials Gaussians exp XTBX uniformly integrable xTBx nH x maximized at x X where H x is some kind of multivariate entropy. Achlioptas-Naor: sufficient condition for the maximum to be at x

  • X. (They were studying planted colorings.)

14

slide-31
SLIDE 31

Indistinguishability

Replacing multinomials with Gaussians, EQn ( dPn dQn )2 → CE exp(ZTBZ) = ψ(λ2d)k−1 where ψ(x) = e−x/2−x2/4

√1−x

. Finite whenever λ2d < 1. multinomials ↔ Gaussians exp XTBX uniformly integrable xTBx nH x maximized at x X where H x is some kind of multivariate entropy. Achlioptas-Naor: sufficient condition for the maximum to be at x

  • X. (They were studying planted colorings.)

14

slide-32
SLIDE 32

Indistinguishability

Replacing multinomials with Gaussians, EQn ( dPn dQn )2 → CE exp(ZTBZ) = ψ(λ2d)k−1 where ψ(x) = e−x/2−x2/4

√1−x

. Finite whenever λ2d < 1. multinomials ↔ Gaussians ⇔ exp(XTBX) uniformly integrable xTBx nH x maximized at x X where H x is some kind of multivariate entropy. Achlioptas-Naor: sufficient condition for the maximum to be at x

  • X. (They were studying planted colorings.)

14

slide-33
SLIDE 33

Indistinguishability

Replacing multinomials with Gaussians, EQn ( dPn dQn )2 → CE exp(ZTBZ) = ψ(λ2d)k−1 where ψ(x) = e−x/2−x2/4

√1−x

. Finite whenever λ2d < 1. multinomials ↔ Gaussians ⇔ exp(XTBX) uniformly integrable ⇔xTBx − nH(x) maximized at x = EX, where H(x) is some kind of multivariate entropy. Achlioptas-Naor: sufficient condition for the maximum to be at x

  • X. (They were studying planted colorings.)

14

slide-34
SLIDE 34

Indistinguishability

Replacing multinomials with Gaussians, EQn ( dPn dQn )2 → CE exp(ZTBZ) = ψ(λ2d)k−1 where ψ(x) = e−x/2−x2/4

√1−x

. Finite whenever λ2d < 1. multinomials ↔ Gaussians ⇔ exp(XTBX) uniformly integrable ⇔xTBx − nH(x) maximized at x = EX, where H(x) is some kind of multivariate entropy. Achlioptas-Naor: sufficient condition for the maximum to be at x = EX. (They were studying planted colorings.)

14

slide-35
SLIDE 35

Indistinguishability

Replacing multinomials with Gaussians, EQn ( dPn dQn )2 → CE exp(ZTBZ) = ψ(λ2d)k−1 where ψ(x) = e−x/2−x2/4

√1−x

. Finite whenever λ2d < 1. multinomials ↔ Gaussians ⇔ exp(XTBX) uniformly integrable ⇔xTBx − nH(x) maximized at x = EX, where H(x) is some kind of multivariate entropy. Achlioptas-Naor: sufficient condition for the maximum to be at x = EX. (They were studying planted colorings.)

14

slide-36
SLIDE 36

Indistinguishability

For the other direction (Pn(An) → 0 ⇒ Qn(An) → 0), want to show dPn

dQn bounded away from zero.

Small subgraph conditioning (Robinson/Wormald): d

n

d

n is

essentially a function of the number of short cycles; it converges to an explicit limiting random variable that is never zero. Main thing to check: convergence of second moment.

15

slide-37
SLIDE 37

Indistinguishability

For the other direction (Pn(An) → 0 ⇒ Qn(An) → 0), want to show dPn

dQn bounded away from zero.

Small subgraph conditioning (Robinson/Wormald): dPn

dQn is

essentially a function of the number of short cycles; it converges to an explicit limiting random variable that is never zero. Main thing to check: convergence of second moment.

15

slide-38
SLIDE 38

Indistinguishability

For the other direction (Pn(An) → 0 ⇒ Qn(An) → 0), want to show dPn

dQn bounded away from zero.

Small subgraph conditioning (Robinson/Wormald): dPn

dQn is

essentially a function of the number of short cycles; it converges to an explicit limiting random variable that is never zero. Main thing to check: convergence of second moment.

15

slide-39
SLIDE 39

Undetectability

Suffices to show that the distribution of G is not much affected by conditioning on σu, σv. Let

1 n 2 n be n conditioned on

two (possibly different) labellings of u v. dTV

1 n 2 n

n

d

1 n

d

n

d

2 n

d

n

n

d

1 n

d

n

d

2 n

d

n 2

n

d

i n

d

n

d

j n

d

n 2d k 1

Similar to previous second moment computation.

16

slide-40
SLIDE 40

Undetectability

Suffices to show that the distribution of G is not much affected by conditioning on σu, σv. Let P1,n, P2,n be Pn conditioned on two (possibly different) labellings of u, v. dTV(P1,n, P2,n) → 0

n

d

1 n

d

n

d

2 n

d

n

n

d

1 n

d

n

d

2 n

d

n 2

n

d

i n

d

n

d

j n

d

n 2d k 1

Similar to previous second moment computation.

16

slide-41
SLIDE 41

Undetectability

Suffices to show that the distribution of G is not much affected by conditioning on σu, σv. Let P1,n, P2,n be Pn conditioned on two (possibly different) labellings of u, v. dTV(P1,n, P2,n) → 0 ⇔ EQn

  • dP1,n

dQn − dP2,n dQn

  • → 0

n

d

1 n

d

n

d

2 n

d

n 2

n

d

i n

d

n

d

j n

d

n 2d k 1

Similar to previous second moment computation.

16

slide-42
SLIDE 42

Undetectability

Suffices to show that the distribution of G is not much affected by conditioning on σu, σv. Let P1,n, P2,n be Pn conditioned on two (possibly different) labellings of u, v. dTV(P1,n, P2,n) → 0 ⇔ EQn

  • dP1,n

dQn − dP2,n dQn

  • → 0

⇐ EQn (dP1,n dQn − dP2,n dQn )2 → 0

n

d

i n

d

n

d

j n

d

n 2d k 1

Similar to previous second moment computation.

16

slide-43
SLIDE 43

Undetectability

Suffices to show that the distribution of G is not much affected by conditioning on σu, σv. Let P1,n, P2,n be Pn conditioned on two (possibly different) labellings of u, v. dTV(P1,n, P2,n) → 0 ⇔ EQn

  • dP1,n

dQn − dP2,n dQn

  • → 0

⇐ EQn (dP1,n dQn − dP2,n dQn )2 → 0 ⇐ EQn dPi,n dQn dPj,n dQn → ψ(λ2d)k−1. Similar to previous second moment computation.

16

slide-44
SLIDE 44

Undetectability

Suffices to show that the distribution of G is not much affected by conditioning on σu, σv. Let P1,n, P2,n be Pn conditioned on two (possibly different) labellings of u, v. dTV(P1,n, P2,n) → 0 ⇔ EQn

  • dP1,n

dQn − dP2,n dQn

  • → 0

⇐ EQn (dP1,n dQn − dP2,n dQn )2 → 0 ⇐ EQn dPi,n dQn dPj,n dQn → ψ(λ2d)k−1. Similar to previous second moment computation.

16

slide-45
SLIDE 45

Summary

Indistinguishability and undetectability follow from an explicit second moment calculation. Use Achlioptas-Naor to estimate the set of parameters where the second moment is finite.

17

slide-46
SLIDE 46

200 400 600 800 1000

d

0.0 0.2 0.4 0.6 0.8 1.0

λ detectable (quickly), distinguishable (Bordenave/Lelarge/Massouli´ e, Abbe/Sandon) detectable, distinguishable (Abbe/Sandon, this work) undetectable, indistinguishable (this work)

Thank you!

18

slide-47
SLIDE 47

200 400 600 800 1000

d

0.0 0.2 0.4 0.6 0.8 1.0

λ detectable (quickly), distinguishable (Bordenave/Lelarge/Massouli´ e, Abbe/Sandon) detectable, distinguishable (Abbe/Sandon, this work) undetectable, indistinguishable (this work)

Thank you!

18