the condensation threshold in stochastic block models
play

The condensation threshold in stochastic block models Joe Neeman - PowerPoint PPT Presentation

The condensation threshold in stochastic block models Joe Neeman (with Jess Banks, Cris Moore, Praneeth Netrapalli) Austin, May 9, 2016 1 Stochastic block model G ( n , k , a , b ) 1. n nodes, k colors, about n / k nodes of each color a


  1. The condensation threshold in stochastic block models Joe Neeman (with Jess Banks, Cris Moore, Praneeth Netrapalli) Austin, May 9, 2016 1

  2. Stochastic block model G ( n , k , a , b ) 1. n nodes, k colors, about n / k nodes of each color  a if the same color  n 2. connect u to v with probability b if different colors  n 2

  3. max where max is over all permutations on 1 k . Definition Let 1 k be the color of v . For another coloring , v v V 1 v v O lap n k Definition G n n k a b is detectable if there exists 0 and n maps A n graphs labellings such that lim inf Pr O lap n A n G n n Otherwise it is undetectable . Problem I: detecting Given the (uncolored) graph, recover the colors (up to permutation) better than a random guess. 3

  4. max where max is over all permutations on 1 k . Definition G n n k a b is detectable if there exists 0 and n maps A n graphs labellings such that lim inf Pr O lap n A n G n n Otherwise it is undetectable . Problem I: detecting Given the (uncolored) graph, recover the colors (up to permutation) better than a random guess. Definition Let σ v ∈ { 1 , . . . , k } be the color of v . For another coloring τ , # { v ∈ V : σ v = − 1 τ v } O lap ( σ, τ ) = k , n 3

  5. Definition G n n k a b is detectable if there exists 0 and n maps A n graphs labellings such that lim inf Pr O lap n A n G n n Otherwise it is undetectable . Problem I: detecting Given the (uncolored) graph, recover the colors (up to permutation) better than a random guess. Definition Let σ v ∈ { 1 , . . . , k } be the color of v . For another coloring τ , # { v ∈ V : σ v = π ( τ v ) } − 1 O lap ( σ, τ ) = max k , n π where max is over all permutations π on { 1 , . . . , k } . 3

  6. Problem I: detecting Given the (uncolored) graph, recover the colors (up to permutation) better than a random guess. Definition Let σ v ∈ { 1 , . . . , k } be the color of v . For another coloring τ , # { v ∈ V : σ v = π ( τ v ) } − 1 O lap ( σ, τ ) = max k , n π where max is over all permutations π on { 1 , . . . , k } . Definition ( G n , σ n ) ∼ G ( n , k , a , b ) is detectable if there exists ϵ > 0 and maps A n : { graphs } → { labellings } such that lim inf n →∞ Pr ( O lap ( σ n , A n ( G n )) > ϵ ) > ϵ. 3 Otherwise it is undetectable .

  7. Definition Sequences n and n of probability measures are • contiguous if n A n 0 iff n A n 0 • orthogonal if A n with n A n 0 and n A n 1. Say that n k a b is n d • distinguishable if it is orthogonal to n n d • indistinguishable if it is contiguous with n Problem II: distinguishing Given the (uncolored) graph, did it come from G ( n , k , a , b ) or n ) , where d = a +( k − 1 ) b G ( n , d ? k 4

  8. Say that n k a b is n d • distinguishable if it is orthogonal to n n d • indistinguishable if it is contiguous with n Problem II: distinguishing Given the (uncolored) graph, did it come from G ( n , k , a , b ) or n ) , where d = a +( k − 1 ) b G ( n , d ? k Definition Sequences P n and Q n of probability measures are • contiguous if P n ( A n ) → 0 iff Q n ( A n ) → 0 • orthogonal if ∃ A n with P n ( A n ) → 0 and Q n ( A n ) → 1. 4

  9. Problem II: distinguishing Given the (uncolored) graph, did it come from G ( n , k , a , b ) or n ) , where d = a +( k − 1 ) b G ( n , d ? k Definition Sequences P n and Q n of probability measures are • contiguous if P n ( A n ) → 0 iff Q n ( A n ) → 0 • orthogonal if ∃ A n with P n ( A n ) → 0 and Q n ( A n ) → 1. Say that G ( n , k , a , b ) is • distinguishable if it is orthogonal to G ( n , d n ) • indistinguishable if it is contiguous with G ( n , d n ) 4

  10. Better parametrization a • n = within-block edge probability b • n = between-block edge probability • k = number of blocks d = a + ( k − 1 ) b k a − b λ = a + ( k − 1 ) b 1 Note λ ∈ k − 1 , 1 . [ ] − 5

  11. Phase diagram for k = 2 1 . 0 detectable, distinguishable 0 . 5 λ 2 d = 1 λ 0 . 0 undetectable, indistinguishable − 0 . 5 − 1 . 0 0 5 10 15 20 d (Mossel/N/Sly, Massoulié) 6

  12. Conjectured phase diagram for k = 20 1 . 0 0 . 8 detectable, distinguishable 0 . 6 λ 0 . 4 λ 2 d = 1 0 . 2 detectable but hard, distinguishable undetectable, indistinguishable 0 . 0 0 200 400 600 800 1000 d (Decelle, Krzakala, Moore, Zdeborova) 7

  13. What we know for k = 20 1 . 0 0 . 8 detectable (quickly), distinguishable (Bordenave/Lelarge/Massouli´ e, Abbe/Sandon) 0 . 6 λ 0 . 4 0 . 2 detectable, distinguishable (Abbe/Sandon, this work) undetectable, indistinguishable (this work) 0 . 0 0 200 400 600 800 1000 d 8

  14. 1 If k is large enough then there are such that d 2 , giving the yellow region. 2 d a b lim where d 1 log 1 d k d If 1 and lim k 1 (planted coloring / giant) d Theorem (Banks/Moore/N/Netrapalli) 2 k log k d + = ( 1 + ( k − 1 ) λ ) log ( 1 + ( k − 1 ) λ ) + ( k − 1 )( 1 − λ ) log ( 1 − λ ) d − = 2 log ( k − 1 ) λ 2 ( k − 1 ) • d > d + implies detectability, distinguishability. • d < d − implies undetectability, indistinguishability. 9

  15. 2 d a b lim where d 1 log 1 d k d If 1 and lim k 1 (planted coloring / giant) d Theorem (Banks/Moore/N/Netrapalli) 2 k log k d + = ( 1 + ( k − 1 ) λ ) log ( 1 + ( k − 1 ) λ ) + ( k − 1 )( 1 − λ ) log ( 1 − λ ) d − = 2 log ( k − 1 ) λ 2 ( k − 1 ) • d > d + implies detectability, distinguishability. • d < d − implies undetectability, indistinguishability. If k is large enough then there are λ such that d + < 1 λ 2 , giving the yellow region. 9

  16. Theorem (Banks/Moore/N/Netrapalli) 2 k log k d + = ( 1 + ( k − 1 ) λ ) log ( 1 + ( k − 1 ) λ ) + ( k − 1 )( 1 − λ ) log ( 1 − λ ) d − = 2 log ( k − 1 ) λ 2 ( k − 1 ) • d > d + implies detectability, distinguishability. • d < d − implies undetectability, indistinguishability. If k is large enough then there are λ such that d + < 1 λ 2 , giving the yellow region. µ 2 d + ( 1 + µ ) log ( 1 + µ ) − µ where µ = a − b lim d − = . d k →∞ d + If µ ≈ ± 1 and lim k →∞ d − ≈ 1 (planted coloring / giant) 9

  17. The proofs 1 . 0 0 . 8 detectable (quickly), distinguishable (Bordenave/Lelarge/Massouli´ e, Abbe/Sandon) 0 . 6 λ 0 . 4 0 . 2 detectable, distinguishable (Abbe/Sandon, this work) undetectable, indistinguishable (this work) 0 . 0 0 200 400 600 800 1000 d 10

  18. For suitable a b k , w.h.p. • n k a b : all good partitions are correlated with the truth. n d • n : there are no good partitions. Proof: concentration + union bound. Distinguishing: check if there is a good partition. Detecting: find a good partition. Abbe/Sandon improved this for small d by taking the giant component and pruning trees. Detecting/distinguishing inefficiently Consider partitions of G into k equal parts. A partition is good if its average in-degree is ≈ a k and its average out-degree is ≈ ( k − 1 ) b . k 11

  19. n d • n : there are no good partitions. Proof: concentration + union bound. Distinguishing: check if there is a good partition. Detecting: find a good partition. Abbe/Sandon improved this for small d by taking the giant component and pruning trees. Detecting/distinguishing inefficiently Consider partitions of G into k equal parts. A partition is good if its average in-degree is ≈ a k and its average out-degree is ≈ ( k − 1 ) b . k For suitable a , b , k , w.h.p. • G ( n , k , a , b ) : all good partitions are correlated with the truth. 11

  20. Proof: concentration + union bound. Distinguishing: check if there is a good partition. Detecting: find a good partition. Abbe/Sandon improved this for small d by taking the giant component and pruning trees. Detecting/distinguishing inefficiently Consider partitions of G into k equal parts. A partition is good if its average in-degree is ≈ a k and its average out-degree is ≈ ( k − 1 ) b . k For suitable a , b , k , w.h.p. • G ( n , k , a , b ) : all good partitions are correlated with the truth. • G ( n , d n ) : there are no good partitions. 11

  21. Distinguishing: check if there is a good partition. Detecting: find a good partition. Abbe/Sandon improved this for small d by taking the giant component and pruning trees. Detecting/distinguishing inefficiently Consider partitions of G into k equal parts. A partition is good if its average in-degree is ≈ a k and its average out-degree is ≈ ( k − 1 ) b . k For suitable a , b , k , w.h.p. • G ( n , k , a , b ) : all good partitions are correlated with the truth. • G ( n , d n ) : there are no good partitions. Proof: concentration + union bound. 11

  22. Abbe/Sandon improved this for small d by taking the giant component and pruning trees. Detecting/distinguishing inefficiently Consider partitions of G into k equal parts. A partition is good if its average in-degree is ≈ a k and its average out-degree is ≈ ( k − 1 ) b . k For suitable a , b , k , w.h.p. • G ( n , k , a , b ) : all good partitions are correlated with the truth. • G ( n , d n ) : there are no good partitions. Proof: concentration + union bound. Distinguishing: check if there is a good partition. Detecting: find a good partition. 11

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend