information theoretic thresholds
play

Information-theoretic thresholds Amin Coja-Oghlan Goethe University - PowerPoint PPT Presentation

Information-theoretic thresholds Amin Coja-Oghlan Goethe University Frankfurt based on joint work with Florent Krzakala (ENS Paris) Will Perkins (Birmingham) Lenka Zdeborov (CEA Saclay) Inference from samples to infer an unkown


  1. Information-theoretic thresholds Amin Coja-Oghlan Goethe University Frankfurt based on joint work with Florent Krzakala (ENS Paris) Will Perkins (Birmingham) Lenka Zdeborová (CEA Saclay)

  2. Inference from samples � to infer an unkown probability distribution from samples � the distribution itself is random, determined by parameters σ ∗

  3. Example: error-correcting codes � A ∈ F m × n is the generator matrix 2 � A σ ∗ is subjected to noise

  4. Example: the stochastic block model � random coloring σ ∗ : V → {1,..., q } � for each e = { v , w } independently, � e − β if σ ∗ ( v ) = σ ∗ ( w ), = d q e ∈ G ∗ | σ ∗ � � P n · q − 1 + e − β · if σ ∗ ( v ) �= σ ∗ ( w ) 1 � d = signal strength; e − β = noise

  5. Example: the stochastic block model � the agreement of σ , τ : V → {1,..., q } is � � 1 q � α ( σ , τ ) = q − 1 max 1 { σ ( v ) = κ ◦ τ ( v )} − 1 . n κ ∈ S q v ∈ V � for what d , β is it possible to recover τ G ∗ such that E[ α ( σ ∗ , τ G ∗ )] ≥ Ω (1) ?

  6. Example: the stochastic block model Easy–hard–impossible � for large d efficient algorithms should detect σ ∗ � for very small d there is nothing to detect � in-between the problem may be well-posed but hard 0 < d inf ( β ) < d alg ( β )

  7. Example: the stochastic block model The algorithmic threshold � combinatorial algorithms for large d [1980s] � spectral algorithms for moderate d [1990s, 2000s] � the Kesten-Stigum threshold [AS15] � 2 � q − 1 + e − β ? d alg ( β ) = 1 − e − β

  8. Example: the stochastic block model The information-theoretic threshold � statistical physics prediction [DKMZ11] � the case q = 2 [MNS13, MNS14, M14] � bounds on d inf ( q , β ) [BMNN16]

  9. The information-theoretic threshold Theorem [COKPZ16] For β > 0, d > 0 let � � � B ∗ q , β ( d ) = sup B q , β , d ( π ) : π = T q , β , d ( π ), µ ( i ) d π ( µ ) = 1/ q where �� q d γ exp( − d ) γ � ∞ 1 − (1 − e − β ) µ j ( h ) δ BP µ 1,..., µγ d π ⊗ γ ( µ 1 ,..., µ γ ), � � � T q , β , d : π �→ q (1 − (1 − e − β )/ q ) γ γ ! γ = 0 h = 1 j = 1 � γ j = 1 1 − (1 − e − β ) µ j ( i ) BP µ 1 ,..., µ γ ( i ) = , � q � γ j = 1 1 − (1 − e − β ) µ j ( h ) h = 1 � Λ ( � q � γ Λ (1 − (1 − e − β ) � q i = 1 1 − (1 − e − β ) µ ( π ) σ = 1 µ ( π ) 1 ( σ ) µ ( π ) ( σ )) 2 ( σ )) − d � σ = 1 i B q , β , d ( π ) = E . q (1 − (1 − e − β )/ q ) γ 1 − (1 − e − β )/ q 2 � q , β ( d ) > ln q + d � d > 0 : B ∗ 2 ln(1 − (1 − e − β )/ q ) Then d inf ( q , β ) = inf .

  10. The posterior distribution � define � ψ G ∗ ( σ ) = exp( − β 1 { σ ( v ) = σ ( w )}), { v , w } ∈ E ( G ) Z ( G ∗ ) = � ψ G ∗ ( σ ). σ ∈ Ω V σ ∗ = σ | G ∗ � ≍ µ G ∗ ( σ ) = ψ G ∗ ( σ )/ Z ( G ∗ ) � then � P

  11. The posterior distribution � reconstruction is impossible iff 1 � � � lim E � µ G ∗ , v , w − µ G ∗ , v ⊗ µ G ∗ , w TV = 0 � n 2 n →∞ v , w

  12. The posterior distribution 1 � � � lim E � µ G ∗ , v , w − µ G ∗ , v ⊗ µ G ∗ , w TV = 0 � n 2 n →∞ v , w 1 n E[log Z ( G ∗ )] = log q + d 2 log(1 − (1 − e − β )/ q ). ⇔ lim n →∞

  13. The Aizenman-Sims-Starr scheme Z ( G ∗ n + 1 ) 1 � � n E[log Z ( G ∗ )] = lim lim n →∞ E log Z ( G ∗ n →∞ n )

  14. The Aizenman-Sims-Starr scheme Z ( ˜ G + vw ) e − β 1 { σ = τ } µ G ∗ , v , w ( σ , τ ) � = Z ( ˜ G ) σ , τ ∈ [ q ]

  15. Correlations � X = fixed finite set � µ ∈ P ( X n ) for some large integer n

  16. Correlations Definition A probability measure µ ∈ P ( X n ) is ε -symmetric if n 1 � � � � µ i , j − µ i ⊗ µ j TV < ε � n 2 i , j = 1

  17. Correlations The magic lemma [COKPZ16] For any ε > 0 there is a bounded random variable T such that for all µ ∈ P ( X n ) the following is true: � choose U ⊂ {1,..., n } of size T randomly � sample ˆ σ from µ � let µ ( τ ) = µ [ τ |∀ i ∈ U : τ ( i ) = ˆ ˆ σ ( i )]; then � � P µ is ε -symmetric ˆ > 1 − ε

  18. Correlations Lemma [BCO15] For any ε > 0, k ≥ 3 there is δ > 0 s.t. for n > 1/ δ for δ -symmetric µ , n 1 � � � � µ i 1 ,..., i k − µ i 1 ⊗···⊗ µ i k TV < ε � n k i 1 ,..., i k = 1

  19. Low density generator matrix codes � A ∈ F m × n with k ≥ 3 ones per row 2 � signal d = km / n , noise β

  20. Low density generator matrix codes log P[ σ ∗ = s , τ = t ] σ ∗ = s , τ = t I ( σ ∗ , τ | A ) = � � � P P[ σ ∗ = s ]P[ τ = t ] s , t

  21. Low density generator matrix codes � non-rigorous statistical physics analysis [KS99] � upper bound on the mutual information, even k [M05] � existence of lim n →∞ 1 n I ( σ ∗ , τ | A ), even k [AM15]

  22. Low density generator matrix codes Theorem [CKPZ16] For k ≥ 2, β > 0, d > 0 and π ∈ P 0 ([ − 1,1]) let � � � � �� γ k − 1 k 1 − d ( k − 1) θ π θ π � � � � B d , β ( π ) = E 1 + (1 − 2 β ) σ J i 1 + (1 − 2 β ) J 2 Λ Λ i , j j k σ =± 1 i = 1 j = 1 j = 1 Then 1 n I ( σ ∗ , τ | A ) = (1 + d / k )log2 + β log β + (1 − β )log(1 − β ) − sup lim B d , β ( π ) n →∞ π ∈ P 0 ([ − 1,1]) The information-theoretic threshold is equal to � � d inf ( β ) = inf d > 0 : sup B d , β ( π ) > log2 π ∈ P 0 ([ − 1,1])

  23. Conclusions � generalisation: the “teacher-student scheme” � justification of the ‘replica symmetric cavity method’ � other applications: � random graph colouring � Goldreich’s one-way function � the diluted p -spin model

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend