phase transitions in low rank matrix estimation
play

Phase transitions in low-rank matrix estimation May 11, 2017 Marc - PowerPoint PPT Presentation

Phase transitions in low-rank matrix estimation May 11, 2017 Marc Lelarge & L eo Miolane INRIA, ENS 1 / 14 Introduction The statistical model Spiked Wigner model n XX Y + Z =


  1. Phase transitions in low-rank matrix estimation May 11, 2017 Marc Lelarge & L´ eo Miolane INRIA, ENS 1 / 14

  2. Introduction The statistical model “Spiked Wigner” model � � � λ � � n XX ⊺ Y + Z = ���� � �� � ���� noise observations signal i.i.d. ◮ X : vector of dimension n with entries X i ∼ P 0 . E X 1 = 0 , E X 2 1 = 1 . i.i.d. ◮ Z i,j = Z j,i ∼ N (0 , 1) . ◮ λ : signal-to-noise ratio. Goal: recover the low-rank matrix XX ⊺ from Y . 2 / 14

  3. Principal component analysis (PCA) B.B.P. phase transition √ ◮ The matrix Y / √ n = λ XX ⊺ /n + Z / √ n is a perturbed low-rank matrix. ◮ Estimate X using the eigenvector ˆ x n associated with the largest eigenvalue µ n of Y / √ n . 3 / 14

  4. Principal component analysis (PCA) B.B.P. phase transition √ ◮ The matrix Y / √ n = λ XX ⊺ /n + Z / √ n is a perturbed low-rank matrix. ◮ Estimate X using the eigenvector ˆ x n associated with the largest eigenvalue µ n of Y / √ n . Spectral density of the signal Limiting spectral density of the noise 3 / 14

  5. Principal component analysis (PCA) B.B.P. phase transition √ ◮ The matrix Y / √ n = λ XX ⊺ /n + Z / √ n is a perturbed low-rank matrix. ◮ Estimate X using the eigenvector ˆ x n associated with the largest eigenvalue µ n of Y / √ n . B.B.P. phase transition � µ n − → 2 ◮ if λ ≤ 1 X · ˆ − → 0 x n √ � 1 µ n − → λ + λ > 2 √ ◮ if λ > 1 � | X · ˆ x n | − → 1 − 1 /λ > 0 Baik et al., 2005; Benaych-Georges and Nadakuditi, 2011 3 / 14

  6. Questions ◮ PCA fails when λ ≤ 1 , but is it still possible to recover the signal? 4 / 14

  7. Questions ◮ PCA fails when λ ≤ 1 , but is it still possible to recover the signal? ◮ When λ > 1 , is PCA optimal? 4 / 14

  8. Questions ◮ PCA fails when λ ≤ 1 , but is it still possible to recover the signal? ◮ When λ > 1 , is PCA optimal? ◮ More generally, what is the best achievable estimation performance in both regimes? 4 / 14

  9. MMSE and information-theoretic threshold Goal 1 � � � XX ⊺ − ˆ 2 � � MMSE n = min n 2 E θ ( Y ) � ˆ θ = 1 ( X i X j − E [ X i X j | Y ]) 2 ≤ � E [ X 2 ] 2 n 2 � �� � 1 ≤ i,j ≤ n Dummy MSE 5 / 14

  10. MMSE and information-theoretic threshold Goal 1 � � � XX ⊺ − ˆ 2 � � MMSE n = min n 2 E θ ( Y ) � ˆ θ = 1 ( X i X j − E [ X i X j | Y ]) 2 ≤ � E [ X 2 ] 2 n 2 � �� � 1 ≤ i,j ≤ n Dummy MSE Information-theoretic threshold 1. Compute lim n →∞ MMSE n 2. Deduce the information-theoretic threshold, i.e. the critical value λ c such that ◮ if λ > λ c , n →∞ MMSE n < Dummy MSE lim ◮ if λ < λ c , n →∞ MMSE n = Dummy MSE lim 5 / 14

  11. Connection with statistical physics A planted spin glass model � λ n XX ⊺ + Z ◮ Compute the MMSE for Y = 6 / 14

  12. Connection with statistical physics A planted spin glass model � λ n XX ⊺ + Z ◮ Compute the MMSE for Y = ◮ Study the posterior P ( x | Y ) = 1 P 0 ( x ) exp( H n ( x )) where Z n � λ nY i,j x i x j − λ � 2 nx 2 i x 2 H n ( x ) = j i<j � λ + λ nX i X j x i x j − λ � 2 nx 2 i x 2 = nZ i,j x i x j j i<j � �� � � �� � planted solution SK 6 / 14

  13. Connection with statistical physics A planted spin glass model � λ n XX ⊺ + Z ◮ Compute the MMSE for Y = ◮ Study the posterior P ( x | Y ) = 1 P 0 ( x ) exp( H n ( x )) where Z n � nY i,j x i x j − λ λ � 2 nx 2 i x 2 H n ( x ) = j i<j � λ + λ nX i X j x i x j − λ � 2 nx 2 i x 2 = nZ i,j x i x j j i<j � �� � � �� � planted solution SK ◮ Compute the limit of the free energy F n = 1 n E log Z n because Constant − F n = 1 ∂λ nI ( X ; Y ) − − → MMSE 6 / 14

  14. Replica symmetric formula The scalar channel Lesieur et al., 2015 conjectured that the problem is characterized par the scalar channel: Y 0 = √ γX 0 + Z 0 � � � √ γY 0 x 0 − γ 2 x 2 and the scalar free energy: F ( γ ) = E log P 0 ( x 0 ) e 0 x 0 7 / 14

  15. Replica symmetric formula The scalar channel Lesieur et al., 2015 conjectured that the problem is characterized par the scalar channel: Y 0 = √ γX 0 + Z 0 � � � √ γY 0 x 0 − γ 2 x 2 and the scalar free energy: F ( γ ) = E log P 0 ( x 0 ) e 0 x 0 Replica symmetric formula q ≥ 0 F ( λq ) − λ 4 q 2 F n − n →∞ sup − − → n →∞ E P 0 [ X 2 ] 2 − q ∗ ( λ ) 2 MMSE n − − − → Proved by Barbier et al., 2016, extended by Lelarge and Miolane, 2016. 7 / 14

  16. Some curves ◮ We will plot the MMSE and MSE PCA curves when P 0 is of the form � � P 0 ( (1 − p ) /p ) = p � P 0 ( − p/ (1 − p )) = 1 − p for some p ∈ (0 , 1) . ◮ One can show that the corresponding matrix estimation problem is, in some sense, equivalent to the community detection problem with 2 asymmetric communities. 8 / 14

  17. 1 . 0 0 . 8 0 . 6 MMSE MSE AMP MSE P CA 0 . 4 0 . 2 0 . 0 0 . 25 0 . 50 0 . 75 1 . 00 1 . 25 1 . 50 1 . 75 2 . 00 λ MMSE, MSE PCA and MSE AMP , asymmetric SBM: p = 0 . 05 . 9 / 14

  18. “Free energy lanscape”, p = 0 . 05 , λ = 0 . 63 . 10 / 14

  19. 1 . 2 EASY K-S 1 HARD 0 . 8 p ∗ λ c 0 . 6 λ λ sp 0 . 4 IMPOSSIBLE 0 . 2 0 0 0 . 1 0 . 2 0 . 3 0 . 4 0 . 5 p Phase diagram from Caltagirone et al., 2016 11 / 14

  20. Thank you for your attention. Any questions? 12 / 14

  21. References I ◮ Baik, Jinho, G´ erard Ben Arous, and Sandrine P´ ech´ e (2005). “Phase transition of the largest eigenvalue for nonnull complex sample covariance matrices”. In: Annals of Probability , pp. 1643–1697. ◮ Barbier, Jean et al. (2016). “Mutual information for symmetric rank-one matrix estimation: A proof of the replica formula”. In: Advances in Neural Information Processing Systems , pp. 424–432. ◮ Benaych-Georges, Florent and Raj Rao Nadakuditi (2011). “The eigenvalues and eigenvectors of finite, low rank perturbations of large random matrices”. In: Advances in Mathematics 227.1, pp. 494–521. ◮ Caltagirone, Francesco, Marc Lelarge, and L´ eo Miolane (2016). “Recovering asymmetric communities in the stochastic block model”. In: arXiv preprint arXiv:1610.03680 . ◮ Lelarge, Marc and L´ eo Miolane (2016). “Fundamental limits of symmetric low-rank matrix estimation”. In: arXiv preprint arXiv:1611.03888 . 13 / 14

  22. References II ◮ Lesieur, Thibault, Florent Krzakala, and Lenka Zdeborov´ a (2015). “MMSE of probabilistic low-rank matrix estimation: Universality with respect to the output channel”. In: 53rd Annual Allerton Conference on Communication, Control, and Computing, Allerton 2015, Allerton Park & Retreat Center, Monticello, IL, USA, September 29 - October 2, 2015 . IEEE, pp. 680–687. isbn : 978-1-5090-1824-6. doi : 10.1109/ALLERTON.2015.7447070 . url : http://dx.doi.org/10.1109/ALLERTON.2015.7447070 . 14 / 14

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend