Phase transitions in low-rank matrix estimation
May 11, 2017 Marc Lelarge & L´ eo Miolane
INRIA, ENS
1 / 14
Phase transitions in low-rank matrix estimation May 11, 2017 Marc - - PowerPoint PPT Presentation
Phase transitions in low-rank matrix estimation May 11, 2017 Marc Lelarge & L eo Miolane INRIA, ENS 1 / 14 Introduction The statistical model Spiked Wigner model n XX Y + Z =
May 11, 2017 Marc Lelarge & L´ eo Miolane
INRIA, ENS
1 / 14
The statistical model
“Spiked Wigner” model
◮ X: vector of dimension n with entries Xi
i.i.d.
∼ P0. EX1 = 0, EX2
1 = 1. ◮ Zi,j = Zj,i
i.i.d.
∼ N(0, 1).
◮ λ: signal-to-noise ratio.
Goal: recover the low-rank matrix XX⊺ from Y.
2 / 14
B.B.P. phase transition
◮ The matrix Y/√n =
√ λXX⊺/n + Z/√n is a perturbed low-rank matrix.
◮ Estimate X using the eigenvector ˆ
xn associated with the largest eigenvalue µn of Y/√n.
3 / 14
B.B.P. phase transition
◮ The matrix Y/√n =
√ λXX⊺/n + Z/√n is a perturbed low-rank matrix.
◮ Estimate X using the eigenvector ˆ
xn associated with the largest eigenvalue µn of Y/√n.
Spectral density of the signal Limiting spectral density of the noise
3 / 14
B.B.P. phase transition
◮ The matrix Y/√n =
√ λXX⊺/n + Z/√n is a perturbed low-rank matrix.
◮ Estimate X using the eigenvector ˆ
xn associated with the largest eigenvalue µn of Y/√n.
B.B.P. phase transition
◮ if λ ≤ 1
− → 2 X · ˆ xn − → 0
◮ if λ > 1
− → √ λ +
1 √ λ > 2
|X · ˆ xn| − →
Baik et al., 2005; Benaych-Georges and Nadakuditi, 2011
3 / 14
◮ PCA fails when λ ≤ 1, but is it still possible to recover the
signal?
4 / 14
◮ PCA fails when λ ≤ 1, but is it still possible to recover the
signal?
◮ When λ > 1, is PCA optimal?
4 / 14
◮ PCA fails when λ ≤ 1, but is it still possible to recover the
signal?
◮ When λ > 1, is PCA optimal? ◮ More generally, what is the best achievable estimation
performance in both regimes?
4 / 14
Goal
MMSEn = min
ˆ θ
1 n2E
θ(Y)
= 1 n2
(XiXj − E[XiXj|Y])2 ≤ E[X2]2
5 / 14
Goal
MMSEn = min
ˆ θ
1 n2E
θ(Y)
= 1 n2
(XiXj − E[XiXj|Y])2 ≤ E[X2]2
Information-theoretic threshold
n→∞ MMSEn
that
◮ if λ > λc,
lim
n→∞ MMSEn < Dummy MSE
◮ if λ < λc,
lim
n→∞ MMSEn = Dummy MSE
5 / 14
A planted spin glass model
◮ Compute the MMSE for Y =
nXX⊺ + Z
6 / 14
A planted spin glass model
◮ Compute the MMSE for Y =
nXX⊺ + Z
◮ Study the posterior P(x | Y) = 1
Zn P0(x) exp(Hn(x)) where
Hn(x) =
nYi,jxixj − λ 2nx2
i x2 j
=
nZi,jxixj
+ λ nXiXjxixj − λ 2nx2
i x2 j
6 / 14
A planted spin glass model
◮ Compute the MMSE for Y =
nXX⊺ + Z
◮ Study the posterior P(x | Y) = 1
Zn P0(x) exp(Hn(x)) where
Hn(x) =
nYi,jxixj − λ 2nx2
i x2 j
=
nZi,jxixj
+ λ nXiXjxixj − λ 2nx2
i x2 j
◮ Compute the limit of the free energy Fn = 1
nE log Zn because
Constant − Fn = 1 nI(X; Y)
∂λ
− − → MMSE
6 / 14
The scalar channel Lesieur et al., 2015 conjectured that the problem is characterized par the scalar channel: Y0 = √γX0 + Z0 and the scalar free energy: F(γ) = E
P0(x0)e
√γY0x0− γ
2 x2
The scalar channel Lesieur et al., 2015 conjectured that the problem is characterized par the scalar channel: Y0 = √γX0 + Z0 and the scalar free energy: F(γ) = E
P0(x0)e
√γY0x0− γ
2 x2
Fn − − − →
n→∞ sup q≥0 F(λq) − λ
4q2 MMSEn − − − →
n→∞ EP0[X2]2 − q∗(λ)2
Proved by Barbier et al., 2016, extended by Lelarge and Miolane, 2016.
7 / 14
◮ We will plot the MMSE and MSEPCA curves when P0 is of the form
= p P0(−
= 1 − p for some p ∈ (0, 1).
◮ One can show that the corresponding matrix estimation problem is, in
some sense, equivalent to the community detection problem with 2 asymmetric communities.
8 / 14
0.25 0.50 0.75 1.00 1.25 1.50 1.75 2.00 λ 0.0 0.2 0.4 0.6 0.8 1.0 MMSE MSEAMP MSEP CA
MMSE, MSEPCA and MSEAMP, asymmetric SBM: p = 0.05.
9 / 14
“Free energy lanscape”, p = 0.05, λ = 0.63.
10 / 14
0.2 0.4 0.6 0.8 1 1.2 0.1 0.2 0.3 0.4 0.5 λ p
K-S λsp λc p∗ EASY HARD IMPOSSIBLE
Phase diagram from Caltagirone et al., 2016
11 / 14
12 / 14
◮ Baik, Jinho, G´
erard Ben Arous, and Sandrine P´ ech´ e (2005). “Phase transition of the largest eigenvalue for nonnull complex sample covariance matrices”. In: Annals of Probability, pp. 1643–1697.
◮ Barbier, Jean et al. (2016). “Mutual information for symmetric rank-one
matrix estimation: A proof of the replica formula”. In: Advances in Neural Information Processing Systems, pp. 424–432.
◮ Benaych-Georges, Florent and Raj Rao Nadakuditi (2011). “The eigenvalues
and eigenvectors of finite, low rank perturbations of large random matrices”. In: Advances in Mathematics 227.1, pp. 494–521.
◮ Caltagirone, Francesco, Marc Lelarge, and L´
eo Miolane (2016). “Recovering asymmetric communities in the stochastic block model”. In: arXiv preprint arXiv:1610.03680.
◮ Lelarge, Marc and L´
eo Miolane (2016). “Fundamental limits of symmetric low-rank matrix estimation”. In: arXiv preprint arXiv:1611.03888.
13 / 14
◮ Lesieur, Thibault, Florent Krzakala, and Lenka Zdeborov´
a (2015). “MMSE
Control, and Computing, Allerton 2015, Allerton Park & Retreat Center, Monticello, IL, USA, September 29 - October 2, 2015. IEEE, pp. 680–687. isbn: 978-1-5090-1824-6. doi: 10.1109/ALLERTON.2015.7447070. url: http://dx.doi.org/10.1109/ALLERTON.2015.7447070.
14 / 14