Phase transitions in low-rank matrix estimation May 11, 2017 Marc - - PowerPoint PPT Presentation

phase transitions in low rank matrix estimation
SMART_READER_LITE
LIVE PREVIEW

Phase transitions in low-rank matrix estimation May 11, 2017 Marc - - PowerPoint PPT Presentation

Phase transitions in low-rank matrix estimation May 11, 2017 Marc Lelarge & L eo Miolane INRIA, ENS 1 / 14 Introduction The statistical model Spiked Wigner model n XX Y + Z =


slide-1
SLIDE 1

Phase transitions in low-rank matrix estimation

May 11, 2017 Marc Lelarge & L´ eo Miolane

INRIA, ENS

1 / 14

slide-2
SLIDE 2

Introduction

The statistical model

“Spiked Wigner” model

Y

  • bservations

=

  • λ

n XX⊺

  • signal

+ Z

  • noise

◮ X: vector of dimension n with entries Xi

i.i.d.

∼ P0. EX1 = 0, EX2

1 = 1. ◮ Zi,j = Zj,i

i.i.d.

∼ N(0, 1).

◮ λ: signal-to-noise ratio.

Goal: recover the low-rank matrix XX⊺ from Y.

2 / 14

slide-3
SLIDE 3

Principal component analysis (PCA)

B.B.P. phase transition

◮ The matrix Y/√n =

√ λXX⊺/n + Z/√n is a perturbed low-rank matrix.

◮ Estimate X using the eigenvector ˆ

xn associated with the largest eigenvalue µn of Y/√n.

3 / 14

slide-4
SLIDE 4

Principal component analysis (PCA)

B.B.P. phase transition

◮ The matrix Y/√n =

√ λXX⊺/n + Z/√n is a perturbed low-rank matrix.

◮ Estimate X using the eigenvector ˆ

xn associated with the largest eigenvalue µn of Y/√n.

Spectral density of the signal Limiting spectral density of the noise

3 / 14

slide-5
SLIDE 5

Principal component analysis (PCA)

B.B.P. phase transition

◮ The matrix Y/√n =

√ λXX⊺/n + Z/√n is a perturbed low-rank matrix.

◮ Estimate X using the eigenvector ˆ

xn associated with the largest eigenvalue µn of Y/√n.

B.B.P. phase transition

◮ if λ ≤ 1

  • µn

− → 2 X · ˆ xn − → 0

◮ if λ > 1

  • µn

− → √ λ +

1 √ λ > 2

|X · ˆ xn| − →

  • 1 − 1/λ > 0

Baik et al., 2005; Benaych-Georges and Nadakuditi, 2011

3 / 14

slide-6
SLIDE 6

Questions

◮ PCA fails when λ ≤ 1, but is it still possible to recover the

signal?

4 / 14

slide-7
SLIDE 7

Questions

◮ PCA fails when λ ≤ 1, but is it still possible to recover the

signal?

◮ When λ > 1, is PCA optimal?

4 / 14

slide-8
SLIDE 8

Questions

◮ PCA fails when λ ≤ 1, but is it still possible to recover the

signal?

◮ When λ > 1, is PCA optimal? ◮ More generally, what is the best achievable estimation

performance in both regimes?

4 / 14

slide-9
SLIDE 9

MMSE and information-theoretic threshold

Goal

MMSEn = min

ˆ θ

1 n2E

  • XX⊺ − ˆ

θ(Y)

  • 2

= 1 n2

  • 1≤i,j≤n

(XiXj − E[XiXj|Y])2 ≤ E[X2]2

  • Dummy MSE

5 / 14

slide-10
SLIDE 10

MMSE and information-theoretic threshold

Goal

MMSEn = min

ˆ θ

1 n2E

  • XX⊺ − ˆ

θ(Y)

  • 2

= 1 n2

  • 1≤i,j≤n

(XiXj − E[XiXj|Y])2 ≤ E[X2]2

  • Dummy MSE

Information-theoretic threshold

  • 1. Compute lim

n→∞ MMSEn

  • 2. Deduce the information-theoretic threshold, i.e. the critical value λc such

that

◮ if λ > λc,

lim

n→∞ MMSEn < Dummy MSE

◮ if λ < λc,

lim

n→∞ MMSEn = Dummy MSE

5 / 14

slide-11
SLIDE 11

Connection with statistical physics

A planted spin glass model

◮ Compute the MMSE for Y =

  • λ

nXX⊺ + Z

6 / 14

slide-12
SLIDE 12

Connection with statistical physics

A planted spin glass model

◮ Compute the MMSE for Y =

  • λ

nXX⊺ + Z

◮ Study the posterior P(x | Y) = 1

Zn P0(x) exp(Hn(x)) where

Hn(x) =

  • i<j
  • λ

nYi,jxixj − λ 2nx2

i x2 j

=

  • i<j
  • λ

nZi,jxixj

  • SK

+ λ nXiXjxixj − λ 2nx2

i x2 j

  • planted solution

6 / 14

slide-13
SLIDE 13

Connection with statistical physics

A planted spin glass model

◮ Compute the MMSE for Y =

  • λ

nXX⊺ + Z

◮ Study the posterior P(x | Y) = 1

Zn P0(x) exp(Hn(x)) where

Hn(x) =

  • i<j
  • λ

nYi,jxixj − λ 2nx2

i x2 j

=

  • i<j
  • λ

nZi,jxixj

  • SK

+ λ nXiXjxixj − λ 2nx2

i x2 j

  • planted solution

◮ Compute the limit of the free energy Fn = 1

nE log Zn because

Constant − Fn = 1 nI(X; Y)

∂λ

− − → MMSE

6 / 14

slide-14
SLIDE 14

Replica symmetric formula

The scalar channel Lesieur et al., 2015 conjectured that the problem is characterized par the scalar channel: Y0 = √γX0 + Z0 and the scalar free energy: F(γ) = E

  • log
  • x0

P0(x0)e

√γY0x0− γ

2 x2

  • 7 / 14
slide-15
SLIDE 15

Replica symmetric formula

The scalar channel Lesieur et al., 2015 conjectured that the problem is characterized par the scalar channel: Y0 = √γX0 + Z0 and the scalar free energy: F(γ) = E

  • log
  • x0

P0(x0)e

√γY0x0− γ

2 x2

  • Replica symmetric formula

Fn − − − →

n→∞ sup q≥0 F(λq) − λ

4q2 MMSEn − − − →

n→∞ EP0[X2]2 − q∗(λ)2

Proved by Barbier et al., 2016, extended by Lelarge and Miolane, 2016.

7 / 14

slide-16
SLIDE 16

Some curves

◮ We will plot the MMSE and MSEPCA curves when P0 is of the form

  • P0(
  • (1 − p)/p)

= p P0(−

  • p/(1 − p))

= 1 − p for some p ∈ (0, 1).

◮ One can show that the corresponding matrix estimation problem is, in

some sense, equivalent to the community detection problem with 2 asymmetric communities.

8 / 14

slide-17
SLIDE 17

0.25 0.50 0.75 1.00 1.25 1.50 1.75 2.00 λ 0.0 0.2 0.4 0.6 0.8 1.0 MMSE MSEAMP MSEP CA

MMSE, MSEPCA and MSEAMP, asymmetric SBM: p = 0.05.

9 / 14

slide-18
SLIDE 18

“Free energy lanscape”, p = 0.05, λ = 0.63.

10 / 14

slide-19
SLIDE 19

0.2 0.4 0.6 0.8 1 1.2 0.1 0.2 0.3 0.4 0.5 λ p

K-S λsp λc p∗ EASY HARD IMPOSSIBLE

Phase diagram from Caltagirone et al., 2016

11 / 14

slide-20
SLIDE 20

Thank you for your attention.

Any questions?

12 / 14

slide-21
SLIDE 21

References I

◮ Baik, Jinho, G´

erard Ben Arous, and Sandrine P´ ech´ e (2005). “Phase transition of the largest eigenvalue for nonnull complex sample covariance matrices”. In: Annals of Probability, pp. 1643–1697.

◮ Barbier, Jean et al. (2016). “Mutual information for symmetric rank-one

matrix estimation: A proof of the replica formula”. In: Advances in Neural Information Processing Systems, pp. 424–432.

◮ Benaych-Georges, Florent and Raj Rao Nadakuditi (2011). “The eigenvalues

and eigenvectors of finite, low rank perturbations of large random matrices”. In: Advances in Mathematics 227.1, pp. 494–521.

◮ Caltagirone, Francesco, Marc Lelarge, and L´

eo Miolane (2016). “Recovering asymmetric communities in the stochastic block model”. In: arXiv preprint arXiv:1610.03680.

◮ Lelarge, Marc and L´

eo Miolane (2016). “Fundamental limits of symmetric low-rank matrix estimation”. In: arXiv preprint arXiv:1611.03888.

13 / 14

slide-22
SLIDE 22

References II

◮ Lesieur, Thibault, Florent Krzakala, and Lenka Zdeborov´

a (2015). “MMSE

  • f probabilistic low-rank matrix estimation: Universality with respect to the
  • utput channel”. In: 53rd Annual Allerton Conference on Communication,

Control, and Computing, Allerton 2015, Allerton Park & Retreat Center, Monticello, IL, USA, September 29 - October 2, 2015. IEEE, pp. 680–687. isbn: 978-1-5090-1824-6. doi: 10.1109/ALLERTON.2015.7447070. url: http://dx.doi.org/10.1109/ALLERTON.2015.7447070.

14 / 14