Poster #46 How Much Restricted Isometry is Needed in Nonconvex - - PowerPoint PPT Presentation

poster 46
SMART_READER_LITE
LIVE PREVIEW

Poster #46 How Much Restricted Isometry is Needed in Nonconvex - - PowerPoint PPT Presentation

Wed Dec 5th 5 - 7 PM @ Room 210 & 230 AB Neural Information Processing Systems 2018 Poster #46 How Much Restricted Isometry is Needed in Nonconvex Matrix Recovery? Richard Y. Cdric Javad Somayeh Zhang Josz Sojoudi Lavaei 1


slide-1
SLIDE 1

How Much Restricted Isometry is Needed in Nonconvex Matrix Recovery?

1

Cédric Josz Javad Lavaei Somayeh Sojoudi Richard Y. Zhang

Neural Information Processing Systems 2018

Wed Dec 5th 5 - 7 PM @ Room 210 & 230 AB

Poster #46

slide-2
SLIDE 2

2

Nonconvex matrix recovery (Burer & Monteiro 2003)

Recommendation engines Phase retrieval Power system state estimation Cluster analysis

slide-3
SLIDE 3

3

Nonconvex matrix recovery (Burer & Monteiro 2003)

  • 1. Express low-rank matrix as product of factors

table of movie ratings movie genres user preferences

slide-4
SLIDE 4

4

Nonconvex matrix recovery (Burer & Monteiro 2003)

  • 2. Minimize least-squares loss of linear model

specific table elements known ratings table of movie ratings

slide-5
SLIDE 5

5

Spurious local minima Global minimum X = UUT

slide-6
SLIDE 6

6

Exact recovery guarantee (Bhojanapalli et al. 2016) δ-Restricted isometry property (δ-RIP) If δ < 1/5, then no spurious local minima.

See also (Ge et al. 2017; Li & Tang 2017; Zhu et al. 2017)

slide-7
SLIDE 7

7

Local search is guaranteed to succeed.

Exact recovery guarantee (Bhojanapalli et al. 2016)

See also (Ge et al. 2017; Li & Tang 2017; Zhu et al. 2017)

δ-Restricted isometry property (δ-RIP) If δ < 1/5, then no spurious local minima.

slide-8
SLIDE 8

8

δ-Concentration inequality If δ very small, then no spurious local minima. Exact recovery guarantee (Ge et al. 2016)

See also (Ge et al. 2015; 2017; Sun et al. 2015; 2016; Park et al. 2017; etc.)

Similar idea drives many proofs

slide-9
SLIDE 9

9

If δ < 1/5, then no spurious local min.

slide-10
SLIDE 10

10

If δ < 1/5, then no spurious local min.

=1.0 =1.0

slide-11
SLIDE 11

11

If δ < 1/5, then no spurious local min.

=1.0 =1.0

< 1/5 < 1/5

slide-12
SLIDE 12

12

If δ < 1/5, then no spurious local min.

Preserve lengths with <10% distortion =1.0 =1.0 <1.1 >0.9

< 1/5 < 1/5

slide-13
SLIDE 13

13

Can this be significantly improved?

δ

1 1/5

Good ??? NO

  • Problem hard.
  • Specific to algorithm.
  • Proof idea is limited.

Yes

  • Problem easy.
  • Agnostic to algorithm.
  • Proof idea is powerful.

If δ < 1/5, then no spurious local min.

slide-14
SLIDE 14

14

Previous attempts all stuck at 1/5

(Bhojanapalli et al. 2016) (Ge et al. 2017) (Li & Tang 2017) (Zhu et al. 2017) etc.

Can this be significantly improved?

δ

1 1/5

Good ??? NO

  • Problem hard.
  • Specific to algorithm.
  • Proof idea is limited.

Yes

  • Problem easy.
  • Agnostic to algorithm.
  • Proof idea is powerful.

If δ < 1/5, then no spurious local min.

slide-15
SLIDE 15

15

Can this be significantly improved?

δ

1 1/5

Good Bad

1/2

NO

  • Problem hard.
  • Specific to algorithm.
  • Proof idea is limited.

If δ ≥ 1/2, many counterexamples. If δ < 1/5, then no spurious local min.

Contribution 1.

Zhang, Josz, Sojoudi, Lavaei, NeurIPS (2018)

slide-16
SLIDE 16

16

If δ < 1/2, then no spurious local min. δ

1 1/2

Good Bad

If δ ≥ 1/2, many counterexamples.

Contribution 2.

Let rank r = 1.

Zhang, Sojoudi, Lavaei. Submitted to JMLR (2018)

slide-17
SLIDE 17

Counterexample with δ = 1/2

17

Satisfies ½-RIP.

Zhang, Josz, Sojoudi, Lavaei, NeurIPS (2018) Zhang, Sojoudi, Lavaei, Submitted to JMLR (2018)

slide-18
SLIDE 18

Counterexample with δ = 1/2

18

Satisfies ½-RIP. Ground truth z = (1,0) Spurious local min x = (0,1/√2)

Zhang, Josz, Sojoudi, Lavaei, NeurIPS (2018) Zhang, Sojoudi, Lavaei, Submitted to JMLR (2018)

slide-19
SLIDE 19

Counterexample with δ = 1/2

  • 100,000 trials w/ SGD
  • 87,947 successful
  • 12% failure rate

19

Satisfies ½-RIP. Ground truth z = (1,0) Spurious local min x = (0,1/√2)

Zhang, Josz, Sojoudi, Lavaei, NeurIPS (2018) Zhang, Sojoudi, Lavaei, Submitted to JMLR (2018)

slide-20
SLIDE 20

Counterexample with δ = 1/2

  • 100,000 trials w/ SGD
  • 87,947 successful
  • 12% failure rate

20

Satisfies ½-RIP. Ground truth z = (1,0) Spurious local min x = (0,1/√2) Generalization to arbitrary rank-1 ground truth

Zhang, Josz, Sojoudi, Lavaei, NeurIPS (2018) Zhang, Sojoudi, Lavaei, Submitted to JMLR (2018)

slide-21
SLIDE 21

Proof idea. Counterexamples via convex optimization

21

Zhang, Josz, Sojoudi, Lavaei, NeurIPS (2018)

Key insight. Relax into a semidefinite program

slide-22
SLIDE 22

Main Result 1. Counterexamples are almost everywhere

22

Theorem 1 (Zhang, Josz, Sojoudi, Lavaei 2018). Given x, z not colinear and nonzero, there exists a counterexample that

  • satisfy δ-RIP and 1/2 ≤ δ < 1
  • has z as ground truth
  • has x as spurious local min.

Zhang, Josz, Sojoudi, Lavaei, NeurIPS (2018)

slide-23
SLIDE 23

Main Result 1. Counterexamples are almost everywhere

23

Theorem 1 (Zhang, Josz, Sojoudi, Lavaei 2018). Given x, z not colinear and nonzero, there exists a counterexample that

  • satisfy δ-RIP and 1/2 ≤ δ < 1
  • has z as ground truth
  • has x as spurious local min.

Take-away. If δ-RIP with δ ≥ 1/2, then expect spurious local minima.

Zhang, Josz, Sojoudi, Lavaei, NeurIPS (2018)

slide-24
SLIDE 24

24

Conjecture (Zhang, Josz, Sojoudi, Lavaei 2018). If δ-RIP with δ < 1/2, then no spurious local min.

Zhang, Josz, Sojoudi, Lavaei, NeurIPS (2018)

slide-25
SLIDE 25

Main Result 2. Sharp RIP-based guarantee

25

Zhang, Sojoudi, Lavaei. Submitted to JMLR (2018)

Theorem 2 (Zhang, Sojoudi, Lavaei 2018). If δ-RIP with δ < 1/2 and r =1, then no spurious local min.

Proof for rank-1 case

slide-26
SLIDE 26

26

Ongoing work. Generalization to rank-r

Theorem 2 (Zhang, Sojoudi, Lavaei 2018). If δ-RIP with δ < 1/2 and r =1, then no spurious local min.

slide-27
SLIDE 27

27

Practical implications? δ-RIP with 1/2 ≤ δ < 1

slide-28
SLIDE 28

28

“Engineered” spurious local minimum Global minimum

xbad xgood

Zhang, Josz, Sojoudi, Lavaei, NeurIPS (2018)

slide-29
SLIDE 29

29

xbad xgood

  • 2. Start SGD at
  • 3. Make 10k SGD steps, measure error

error = | xfinal - xgood |

Zhang, Josz, Sojoudi, Lavaei, NeurIPS (2018)

xinit = (1-γ) xbad + γ Gaussian

  • 1. Select γ in [0,1]
slide-30
SLIDE 30

max 95% median 5% min

error (1k trials)

xinit = xbad xinit ~ Gaussian

Zhang, Josz, Sojoudi, Lavaei, NeurIPS (2018)

γ Example 1 100% success rate

slide-31
SLIDE 31

Example 1 100% success rate

max 95% median 5% min

error (1k trials)

xinit = xbad xinit ~ Gaussian

Zhang, Josz, Sojoudi, Lavaei, NeurIPS (2018)

γ

100% failure

slide-32
SLIDE 32

Example 1 100% success rate

max 95% median 5% min

error (1k trials)

xinit = xbad xinit ~ Gaussian

Zhang, Josz, Sojoudi, Lavaei, NeurIPS (2018)

γ

100% failure 100% success

slide-33
SLIDE 33

max 95% median 5% min

error (1k trials)

γ xinit = xbad xinit = w

Zhang, Josz, Sojoudi, Lavaei, NeurIPS (2018)

Example 2 <95% success rate

100% failure >5% failure

slide-34
SLIDE 34

34

Practical implications? δ-RIP with 1/2 ≤ δ < 1

100% success no spurious local min spurious local min > 0% failure Limitations of “no spurious local min” guarantees

slide-35
SLIDE 35

δ

1 1/5

Good Bad

1/2

How Much Restricted Isometry is Needed in Nonconvex Matrix Recovery?

35

If δ ≥ 1/2, many counterexamples.

R.Y. Zhang, C. Josz, S. Sojoudi, J. Lavaei, NeurIPS (2018)

Wed Dec 5th 5 - 7 PM @ Room 210 & 230 AB

Poster #46

slide-36
SLIDE 36

How Much Restricted Isometry is Needed in Nonconvex Matrix Recovery?

36

If δ < 1/2, then no spurious local min (?) δ

1 1/2

Good Bad

If δ ≥ 1/2, many counterexamples.

R.Y. Zhang, C. Josz, S. Sojoudi, J. Lavaei, NeurIPS (2018)

Wed Dec 5th 5 - 7 PM @ Room 210 & 230 AB

Poster #46

slide-37
SLIDE 37

How Much Restricted Isometry is Needed in Nonconvex Matrix Recovery?

37

If δ < 1/2, then no spurious local min (?) δ

1 1/2

Good Bad

If δ ≥ 1/2, many counterexamples.

R.Y. Zhang, C. Josz, S. Sojoudi, J. Lavaei, NeurIPS (2018)

Wed Dec 5th 5 - 7 PM @ Room 210 & 230 AB

Poster #46

Limitations of “no spurious local min” guarantees