homotopy analysis for tensor pca

Homotopy Analysis for Tensor PCA Yuan Deng Duke University Joint - PowerPoint PPT Presentation

Homotopy Analysis for Tensor PCA Yuan Deng Duke University Joint work with Anima Anandkumar, Rong Ge, Hossein Mobahi Non-convex Optimization Optimizing smooth function f(x). Local Optimum Global Optimum How to get rid of local optima


  1. Homotopy Analysis for Tensor PCA Yuan Deng Duke University Joint work with Anima Anandkumar, Rong Ge, Hossein Mobahi

  2. Non-convex Optimization • Optimizing smooth function f(x). Local Optimum Global Optimum How to get rid of local optima

  3. Gaussian Smoothing • Idea: Smooth the function • convolve with 𝒪(0, 𝑢)

  4. Gaussian Smoothing • Idea: Smooth the function • convolve with 𝒪(0, 𝑢)

  5. Gaussian Smoothing • Idea: Smooth the function • convolve with 𝒪(0, 𝑢) Local Minimum Disappears!

  6. Gaussian Smoothing • Idea: Smooth the function • convolve with 𝒪(0, 𝑢) Local Minimum Disappears! Shifted Global Optimum

  7. Gaussian Smoothing • Idea: Smooth the function • convolve with 𝒪(0, 𝑢) Local Minimum Disappears! Shifted Global Optimum • How to decide how much to smooth? • How to recover the original global optium?

  8. Homotopy Method • Try all level of smoothing!

  9. Homotopy Method Computer Vision • image deblurring [Boccuto et al., 2002] • image restoration [Nikolova et al., 2010] • optical flow [Brox & Malik, 2011] Clustering [Gold, 1994] Graph matching [Zaslavskiy et al., 2009] • No theoretical guarantees on the solution • too restrictive [Mobahi and Fisher III, 2015] • difficult to check [Hazan et al., 2016]

  10. Homotopy Method • Handcrafted the choice of smoothing levels • Slow : Local search is repeated for each smoothing level

  11. Tensor PCA [Richard and Montanari 2014] Probabilistic model for PCA 𝑤 ∈ ℝ * , 𝜐 ≥ 0 is the signal-to-noise ratio 𝑁 = 𝜐𝒘𝒘 4 + 𝐵 Gaussian Signal Noise Tensor PCA: 𝑈 = 𝜐𝒘 ⊗ 𝒘 ⊗ 𝒘 + 𝐵 Objective: • Design an efficient algorithm for as small 𝜐 as possible

  12. � Previous Work • [Richard & Montanari 2014] Can find 𝑤 when 𝜐 = Ω 𝑒 in poly time, and 𝜐 = Ω( 𝑒 ) in exp. time. • [Hopkins, Shi & Steurer 2015] Sum-of-Squares technique, 9(𝑒 :/< ) in poly time can find 𝑤 when 𝜐 = Ω • Basic Sum-of-Squares algorithm is very slow. 9 𝑒 : , nearly linear • Running time can be improved Ω

  13. Our Results Bound on 𝜐 Extra Space Method Time 𝑃(𝑒) 9(𝑒 :/< ) 9(𝑒 : ) Ω Ω Ours 𝑃(𝑒 > ) 9(𝑒 :/< ) 9(𝑒 : ) State-of-Art Ω Ω Guarantee matches best known result Better convergence rate when 𝜐 is closed to 𝑒 :/< One of the first results on provably analyzing homotopy method

  14. Optimization for tensor PCA • Recall: for matrix PCA, we optimize max 𝒚 4 𝑁𝒚 = 𝜐 𝒘, 𝒚 > + 𝒚 4 𝐵𝒚 𝒚 = 1 𝑤 • For tensor PCA, we optimize max 𝑈 𝒚, 𝒚, 𝒚 = 𝜐 𝑤, 𝒚 : + 𝐵(𝒚, 𝒚, 𝒚) 𝒚 = 1

  15. Infinite Smoothing 𝑤 unique optimum 𝑦 ∗ : correlation 𝜐 /𝑒 = Ω( 𝑒 FG.>I ) [random unit vector : 𝑒 FG.I ] 𝑢 = ∞

  16. Phase Transition in Homotopy Method • Lemma*: there is a threshold 𝜄 , 𝑤 𝑤 𝑢 > 𝜄 𝑢 = 𝜄 𝑢 < 𝜄 • If using infinite steps, i.e., continuously ∞ → 0 • 𝑢 > 𝜄 , ||𝑦 O − 𝑦 ∗ || ≤ 𝑝(1)||𝑦 ∗ || • 𝑢 < 𝜄 , 𝑦 O , 𝑤 = Ω(1)

  17. Phase Transition 𝑔 𝑦 = −𝑦 < + 0.8𝑦 > - 1.0 - 0.5 0.5 1.0 - 1.0 - 0.5 0.5 1.0 - 0.2 - 0.2 - 0.4 - 0.6 - 0.4 - 0.8 - 0.6 - 1.0 - 1.2 𝑕(𝑦, 0.3) 𝑕(𝑦, 0.4) 0.1 0.15 0.10 - 1.0 - 0.5 0.5 1.0 0.05 - 0.1 - 1.0 - 0.5 0.5 1.0 - 0.2 - 0.05 - 0.10 - 0.3 - 0.15 - 0.4 - 0.20 𝑕(𝑦, 0.2) 𝑕(𝑦, 0)

  18. Phase Transition • If using infinite steps, i.e., continuously ∞ → 0 • 𝑢 > 𝜄 , ||𝑦 O − 𝑦 ∗ || ≤ 𝑝(1)||𝑦 ∗ || 𝑢 Z = ∞ • 𝑢 < 𝜄 , 𝑦 O , 𝑤 = Ω(1) 𝑢 > = 0 Infinite smoothing Power Method at 0 smoothing

  19. Conclusions • Homotopy method gives near-optimal results for tensor PCA. • Possible to analyze non-convex functions even when they really have bad local optima.

  20. Open Problems • More examples of Homotopy method? • When the tensor has higher rank? • General results for effects of smoothing • What kind of local optima will disappear? • Different way of smoothing/regularization?

Recommend


More recommend