spectral frank wolfe algorithm strict complementarity and
play

Spectral Frank-Wolfe Algorithm: Strict Complementarity and Linear - PowerPoint PPT Presentation

Spectral Frank-Wolfe Algorithm: Strict Complementarity and Linear Convergence Lijun Ding Joint work with Yingjie Fei, Qiantong Xu, and Chengrun Yang June 15, 2020 Lijun Ding (Cornell University) SpecFW June 15, 2020 1 / 17 Overview


  1. Spectral Frank-Wolfe Algorithm: Strict Complementarity and Linear Convergence Lijun Ding Joint work with Yingjie Fei, Qiantong Xu, and Chengrun Yang June 15, 2020 Lijun Ding (Cornell University) SpecFW June 15, 2020 1 / 17

  2. Overview Introduction 1 Problem setup Past algorithms SpecFW and strict complementarity 2 Spectral Frank-Wolfe (SpecFW) Strict complementarity Numerics 3 Experimental setup Numerical results Lijun Ding (Cornell University) SpecFW June 15, 2020 2 / 17

  3. Convex smooth minimization over a spectrahedron Main optimization problem: Lijun Ding (Cornell University) SpecFW June 15, 2020 3 / 17

  4. Convex smooth minimization over a spectrahedron Main optimization problem: minimize f ( X ) := g ( A X ) + tr ( CX ) n ⊂ R n × n X ∈ S (M) X ∈ S n subject to tr ( X ) = 1 , and + , function g strongly convex and smooth Lijun Ding (Cornell University) SpecFW June 15, 2020 3 / 17

  5. Convex smooth minimization over a spectrahedron Main optimization problem: minimize f ( X ) := g ( A X ) + tr ( CX ) n ⊂ R n × n X ∈ S (M) X ∈ S n subject to tr ( X ) = 1 , and + , function g strongly convex and smooth linear map A and matrix C ∈ S n Lijun Ding (Cornell University) SpecFW June 15, 2020 3 / 17

  6. Convex smooth minimization over a spectrahedron Main optimization problem: minimize f ( X ) := g ( A X ) + tr ( CX ) n ⊂ R n × n X ∈ S (M) X ∈ S n subject to tr ( X ) = 1 , and + , function g strongly convex and smooth linear map A and matrix C ∈ S n trace tr ( · ), sum of diagonals Lijun Ding (Cornell University) SpecFW June 15, 2020 3 / 17

  7. Convex smooth minimization over a spectrahedron Main optimization problem: minimize f ( X ) := g ( A X ) + tr ( CX ) n ⊂ R n × n X ∈ S (M) X ∈ S n subject to tr ( X ) = 1 , and + , function g strongly convex and smooth linear map A and matrix C ∈ S n trace tr ( · ), sum of diagonals positive semidefinite matrices S n + , i.e., symmetric matrices with non-negative eigenvalues Lijun Ding (Cornell University) SpecFW June 15, 2020 3 / 17

  8. Convex smooth minimization over a spectrahedron Main optimization problem: minimize f ( X ) := g ( A X ) + tr ( CX ) n ⊂ R n × n X ∈ S (M) X ∈ S n subject to tr ( X ) = 1 , and + , function g strongly convex and smooth linear map A and matrix C ∈ S n trace tr ( · ), sum of diagonals positive semidefinite matrices S n + , i.e., symmetric matrices with non-negative eigenvalues unique optimal solution X ⋆ Lijun Ding (Cornell University) SpecFW June 15, 2020 3 / 17

  9. Applications minimize f ( X ) := g ( A X ) + tr ( CX ) n ⊂ R n × n X ∈ S (M) X ∈ S n subject to tr ( X ) = 1 , and + , Lijun Ding (Cornell University) SpecFW June 15, 2020 4 / 17

  10. Applications minimize f ( X ) := g ( A X ) + tr ( CX ) n ⊂ R n × n X ∈ S (M) X ∈ S n subject to tr ( X ) = 1 , and + , matrix sensing [RFP10] Lijun Ding (Cornell University) SpecFW June 15, 2020 4 / 17

  11. Applications minimize f ( X ) := g ( A X ) + tr ( CX ) n ⊂ R n × n X ∈ S (M) X ∈ S n subject to tr ( X ) = 1 , and + , matrix sensing [RFP10] matrix completion [CR09, JS10] Lijun Ding (Cornell University) SpecFW June 15, 2020 4 / 17

  12. Applications minimize f ( X ) := g ( A X ) + tr ( CX ) n ⊂ R n × n X ∈ S (M) X ∈ S n subject to tr ( X ) = 1 , and + , matrix sensing [RFP10] matrix completion [CR09, JS10] phase retrieval [CESV15, YUTC17] Lijun Ding (Cornell University) SpecFW June 15, 2020 4 / 17

  13. Applications minimize f ( X ) := g ( A X ) + tr ( CX ) n ⊂ R n × n X ∈ S (M) X ∈ S n subject to tr ( X ) = 1 , and + , matrix sensing [RFP10] matrix completion [CR09, JS10] phase retrieval [CESV15, YUTC17] one-bit matrix completion [DPVDBW14] Lijun Ding (Cornell University) SpecFW June 15, 2020 4 / 17

  14. Applications minimize f ( X ) := g ( A X ) + tr ( CX ) n ⊂ R n × n X ∈ S (M) X ∈ S n subject to tr ( X ) = 1 , and + , matrix sensing [RFP10] matrix completion [CR09, JS10] phase retrieval [CESV15, YUTC17] one-bit matrix completion [DPVDBW14] blind deconvolution [ARR13] Lijun Ding (Cornell University) SpecFW June 15, 2020 4 / 17

  15. Applications minimize f ( X ) := g ( A X ) + tr ( CX ) n ⊂ R n × n X ∈ S (M) X ∈ S n subject to tr ( X ) = 1 , and + , matrix sensing [RFP10] matrix completion [CR09, JS10] phase retrieval [CESV15, YUTC17] one-bit matrix completion [DPVDBW14] blind deconvolution [ARR13] Expect rank r ⋆ = rank ( X ⋆ ) ≪ n ! Lijun Ding (Cornell University) SpecFW June 15, 2020 4 / 17

  16. Projected Gradient (PG) tr ( X ) = 1 , X ∈ S n minimize X ∈ S f ( X ) subject to , n + (M) � �� � SP n Lijun Ding (Cornell University) SpecFW June 15, 2020 5 / 17

  17. Projected Gradient (PG) tr ( X ) = 1 , X ∈ S n minimize X ∈ S f ( X ) subject to , n + (M) � �� � SP n orthogonal projection: P SP n ( X ) = arg min V � X − V � F Lijun Ding (Cornell University) SpecFW June 15, 2020 5 / 17

  18. Projected Gradient (PG) tr ( X ) = 1 , X ∈ S n minimize X ∈ S f ( X ) subject to , n + (M) � �� � SP n orthogonal projection: P SP n ( X ) = arg min V � X − V � F PG: Choose X 0 ∈ SP n and η > 0, iterate X t +1 = P SP n ( X t − η ∇ f ( X t )) . (PG) Lijun Ding (Cornell University) SpecFW June 15, 2020 5 / 17

  19. Projected Gradient (PG) tr ( X ) = 1 , X ∈ S n minimize X ∈ S f ( X ) subject to , n + (M) � �� � SP n orthogonal projection: P SP n ( X ) = arg min V � X − V � F PG: Choose X 0 ∈ SP n and η > 0, iterate X t +1 = P SP n ( X t − η ∇ f ( X t )) . (PG) iteration complexity O ( 1 ǫ ) Lijun Ding (Cornell University) SpecFW June 15, 2020 5 / 17

  20. Projected Gradient (PG) tr ( X ) = 1 , X ∈ S n minimize X ∈ S f ( X ) subject to , n + (M) � �� � SP n orthogonal projection: P SP n ( X ) = arg min V � X − V � F PG: Choose X 0 ∈ SP n and η > 0, iterate X t +1 = P SP n ( X t − η ∇ f ( X t )) . (PG) iteration complexity O ( 1 ǫ ) accelerated PG, O ( 1 √ ǫ ) Lijun Ding (Cornell University) SpecFW June 15, 2020 5 / 17

  21. Projected Gradient (PG) tr ( X ) = 1 , X ∈ S n minimize X ∈ S f ( X ) subject to , n + (M) � �� � SP n orthogonal projection: P SP n ( X ) = arg min V � X − V � F PG: Choose X 0 ∈ SP n and η > 0, iterate X t +1 = P SP n ( X t − η ∇ f ( X t )) . (PG) iteration complexity O ( 1 ǫ ) accelerated PG, O ( 1 √ ǫ ) Bottleneck: O ( n 3 ) per iteration due to FULL EVD in P SP n ! Lijun Ding (Cornell University) SpecFW June 15, 2020 5 / 17

  22. Projection free method: Frank-Wolfe (FW) tr ( X ) = 1 , X ∈ S n minimize X ∈ S f ( X ) subject to , n + (M) � �� � SP n Lijun Ding (Cornell University) SpecFW June 15, 2020 6 / 17

  23. Projection free method: Frank-Wolfe (FW) tr ( X ) = 1 , X ∈ S n minimize X ∈ S f ( X ) subject to , n + (M) � �� � SP n FW: choose X 0 ∈ SP n , iterate (LOO) Linear Optimization Oracle: V t = arg min V ∈SP n tr ( V ∇ f ( X t )). Lijun Ding (Cornell University) SpecFW June 15, 2020 6 / 17

  24. Projection free method: Frank-Wolfe (FW) tr ( X ) = 1 , X ∈ S n minimize X ∈ S f ( X ) subject to , n + (M) � �� � SP n FW: choose X 0 ∈ SP n , iterate (LOO) Linear Optimization Oracle: V t = arg min V ∈SP n tr ( V ∇ f ( X t )). (LS) Line Search: X t +1 solves min X = η X t +(1 − η ) V t ,η ∈ [0 , 1] f ( X ). Low per iteration complexity : LOO only needs to compute one eigenvector of ∇ f ( X t )! Lijun Ding (Cornell University) SpecFW June 15, 2020 6 / 17

  25. Projection free method: Frank-Wolfe (FW) tr ( X ) = 1 , X ∈ S n minimize X ∈ S f ( X ) subject to , n + (M) � �� � SP n FW: choose X 0 ∈ SP n , iterate (LOO) Linear Optimization Oracle: V t = arg min V ∈SP n tr ( V ∇ f ( X t )). (LS) Line Search: X t +1 solves min X = η X t +(1 − η ) V t ,η ∈ [0 , 1] f ( X ). Low per iteration complexity : LOO only needs to compute one eigenvector of ∇ f ( X t )! Bottleneck: Slow convergence, O ( 1 ǫ ) iteration complexity in both theory and practice! Lijun Ding (Cornell University) SpecFW June 15, 2020 6 / 17

  26. FW variants Many variants: Randomized regularized FW [Gar16] In-face direction FW [FGM17] BlockFW [AZHHL17] FW with r ⋆ = rank ( X ⋆ ) = 1 [Gar19] Shortage: No linear convergence or sensitive to input rank estimate or r ⋆ = 1 . Lijun Ding (Cornell University) SpecFW June 15, 2020 7 / 17

  27. Outline Introduction 1 Problem setup Past algorithms SpecFW and strict complementarity 2 Spectral Frank-Wolfe (SpecFW) Strict complementarity Numerics 3 Experimental setup Numerical results Lijun Ding (Cornell University) SpecFW June 15, 2020 8 / 17

  28. Spectral Frank-Wolfe (SpecFW) Spectral Frank-Wolfe: choose X 0 ∈ SP n , a rank estimate k > 0, iterate Lijun Ding (Cornell University) SpecFW June 15, 2020 9 / 17

  29. Spectral Frank-Wolfe (SpecFW) Spectral Frank-Wolfe: choose X 0 ∈ SP n , a rank estimate k > 0, iterate k LOO: Compute bottom k eigenvectors V = [ v 1 , . . . , v k ] ∈ R n × k of ∇ f ( X t ). Lijun Ding (Cornell University) SpecFW June 15, 2020 9 / 17

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend