a primal dual algorithm for expontial cone optimization
play

A primal-dual algorithm for expontial-cone optimization ICCOPT - PowerPoint PPT Presentation

A primal-dual algorithm for expontial-cone optimization ICCOPT Berlin, August 8th, 2019 joachim.dahl@mosek.com www.mosek.com Conic optimization Linear cone problem: c T x minimize subject to Ax = b x K , with K = K 1 K 2


  1. A primal-dual algorithm for expontial-cone optimization ICCOPT Berlin, August 8th, 2019 joachim.dahl@mosek.com www.mosek.com

  2. Conic optimization Linear cone problem: c T x minimize subject to Ax = b x ∈ K , with K = K 1 × K 2 × · · · × K p a product of proper cones. Dual: b T y maximize c − A T y = s subject to s ∈ K ∗ , with K ∗ = K ∗ 1 × K ∗ 2 × · · · × K ∗ p . 1 / 19

  3. Conic optimization MOSEK 9 supports the following symmetric cones, • linear, quadratic and semidefinite cones and the nonsymmetric cones, • three-dimensional power cone for 0 < α < 1, pow = { x ∈ R 3 | x α 1 x (1 − α ) K α ≥ | x 3 | , x 1 , x 2 > 0 } , 2 • exponential cone K exp = cl { x ∈ R 3 | x 1 ≥ x 2 exp( x 3 / x 2 ) , x 2 > 0 } . 2 / 19

  4. Self-concordant barriers Self-concordant barrier for K exp : F ( x ) = − log( x 2 log( x 1 / x 2 ) − x 3 ) − log x 1 − log x 2 . Conjugate barrier: F ∗ ( s ) = max {−� x , s � − F ( x ) : x ∈ int ( K ) } . Standard properties: F ( k ) ( τ x ) = 1 τ k F ( k ) ( x ) F ( k ) ( x )[ x ] = − kF ( k − 1) ( x ) − F ′ ( x ) ∈ int ( K ∗ ) − F ′ ∗ ( s ) ∈ int ( K ) ∗ ( s )] − 1 F ′ ( − F ′ ∗ ( s )) = − s F ′′ ( − F ′ ∗ ( s )) = [ F ′′ 3 / 19

  5. Central path for conic problem Central path for homogenous model parametrized by µ : Ax µ − b τ µ = µ ( Ax − b τ ) s µ + A T y µ − c τ µ = µ ( s + A T y − c τ ) c T x µ − b T y µ + κ µ = µ ( c T x − b T y + κ ) s µ = − µ F ′ ( x µ ) , x µ = − µ F ′ ∗ ( s µ ) , κ µ τ µ = µ, or equivalently  0 A − b     0    y µ r p − A T  −  = µ 0 c x µ s µ r d       b T − c T 0 τ µ κ µ r g s µ = − µ F ′ ( x µ ) , x µ = − µ F ′ ∗ ( s µ ) , κ µ τ µ = µ, r d := c τ − A T y − s , r g := κ − c T x + b T y , r c := x T s + τκ. r p := Ax − b τ, 4 / 19

  6. Scaling for nonsymmetric cones cel [5] we consider a scaling W T W ≻ 0, Following Tun¸ x = W − T ˜ v = Wx = W − T s , v = W ˜ ˜ s x := − F ′ s := − F ′ ( x ). The centrality conditions where ˜ ∗ ( s ) and ˜ x = µ ˜ x , s = µ ˜ s can then be written symmetrically as v = µ ˜ v , and we linearize the centrality condition v = µ ˜ v as W ∆ x + W − T ∆ s = µ ˜ v − v . 5 / 19

  7. An affine search-direction  0 − b        A ∆ y a 0 r p − A T  −  = − 0 ∆ x a ∆ s a c r d       b T − c T 0 ∆ τ a ∆ κ a r g ∆ s a + W T W ∆ x a = − s , τ ∆ κ a + κ ∆ τ a = − κτ, satisfying (∆ x a ) T ∆ s a + ∆ τ a ∆ κ a = 0 . Let α a ∈ (0 , 1] denote largest feasible step in the affine direction. We estimate a centering parameter as γ := (1 − α a ) min { (1 − α a ) 2 , 1 / 4 } . 6 / 19

  8. A centering search-direction Let µ = ( x T s + τκ ) / ( ν + 1).  0 A − b   ∆ y c   0    r p − A T  −  = ( γ − 1) 0 c ∆ x c ∆ s c r d       b T − c T ∆ τ c ∆ κ c 0 r g W ∆ x c + W − T ∆ s c = γµ ˜ v − v , τ ∆ κ c + κ ∆ τ c = γµ − κτ, Constant decrease of residuals and complementarity: Ax + − b τ + = (1 − α (1 − γ )) · r p , c τ + − A T y + − s + = (1 − α (1 − γ )) · r d , b T y + − c T x + − κ + = (1 − α (1 − γ )) · r g , ( x + ) T s + + τ + κ + = (1 − α (1 − γ )) · r c , where z + := ( z + α ∆ z c ). 7 / 19

  9. A higher-order corrector term Derivatives of s µ = − µ F ′ ( x µ ): s µ + µ F ′′ ( x µ ) ˙ x µ = − F ′ ( x µ ) , ˙ s µ + µ F ′′ ( x µ )¨ x µ = − 2 F ′′ ( x µ ) ˙ x µ − µ F ′′′ ( x µ )[ ˙ ¨ x µ , ˙ x µ ] . Using F ′′ ( x ) x = − F ′ ( x ) and F ′′′ ( x )[ x ] = − 2 F ′′ ( x ) we obtain x µ , ( F ′′ ( x µ )) − 1 ˙ s µ + µ F ′′ ( x µ )¨ x µ = F ′′′ ( x µ )[ ˙ ¨ s µ ] . We interpret ˙ s µ ≈ − µ ∆ s a and ˙ x µ ≈ − µ ∆ x a , i.e. , ∆ s cor + W T W ∆ x cor = 1 2 F ′′′ ( x )[∆ x a , ( F ′′ ( x )) − 1 ∆ s a ] , satisfying x T ∆ s cor + s T ∆ x cor = − (∆ x a ) T ∆ s a . 8 / 19

  10. Combined centering-corrector direction A combined centering-corrector direction:  0 A − b   ∆ y   0    r p  −  = ( γ − 1) − A T 0 c ∆ x ∆ s r d       b T − c T ∆ τ ∆ κ 0 r g v − v + 1 2 W − T F ′′′ ( x )[∆ x a , ( F ′′ ( x )) − 1 ∆ s a ] , W ∆ x + W − T ∆ s = γµ ˜ τ ∆ κ + κ ∆ τ = γµ − τκ − ∆ τ a ∆ κ a . All residuals and complementarity decrease by (1 − α (1 − γ )). 9 / 19

  11. Computing the scaling matrix Theorem (Schnabel [4]) Let S , Y ∈ R n × p have full rank p. Then there exists H ≻ 0 such that HS = Y if and only if Y T S ≻ 0 . Let � � � � S := x x ˜ , Y := s s ˜ both be full rank. As a consequence of Thm. 1 (for n = 3), H = Y ( Y T S ) − 1 Y T + zz T where S T z = 0, z � = 0 and � x T ˜ s ) − ν 2 � det( Y T S ) = ( x T s ) · (˜ > 0 vanishing towards the central path. 10 / 19

  12. Computing the scaling matrix Expanding the BFGS update [4] H = H 0 + Y ( Y T S ) − 1 Y T − H 0 S ( S T H 0 S ) − 1 S T H 0 , ˆ for H 0 ≻ 0 gives the scaling by Tun¸ cel [5] and Myklebust [2], i.e. , z T = H 0 − H 0 S ( S T H 0 S ) − 1 S T H 0 . z ˆ ˆ We choose H 0 := µ F ′′ ( x ). In other words, W T W = ˆ H ≈ µ F ′′ ( x ) and satisfies W T Wx = s , W T W ˜ x = ˜ s . 11 / 19

  13. Tun¸ cel’s scaling bounds x T ˜ Let µ := ( x T s ) /ν and ˜ µ := (˜ s ) /ν . Tun¸ cel defines � T 2 ( ξ, x , s ) := H ≻ 0 | Hx = s , H ˜ x = ˜ s , � µ − 1) + 1) F ′′ ( x ) � H � ξ ( ν ( µ ˜ µ µ − 1) + 1) F ′′ (˜ x ) ξ ( ν ( µ ˜ µ and shows polynomial convergence for a potential reduction method if ∀ x ∈ int ( K ) , s ∈ int ( K ∗ ) . inf ξ T 2 ( ξ, x , s ) ≤ O (1) , For symmetric cones ξ ⋆ ≤ 4 / 3. 12 / 19

  14. Bounds for the exponential cone Given s ∈ int ( K ∗ exp ) and µ > 0. Let h := (0 , 0 , νµ/ s 3 ) and x α := h − α ( µ F ′ ( s ) + h ) . 1 x α ∈ K exp , α ∈ [0 , ν/ 2]. 2 � x α , s � = µ . ν ∗ ( s ) � = ν − 1 1 3 µ � F ′ ( x α ) , F ′ + ν − ( ν − 1) α . α ∗ ( s ) = ( α 2 − 2 α ) ν ( ν − 1) + ν 2 . 4 � x α � 2 − µ F ′ Conjecture (Øbro [3]): For the exponential cone ξ ⋆ ≈ 1 . 2532, i.e. , � 2 ν � − 1 � − 1 � 2 √ ν ( ν − 1) 3 / 2 1 ξ ⋆ = + − ν + 1 ν − 1 − √ ν − 1 √ ν � ν ( ν − 1) ν − attained for x α ⋆ with α ⋆ = ν ( ν ( ν − 1)) − 1 / 2 . 13 / 19

  15. Øbro’s conjecture 4 3 x 1 2 1 0 0 . 0 2 1 0 . 5 0 1 . 0 − 1 x 2 x 3 1 . 5 − 2 2 . 0 − 3 Plot of K exp ∩ { x : x T s = νµ } , D ( − µ F ′ ∗ ( s ) , 1) and x α ⋆ (red). 14 / 19

  16. Implications for the exponential-cone • F ( x ) does not have negative curvature, i.e. , F ′′′ ( x )[ u ] �� 0 , ∀ x ∈ int ( K exp ) , ∀ u ∈ K exp . • But F ′′ is still bounded, for another reason. • Tun¸ cel’s potential-reduction method for expontial-cones have polynomial-time complexity. • No equivalent proof yet for MOSEK’s algorithm, even with optimal scalings. • The BFGS scaling appears to be bounded as well, and often coincides with the optimal scaling, leaving more to be proved. 15 / 19

  17. Comparing MOSEK and ECOS conic solvers MOSEK MOSEK n/c ECOS 300 iterations 200 100 0 0 50 100 150 problem index Iteration counts for different exponential cone problems, comparing MOSEK (with and without proposed corrector) and ECOS. 16 / 19

  18. Comparing MOSEK and ECOS conic solvers 10 2 MOSEK MOSEK n/c ECOS 10 1 10 0 time [s] 10 - 1 10 - 2 10 - 3 0 50 100 150 problem index Solution time for different exponential cone problems, comparing MOSEK (with and without proposed corrector) and ECOS. 17 / 19

  19. Conclusions • Exponential cone optimization included in MOSEK 9. • Works very well in practice, especially with the proposed corrector. • Solution-time, accuracy, number of iterations on level with symmetric cone implementation. • No proof of polynomial-time complexity yet. • More details can be found in [1]. 18 / 19

  20. References [1] J. Dahl and E. D. Andersen. A primal-dual interior-point algorithm for nonsymmetric exponential-cone optimization. Technical report, MOSEK ApS., 2019. [2] T. Myklebust and L. Tun¸ cel. Interior-point algorithms for convex optimization based on primal-dual metrics. Technical report, University of Waterloo, 2014. [3] M. Øbro. Conic optimization with exponential cones. Master’s thesis, Technical University of Denmark, 2019. [4] R. B. Schnabel. Quasi-newton methods using multiple secant equations. Technical report, Colorado Univ., Boulder, Dept. Comp. Sci., 1983. [5] L. Tun¸ cel. Generalization of primal-dual interior-point methods to convex optimization problems in conic form. Foundations of Computational Mathematics , 1:229–254, 2001. 19 / 19

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend