Extending MOSEK with exponential cones ISMP Bordeaux 2018 - PowerPoint PPT Presentation

Extending MOSEK with exponential cones ISMP Bordeaux 2018 joachim.dahl@mosek.com www.mosek.com

Conic optimization Linear cone problem: c T x minimize subject to Ax = b x ∈ K , with K = K 1 × K 2 × · · · × K p a product of proper cones. Dual: b T y maximize c − A T y = s subject to s ∈ K ∗ , with K ∗ = K ∗ 1 × K ∗ 2 × · · · × K ∗ p .

Symmetric cones (supported by MOSEK 8) • the nonnegative orthant l := { x ∈ R n | x j ≥ 0 , j = 1 , . . . , n } , K n • the quadratic cone q = { x ∈ R n | x 1 ≥ � � 1 / 2 } , x 2 2 + · · · + x 2 K n n • the rotated quadratic cone r = { x ∈ R n | 2 x 1 x 2 ≥ x 2 K n 3 + . . . x 2 n , x 1 , x 2 ≥ 0 } . • the semidefinite matrix cone s = { x ∈ R n ( n +1) / 2 | z T mat ( x ) z ≥ 0 , ∀ z } . K n

Nonsymmetric cones (supported by MOSEK 9) • the three-dimensional power cone pow = { x ∈ R 3 | x α 1 x (1 − α ) K α ≥ | x 3 | , x 1 , x 2 > 0 } , 2 for 0 < α < 1. • the exponential cone K exp = cl { x ∈ R 3 | x 1 ≥ x 2 exp( x 3 / x 2 ) , x 2 > 0 } .

Central path for conic problem Central path for homogenous model parametrized by µ : Ax µ − b τ µ = µ ( Ax − b τ ) s µ + A T y µ − c τ µ = µ ( s + A T y − c τ ) c T x µ − b T y µ + κ µ = µ ( c T x − b T y + κ ) s µ = − µ F ′ ( x µ ) , x µ = − µ F ′ ∗ ( s µ ) , κ µ τ µ = µ, or equivalently         0 A − b 0 y µ r p − A T  −  = µ 0 c x µ s µ r d       b T − c T 0 τ µ κ µ r g s µ = − µ F ′ ( x µ ) , x µ = − µ F ′ ∗ ( s µ ) , κ µ τ µ = µ, r d := c τ − A T y − s , r g := κ − c T x + b T y , r c := x T s + τκ. r p := Ax − b τ,

Scaling for nonsymmetric cones cel [3] we consider a scaling W T W ≻ 0, Following Tun¸ x = W − T ˜ v = Wx = W − T s , v = W ˜ ˜ s x := − F ′ s := − F ′ ( x ). The centrality conditions where ˜ ∗ ( s ) and ˜ x = µ ˜ x , s = µ ˜ s can then be written symmetrically as v = µ ˜ v , and we linearize the centrality condition v = µ ˜ v as W ∆ x + W − T ∆ s = − v + µ ˜ v .

A centering search-direction Let µ := x T s + τκ with barrier parameter ν and centering γ . ν + 1         0 A − b ∆ y c 0 r p − A T  −  = ( γ − 1) 0 c ∆ x c ∆ s c r d       b T − c T ∆ τ c ∆ κ c 0 r g W ∆ x c + W − T ∆ s c = γµ ˜ v − v , τ ∆ κ c + κ ∆ τ c = γµ − κτ, Constant decrease of residuals and complementarity: Ax + − b τ + = η · r p , c τ + − A T y + − s + = η · r d , b T y + − c T x + − κ + = η · r g , ( x + ) T s + + τ + κ + = η · r c , where z + := ( z + α ∆ z c ) and η = (1 − α (1 − γ )).

A higher-order corrector term Derivatives of s µ = − µ F ′ ( x µ ): s µ + µ F ′′ ( x µ ) ˙ ˙ x µ = − F ′ ( x µ ) , s µ + µ F ′′ ( x µ )¨ x µ = − 2 F ′′ ( x µ ) ˙ x µ − µ F ′′′ ( x µ )[ ˙ ¨ x µ , ˙ x µ ] . Since s µ ) = x µ − [ F ′′ ( x µ )] − 1 ˙ x µ = − [ F ′′ ( x µ )] − 1 ( F ′ ( x µ ) + ˙ µ ˙ s µ , we have x µ , ( F ′′ ( x µ )) − 1 ˙ µ F ′′′ ( x µ )[ ˙ x µ ] = F ′′′ ( x µ )[ ˙ − F ′′′ ( x µ )[ ˙ x µ , ˙ x µ , x µ ] s µ ] � �� − 2 F ′′ ( x µ ) ˙ x µ so x µ , ( F ′′ ( x µ )) − 1 ˙ s µ + µ F ′′ ( x µ )¨ ¨ x µ = F ′′′ ( x µ )[ ˙ s µ ] .

An affine search-direction Affine search-direction:         0 − b A ∆ y a 0 r p  −  = − − A T 0 c ∆ x a ∆ s a r d       b T − c T ∆ τ a ∆ κ a r g 0 W ∆ x a + W − T ∆ s a = − v , τ ∆ κ a + κ ∆ τ a = − κτ, satisfies (∆ x a ) T ∆ s a + ∆ τ a ∆ κ a = 0 . Since s µ + µ F ′′ ( x µ ) ˙ ˙ x µ = − F ′ ( x µ ) = s µ , we interpret ∆ s a = − ˙ s µ and ∆ x a = − ˙ x µ .

A higher-order corrector term From x µ , ( F ′′ ( x µ )) − 1 ˙ s µ + µ F ′′ ( x µ )¨ x µ = F ′′′ ( x µ )[ ˙ ¨ s µ ] we define a corrector direction as W ∆ x cor + W − T ∆ s cor = 1 2 W − T F ′′′ ( x )[∆ x a , ( F ′′ ( x )) − 1 ∆ s a ] . Note that s T ∆ x cor + x T ∆ s cor = 1 2 x T F ′′′ ( x )[∆ x a , ( F ′′ ( x )) − 1 ∆ s a ] = − (∆ x a ) T ∆ s a , condition for constant decrease of complementarity.

A higher-order corrector term • Linear case, 1 2 F ′′′ ( x )[∆ x a , ( F ′′ ( x )) − 1 ∆ s a ] = − diag ( x ) − 1 diag (∆ x a )∆ s a , • Semidefinite case, 1 2 F ′′′ ( x )[∆ x a , ( F ′′ ( x )) − 1 ∆ s a ] = − 1 2 x − 1 ∆ x a ∆ s a − 1 2∆ s a ∆ x a x − 1 = − ( x − 1 ) ◦ (∆ x a ∆ s a ) , • Second-order cone case, 2 x T Qx ( ux T Q + Qxu T − ( x T u ) Q ) F ′′′ ( x )[( F ′′ ( x )) − 1 u ] = − for Q = diag (1 , − 1 , . . . , − 1). Then F ′′′ ( x )[( F ′′ ( x )) − 1 u ] e = − 2( x − 1 ◦ u ) .

Combined centering-corrector direction A combined centering-corrector direction:         0 A − b ∆ y 0 r p − A T  −  = ( γ − 1) 0 c ∆ x ∆ s r d       b T − c T ∆ τ ∆ κ 0 r g v − v + 1 W ∆ x + W − 1 ∆ s = γµ ˜ 2 W − T F ′′′ ( x )[∆ x a , ( F ′′ ( x )) − 1 ∆ s a ] , τ ∆ κ + κ ∆ τ = γµ − τκ − ∆ τ a ∆ κ a . All residuals and complementarity decrease by η .

Computing the scaling matrix Theorem (Schnabel [2]) Let S , Y ∈ R n × p have full rank p. Then there exists H ≻ 0 such that HS = Y if and only if Y T S ≻ 0 . As a consequence H = Y ( Y T S ) − 1 Y T + ZZ T where S T Z = 0, rank ( Z ) = n − p . We have n = 3, p = 2 and � � � � S := x ˜ x , Y := s ˜ s , with det( Y T S ) = ν 2 ( µ ˜ µ − 1) ≥ 0 vanishing only on the central path.

Computing the scaling matrix Any scaling with n = 3 satisfies W T W = Y ( Y T S ) − 1 Y T + zz T � T z = 0, z � = 0. Expanding the BFGS update [2] � where x x ˜ H + = H + Y ( Y T S ) − 1 Y T − HS ( S T HS ) − 1 S T H , for H ≻ 0 gives the scaling by Tun¸ cel [3] and Myklebust [1], i.e. , zz T = H − HS ( S T HS ) − 1 S T H , with H = µ F ′′ ( x ).

A negative result on complexity Nesterov’s long-step Hessian estimation property holds if F ′′′ ( x )[ u ] � 0 , ∀ x ∈ int ( K ) , ∀ u ∈ K . We have F ′′′ ([1; 1; − 1])[ u ] =  − 9 6 3   6 − 5 − 3   3 − 3 − 2   + u 2  + u 3 6 − 5 − 3 − 5 2 3 − 3 − 3 2 u 1  .    3 − 3 − 2 − 3 3 2 − 2 2 2 Not negative semidefinite for all u ∈ K .

Comparing MOSEK and ECOS conic solvers 400 MOSEK (2) MOSEK w/o corr (29) ECOS (41) 300 iterations 200 100 0 0 50 100 150 200 prob instance Iteration counts for different exponential cone problems. Failures marked with ⋄ .

Comparing MOSEK and ECOS conic solvers 10 3 MOSEK w/o corr (29) ECOS (41) 10 2 time other 10 1 10 0 10 - 1 10 - 1 10 0 10 1 10 2 10 3 time MOSEK Solution time for different exponential cone problems. Failures marked with ⋄ .

Comparing MOSEK and ECOS conic solvers feasibility measure 10 0 MOSEK 10 - 2 MOSEK w/o corr ECOS 10 - 4 largest error 10 - 6 10 - 8 10 - 10 10 - 12 50 100 150 200 prob instance Feasibility measures for different exponential cone problems.

Comparing MOSEK conic and MOSEK GP solvers 400 conic (3) GP primal (14) GP dual (2) 300 iterations 200 100 0 0 25 50 75 100 125 prob instance Iteration counts for different GPs. Failures marked with ⋄ .

Comparing MOSEK conic and MOSEK GP solvers 10 4 GP primal (12) GP dual (2) 10 3 10 2 time GP 10 1 10 0 10 - 1 10 - 1 10 0 10 1 10 2 10 3 10 4 time conic Solution time for different GPs. Failures marked with ⋄ .

Comparing MOSEK conic and MOSEK GP solvers feasibility measure 10 0 conic 10 - 2 GP primal GP dual 10 - 4 largest error 10 - 6 10 - 8 10 - 10 10 - 12 25 50 75 100 125 prob instance Feasibility measures for different GPs.

References [1] T. Myklebust and L. Tun¸ cel. Interior-point algorithms for convex optimization based on primal-dual metrics. Technical report, 2014. [2] R. B. Schnabel. Quasi-newton methods using multiple secant equations. Technical report, Colorado Univ., Boulder, Dept. Comp. Sci., 1983. [3] L. Tun¸ cel. Generalization of primal-dual interior-point methods to convex optimization problems in conic form. Foundations of Computational Mathematics , 1:229–254, 2001.

Extending MOSEK with exponential cones ISMP Bordeaux 2018 - PowerPoint PPT Presentation

Extending MOSEK with exponential cones ISMP Bordeaux 2018 joachim.dahl@mosek.com www.mosek.com Conic optimization Linear cone problem: c T x minimize subject to Ax = b x K , with K = K 1 K 2 K p a product of proper

Projection and presolve in MOSEK: exponential and power cones ISMP 2018 Henrik A. Friberg

On recent improvements in MOSEK Erling D. Andersen MOSEK ApS, Fruebjergvej 3, Box 16, 2100

Tour de MOSEK 7: The short version Erling D. Andersen MOSEK ApS, Fruebjergvej 3, Box 16, 2100

An overview of MOSEK Erling D. Andersen MOSEK ApS Fruebjergvej 3, Symbion Science park, Box 16

Power and Exponential cones x y 1 | z | and x ye z/y June 27, 2018 Ulf Worse

Exponential cone in MOSEK ISMP2018, Relative Entropy Optimization, 6 July 2018 Micha l

Solving conic optimization problems using MOSEK December 16th 2017 e.d.andersen@mosek.com

What is Mosek up to January 15, 2019 Erling D. Andersen www.mosek.com Mosek A software

On Using MOSEK to Solve The MOSEK solvers Large-Scale Linear and Conic Optimization Problems

MOSEK version 9 (work in progress) July 4, 2018 Erling D. Andersen www.mosek.com Mosek A

Mixed-integer conic optimization and MOSEK Dagstuhl seminar on MINLP, February 20th 2018 Sven

The R-to-MOSEK Optimization Interface Henrik Alsing Friberg MOSEK ApS, Fruebjergvej 3, Box 16,

On Recent Improvements in the Interior-Point Optimizer in MOSEK ISMP2015 14 July 2015

Mixed-integer conic optimization and MOSEK Dagstuhl seminar on MINLP, February 20th 2018 Sven

Convex Optimization via Cones and MOSEK 9 CO@Work September 2020, online event Sven Wiese

Exponential Families Leila Wehbe March 19, 2013 Leila Wehbe Exponential Families Exponential

The ICSI Haystack A Platform for Hybrid Mobile Measurements in the Wild Narseo

Persistent Personal Names for Globally Connected Mobile Devices Bryan Ford, Jacob Strauss, Chris

Chapters IID and II a : SUN new and sun proofs of WUN Last time : and awkward

Neue strongSwan VPN Features GUUG Frhjahrsfachgesprch 2015 Stuttgart Prof. Dr. Andreas

Who Will (Most Likely) Win the 2018 FIFA World Cup? Achim Zeileis

Mayan Glyphs Matthew Dockrey 12.19.16.2.7 13 Chicchan 8 K'ayab Stellae Stellae Codices

ReVirt Enabling Intrusion Analysis through Virtual Machine Logging and Replay George Dunlap

=0.92 Cavity Safety Bracket Design In partnership with: Sergey Cheban India/DAE Italy/INFN