semidefinite optimization using mosek
play

Semidefinite Optimization using MOSEK Joachim Dahl ISMP Berlin, - PowerPoint PPT Presentation

Semidefinite Optimization using MOSEK Joachim Dahl ISMP Berlin, August 23, 2012 http://www.mosek.com Introduction Introduction MOSEK is a state-of-the-art solver for large-scale linear Semidefinite optimization and conic quadratic


  1. Semidefinite Optimization using MOSEK Joachim Dahl ISMP Berlin, August 23, 2012 http://www.mosek.com

  2. Introduction Introduction ■ MOSEK is a state-of-the-art solver for large-scale linear Semidefinite optimization and conic quadratic problems. Algorithms ■ Based on the homogeneous model using Nesterov-Todd Results and examples scaling. Summary ■ Next version includes semidefinite optimization . ■ Goal for first version is to beat SeDuMi. Why SDP in MOSEK? ■ It is very powerful and flexible for modeling. ■ We want the SDP work you developed to be available to our customers! ■ Customers have been asking about it for awhile. 2 / 28

  3. Semidefinite optimization 3 / 28

  4. Semidefinite optimization ■ Primal and dual problems: � p b T y minimize j =1 C j • X j maximize � p � m subject to j =1 A ij • X j = b i subject to i =1 y i A ij + S j = C j S j ∈ S n j X j ∈ S n j + , + , where A ij , C j ∈ S n j . ■ Optimality conditions: p m � � X j , S j ∈ S n j A ij • X j = b i , y i A ij + S j = C j , X j S j = 0 , + . j =1 i =1 ■ Equivalent complementary conditions if X j , S j ∈ S n j + : X j • S j = 0 ⇐ ⇒ X j S j = 0 ⇐ ⇒ X j S j + S j X j = 0 . 4 / 28

  5. Algorithms 5 / 28

  6. Notation For symmetric matrices U ∈ S n we define √ √ √ √ 2 U n 2 , . . . , U nn ) T svec ( U ) = ( U 11 , 2 U 21 , . . . , 2 U n 1 , U 22 , 2 U 32 , . . . , with inverse operation √ √   u 1 u 2 / 2 · · · u n / 2 √ √ u 2 / 2 u n +1 · · · u 2 n − 1 / 2     smat ( u ) =  , . . .   . . . . . .  √ √ u n / 2 u n − 1 / 2 · · · u n ( n +1) / 2 and a symmetric product u ◦ v := (1 / 2)( smat ( u ) smat ( v ) + smat ( v ) smat ( u )) . 6 / 28

  7. Notation Let a ij = svec ( A ij ) , c j := svec ( C j ) , x j := svec ( X j ) , s j := svec ( S j ) , and         a T a T . . . c 1 x 1 s 1 11 1 p . . . . .         . . . . . A :=  , c :=  , x :=  , s :=  . . . . . .     a T a T . . . c p x p s p m 1 mp We then have primal and dual problems: c T x b T y minimize maximize A T y + s = c subject to Ax = b subject to x � 0 , s � 0 . 7 / 28

  8. The homogeneous model Simplified homogeneous model:         A T s 0 − c x 0  +  = 0 A 0 − b y 0 x, s � 0 , τ, κ ≥ 0 .       c T − b T κ 0 τ 0 Note that ( x, y, s, κ, τ ) = 0 is feasible, and x T s + κτ = 0 . ■ If τ > 0 , κ = 0 then ( x, y, s ) /τ is a primal-dual optimal solution, A T y + s = cτ, x T s = 0 . Ax = bτ, ■ If κ > 0 , τ = 0 then either primal or dual is infeasible, A T y + s = 0 , c T x − b T y < 0 , Ax = 0 , Primal infeasible if b T y > 0 , dual infeasible if c T x < 0 . 8 / 28

  9. The central path Introduction We define the central path as: Semidefinite optimization A T y (0) + s (0) − cτ (0)         A T s 0 − c x Algorithms Ax (0) − bτ (0)  = γ 0  + A 0 − b y Notation       Homogeneous c T x (0) − b T y (0) + κ (0) c T − b T κ 0 τ model The central path Primal-dual scaling x ◦ s = γµ (0) e, τκ = γµ (0) , The search direction Schur complement where γ ∈ [0 , 1] , µ (0) = ( x (0) ) T s (0) + κ (0) τ (0) Practical details > 0 and n +1 Results and examples   Summary svec ( I n 1 ) .   . e =  . .  svec ( I n p ) 9 / 28

  10. Primal-dual scaling Introduction A primal-dual scaling is defined by a non-singular R k as Semidefinite optimization W k ( x k ) := svec ( R − T X k R − 1 W − T ( s k ) := svec ( R k S k R T k ) , k ) , k k Algorithms Notation Homogeneous model ■ preserves the semidefinite cone, The central path Primal-dual scaling The search direction W k ( x k ) ≻ 0 , W − T x k ≻ 0 , s k ≻ 0 ⇐ ⇒ ( s k ) ≻ 0 , Schur complement k Practical details Results and examples Summary ■ the central path, W k ( x k ) ◦ W − T x k ◦ s k = γµe k ⇐ ⇒ ( s k ) = γµe k , k k s k = W k ( x k ) T W − T ■ and the complementarity: x T ( s k ) . k 10 / 28

  11. Two popular primal-dual scalings Introduction = S 1 / 2 ■ Helmberg-Kojima-Monteiro (HKM) R − 1 : Semidefinite k k optimization Algorithms W k ( x k ) = svec ( S 1 / 2 X k S 1 / 2 W − T ) , ( s k ) = e k . Notation k k k Homogeneous model The central path Primal-dual scaling The search direction ■ Nesterov-Todd (NT): Schur complement Practical details � � 1 / 2 Results and S − 1 / 2 ( S 1 / 2 X k S 1 / 2 ) 1 / 2 S − 1 / 2 R − 1 = , examples k k k k k Summary Satisfies R − 1 k X k R − 1 = R k S k R k . Computed as: k = Λ 1 / 4 Q T L − 1 , LL T = X k , Q Λ Q T = L T S k L R − 1 k which satisfies R − T X k R − 1 = R k S k R T k = Λ k . k k 11 / 28

  12. The search direction Introduction Linearizing the scaled centrality condition Semidefinite optimization W k ( x k ) ◦ W − T ( s k ) = γµe k k Algorithms Notation Homogeneous and discarding higher-order terms leads to model The central path Primal-dual scaling ∆ x k + Π k ∆ s k = γµs − 1 − x k , The search direction k Schur complement Practical details where Results and examples ■ HKM scaling: Π k z k = svec ( X k Z k S − 1 + S − 1 k Z k X k ) / 2 , k Summary ■ NT scaling: Π k z k = svec ( R T k R k Z k R T k R k ) . 12 / 28

  13. The search direction Most expensive part of interior-point optimizer is solving: A ∆ x − b ∆ τ = ( γ − 1)( Ax − bτ ) = r 1 A T ∆ y + ∆ s − c ∆ τ = ( γ − 1)( A T y + s − cτ ) = r 2 c T ∆ x − b T ∆ y + ∆ κ = ( γ − 1)( c T x − b T y + κ ) = r 3 ∆ x + Π∆ s = γµs − 1 − x = r 4 ∆ κ + κ/τ ∆ τ = γµ/τ − κ = r 5 Gives constant decrease in residuals and complementarity: A ( x + α ∆ x ) − b ( τ + α ∆ τ ) = (1 − α (1 − γ ))( Ax − bτ ) ( x + α ∆ x ) T ( s + α ∆ s ) + ( κ + α ∆ κ )( τ + α ∆ τ ) = (1 − α (1 − γ )) ( x T s + κτ ) . � �� � < 1 Polynomial complexity for γ chosen properly. 13 / 28

  14. Solving for the search direction After block-elimination, we get a 2 × 2 system: A Π A T ∆ y − ( A Π c + b )∆ τ = r y ( A Π c − b )∆ y + ( c T Π c + κ/τ )∆ τ = r τ . We factor A Π A T = LL T and eliminate ∆ y = L − T L − 1 (( A Π c + b )∆ τ + r y ) . Then � ( A Π c − b ) T L − T L − 1 ( A Π c + b ) + ( c T Π c + κ/τ ) � ∆ τ = r τ − ( A Π c − b ) T L − T L − 1 r y . An extra (insignificant) solve compared to an infeasible method. 14 / 28

  15. Forming the Schur complement Schur complement computed as a sum of outer products:       a T a T . . . Π 1 a 11 . . . a m 1 11 1 p . . . . ... A Π A T       . . . . = . . . .       a T a T Π p a 1 p . . . a mp . . . mp m 1 p � � � P T P k A : ,k Π k A T : ,k P T = P k , k k k =1 where P k is a permutation of A : ,k . Complexity depends on choice of P k . We compute only tril ( A Π A T ) , so by nnz ( P k A : ,k ) 1 ≥ nnz ( P k A : ,k ) 2 ≥ · · · ≥ nnz ( P k A : ,k ) p , get a good reduction in complexity. More general than SDPA. 15 / 28

  16. Forming the Schur complement To evaluate each term a T ik Π k a jk = ( R T k A ik R k ) • ( R T k A jk R k ) = A ik • ( R k R T k A jk R k R T k ) . we follow SeDuMi. Rewrite MM + M T ˆ M T , k = ˆ R k R T k A jk R k R T i.e., a symmetric (low-rank) product. MM + M T ˆ M T element by element. ■ A ik sparse: evaluate ˆ MM + M T ˆ M T using level 3 BLAS. ■ A ik dense: evaluate ˆ May exploit low-rank structure later. 16 / 28

  17. Practical details Introduction Additionally, the MOSEK solver Semidefinite optimization ■ uses Mehrotra’s predictor-corrector step. Algorithms Notation ■ solves mixed linear, conic quadratic and semidefinite Homogeneous model problems. The central path Primal-dual scaling The search direction ■ employs a presolve step for linear variables, which can Schur complement reduce problem size and complexity significantly. Practical details Results and examples ■ scales constraints trying to improving conditioning of a Summary problem. 17 / 28

  18. Results and examples 18 / 28

  19. SDPLIB benchmark Introduction ■ We use the subset of SDPLIB used by H. D. Mittelmann Semidefinite optimization for his SDP solver comparison. Algorithms ■ For brevity we report only iteration count, computation Results and examples time, and relative gap at termination. SDPLIB benchmark Profiling Conic modeling ■ Our SDPA-format converter detects block-diagonal Summary structure within the semidefinite blocks (SeDuMi and SDPA does not), so for a few problems our solution time is much smaller. ■ All solvers run on a single thread on an Intel Xeon E32270 with 4 cores at 3.4GHz and 32GB memory using Linux. 19 / 28

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend