a practical primal dual interior point algorithm for
play

A practical primal-dual interior-point algorithm for nonsymmetric - PowerPoint PPT Presentation

A practical primal-dual interior-point algorithm for nonsymmetric conic optimization September 8, 2020 Erling D. Andersen (joint work with Joachim Dahl) MOSEK ApS, Email: e.d.andersen@mosek.com Personal WWW: https://erling.andersens.name


  1. A practical primal-dual interior-point algorithm for nonsymmetric conic optimization September 8, 2020 Erling D. Andersen (joint work with Joachim Dahl) MOSEK ApS, Email: e.d.andersen@mosek.com Personal WWW: https://erling.andersens.name Company WWW: https://mosek.com www.mosek.com

  2. Outline Conic optimization The problem The two nonsymmetric cones A primal-dual interior-point algorithm Survey of algorithms Preliminaries Motivation The algorithm Computational results Summary 1 / 47

  3. Section 1 Conic optimization

  4. Generic conic optimization problem Primal form ( c k ) T x k � minimize k � A k x k st = b, k x k ∈ K k , ∀ k, where • c k ∈ R n k , • A k ∈ R m × n k , • b ∈ R m , • K k are convex cones. 3 / 47

  5. The 5 cones 3 symmatric cones: • Linear. • Quadratic. • Semidefinite. 2 nonsymmetric cones: • Exponential . • Power . Observation: • Almost all convex optimization problems appearing in practice can be formulated using those 5 cones. 4 / 47

  6. The power cone The power cone:   n  � n  x | α j | � j =1 | α j | , x ≥ 0 K pow ( α ) := ≥ � z �  ( x, z ) :  . j j =1 Examples ( α ∈ (0 , 1) ): t ≥ | x | 1 /α , t ≥ 0 , ( t, 1 , x ) ∈ K pow ( α, 1 − α ) ⇔ x α ≥ | t | , x ≥ 0 , ( x, 1 , t ) ∈ K pow ( α, 1 − α ) ⇔ 1 /n   n � ( x, t ) ∈ K pow ( e ) ⇔ x j ≥ | t | , x ≥ 0 .   j =1 See also Chares [2] and the Mosek modelling cookbook [5]. 5 / 47

  7. The exponential cone The exponential cone x 3 x 2 , x 2 ≥ 0 } K exp := { ( x 1 , x 2 , x 3 ) : x 1 ≥ x 2 e ∪{ ( x 1 , x 2 , x 3 ) : x 1 ≥ 0 , x 2 = 0 , x 3 ≤ 0 } Applications: t ≥ e x , ( t, 1 , x ) ∈ K exp ⇔ t ≥ a x , ( t, 1 , ln( a ) x ) ∈ K exp ⇔ ( x, 1 , t ) ∈ K exp ⇔ t ≤ ln( x ) , (1 , x, t ) ∈ K exp ⇔ t ≤ − x ln( x ) , ( y, x, − t ) ∈ K exp ⇔ t ≥ x ln( x/y ) , (relative entropy) . Geometric programming + many more [2, 5]. 6 / 47

  8. Section 2 A primal-dual interior-point algorithm

  9. Survey • Lesson learned from the linear case: Solve the primal and dual problem simultaneously. • Symmetric cones: Employ the Nesterov-Todd (NT) algorithm [10, 12]. • Nonsymmetric cones: How to generalize the NT algorithm? • Nesterov [8, 9], Skajaa and Ye [15], Serrano [14]: Computational results available • Tuncel [16], Myklebust [6], Tuncel and Myklebust [7]: No computational results. Present work: • Follows Myklebust and Tuncel. 8 / 47

  10. Primal and dual problem Primal problem c T x minimize st Ax = b, x ∈ K and the dual b T y maximize A T y + s st = c, s ∈ ( K ) ∗ where K = K 1 × K 2 × · · · K k and K ∗ is the corresponding dual cone. Known for the 5 cone types. 9 / 47

  11. Barrier functions Define a 3 times differentiable function F such that F : int ( K ) �→ R then it is a ν -logarithmically homogeneouos self-concordant barrier ( ν -LHSCB) for int ( K ) if | F ′′′ ( x )[ u, u, u ] | ≤ 2( F ′′ ( x )[ u, u ]) 3 / 2 and F ( τx ) = F ( x ) − ν log τ. See [10, 12]. 10 / 47

  12. The dual barrier If F is a ν -self-concordant barrier for K , then the Fenchel conjugate F ∗ ( s ) = sup {−� s, x � − F ( x ) } . (1) x ∈ int ( K ) is a ν -self-concordant barrier for K ∗ . Let µ := � x, s � µ := � ˜ s � x, ˜ x := − F ′ s := − F ′ ( x ) , ˜ ∗ ( s ) , ˜ , ˜ . ν ν s ∈ int ( K ∗ ) and x ∈ int ( K ) , ˜ Then ˜ µ ˜ µ ≥ 1 (2) with equality iff x = − µ ˜ x (and s = µ ˜ s ). 11 / 47

  13. The homogeneous model Generalized Goldman-Tucker homogeneous model: ( H ) Ax − bτ = 0 , A T y + s − cτ = 0 , − c T x + b T y − κ = 0 , ( x ; τ ) ∈ ¯ K , ( s ; κ ) ∈ ¯ K ∗ where K ∗ := K ∗ × R + . ¯ ¯ K := K × R + and • K is Cartesian product of k + 1 convex cones. • The homogeneous model always has a solution. • Partial list of references: • Linear case: [4], [3], [17]. • Nonlinear case: [11]. 12 / 47

  14. Investigating the homogeneous model Lemma Let ( x ∗ , τ ∗ , y ∗ , s ∗ , κ ∗ ) be any feasible solution to (H), then i) ( x ∗ ) T s ∗ + τ ∗ κ ∗ = 0 . ii) If τ ∗ > 0 , then ( x ∗ , y ∗ , s ∗ ) /τ ∗ is an optimal solution. iii) If κ ∗ > 0 , then at least one of the strict inequalities b T y ∗ > 0 (3) and c T x ∗ < 0 (4) holds. If the first inequality holds, then ( P ) is infeasible. If the second inequality holds, then ( D ) is infeasible. 13 / 47

  15. The central path The central path: Ax − bτ x − b ˆ = γ ( A ˆ τ ) , A T y + s − cτ γ ( A T ˆ s − c ˆ = y + ˆ τ ) , − c T x + b T y − κ γ ( − c T ˆ x + b T ˆ y − ˆ = κ ) , µF ′ ( x ) s + γ ˆ = 0 , τκ − γ ˆ µ = 0 , where x ) T ˆ µ := (ˆ s + ˆ τ ˆ κ ˆ ν + 1 and (ˆ x, ˆ τ, ˆ y, ˆ s, ˆ κ ) is an “interior” solution for γ = 1 . The central path is the solutions parameterised by γ ∈ [0 , 1] . 14 / 47

  16. Tracing the central path • Idea: Trace the central path using Newton’s method. • Question: Should we use the primal or dual barrier i.e. µF ′ ( x ) = s + γ ˆ s + γ ˆ µ ˜ s = 0 or µF ′ x + γ ˆ ∗ ( s ) = x + γ ˆ µ ˜ x = 0 where x := − F ′ s := − F ′ ( x ) . ˜ ∗ ( s ) and ˜ 15 / 47

  17. Primal-dual scaling A nonsingular matrix W is called a primal-dual scaling if it satisfies W − T s, v := Wx = W − T ˜ v ˜ := W ˜ x = s. The primal or dual centrality conditions are equivalent to v = γ ˆ µ ˜ v. • Result: The centrality conditions have become symmetric! 16 / 47

  18. The search direction Affine direction: − ( Ax 0 − bτ 0 ) , Ad a x − bd a = τ A T d a − ( A T y 0 + s 0 − cτ 0 ) , y + d a s − cd a = τ − c T d a x + b T d a − ( − c T x 0 + b T y 0 − κ 0 ) , y − d a = κ x + W − T d a Wd a − v 0 , = s τ 0 d a τ + κ 0 d a − τ 0 κ 0 . = τ Centering direction: ( Ax 0 − bτ 0 ) , Ad c x − bd c = τ A T d c ( A T y 0 + s 0 − cτ 0 ) , y + d a s − cd c = τ − c T d c x + b T d c ( − c T x 0 + b T y 0 − κ 0 ) , y − d c = κ x + W − T d c Wd c µ 0 ˜ v 0 , = s τ 0 d c τ + κ 0 d c µ 0 . = τ 17 / 47

  19. Updating the solution For a given γ ∈ [0 , 1] then define d a x + γd c d x := x , d a τ + γd c d τ := τ , d a y + γd c d y := y , d a s + γd c d s := s , d a κ + γd c d κ := κ , and hence for a step size α ∈ [0 , 1] we have x 0 + αd x , x + := τ 0 + αd τ , τ + := y 0 + αd y , y + := s 0 + αd s , s + := κ 0 + αd κ . κ + := 18 / 47

  20. Basic but important properties Ax + − bτ + (1 − α (1 − γ ))( Ax 0 − bτ 0 ) , = A T y + + s + − cτ + (1 − α (1 − γ ))( A T y 0 + s 0 − cτ 0 ) , = − c T x + + b T y + − κ + (1 − α (1 − γ ))( − c T x 0 + b T y 0 − κ 0 ) , = ( x + ) T ( s + ) + τ + κ + (1 − α (1 − γ ))(( x 0 ) T s 0 + τ 0 κ 0 ) . = • Equal decrease in infeasibility and complementarity for γ ∈ [0 , 1) . • If α ∈ ]0 , 1] , then “convergence”. • No merit function is needed. Yahooooo! 19 / 47

  21. Choice of the primal-dual scaling Our method inspired by (Tuncel, Tuncel and Myklebust): W T W µ 0 F ′′ ( x 0 ) , ≈ W − T s, Wx = W − T ˜ W ˜ x = s. Employ the quasi Newton idea to compute W . 20 / 47

  22. Computing the scaling matrix Theorem (Schnabel [13]) S ∈ R n × p have full rank p . Then there exists H ≻ 0 such Let ¯ X, ¯ S T ¯ that H ¯ X = ¯ S if and only if ¯ X ≻ 0 . As a consequence S T ¯ X ) − 1 ¯ S T + ZZ T H = ¯ S ( ¯ X T Z = 0 , rank ( Z ) = n − p . We have n = 3 , p = 2 and where ¯ ¯ ¯ � � � � X := x x ˜ , S := s s ˜ , with S T ¯ det( ¯ X ) = ν 2 ( µ ˜ µ − 1) ≥ 0 vanishing only on the central path. 21 / 47

  23. Computing the scaling matrix Any scaling with n = 3 satisfies W T W = ¯ S T ¯ X ) − 1 ¯ S T + zz T S ( ¯ � T z = 0 , z � = 0 . Expanding the BFGS update [13] � where x ˜ x H + = H + ¯ S T ¯ X ) − 1 ¯ S T − H ¯ X T H ¯ X ) − 1 ¯ X T H, S ( ¯ X ( ¯ for H ≻ 0 gives the scaling by Tun¸ cel [16] and Myklebust [7], i.e., zz T = H − H ¯ X T H ¯ X ) − 1 ¯ X T H, X ( ¯ with H = µF ′′ ( x ) . 22 / 47

  24. A high-order corrector term A high-order correction: Ad co x − bd co = 0 , τ A T d co y + d co s − cd co = 0 , τ − c T d co x + b T d co y − d co = 0 , κ − 1 x + W − T d co 2 W − T F ′′′ ( x )[ d a Wd co x , F ′′ ( x ) − 1 d a = s ] , s τ 0 d co κ + κ 0 d co − d a τ d a = κ . τ For motivation see paper. Finally d a x + γd c x + d co d x := x , d a τ + γd c τ + d co d τ := τ , d a y + γd c y + d co d y := y , d a s + γd c s + d co d s := s , d a κ + γd c κ + d co d κ := κ . 23 / 47

  25. The power cone case A 3-self-concordant barrier for the 3 dimensional primal power cone: F ( x ) = − log( x 2 α 1 x 2 − 2 α − x 2 3 ) − (1 − α ) log x 1 − α log x 2 . (5) 2 suggest by Chares [2]. Generalized in [1]. Is self-dual using redefined inner product. However, • The conjugate barrier F ∗ ( x ) or its derivatives cannot be evaluated on closed-form. • Can be evaluated numerically to high accuracy based of an idea of Nesterov. 24 / 47

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend