an e ffi cient a ffi ne scaling algorithm for hyperbolic
play

An E ffi cient A ffi ne-Scaling Algorithm for Hyperbolic Programming - PowerPoint PPT Presentation

An E ffi cient A ffi ne-Scaling Algorithm for Hyperbolic Programming Jim Renegar joint work with Mutiara Sondjaja 1 Euclidean space A homogeneous polynomial p : E ! R is hyperbolic if there is a vector e 2 E such that for all x 2 E , the


  1. An E ffi cient A ffi ne-Scaling Algorithm for Hyperbolic Programming Jim Renegar – joint work with Mutiara Sondjaja 1

  2. Euclidean space A homogeneous polynomial p : E ! R is hyperbolic if there is a vector e 2 E such that for all x 2 E , the univariate polynomial t 7! p ( x + t e ) has only real roots. “ p is hyperbolic in direction e ” E = S n × n , p ( X ) = det( X ) , E = I (identity matrix) Example: – then p ( X + t E ) is the characteristic polynomial of − X All roots are real because symmetric matrices have only real eigenvalues. The hyperbolicity cone is Λ ++ the connected component of { x : p ( x ) 6 = 0 } containing e . For the example, Λ ++ = S n × n (cone of positive-definite matrices) ++ – the convexity of this particular cone is true of hyperbolicity cones in general . . . 2

  3. Thm (G ˚ arding, 1959): Λ ++ is a convex cone A hyperbolic program is an optimization problem of the form  h c, x i min  Ax = b s.t.  HP x 2 Λ + closure of Λ ++ G¨ uler (1997) introduced hyperbolic programming, motivated largely by the realization that f ( x ) = − ln p ( x ) is a self-concordant barrier function “ O ( √ n ) iterations to halve the duality gap ” – where n is the degree of p G¨ uler showed the barrier functions f ( x ) = � ln p ( x ) possess many of the nice properties of X 7! � ln det( X ) although hyperbolicity cones in general are not symmetric (i.e., self-scaled) 3

  4.  h c, x i min  Ax = b s.t.  HP x 2 Λ + There are natural ways in which to “relax” HP to hyperbolic programs for lower degree polynomials. For example, to obtain a relaxation of SDP . . . Fix n , and for 1 ≤ k ≤ n let σ k ( λ 1 , . . . , λ n ) := P j 1 <...<j k λ j 1 · · · λ j k – elementary symmetric polynomial of degree k Then X 7! σ k ( λ ( X ) ) is a hyperbolic polynomial in direction E = I of degree k , and its hyperbolicity cone contains S n × n ++ These polynomials can be evaluated e ffi ciently via the FFT. Perhaps relaxing SDP’s in this and related ways will allow larger SDP’s to be approximately solved e ffi ciently. The relaxations easily generalize to all hyperbolic programs. 4

  5.  h c, x i min  Ax = b s.t.  HP x 2 Λ + barrier function, f ( x ) = − ln p ( x ) its gradient g ( x ) and Hessian H ( x ) positive-definite for all x ∈ Λ ++ “ local inner product at e 2 Λ ++ ” h u, v i e := h u, H ( e ) v i p – the induced norm: k v k e = h v, v i e – “Dikin ellipsoids”: ¯ B e ( e, r ) = { x : k x � e k e  r } The gist of the original a ffi ne-scaling method due to Dikin is simply: Given a strictly feasible point e for HP and an appropriate value r > 0 , move from e to the optimal solution e + for min h c, x i s.t. Ax = b x 2 ¯ B e ( e, r ) Dikin focused on linear programming and chose r = 1 (giving the largest Dikin ellipsoids contained in R n + ) also: Vanderbei, Meketon and Freedman (1986) 5

  6. In the mid-1980’s, there was considerable e ff ort trying to prove that Dikin’s a ffi ne-scaling method runs in polynomial-time (perhaps with choice r < 1) The e ff orts mostly ceased when in 1986, Shub and Megiddo showed that the “infinitesimal version” of the algorithm can come near all vertices of a Klee-Minty cube. Nevertheless, several algorithms with spirit similar to Dikin’s method have been shown to halve the duality gap in polynomial time: Monteiro, Adler and Resende 1990 LP, and convex QP } Jansen, Roos and Terlaky 1996 LP use “scaling points” and “V-space” 1997 PSD LCP-problems Sturm and Zhang 1996 SDP } use ellipsoidal cones Chua 2007 symmetric cone programming rather than ellipsoids These algorithms are primal-dual methods and rely heavily on the cones being self-scaled. Our framework shares some strong connections to the one developed by Chek Beng Chua, to whom we are indebted. 6

  7.  h c, x i min  Ax = b s.t.  HP x 2 Λ + For e 2 Λ ++ and 0 < α < p n , let K e ( α ) := { x : h e, x i e � α k x k e } – this happens to be the smallest cone √ � n − α 2 � containing the Dikin ellipsoid B e e, Keep in mind that the cone grows in size as α decreases.   min h c, x i min h c, x i   s.t. Ax = b s.t. Ax = b  HP � !  QP e ( α ) x 2 Λ + x 2 K e ( α ) Swath( α ) = { e ∈ Λ ++ : Ae = b and QP e ( α ) has an optimal solution } Definition: Swath(0) = Central Path Prop: Thus, α can be regarded as a measure of the proximity of points in Swath( α ) to the central path. 7

  8.   min h c, x i min h c, x i   s.t. Ax = b  HP � ! s.t. Ax = b  QP e ( α ) x 2 Λ + x 2 K e ( α ) { x : h e, x i e � α k x k e } Swath( α ) = { e ∈ Λ ++ : Ae = b and QP e ( α ) has an optimal solution } Definition: Let x e ( α ) = optimal solution of QP e ( α ) (assuming e ∈ Swath( α )) – the main work in computing x e ( α ) lies in solving a system of linear equations We assume 0 < α < 1 , in which case Λ + ⊆ K e ( α ) – thus, K e ( α ) is a relaxation of HP – hence, optimal value of HP � h c, x e ( α ) i Current iterate: e ∈ Λ ++ Next iterate will be e 0 , a convex combination of e and x e ( α ) e 0 = 1 � � e + t x e ( α ) 1+ t The choice of t is made through duality . . . 8

  9.   min h c, x i min h c, x i   s.t. Ax = b  HP � ! s.t. Ax = b  QP e ( α ) x 2 Λ + x 2 K e ( α ) { x : h e, x i e � α k x k e } Swath( α ) = { e ∈ Λ ++ : Ae = b and QP e ( α ) has an optimal solution } Definition: Let x e ( α ) = optimal solution of QP e ( α ) (assuming e ∈ Swath( α )) 9

  10.   min h c, x i min h c, x i   s.t. Ax = b  HP � ! s.t. Ax = b  QP e ( α ) x 2 Λ + x 2 K e ( α ) { x : h e, x i e � α k x k e } Swath( α ) = { e ∈ Λ ++ : Ae = b and QP e ( α ) has an optimal solution } Definition: Let x e = optimal solution of QP e ( α ) (assuming e ∈ Swath( α )) 10

  11.   min h c, x i min h c, x i   s.t. Ax = b  HP � ! s.t. Ax = b  QP e ( α ) x 2 Λ + x 2 K e ( α ) { x : h e, x i e � α k x k e } Swath( α ) = { e ∈ Λ ++ : Ae = b and QP e ( α ) has an optimal solution } Definition: Let x e = optimal solution of QP e ( α ) (assuming e ∈ Swath( α )) b T y b T y   max max   s.t. A ∗ y + s = c  HP ∗ s.t. A ∗ y + s = c  QP e ( α ) ∗ − → s ∈ Λ ∗ x ∈ K e ( α ) ∗ + First-order optimality conditions for x e yield optimal solution ( y e , s e ) for QP e ( α ) ∗ because Λ + ⊆ K e ( α ) Moreover, ( y e , s e ) is feasible for HP ∗ and hence K e ( α ) ∗ ⊆ Λ ∗ + ( y e , s e ) for HP ∗ e for HP , primal-dual feasible pair: gap e := h c, e i � b T y e duality gap: 11

  12. Swath( α ) = { e ∈ Λ ++ : Ae = b and QP e ( α ) has an optimal solution } x e = optimal solution of QP e ( α ) (assuming e ∈ Swath( α )) primal-dual feasible pair: e for HP , ( y e , s e ) for HP ∗ Current iterate: e ∈ Λ ++ Next iterate will be a convex combination of e and x e : 1 � � e ( t ) = e + t x e 1+ t Want t to be large so as to improve primal objective value, but also want e ( t ) ∈ Swath( α ) We choose t to be the minimizer of a particular quadratic polynomial, and thereby ensure that: s e ∈ int( K e ( t ) ( α ) ∗ ) e ( t ) ∈ Λ ++ • • – consequently, both e ( t ) is strictly feasible for QP e ( t ) ( α ) and ( y e , s e ) is strictly feasible for QP e ( t ) ( α ) ∗ – hence, e ( t ) ∈ Swath( α ) 12

  13. Swath( α ) = { e ∈ Λ ++ : Ae = b and QP e ( α ) has an optimal solution } x e = optimal solution of QP e ( α ) (assuming e ∈ Swath( α )) primal-dual feasible pair: e for HP , ( y e , s e ) for HP ∗ Current iterate: e ∈ Λ ++ Next iterate will be a convex combination of e and x e : 1 � � e ( t ) = e + t x e 1+ t Want t to be large so as to improve primal objective value, but also want e ( t ) ∈ Swath( α ) We choose t to be the minimizer of a particular quadratic polynomial, and thereby ensure that: t � 1 2 α / k x e k e • – and thus ensure good improvement in the primal objective value if, say, k x e k e  p n 13

  14. Swath( α ) = { e ∈ Λ ++ : Ae = b and QP e ( α ) has an optimal solution } x e = optimal solution of QP e ( α ) (assuming e ∈ Swath( α )) primal-dual feasible pair: e for HP , ( y e , s e ) for HP ∗ Current iterate: e ∈ Λ ++ Next iterate will be a convex combination of e and x e : 1 � � e ( t ) = e + t x e 1+ t Want t to be large so as to improve primal objective value, but also want e ( t ) ∈ Swath( α ) We choose t to be the minimizer of a particular quadratic polynomial, and thereby ensure that: q 1+ α s e ∈ K e ( t ) ( β ) ∗ where β = α • 2 – which implies s e is “deep within” K e ( t ) ( α ) ∗ – and hence ( y e , s e ) is “very strongly” feasible for QP e ( t ) ( α ) ∗ 14

  15. 1 � � e ( t ) = e + t x e 1+ t We choose t to be the minimizer of a particular quadratic polynomial, and thereby ensure that: 1. There is “good” improvement in primal objective value if k x e k e  p n ( y e , s e ) is “very strongly” feasible for QP ∗ 2. e ( t ) Sequence of iterates: e 0 , e 1 , e 2 , . . . – write x i and ( y i , s i ) rather than x e i and ( y e i , s e i ) If i > 0, then k x i k e i  p n ) h c, e i +1 i ⌧ h c, e i i 1. ( y i − 1 , s i − 1 ) is “very strongly” feasible for QP ∗ 2. e i On the other hand, we show ( k x i k e i � p n ) ^ (2 . ) b T y i � b T y i − 1 3. ) In this manner we establish the Main Theorem . . . 15

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend