algorithms for constrained local optimization
play

Algorithms for constrained local optimization Fabio Schoen 2008 - PowerPoint PPT Presentation

Algorithms for constrained local optimization Fabio Schoen 2008 http://gol.dsi.unifi.it/users/schoen Algorithms for constrained local optimization p. Feasible direction methods Algorithms for constrained local optimization p.


  1. Algorithms for constrained local optimization Fabio Schoen 2008 http://gol.dsi.unifi.it/users/schoen Algorithms for constrained local optimization – p.

  2. Feasible direction methods Algorithms for constrained local optimization – p.

  3. Frank–Wolfe method Let X : convex set. Consider the problem: min x ∈ X f ( x ) Let x k ∈ X ⇒ choosing a feasible direction d k corresponds to choosing a point x ∈ X : d k = x − x k . “Steepest descent” choice: x ∈ X ∇ T f ( x k )( x − x k ) min (a linear objective with convex constraints, usually easy to solve). Let ˆ x k be an optimal solution of this problem. Algorithms for constrained local optimization – p.

  4. Frank–Wolfe If ∇ T f ( x k )(ˆ x k − x k ) = 0 then ∇ T f ( x k ) d ≥ 0 for every feasible direction d ⇒ first order necessary conditions hold. Otherwise, letting d k = ˆ x k − x , this is a descent direction along which a step α k ∈ (0 , 1] might be chosen according to Armijo’s rule. Algorithms for constrained local optimization – p.

  5. Convergence of Frank-Wolfe method Under mild conditions the method converges to a point satisfying first order necessary conditions. However it is usually extremely slow (convergence may be sub–linear) It might find applications in very large scale problems in which solving the sub-problem for direction determination is very easy (e.g. when X is a polytope). Algorithms for constrained local optimization – p.

  6. Gradient Projection methods Generic iteration: x k +1 = x k + α k (¯ x k − x k ) where the direction d k = ¯ x k − x k is obtained finding x k = [ x k − s k ∇ f ( x k )] + ¯ where: s k ∈ R + and [ · ] + represents projection over the feasible set. Algorithms for constrained local optimization – p.

  7. The method is slightly faster than Frank-Wolfe, with a linear convergence rate similar to that of (unconstrained) steepest descent. It might be applied if projection is relatively cheap, e.g. when the feasible set is a box. A point x k satisfies first order necessary conditions d T ∇ f ( x k ) ≥ 0 iff x k = [ x k − s k ∇ f ( x k )] + Algorithms for constrained local optimization – p.

  8. Lagrange Multiplier Algorithms Algorithms for constrained local optimization – p.

  9. Barrier Methods min f ( x ) g j ( x ) ≤ 0 j = 1 , . . . , r A Barrier is a continuous function which tends to + ∞ whenever x approaches the boundary of the feasible region. Examples of barrier functions: � B ( x ) = − log( − g j ( x )) logaritmic barrier j 1 � B ( x ) = − invers barrier g j ( x ) j Algorithms for constrained local optimization – p.

  10. Barrier Method Let ε k ↓ 0 and x 0 strictly feasible, i.e. g j ( x 0 ) < 0 ∀ j . Then let x k = arg min x ∈ R n ( f ( x ) + ε k B ( x )) Proposition: every limit point of { x k } is a global minimum of the constrained optimization problem Algorithms for constrained local optimization – p. 1

  11. Analysis of Barrier methods Special case: a single constraint (might be generalized) Let ¯ x be a limit point of { x k } (a global minimum). If KKT conditions hold, then there exists a unique λ ≥ 0 : ∇ f (¯ x ) + λ ∇ g (¯ x ) = 0 (with λg (¯ x ) = 0 . x k , solution of the barrier problem min f ( x ) + ε k B ( x ) g ( x ) < 0 satisfies ∇ f ( x k ) + ε k ∇ B ( x k ) = 0 Algorithms for constrained local optimization – p. 1

  12. . . . If B ( x ) = φ ( g ( x )) , ⇒ ∇ f ( x k ) + ε k φ ′ ( g ( x k )) ∇ g ( x k ) = 0 In the limit, for k → ∞ : lim ε k φ ′ ( g ( x k )) ∇ g ( x k ) = λ ∇ g (¯ x ) if lim k g ( x k ) < 0 ⇒ φ ′ ( g ( x k )) ∇ g ( x k ) → K (finite) and Kε k → 0 if lim k g ( x k ) = 0 ⇒ (thanks to the unicity of Lagrange multipliers), k ε k φ ′ ( g ( x k )) λ = lim Algorithms for constrained local optimization – p. 1

  13. Difficulties in Barrier Methods strong numeric instability: the condition number of the hessian matrix grows as ε k → 0 need for an initial strictly feasible point x 0 (partial) remedy: ε k is very slowly decreased and the solution of the k + 1 –th problem is obtained starting an unconstrained optimization from x k Algorithms for constrained local optimization – p. 1

  14. Example min( x − 1) 2 + ( y − 1) 2 x + y ≤ 1 Logarithmic Barrier problem: min( x − 1) 2 + ( y − 1) 2 − ε k log(1 − x − y ) x + y − 1 < 0 Gradient:   ε k  2( x − 1) + 1 − x − y  ε k 2( y − 1) + 1 − x − y √ 1+ ε k Stationary points x = y = 3 4 ± (only the “-” solution is acceptable) 4 Algorithms for constrained local optimization – p. 1

  15. Barrier methods and L.P . min c T x Ax = b x ≥ 0 Logarithmic Barrier on x ≥ 0 : � min c T x − ε log x j j Ax = b x > 0 Algorithms for constrained local optimization – p. 1

  16. The central path The starting point is usually associated with ε = ∞ and is the unique solution of � min − log x j j Ax = b x > 0 The trajectory x ( ε ) of solutions to the barrier problem is called the central path and leads to an optimal solution of the LP . Algorithms for constrained local optimization – p. 1

  17. Penalty Methods Penalized problem: min f ( x ) + ρP ( x ) where ρ > 0 and P ( x ) ≥ 0 with P ( x ) = 0 if x is feasible. Example: min f ( x ) h i ( x ) = 0 i = 1 , . . . , m A penalized problem might be: � h i ( x ) 2 min f ( x ) + ρ i Algorithms for constrained local optimization – p. 1

  18. Convergence of the quadratic penalty me (for equality constrained problems): let � h i ( x ) 2 P ( x ; ρ ) = f ( x ) + ρ i Given ρ 0 > 0 , x 0 ∈ R n , k = 0 , let x k +1 = arg min P ( x ; ρ k ) (found with an iterative method initialized at x k ); let ρ k +1 > ρ k , k := k + 1 . If x k +1 is a global minimizer of P and ρ k → ∞ then every limit point of { x k } is a global optimum of the constrained problem. Algorithms for constrained local optimization – p. 1

  19. Exact penalties Exact penalties: there exists a penalty parameter value s.t. the optimal solution to the penalized problem is the optimal solution of the original one. ℓ 1 penalty function: � P 1 ( x ; ρ ) = f ( x ) + ρ | h i ( x ) | i Algorithms for constrained local optimization – p. 1

  20. Exact penalties for inequality constrained problems: min f ( x ) h i ( x ) = 0 g j ( x ) ≤ 0 the penalized problem is � � P 1 ( x ; ρ ) = f ( x ) ρ | h i ( x ) | + ρ max(0 , − g j ( x )) i j Algorithms for constrained local optimization – p. 2

  21. Augmented Lagrangian method Given an equality constrained problem, reformulate it as: min f ( x ) + 1 2 ρ � h ( x ) � 2 h ( x ) = 0 The Lagrange function of this problem is called Augmented Lagrangian: L ( x ; λ ) = f ( x ) + 1 2 ρ � h ( x ) � 2 + λ T h ( x ) Algorithms for constrained local optimization – p. 2

  22. Motivation x f ( x ) + 1 2 ρ � h ( x ) � 2 + λ T h ( x ) min � ∇ x L ρ ( x, λ ) = ∇ f ( x ) + λ i ∇ h ( x ) + ρh ( x ) ∇ h ( x ) i = ∇ x L ( x, λ ) + ρh ( x ) ∇ h ( x ) λ i ∇ 2 h ( x ) + ρh ( x ) ∇ 2 h ( x ) + ρ ∇ h ( x ) ∇ T h ( x ) � ∇ 2 xx L ρ ( x, λ ) = ∇ 2 f ( x ) + i xx L ( x, λ ) + ρh ( x ) ∇ 2 h ( x ) + ρ ∇ h ( x ) ∇ T h ( x ) = ∇ 2 Algorithms for constrained local optimization – p. 2

  23. motivation . . . Let ( x ⋆ , λ ⋆ ) an optimal (primal and dual) solution. Necessarily: ∇ x L ( x ⋆ , λ ⋆ ) = 0 ; moreover h ( x ⋆ ) = 0 thus ∇ x L ρ ( x ⋆ , λ ⋆ ) = ∇ x L ( x ⋆ , λ ⋆ ) + ρh ( x ⋆ ) ∇ h ( x ⋆ ) = 0 ⇒ ( x ⋆ , λ ⋆ ) is a stationary point for the augmented lagrangian. Algorithms for constrained local optimization – p. 2

  24. motivation . . . Observe that: ∇ 2 xx L ρ ( x, λ ) = ∇ 2 xx L ( x, λ ) + ρh ( x ) ∇ 2 h ( x ) + ρ ∇ h ( x ) ∇ T h ( x ) = ∇ 2 xx L ( x, λ ) + ρ ∇ h ( x ) ∇ T h ( x ) Assume that sufficient optimality conditions hold: v T ∇ 2 xx L ( x ⋆ , λ ⋆ ) v > 0 ∀ v : v T ∇ h ( x ⋆ ) = 0 , Algorithms for constrained local optimization – p. 2

  25. . . . Let v � = 0 : v T ∇ h ( x ⋆ )= 0 . Then xx L ρ ( x ⋆ , λ ⋆ ) v T = v T ∇ 2 xx L ( x ⋆ , λ ⋆ ) v T + ρv T ∇ h ( x ⋆ ) ∇ T h ( x ⋆ ) v v T ∇ 2 = v T ∇ 2 xx L ( x ⋆ , λ ⋆ ) v T > 0 Algorithms for constrained local optimization – p. 2

  26. . . . Let v � = 0 : v T ∇ h ( x ⋆ ) � = 0 . Then xx L ρ ( x ⋆ , λ ⋆ ) v T = v T ∇ 2 xx L ( x ⋆ , λ ⋆ ) v T + ρv T ∇ h ( x ⋆ ) ∇ T h ( x ⋆ ) v v T ∇ 2 xx L ( x ⋆ , λ ⋆ ) v T + ρ ( v T ∇ h ( x ⋆ )) 2 = v T ∇ 2 which might be negative. However ∃ ¯ ρ > 0 : if ρ ≥ ¯ ρ xx L ρ ( x ⋆ , λ ⋆ ) v T > 0 . ⇒ v T ∇ 2 Thus, if ρ is large enough, the Hessian of the augmented lagrangian is positive definite and x ⋆ is a (strict) local minimum of L ρ ( · , λ ⋆ ) Algorithms for constrained local optimization – p. 2

  27. Inequality constraints min f ( x ) g ( x ) ≤ 0 Nonlinear transformation of inequalities into equalities: min x,s f ( x ) g j ( x ) + s 2 j = 0 j = 1 , p Algorithms for constrained local optimization – p. 2

  28. Given the problem min f ( x ) h i ( x ) = 0 i = 1 , m g j ( x ) ≤ 0 j = 1 , p an Augmented Lagrangian problem might be defined as x,z f ( x ) + λ T h ( x ) + 1 2 ρ � h ( x ) � 2 min L ρ ( x, z ; λ, µ ) = min j ) + 1 � � µ j ( g j ( x ) + z 2 ( g j ( x ) + z 2 j ) 2 + 2 ρ j j Algorithms for constrained local optimization – p. 2

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend