unconstrained minimization
play

Unconstrained minimization Lectures for PHD course on Numerical - PowerPoint PPT Presentation

Unconstrained minimization Lectures for PHD course on Numerical optimization Enrico Bertolazzi DIMS Universit a di Trento November 21 December 14, 2011 Unconstrained minimization 1 / 58 Outline General iterative scheme 1


  1. Unconstrained minimization Lectures for PHD course on Numerical optimization Enrico Bertolazzi DIMS – Universit´ a di Trento November 21 – December 14, 2011 Unconstrained minimization 1 / 58

  2. Outline General iterative scheme 1 Backtracking Armijo line-search 2 Global convergence of backtracking Armijo line-search Global convergence of steepest descent Wolfe–Zoutendijk global convergence 3 The Wolfe conditions The Armijo-Goldstein conditions Algorithms for line-search 4 Armijo Parabolic-Cubic search Wolfe linesearch Unconstrained minimization 2 / 58

  3. The problem (1 / 3) Given f : ❘ n �→ ❘ : minimize f ( x ) x ∈ ❘ n the following regularity about f ( x ) is assumed in the following: Assumption (Regularity assumption) We assume f ∈ C 1 ( ❘ n ) with Lipschitz continuous gradient, i.e. there exists γ > 0 such that � ∇ f ( x ) T − ∇ f ( y ) T � � ≤ γ � x − y � , � ∀ x , y ∈ ❘ n Unconstrained minimization 3 / 58

  4. The problem (2 / 3) Definition (Global minimum) Given f : ❘ n �→ ❘ a point x ⋆ ∈ ❘ n is a global minimum if ∀ x ∈ ❘ n . f ( x ⋆ ) ≤ f ( x ) , Definition (Local minimum) Given f : ❘ n �→ ❘ a point x ⋆ ∈ ❘ n is a local minimum if f ( x ⋆ ) ≤ f ( x ) , ∀ x ∈ B ( x ⋆ ; δ ) . Obviously a global minimum is a local minimum. Find a global minimum in general is not an easy task. The algorithms presented in the sequel will approximate local minima’s. Unconstrained minimization 4 / 58

  5. The problem (3 / 3) Definition (Strict global minimum) Given f : ❘ n �→ ❘ a point x ⋆ ∈ ❘ n is a strict global minimum if ∀ x ∈ ❘ n \ { x ⋆ } . f ( x ⋆ ) < f ( x ) , Definition (Strict local minimum) Given f : ❘ n �→ ❘ a point x ⋆ ∈ ❘ n is a strict local minimum if f ( x ⋆ ) < f ( x ) , ∀ x ∈ B ( x ⋆ ; δ ) \ { x ⋆ } . Obviously a strict global minimum is a strict local minimum. Unconstrained minimization 5 / 58

  6. First order Necessary condition Lemma (First order Necessary condition for local minimum) Given f : ❘ n �→ ❘ satisfying the regularity assumption. If a point x ⋆ ∈ ❘ n is a local minimum then ∇ f ( x ⋆ ) T = 0 . Proof. Consider a generic direction d , then for δ small enough we have λ − 1 � � f ( x ⋆ + λ d ) − f ( x ⋆ ) ≥ 0 , 0 < λ < δ so that λ → 0 λ − 1 � � lim f ( x ⋆ + λ d ) − f ( x ⋆ ) = ∇ f ( x ⋆ ) d ≥ 0 , because d is a generic direction we have ∇ f ( x ⋆ ) T = 0 . Unconstrained minimization 6 / 58

  7. 1 The first order necessary condition do not discriminate maximum, minimum, or saddle points. 2 To discriminate maximum and minimum we need more information, e.g. second order derivative of f ( x ) . 3 With second order derivative we can build necessary and sufficient condition for a minima. 4 In general using only first and second order derivative at the point x ⋆ it is not possible to deduce a necessary and sufficient condition for a minima. Unconstrained minimization 7 / 58

  8. Second order Necessary condition Lemma (Second order Necessary condition for local minimum) Given f ∈ C 2 ( ❘ n ) if a point x ⋆ ∈ ❘ n is a local minimum then ∇ f ( x ⋆ ) T = 0 and ∇ 2 f ( x ⋆ ) is semi-definite positive, i.e. d T ∇ 2 f ( x ⋆ ) d ≥ 0 , ∀ d ∈ ❘ n Example This condition is only, necessary, in fact consider f ( x ) = x 2 1 − x 3 2 , � 2 � 0 2 x 1 , − 3 x 2 ∇ 2 f ( x ) = � � ∇ f ( x ) = , 2 0 − 6 x 2 for the point x ⋆ = 0 we have ∇ f ( 0 ) = 0 and ∇ 2 f ( 0 ) semi-definite positive, but 0 is a saddle point not a minimum. Unconstrained minimization 8 / 58

  9. Proof. The condition ∇ f ( x ⋆ ) T = 0 comes from first order necessary conditions. Consider now a generic direction d , and the finite difference: f ( x ⋆ + λ d ) − 2 f ( x ⋆ ) + f ( x ⋆ − λ d ) ≥ 0 λ 2 by using Taylor expansion for f ( x ) f ( x ⋆ ± λ d ) = f ( x ⋆ ) ± ∇ f ( x ⋆ ) λ d + λ 2 2 d T ∇ 2 f ( x ⋆ ) d + o ( λ 2 ) and from the previous inequality d T ∇ 2 f ( x ⋆ ) d + 2 o ( λ 2 ) /λ 2 ≥ 0 taking the limit λ → 0 and form the arbitrariness of d we have that ∇ 2 f ( x ⋆ ) must be semi-definite positive. Unconstrained minimization 9 / 58

  10. Second order sufficient condition Lemma (Second order sufficient condition for local minimum) Given f ∈ C 2 ( ❘ n ) if a point x ⋆ ∈ ❘ n satisfy: 1 ∇ f ( x ⋆ ) T = 0 ; 2 ∇ 2 f ( x ⋆ ) is definite positive; i.e. d T ∇ 2 f ( x ⋆ ) d > 0 , ∀ d ∈ ❘ n \ { x ⋆ } then x ⋆ ∈ ❘ n is a strict local minimum. Remark Because ∇ 2 f ( x ⋆ ) is symmetric we can write λ min d T d ≤ d T ∇ 2 f ( x ⋆ ) d ≤ λ max d T d If ∇ 2 f ( x ⋆ ) is positive definite we have λ min > 0 . Unconstrained minimization 10 / 58

  11. Proof. Consider now a generic direction d , and the Taylor expansion for f ( x ) f ( x ⋆ + d ) = f ( x ⋆ ) + ∇ f ( x ⋆ ) d + 1 2 d T ∇ 2 f ( x ⋆ ) d + o ( � d � 2 ) ≥ f ( x ⋆ ) + 1 2 λ min � d � 2 + o ( � d � 2 ) ≥ f ( x ⋆ ) + 1 2 λ min � d � 2 � 1 + o ( � d � 2 ) / � d � 2 � choosing d small enough we can write f ( x ⋆ + d ) ≥ f ( x ⋆ ) + 1 4 λ min � d � 2 > f ( x ⋆ ) , d � = 0 , � d � ≤ δ. i.e. x ⋆ is a strict minimum. Unconstrained minimization 11 / 58

  12. General iterative scheme Outline General iterative scheme 1 Backtracking Armijo line-search 2 Global convergence of backtracking Armijo line-search Global convergence of steepest descent Wolfe–Zoutendijk global convergence 3 The Wolfe conditions The Armijo-Goldstein conditions Algorithms for line-search 4 Armijo Parabolic-Cubic search Wolfe linesearch Unconstrained minimization 12 / 58

  13. General iterative scheme How to find a minimum Given f : ❘ n �→ ❘ : minimize x ∈ ❘ n f ( x ) . 1 We can solve the problem by solving the necessary condition. i.e by solving the nonlinear systems ∇ f ( x ) T = 0 . 2 Using such an approach we looses the information about f ( x ) . 3 Moreover such an approach can find solution corresponding to a maximum or saddle points. 4 A better approach is to use all the information and try to build minimizing procedure, i.e. procedures that, starting from a point x 0 build a sequence { x k } such that f ( x k +1 ) ≤ f ( x k ) . In this way, at least, we avoid to converge to a strict maximum. Unconstrained minimization 13 / 58

  14. General iterative scheme Iterative Methods in practice very rare to be able to provide explicit minimizer. iterative method: given starting guess x 0 , generate the sequence, � � x k , k = 1 , 2 , . . . AIM: ensure that (a subsequence) has some favorable limiting properties: satisfies first-order necessary conditions satisfies second-order necessary conditions Unconstrained minimization 14 / 58

  15. General iterative scheme Line-search Methods A generic iterative minimization procedure can be sketched as follows: calculate a search direction p k from x k ensure that this direction is a descent direction, i.e. whenever ∇ f ( x k ) T � = 0 ∇ f ( x k ) p k < 0 , so that, at least for small steps along p k , the objective function f ( x ) will be reduced use line-search to calculate a suitable step-length α k > 0 so that f ( x k + α k p k ) < f ( x k ) . Update the point: x k +1 = x k + α k p k Unconstrained minimization 15 / 58

  16. General iterative scheme Generic minimization algorithm Written with a pseudo-code the minimization procedure is the following algorithm: Generic minimization algorithm Given an initial guess x 0 , let k = 0 ; while not converged do Find a descent direction p k at x k ; Compute a step size α k using a line-search along p k . Set x k +1 = x k + α k p k and increase k by 1 . end while The crucial points which differentiate the algorithms are: 1 The computation of the direction p k ; 2 The computation of the step size α k . Unconstrained minimization 16 / 58

  17. General iterative scheme Practical Line-search methods The first developed minimization algorithms try to solve α k = arg min α> 0 f ( x k + α p k ) performing exact line-search by univariate minimization; rather expensive and certainly not cost effective. Modern methods implements inexact line-search: ensure steps are neither too long nor too short try to pick useful initial step size for fast convergence best methods are based on: backtracking–Armijo search; Armijo–Goldstein search; Franke–Wolfe search; Unconstrained minimization 17 / 58

  18. General iterative scheme backtracking line-search To obtain a monotone decreasing sequence we can use the following algorithm: Backtracking line-search Given α init (e.g., α init = 1 ); Given τ ∈ (0 , 1) typically τ = 0 . 5 ; Let α (0) = α init ; while not f ( x k + α ( ℓ ) p k ) < f ( x k ) do set α ( ℓ +1) = τα ( ℓ ) ; increase ℓ by 1 ; end while Set α k = α ( ℓ ) . To be effective the previous algorithm should terminate in a finite number of steps. The next lemma assure that if p k is a descent direction then the algorithm terminate. Unconstrained minimization 18 / 58

  19. Backtracking Armijo line-search Outline General iterative scheme 1 Backtracking Armijo line-search 2 Global convergence of backtracking Armijo line-search Global convergence of steepest descent Wolfe–Zoutendijk global convergence 3 The Wolfe conditions The Armijo-Goldstein conditions Algorithms for line-search 4 Armijo Parabolic-Cubic search Wolfe linesearch Unconstrained minimization 19 / 58

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend