Trust Region Method Lectures for PHD course on Numerical - PowerPoint PPT Presentation

Trust Region Method Lectures for PHD course on Numerical optimization Enrico Bertolazzi DIMS – Universit´ a di Trento November 21 – December 14, 2011 Trust Region Method 1 / 36

The Trust Region method Outline The Trust Region method 1 The exact solution of trust region step 2 The dogleg trust region step 3 Trust Region Method 2 / 36

The Trust Region method Introduction Newton and quasi-Newton methods search a solution iteratively by choosing at each step a search direction and minimize in this direction. An alternative approach is to to find a direction and a step-length, then if the step is successful in some sense the step is accepted. Otherwise another direction and step-length is chosen. The choice of the step-length and direction is algorithm dependent but a successful approach is the one based on trust region. Trust Region Method 3 / 36

The Trust Region method Introduction Newton and quasi-Newton at each step (approximately) solve the minimization problem min m ( x k + s ) = f ( x k ) + ∇ f ( x k ) s + 1 2 s T H k s in the case H k is symmetric and positive definite (SPD). If H k is SPD the minimum is s = − H − 1 g k = ∇ f ( x k ) T k g k , and s is the quasi-Newton step. If H k = ∇ 2 f ( x k ) and is SPD, then s = − H − 1 k g k is the Newton step. Trust Region Method 4 / 36

The Trust Region method Introduction If H k is not positive definite, the search direction − H − 1 k g k may fail to be a descent direction and the previous minimization problem can have no solution. The problem is that the model m ( x k + s ) is an approximation of f ( x ) m ( x k + s ) ≈ f ( x k + s ) and this approximation is valid only in a small neighbors of x k . So that an alternative minimization problem is the following min m ( x k + s ) = f ( x k ) + ∇ f ( x k ) s + 1 2 s T H k s , Subject to � s � ≤ δ k δ k is the trust region of the model m ( x ) , i.e. the region where we trust the model is valid. Trust Region Method 5 / 36

The Trust Region method The generic trust region algorithm Algorithm (Generic trust region algorithm) x assigned; δ assigned; g ← ∇ f ( x ) T ; H ← ∇ 2 f ( x ) ; while � g � > ǫ do ← arg min � s �≤ δ m ( x + s ) = f ( x ) + g T s + 1 2 s T Hs ; s pred ← m ( x + s ) − m ( x ) ; ared ← f ( x + s ) − f ( x ) ; if ( ared / pred ) < η 1 then x ← x ; δ ← δγ 1 ; — reject step, reduce δ else x ← x + s ; — accept step, update H if ( ared / pred ) > η 2 then δ ← max { δ, γ 2 � s �} ; — enlarge δ end if end if end while Trust Region Method 6 / 36

The Trust Region method A fundamental lemma The previous algorithm is based on two keys ingredients: The ratio r = ( ared / pred ) which is the ratio of the actual 1 reduction and the predicted reduction. Enlarge or reduce the trust region δ . 2 If the ratio r is between 0 < η 1 < r < η 2 < 1 we have that the model is quite appropriate; we accept the step and do not modify the trust region. If the ratio r is small r ≤ η 1 we have that the model is not appropriate; we do not accept the step and we must reduce the trust region by a factor γ 1 < 1 If the ratio r is large r ≥ η 2 we have that the model is very appropriate; we do accept the step and we enlarge the trust region factor γ 2 > 1 The algorithm is quite insensitive to the constant η 1 and η 2 . Typical values are η 1 = 0 . 25 , η 2 = 0 . 75 , γ 1 = 0 . 5 and γ 2 = 3 . Trust Region Method 7 / 36

The Trust Region method A fundamental lemma Lemma Let f : ❘ n �→ ❘ be twice continuously differentiable, H ∈ ❘ n × n symmetric and positive definite. Then the problem min m ( x + s ) = f ( x ) + ∇ f ( x ) s + 1 2 s T Hs , Subject to � s � ≤ δ is solved by s ( µ ) . = − ( H + µ I ) − 1 g , g = ∇ f ( x ) T for the unique µ ≥ 0 such that � s ( µ ) � = δ , unless � s (0) � ≤ δ , in which case s (0) is the solution. For any µ ≥ 0 , s ( µ ) defines a descent direction for f from x . Trust Region Method 8 / 36

The Trust Region method A fundamental lemma Proof. (1 / 2) . If � s (0) � ≤ δ then s (0) is the global minimum inside the trust region. Otherwise consider the Lagrangian L ( s , µ ) = a + g T s + 1 2 s T Hs + 1 2 µ ( s T s − δ 2 ) , where a = f ( x ) and g = ∇ f ( x ) T . Then we have ∂ L s = − ( H + µ I ) − 1 g ∂ s ( s , µ ) = Hs + µ s + g = 0 ⇒ and s T s = δ 2 . Remember that if H is SPD then H + µ I is SPD for all µ ≥ 0 . Moreover the inverse of an SPD matrix is SPD. From g T s = − g T ( H + µ I ) − 1 g < 0 for all µ ≥ 0 follows that s ( µ ) is a descent direction for all µ ≥ 0 . Trust Region Method 9 / 36

The Trust Region method A fundamental lemma Proof. (2 / 2) . To prove the uniqueness consider expand the gradient g with the eigenvectors of H n � g = α i u i i =1 H is SPD so that u i can be chosen orthonormal. It follows n n α i ( H + µ I ) − 1 g = ( H + µ I ) − 1 � � α i u i = λ i + µ u i i =1 i =1 n α 2 � 2 = � ( H + µ I ) − 1 g � i � � ( λ i + µ ) 2 i =1 � is a monotonically decreasing function of � � ( H + µ I ) − 1 g � and µ . Trust Region Method 10 / 36

The Trust Region method A fundamental lemma Remark As a consequence of the previous Lemma we have: as the ray of the trust region becomes smaller as the scalar µ becomes larger. This means that the search direction become more and more oriented toward the gradient direction. as the ray of the trust region becomes larger as the scalar µ becomes smaller. This means that the search direction become more and more oriented toward the Newton direction. Thus a trust region technique not only change the size of the step-length but also its direction. This results in a more robust numerical technique. The price to pay is that the solution of the minimization is more costly than the inexact line search. Trust Region Method 11 / 36

The Trust Region method Solving the constrained minimization problem Solving the constrained minimization problem As for the line-search problem we have many alternative for solving the constrained minimization problem: We can solve accurately the constrained minimization problem. For example by an iterative method. We can approximate the solution of the constrained minimization problem. as for the line search the accurate solution of the constrained minimization problem is not paying while a good cheap approximations is normally better performing. Trust Region Method 12 / 36

The exact solution of trust region step Outline The Trust Region method 1 The exact solution of trust region step 2 The dogleg trust region step 3 Trust Region Method 13 / 36

The exact solution of trust region step The Newton approach The Newton approach (1 / 5) Consider the Lagrangian L ( s , µ ) = a + g T s + 1 2 s T Hs + 1 2 µ ( s T s − δ 2 ) , where a = f ( x ) and g = ∇ f ( x ) T . Then we can try to solve the nonlinear system � Hs + µ s + g � � 0 � ∂ L ∂ ( s , µ ) ( s , µ ) = = ( s T s − δ 2 ) / 2 0 Using Newton method we have � − 1 � Hs k + µ k s k + g � s k +1 � � s k � � H + µ I � s = − s T ( s T k s k − δ 2 ) / 2 µ k +1 µ k 0 Trust Region Method 14 / 36

The exact solution of trust region step The Newton approach The Newton approach (2 / 5) A better approach is given by solving Φ( µ ) = 0 where s ( µ ) = − ( H + µ I ) − 1 g Φ( µ ) = � s ( µ ) � − δ, and To build Newton method we need to evaluate Φ( µ ) ′ = s ( µ ) T s ( µ ) ′ s ( µ ) ′ = ( H + µ I ) − 2 g , � s ( µ ) � where to evaluate s ( µ ) ′ we differentiate the relation Hs ( µ ) ′ + µ s ( µ ) ′ + s ( µ ) = 0 Hs ( µ ) + µ s ( µ ) = g ⇒ Putting all in a Newton step we obtain � s ( µ k ) � µ k +1 = µ k − s ( µ k ) T s ( µ k ) ′ ( � s ( µ k ) � − δ ) Trust Region Method 15 / 36

The exact solution of trust region step The Newton approach The Newton approach (3 / 5) Newton step can be reorganized as follows s k = − ( H + µ I ) − 1 g s ′ k = − ( H + µ I ) − 1 s k � s T β = k s k µ k +1 = µ k − β ( β − δ ) s T k s ′ k Thus Newton step require two linear system solution per step. However the coefficient matrix is the same so that only one LU factorization, thus the cost per step is essentially due to the LU factorization. Trust Region Method 16 / 36

The exact solution of trust region step The Newton approach The Newton approach (4 / 5) Evaluating Φ( µ ) ′′ we have Φ( µ ) ′′ = � s ( µ ) � 2 + s ( µ ) T s ( µ ) ′′ + ( s ( µ ) T s ( µ ) ′ ) 2 � s ( µ ) � 2 � s ( µ ) � where s ( µ ) ′′ = 0 In fact, from ( H + µ I ) s ( µ ) ′ = s ( µ ) we have Hs ( µ ) ′′ + µ s ( µ ) ′′ + s ( µ ) ′ = s ( µ ) ′ s ( µ ) ′′ = 0 . ⇒ Then for all µ ≥ 0 we have Φ ′′ ( µ ) > 0 . Trust Region Method 17 / 36

The exact solution of trust region step The Newton approach The Newton approach (5 / 5) From Φ ′′ ( µ ) > 0 we have that Newton step underestimates µ at each step. � s ( µ ) � Φ( µ ) δ µ ⋆ µ Trust Region Method 18 / 36

Trust Region Method Lectures for PHD course on Numerical - PowerPoint PPT Presentation

Trust Region Method Lectures for PHD course on Numerical optimization Enrico Bertolazzi DIMS Universit a di Trento November 21 December 14, 2011 Trust Region Method 1 / 36 The Trust Region method Outline The Trust Region method

TULA REGION TULA Moscow REGION Moscow region Kaluga region Tula Novomoskovsk Ryazan

The Scientific Method The Scientific Method The Scientific Method involves 6 steps: Problem

Trust But Verify Trust But Verify Trust But Verify Trust But Verify What Is CEC Entertainment?

Dynamics, robustness and fragility Private trust Public trust of trust Conclusions Dusko

Gods stories Gods stories Trust Trust To Rely Upon Something Totally Trust trust:

Composite Trust Composite Trust Composite Trust A formal derivation of conjunction A formal

Klaipeda region about the region Klaipda region the right location Klaipda region is

Development Corporation Stuttgart Region Economic The Stuttgart Region The Stuttgart Region is

POTENTIAL OF NAMANGAN REGION 1 Namangan region Namangan region on the map of Uzbekistan

Group- Group -per per- -Region Allocation Region Allocation Region Bounds Region Bounds

Method Handles Everywhere! Charles Oliver Nutter @headius Method Handles What are method

B Method Proof assistants May 16, 2017 Lucas Franceschino What is B method? B-method goal

Newtons method Newtons method 1 / 8 Newtons method Objective: solving a non-linear

Session 1 The New Codex Trust Fund Purpose of the Codex Trust Fund? The Codex Trust Fund supports

Trust based Clustering for Group Trust based Clustering for Group Trust based Clustering for

Islands Trust Council September 15, 2011 The Islands Trust Trust Council 26 elected

CONSULTANT STRATEGY KENNEL STAR MASTINO Information Architecture 3 rd assignment Group 4:

First-order Logic [RN2] Sec 7.1-7.6 Chap 8-9 [RN3] Sec 7.1-7.6 Chap 8-9 CS 486/686 University

CS525z Perceptual Quality Multimedia Networking Network Issues The Science

Natural Language Processing Acoustic Models Dan Klein UC Berkeley 1 The Noisy Channel Model

Aggregation Announcements Aggregation Aggregate Functions So far, all SQL expressions have

03-1 Specialization "is a" Drawing Names and Objects A name might not refer to

Case study 1 PRIMARY CARE SETTING You are a Nurse in the community and you have been asked to

20 04 4111 111 Com puter and Com puter and 2 Program m ing Program m ing Lecture #6: M

Trust Region Method Lectures for PHD course on Numerical - PowerPoint PPT Presentation

Trust Region Method Lectures for PHD course on Numerical optimization Enrico Bertolazzi DIMS Universit a di Trento November 21 December 14, 2011 Trust Region Method 1 / 36 The Trust Region method Outline The Trust Region method

TULA REGION TULA Moscow REGION Moscow region Kaluga region Tula Novomoskovsk Ryazan

The Scientific Method The Scientific Method The Scientific Method involves 6 steps: Problem

Trust But Verify Trust But Verify Trust But Verify Trust But Verify What Is CEC Entertainment?

Dynamics, robustness and fragility Private trust Public trust of trust Conclusions Dusko

Gods stories Gods stories Trust Trust To Rely Upon Something Totally Trust trust:

Composite Trust Composite Trust Composite Trust A formal derivation of conjunction A formal

Klaipeda region about the region Klaipda region the right location Klaipda region is

Development Corporation Stuttgart Region Economic The Stuttgart Region The Stuttgart Region is

POTENTIAL OF NAMANGAN REGION 1 Namangan region Namangan region on the map of Uzbekistan

Group- Group -per per- -Region Allocation Region Allocation Region Bounds Region Bounds

Method Handles Everywhere! Charles Oliver Nutter @headius Method Handles What are method

B Method Proof assistants May 16, 2017 Lucas Franceschino What is B method? B-method goal

Newtons method Newtons method 1 / 8 Newtons method Objective: solving a non-linear

Session 1 The New Codex Trust Fund Purpose of the Codex Trust Fund? The Codex Trust Fund supports

Trust based Clustering for Group Trust based Clustering for Group Trust based Clustering for

Islands Trust Council September 15, 2011 The Islands Trust Trust Council 26 elected

CONSULTANT STRATEGY KENNEL STAR MASTINO Information Architecture 3 rd assignment Group 4:

First-order Logic [RN2] Sec 7.1-7.6 Chap 8-9 [RN3] Sec 7.1-7.6 Chap 8-9 CS 486/686 University

CS525z Perceptual Quality Multimedia Networking Network Issues The Science

Natural Language Processing Acoustic Models Dan Klein UC Berkeley 1 The Noisy Channel Model

Aggregation Announcements Aggregation Aggregate Functions So far, all SQL expressions have

03-1 Specialization &quot;is a&quot; Drawing Names and Objects A name might not refer to

Case study 1 PRIMARY CARE SETTING You are a Nurse in the community and you have been asked to

20 04 4111 111 Com puter and Com puter and 2 Program m ing Program m ing Lecture #6: M

03-1 Specialization "is a" Drawing Names and Objects A name might not refer to