nlo june 13 2013
play

NLO: June 13, 2013 Dr. Thomas M. Surowiec Humboldt University of - PowerPoint PPT Presentation

Introduction to Quasi-Newton Methods Local Convergence Theory NLO: June 13, 2013 Dr. Thomas M. Surowiec Humboldt University of Berlin Department of Mathematics Summer 2013 Dr. Thomas M. Surowiec BMS Course NLO, Summer 2013 Introduction to


  1. Introduction to Quasi-Newton Methods Local Convergence Theory NLO: June 13, 2013 Dr. Thomas M. Surowiec Humboldt University of Berlin Department of Mathematics Summer 2013 Dr. Thomas M. Surowiec BMS Course NLO, Summer 2013

  2. Introduction to Quasi-Newton Methods Local Convergence Theory Motivation The only real difference between QN methods and the classical Newton method is the usage of second derivatives in the latter. As the calculation of the Hessian can be quite expensive computationally, one tries to avoid this. Basic Idea: Find H k such that H k d k = −∇ f ( x k ) 1 Do line search to obtain α k and set x k + 1 = x k + α k d k 2 Use x k , x k + 1 , H k to obtain the update H k + 1 . 3 Dr. Thomas M. Surowiec BMS Course NLO, Summer 2013

  3. Introduction to Quasi-Newton Methods Local Convergence Theory Motivation For the intial matrix, we chose a simple, symmetric pos. def. matrix. Oftentimes I . Some of the benefits of this method include: 1 Only need 1st derivatives. H k (will be chosen to) always be pos. def., d k a descent direction. 2 Some variants only require O ( n 2 ) multiplications per iteration. 3 Note all QN methods can guarantee H k pos. def. One speaks of variable metric methods, when H k is always pos. def. Dr. Thomas M. Surowiec BMS Course NLO, Summer 2013

  4. Introduction to Quasi-Newton Methods Local Convergence Theory Deriving Update Rules Starting with a pos. def. sym. matrix, one choses a simple ansatz such that a condition called the QN condition holds. Let ( = − α a ( H k ) − 1 ∇ f ( x k )) s a = α a d a y a = ∇ f ( x + ) − ∇ f ( x a ) , x + = x a + s a Using Taylor’s theorem, we can motivate the following important conditions known as the Quasi-Newton or Secant Condition: Any updated Hessian approximation H + should satisfy: H + s a = y a Dr. Thomas M. Surowiec BMS Course NLO, Summer 2013

  5. Introduction to Quasi-Newton Methods Local Convergence Theory Deriving Update Rules Using α ∈ R , u ∈ R n and the ansatz H + = H a + α uu T , we get the symmetric-rank-1-update: H + = H a + ( y a − H a s a )( y a − H a s a ) T ( y a − H a s a ) T s a Drawbacks: 1 H + not necessarily pos. def. if y a − H a s a ≈ 0 or ( y a − H a s a ) T s a ≈ 0, then numerical problems appear. 2 Using α ∈ R , u , v ∈ R n and the ansatz H + = H a + α uv T , we get the non-symmetric-rank-1-update: H + = H a + ( y a − H a s a ) s T a s T a s a This is update is decidedly disadvantageous. Dr. Thomas M. Surowiec BMS Course NLO, Summer 2013

  6. Introduction to Quasi-Newton Methods Local Convergence Theory Deriving Update Rules: BFGS & DFP Using α, β ∈ R , u , v ∈ R n and the ansatz H + = H a + α uu T + β vv T , we get the symmetric-rank-2-update: H + = H a + y a y T − ( H a s a )( H a s a ) T a y T s T a s a a H a s a This is known as the “Broyden-Fletcher-Goldfarb-Shanno” (BFGS) update, it will be our main focus. One can also directly update the inverse. B + = B a + s a s T − ( B a y a )( B a y a ) T a s T y T a y a a B a y a This symmetric-rank-2-update of the inverse is known as the “Davidon-Fletcher-Powell” (DFP) update. This reduces the theoretical bound for the number of multiplications needed for the calculation. However, a large body of experimental evidence shows that the BFGS outperforms this method. Dr. Thomas M. Surowiec BMS Course NLO, Summer 2013

  7. Introduction to Quasi-Newton Methods Local Convergence Theory Properties of BFGS Lemma 1.1 Let H a ∈ S n be positive definite, y T a s a > 0 , and H + determined according to BFGS. Then H + ∈ S n is positive definite. Proof. On the board. Dr. Thomas M. Surowiec BMS Course NLO, Summer 2013

  8. Introduction to Quasi-Newton Methods Local Convergence Theory Invariance under Affine Transformations By showing that the BFGS and Newton method’s are invariant under affine transformations, we greatly simplify the analysis in the coming days. In particular, we will be apply to assume that ∇ 2 f ( x ∗ ) = I . Dr. Thomas M. Surowiec BMS Course NLO, Summer 2013

  9. Introduction to Quasi-Newton Methods Local Convergence Theory Central Convergence Result Theorem 2.1 Let (A) be satisfied. Then ∃ δ > 0 such that for || x 0 − x ∗ || ≤ δ and || H 0 − ∇ 2 f ( x ∗ ) || ≤ δ, the BFGS method is well-defined and converges q-superlinearly to x ∗ . The proof requires a number of observations and auxiliary results. Dr. Thomas M. Surowiec BMS Course NLO, Summer 2013

  10. Introduction to Quasi-Newton Methods Local Convergence Theory Proving the Convergence Result Lemma 2.1 Let (A) be satisfied. If H a ∈ S n is positive definite and x + = x a − H − 1 a ∇ f ( x a ) , then there exists a δ 0 > 0 such that for 0 < || x a − x ∗ || ≤ δ 0 and || F a || ≤ δ 0 it holds that y T a s a > 0 . Moreover, if H + is the BFGS update of H a , then it follows that F + = H − 1 − I = ( I − w a w T a ) F a ( I − w a w T a ) + D a + with w a = s a / || s a || , D a ∈ R n × n , and || D a || ≤ K D || s a || with K D > 0 . F a is always used to represent the error of the inverse of the current (a for aktuell) approximation in regards to Hessian at the solution. Dr. Thomas M. Surowiec BMS Course NLO, Summer 2013

  11. Introduction to Quasi-Newton Methods Local Convergence Theory Proving the Convergence Result Corollary 2.1 Under the same assumptions, on has || F + || ≤ || F a || + K D || s a || ≤ || F a || + K D ( || x a − x ∗ || + || x + − x ∗ || ) Proof. Since we are working in finite dimensions, we may take any norm for the space R n × n . Therefore, we let || · || be the Frobenius norm. By expanding the term ( I − w a w T a ) F a ( I − w a w T a ) and considering || ( I − w a w T a ) F a ( I − w a w T a ) || 2 , we obtain || F + || ≤ || F a || + K D || s a || for some K D > 0. The rest follows from the triangle inequality. Dr. Thomas M. Surowiec BMS Course NLO, Summer 2013

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend