NLO: June 13, 2013 Dr. Thomas M. Surowiec Humboldt University of - PowerPoint PPT Presentation

Introduction to Quasi-Newton Methods Local Convergence Theory NLO: June 13, 2013 Dr. Thomas M. Surowiec Humboldt University of Berlin Department of Mathematics Summer 2013 Dr. Thomas M. Surowiec BMS Course NLO, Summer 2013

Introduction to Quasi-Newton Methods Local Convergence Theory Motivation The only real difference between QN methods and the classical Newton method is the usage of second derivatives in the latter. As the calculation of the Hessian can be quite expensive computationally, one tries to avoid this. Basic Idea: Find H k such that H k d k = −∇ f ( x k ) 1 Do line search to obtain α k and set x k + 1 = x k + α k d k 2 Use x k , x k + 1 , H k to obtain the update H k + 1 . 3 Dr. Thomas M. Surowiec BMS Course NLO, Summer 2013

Introduction to Quasi-Newton Methods Local Convergence Theory Motivation For the intial matrix, we chose a simple, symmetric pos. def. matrix. Oftentimes I . Some of the benefits of this method include: 1 Only need 1st derivatives. H k (will be chosen to) always be pos. def., d k a descent direction. 2 Some variants only require O ( n 2 ) multiplications per iteration. 3 Note all QN methods can guarantee H k pos. def. One speaks of variable metric methods, when H k is always pos. def. Dr. Thomas M. Surowiec BMS Course NLO, Summer 2013

Introduction to Quasi-Newton Methods Local Convergence Theory Deriving Update Rules Starting with a pos. def. sym. matrix, one choses a simple ansatz such that a condition called the QN condition holds. Let ( = − α a ( H k ) − 1 ∇ f ( x k )) s a = α a d a y a = ∇ f ( x + ) − ∇ f ( x a ) , x + = x a + s a Using Taylor’s theorem, we can motivate the following important conditions known as the Quasi-Newton or Secant Condition: Any updated Hessian approximation H + should satisfy: H + s a = y a Dr. Thomas M. Surowiec BMS Course NLO, Summer 2013

Introduction to Quasi-Newton Methods Local Convergence Theory Deriving Update Rules Using α ∈ R , u ∈ R n and the ansatz H + = H a + α uu T , we get the symmetric-rank-1-update: H + = H a + ( y a − H a s a )( y a − H a s a ) T ( y a − H a s a ) T s a Drawbacks: 1 H + not necessarily pos. def. if y a − H a s a ≈ 0 or ( y a − H a s a ) T s a ≈ 0, then numerical problems appear. 2 Using α ∈ R , u , v ∈ R n and the ansatz H + = H a + α uv T , we get the non-symmetric-rank-1-update: H + = H a + ( y a − H a s a ) s T a s T a s a This is update is decidedly disadvantageous. Dr. Thomas M. Surowiec BMS Course NLO, Summer 2013

Introduction to Quasi-Newton Methods Local Convergence Theory Deriving Update Rules: BFGS & DFP Using α, β ∈ R , u , v ∈ R n and the ansatz H + = H a + α uu T + β vv T , we get the symmetric-rank-2-update: H + = H a + y a y T − ( H a s a )( H a s a ) T a y T s T a s a a H a s a This is known as the “Broyden-Fletcher-Goldfarb-Shanno” (BFGS) update, it will be our main focus. One can also directly update the inverse. B + = B a + s a s T − ( B a y a )( B a y a ) T a s T y T a y a a B a y a This symmetric-rank-2-update of the inverse is known as the “Davidon-Fletcher-Powell” (DFP) update. This reduces the theoretical bound for the number of multiplications needed for the calculation. However, a large body of experimental evidence shows that the BFGS outperforms this method. Dr. Thomas M. Surowiec BMS Course NLO, Summer 2013

Introduction to Quasi-Newton Methods Local Convergence Theory Properties of BFGS Lemma 1.1 Let H a ∈ S n be positive definite, y T a s a > 0 , and H + determined according to BFGS. Then H + ∈ S n is positive definite. Proof. On the board. Dr. Thomas M. Surowiec BMS Course NLO, Summer 2013

Introduction to Quasi-Newton Methods Local Convergence Theory Invariance under Affine Transformations By showing that the BFGS and Newton method’s are invariant under affine transformations, we greatly simplify the analysis in the coming days. In particular, we will be apply to assume that ∇ 2 f ( x ∗ ) = I . Dr. Thomas M. Surowiec BMS Course NLO, Summer 2013

Introduction to Quasi-Newton Methods Local Convergence Theory Central Convergence Result Theorem 2.1 Let (A) be satisfied. Then ∃ δ > 0 such that for || x 0 − x ∗ || ≤ δ and || H 0 − ∇ 2 f ( x ∗ ) || ≤ δ, the BFGS method is well-defined and converges q-superlinearly to x ∗ . The proof requires a number of observations and auxiliary results. Dr. Thomas M. Surowiec BMS Course NLO, Summer 2013

Introduction to Quasi-Newton Methods Local Convergence Theory Proving the Convergence Result Lemma 2.1 Let (A) be satisfied. If H a ∈ S n is positive definite and x + = x a − H − 1 a ∇ f ( x a ) , then there exists a δ 0 > 0 such that for 0 < || x a − x ∗ || ≤ δ 0 and || F a || ≤ δ 0 it holds that y T a s a > 0 . Moreover, if H + is the BFGS update of H a , then it follows that F + = H − 1 − I = ( I − w a w T a ) F a ( I − w a w T a ) + D a + with w a = s a / || s a || , D a ∈ R n × n , and || D a || ≤ K D || s a || with K D > 0 . F a is always used to represent the error of the inverse of the current (a for aktuell) approximation in regards to Hessian at the solution. Dr. Thomas M. Surowiec BMS Course NLO, Summer 2013

Introduction to Quasi-Newton Methods Local Convergence Theory Proving the Convergence Result Corollary 2.1 Under the same assumptions, on has || F + || ≤ || F a || + K D || s a || ≤ || F a || + K D ( || x a − x ∗ || + || x + − x ∗ || ) Proof. Since we are working in finite dimensions, we may take any norm for the space R n × n . Therefore, we let || · || be the Frobenius norm. By expanding the term ( I − w a w T a ) F a ( I − w a w T a ) and considering || ( I − w a w T a ) F a ( I − w a w T a ) || 2 , we obtain || F + || ≤ || F a || + K D || s a || for some K D > 0. The rest follows from the triangle inequality. Dr. Thomas M. Surowiec BMS Course NLO, Summer 2013

NLO: June 13, 2013 Dr. Thomas M. Surowiec Humboldt University of - PowerPoint PPT Presentation

Introduction to Quasi-Newton Methods Local Convergence Theory NLO: June 13, 2013 Dr. Thomas M. Surowiec Humboldt University of Berlin Department of Mathematics Summer 2013 Dr. Thomas M. Surowiec BMS Course NLO, Summer 2013 Introduction to

MG5_aMC@NLO looping up to be mad! Olivier Mattelaer IPPP/Durham work in progress with V.

aMC@NLO Olivier Mattelaer University of Illinois at Urbana-Champaign for the MadGraph/aMC@NLO

Top Physics in WHIZARD (+ NLO/QCD Status) Jrgen R. Reuter, DESY J.R.Reuter Top

NLO vs NNLO NLO calculations are/will be the workhorse of LHC physic. They are: Versatile

Multi-jet production at NLO with NJet Valery Yundin Max-Planck-Institut f ur Physik in

NLO QCD and SMC Algorithms Richard Corke Outline Parton shower matching POWHEG to Pythia

Single top quark production at Single top quark production at NLO at the LHC NLO at the LHC

7/8/2013 1 7/8/2013 2 7/8/2013 3 7/8/2013 4 7/8/2013 5 7/8/2013 6 7/8/2013 7 7/8/2013

NLO: April 18, 2013 Dr. Thomas M. Surowiec Humboldt University of Berlin Department of

Top quark decay with dimension-six operators at NLO in QCD Cen Zhang CP3, Universit catholique

NLO Event Simulation for Chargino Production at the ILC based on hep-ph/0607127, hep-ph/0610401

NLO electroweak corrections to SM Higgs production gg H and decay H Stefano Actis

New computational methods for NLO and NNLO calculations in QCD Stefan Weinzierl Institut fr

Multi-particle production in the glasma at NLO and plasma instabilities Raju Venugopalan

NLO QCD corrections to the production of a weak boson pair with a jet Gr egory Sanguinetti -

VBFNLO NLO QCD corrections for processes with electroweak bosons in the final state giuseppe

The Local Elections Media Briefing Tuesday 29 th April John Curtice, Professor of Politics,

I ntegrated Transm ission Planning and Regulation w orkshop Coin Street Neighbourhood Centre 16

Research Data Management Forum 13 Preparing Data for Deposit Friends House,

The Computational Geometry of Congruence Testing, Part II G unter Rote Freie Universit at

Connected Filters Alexandre Xavier Falc ao Instituto de Computa c ao - UNICAMP

Tree Decomposition Maren Kaluza HELMHOLTZ CENTRE FOR ENVIRONMENTAL November 2018 RESEARCH -

Chart Parsing: The Earley Algorithm 2 The Earley Algorithm Informatics 2A: Lecture 18 Parsing

CS-5630 / CS-6630 Visualization for DataScience Tables Alexander Lex alex@sci.utah.edu