Part 3 Metrics of algorithmic complexity 87 Wolfgang Bangerth

Outline of optimization algorithms All algorithms to find minima of f(x) do so iteratively: x 0 - start at a point - for k=1,2,... , : p k . compute an update direction . compute a step length  k x k  x k − 1  k p k . set k  k  1 . set 88 Wolfgang Bangerth

Outline of optimization algorithms All algorithms to find minima of f(x) do so iteratively: x 0 - start at a point - for k=1,2,... , : p k . compute an update direction . compute a step length  k x k  x k − 1  k p k . set k  k  1 . set Questions: x * - If is the minimizer that we are seeking, x k  x * does ? ∥ x k − x * ∥≤ - How many iterations does it take for ? - How expensive is every iteration? 89 Wolfgang Bangerth

How expensive is every iteration? The cost of optimization algorithms is dominated by evaluating f(x), g(x), h(x ) and derivatives: Traffic light example: Evaluating f(x) requires us to sit at an ● intersection for an hour, counting cars Designing air foils: Testing an improved wing design in a ● wind tunnel costs millions of dollars. 90 Wolfgang Bangerth

How expensive is every iteration? Example: Boeing wing design Boeing 767 (1980s) Boeing 777 (1990s) Boeing 787 (2000s) 50+ wing designs 18 wing designs 10 wing designs tested in wind tunnel tested in wind tunnel tested in wind tunnel Planes today are 30% more efficient than those developed in the 1970s. Optimization in the wind tunnel and in silico made that happen but is very expensive. 91 Wolfgang Bangerth

How expensive is every iteration? Practical algorithms: p k To determine the search direction ● Gradient (steepest descent) method requires 1 evaluation of per iteration ∇ f ⋅ ● Newton's method requires 1 evaluation of and ∇ f ⋅ 2 f ⋅ 1 evaluation of per iteration ∇ ● If derivatives can not be computed exactly, they can be f ⋅ ∇ f ⋅ approximated by several evaluations of and  k To determine the step length ● Both gradient and Newton method typically require several ∇ f ⋅ f ⋅ evaluations of and potentially per iteration. 92 Wolfgang Bangerth

How many iterations do we need? Question: Given a sequence (for which we know x k  x * that ), can we determine exactly how fast the error ∥ x k − x * ∥ 0 goes to zero? ∥ x k − x * ∥ k 93 Wolfgang Bangerth

How many iterations do we need? Definition: We say that a sequence is of order s if x k  x * s ∥ x k − x * ∥ ≤ C ∥ x k − 1 − x * ∥ a k  0 A sequence of numbers is called of order s if s ∣ a k ∣ ≤ C ∣ a k − 1 ∣ s − 1 C ∣ a k − 1 ∣ C is called the asymptotic constant . We call gain factor. Specifically: If s=1 , the sequence is called linearly convergent . Note: Convergence requires C<1 . In a singly logarithmic plot, linearly convergent sequences are straight lines. If s=2 , we call the sequence quadratically convergent . If 1<s<2 , we call the sequence superlinearly convergent . 94 Wolfgang Bangerth

How many iterations do we need? Example: The sequence of numbers a k = 1, 0.9, 0.81, 0.729, 0.6561, ... is linearly convergent because s ∣ a k ∣ ≤ C ∣ a k − 1 ∣ with s=1, C=0.9 . Remark 1: Linearly convergent sequences can converge very slowly if C is close to 1. Remark 2: Linear convergence is considered slow. We will want to avoid linearly convergent algorithms. 95 Wolfgang Bangerth

How many iterations do we need? Example: The sequence of numbers a k = 0.1, 0.03, 0.0027, 0.00002187, ... is quadratically convergent because s ∣ a k ∣ ≤ C ∣ a k − 1 ∣ with s=2, C=3 . Remark 1: Quadratically convergent sequences can converge very slowly if C is large. For many algorithms we can show that they converge quadratically if a 0 is small enough since then 2 ≤ ∣ a 0 ∣ ∣ a 1 ∣ ≤ C ∣ a 0 ∣ If a 0 is too large then the sequence may fail to converge since 2 ≥ ∣ a 0 ∣ ∣ a 1 ∣ ≤ C ∣ a 0 ∣ Remark 2: Quadratic convergence is considered fast. We will want to use quadratically convergent algorithms. 96 Wolfgang Bangerth

How many iterations do we need? Example: Compare linear and quadratic convergence ∥ x k − x * ∥ Linear convergence. Gain factor C<1 is constant. k Quadratic convergence. C ∣ a k − 1 ∣ 1 Gain factor becomes better and better! 97 Wolfgang Bangerth

Metrics of algorithmic complexity Summary: ● Quadratic algorithms converge faster in the limit than linear or superlinear algorithms ● Algorithms that are better than linear will need to be started close enough to the solution Algorithms are best compared by counting the number of ● function, ● gradient, or ● Hessian evaluations to achieve a certain accuracy. This is generally a good measure for the run-time of such algorithms. 98 Wolfgang Bangerth

Part 3 Metrics of algorithmic complexity 87 Wolfgang Bangerth - PowerPoint PPT Presentation

Part 3 Metrics of algorithmic complexity 87 Wolfgang Bangerth Outline of optimization algorithms All algorithms to find minima of f(x) do so iteratively: x 0 - start at a point - for k=1,2,... , : p k . compute an update direction . compute

Algorithmic Complexity Algorithmic Complexity "Algorithmic Complexity", also called

Basic Algorithms in Number Theory Francesco Pappalardi Algorithmic Complexity & more. July 19

Section 3.3 Section Summary ! Time Complexity ! Worst-Case Complexity ! Algorithmic Paradigms !

What we learned from Community Metrics Agenda Why are metrics used? How metrics are used

Performance Metrics for Graph Mining Tasks 1 Outline Introduction to Performance Metrics

AGENCY OPERATIONS METRICS The Metrics of Me The Metrics of Me x 159 13,006 5 days old books

Proposal Metrics Dashboard What Gets Measured Gets Done Topics Why Keep Metrics? What

MLXSW UPDATES August 2020 PLANNED FEATURES 2 DEVICE METRICS Netdev-centric metrics (rtnetlink

Basic Algorithms in Number Theory Francesco Pappalardi #1 - Algorithmic Complexity & more.

Algorithmic Meta-Theorems for Restrictions of Treewidth Michael Lampis Computer Science Dept.

Algorithmic Aspects of Example: How to . . . Algorithmic Aspects of . . . Analysis, Prediction,

Treewidth reduction and algorithmic applications Treewidth reduction and algorithmic applications

Software Metrics Alex Boughton Executive Summary What are software metrics? Why are

Astheno-Khler and strong KT General results metrics Bismut connection Definition of strong KT

NDCs and metrics Andrei Marcu , Director, ERCST 1 NDCs and metrics Main issues: - Which metrics

Metrics are Pivotal A NATIONAL FARM TO INSTITUTION METRICS COLLABORATIVE WEBINAR Local

(Ultra-light) Cold Dark Matter and Dark Energy from attractors @ Gravity and Cosmology 2018

Computational Thinking Class Overview web site: www.cs.vt.edu/~kafura/CS6604 Origins Term

Effect of a finite wing Geometric eff angle of Local

Self-Address Fixing Evolution BOF https://www1.ietf.org/mailman/listinfo/safe Chairs: Colin

Logistics Class meets Wednesdays from 3:05-6:05 Well start meeting in MEB 3133 At

Gods Plan for the Ages Series Lesson #020 August 12, 2014 Dean Bible Ministries

1 The issue of generalisability Current study objectives 1. To compare the accuracy of HIV-TRePS

DUNE Far Detector Calibration with Cosmic Rays Tom Junk DUNE Far Detector Calibration Workshop