Condition Numbers of Numeric and Algebraic Problems Stephen Vavasis - PowerPoint PPT Presentation

Condition Numbers of Numeric and Algebraic Problems Stephen Vavasis 1 1 Department of Combinatorics & Optimization University of Waterloo 2011-Nov-16 / Fields Inst. Workshop on Hybrid Symbolic-Numeric Computation 1/ 76

Outline Condition numbers in general 1 Condition numbers of linear equations 2 Linear least squares 3 Eigenvalues 4 Linear Programming 5 Geometric condition numbers 6 Polynomial evaluation and roots 7 2/ 76

Condition numbers in general Outline Condition numbers in general 1 Condition numbers of linear equations 2 Linear least squares 3 Eigenvalues 4 Linear Programming 5 Geometric condition numbers 6 Polynomial evaluation and roots 7 3/ 76

Condition numbers in general Condition number definition Given a real-number problem, that is, a function Φ : R m → R n , the condition number of an instance means its sensitivity to small perturbation. In � Φ( x + y ) − Φ( x ) � particular: cond num= lim ǫ → 0 sup � y � � y �≤ ǫ (absolute measurement). Or perhaps � Φ( x + y ) − Φ( x ) � / � Φ( x ) � cond num= lim ǫ → 0 sup � y � / � x � � y �≤ ǫ 4/ 76

Condition numbers in general Details Details to specify: Precise definition of input and output Relative or absolute? (applies to both the input and output) Which part of the data is perturbed? What norm is used to measure sensitivity? 5/ 76

Condition numbers in general Uses of a condition number Condition numbers determine the best possible accuracy of the solution in the presence of approximations made by the computation. Condition numbers sometimes bound the convergence speed of iterative methods. Condition numbers sometimes measure the distance of an instance to singularity. Condition numbers sometimes shed light on preconditioning. 6/ 76

Condition numbers in general Condition numbers and floating-point algorithms Condition numbers are properties of an instance (i.e., the data), not any particular algorithm. Condition numbers set achievable limits algorithms. Condition number analysis may indicate that certain algorithmic choices are unwise Condition numbers often reveal some useful geometric properties of the instance 7/ 76

Condition numbers of linear equations Outline Condition numbers in general 1 Condition numbers of linear equations 2 Linear least squares 3 Eigenvalues 4 Linear Programming 5 Geometric condition numbers 6 Polynomial evaluation and roots 7 8/ 76

Condition numbers of linear equations Condition number of linear equations Most famous classic example (Von Neumann & Goldstine; Turing) is the condition number of solving linear equations. Function Φ is Φ( A , b ) = A − 1 b , A ∈ R n × n , b ∈ R n . Sensitivities understood in the relative sense. All data may be perturbed. Matrix norm is induced by vector norm. 9/ 76

Condition numbers of linear equations Condition number of linear equations (cont’d) Theorem: Suppose x = A − 1 b ; x + ∆ x = ( A + ∆ A ) − 1 ( b + ∆ b ) with � � ∆ A � � A � , � ∆ b � � max ≤ δ. � b � Then � ∆ x � � x � ≤ 2 κ ( A ) δ + O ( δ 2 ) where κ ( A ) = � A � · � A − 1 � is the condition number of A . 10/ 76

Condition numbers of linear equations Condition number of linear equations (cont’d) Note κ ( A ) ≥ 1 and κ ( tA ) = κ ( A ) for all t � = 0. Specializing to the Euclidean vector norm and its induced matrix norm, κ ( A ) = σ 1 /σ n , the ratio of the extremal singular values of A . Geometrically: matrix A maps the n -ball to an ellipsoid. The condition number is the ratio of the maximum to minimum axis length of this ellipsoid. 11/ 76

Condition numbers of linear equations Condition number of linear equations (cont’d) � � 4 2 x �→ x 3 2 1 3 0.8 ✲ 0.6 2 0.4 1 0.2 0 0 −0.2 −1 −0.4 −2 −0.6 −0.8 −3 −1 −1 −0.5 0 0.5 1 −4 −3 −2 −1 0 1 2 3 4 � � 4 2 κ = 16 . 4 3 2 12/ 76

Condition numbers of linear equations Condition number of linear equations (cont’d) The bound on perturbation to x can be (mostly) achieved. The bound can be achieved if only A or only b is perturbed. Condition number does not depend on b . 13/ 76

Condition numbers of linear equations Condition number and distance to singularity Theorem: If A ∈ R n × n is nonsingular, then 1 /κ ( A ) is the relative distance from A to singular matrices, i.e., 1 � � ∆ A � � κ ( A ) = inf � A � : A + ∆ A is singular . Optimal ∆ A pushes the smallest singular value to 0. 14/ 76

Condition numbers of linear equations Condition number and iterative methods Theorem: Suppose A ∈ R n × n is symmetric and positive definite. Then the i th iterate of the conjugate gradient method for solving A x = b satisfies � √ � i κ ( A ) − 1 � x i − A − 1 b � A ≤ 2 � x 0 − A − 1 b � A · √ . κ ( A )+1 Note: � x � A means ( x T A x ) 1 / 2 . For a symmetric positive definite A , κ ( A ) = λ max ( A ) /λ min ( A ). Steepest descent minimization applied to φ ( A ) = x T A x / 2 − b T x is also bounded in terms of condition number. 15/ 76

Condition numbers of linear equations Extension to nonsymmetric systems For nonsymmetric A , can apply CG to minimize � A x − b � ; equivalent to solving symmetric system A T A x = A T b . Note that κ ( A T A ) = κ ( A ) 2 . Other well-known iterative methods for nonsymmetric A x = b , e.g., GMRES, Bi-CGSTAB, are not governed by κ ( A ). 16/ 76

Condition numbers of linear equations Computing the condition number The condition number of the condition number is the condition number (Demmel). Means: The sensititivity of the condition number itself with respect to perturbations of A is again κ ( A ). In practice, this means that very large condition numbers (greater than 10 17 in Matlab) cannot usually be computed accurately, except for matrices with special structure. Even for well-conditioned matrices, computing the condition number is more expensive than solving A x = b . 17/ 76

Condition numbers of linear equations Preconditioning linear equations Instead of applying conjugate gradient to A x = b , apply it to CAC T y = C b , where C is a square nonsingular system; C T C is called the preconditioner . Want C such that κ ( CAC T ) ≪ κ ( A ). Too expensive to compute either quantity. Tradeoff between time to compute/apply C versus κ ( CAC T ). 18/ 76

Condition numbers of linear equations Example of preconditioning A symmetric n × n matrix A is a weighted Laplacian if the diagonal entries are nonnegative, the off-diagonal entries are negative, and the row sums are nonnegative. Above conditions imply positive semidefiniteness. Spielman, Teng and others in a series of papers in the past 10 years found a graph-theoretic preconditioner for weighted Laplacians. Consequence is that these systems can be solved in nearly linear time (linear in the number of nonzero entries in A ). 19/ 76

Condition numbers of linear equations Extension to finite element stiffness matrices Boman, Hendrickson and V. extend Spielman & Teng to finite element discretizations of PDE’s of the form ∇ · ( σ ∇ u ) = − f . Finite element stiffness matrix K can be factored as A T D 1 / 2 HD 1 / 2 A , where A is a node-arc incidence matrix, D is diagonal, positive semidefinite ⇒ A T DA is a weighted Laplacian). (= If all cells of the mesh are well-shaped, then κ ( H ) is small, and any preconditioner for A T DA also works for K . 20/ 76

Linear least squares Outline Condition numbers in general 1 Condition numbers of linear equations 2 Linear least squares 3 Eigenvalues 4 Linear Programming 5 Geometric condition numbers 6 Polynomial evaluation and roots 7 21/ 76

Linear least squares Linear least squares Problem is: minimize � A x − b � 2 given A ∈ R m × n and b ∈ R m . Equivalent to solving linear equations A T A x = A T b (first order optimality condition). Assume that rank ( A ) = n , so solution is unique. Common algorithms: method of normal equations; QR factorization 22/ 76

Linear least squares Condition number of linear least squares Theorem: If � ∆ A � / � A � ≤ δ and � ∆ b � / � b � ≤ δ then � x � ≤ 2 κ ( A ) δ + κ ( A ) 2 δ � b − A x � � ∆ x � � A � · � x � due to Wedin. Achievable. Here, κ ( A ) = σ 1 ( A ) /σ n ( A ). If linear systems bound applied to A T A x = A T b , obtain a weaker bound. 23/ 76

Linear least squares Solving linear least squares Moral 1: Reducing Problem A to Problem B establishes a bound on the condition number, but the bound may be weak. Moral 2: Solving linear least squares via reduction to linear equations may give a poor answer. 24/ 76

Linear least squares Weighted least squares Weighted least squares means minimizing � D ( A x − b ) � where D is a positive definite diagonal weight matrix. Reduces to ordinary linear least squares under the obvious substitution ¯ A = DA and ¯ b = D b . Perturbation bound for ordinary least squares means solution can be arbitrarily inaccurate as κ ( D ) → ∞ . 25/ 76

Condition Numbers of Numeric and Algebraic Problems Stephen Vavasis - PowerPoint PPT Presentation

Condition Numbers of Numeric and Algebraic Problems Stephen Vavasis 1 1 Department of Combinatorics & Optimization University of Waterloo 2011-Nov-16 / Fields Inst. Workshop on Hybrid Symbolic-Numeric Computation 1/ 76 Outline Condition

Complex Numbers Complex Numbers 1 / 19 Complex Numbers Complex numbers ( C ) are an extension of

BY THE NUMBERS BY THE NUMBERS BY THE NUMBERS BY THE NUMBERS BY THE NUMBERS BY THE NUMBERS

Lecture 12 Conditioning and Condition Numbers NLA Reading Group Spring 13 by Can

Binary Numbers 723 Binary Numbers 723 = 7x100 + 2x10 + 3x1 Binary Numbers 723 = 7x100 + 2x10 +

An introduction to complex numbers The complex numbers Are the real numbers not sufficient? A

Signed numbers Goals unsigned numbers - non-negative integers signed numbers - positive/negative

Tidyverse wrapup Steve Bagley somgen223.stanford.edu 1 Making numbers into factors using

A Modular, Efficient Formalisation of Real Algebraic Numbers Wenda Li in joint work with Larry

Real Numbers and their Properties Types of Numbers Z + Natural numbers - counting numbers - 1

Binary Numbers Binary numbers look like this Binary Numbers or Binary Code Binary numbers or

Dual Numbers Gino van den Bergen gino@dtecta.com Introduction Dual numbers extend real

A. Operations with algebraic Algebra practice part 1 expressions 3 4 A. Operations with

Random Numbers, Files, and Onwards Random Numbers Computers cannot produce truly random numbers.

Classes of Real Numbers All real numbers can be represented by a line: 1/2 1 0

Lets do some Computer Algebra and Combinatorics! SYMBOLIC-NUMERIC TOOLS FOR ANALYTIC

Solving equations and inequalities using validated numeric methods Adam Strzebonski, Wolfram

Neural Networks (and Gradient Ascent Again) Frank Wood April 27, 2010 Generalized Regression

Identification of Hybrid Systems Identification of Hybrid Systems Therefore, a model must be

Scalable Machine Learning 5. (Generalized) Linear Models Alex Smola Yahoo! Research and ANU

MLCC 2017 Regularization Networks I: Linear Models Lorenzo Rosasco UNIGE-MIT-IIT June 27, 2017

Reliable Modeling Using Interval Analysis: Chemical Engineering Applications Mark A. Stadtherr

STK-IN4300 Statistical Learning Methods in Data Science Riccardo De Bin debin@math.uio.no

Estimation in Mixed Models with Dirichlet Process Random Effects Both Sides of the Story George

Data Mining / Intelligent Data Analysis Christian Borgelt Dept. of Mathematics / Dept. of