A quick introduction to Krylov solvers for the solution of linear - PowerPoint PPT Presentation

A quick introduction to Krylov solvers for the solution of linear systems L. Giraud (Inria) HiePACS project - Inria Bordeaux Sud-Ouest Inria School Nov. 5, 2019

Outline Reliability of the calculation Algorithm selection Why searching solutions in Krylov subspaces Unsymmetric Krylov solvers based on the Arnoldi procedure Algebraic preconditioning techniques Bibliography

Outline Reliability of the calculation Backward error Sensitivity v.s. conditioning Backward stability of algorithms Algorithm selection Why searching solutions in Krylov subspaces Unsymmetric Krylov solvers based on the Arnoldi procedure Algebraic preconditioning techniques Bibliography

How to assess the quality of a computed solution ? Relative norm should be preferred. Let x and ˜ x be the solution and its approximation, δ = � x − ˜ x � gives the number of correct digits � x � in the solution. Example: e = 2 . 7182818 . . . δ Approximation 2 . 10 − 1 2 . 6 . 10 − 3 2 . 7 3 . 10 − 3 2 . 71 1 . 10 − 4 2 . 718 3 . 10 − 5 2 . 7182 6 . 10 − 7 2 . 71828 IEEE 754 standard only provides relative accuracy information.

How to assess the quality of a computed solution ? If � x − ˜ x � is large, TWO possible reasons: � x � ◮ the mathematical problem is sensible to perturbations ◮ the selected numerical algorithm behaves poorly in finite precision calculation The backward error analysis (Wilkinson, 1963) gives a framework to look at the problem.

Sensitivity to small perturbations (exact arithmetic) � 1 � � 1 � 2 � � 1 = 1 . 000005 1 1 2 . 000005 ⇓ � 1 � � 0 . 83333 · · · � 2 � � 1 = 1 . 000006 1 1 . 16666 · · · 2 . 000005

Poor numerical behavioour in finite precision � 1 x n e − x dx , n > 0 Computation of I n = 0 By part integration I 0 = 1 − 1 / e , I n = nI n − 1 − 1 / e , n > 1 Computation with 16 significative digits: ˜ I 100 = 5 . 7 10 +141 � � Backward recurrence: I 300 = 1(??), I n − 1 = 1 I n + 1 n e Computation with 16 significative digits: ˜ I 100 = 3 . 715578714528098 ... 10 − 3 Exact value: ˜ I 100 = 3 . 715578714528085 ... 10 − 03 The art to select the good algorithm !! More calculation does not imply larger errors !!

Backward error and conditioning The backward error analysis, introduced by Wilkinson (1963), is a powerful concept for analyzing the quality of an approximate solution: 1. it is independent of the details of round-off propagation: the errors introduced during the computation are interpreted in terms of perturbations of the initial data, and the computed solution is considered as exact for the perturbed problem; 2. because round-off errors are seen as data perturbations, they can be compared with errors due to numerical approximations (consistency of numerical schemes) or to physical measurements (uncertainties on data coming from experiments for instance).

Backward error Rigal and Gaches result (1967): the normwise backward error η N A , b (˜ x ) = min { ε : ( A + ∆ A )˜ x = b + ∆ b , (1) � ∆ A � ≤ ε � A � , � ∆ b � ≤ ε � b �} is given by � r � η N A , b (˜ x ) = (2) � A �� ˜ x � + � b � where r = b − A ˜ x .

Backward error Simplified proof ( � . � = � . � 2 ): Right-hand side of (2) is a lower bound of (1). Let (∆ A , ∆ b ) such that ( A + ∆ A )˜ x = b + ∆ b , � ∆ A � ≤ ε � A � and � ∆ b � ≤ ε � b � . We have ( A + ∆ A )˜ x = b + ∆ b ⇒ b − A ˜ x = ∆ b − ∆ A ˜ x ⇒ � b − A ˜ x � ≤ � ∆ A �� ˜ x � + � ∆ b � ⇒ � r � ≤ ε ( � A �� ˜ x � + � b � ) � r � x � + � b � ≤ min { ε } = η N ⇒ A , b (˜ x ) � A �� ˜

The bound is attained for � A � x T ∆ A min = x � + � b � ) r ˜ � ˜ x � ( � A �� ˜ and � b � ∆ b min = − x � + � b � r . � A �� ˜ We have ∆ A min ˜ x − ∆ b min = r with � A �� r � � b �� r � � ∆ A min � = x � + � b � and � ∆ b min � = x � + � b � . � A �� ˜ � A �� ˜

Normwise backward error We can also define η N b (˜ x ) = min { ε : A ˜ x = b + ∆ b , � ∆ b � ≤ ε � b �} � r � = � b � ◮ Classically η N A , b or η N b are considered to implement stopping criterion in iterative methods. ◮ Notice that � r k � � r 0 � (often seen in some implementations) reduces to η N b if x 0 = 0.

Sensitivity v.s. conditioning Let suppose that we have solved ( A + ∆ A )( x + ∆ x ) = b + ∆ b . We would like to know how � ∆ x � depends on � ∆ A � and � ∆ b � � b � . � x � � A � � � ∆ A � � � A � , � ∆ b � We denote ω = max . � b � ( A + ∆ A )( x + ∆ x ) = b + ∆ b ∆ x = − A − 1 ∆ Ax − A − 1 ∆ A ∆ x + A − 1 ∆ b ⇒ � ∆ x � ≤ � A − 1 �� ∆ A � � A � � A �� x � + � A − 1 �� ∆ A � ⇒ � A � � A �� ∆ x � + � A − 1 �� b �� ∆ b � � b � � ∆ x � ≤ ωκ N ( A ) � x � + κ N ( A ) ω � ∆ x � + ω � A − 1 �� b � ⇒ (1 − ωκ N ( A )) � ∆ x � ≤ ωκ N ( A ) � x � + ω � A − 1 �� b � ⇒

Sensitivity v.s. conditioning (cont) If ωκ N ( A ) < 1 then � � κ N ( A ) + � A − 1 �� b � � ∆ x � ω ≤ (3) 1 − ωκ N ( A ) � x � � x � � � κ N ( A ) + � A − 1 �� b � if ωκ N ( A ) < 0 . 5 ≤ 2 ω � x � 4 ωκ N ( A ) ≤ that reads Relative Forward error � (Condition number) × (Backward error)

First order estimate of the forward error κ N ( A ) η N κ N ( A ) × η N Matrix FE A , b A , b 1 10 2 1 10 − 16 1 10 − 14 2 10 − 15 M1 4 10 5 9 10 − 17 3 10 − 11 8 10 − 12 M2 1 10 16 2 10 − 16 6 10 − 1 M3 2 3 10 1 3 10 − 7 2 10 − 5 9 10 − 6 M4 4 10 1 6 10 − 4 3 10 − 2 2 10 − 2 M5 Algorithm: Gauss elimination with partial pivoting (MATLAB). Matrices : M1 - augment(20) M2 - dramadah(2) M3 - chebvand(20) M4 - will(40) M5 - will(50) M6 - dd(20)

First order estimate of the forward error (cont) ◮ Large backward error: unreliable algorithm ⇒ change the algorithm (ex. use Gauss elimination with pivoting). ◮ Ill-conditioned matrices: large condition number ⇒ scale the matrix , ⇒ think about alternative for the numerical formulation of the problem.

Gaussian elimination with partial pivoting for k = 1 to n − 1 do for i = k + 1 to n do a ik = a ik / a kk for j = k + 1 to n do a ij = a ij − a ik a kj end for end for end for LU factorization algorithm without pivoting Let ˜ x the solution computed by Gaussian elimination with partial pivoting. x = b with � ∆ A � ∞ ≤ 8 n 3 τψ + O ( ψ 2 ) where ψ is Then ( A + ∆ A )˜ � A � ∞ the machine precision. τ is the growth factor ; max i , j , k | a ( k ) ij | τ = max i , j | a ij | . If τ is large it might imply numerical instability (i.e. large backward error)

Backward stability of GMRES with robust orthogonalization For not too ill-conditioned matrix, the Householder GMRES implementation ensures that � b − Ax n � η N x � + � b � ≤ P ( n ) ψ + O ( ψ 2 ) A , b ( ˜ x n ) = � A �� ˜ [J. Drkoˇ sov´ a, M. Rozloˇ zn´ ık, Z. Strakoˇ s and A. Greenbaum, Numerical stability of the GMRES method , BIT, vol. 35, p. 309-330, 1995. ] [C.C. Paige, M. Rozloˇ zn´ ık and Z. Strakoˇ s, Modified Gram-Schmidt (MGS), Least Squares, and Backward Stability of MGS-GMRES , SIAM J. Matrix Anal. Appl., vol. 28 (1), p. 264-284, 2006. ]

Backward stability of GMRES with robust orthogonalization

Outline Reliability of the calculation Algorithm selection Why searching solutions in Krylov subspaces Unsymmetric Krylov solvers based on the Arnoldi procedure Algebraic preconditioning techniques Bibliography

Sparse linear solver Goal: solving A x = b , where A is sparse Usual trades off Iterative Direct ◮ Problem dependent ◮ Robust/accurate for efficiency / accuracy general problems ◮ Sparse computational ◮ BLAS-3 based kernels implementations ◮ Less memory requirements ◮ Memory/CPU prohibitive and possibly faster for large 3 D problems ◮ Possible high weak ◮ Limited weak scalability scalability

Algorithm selection Two main approaches: ◮ Direct solvers: compute a factorization and use the factors to solve the linear system; that is, express the matrix A as the product of matrices having simple structures (i.e. diagonal, triangular). Example: LU for unsymmetric matrices, LL T (Cholesky) for symmetric positive definite matrices, LDL T for symmetric indefinite matrices. ◮ Iterative solvers: build a sequence ( x k ) → x ∗ .

Stationary v.s. Krylov methods Alternative to direct solvers when memory and/or CPU constraints. Two main approaches ◮ Stationary/asymptotic method: x k = f ( x k − 1 ) with x k → x ∗ ∀ x 0 . ◮ Krylov method: x k = x 0 + span { r 0 , Ar 0 , .., A k − 1 r 0 } with r k = b − Ax k subject to some constraints/optimality conditions.

Basic scheme Let x 0 be given and M ∈ R n × n a nonsingular matrix, compute x k = x k − 1 + M ( b − Ax k − 1 ) . (4) Note that b − Ax k − 1 = A ( x ∗ − x k − 1 ) ⇒ the best M is A − 1 . Theorem The stationary sheme defined by (4) converges to x ∗ = A − 1 b for any x 0 iff ρ ( I − MA ) < 1 , where ρ ( I − MA ) denotes the spectral radius of the iteration matrix ( I − MA ) .

A quick introduction to Krylov solvers for the solution of linear - PowerPoint PPT Presentation

A quick introduction to Krylov solvers for the solution of linear systems L. Giraud (Inria) HiePACS project - Inria Bordeaux Sud-Ouest Inria School Nov. 5, 2019 Outline Reliability of the calculation Algorithm selection Why searching

Math 211 Math 211 Lecture #14 M ATLAB s ODE Solvers September 26, 2003 2 Matlab Solvers

Multilevel Krylov Methods Deflation Deflation, DD, MG Reinhard Nabben Multilevel Krylov

Whats so great about Krylov subspaces? David S. Watkins Department of Mathematics Washington

Introducing Krylov eBay AI Platform - Machine Learning Made Easy Henry Saputra Technical Lead

On rational Krylov sequences Karl Meerbergen K.U. Leuven Rolling waves December 1516,

OpenFOAMs basic solvers for linear systems of equations Solvers, preconditioners, smoothers

SMT Solvers: A Disruptive Technology John Rushby Computer Science Laboratory SRI International

2/17/2017 Continued from yesterday >java RealQueen 5 SOLUTION: 1 3 5 2 4 SOLUTION: 1 4 2 5

iOptimizing Krylov Subspace Solvers on iiGraphics Processing Units Hartwig Anzt, W. Sawyer, S.

Raising the Arithmetic Intensity of Krylov solvers 1 Applied Mathematics, University of Antwerp,

BLOCK SOLVERS FOR MULTIPLE RIGHT HAND SIDES ON NVIDIA GPUS Mathias Wagner, Lattice 2016 QUDA

Rational Krylov methods for linear and nonlinear eigenvalue problems Mele Giampaolo

Krylov methods for tensors I Lars Eldn and Berkant Savas Department of Mathematics Linkping

Inverse Free Preconditioned Krylov Subspace Method for Symmetric Generalized Eigenvalue Problems

Iterative Krylov Subspace Methods for Sparse Reconstruction James Nagy Mathematics and Computer

Krylov methods for fast frequency response computations Karl Meerbergen February 28, 2007 Karl

CS 5220: Impact of Floating Point David Bindel 2017-11-16 1 Why this lecture? Isnt this

TF-WebRTC Mihly Mszros 12/15/14 Paris / France Aim of the TF-WebRTC Build community

RTSP - Core Magnus Westerlund / Ericsson Rob Lanphier / Real Networks Aravind Narasimhan / Sun

Chapter 6 Using UML, Patterns, and Java System Design: Decomposing the System Podcast Ch06-01

Notes Linear Algebra Assignment 1 will be out later today Last class: (look on the web)

Main Idea Idea To study the discretizations of dynamical systems viewed as perturbations:

Symbolic Shape Analysis Thomas Wies University of Freiburg, Germany Motivation class

Parallel-in-time integrators for Hamiltonian systems Claude Le Bris ENPC and INRIA Visiting

A quick introduction to Krylov solvers for the solution of linear - PowerPoint PPT Presentation

A quick introduction to Krylov solvers for the solution of linear systems L. Giraud (Inria) HiePACS project - Inria Bordeaux Sud-Ouest Inria School Nov. 5, 2019 Outline Reliability of the calculation Algorithm selection Why searching

Math 211 Math 211 Lecture #14 M ATLAB s ODE Solvers September 26, 2003 2 Matlab Solvers

Multilevel Krylov Methods Deflation Deflation, DD, MG Reinhard Nabben Multilevel Krylov

Whats so great about Krylov subspaces? David S. Watkins Department of Mathematics Washington

Introducing Krylov eBay AI Platform - Machine Learning Made Easy Henry Saputra Technical Lead

On rational Krylov sequences Karl Meerbergen K.U. Leuven Rolling waves December 1516,

OpenFOAMs basic solvers for linear systems of equations Solvers, preconditioners, smoothers

SMT Solvers: A Disruptive Technology John Rushby Computer Science Laboratory SRI International

2/17/2017 Continued from yesterday &gt;java RealQueen 5 SOLUTION: 1 3 5 2 4 SOLUTION: 1 4 2 5

iOptimizing Krylov Subspace Solvers on iiGraphics Processing Units Hartwig Anzt, W. Sawyer, S.

Raising the Arithmetic Intensity of Krylov solvers 1 Applied Mathematics, University of Antwerp,

BLOCK SOLVERS FOR MULTIPLE RIGHT HAND SIDES ON NVIDIA GPUS Mathias Wagner, Lattice 2016 QUDA

Rational Krylov methods for linear and nonlinear eigenvalue problems Mele Giampaolo

Krylov methods for tensors I Lars Eldn and Berkant Savas Department of Mathematics Linkping

Inverse Free Preconditioned Krylov Subspace Method for Symmetric Generalized Eigenvalue Problems

Iterative Krylov Subspace Methods for Sparse Reconstruction James Nagy Mathematics and Computer

Krylov methods for fast frequency response computations Karl Meerbergen February 28, 2007 Karl

CS 5220: Impact of Floating Point David Bindel 2017-11-16 1 Why this lecture? Isnt this

TF-WebRTC Mihly Mszros 12/15/14 Paris / France Aim of the TF-WebRTC Build community

RTSP - Core Magnus Westerlund / Ericsson Rob Lanphier / Real Networks Aravind Narasimhan / Sun

Chapter 6 Using UML, Patterns, and Java System Design: Decomposing the System Podcast Ch06-01

Notes Linear Algebra Assignment 1 will be out later today Last class: (look on the web)

Main Idea Idea To study the discretizations of dynamical systems viewed as perturbations:

Symbolic Shape Analysis Thomas Wies University of Freiburg, Germany Motivation class

Parallel-in-time integrators for Hamiltonian systems Claude Le Bris ENPC and INRIA Visiting

2/17/2017 Continued from yesterday >java RealQueen 5 SOLUTION: 1 3 5 2 4 SOLUTION: 1 4 2 5