Analysis of Krylov subspace methods: Moments, error estimators, - PowerPoint PPT Presentation

Analysis of Krylov subspace methods: Moments, error estimators, numerical stability and unexpected consequences Zdenˇ ek Strakoš Academy of Sciences and Charles University, Prague http://www.cs.cas.cz/˜strakos A joint work with Dianne P . O’Leary, Chris C. Paige and Petr Tichý New trends in scientific computing, CIRM, Luminy, February 2009.

Projections on nested subspaces A ∈ R N × N A x = b, A n x n = b n x n approximates the solution x using the subspace of small dimension. Z. Strakoš 2

Projection processes x n ∈ x 0 + S n , r n ≡ b − Ax n ∈ r 0 + A S n , where the constraints needed to determine x n are given by n ⊕ A S n = R N , C ⊥ r n ⊥ C n , S n is the search space, C n is the constraint space. r n ∈ C ⊥ r 0 = r 0 | A S n + r 0 | C ⊥ n ≡ r 0 | A S n + r n , n , the projection should be called orthogonal if C n = A S n , and it should be called oblique otherwise. Illustrations using conjugate gradients and GMRES will be given in the next few slides. Z. Strakoš 3

Krylov subspace methods S n ≡ K n ≡ K n ( A, r 0 ) ≡ span { r 0 , Ar 0 , · · · , A n − 1 r 0 } . Krylov subspaces accumulate the dominant information of A with respect to r 0 . Unlike in the power method for computing the single dominant eigenspace, here all the information accumulated along the way is used, see Parlett (1980), Example 12.1.1. The idea of projections using Krylov subspaces is in a fundamental way linked with the problem of moments. Z. Strakoš 4

Conjugate gradients (CG) � x − x n � A = u ∈ x 0 + K n ( A,r 0 ) � x − u � A min with the formulation via the Lanczos process, w 1 = r 0 / � r 0 � , A W n = W n T n + δ n +1 w n +1 e T T n = W T n , n ( A ) A W n ( A ) , and the CG approximation given by T n y n = � r 0 � e 1 , x n = x 0 + W n y n . In terms of projection processes r n = ( − 1) n w n +1 � r n � ⊥ R ( W n ) = C n . S n = C n ≡ K n ( A, r 0 ) = R ( W n ) , 5 Z. Strakoš

Motivation I: CG in finite precision arithmetic 0 10 −10 10 squared energy norm − FP CG squared energy norm − exact CG 0 5 10 15 20 iteration n 0 difference 10 −10 10 0 5 10 15 20 iteration n Z. Strakoš 6

Motivation I: CG in finite precision arithmetic A is diagonal positive definite, see S (1991), Greenbaum, S (1992), λ i = λ 1 + i − 1 N − 1 ( λ N − λ 1 ) γ N − i , i = 2 , . . . , N − 1 , λ 1 = 0 . 1 , λ N = 100 , N = 24 , γ = 0 . 55 . In the experiment we take Initial residual has been generated randomly. 7 Z. Strakoš

Motivation I: Observations ● Rounding errors can cause large delays. ● They may cause an irregular staircase -like behaviour. ● Local decrease of error says nothing about the total error. ● Stopping criteria must be based on global information. ● It must be justified by rigorous rounding error analysis. Golub and S (1994), S and Tichý (2002, 2005), Meurant and S (Acta Numerica, 2006) Comput. Methods Appl. Mech. Engrg. (2003). Z. Strakoš 8

Error estimation in numerical PDE Three reasons for including the algebraic errors into the a -posteriori error estimates: ||| p − p ( n ) ● We need to estimate h ||| , not ||| p − p h ||| . ● It can make computations efficient by stopping the iterations after the algebraic error drops to the level not affecting significantly the total error. ● The mesh refinement can in the presence of singularity pass the difficulty from discretisation to computation. Are there practical examples, when the discretisation error is small while the algebraic error due to roundoff is large? S and Liesen (2005), Jiránek, Vohralík and S (2008). Z. Strakoš 9

Generalized minimal residuals (GMRES) � b − A x n � = u ∈ x 0 + K n ( A,r 0 ) � b − A u � min with the formulation via the Arnoldi process H n +1 ,n = W T A W n = W n +1 H n +1 ,n , n +1 ( A ) A W n ( A ) , and the GMRES approximation given by y n = arg min �� r 0 � e 1 − H n +1 ,n u � , x n = x 0 + W n y n . u In terms of projection processes S n ≡ K n ( A, r 0 ) = R ( W n ) , C n ≡ A K n ( A, r 0 ) = R ( AW n ) , r n ⊥ C n . Z. Strakoš 10

Motivation II: MGS GMRES in finite precision 0 10 −5 10 −10 10 residual smooth ubound −15 backward error 10 loss of orthogonality approximate solution error 0 100 200 300 400 500 600 700 800 iteration number Sherman2 from Matrix market, problem rhs. Z. Strakoš 11

Motivation II: Observations ● Despite the loss of orthogonality, the modified Gram -Schmidt implementation is as accurate as the Householder reflections-based implementation. ● There is no delay due to rounding errors. ● Loss of orthogonality seems inversely proportional to the normwise backward error. ● Full loss of orthogonality means that the normwise backward error is proportional to machine precision. Z. Strakoš 12

Outline 1. Gauss -Christoffel quadrature, moments and CG 2. Convergence of CG in the presence of close eigenvalues 3. Gauss-Christoffel quadrature can be sensitive to small perturbations of the distribution function 4. Sensitivity of Jacobi matrices to spectral data 5. Back to CG in finite precision arithmetic 6. From MGS GMRES to updating of singular values 7. Golub-Kahan bidiagonalization as a fundamental decomposition of data and core problems in errors-in-variables modeling 8. MGS GMRES is normwise backward stable Z. Strakoš 13

From Gauss -Christoffel quadrature to conjugate gradients and back 1. Gauss-Christoffel quadrature, moments and CG Z. Strakoš 14

1 : Matching moments Consider a non -decreasing distribution function ω ( λ ) , λ ≥ 0 with the moments � ∞ λ k dω ( λ ) , ξ k = k = 0 , 1 , . . . . 0 λ ( n ) Find the distribution function ω ( n ) ( λ ) with n points of increase i which matches the first 2 n moments for the distribution function ω ( λ ) , � ∞ n λ k dω ( n ) ( λ ) ≡ ) k = ξ k , ω ( n ) ( λ ( n ) � k = 0 , 1 , . . . , 2 n − 1 . i i 0 i =1 Z. Strakoš 15

1 : Gauss -Christoffel quadrature Clearly, � ∞ n λ k dω ( λ ) = ) k , ω ( n ) ( λ ( n ) � k = 0 , 1 , . . . , 2 n − 1 i i 0 i =1 represents the n -point Gauss-Christoffel quadrature, see C. F . Gauss, Methodus nova integralium valores per approximationem inveniendi, (1814) C. G. J. Jacobi, Uber Gauss’ neue Methode, die Werthe der Integrale näherungsweise zu finden, (1826) With no loss of generality we assume ξ 0 = 1 . Z. Strakoš 16

1 : Stieltjes recurrence Let p 1 ( λ ) ≡ 1 , p 2 ( λ ) , . . . , p n +1 ( λ ) be the first n + 1 orthonormal ω ( λ ) . polynomials corresponding to the distribution function P n ( λ ) = ( p 1 ( λ ) , . . . , p n ( λ )) T , Then, writing λ P n ( λ ) = T n P n ( λ ) + δ n +1 p n +1 ( λ ) e n represents the Stieltjes recurrence, with the Jacobi matrix   γ 1 δ 2 ...   δ 2 γ 2   T n ≡ , δ l > 0 .   ... ... δ n       δ n γ n Z. Strakoš 17

1 : Algebraic connection - Lanczos Algebraically, T n represents the result of the Lanczos process applied to T n with the starting vector e 1 . Therefore p 1 ( λ ) ≡ 1 , p 2 ( λ ) , . . . , p n ( λ ) are orthonormal with respect to the innerproduct n , e 1 ) | 2 p s ( θ ( n ) | ( z ( n ) ) p t ( θ ( n ) � ( p s , p t ) ≡ ) , i i i i =1 z ( n ) where is the orthonormal eigenvector of T n corresponding to the i θ ( n ) θ ( n ) eigenvalue , and p n +1 ( λ ) has the roots , i = 1 , . . . , n . i i Consequently, , e 1 ) | 2 , = | ( z ( n ) λ ( n ) = θ ( n ) ω n , i i i i Golub and Welsh (1969), . . . . . . , Meurant and S (2006). Z. Strakoš 18

1 : Vector moment problem in CG A ∈ R N × N , r 0 = b − Ax 0 , w 1 = r 0 / � r 0 � . Given Ax = b with an SPD dim( K n ( A, r 0 )) = n . Assume, for simplicity of notation, Find a linear SPD operator A n on K n ( A, r 0 ) such that A n w 1 = A w 1 , A n ( A w 1 ) ≡ A 2 A 2 w 1 , n w 1 = . . . A n ( A n − 2 w 1 ) ≡ A n − 1 A n − 1 w 1 , w 1 = n A n ( A n − 1 w 1 ) ≡ A n Q n ( A n w 1 ) , n w 1 = where Q n projects onto K n orthogonally to C n ≡ K n . Vorobyev (1958 R., 1965 E.), Brezinski (1997), S (2008) Z. Strakoš 19

1 : Scalar and vector moment problems With the spectral decomposition of A and A n we get the scalar formulation for matching the 2 n scalar moments given above. Vorobyev (1958 R.), Chapter III, with references to Lanczos (1950, 1952), Hestenes and Stiefel (1952), Ljusternik (1956 R., Solution of problems in linear algebra by the method of continued fractions ), see also Stiefel (1958), Rutishauser (1954, 1959), . . . Extensions to the non -Hermitian Lanzczos and Arnoldi algorithms given in S (2008). Z. Strakoš 20

1 : Matching moments model reduction Estimation of errors in iterative methods, numerical approximation of the c ∗ A − 1 b scattering amplitude, application in sciences Dahlquist, Golub and Nash (1978), . . . , Golub and Meurant (1994), . . . , Meurant, S (2006); Gordon (1968), . . . Fischer and Freund (1992), Freund and Hochbruck (1993), . . . , Saylor and Smolarski (2001), Golub, Stoll and Wathen (2009), S and Tichý (2009) Model reduction in approximation of large scale dynamical systems, in T ( s ) = c ∗ ( sE − A ) − 1 b , s ∈ C A ⊂ C terms of the transfer function Gragg and Lindquist (1983), Villemagne and Skelton (1987), Gallivan, Grimme and Van Dooren (1994), . . . , Bai (2002), Freund (2003), Antoulas (2005), . . . , Guercin and Willcox (2008). Z. Strakoš 21

Analysis of Krylov subspace methods: Moments, error estimators, - PowerPoint PPT Presentation

Analysis of Krylov subspace methods: Moments, error estimators, numerical stability and unexpected consequences Zden ek Strako Academy of Sciences and Charles University, Prague http://www.cs.cas.cz/strakos A joint work with Dianne P

Iterative Krylov Subspace Methods for Sparse Reconstruction James Nagy Mathematics and Computer

Multilevel Krylov Methods Deflation Deflation, DD, MG Reinhard Nabben Multilevel Krylov

Whats so great about Krylov subspaces? David S. Watkins Department of Mathematics Washington

Subspace Polynomials and Cyclic Subspace Codes Netanel Raviv Joint work with: Prof. Tuvi Etzion

Inverse Free Preconditioned Krylov Subspace Method for Symmetric Generalized Eigenvalue Problems

Moments, Krylov subspace methods and model reduction Zden ek Strako Academy of Sciences and

Iterative methods Krylov subspace methods for symmetric eigenvalue problems Jacobi-Davidson

Subspace Information Criterion Subspace Information Criterion for Image Restoration for Image

Graph based Subspace Segmentation Canyi Lu National University of Singapore Nov. 21, 2013

The use of stopping criteria for iterative Krylov methods in designing adaptive methods for PDEs

Modern Krylov subspace methods (and applications to parabolic control problems) Daniel B. Szyld

Krylov subspace methods for Perron-Frobenius operators in RKHS Yuka Hashimoto Takashi Nodera

Communication-avoiding Krylov subspace methods Mark Hoemmen mhoemmen@cs.berkeley.edu University

Krylov subspace methods for eigenvalue problems David S. Watkins watkins@math.wsu.edu

Stability of Krylov Subspace Spectral Methods James V. Lambers Department of Energy Resources

Exponential Integrators using Matrix Functions: Krylov Subspace Methods and Chebyshev Expansion

Using Routing Policies Using Routing Policies Mark Prior Mark Prior Core Engineering Group

Comparing the Performance of Abstract Syntax Notation One (ASN.1) vs. eXtensible Markup Language

The Role of the Head in the Interpretation of English Deverbal Compounds Gianina Iordchioaia i ,

Corporation for National and Community Service Programs and Funding Nov. 6, 2019 Welcome

Cryptographic Approaches for Securing Routing Protocols Adrian Perrig perrig@cmu.edu Why Secure

Eigenspace estimation for source localization using large random matrices Pascal Vallet (1) Joint

Brief Introduction to the Internet Standard Management Framework Part 2: Structure of Management

Prolongable Satisfaction Classes and Iterations of Uniform Reflection over PA Mateusz eyk