 
              Analysis of Krylov subspace methods: Moments, error estimators, numerical stability and unexpected consequences Zdenˇ ek Strakoš Academy of Sciences and Charles University, Prague http://www.cs.cas.cz/˜strakos A joint work with Dianne P . O’Leary, Chris C. Paige and Petr Tichý New trends in scientific computing, CIRM, Luminy, February 2009.
Projections on nested subspaces A ∈ R N × N A x = b, A n x n = b n x n approximates the solution x using the subspace of small dimension. Z. Strakoš 2
Projection processes x n ∈ x 0 + S n , r n ≡ b − Ax n ∈ r 0 + A S n , where the constraints needed to determine x n are given by n ⊕ A S n = R N , C ⊥ r n ⊥ C n , S n is the search space, C n is the constraint space. r n ∈ C ⊥ r 0 = r 0 | A S n + r 0 | C ⊥ n ≡ r 0 | A S n + r n , n , the projection should be called orthogonal if C n = A S n , and it should be called oblique otherwise. Illustrations using conjugate gradients and GMRES will be given in the next few slides. Z. Strakoš 3
Krylov subspace methods S n ≡ K n ≡ K n ( A, r 0 ) ≡ span { r 0 , Ar 0 , · · · , A n − 1 r 0 } . Krylov subspaces accumulate the dominant information of A with respect to r 0 . Unlike in the power method for computing the single dominant eigenspace, here all the information accumulated along the way is used, see Parlett (1980), Example 12.1.1. The idea of projections using Krylov subspaces is in a fundamental way linked with the problem of moments. Z. Strakoš 4
Conjugate gradients (CG) � x − x n � A = u ∈ x 0 + K n ( A,r 0 ) � x − u � A min with the formulation via the Lanczos process, w 1 = r 0 / � r 0 � , A W n = W n T n + δ n +1 w n +1 e T T n = W T n , n ( A ) A W n ( A ) , and the CG approximation given by T n y n = � r 0 � e 1 , x n = x 0 + W n y n . In terms of projection processes r n = ( − 1) n w n +1 � r n � ⊥ R ( W n ) = C n . S n = C n ≡ K n ( A, r 0 ) = R ( W n ) , 5 Z. Strakoš
Motivation I: CG in finite precision arithmetic 0 10 −10 10 squared energy norm − FP CG squared energy norm − exact CG 0 5 10 15 20 iteration n 0 difference 10 −10 10 0 5 10 15 20 iteration n Z. Strakoš 6
Motivation I: CG in finite precision arithmetic A is diagonal positive definite, see S (1991), Greenbaum, S (1992), λ i = λ 1 + i − 1 N − 1 ( λ N − λ 1 ) γ N − i , i = 2 , . . . , N − 1 , λ 1 = 0 . 1 , λ N = 100 , N = 24 , γ = 0 . 55 . In the experiment we take Initial residual has been generated randomly. 7 Z. Strakoš
Motivation I: Observations ● Rounding errors can cause large delays. ● They may cause an irregular staircase -like behaviour. ● Local decrease of error says nothing about the total error. ● Stopping criteria must be based on global information. ● It must be justified by rigorous rounding error analysis. Golub and S (1994), S and Tichý (2002, 2005), Meurant and S (Acta Numerica, 2006) Comput. Methods Appl. Mech. Engrg. (2003). Z. Strakoš 8
Error estimation in numerical PDE Three reasons for including the algebraic errors into the a -posteriori error estimates: ||| p − p ( n ) ● We need to estimate h ||| , not ||| p − p h ||| . ● It can make computations efficient by stopping the iterations after the algebraic error drops to the level not affecting significantly the total error. ● The mesh refinement can in the presence of singularity pass the difficulty from discretisation to computation. Are there practical examples, when the discretisation error is small while the algebraic error due to roundoff is large? S and Liesen (2005), Jiránek, Vohralík and S (2008). Z. Strakoš 9
Generalized minimal residuals (GMRES) � b − A x n � = u ∈ x 0 + K n ( A,r 0 ) � b − A u � min with the formulation via the Arnoldi process H n +1 ,n = W T A W n = W n +1 H n +1 ,n , n +1 ( A ) A W n ( A ) , and the GMRES approximation given by y n = arg min �� r 0 � e 1 − H n +1 ,n u � , x n = x 0 + W n y n . u In terms of projection processes S n ≡ K n ( A, r 0 ) = R ( W n ) , C n ≡ A K n ( A, r 0 ) = R ( AW n ) , r n ⊥ C n . Z. Strakoš 10
Motivation II: MGS GMRES in finite precision 0 10 −5 10 −10 10 residual smooth ubound −15 backward error 10 loss of orthogonality approximate solution error 0 100 200 300 400 500 600 700 800 iteration number Sherman2 from Matrix market, problem rhs. Z. Strakoš 11
Motivation II: Observations ● Despite the loss of orthogonality, the modified Gram -Schmidt implementation is as accurate as the Householder reflections-based implementation. ● There is no delay due to rounding errors. ● Loss of orthogonality seems inversely proportional to the normwise backward error. ● Full loss of orthogonality means that the normwise backward error is proportional to machine precision. Z. Strakoš 12
Outline 1. Gauss -Christoffel quadrature, moments and CG 2. Convergence of CG in the presence of close eigenvalues 3. Gauss-Christoffel quadrature can be sensitive to small perturbations of the distribution function 4. Sensitivity of Jacobi matrices to spectral data 5. Back to CG in finite precision arithmetic 6. From MGS GMRES to updating of singular values 7. Golub-Kahan bidiagonalization as a fundamental decomposition of data and core problems in errors-in-variables modeling 8. MGS GMRES is normwise backward stable Z. Strakoš 13
From Gauss -Christoffel quadrature to conjugate gradients and back 1. Gauss-Christoffel quadrature, moments and CG Z. Strakoš 14
1 : Matching moments Consider a non -decreasing distribution function ω ( λ ) , λ ≥ 0 with the moments � ∞ λ k dω ( λ ) , ξ k = k = 0 , 1 , . . . . 0 λ ( n ) Find the distribution function ω ( n ) ( λ ) with n points of increase i which matches the first 2 n moments for the distribution function ω ( λ ) , � ∞ n λ k dω ( n ) ( λ ) ≡ ) k = ξ k , ω ( n ) ( λ ( n ) � k = 0 , 1 , . . . , 2 n − 1 . i i 0 i =1 Z. Strakoš 15
1 : Gauss -Christoffel quadrature Clearly, � ∞ n λ k dω ( λ ) = ) k , ω ( n ) ( λ ( n ) � k = 0 , 1 , . . . , 2 n − 1 i i 0 i =1 represents the n -point Gauss-Christoffel quadrature, see C. F . Gauss, Methodus nova integralium valores per approximationem inveniendi, (1814) C. G. J. Jacobi, Uber Gauss’ neue Methode, die Werthe der Integrale näherungsweise zu finden, (1826) With no loss of generality we assume ξ 0 = 1 . Z. Strakoš 16
1 : Stieltjes recurrence Let p 1 ( λ ) ≡ 1 , p 2 ( λ ) , . . . , p n +1 ( λ ) be the first n + 1 orthonormal ω ( λ ) . polynomials corresponding to the distribution function P n ( λ ) = ( p 1 ( λ ) , . . . , p n ( λ )) T , Then, writing λ P n ( λ ) = T n P n ( λ ) + δ n +1 p n +1 ( λ ) e n represents the Stieltjes recurrence, with the Jacobi matrix   γ 1 δ 2 ...   δ 2 γ 2   T n ≡ , δ l > 0 .   ... ... δ n       δ n γ n Z. Strakoš 17
1 : Algebraic connection - Lanczos Algebraically, T n represents the result of the Lanczos process applied to T n with the starting vector e 1 . Therefore p 1 ( λ ) ≡ 1 , p 2 ( λ ) , . . . , p n ( λ ) are orthonormal with respect to the innerproduct n , e 1 ) | 2 p s ( θ ( n ) | ( z ( n ) ) p t ( θ ( n ) � ( p s , p t ) ≡ ) , i i i i =1 z ( n ) where is the orthonormal eigenvector of T n corresponding to the i θ ( n ) θ ( n ) eigenvalue , and p n +1 ( λ ) has the roots , i = 1 , . . . , n . i i Consequently, , e 1 ) | 2 , = | ( z ( n ) λ ( n ) = θ ( n ) ω n , i i i i Golub and Welsh (1969), . . . . . . , Meurant and S (2006). Z. Strakoš 18
1 : Vector moment problem in CG A ∈ R N × N , r 0 = b − Ax 0 , w 1 = r 0 / � r 0 � . Given Ax = b with an SPD dim( K n ( A, r 0 )) = n . Assume, for simplicity of notation, Find a linear SPD operator A n on K n ( A, r 0 ) such that A n w 1 = A w 1 , A n ( A w 1 ) ≡ A 2 A 2 w 1 , n w 1 = . . . A n ( A n − 2 w 1 ) ≡ A n − 1 A n − 1 w 1 , w 1 = n A n ( A n − 1 w 1 ) ≡ A n Q n ( A n w 1 ) , n w 1 = where Q n projects onto K n orthogonally to C n ≡ K n . Vorobyev (1958 R., 1965 E.), Brezinski (1997), S (2008) Z. Strakoš 19
1 : Scalar and vector moment problems With the spectral decomposition of A and A n we get the scalar formulation for matching the 2 n scalar moments given above. Vorobyev (1958 R.), Chapter III, with references to Lanczos (1950, 1952), Hestenes and Stiefel (1952), Ljusternik (1956 R., Solution of problems in linear algebra by the method of continued fractions ), see also Stiefel (1958), Rutishauser (1954, 1959), . . . Extensions to the non -Hermitian Lanzczos and Arnoldi algorithms given in S (2008). Z. Strakoš 20
1 : Matching moments model reduction Estimation of errors in iterative methods, numerical approximation of the c ∗ A − 1 b scattering amplitude, application in sciences Dahlquist, Golub and Nash (1978), . . . , Golub and Meurant (1994), . . . , Meurant, S (2006); Gordon (1968), . . . Fischer and Freund (1992), Freund and Hochbruck (1993), . . . , Saylor and Smolarski (2001), Golub, Stoll and Wathen (2009), S and Tichý (2009) Model reduction in approximation of large scale dynamical systems, in T ( s ) = c ∗ ( sE − A ) − 1 b , s ∈ C A ⊂ C terms of the transfer function Gragg and Lindquist (1983), Villemagne and Skelton (1987), Gallivan, Grimme and Van Dooren (1994), . . . , Bai (2002), Freund (2003), Antoulas (2005), . . . , Guercin and Willcox (2008). Z. Strakoš 21
Recommend
More recommend