GMRES Methods for Least Squares Problems Ken Hayami, Jun-Feng Yin, - PowerPoint PPT Presentation

GMRES Methods for Least Squares Problems Ken Hayami, Jun-Feng Yin, National Institute of Informatics, Tokyo and Tokushi Ito Business Design Laboratory Co.,Ltd., Nagoya HARRACHOV 2007 Computational Linear Algebra with Applications Harrachov, Czech Republic August 23rd, 2007

Problem: x ∈ R n � b − A x � 2 min A ∈ R m × n • m > n or m = n or m < n • not necessarily full rank • large sparse � A T A x = A T b (normal equation)

For m < n AA T y = b , x = A T y ↓ minimum norm solution

1. Iterative methods using the normal equation A T A : square, symmetric, positive definite (if rank A = n ) matrix ↓ Apply Conjugate Gradient method. ↓ CGLS (Conjugate Gradient Least Squares) method

The CGLS(CGNR) method Choose x 0 . r 0 = b − A x 0 , p 0 = s 0 = A T r 0 , γ 0 = � s 0 � 22 for i = 0 , 1 , 2 , . . . until γ i < ǫ q i = A p i α i = γ i / � q i � 22 x i +1 = x i + α i p i r i +1 = r i − α i q i s i +1 = A T r i +1 γ i +1 = � s i +1 � 22 β i = γ i +1 /γ i p i +1 = s i +1 + β i p i endfor

However, condition number of A T A : square of A . ↓ Slow convergence Preconditioning necessary • diagonal scaling • incomplete Cholesky decomposition (Meijerink, van der Vorst ’77) • incomplete QR decomposition (Jennings, Ajiz ’84, Saad ’88) • incomplete Givens orthogonalization (Bai et al. ’01) • robust incomplete factorization (RIF) (Benzi, Tuma ’03) etc.

2. Iterative methods not based on the normal equation CR-LS( k ) method (Zhang and Oyanagi, ’89) Apply Orthomin( k ) method to x ∈ R n � b − A x � 2 min by introducing a mapping matrix B ∈ R n × m , and using the Krylov subspace generated by AB ∈ R m × m .

Consider using GMRES (Generalized Minimal RESidual) method instead. GMRES: efficient and robust method for A x = b , where A ∈ R n × n : nonsingular.

GMRES( k ) method Choose x 0 ∗ r 0 = b − A x 0 ; v 1 = r 0 / || r 0 || 2 for i = 1 , 2 , . . . , k w i = A v i for j = 1 , 2 , . . . , i h j,i = ( w i , v j ) w i = w i − h j,i v j end for h i +1 ,i = � w i � 2 v i +1 = w i /h i +1 ,i Find y i ∈ R i which minimizes � r i � 2 = � � r 0 � 2 e i − ¯ � � H i y � 2 . if � r i � 2 < ǫ then x i = x 0 + [ v 1 , . . . , v i ] y i ; stop endif endfor x k = x 0 + [ v 1 , . . . , v k ] y k x 0 = x k Go to ∗ .

H i = ( h pq ) ∈ R ( i +1) × i , ( Here, ¯ e i = (1 , 0 , . . . , 0) T ∈ R i +1 . ) Minimizes � r k � 2 over x k = x 0 + � v 1 , · · · , v k � � v 1 , · · · , v k � = � r 0 , A r 0 , · · · , A k − 1 r 0 � ( v i , v j ) = δ ij k = ∞ : (full) GMRES. h i +1 ,i = 0 : breakdown When A : nonsingular, GMRES does not break down until it has found ∀ b , ∀ x 0 ∈ R n the solution of A x = b , (with in n steps).

When A : singular, Theorem 1 (Brown, Walker ’97) GMRES determines a least squares solution of min x ∈ R n || b − A x || 2 without breakdown ∀ b , ∀ x 0 ∈ R n � R ( A ) = R ( A T ) . ✷ R ( M ): range space of M .

How can we apply GMRES to the least squares problem x ∈ R n � b − A x � 2 min where A ∈ R m × n ? A ∈ R m × n , r 0 = b − A x 0 ∈ R m . ↓ Cannot create Krylov subspace by A × r 0 . Basically two ways to overcome by using a mapping matrix B ∈ R n × m .

1. AB-GMRES method Use Krylov subspace: K k ( AB, r 0 ) := � r 0 , AB r 0 , . . . , ( AB ) i − 1 r 0 � in R m . AB ∈ R m × m . (cf. CR-LS( k ) method by Zhang, Oyanagi)

First note: Lemma 2 ∀ b ∈ R m x ∈ R n || b − A x || 2 = min min z ∈ R m || b − AB z || 2 � R ( A ) = R ( AB ) ✷

Using Lemma 3 R ( AA T ) = R ( A ) ✷ we can show R ( A T ) = R ( B ) = Lemma 4 ⇒ R ( A ) = R ( AB ) . ✷

Thus, assume R ( A ) = R ( AB ) . Consider solving z ∈ R m || b − AB z || 2 = min x ∈ R n || b − A x || 2 min using GMRES with initial approximation z 0 ∈ R m ; AB z 0 = A x 0 . Then, we have the following.

AB-GMRES( k ) method Choose x 0 (; A x 0 = AB z 0 ). ∗ r 0 = b − A x 0 (= b − AB z 0 ); v 1 = r 0 / || r 0 || 2 for i = 1 , 2 , . . . , k w i = AB v i for j = 1 , 2 , . . . , i h j,i = ( w i , v j ) w i = w i − h j,i v j end for h i +1 ,i = � w i � 2 v i +1 = w i /h i +1 ,i Find y i ∈ R i which minimizes � r i � 2 = � � r 0 � 2 e i − ¯ � � H i y � 2 x i = x 0 + B [ v 1 , . . . , v i ] y i r i = b − A x i if � A T r i � 2 < ǫ stop endfor x 0 = x k Go to ∗ .

Does the AB-GMRES method determine the least squares solution without breakdown ? Recall the following. Theorem 5 (Brown, Walker ’97) A ∈ R m × m . Then, the following holds. Let ˜ The GMRES method determines z ∈ R m || b − ˜ a least squares solution of min A z || 2 ∀ b ∈ R m , ∀ z 0 ∈ R m without breakdown � R ( ˜ A ) = R ( ˜ A T ) . ✷

Let ˜ A := AB . Noting If R ( A T ) = R ( B ) , then Theorem 6 R ( AB ) = R ( B T A T ) ⇐ ⇒ R ( A ) = R ( B T ) ✷ we obtain Theorem 7 If R ( A T ) = R ( B ) , then AB-GMRES method determines a least squares solution of min x ∈ R n || b − A x || 2 ∀ b ∈ R m , ∀ x 0 ∈ R n without breakdown � R ( A ) = R ( B T ) . ✷

Corollary 8 R ( A T ) = R ( B ) , R ( A ) = R ( B T ) ⇓ AB-GMRES method determines a least squares solution of min x ∈ R n || b − A x || 2 ∀ b ∈ R m , ∀ x 0 ∈ R n without breakdown. ✷ Remark 1 R ( A ) = R ( B T ) ⇑ B T = AC T ; C T : nonsingular � B = CA T ; C : nonsingular.

2. BA-GMRES method The other alternative: Use a matrix B ∈ R n × m to map r 0 ∈ R m to ˜ r 0 = B r 0 ∈ R n . Then create Krylov subspace r 0 , . . . , ( BA ) i − 1 ˜ r 0 � in R n . � ˜ r 0 , BA ˜ BA ∈ R n × n

First note: Theorem 9 � b − A x ∗ � 2 = min x ∈ R n � b − A x � 2 and � B b − BA x ∗ � 2 = min x ∈ R n � B b − BA x � 2 are equivalent for all b ∈ R m � R ( A ) = R ( B T BA ) .

Also note Lemma 10 R ( A ) = R ( B T ) = ⇒ R ( BA ) = R ( B ) . Lemma 11 ⇒ R ( B T BA ) = R ( B T ) . R ( BA ) = R ( B ) = Lemma 12 R ( A ) = R ( B T ) = ⇒ R ( A ) = R ( B T BA ) .

Thus, assume R ( A ) = R ( B T BA ), and consider applying GMRES( k ) method to x ∈ R n || B b − BA x || 2 min with initial approximation x 0 .

BA-GMRES( k ) method Choose x 0 . ∗ ˜ r 0 = B ( b − A x 0 ) r 0 / || ˜ r 0 || 2 v 1 = ˜ for i = 1 , 2 , . . . , k until convergence w i = BA v i for j = 1 , 2 , . . . , i h j,i = ( w i , v j ) w i = w i − h j,i v j end for h i +1 ,i = � w i � 2 v i +1 = w i /h i +1 ,i Find y i ∈ R i which minimizes � ˜ � � ˜ r 0 � 2 e i − ¯ r i � 2 = � � H i y � 2 x i = x 0 + [ v 1 , . . . , v i ] y i r i = b − A x i if � A T r i � 2 < ǫ stop end for Go to ∗ .

Does BA-GMRES method give the least squares solution without breakdown ? Noting Theorem 13 If R ( A ) = R ( B T ) , then R ( BA ) = R ( A T B T ) ⇐ ⇒ R ( A T ) = R ( B ) . ✷

Theorem 14 If R ( A ) = R ( B T ) , then BA-GMRES method determines x ∈ R m || b − A x || 2 a least squares solution of min ∀ b ∈ R m , ∀ x 0 ∈ R n without breakdown � R ( A T ) = R ( B ) . ✷ Corollary 15 R ( A ) = R ( B T ) , R ( A T ) = R ( B ) ⇓ BA-GMRES method determines a least squares solution of x ∈ R m || b − A x || 2 min ∀ b ∈ R m , ∀ x 0 ∈ R n without breakdown. ✷

Summary on condition for B General case : rank A ≤ min( m, n ), R ( A ) = R ( B T ) , R ( A T ) = R ( B ) ⇓ AB-GMRES, BA-GMRES methods give x ∈ R m || b − A x || 2 a least squares solution of min ∀ b ∈ R m , ∀ x 0 ∈ R n without breakdown. · · · e.g. Let B = αA T where 0 � = α ∈ R .

Full rank case m ≥ n = rank A ( over-determined case ) Let B = CA T where C ∈ R n × n :nonsingular ( B, A T ∈ R n × m ). ⇓ B T = AC T ⇓ R ( A ) = R ( B T ) . ⇓ n = rank A T = rank A = rank B T = rank B ⇓ R ( A T ) = R ( B ) = R n . Since AB ∈ R m × m , BA ∈ R n × n , m ≥ n , use BA-GMRES method. e.g. Let C := { diag( A T A ) } − 1 i.e. BA = { diag( A T A ) } − 1 A T A .

Convergence Analysis Theorem 16 Let A ∈ R m × n , m ≥ n, B := CA T , C ∈ R n × n : sym.pos.def., 1 σ i (1 ≤ i ≤ n ) : singular values of ˜ A := AC 2 . Then, σ 2 i (1 ≤ i ≤ n ) : eigenvalues of AB and BA . If m > n , all other eigenvalues of AB are 0 .

[Proof] 1 2 = U Σ V T : SVD, Let ˜ A := AC U ∈ R m × m , V ∈ R n × n : orthogonal matrices, ⎡ ⎤ σ 1 ... 0 ⎢ ⎥ ⎢ ⎥ ∈ R m × n , ⎢ ⎥ Σ = σ n ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎣ ⎦ 0 σ 1 ≥ . . . ≥ σ n ≥ 0 : singular values of ˜ A . Then, AB = ACA T = ˜ A T = U ΣΣ T U T , A ˜ 1 AC − 1 1 1 A T ˜ 2 V ) − 1 . BA = CA T A = C 2 ˜ 2 V Σ T Σ( C 2 = C cf. If rank A = n, C := { diag( A T A ) } − 1 : sym.pos.def. Similarly for RIF.

Theorem 17 The residual r = b − A x achieved by the k -th step of AB-GMRES satisfies � k � σ 1 − σ n � r k | R ( A ) � 2 ≤ 2 � r 0 | R ( A ) � 2 . σ 1 + σ n Theorem 18 The residual r = b − A x achieved by the k -th step of BA-GMRES satisfies � k � σ 1 − σ n � � B r k � 2 = � CA T r k � 2 ≤ 2 κ ( C ) � B r 0 � 2 . σ 1 + σ n

GMRES Methods for Least Squares Problems Ken Hayami, Jun-Feng Yin, - PowerPoint PPT Presentation

GMRES Methods for Least Squares Problems Ken Hayami, Jun-Feng Yin, National Institute of Informatics, Tokyo and Tokushi Ito Business Design Laboratory Co.,Ltd., Nagoya HARRACHOV 2007 Computational Linear Algebra with Applications

Practical Least-Squares for Computer Graphics Siggraph Course 11 Siggraph Course 11 Practical

Statistical Properties of the Regularized Least Squares Functional and a hybrid LSQR Newton method

Least Mean Squares Regression Machine Learning 1 Least Squares Method for regression

The Mathemagic of Magic Squares History of Magic Squares Mathematics and Magic Squares

ECE 516: Adaptive Digital Filters Lecture 13 (Recursive Least-Squares) Mojtaba Soltanalian 2

Statistical Geometry Processing Winter Semester 2011/2012 Least-Squares Least-Squares Fitting

9. Equality constraints and tradeoffs More least squares Example: moving average model

8. Least squares Review of linear equations Least squares Example: curve-fitting

Linear Least Squares I Steve Marschner Cornell CS 322 Cornell CS 322 Linear Least Squares I 1

Moving Least Squares Outline The Approximation Power of Moving Least- Squares D. Levin

The Chi-squared Distribution of the Regularized Least Squares Functional for Regularization

Non linear Least Squares Lectures for PHD course on Numerical optimization Enrico Bertolazzi

Geometry of Least Squares 2 Least squares from the

CS475 / CS675 Lecture 10: June 2, 2016 Least Squares Problems Reading: [TB] Chapt 11

Squares of function spaces and function spaces on squares Miko laj Krupski University of

ECS231 Least-squares problems (Introduction to Randomized Algorithms) May 21, 2019 1 / 12

ON-Bases and Least Square Method Artem Los (arteml@kth.se) February 21th, 2017 Artem Los

Estimating Size and Effort Dr. James A. Bednar jbednar@inf.ed.ac.uk

Machine Learning 10-701 Tom M. Mitchell Machine Learning Department Carnegie Mellon University

2.2: Numerical summary Measures of location. Measures of spread. Measures of form.

On the use of the damped Newton method to solve direct and controllability problems for parabolic

Fitting Linear Statistical Models to Data by Least Squares III: Multivariate Brian R. Hunt and C.

Final exam location: Clough 152 Please fill out your CIOS survey! Post topics for

Estimation III: Method of Moments and Maximum Likelihood Stat 3202 @ OSU, Autumn 2018 Dalpiaz 1