The Faber-Manteuffel Theorem and its Consequences Petr Tich joint - PowerPoint PPT Presentation

The Faber-Manteuffel Theorem and its Consequences Petr Tichý joint work with Vance Faber, Jörg Liesen Czech Academy of Sciences July 21, 2011 ICIAM 2011, Vancouver, BC, Canada 1

Optimal Krylov subspace methods and low memory requirements? Consider a system of linear algebraic equations A x = b A ∈ R n × n is nonsingular, b ∈ R n . Given x 0 , find an optimal x j ∈ x 0 + K j ( A , r 0 ) so that the error is minimized in a given vector norm. What are necessary and sufficient conditions on A so that the optimal x j can be computed using short recurrences? (only a constant number of vectors is needed) 2

Examples of optimal Krylov subspace methods with short recurrences CG [Hestenes, Stiefel 1952] , MINRES, SYMMLQ [Paige, Saunders 1975] Optimal in the sense that they minimize some error norm: � x − x j � A in CG, � x − x j � A T A = � r j � in MINRES, � x − x j � in SYMMLQ - here x j ∈ x 0 + A K j ( A , r 0 ) . Generate orthogonal (or A -orthogonal) Krylov subspace basis using a three-term recurrence, r j +1 = γ j A r j − α j r j − β j r j − 1 . An important assumption: A is symmetric (MINRES, SYMMLQ) and positive definite (CG). 3

Gene Golub By the end of the 1970s it was unknown if such methods existed also for general unsymmetric A . Gatlinburg VIII (now Householder Symposium) held in Oxford in 1981. “A prize of $500 has been offered by Gene Golub for the construction of a 3-term conjugate gradient like descent method for non-symmetric real matrices or a proof that there G. H. Golub, 1932–2007 can be no such method”. 4

What kind of method Golub had in mind We want to solve A x = b using CG-like descent method: error is minimized in some given inner product norm, � · � B = �· , ·� 1 / 2 B . Starting from x 0 , compute x j +1 = x j + α j p j , j = 0 , 1 , . . . , p j is a direction vector, α j is a scalar (to be determined), span { p 0 , . . . , p j } = K j +1 ( A , r 0 ) , r 0 = b − A x 0 . � x − x j +1 � B is minimal iff α j = � x − x j , p j � B � p j , p i � B = 0 . and � p j , p j � B p 0 , . . . , p j has to be a B -orthogonal basis of K j +1 ( A , r 0 ) . 5

Optimal Krylov subspace method with short recurrences The question about the existence of an optimal Krylov subspace method with short recurrences can be reduced to the question: For which A is it possible to generate a B -orthogonal basis of the Krylov subspace using short recurrences? (for each initial starting vector) 6

Faber, Manteuffel 1984 Faber and Manteuffel gave the answer in 1984: For a general matrix A there exists no short recurrence for generating orthogonal Krylov subspace bases. What are the details of this statement ? 7

Outline The Faber-Manteuffel theorem 1 2 Ideas of a new proof 3 Consequences Other types of recurrences 4 8

Formulation of the problem B -inner product, Input and Notation Without loss of generality, B = I . Otherwise change the basis: � x, y � B = � B 1 / 2 x, B 1 / 2 y � , A ≡ B 1 / 2 AB − 1 / 2 , ˆ v ≡ B 1 / 2 v . ˆ Input data : A ∈ C n × n , a nonsingular matrix. v ∈ C n , an initial vector. Notation: d min ( A ) . . . the degree of the minimal polynomial of A . d = d ( A , v ) . . . the grade of v with respect to A , the smallest d s.t. K d ( A , v ) is invariant under multiplication with A . 10

Formulation of the problem Our Goal Generate a basis v 1 , . . . , v d of K d ( A , v ) s.t. 1. span { v 1 , . . . , v j } = K j ( A , v ) , for j = 1 , . . . , d , 2. � v i , v j � = 0 , for i � = j , i, j = 1 , . . . , d . The Arnoldi algorithm: Standard way for generating the orthogonal basis (no normalization for convenience): v 1 ≡ v , j h i,j = � A v j , v i � � v j +1 = A v j − h i,j v i , � v i , v i � , i =1 j = 0 , . . . , d − 1 . 11

Formulation of the problem The Arnoldi algorithm - matrix representation In matrix notation: v 1 = v ,   · · · h 1 , 1 h 1 ,d − 1 . ...   . 1 .     ...   A [ v 1 , . . . , v d − 1 ] = [ v 1 , . . . , v d ] ,   h d − 1 ,d − 1   � ��   ≡ V d − 1 ≡ V d   1 � �� ≡ H d,d − 1 V ∗ d V d is diagonal , d = dim K n ( A , v ) . j � ( s + 2 )-term recurrence: v j +1 = A v j − h i,j v i . i = j − s 12

Formulation of the problem Optimal short recurrences (Definition - Liesen, Strakoš 2008) A admits an optimal ( s + 2) -term recurrence, if for any v , H d,d − 1 is at most ( s + 2) -band Hessenberg, and for at least one v , H d,d − 1 is ( s + 2) -band Hessenberg. s + 1 � ��   • · · · • ... ...   •     ... ...    •    A V d − 1 = V d .  ... ...  .   .     ...   •   • � �� d − 1 Sufficient and necessary conditions on A ? 13

The Faber-Manteuffel theorem Definition . If A ∗ = p s ( A ) , where p s is a polynomial of the smallest possible degree s , A is called normal( s ). Theorem [Faber, Manteuffel 1984] , [Liesen, Strakoš 2008] Given nonsingular A and nonnegative s , s + 2 < d min ( A ) . A admits an optimal ( s + 2) -term recurrence if and only if A is normal( s ). Sufficiency is straightforward, necessity is not . Key words from the proof of necessity in [Faber, Manteuffel 1984] include: “continuous function” (analysis), “closed set of smaller dimension” (topology), “wedge product” (multilinear algebra). 14

A new proof of the Faber-Manteuffel theorem Motivated by the paper [Liesen, Strakoš 2008] which contains a completely reworked theory of short recurrences for generating orthogonal Krylov subspace bases . “It is unknown if a simpler proof of the necessity part can be found. In view of the fundamental nature of the Faber-Manteuffel Theorem, such proof would be a welcome addition to the existing literature. It would lead to a better understanding of the theorem by enlightening some (possibly unexpected) relationships, and it would also be more suitable for classroom teaching.” In [Faber, Liesen, T. 2008] we give two new proofs of the Faber-Manteuffel theorem that use more elementary tools. 16

Extension of A V d − 1 = V d H d,d − 1 Matrix representation of A in V d Since K d ( A , v ) is invariant, A v d ∈ K d ( A , v ) and d � A v d = h i,d v i . i =1 s + 1 � ��   • · · · • • . ... ...   . • .     ... ...    • •    A V d = V d . .  ... ...  . .   . .     ...   • •   • • � �� d − 1 17

Idea of the proof Unitary transformation of the upper Hessenberg matrix (for simplicity, we omit indices by V d and H d,d ) Proof by contradiction. Let A admit an optimal ( s + 2) -term recurrence and A not be normal ( s ) . Then there exists a starting vector v such that h 1 ,d � = 0 . •   • · · · • . ... ...   . • .     ... ...    • •  A ( VG ) = ( VG ) G ∗   G . .  ... ...  . .   . .     ...   • •   • • Find unitary G such that G ∗ HG is unreduced upper Hessenberg, but G ∗ HG is not ( s + 2) -band (up to the last column). 18

Faber-Manteuffel Theorem – Summary Generating an orthogonal basis of K d ( A , v ) via Arnoldi-type recurrence When is A normal ( s ) ? Arnoldi-type recurrence ( s + 2) -term A is normal and [Faber, Manteuffel 1984] , � [Khavinson, Świa ¸tek 2003] [Liesen, Strakoš 2008] A is normal(s) 1. s = 1 if and only if the A ∗ = p ( A ) eigenvalues of A lie on a line in C . � 2. For s > 1 , A has at most 3 s − 2 different the only interesting case eigenvalues. is s = 1 , All classes of “interesting” collinear eigenvalues matrices are known. 19

When is A orthogonally reducible to ( s + 2) -band Hessenberg form? The matrix representation of the Arnoldi algorithm can be extended by one column to A V d = V d H d where H d ∈ C d × d is unreduced upper Hessenberg matrix. We say that A is orthogonally reducible to ( s + 2) -band Hessenberg form if H d is ( s + 2) -band Hessenberg matrix for each starting vector v 1 . What are necessary and sufficient conditions on A to be orthogonally reducible to ( s + 2) -band Hessenberg form? 21

When is A orthogonally reducible to ( s + 2) -band Hessenberg form? A is normal ( s ) , A ∗ = p ( A ) A admits A is reducible to ( s + 2) -term recurrence ( s +2) -band Hessenberg     • · · · • • • · · · • . ... ... ... ...     . • . •         ... ... ... ...      • •   •      . . .  ... ...   ... ...  . . .    •  . . .        .  ... ... .     • • • .     • • • • 22

The Faber-Manteuffel Theorem and its Consequences Petr Tich joint - PowerPoint PPT Presentation

The Faber-Manteuffel Theorem and its Consequences Petr Tich joint work with Vance Faber, Jrg Liesen Czech Academy of Sciences July 21, 2011 ICIAM 2011, Vancouver, BC, Canada 1 Optimal Krylov subspace methods and low memory

On a New Proof of the Faber-Manteuffel Theorem Petr Tich joint work with Jrg Liesen and

On Singularity Resolutions, Evaluations and Reductions of Feynman Integrals Andreas v. Manteuffel

Overview of A.W. Faber-Castell Slide Rule Dating Chronology 1892-1920 Colin Tombeur Background

31. Stokes Theorem Stokes theorem is to Greens theorem, for the work done, as the

Ch04. Maximum Theorem, Implicit Function Theorem and Envelope Theorem Ping Yu Faculty of

Arrows Impossibility Theorem Lecture 12 Arrows Impossibility Theorem Lecture 12, Slide 1

Enhancing Efficiency and Expressiveness in Answer Set Programming Systems Wolfgang Faber

SPECIAL MOBILITY STRAND On Natural Hazards Risk Management Michael Havbro Faber Banja Luka,

Ahold E-commerce Hanneke Faber, Chief Commercial Officer Commerzbank, Frankfurt, May 12, 2016

SPECIAL MOBILITY STRAND Resilience in the Context of Insurance Michael Havbro Faber University

Generalized Intermediate Value Theorem Intermediate Value Theorem Theorem Intermediate Value

Ethical Theory The only thing that determines the Consequences Consequences are irrelevant one

The Replacement Theorem Theorem (Theorem 1.10) Let V be a vector space and suppose G and L are

Section 10 Cosets and the Theorem of Lagrange Instructor: Yifan Yang Fall 2006 Instructor:

PCP Theorem [PCP Theorem is] the most important result in complexity theory since Cooks

Arrows Impossibility Theorem Lecture 12 Arrows Impossibility Theorem Lecture 12, Slide 1

Cache-oblivious sparse matrixvector multiplication Albert-Jan Yzelman April 3, 2009 Joint

Quantum Diffusion and Delocalization for Random Band Matrices Antti Knowles Harvard University

A Minimization Algorithm Consider the minimization problem: * M min M M * subject

Matrix Factorization DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis

Structured matrices in the computation of band spectra of photonic crystals Pietro Contu,

Estimator for Graph Analytics on FPGA Heiner Giefers, Peter Staar, Raphael Polig IBM Research

EI331 Signals and Systems Lecture 17 Bo Jiang John Hopcroft Center for Computer Science

Internet Technology 5/6/2016 Session Initiation Protocol (SIP) Dominant protocol for Voice