On a New Proof of the Faber-Manteuffel Theorem Petr Tich joint work - - PowerPoint PPT Presentation

on a new proof of the faber manteuffel theorem
SMART_READER_LITE
LIVE PREVIEW

On a New Proof of the Faber-Manteuffel Theorem Petr Tich joint work - - PowerPoint PPT Presentation

On a New Proof of the Faber-Manteuffel Theorem Petr Tich joint work with Jrg Liesen and Vance Faber Institute of Computer Science, Academy of Sciences of the Czech Republic June 2, 2008, Zeuthen, Germany Householder Symposium XVII 1


slide-1
SLIDE 1

On a New Proof of the Faber-Manteuffel Theorem

Petr Tichý

joint work with

Jörg Liesen and Vance Faber

Institute of Computer Science, Academy of Sciences of the Czech Republic

June 2, 2008, Zeuthen, Germany Householder Symposium XVII

1

slide-2
SLIDE 2

Outline

1

Introduction

2

Formulation of the problem

3

The Faber-Manteuffel theorem

4

The ideas of a new proof

2

slide-3
SLIDE 3

Outline

1

Introduction

2

Formulation of the problem

3

The Faber-Manteuffel theorem

4

The ideas of a new proof

3

slide-4
SLIDE 4

Krylov subspace methods

Basis

Methods based on projection onto the Krylov subspaces Kj(A, v) ≡ span(v, Av, . . . , Aj−1v) j = 1, 2, . . . . A ∈ Rn×n, v ∈ Rn.

4

slide-5
SLIDE 5

Krylov subspace methods

Basis

Methods based on projection onto the Krylov subspaces Kj(A, v) ≡ span(v, Av, . . . , Aj−1v) j = 1, 2, . . . . A ∈ Rn×n, v ∈ Rn. Each method must generate a basis of Kj(A, v). The trivial choice v, Av, . . . , Aj−1v is computationally infeasible (recall the Power Method). For numerical stability: Well conditioned basis. For computational efficiency: Short recurrence.

4

slide-6
SLIDE 6

Krylov subspace methods

Basis

Methods based on projection onto the Krylov subspaces Kj(A, v) ≡ span(v, Av, . . . , Aj−1v) j = 1, 2, . . . . A ∈ Rn×n, v ∈ Rn. Each method must generate a basis of Kj(A, v). The trivial choice v, Av, . . . , Aj−1v is computationally infeasible (recall the Power Method). For numerical stability: Well conditioned basis. For computational efficiency: Short recurrence. Best of both worlds: Orthogonal basis computed by short recurrence.

4

slide-7
SLIDE 7

Optimal Krylov subspace methods

with short recurrences

CG (1952), MINRES, SYMMLQ (1975) based on three-term recurrences rj+1 = γjArj − αjrj − βjrj−1 ,

5

slide-8
SLIDE 8

Optimal Krylov subspace methods

with short recurrences

CG (1952), MINRES, SYMMLQ (1975) based on three-term recurrences rj+1 = γjArj − αjrj − βjrj−1 , generate orthogonal (or A-orthogonal) Krylov subspace basis,

5

slide-9
SLIDE 9

Optimal Krylov subspace methods

with short recurrences

CG (1952), MINRES, SYMMLQ (1975) based on three-term recurrences rj+1 = γjArj − αjrj − βjrj−1 , generate orthogonal (or A-orthogonal) Krylov subspace basis,

  • ptimal in the sense that they minimize some error norm:

x − xjA in CG, x − xjAT A = rj in MINRES, x − xj in SYMMLQ -here xj ∈ x0 + AKj(A, r0).

5

slide-10
SLIDE 10

Optimal Krylov subspace methods

with short recurrences

CG (1952), MINRES, SYMMLQ (1975) based on three-term recurrences rj+1 = γjArj − αjrj − βjrj−1 , generate orthogonal (or A-orthogonal) Krylov subspace basis,

  • ptimal in the sense that they minimize some error norm:

x − xjA in CG, x − xjAT A = rj in MINRES, x − xj in SYMMLQ -here xj ∈ x0 + AKj(A, r0). An important assumption on A: A is symmetric (MINRES, SYMMLQ) & pos. definite (CG).

5

slide-11
SLIDE 11

Gene Golub

  • G. H. Golub, 1932–2007

By the end of the 1970s it was unknown if such methods existed also for general unsymmetric A. Gatlinburg VIII (now Householder VIII) held in Oxford from July 5 to 11, 1981. “A prize of $500 has been

  • ffered by Gene Golub for the

construction of a 3-term conjugate gradient like descent method for non-symmetric real matrices or a proof that there can be no such method”.

6

slide-12
SLIDE 12

What kind of method Golub had in mind

We want to solve Ax = b using CG-like descent method: error is minimized in some given inner product norm, · B = ·, ·1/2

B .

7

slide-13
SLIDE 13

What kind of method Golub had in mind

We want to solve Ax = b using CG-like descent method: error is minimized in some given inner product norm, · B = ·, ·1/2

B .

Starting from x0, compute xj+1 = xj + αjpj , j = 0, 1, . . . , pj is a direction vector, αj is a scalar (to be determined), span{p0, . . . , pj} = Kj+1(A, r0), r0 = b − Ax0 .

7

slide-14
SLIDE 14

What kind of method Golub had in mind

We want to solve Ax = b using CG-like descent method: error is minimized in some given inner product norm, · B = ·, ·1/2

B .

Starting from x0, compute xj+1 = xj + αjpj , j = 0, 1, . . . , pj is a direction vector, αj is a scalar (to be determined), span{p0, . . . , pj} = Kj+1(A, r0), r0 = b − Ax0 . x − xj+1B is minimal iff αj = x − xj, pjB pj, pjB and pj, piB = 0 .

7

slide-15
SLIDE 15

What kind of method Golub had in mind

We want to solve Ax = b using CG-like descent method: error is minimized in some given inner product norm, · B = ·, ·1/2

B .

Starting from x0, compute xj+1 = xj + αjpj , j = 0, 1, . . . , pj is a direction vector, αj is a scalar (to be determined), span{p0, . . . , pj} = Kj+1(A, r0), r0 = b − Ax0 . x − xj+1B is minimal iff αj = x − xj, pjB pj, pjB and pj, piB = 0 . p0, . . . , pj has to be a B-orthogonal basis of Kj+1(A, r0).

7

slide-16
SLIDE 16

Faber and Manteuffel, 1984

Faber and Manteuffel gave the answer in 1984: For a general matrix A there exists no short recurrence for generating orthogonal Krylov subspace bases. What are the details of this statement?

8

slide-17
SLIDE 17

Outline

1

Introduction

2

Formulation of the problem

3

The Faber-Manteuffel theorem

4

The ideas of a new proof

9

slide-18
SLIDE 18

Formulation of the problem

B-inner product, Input and Notation

Without loss of generality, B = I. Otherwise change the basis: x, yB = B1/2x, B1/2y, ˆ A ≡ B1/2AB−1/2, ˆ v ≡ B1/2v .

10

slide-19
SLIDE 19

Formulation of the problem

B-inner product, Input and Notation

Without loss of generality, B = I. Otherwise change the basis: x, yB = B1/2x, B1/2y, ˆ A ≡ B1/2AB−1/2, ˆ v ≡ B1/2v . Input data: A ∈ Cn×n, a nonsingular matrix. v ∈ Cn, an initial vector.

10

slide-20
SLIDE 20

Formulation of the problem

B-inner product, Input and Notation

Without loss of generality, B = I. Otherwise change the basis: x, yB = B1/2x, B1/2y, ˆ A ≡ B1/2AB−1/2, ˆ v ≡ B1/2v . Input data: A ∈ Cn×n, a nonsingular matrix. v ∈ Cn, an initial vector. Notation: dmin(A) . . . the degree of the minimal polynomial of A. d = d(A, v) . . . the grade of v with respect to A, the smallest d s.t. Kd(A, v) is invariant under mult. with A.

10

slide-21
SLIDE 21

Formulation of the problem

Our Goal

Generate a basis v1, . . . , vd of Kd(A, v) s.t.

  • 1. span{v1, . . . , vj} = Kj(A, v),

for j = 1, . . . , d,

  • 2. vi, vj = 0,

for i = j, i, j = 1, . . . , d.

11

slide-22
SLIDE 22

Formulation of the problem

Our Goal

Generate a basis v1, . . . , vd of Kd(A, v) s.t.

  • 1. span{v1, . . . , vj} = Kj(A, v),

for j = 1, . . . , d,

  • 2. vi, vj = 0,

for i = j, i, j = 1, . . . , d. Arnoldi’s method: Standard way for generating the orthogonal basis (no normalization for convenience): v1 ≡ v, vj+1 = Avj −

j

  • i=1

hi,j vi , hi,j = Avj, vi vi, vi , j = 0, . . . , d − 1.

11

slide-23
SLIDE 23

Formulation of the problem

Arnoldi’s method - matrix formulation

In matrix notation: v1 = v , A [v1, . . . , vd−1]

  • ≡ Vd−1

= [v1, . . . , vd]

  • ≡ Vd

        

h1,1 · · · h1,d−1 1 ... . . . ... hd−1,d−1 1

        

  • ≡ Hd,d−1

, V∗

dVd is diagonal ,

d = dim Kn(A, v) .

12

slide-24
SLIDE 24

Formulation of the problem

Optimal short recurrences (Definition - Liesen and Strakoš, 2008)

A admits an optimal (s + 2)-term recurrence, if for any v, Hd,d−1 is at most (s + 2)-band Hessenberg, and for at least one v, Hd,d−1 is (s + 2)-band Hessenberg. s + 1

  • A Vd−1

= Vd

            

  • · · ·
  • ...

... ... ...

  • ...

... . . . ...

           

  • d − 1

13

slide-25
SLIDE 25

Formulation of the problem

Basic question

What are sufficient and necessary conditions for A to admit an

  • ptimal (s + 2)-term recurrence?

14

slide-26
SLIDE 26

Formulation of the problem

Basic question

What are sufficient and necessary conditions for A to admit an

  • ptimal (s + 2)-term recurrence?

In other words, how can we characterize matrices A such that for any v, Arnoldi’s method applied to A and v generates an

  • rthogonal basis via a short recurrence of length s + 2.

14

slide-27
SLIDE 27

Formulation of the problem

Basic question

What are sufficient and necessary conditions for A to admit an

  • ptimal (s + 2)-term recurrence?

In other words, how can we characterize matrices A such that for any v, Arnoldi’s method applied to A and v generates an

  • rthogonal basis via a short recurrence of length s + 2.

Example of sufficiency: If A∗ = A, then s = 1 and A admits an

  • ptimal 3-term recurrence.

14

slide-28
SLIDE 28

Formulation of the problem

Basic question

What are sufficient and necessary conditions for A to admit an

  • ptimal (s + 2)-term recurrence?

In other words, how can we characterize matrices A such that for any v, Arnoldi’s method applied to A and v generates an

  • rthogonal basis via a short recurrence of length s + 2.

Example of sufficiency: If A∗ = A, then s = 1 and A admits an

  • ptimal 3-term recurrence.
  • Definition. If

A∗ = ps(A), where ps is a polynomial of the smallest possible degree s, A is called normal(s).

14

slide-29
SLIDE 29

Outline

1

Introduction

2

Formulation of the problem

3

The Faber-Manteuffel theorem

4

The ideas of a new proof

15

slide-30
SLIDE 30

The Faber-Manteuffel theorem

  • Theorem. [Faber and Manteuffel, 1984], [Liesen and Strakoš, 2008]

Let A be a nonsingular matrix with minimal polynomial degree dmin(A). Let s be a nonnegative integer, s + 2 < dmin(A): A admits an optimal (s + 2)-term recurrence if and only if A is normal(s).

16

slide-31
SLIDE 31

The Faber-Manteuffel theorem

  • Theorem. [Faber and Manteuffel, 1984], [Liesen and Strakoš, 2008]

Let A be a nonsingular matrix with minimal polynomial degree dmin(A). Let s be a nonnegative integer, s + 2 < dmin(A): A admits an optimal (s + 2)-term recurrence if and only if A is normal(s). Sufficiency is rather straightforward, necessity is not. Key words from the proof of necessity in (Faber and Manteuffel, 1984) include: “continuous function” (analysis), “closed set of smaller dimension” (topology), “wedge product” (multilinear algebra).

16

slide-32
SLIDE 32

The Faber-Manteuffel theorem

Why is necessity so hard?

Optimal (s + 2)-term recurrence: s + 1

  • A Vd−1

= Vd

            

  • · · ·
  • ...

... ... ...

  • ...

... . . . ...

           

  • d − 1

Prove something about the linear operator A, without complete knowledge of the structure of its matrix representation.

17

slide-33
SLIDE 33

The Faber-Manteuffel theorem

Why is necessity so hard?

Since Kd(A, v) is invariant, Avd ∈ Kd(A, v) and Avd =

d

  • i=1

hid vi. s + 1

  • A Vd

= Vd

            

  • · · ·
  • ...

... . . . ... ...

  • ...

... . . . . . . ...

           

  • d − 1

18

slide-34
SLIDE 34

Outline

1

Introduction

2

Formulation of the problem

3

The Faber-Manteuffel theorem

4

The ideas of a new proof

19

slide-35
SLIDE 35
  • V. Faber, J. Liesen and P. Tichý., 2008

The Faber-Manteuffel Theorem for Linear Operators

Motivated by the paper [J. Liesen and Z. Strakoš, 2008] which

contains a completely reworked theory of short recurrences for generating orthogonal Krylov subspace bases. “It is unknown if a simpler proof of the necessity part can be found. In view of the fundamental nature of the Faber-Manteuffel Theorem, such proof would be a welcome addition to the existing

  • literature. It would lead to a better understanding of the theorem by

enlightening some (possibly unexpected) relationships, and it would also be more suitable for classroom teaching.”

20

slide-36
SLIDE 36
  • V. Faber, J. Liesen and P. Tichý., 2008

The Faber-Manteuffel Theorem for Linear Operators

Motivated by the paper [J. Liesen and Z. Strakoš, 2008] which

contains a completely reworked theory of short recurrences for generating orthogonal Krylov subspace bases. “It is unknown if a simpler proof of the necessity part can be found. In view of the fundamental nature of the Faber-Manteuffel Theorem, such proof would be a welcome addition to the existing

  • literature. It would lead to a better understanding of the theorem by

enlightening some (possibly unexpected) relationships, and it would also be more suitable for classroom teaching.”

We give two new proofs of the Faber-Manteuffel theorem that use more elementary tools, first proof - improved version of the Faber-Manteuffel proof, second proof - completely new proof based on orthogonal transformations of upper Hessenberg matrices.

20

slide-37
SLIDE 37

Idea of the second proof (1)

  • V. Faber, J. Liesen and P. Tichý, 2008

(for simplicity, we omit indices by Vd and Hd,d) Let A admit an optimal (s + 2)-term recurrence A V = V H, V∗V = I . Up to the last column, H is (s + 2)-band Hessenberg.

21

slide-38
SLIDE 38

Idea of the second proof (1)

  • V. Faber, J. Liesen and P. Tichý, 2008

(for simplicity, we omit indices by Vd and Hd,d) Let A admit an optimal (s + 2)-term recurrence A V = V H, V∗V = I . Up to the last column, H is (s + 2)-band Hessenberg. Let G be a d × d unitary matrix, G∗G = I. Then A (VG)

W

= (VG)

W

(G∗HG)

  • H

. W is unitary.

21

slide-39
SLIDE 39

Idea of the second proof (1)

  • V. Faber, J. Liesen and P. Tichý, 2008

(for simplicity, we omit indices by Vd and Hd,d) Let A admit an optimal (s + 2)-term recurrence A V = V H, V∗V = I . Up to the last column, H is (s + 2)-band Hessenberg. Let G be a d × d unitary matrix, G∗G = I. Then A (VG)

W

= (VG)

W

(G∗HG)

  • H

. W is unitary. If G is chosen such that H is again unreduced upper Hessenberg matrix, then A W = W ˜ H . represents the result of Arnoldi’s method applied to A and w1. Up to the last column, H has to be (s + 2)-band Hessenberg.

21

slide-40
SLIDE 40

Idea of the second proof (2)

  • V. Faber, J. Liesen and P. Tichý, 2008

Proof by contradiction. Let A admit an optimal (s + 2)-term recurrence and A not be normal(s). Then there exists a starting vector v such that h1,d = 0. A V = V

            

  • · · ·
  • ...

... . . . ... ...

  • ...

... . . . . . . ...

           

22

slide-41
SLIDE 41

Idea of the second proof (2)

  • V. Faber, J. Liesen and P. Tichý, 2008

Proof by contradiction. Let A admit an optimal (s + 2)-term recurrence and A not be normal(s). Then there exists a starting vector v such that h1,d = 0. A (VG) = (VG) G∗

            

  • · · ·
  • ...

... . . . ... ...

  • ...

... . . . . . . ...

           

G

22

slide-42
SLIDE 42

Idea of the second proof (2)

  • V. Faber, J. Liesen and P. Tichý, 2008

Proof by contradiction. Let A admit an optimal (s + 2)-term recurrence and A not be normal(s). Then there exists a starting vector v such that h1,d = 0. A (VG) = (VG) G∗

            

  • · · ·
  • ...

... . . . ... ...

  • ...

... . . . . . . ...

           

G Find unitary G (a product of Givens rotations) such that H is unreduced upper Hessenberg, but H is not (s + 2)-band (up to the last column) - contradiction.

22

slide-43
SLIDE 43

Idea of the second proof (3)

  • V. Faber, J. Liesen and P. Tichý, 2008

Let v be a starting vector such that h1,8 = 0. Choose Givens rotation G7,8. G∗

7,8

             

            

G7,8

23

slide-44
SLIDE 44

Idea of the second proof (3)

  • V. Faber, J. Liesen and P. Tichý, 2008

Let v be a starting vector such that h1,8 = 0. Choose Givens rotation G7,8. G∗

7,8

             

            

G7,8

23

slide-45
SLIDE 45

Idea of the second proof (3)

  • V. Faber, J. Liesen and P. Tichý, 2008

Let v be a starting vector such that h1,8 = 0. Choose Givens rotation G7,8. G∗

7,8

             

            

G6,7

23

slide-46
SLIDE 46

Idea of the second proof (3)

  • V. Faber, J. Liesen and P. Tichý, 2008

Let v be a starting vector such that h1,8 = 0. Choose Givens rotation G7,8. G∗

6,7

             

            

G6,7

23

slide-47
SLIDE 47

Idea of the second proof (3)

  • V. Faber, J. Liesen and P. Tichý, 2008

Let v be a starting vector such that h1,8 = 0. Choose Givens rotation G7,8. G∗

6,7

             

            

G5,6

23

slide-48
SLIDE 48

Idea of the second proof (3)

  • V. Faber, J. Liesen and P. Tichý, 2008

Let v be a starting vector such that h1,8 = 0. Choose Givens rotation G7,8. G∗

5,6

             

            

G5,6

23

slide-49
SLIDE 49

Idea of the second proof (3)

  • V. Faber, J. Liesen and P. Tichý, 2008

Let v be a starting vector such that h1,8 = 0. Choose Givens rotation G7,8. G∗

5,6

             

            

G4,5

23

slide-50
SLIDE 50

Idea of the second proof (3)

  • V. Faber, J. Liesen and P. Tichý, 2008

Let v be a starting vector such that h1,8 = 0. Choose Givens rotation G7,8. G∗

4,5

             

            

G4,5

23

slide-51
SLIDE 51

Idea of the second proof (3)

  • V. Faber, J. Liesen and P. Tichý, 2008

Let v be a starting vector such that h1,8 = 0. Choose Givens rotation G7,8. G∗

4,5

             

            

G3,4

23

slide-52
SLIDE 52

Idea of the second proof (3)

  • V. Faber, J. Liesen and P. Tichý, 2008

Let v be a starting vector such that h1,8 = 0. Choose Givens rotation G7,8. G∗

3,4

             

            

G3,4

23

slide-53
SLIDE 53

Idea of the second proof (3)

  • V. Faber, J. Liesen and P. Tichý, 2008

Let v be a starting vector such that h1,8 = 0. Choose Givens rotation G7,8. G∗

3,4

             

            

G2,3

23

slide-54
SLIDE 54

Idea of the second proof (3)

  • V. Faber, J. Liesen and P. Tichý, 2008

Let v be a starting vector such that h1,8 = 0. Choose Givens rotation G7,8. G∗

2,3

             

            

G2,3

23

slide-55
SLIDE 55

Idea of the second proof (3)

  • V. Faber, J. Liesen and P. Tichý, 2008

Let v be a starting vector such that h1,8 = 0. Choose Givens rotation G7,8. G∗

2,3

             

            

G1,2

23

slide-56
SLIDE 56

Idea of the second proof (3)

  • V. Faber, J. Liesen and P. Tichý, 2008

Let v be a starting vector such that h1,8 = 0. Choose Givens rotation G7,8. G∗

1,2

             

            

G1,2

23

slide-57
SLIDE 57

Idea of the second proof (3)

  • V. Faber, J. Liesen and P. Tichý, 2008

Let v be a starting vector such that h1,8 = 0. Choose Givens rotation G7,8. G∗

1,2

             

            

G1,2 G ≡ G7,8G6,7 . . . G1,2,

  • H ≡ G∗ H G.

We proved: It is possible to choose G7,8 such that h1,8 = 0 = ⇒ ˜ h1,7 = 0 or ˜ h2,7 = 0 .

23

slide-58
SLIDE 58

Summary

Generating of orthogonal basis of Kd(A, v) via short recurrences

Arnoldi-type recurrence (s + 2)-term

  • A is normal(s)

A∗ = p(A) When is A normal(s)?

24

slide-59
SLIDE 59

Summary

Generating of orthogonal basis of Kd(A, v) via short recurrences

Arnoldi-type recurrence (s + 2)-term

  • A is normal(s)

A∗ = p(A) When is A normal(s)? A is normal and

[Faber and Manteuffel, 1984], [Khavinson and Świa ¸tek, 2003] [Liesen and Strakoš, 2008]

  • 1. s = 1 if and only if the

eigenvalues of A lie on a line in C.

  • 2. If the eigenvalues of A

are not on a line, then dmin(A) ≤ 3s − 2.

24

slide-60
SLIDE 60

Summary

Generating of orthogonal basis of Kd(A, v) via short recurrences

Arnoldi-type recurrence (s + 2)-term

  • A is normal(s)

A∗ = p(A)

  • the only interesting case

is s = 1, collinear eigenvalues When is A normal(s)? A is normal and

[Faber and Manteuffel, 1984], [Khavinson and Świa ¸tek, 2003] [Liesen and Strakoš, 2008]

  • 1. s = 1 if and only if the

eigenvalues of A lie on a line in C.

  • 2. If the eigenvalues of A

are not on a line, then dmin(A) ≤ 3s − 2.

24

slide-61
SLIDE 61

Summary

Generating of orthogonal basis of Kd(A, v) via short recurrences

Arnoldi-type recurrence (s + 2)-term

  • A is normal(s)

A∗ = p(A)

  • the only interesting case

is s = 1, collinear eigenvalues When is A normal(s)? A is normal and

[Faber and Manteuffel, 1984], [Khavinson and Świa ¸tek, 2003] [Liesen and Strakoš, 2008]

  • 1. s = 1 if and only if the

eigenvalues of A lie on a line in C.

  • 2. If the eigenvalues of A

are not on a line, then dmin(A) ≤ 3s − 2.

All classes of “interesting” matrices are known.

24

slide-62
SLIDE 62

Related papers

  • J. Liesen and Z. Strakoš, [On optimal short recurrences for generating
  • rthogonal Krylov subspace bases, to appear in SIAM Review, 2008].

Completely reworked theory of short recurrences for generating

  • rthogonal Krylov subspace bases
  • V. Faber, J. Liesen and P. Tichý, [The Faber-Manteuffel Theorem for

Linear Operators, SIAM J. Numer. Anal., 2008, 46, 1323-1337].

New proofs of the fundamental theorem of Faber and Manteuffel

More details can be found at http://www.cs.cas.cz/tichy http://www.math.tu-berlin.de/˜liesen http://www.cs.cas.cz/strakos

25

slide-63
SLIDE 63

Related papers

  • J. Liesen and Z. Strakoš, [On optimal short recurrences for generating
  • rthogonal Krylov subspace bases, to appear in SIAM Review, 2008].

Completely reworked theory of short recurrences for generating

  • rthogonal Krylov subspace bases
  • V. Faber, J. Liesen and P. Tichý, [The Faber-Manteuffel Theorem for

Linear Operators, SIAM J. Numer. Anal., 2008, 46, 1323-1337].

New proofs of the fundamental theorem of Faber and Manteuffel

More details can be found at http://www.cs.cas.cz/tichy http://www.math.tu-berlin.de/˜liesen http://www.cs.cas.cz/strakos Thank you for your attention!

25