The QR algorithm is a method for calculating all eigenvalues We - - PowerPoint PPT Presentation

the qr algorithm is a method for calculating all
SMART_READER_LITE
LIVE PREVIEW

The QR algorithm is a method for calculating all eigenvalues We - - PowerPoint PPT Presentation

Lesson 22 QR A LGORITHM The QR algorithm is a method for calculating all eigenvalues We will see that the pure QR algorithm is equivalent to power iteration applied to multiple vectors at once It therefore suffers the same problems as


slide-1
SLIDE 1

QR ALGORITHM

Lesson 22

slide-2
SLIDE 2
  • The QR algorithm is a method for calculating all eigenvalues
  • We will see that the pure QR algorithm is equivalent to power

iteration applied to multiple vectors at once

  • It therefore suffers the same problems as power iteration
  • To make the algorithm practical, we use shifts, like in Rayleigh iteration
  • We also reduce matrices to tridiagonal form
  • We also introduce the Schur decomposition and calculate SVD
slide-3
SLIDE 3

SIMULTANEOUS ITERATION

slide-4
SLIDE 4
  • Suppose we want to find the two largest eigenvalues λ1 and λ2
  • The idea is to run power iteration on two column vectors simultaneously

V0 =

  • v1 | v2
  • Vk = AVk−1
  • Now each normalized column converges to the same vector q1
  • But we claim that the span of the columns

converges in a sense to the span of the two largest eigenvectors

  • In other words,

where

slide-5
SLIDE 5
  • Suppose we want to find the two largest eigenvalues λ1 and λ2
  • The idea is to run power iteration on two column vectors simultaneously

V0 =

  • v1 | v2
  • Vk = AVk−1
  • Now each normalized column converges to the same vector q1
  • But we claim that the span of the columns Vk converges in a sense to the

span of the two largest eigenvectors

  • q1 | q2
  • In other words, Qk →
  • ±q1 | ± q2
  • where

QkRk = Vk

slide-6
SLIDE 6
  • We can make this precise: we have

V0 = n

j=1 cj1qj | n j=1 cj2qj

  • = Q
  • c11

c12 . . . . . . cn1 cn2

  • = QC
  • Then we have, since
  • Assuming that

is nonsingular, we have, without changing the column space where are the entries of

slide-7
SLIDE 7
  • We can make this precise: we have

V0 = n

j=1 cj1qj | n j=1 cj2qj

  • = Q
  • c11

c12 . . . . . . cn1 cn2

  • = QC
  • Then we have, since AQ = QΛ

Vk = AkQC = QΛkC

  • Assuming that
  • is nonsingular, we have, without changing

the column space where are the entries of

slide-8
SLIDE 8
  • We can make this precise: we have

V0 = n

j=1 cj1qj | n j=1 cj2qj

  • = Q
  • c11

c12 . . . . . . cn1 cn2

  • = QC
  • Then we have, since AQ = QΛ

Vk = AkQC = QΛkC

  • Assuming that C1:2,1:2 :=
  • c11

c12 c21 c22

  • is nonsingular, we have, without changing

the column space VkC−1

1:2,1:2

  • λ−k

1

λ−k

2

  • = Q
  • 1

1 (λ2/λ1)k˜ c31 (λ2/λ1)k˜ c31 . . . . . . (λn/λ1)k˜ cn1 (λ2/λ1)k˜ cn1

  • → Q
  • 1

1 . . . . . .

  • where ˜

ckj are the entries of C−1

1:2,1:2

slide-9
SLIDE 9
  • A more numerically practical approach is to orthogonalize at each step:

V0 = QC ¯ Qk ¯ Rk = Vk Vk = A ¯ Qk−1

  • This is the same as
  • r in other words, we are calculating the QR decomposition of

:

  • Note that we can easily apply this to 3, 4, or even

columns

slide-10
SLIDE 10
  • A more numerically practical approach is to orthogonalize at each step:

V0 = QC ¯ Qk ¯ Rk = Vk Vk = A ¯ Qk−1

  • This is the same as

Vk = A ¯ Qk−1 = A ¯ Qk−1 ¯ Rk−1 ¯ R−1

k−1 = AVk−1 ¯

R−1

k−1 = · · ·

= AkV0 ¯ R−1 · · · ¯ R−1

k−1

  • r in other words, we are calculating the QR decomposition of

:

  • Note that we can easily apply this to 3, 4, or even

columns

slide-11
SLIDE 11
  • A more numerically practical approach is to orthogonalize at each step:

V0 = QC ¯ Qk ¯ Rk = Vk Vk = A ¯ Qk−1

  • This is the same as

Vk = A ¯ Qk−1 = A ¯ Qk−1 ¯ Rk−1 ¯ R−1

k−1 = AVk−1 ¯

R−1

k−1 = · · ·

= AkV0 ¯ R−1 · · · ¯ R−1

k−1

  • r in other words, we are calculating the QR decomposition of AkV0:

AkV0 = Vk ¯ Rk−1 · · · ¯ R0 = ¯ Qk ¯ Rk ¯ Rk−1 · · · ¯ R0 = ¯ Qk ˜ Rk

  • Note that we can easily apply this to 3, 4, or even n columns
slide-12
SLIDE 12

QR ALGORITHM (NOT DECOMPOSITION)

slide-13
SLIDE 13
  • The pure QR Algorithm consists of calculating a QR decomposition and recombining

in the other order: for A0 = A, define QkRk = Ak1 Ak = RkQk

  • Note that

has the same eigenvalue as

  • The remarkable fact is that, provided the eigenvalues are strictly ordered (

) then this converges to a diagonal matrix:

slide-14
SLIDE 14
  • The pure QR Algorithm consists of calculating a QR decomposition and recombining

in the other order: for A0 = A, define QkRk = Ak1 Ak = RkQk

  • Note that Ak = RkQk = Q

k Ak1Qk has the same eigenvalue as A

  • The remarkable fact is that, provided the eigenvalues are strictly ordered (

) then this converges to a diagonal matrix:

slide-15
SLIDE 15
  • The pure QR Algorithm consists of calculating a QR decomposition and recombining

in the other order: for A0 = A, define QkRk = Ak1 Ak = RkQk

  • Note that Ak = RkQk = Q

k Ak1Qk has the same eigenvalue as A

  • The remarkable fact is that, provided the eigenvalues are strictly ordered (|λ1| >

|λ2| > · · · > |λn|) then this converges to a diagonal matrix: Ak → {λ1, . . . , λn}

slide-16
SLIDE 16

5000 10000 15000 20000 10-11 10-8 10-5 0.01

Ak Ak

  • Consider a randoom symmetric Gaussian matrix

Gn = 1 √n    G11 · · · G1n . . . ... . . . G1n · · · Gnn    where Gij are random Gaussian entries mean zero and variance twice on the diagonal as on the off diagonal

k

slide-17
SLIDE 17

QR ALGORITHM = SIMULTANEOUS ITERATION

slide-18
SLIDE 18
  • We use the QR decomposition with Rii > 0, which is unique (via Gram–Schmidt)
  • Simultaneous iteration on n vectors with initial guess of I is

¯ Q0 = I Zk = A ¯ Qk1 ¯ Qk ¯ Rk = Zk

  • QR algorithm is
  • We claim that

and

slide-19
SLIDE 19
  • We use the QR decomposition with Rii > 0, which is unique (via Gram–Schmidt)
  • Simultaneous iteration on n vectors with initial guess of I is

¯ Q0 = I Zk = A ¯ Qk1 ¯ Qk ¯ Rk = Zk

  • QR algorithm is

A0 = A QkRk = Ak1 Ak = RkQk

  • We claim that

and

slide-20
SLIDE 20
  • We use the QR decomposition with Rii > 0, which is unique (via Gram–Schmidt)
  • Simultaneous iteration on n vectors with initial guess of I is

¯ Q0 = I Zk = A ¯ Qk1 ¯ Qk ¯ Rk = Zk

  • QR algorithm is

A0 = A QkRk = Ak1 Ak = RkQk

  • We claim that

¯ Qk = Q1 · · · Qk, ¯ Rk = Rk and Ak = ¯ Q

k A ¯

Qk

slide-21
SLIDE 21
  • QR algorithm converges provided that Q = LU exists

The proof is straightforward when the eigenvalues are strictly ordered |λ1| > |λ2| > · · · > |λn|

  • Importantly, the (n, n) entry converges to the smallest eigenvalue

We only care about convergence of smallest eigenvalue, as we can use shifts to make a small eigenvalue smaller and accelerate convergence

slide-22
SLIDE 22

REDUCTION TO TRIDIAGONAL FORM

slide-23
SLIDE 23
  • Suppose A is symmetric tridiagonal

A =        α1 β1 β1 α2 β2 ... ... ... βn2 αn1 βn1 βn1 αn       

  • Consider the sequence of Given's rotations that introduce zeros one at a time,

which we saw introduces one superdiagonal:

  • acts only on the

and th rows when acting on the left, so acts only

  • n the

and th columns when acting on the right

  • So the next stage in the QR algorithm has only one subdiagonal:
slide-24
SLIDE 24
  • Suppose A is symmetric tridiagonal

A =        α1 β1 β1 α2 β2 ... ... ... βn2 αn1 βn1 βn1 αn       

  • Consider the sequence of Given's rotations that introduce zeros one at a time,

which we saw introduces one superdiagonal: R = QA = Q

n1 · · · Q 1 A =

         × × × × × × ... ... ... × × × × × ×         

  • acts only on the

and th rows when acting on the left, so acts only

  • n the

and th columns when acting on the right

  • So the next stage in the QR algorithm has only one subdiagonal:
slide-25
SLIDE 25
  • Suppose A is symmetric tridiagonal

A =        α1 β1 β1 α2 β2 ... ... ... βn2 αn1 βn1 βn1 αn       

  • Consider the sequence of Given's rotations that introduce zeros one at a time,

which we saw introduces one superdiagonal: R = QA = Q

n1 · · · Q 1 A =

         × × × × × × ... ... ... × × × × × ×         

  • Q

k acts only on the k and (k +1)th rows when acting on the left, so Qk acts only

  • n the k and (k + 1)th columns when acting on the right
  • So the next stage in the QR algorithm has only one subdiagonal:

RQ = RQ1 · · · Qn =          × × × · · · × × × × × · · · × × × × · · · × × ... ... . . . . . . × × × × ×         

slide-26
SLIDE 26
  • But we know that

RQ = QQRQ = QAQ is symmetric: e

j QAQek = q j Aqk = q k Aqj = e k QAQej

  • Thus if it only has one subdiagonal, symmetry implies it has only one superdiagonal:
  • So each iteration of the QR algorithm is also tridiagonal, and requires only
  • perations
slide-27
SLIDE 27
  • But we know that

RQ = QQRQ = QAQ is symmetric: e

j QAQek = q j Aqk = q k Aqj = e k QAQej

  • Thus if it only has one subdiagonal, symmetry implies it has only one superdiagonal:

RQ =        × × × × × ... ... ... × × × × ×       

  • So each iteration of the QR algorithm is also tridiagonal, and requires only O(n)
  • perations
slide-28
SLIDE 28
  • Our goal for general matrices A is to try to find orthogonal transformation V so

that V AV is tridiagonal

  • Consider Givens rotations that introduces zeros using the second row:
  • Then
  • nly alters the second through

th columns:

slide-29
SLIDE 29
  • Our goal for general matrices A is to try to find orthogonal transformation V so

that V AV is tridiagonal

  • Consider Givens rotations that introduces zeros using the second row:

Q

1 A =

       a11 a12 · · · a1n × × · · · × × · · · × . . . ... . . . × · · · ×       

  • Then
  • nly alters the second through

th columns:

slide-30
SLIDE 30
  • Our goal for general matrices A is to try to find orthogonal transformation V so

that V AV is tridiagonal

  • Consider Givens rotations that introduces zeros using the second row:

Q

1 A =

       a11 a12 · · · a1n × × · · · × × · · · × . . . ... . . . × · · · ×       

  • Then AQ1 only alters the second through nth columns:

Q

1 AQ1 =

       a11 × · · · × × × · · · × × · · · × . . . ... . . . × · · · ×       

slide-31
SLIDE 31
  • Constructing Qk that acts only on the (k + 1)th through nth rows to introduce

zeros, we have Q

n1 · · · Q 1 AQ1 · · · Qn1 =

  • a11

× × · · · × × × × × · · · × × × × · · · × × ... ... . . . . . . × × × × ×

  • Symmetry implies this is actually tridiagonal
  • The total cost of m iterations of the QR algorithm with the initial tridiagonalization

step is then O

  • n3 + mn
slide-32
SLIDE 32

QR ALGORITHM WITH SHIFTS

slide-33
SLIDE 33
  • We now want to incorporate shifts to use guesses of the current eigenvalue to

induce faster converge, like in Rayleigh iteration

  • Provided we can come up with a guess µk, we use the following:

QkRk = Ak−1 − µkI Ak = RkQk + µkI Note that tridiagonality is preserved

  • We are left with two problems:

Choosing µk deflation — splitting into sub-problems — once an eigenvalue has converged

slide-34
SLIDE 34
  • The simplest choice of µk is the n, n entry of Ak1:

µk = e

n Ak1en

The idea is that the entry eventually converges to an eigenvalue

slide-35
SLIDE 35
  • The simplest choice of µk is the n, n entry of Ak1:

µk = e

n Ak1en

The idea is that the entry eventually converges to an eigenvalue

k = 2 0.572067 -1.02939

  • 1.02939 0.621212

2.05923 2.05923 0.669675

  • 0.455239
  • 0.455239 -0.0265475 -0.285787
  • 0.285787

0.0662281

  • 0.32105
  • 0.32105

0.465598

  • 0.357166
  • 0.357166
  • 1.16057
  • 1.1596
  • 1.21336
  • 1.21336

1.69205 1.35086 1.35086 0.846924

  • 0.606383
  • 0.606383

0.206608

  • 0.0842823
  • 0.0842823
  • 0.237721
  • 0.961479
  • 0.961479
  • 0.296382 -0.14286
  • 0.14286

0.155786

Shifted QR algorithm Pure QR algorithm

slide-36
SLIDE 36
  • The simplest choice of µk is the n, n entry of Ak1:

µk = e

n Ak1en

The idea is that the entry eventually converges to an eigenvalue

k = 2 0.572067 -1.02939

  • 1.02939 0.621212

2.05923 2.05923 0.669675

  • 0.455239
  • 0.455239 -0.0265475 -0.285787
  • 0.285787

0.0662281

  • 0.32105
  • 0.32105

0.465598

  • 0.357166
  • 0.357166
  • 1.16057
  • 1.1596
  • 1.21336
  • 1.21336

1.69205 1.35086 1.35086 0.846924

  • 0.606383
  • 0.606383

0.206608

  • 0.0842823
  • 0.0842823
  • 0.237721
  • 0.961479
  • 0.961479
  • 0.296382 -0.14286
  • 0.14286

0.155786 k = 4 2.39821

  • 1.00156
  • 1.00156

1.19242 0.205559 0.205559 -0.865516

  • 0.93256
  • 0.93256
  • 0.601284 -0.375937
  • 0.375937

0.256047

  • 0.428555
  • 0.428555

0.0670021

  • 3.41906 ¥ 10-6
  • 3.41906 ¥ 10-6
  • 1.23922

0.0957731

  • 2.2801
  • 2.2801

1.15351 0.0854953 0.0854953 0.602996

  • 0.275809
  • 0.275809 -0.494368
  • 0.46851
  • 0.46851
  • 0.705281
  • 0.604999
  • 0.604999

0.409555

  • 0.00490028
  • 0.00490028

0.14547

Shifted QR algorithm Pure QR algorithm

slide-37
SLIDE 37
  • The simplest choice of µk is the n, n entry of Ak1:

µk = e

n Ak1en

The idea is that the entry eventually converges to an eigenvalue

k = 2 0.572067 -1.02939

  • 1.02939 0.621212

2.05923 2.05923 0.669675

  • 0.455239
  • 0.455239 -0.0265475 -0.285787
  • 0.285787

0.0662281

  • 0.32105
  • 0.32105

0.465598

  • 0.357166
  • 0.357166
  • 1.16057
  • 1.1596
  • 1.21336
  • 1.21336

1.69205 1.35086 1.35086 0.846924

  • 0.606383
  • 0.606383

0.206608

  • 0.0842823
  • 0.0842823
  • 0.237721
  • 0.961479
  • 0.961479
  • 0.296382 -0.14286
  • 0.14286

0.155786 k = 4 2.39821

  • 1.00156
  • 1.00156

1.19242 0.205559 0.205559 -0.865516

  • 0.93256
  • 0.93256
  • 0.601284 -0.375937
  • 0.375937

0.256047

  • 0.428555
  • 0.428555

0.0670021

  • 3.41906 ¥ 10-6
  • 3.41906 ¥ 10-6
  • 1.23922

0.0957731

  • 2.2801
  • 2.2801

1.15351 0.0854953 0.0854953 0.602996

  • 0.275809
  • 0.275809 -0.494368
  • 0.46851
  • 0.46851
  • 0.705281
  • 0.604999
  • 0.604999

0.409555

  • 0.00490028
  • 0.00490028

0.14547 k = 6 2.93718

  • 0.26163
  • 0.26163

0.684521 0.0707242 0.0707242 0.293266

  • 0.382187
  • 0.382187 -0.438523
  • 1.05075
  • 1.05075
  • 0.737221 -0.303298
  • 0.303298 -0.292346
  • 1.23922

2.2601

  • 1.677
  • 1.677
  • 1.01013

0.00897637 0.00897637 0.54528

  • 0.495737
  • 0.495737
  • 1.07281

0.207455 0.207455 0.425337

  • 0.436615
  • 0.436615
  • 0.085561

0.000393511 0.000393511 0.145443

Shifted QR algorithm Pure QR algorithm

slide-38
SLIDE 38
  • At some point we converge to an eigenvalue, in which case the guess of µk will

not help convergence

  • Fortunately, at the point the eigenvalue has converged, we have decoupled the

problem: Ak =

  • ×

× × × × ... ... ... × × × × ×

  • ×
  • ×

× × × × ... ... ... × × × × ×

  • =

˜ A

  • We can now run the algorithm of ˜

A, which is only (n − 1) × (n − 1) Once its (n − 1), (n − 1) th entry has converged, we deflate again

slide-39
SLIDE 39
  • This choice of shifts can have problems: consider

A =

  • 1

1

  • Unshifted QR doesn't work: A itself is orthogonal so QR decomposition doesn't

do anything Q1I = A0 = A and A1 = IQ1 = Q1 = A0

  • But the simple shift of taking the 22 entry of Ak doesn't help: its 22 entry is always

zero

  • To prevent this in general, the Wilkinson shift can be used:

Take the eigenvalue of the bottom block of that is closest to the last entry of

  • This can apparently be shown to always work
slide-40
SLIDE 40
  • This choice of shifts can have problems: consider

A =

  • 1

1

  • Unshifted QR doesn't work: A itself is orthogonal so QR decomposition doesn't

do anything Q1I = A0 = A and A1 = IQ1 = Q1 = A0

  • But the simple shift of taking the 22 entry of Ak doesn't help: its 22 entry is always

zero

  • To prevent this in general, the Wilkinson shift can be used:

Take the eigenvalue of the bottom 2 × 2 block of A that is closest to the last entry of A

  • This can apparently be shown to always work
slide-41
SLIDE 41

QR ALGORITHM WITH SHIFTS

  • Assume A has been tridiagonalize using Given's rotations
  • While A ≥ 2

Let µ be eigenvalue of An−1:n,n−1:n closest to Ann QR = A − µI A = RQ + µI If |An−1,n| < , add Ann to list of eigenvalues and deflate: A = A1:n−1,1:n−1

  • Now A is 2 × 2 and its eigenvalues can be calculated directly
slide-42
SLIDE 42

SCHUR DECOMPOSITION

slide-43
SLIDE 43
  • We consider decompositions for non-symmetric matrices
  • We introduce the Schur decomposition

A = QUQ where Q is unitary and U is upper triangular The QR algorithm computes the Schur decomposition when applied to non- symmetric matrices

  • The eigenvalues of A are the diagonal of U
slide-44
SLIDE 44
  • Recall that every matrix has an eigenvalue decomposition into Jordan canonical

form A = V JV −1 where J is upper triangular with eigenvalues along the diagonal, and only ones and zeros above the diagonal

  • Jordan canonical form is completely useless for numerics:

A small perturbation causes matrices to be diagonalizable: The eigenvectors can be very badly conditioned

slide-45
SLIDE 45
  • Recall that every matrix has an eigenvalue decomposition into Jordan canonical

form A = V JV −1 where J is upper triangular with eigenvalues along the diagonal, and only ones and zeros above the diagonal

  • Jordan canonical form is completely useless for numerics:

A small perturbation causes matrices to be diagonalizable: 1 1

  • 1
  • =

−1 1 √ √ 1 + √ 1 − √ −1 1 √ √ −1 = −1 1 √ √ 1 + √ 1 − √ − 1

2 1 2√ 1 2 1 2√

  • The norm of V −1 is huge
slide-46
SLIDE 46
  • However, if we orthogonalize the eigenvectors V = QR we obtain the Schur

decomposition A = V JV 1 = QRJR1Q = QUQ where U is upper triangular

  • Schur decomposition is robust for numerics:

It is stable to a small perturbation: for

  • is unitary therefore always well-conditioned
slide-47
SLIDE 47
  • However, if we orthogonalize the eigenvectors V = QR we obtain the Schur

decomposition A = V JV 1 = QRJR1Q = QUQ where U is upper triangular

  • Schur decomposition is robust for numerics:

It is stable to a small perturbation: 1 1

  • 1
  • = Q

1 − √ 1 − 1 + √

  • Q

for Q = 1 √1 + −1 −√ √ −1

  • Q is unitary therefore always well-conditioned
slide-48
SLIDE 48
  • Schur decomposition includes the eigenvalues along the diagonal, but also contains

more information about the operator

  • It reduces powers of general matrices to powers of upper triangular matrices:

Ak = QU kQ The SVD is not useful for describing powers of matrices

  • It also is useful for efficiently calculating pseudo-spectra:
slide-49
SLIDE 49
  • Schur decomposition includes the eigenvalues along the diagonal, but also contains

more information about the operator

  • It reduces powers of general matrices to powers of upper triangular matrices:

Ak = QU kQ The SVD is not useful for describing powers of matrices

  • It also is useful for efficiently calculating pseudo-spectra:
  • (A − λI)−1

=

  • (Q(U − λI)−1Q

=

  • (U − λI)−1
slide-50
SLIDE 50
  • When A is not symmetric, we cannot reduce to tridiagonal form, but we can

reduce to upper-Hessenberg form, which has only one subdiagonal

  • If QR = A is upper-Hessenberg then RQ is also upper-Hessenberg
  • QR algorithm with shifts applied to non-symmetric matrices almost always con-

verges to upper triangular form, in practice Proving this is an open problem (I think)

  • The complexity is now O
  • n3 + mn2
slide-51
SLIDE 51

COMPUTATION OF THE SVD

slide-52
SLIDE 52
  • The simplest approach for calculating the singular values of A = UΣV is to apply

the QR algorithm with shifts to AA = V Σ2V This suffers from bad conditioning

  • Instead, we can double the dimension and apply the QR algorithm with shifts to
  • This is symmetric/Hermitian, and satisfies

for

slide-53
SLIDE 53
  • The simplest approach for calculating the singular values of A = UΣV is to apply

the QR algorithm with shifts to AA = V Σ2V This suffers from bad conditioning

  • Instead, we can double the dimension and apply the QR algorithm with shifts to

B =

  • A

A

  • This is symmetric/Hermitian, and satisfies

B = Q Σ −Σ

  • Q

for Q = 1 √ 2 V V U −U

slide-54
SLIDE 54
  • We can bi-diagonalize A before we apply the QR Algorithm:

H = Q

1AQ2 =

  • ×

× × × ... ... × × ×

  • We can permute to get tridiagonal:
slide-55
SLIDE 55
  • We can bi-diagonalize A before we apply the QR Algorithm:

H = Q

1AQ2 =

  • ×

× × × ... ... × × ×

  • We can permute to get tridiagonal:

P

  • H

H

  • P =
  • ×

× × × × ... ... ... × × × × ×