The QR algorithm is a method for calculating all eigenvalues We - - PowerPoint PPT Presentation
The QR algorithm is a method for calculating all eigenvalues We - - PowerPoint PPT Presentation
Lesson 22 QR A LGORITHM The QR algorithm is a method for calculating all eigenvalues We will see that the pure QR algorithm is equivalent to power iteration applied to multiple vectors at once It therefore suffers the same problems as
- The QR algorithm is a method for calculating all eigenvalues
- We will see that the pure QR algorithm is equivalent to power
iteration applied to multiple vectors at once
- It therefore suffers the same problems as power iteration
- To make the algorithm practical, we use shifts, like in Rayleigh iteration
- We also reduce matrices to tridiagonal form
- We also introduce the Schur decomposition and calculate SVD
SIMULTANEOUS ITERATION
- Suppose we want to find the two largest eigenvalues λ1 and λ2
- The idea is to run power iteration on two column vectors simultaneously
V0 =
- v1 | v2
- Vk = AVk−1
- Now each normalized column converges to the same vector q1
- But we claim that the span of the columns
converges in a sense to the span of the two largest eigenvectors
- In other words,
where
- Suppose we want to find the two largest eigenvalues λ1 and λ2
- The idea is to run power iteration on two column vectors simultaneously
V0 =
- v1 | v2
- Vk = AVk−1
- Now each normalized column converges to the same vector q1
- But we claim that the span of the columns Vk converges in a sense to the
span of the two largest eigenvectors
- q1 | q2
- In other words, Qk →
- ±q1 | ± q2
- where
QkRk = Vk
- We can make this precise: we have
V0 = n
j=1 cj1qj | n j=1 cj2qj
- = Q
- c11
c12 . . . . . . cn1 cn2
- = QC
- Then we have, since
- Assuming that
is nonsingular, we have, without changing the column space where are the entries of
- We can make this precise: we have
V0 = n
j=1 cj1qj | n j=1 cj2qj
- = Q
- c11
c12 . . . . . . cn1 cn2
- = QC
- Then we have, since AQ = QΛ
Vk = AkQC = QΛkC
- Assuming that
- is nonsingular, we have, without changing
the column space where are the entries of
- We can make this precise: we have
V0 = n
j=1 cj1qj | n j=1 cj2qj
- = Q
- c11
c12 . . . . . . cn1 cn2
- = QC
- Then we have, since AQ = QΛ
Vk = AkQC = QΛkC
- Assuming that C1:2,1:2 :=
- c11
c12 c21 c22
- is nonsingular, we have, without changing
the column space VkC−1
1:2,1:2
- λ−k
1
λ−k
2
- = Q
- 1
1 (λ2/λ1)k˜ c31 (λ2/λ1)k˜ c31 . . . . . . (λn/λ1)k˜ cn1 (λ2/λ1)k˜ cn1
- → Q
- 1
1 . . . . . .
- where ˜
ckj are the entries of C−1
1:2,1:2
- A more numerically practical approach is to orthogonalize at each step:
V0 = QC ¯ Qk ¯ Rk = Vk Vk = A ¯ Qk−1
- This is the same as
- r in other words, we are calculating the QR decomposition of
:
- Note that we can easily apply this to 3, 4, or even
columns
- A more numerically practical approach is to orthogonalize at each step:
V0 = QC ¯ Qk ¯ Rk = Vk Vk = A ¯ Qk−1
- This is the same as
Vk = A ¯ Qk−1 = A ¯ Qk−1 ¯ Rk−1 ¯ R−1
k−1 = AVk−1 ¯
R−1
k−1 = · · ·
= AkV0 ¯ R−1 · · · ¯ R−1
k−1
- r in other words, we are calculating the QR decomposition of
:
- Note that we can easily apply this to 3, 4, or even
columns
- A more numerically practical approach is to orthogonalize at each step:
V0 = QC ¯ Qk ¯ Rk = Vk Vk = A ¯ Qk−1
- This is the same as
Vk = A ¯ Qk−1 = A ¯ Qk−1 ¯ Rk−1 ¯ R−1
k−1 = AVk−1 ¯
R−1
k−1 = · · ·
= AkV0 ¯ R−1 · · · ¯ R−1
k−1
- r in other words, we are calculating the QR decomposition of AkV0:
AkV0 = Vk ¯ Rk−1 · · · ¯ R0 = ¯ Qk ¯ Rk ¯ Rk−1 · · · ¯ R0 = ¯ Qk ˜ Rk
- Note that we can easily apply this to 3, 4, or even n columns
QR ALGORITHM (NOT DECOMPOSITION)
- The pure QR Algorithm consists of calculating a QR decomposition and recombining
in the other order: for A0 = A, define QkRk = Ak1 Ak = RkQk
- Note that
has the same eigenvalue as
- The remarkable fact is that, provided the eigenvalues are strictly ordered (
) then this converges to a diagonal matrix:
- The pure QR Algorithm consists of calculating a QR decomposition and recombining
in the other order: for A0 = A, define QkRk = Ak1 Ak = RkQk
- Note that Ak = RkQk = Q
k Ak1Qk has the same eigenvalue as A
- The remarkable fact is that, provided the eigenvalues are strictly ordered (
) then this converges to a diagonal matrix:
- The pure QR Algorithm consists of calculating a QR decomposition and recombining
in the other order: for A0 = A, define QkRk = Ak1 Ak = RkQk
- Note that Ak = RkQk = Q
k Ak1Qk has the same eigenvalue as A
- The remarkable fact is that, provided the eigenvalues are strictly ordered (|λ1| >
|λ2| > · · · > |λn|) then this converges to a diagonal matrix: Ak → {λ1, . . . , λn}
5000 10000 15000 20000 10-11 10-8 10-5 0.01
Ak Ak
- Consider a randoom symmetric Gaussian matrix
Gn = 1 √n G11 · · · G1n . . . ... . . . G1n · · · Gnn where Gij are random Gaussian entries mean zero and variance twice on the diagonal as on the off diagonal
k
QR ALGORITHM = SIMULTANEOUS ITERATION
- We use the QR decomposition with Rii > 0, which is unique (via Gram–Schmidt)
- Simultaneous iteration on n vectors with initial guess of I is
¯ Q0 = I Zk = A ¯ Qk1 ¯ Qk ¯ Rk = Zk
- QR algorithm is
- We claim that
and
- We use the QR decomposition with Rii > 0, which is unique (via Gram–Schmidt)
- Simultaneous iteration on n vectors with initial guess of I is
¯ Q0 = I Zk = A ¯ Qk1 ¯ Qk ¯ Rk = Zk
- QR algorithm is
A0 = A QkRk = Ak1 Ak = RkQk
- We claim that
and
- We use the QR decomposition with Rii > 0, which is unique (via Gram–Schmidt)
- Simultaneous iteration on n vectors with initial guess of I is
¯ Q0 = I Zk = A ¯ Qk1 ¯ Qk ¯ Rk = Zk
- QR algorithm is
A0 = A QkRk = Ak1 Ak = RkQk
- We claim that
¯ Qk = Q1 · · · Qk, ¯ Rk = Rk and Ak = ¯ Q
k A ¯
Qk
- QR algorithm converges provided that Q = LU exists
The proof is straightforward when the eigenvalues are strictly ordered |λ1| > |λ2| > · · · > |λn|
- Importantly, the (n, n) entry converges to the smallest eigenvalue
We only care about convergence of smallest eigenvalue, as we can use shifts to make a small eigenvalue smaller and accelerate convergence
REDUCTION TO TRIDIAGONAL FORM
- Suppose A is symmetric tridiagonal
A = α1 β1 β1 α2 β2 ... ... ... βn2 αn1 βn1 βn1 αn
- Consider the sequence of Given's rotations that introduce zeros one at a time,
which we saw introduces one superdiagonal:
- acts only on the
and th rows when acting on the left, so acts only
- n the
and th columns when acting on the right
- So the next stage in the QR algorithm has only one subdiagonal:
- Suppose A is symmetric tridiagonal
A = α1 β1 β1 α2 β2 ... ... ... βn2 αn1 βn1 βn1 αn
- Consider the sequence of Given's rotations that introduce zeros one at a time,
which we saw introduces one superdiagonal: R = QA = Q
n1 · · · Q 1 A =
× × × × × × ... ... ... × × × × × ×
- acts only on the
and th rows when acting on the left, so acts only
- n the
and th columns when acting on the right
- So the next stage in the QR algorithm has only one subdiagonal:
- Suppose A is symmetric tridiagonal
A = α1 β1 β1 α2 β2 ... ... ... βn2 αn1 βn1 βn1 αn
- Consider the sequence of Given's rotations that introduce zeros one at a time,
which we saw introduces one superdiagonal: R = QA = Q
n1 · · · Q 1 A =
× × × × × × ... ... ... × × × × × ×
- Q
k acts only on the k and (k +1)th rows when acting on the left, so Qk acts only
- n the k and (k + 1)th columns when acting on the right
- So the next stage in the QR algorithm has only one subdiagonal:
RQ = RQ1 · · · Qn = × × × · · · × × × × × · · · × × × × · · · × × ... ... . . . . . . × × × × ×
- But we know that
RQ = QQRQ = QAQ is symmetric: e
j QAQek = q j Aqk = q k Aqj = e k QAQej
- Thus if it only has one subdiagonal, symmetry implies it has only one superdiagonal:
- So each iteration of the QR algorithm is also tridiagonal, and requires only
- perations
- But we know that
RQ = QQRQ = QAQ is symmetric: e
j QAQek = q j Aqk = q k Aqj = e k QAQej
- Thus if it only has one subdiagonal, symmetry implies it has only one superdiagonal:
RQ = × × × × × ... ... ... × × × × ×
- So each iteration of the QR algorithm is also tridiagonal, and requires only O(n)
- perations
- Our goal for general matrices A is to try to find orthogonal transformation V so
that V AV is tridiagonal
- Consider Givens rotations that introduces zeros using the second row:
- Then
- nly alters the second through
th columns:
- Our goal for general matrices A is to try to find orthogonal transformation V so
that V AV is tridiagonal
- Consider Givens rotations that introduces zeros using the second row:
Q
1 A =
a11 a12 · · · a1n × × · · · × × · · · × . . . ... . . . × · · · ×
- Then
- nly alters the second through
th columns:
- Our goal for general matrices A is to try to find orthogonal transformation V so
that V AV is tridiagonal
- Consider Givens rotations that introduces zeros using the second row:
Q
1 A =
a11 a12 · · · a1n × × · · · × × · · · × . . . ... . . . × · · · ×
- Then AQ1 only alters the second through nth columns:
Q
1 AQ1 =
a11 × · · · × × × · · · × × · · · × . . . ... . . . × · · · ×
- Constructing Qk that acts only on the (k + 1)th through nth rows to introduce
zeros, we have Q
n1 · · · Q 1 AQ1 · · · Qn1 =
- a11
× × · · · × × × × × · · · × × × × · · · × × ... ... . . . . . . × × × × ×
- Symmetry implies this is actually tridiagonal
- The total cost of m iterations of the QR algorithm with the initial tridiagonalization
step is then O
- n3 + mn
QR ALGORITHM WITH SHIFTS
- We now want to incorporate shifts to use guesses of the current eigenvalue to
induce faster converge, like in Rayleigh iteration
- Provided we can come up with a guess µk, we use the following:
QkRk = Ak−1 − µkI Ak = RkQk + µkI Note that tridiagonality is preserved
- We are left with two problems:
Choosing µk deflation — splitting into sub-problems — once an eigenvalue has converged
- The simplest choice of µk is the n, n entry of Ak1:
µk = e
n Ak1en
The idea is that the entry eventually converges to an eigenvalue
- The simplest choice of µk is the n, n entry of Ak1:
µk = e
n Ak1en
The idea is that the entry eventually converges to an eigenvalue
k = 2 0.572067 -1.02939
- 1.02939 0.621212
2.05923 2.05923 0.669675
- 0.455239
- 0.455239 -0.0265475 -0.285787
- 0.285787
0.0662281
- 0.32105
- 0.32105
0.465598
- 0.357166
- 0.357166
- 1.16057
- 1.1596
- 1.21336
- 1.21336
1.69205 1.35086 1.35086 0.846924
- 0.606383
- 0.606383
0.206608
- 0.0842823
- 0.0842823
- 0.237721
- 0.961479
- 0.961479
- 0.296382 -0.14286
- 0.14286
0.155786
Shifted QR algorithm Pure QR algorithm
- The simplest choice of µk is the n, n entry of Ak1:
µk = e
n Ak1en
The idea is that the entry eventually converges to an eigenvalue
k = 2 0.572067 -1.02939
- 1.02939 0.621212
2.05923 2.05923 0.669675
- 0.455239
- 0.455239 -0.0265475 -0.285787
- 0.285787
0.0662281
- 0.32105
- 0.32105
0.465598
- 0.357166
- 0.357166
- 1.16057
- 1.1596
- 1.21336
- 1.21336
1.69205 1.35086 1.35086 0.846924
- 0.606383
- 0.606383
0.206608
- 0.0842823
- 0.0842823
- 0.237721
- 0.961479
- 0.961479
- 0.296382 -0.14286
- 0.14286
0.155786 k = 4 2.39821
- 1.00156
- 1.00156
1.19242 0.205559 0.205559 -0.865516
- 0.93256
- 0.93256
- 0.601284 -0.375937
- 0.375937
0.256047
- 0.428555
- 0.428555
0.0670021
- 3.41906 ¥ 10-6
- 3.41906 ¥ 10-6
- 1.23922
0.0957731
- 2.2801
- 2.2801
1.15351 0.0854953 0.0854953 0.602996
- 0.275809
- 0.275809 -0.494368
- 0.46851
- 0.46851
- 0.705281
- 0.604999
- 0.604999
0.409555
- 0.00490028
- 0.00490028
0.14547
Shifted QR algorithm Pure QR algorithm
- The simplest choice of µk is the n, n entry of Ak1:
µk = e
n Ak1en
The idea is that the entry eventually converges to an eigenvalue
k = 2 0.572067 -1.02939
- 1.02939 0.621212
2.05923 2.05923 0.669675
- 0.455239
- 0.455239 -0.0265475 -0.285787
- 0.285787
0.0662281
- 0.32105
- 0.32105
0.465598
- 0.357166
- 0.357166
- 1.16057
- 1.1596
- 1.21336
- 1.21336
1.69205 1.35086 1.35086 0.846924
- 0.606383
- 0.606383
0.206608
- 0.0842823
- 0.0842823
- 0.237721
- 0.961479
- 0.961479
- 0.296382 -0.14286
- 0.14286
0.155786 k = 4 2.39821
- 1.00156
- 1.00156
1.19242 0.205559 0.205559 -0.865516
- 0.93256
- 0.93256
- 0.601284 -0.375937
- 0.375937
0.256047
- 0.428555
- 0.428555
0.0670021
- 3.41906 ¥ 10-6
- 3.41906 ¥ 10-6
- 1.23922
0.0957731
- 2.2801
- 2.2801
1.15351 0.0854953 0.0854953 0.602996
- 0.275809
- 0.275809 -0.494368
- 0.46851
- 0.46851
- 0.705281
- 0.604999
- 0.604999
0.409555
- 0.00490028
- 0.00490028
0.14547 k = 6 2.93718
- 0.26163
- 0.26163
0.684521 0.0707242 0.0707242 0.293266
- 0.382187
- 0.382187 -0.438523
- 1.05075
- 1.05075
- 0.737221 -0.303298
- 0.303298 -0.292346
- 1.23922
2.2601
- 1.677
- 1.677
- 1.01013
0.00897637 0.00897637 0.54528
- 0.495737
- 0.495737
- 1.07281
0.207455 0.207455 0.425337
- 0.436615
- 0.436615
- 0.085561
0.000393511 0.000393511 0.145443
Shifted QR algorithm Pure QR algorithm
- At some point we converge to an eigenvalue, in which case the guess of µk will
not help convergence
- Fortunately, at the point the eigenvalue has converged, we have decoupled the
problem: Ak =
- ×
× × × × ... ... ... × × × × ×
- ×
- ≈
- ×
× × × × ... ... ... × × × × ×
- =
˜ A
- We can now run the algorithm of ˜
A, which is only (n − 1) × (n − 1) Once its (n − 1), (n − 1) th entry has converged, we deflate again
- This choice of shifts can have problems: consider
A =
- 1
1
- Unshifted QR doesn't work: A itself is orthogonal so QR decomposition doesn't
do anything Q1I = A0 = A and A1 = IQ1 = Q1 = A0
- But the simple shift of taking the 22 entry of Ak doesn't help: its 22 entry is always
zero
- To prevent this in general, the Wilkinson shift can be used:
Take the eigenvalue of the bottom block of that is closest to the last entry of
- This can apparently be shown to always work
- This choice of shifts can have problems: consider
A =
- 1
1
- Unshifted QR doesn't work: A itself is orthogonal so QR decomposition doesn't
do anything Q1I = A0 = A and A1 = IQ1 = Q1 = A0
- But the simple shift of taking the 22 entry of Ak doesn't help: its 22 entry is always
zero
- To prevent this in general, the Wilkinson shift can be used:
Take the eigenvalue of the bottom 2 × 2 block of A that is closest to the last entry of A
- This can apparently be shown to always work
QR ALGORITHM WITH SHIFTS
- Assume A has been tridiagonalize using Given's rotations
- While A ≥ 2
Let µ be eigenvalue of An−1:n,n−1:n closest to Ann QR = A − µI A = RQ + µI If |An−1,n| < , add Ann to list of eigenvalues and deflate: A = A1:n−1,1:n−1
- Now A is 2 × 2 and its eigenvalues can be calculated directly
SCHUR DECOMPOSITION
- We consider decompositions for non-symmetric matrices
- We introduce the Schur decomposition
A = QUQ where Q is unitary and U is upper triangular The QR algorithm computes the Schur decomposition when applied to non- symmetric matrices
- The eigenvalues of A are the diagonal of U
- Recall that every matrix has an eigenvalue decomposition into Jordan canonical
form A = V JV −1 where J is upper triangular with eigenvalues along the diagonal, and only ones and zeros above the diagonal
- Jordan canonical form is completely useless for numerics:
A small perturbation causes matrices to be diagonalizable: The eigenvectors can be very badly conditioned
- Recall that every matrix has an eigenvalue decomposition into Jordan canonical
form A = V JV −1 where J is upper triangular with eigenvalues along the diagonal, and only ones and zeros above the diagonal
- Jordan canonical form is completely useless for numerics:
A small perturbation causes matrices to be diagonalizable: 1 1
- 1
- =
−1 1 √ √ 1 + √ 1 − √ −1 1 √ √ −1 = −1 1 √ √ 1 + √ 1 − √ − 1
2 1 2√ 1 2 1 2√
- The norm of V −1 is huge
- However, if we orthogonalize the eigenvectors V = QR we obtain the Schur
decomposition A = V JV 1 = QRJR1Q = QUQ where U is upper triangular
- Schur decomposition is robust for numerics:
It is stable to a small perturbation: for
- is unitary therefore always well-conditioned
- However, if we orthogonalize the eigenvectors V = QR we obtain the Schur
decomposition A = V JV 1 = QRJR1Q = QUQ where U is upper triangular
- Schur decomposition is robust for numerics:
It is stable to a small perturbation: 1 1
- 1
- = Q
1 − √ 1 − 1 + √
- Q
for Q = 1 √1 + −1 −√ √ −1
- Q is unitary therefore always well-conditioned
- Schur decomposition includes the eigenvalues along the diagonal, but also contains
more information about the operator
- It reduces powers of general matrices to powers of upper triangular matrices:
Ak = QU kQ The SVD is not useful for describing powers of matrices
- It also is useful for efficiently calculating pseudo-spectra:
- Schur decomposition includes the eigenvalues along the diagonal, but also contains
more information about the operator
- It reduces powers of general matrices to powers of upper triangular matrices:
Ak = QU kQ The SVD is not useful for describing powers of matrices
- It also is useful for efficiently calculating pseudo-spectra:
- (A − λI)−1
=
- (Q(U − λI)−1Q
=
- (U − λI)−1
- When A is not symmetric, we cannot reduce to tridiagonal form, but we can
reduce to upper-Hessenberg form, which has only one subdiagonal
- If QR = A is upper-Hessenberg then RQ is also upper-Hessenberg
- QR algorithm with shifts applied to non-symmetric matrices almost always con-
verges to upper triangular form, in practice Proving this is an open problem (I think)
- The complexity is now O
- n3 + mn2
COMPUTATION OF THE SVD
- The simplest approach for calculating the singular values of A = UΣV is to apply
the QR algorithm with shifts to AA = V Σ2V This suffers from bad conditioning
- Instead, we can double the dimension and apply the QR algorithm with shifts to
- This is symmetric/Hermitian, and satisfies
for
- The simplest approach for calculating the singular values of A = UΣV is to apply
the QR algorithm with shifts to AA = V Σ2V This suffers from bad conditioning
- Instead, we can double the dimension and apply the QR algorithm with shifts to
B =
- A
A
- This is symmetric/Hermitian, and satisfies
B = Q Σ −Σ
- Q
for Q = 1 √ 2 V V U −U
- We can bi-diagonalize A before we apply the QR Algorithm:
H = Q
1AQ2 =
- ×
× × × ... ... × × ×
- We can permute to get tridiagonal:
- We can bi-diagonalize A before we apply the QR Algorithm:
H = Q
1AQ2 =
- ×
× × × ... ... × × ×
- We can permute to get tridiagonal:
P
- H
H
- P =
- ×
× × × × ... ... ... × × × × ×