( A I ) = 0 However, there is no exact formula for polynomials - - PowerPoint PPT Presentation
( A I ) = 0 However, there is no exact formula for polynomials - - PowerPoint PPT Presentation
Lesson 21 C OMPUTATION OF EIGENVALUES We consider the problem of computation of eigenvalues To make life (much!) simpler, let's assume we have a symmetric matrix A R n n , which has decomposition A = Q Q Unlike previous
- We consider the problem of computation of eigenvalues
- To make life (much!) simpler, let's assume we have a symmetric matrix A ∈ Rn×n,
which has decomposition A = QΛQ
- Unlike previous algorithms, which were deterministic, eigenvalue algorithms must
be iterative
- In this lecture, we overview classical iterative algorithms for calculating eigenvalues
First we look at algorithms that calculate a single eigenvalue, then we intro- duce the QR algorithm for calculating all eigenvalues
BAD IDEA: SOLVE CHARACTERISTIC EQUATION
- Probably in second(?) year you were asked to calculate eigenvalues for small ma-
trices, by solving the characteristic equation: (A − λI) = 0
- However, there is no exact formula for polynomials of degree five or greater: we
cannot solve this equation exactly
- What about numerically?
Newton iteration requires good initial guesses for convergence, and it is dif- ficult to find all roots Indeed, recall we reduced the polynomial root finding problem to calculating eigenvalues of a companion matrix
ONE EIGENVALUE ALGORITHM: POWER ITERATION
- We introduce the Rayleigh quotient
r(x) = xAx xx
- Note that for a normalized eigenvector qk = Qek, we have
r(qk) = q
k Aqk = k
Thus, if we can find an eigenvector, we get the eigenvalue as well
- For general
, is the constant that "acts closest" to the eigenvalue, i.e., it is the that minimizes (Why?)
- Expanding
we get
- Thus if we are near an eigenvector
with we have quadratically small error:
- We introduce the Rayleigh quotient
r(x) = xAx xx
- Note that for a normalized eigenvector qk = Qek, we have
r(qk) = q
k Aqk = k
Thus, if we can find an eigenvector, we get the eigenvalue as well
- For general x, r(x) is the constant that "acts closest" to the eigenvalue, i.e., it is the
that minimizes (Why?) Ax x
- Expanding
- we get
- Thus if we are near an eigenvector
with we have quadratically small error:
- We introduce the Rayleigh quotient
r(x) = xAx xx
- Note that for a normalized eigenvector qk = Qek, we have
r(qk) = q
k Aqk = k
Thus, if we can find an eigenvector, we get the eigenvalue as well
- For general x, r(x) is the constant that "acts closest" to the eigenvalue, i.e., it is the
that minimizes (Why?) Ax x
- Expanding x = n
k=1 xkqk we get
r(x) = n
k=1 kxkxqk
x2 = n
k=1 kx2 k
x2
- Thus if we are near an eigenvector
with we have quadratically small error:
- We introduce the Rayleigh quotient
r(x) = xAx xx
- Note that for a normalized eigenvector qk = Qek, we have
r(qk) = q
k Aqk = k
Thus, if we can find an eigenvector, we get the eigenvalue as well
- For general x, r(x) is the constant that "acts closest" to the eigenvalue, i.e., it is the
that minimizes (Why?) Ax x
- Expanding x = n
k=1 xkqk we get
r(x) = n
k=1 kxkxqk
x2 = n
k=1 kx2 k
x2
- Thus if we are near an eigenvector x = qj +h with h = we have quadratically
small error: r(x) = j 1 + h2
kk/j
x2 = j + O
- 2
- Another simple approach is to use power iteration: for some initial guess for eigen-
vector v0, we continually apply A and normalize: vk = Avk1 Avk1
- =
Akv0 Akv0
- Then we hope that
for large , so that (here denotes largest eigenvalue)
- Another simple approach is to use power iteration: for some initial guess for eigen-
vector v0, we continually apply A and normalize: vk = Avk1 Avk1
- =
Akv0 Akv0
- Then we hope that vk q1 = Qe1 for large k, so that
λ1 r(vk) = v
k Avk
(here λ1 denotes largest eigenvalue)
5 10 15 10-14 10-11 10-8 10-5 0.01
n = 20
Eigenvalue Eigenvector Convergence
- Consider the Hankel operator
H = 1
1 4 1 9 1 16
· · ·
1 4 1 9 1 16 1 25
...
1 9 1 16 1 25 1 36
...
1 16 1 25 1 36 1 49
... . . . ... ... ... ...
- We apply Power iteration to Hn = PnHPn
0.2 0.4 0.6 0.8 1.0
- 1.0
- 0.5
0.5 1.0
Eigenvalues
Eigenvalue Eigenvector Convergence
- Consider the Hankel operator
H = 1
1 4 1 9 1 16
· · ·
1 4 1 9 1 16 1 25
...
1 9 1 16 1 25 1 36
...
1 16 1 25 1 36 1 49
... . . . ... ... ... ...
- We apply Power iteration to Hn = PnHPn
0.2 0.4 0.6 0.8 1.0
- 1.0
- 0.5
0.5 1.0
Eigenvalues
5 10 15 10-15 10-11 10-7 0.001
n = 40
50 100 150 200 250 300 10-13 10-10 10-7 10-4 0.1
n = 50
- 1.0
- 0.5
0.5 1.0
- 1.0
- 0.5
0.5 1.0
- Consider a randoom symmetric Gaussian matrix
Gn = 1 √n G11 · · · G1n . . . ... . . . G1n · · · Gnn where Gij are random Gaussian entries mean zero and variance twice on the diagonal as on the off diagonal
Eigenvalue Eigenvector Eigenvalues Convergence
- Consider a randoom symmetric Gaussian matrix
Gn = 1 √n G11 · · · G1n . . . ... . . . G1n · · · Gnn where Gij are random Gaussian entries mean zero and variance twice on the diagonal as on the off diagonal
Eigenvalue Eigenvector Eigenvalues
500 1000 1500 2000 10-13 10-10 10-7 10-4 0.1
n = 100
- 1.0
- 0.5
0.5 1.0
- 1.0
- 0.5
0.5 1.0
Convergence
: Power iteration converges provided |λ1| > |λ2| and q
1 v0 = 0, at the rate
vk (±q1) = O
- λ2
λ1
- k
and |r(vk) λ1| = O
- λ2
λ1
- 2k
- Note: for a random initial guess, the probability of
is zero
- We can expand
so that
- Since
, we must have (by normalization) that , which shows the first bound
- The second bound follows from the first since it's a Rayleigh quotient
: Power iteration converges provided |λ1| > |λ2| and q
1 v0 = 0, at the rate
vk (±q1) = O
- λ2
λ1
- k
and |r(vk) λ1| = O
- λ2
λ1
- 2k
- Note: for a random initial guess, the probability of q
1 v0 = 0 is zero
- We can expand
v0 = a1q1 + · · · + anqn so that vk = ckAkv0 = ck
- a1λk
1q1 + a2λk 2q2 + · · · + anλk nqn
- Since
, we must have (by normalization) that , which shows the first bound
- The second bound follows from the first since it's a Rayleigh quotient
: Power iteration converges provided |λ1| > |λ2| and q
1 v0 = 0, at the rate
vk (±q1) = O
- λ2
λ1
- k
and |r(vk) λ1| = O
- λ2
λ1
- 2k
- Note: for a random initial guess, the probability of q
1 v0 = 0 is zero
- We can expand
v0 = a1q1 + · · · + anqn so that vk = ckAkv0 = ck
- a1λk
1q1 + a2λk 2q2 + · · · + anλk nqn
- = ckλk
1
- a1q1 + a2(λ2/λ1)kq2 + · · · + an(λn/λ1)kqn
- Since aj(λj/λ1)k 0, we must have (by normalization) that ckλk
1a1 1, which
shows the first bound
- The second bound follows from the first since it's a Rayleigh quotient
INVERSE ITERATION
- Suppose we have a good initial guess of an eigenvalue µ λk
- Then A µI has an eigenvalue that is very small
- Then (A µI)1 has largest eigenvalue (λk µ)1, and power iteration will
calculate this: vk+1 = (A µI)1vk (A µI)1vk
- The eigenvectors of (A µI)1 are the same as A, so we still have
λ v
k Avk
where λ is the eigenvalue closest to µ
Eigenvalue Eigenvector Eigenvalues
- 1.0
- 0.5
0.5 1.0
- 1.0
- 0.5
0.5 1.0 100 200 300 400 500 10-15 10-12 10-9 10-6 0.001 1
n = 100
- Consider a randoom symmetric Gaussian matrix
Gn = 1 √n G11 · · · G1n . . . ... . . . G1n · · · Gnn where Gij are random Gaussian entries mean zero and variance twice on the diagonal as on the off diagonal
- We take µ = 0 and find the smallest eigenvalue
Convergence
RAYLEIGH QUOTIENT ITERATION
- Use current guess with inverse iteration:
λ(k) = v
k1Avk1
vk = (A − λ(k)I)1vk1
- (A − λ(k)I)1vk1
- Convergence is cubic in the limit:
and
RAYLEIGH QUOTIENT ITERATION
- Use current guess with inverse iteration:
λ(k) = v
k1Avk1
vk = (A − λ(k)I)1vk1
- (A − λ(k)I)1vk1
- Convergence is cubic in the limit:
- vk+1 − (±qj)
- = O
- vk − (±qj)
- 3
and
- λ(k+1) − λj
- = O
- λ(k) − λj
- 3
Eigenvalue Eigenvector Eigenvalues
- 1.0
- 0.5
0.5 1.0
- 1.0
- 0.5
0.5 1.0
5 10 15 20 25 10-14 10-11 10-8 10-5 0.01
n = 100
- Consider a randoom symmetric Gaussian matrix
Gn = 1 √n G11 · · · G1n . . . ... . . . G1n · · · Gnn where Gij are random Gaussian entries mean zero and variance twice on the diagonal as on the off diagonal
- We can't predict which eigenvalue we calculate apriori
In this experiment we get λ ≈ −0.136, but it depends on random initial vector
QR ALGORITHM (NOT DECOMPOSITION)
- The pure QR Algorithm consists of calculating a QR decomposition and recombining
in the other order: for A0 = A, define QkRk = Ak1 Ak = RkQk
- Note that Ak = RkQk = Q
k Ak1Qk has the same eigenvalue as A
- The remarkable fact is that, provided the eigenvalues are strictly ordered (|λ1| >
|λ2| > · · · > |λn|) then this converges to a diagonal matrix: Ak → {λ1, . . . , λn}
5000 10000 15000 20000 10-11 10-8 10-5 0.01
Ak Ak
- Consider a randoom symmetric Gaussian matrix
Gn = 1 √n G11 · · · G1n . . . ... . . . G1n · · · Gnn where Gij are random Gaussian entries mean zero and variance twice on the diagonal as on the off diagonal
k