( A I ) = 0 However, there is no exact formula for polynomials - - PowerPoint PPT Presentation

a i 0
SMART_READER_LITE
LIVE PREVIEW

( A I ) = 0 However, there is no exact formula for polynomials - - PowerPoint PPT Presentation

Lesson 21 C OMPUTATION OF EIGENVALUES We consider the problem of computation of eigenvalues To make life (much!) simpler, let's assume we have a symmetric matrix A R n n , which has decomposition A = Q Q Unlike previous


slide-1
SLIDE 1

COMPUTATION OF EIGENVALUES

Lesson 21

slide-2
SLIDE 2
  • We consider the problem of computation of eigenvalues
  • To make life (much!) simpler, let's assume we have a symmetric matrix A ∈ Rn×n,

which has decomposition A = QΛQ

  • Unlike previous algorithms, which were deterministic, eigenvalue algorithms must

be iterative

  • In this lecture, we overview classical iterative algorithms for calculating eigenvalues

First we look at algorithms that calculate a single eigenvalue, then we intro- duce the QR algorithm for calculating all eigenvalues

slide-3
SLIDE 3

BAD IDEA: SOLVE CHARACTERISTIC EQUATION

slide-4
SLIDE 4
  • Probably in second(?) year you were asked to calculate eigenvalues for small ma-

trices, by solving the characteristic equation: (A − λI) = 0

  • However, there is no exact formula for polynomials of degree five or greater: we

cannot solve this equation exactly

  • What about numerically?

Newton iteration requires good initial guesses for convergence, and it is dif- ficult to find all roots Indeed, recall we reduced the polynomial root finding problem to calculating eigenvalues of a companion matrix

slide-5
SLIDE 5

ONE EIGENVALUE ALGORITHM: POWER ITERATION

slide-6
SLIDE 6
  • We introduce the Rayleigh quotient

r(x) = xAx xx

  • Note that for a normalized eigenvector qk = Qek, we have

r(qk) = q

k Aqk = k

Thus, if we can find an eigenvector, we get the eigenvalue as well

  • For general

, is the constant that "acts closest" to the eigenvalue, i.e., it is the that minimizes (Why?)

  • Expanding

we get

  • Thus if we are near an eigenvector

with we have quadratically small error:

slide-7
SLIDE 7
  • We introduce the Rayleigh quotient

r(x) = xAx xx

  • Note that for a normalized eigenvector qk = Qek, we have

r(qk) = q

k Aqk = k

Thus, if we can find an eigenvector, we get the eigenvalue as well

  • For general x, r(x) is the constant that "acts closest" to the eigenvalue, i.e., it is the

that minimizes (Why?) Ax x

  • Expanding
  • we get
  • Thus if we are near an eigenvector

with we have quadratically small error:

slide-8
SLIDE 8
  • We introduce the Rayleigh quotient

r(x) = xAx xx

  • Note that for a normalized eigenvector qk = Qek, we have

r(qk) = q

k Aqk = k

Thus, if we can find an eigenvector, we get the eigenvalue as well

  • For general x, r(x) is the constant that "acts closest" to the eigenvalue, i.e., it is the

that minimizes (Why?) Ax x

  • Expanding x = n

k=1 xkqk we get

r(x) = n

k=1 kxkxqk

x2 = n

k=1 kx2 k

x2

  • Thus if we are near an eigenvector

with we have quadratically small error:

slide-9
SLIDE 9
  • We introduce the Rayleigh quotient

r(x) = xAx xx

  • Note that for a normalized eigenvector qk = Qek, we have

r(qk) = q

k Aqk = k

Thus, if we can find an eigenvector, we get the eigenvalue as well

  • For general x, r(x) is the constant that "acts closest" to the eigenvalue, i.e., it is the

that minimizes (Why?) Ax x

  • Expanding x = n

k=1 xkqk we get

r(x) = n

k=1 kxkxqk

x2 = n

k=1 kx2 k

x2

  • Thus if we are near an eigenvector x = qj +h with h = we have quadratically

small error: r(x) = j 1 + h2

kk/j

x2 = j + O

  • 2
slide-10
SLIDE 10
  • Another simple approach is to use power iteration: for some initial guess for eigen-

vector v0, we continually apply A and normalize: vk = Avk1 Avk1

  • =

Akv0 Akv0

  • Then we hope that

for large , so that (here denotes largest eigenvalue)

slide-11
SLIDE 11
  • Another simple approach is to use power iteration: for some initial guess for eigen-

vector v0, we continually apply A and normalize: vk = Avk1 Avk1

  • =

Akv0 Akv0

  • Then we hope that vk q1 = Qe1 for large k, so that

λ1 r(vk) = v

k Avk

(here λ1 denotes largest eigenvalue)

slide-12
SLIDE 12

5 10 15 10-14 10-11 10-8 10-5 0.01

n = 20

Eigenvalue Eigenvector Convergence

  • Consider the Hankel operator

H =           1

1 4 1 9 1 16

· · ·

1 4 1 9 1 16 1 25

...

1 9 1 16 1 25 1 36

...

1 16 1 25 1 36 1 49

... . . . ... ... ... ...          

  • We apply Power iteration to Hn = PnHPn

0.2 0.4 0.6 0.8 1.0

  • 1.0
  • 0.5

0.5 1.0

Eigenvalues

slide-13
SLIDE 13

Eigenvalue Eigenvector Convergence

  • Consider the Hankel operator

H =           1

1 4 1 9 1 16

· · ·

1 4 1 9 1 16 1 25

...

1 9 1 16 1 25 1 36

...

1 16 1 25 1 36 1 49

... . . . ... ... ... ...          

  • We apply Power iteration to Hn = PnHPn

0.2 0.4 0.6 0.8 1.0

  • 1.0
  • 0.5

0.5 1.0

Eigenvalues

5 10 15 10-15 10-11 10-7 0.001

n = 40

slide-14
SLIDE 14

50 100 150 200 250 300 10-13 10-10 10-7 10-4 0.1

n = 50

  • 1.0
  • 0.5

0.5 1.0

  • 1.0
  • 0.5

0.5 1.0

  • Consider a randoom symmetric Gaussian matrix

Gn = 1 √n    G11 · · · G1n . . . ... . . . G1n · · · Gnn    where Gij are random Gaussian entries mean zero and variance twice on the diagonal as on the off diagonal

Eigenvalue Eigenvector Eigenvalues Convergence

slide-15
SLIDE 15
  • Consider a randoom symmetric Gaussian matrix

Gn = 1 √n    G11 · · · G1n . . . ... . . . G1n · · · Gnn    where Gij are random Gaussian entries mean zero and variance twice on the diagonal as on the off diagonal

Eigenvalue Eigenvector Eigenvalues

500 1000 1500 2000 10-13 10-10 10-7 10-4 0.1

n = 100

  • 1.0
  • 0.5

0.5 1.0

  • 1.0
  • 0.5

0.5 1.0

Convergence

slide-16
SLIDE 16

: Power iteration converges provided |λ1| > |λ2| and q

1 v0 = 0, at the rate

vk (±q1) = O

  • λ2

λ1

  • k

and |r(vk) λ1| = O

  • λ2

λ1

  • 2k
  • Note: for a random initial guess, the probability of

is zero

  • We can expand

so that

  • Since

, we must have (by normalization) that , which shows the first bound

  • The second bound follows from the first since it's a Rayleigh quotient
slide-17
SLIDE 17

: Power iteration converges provided |λ1| > |λ2| and q

1 v0 = 0, at the rate

vk (±q1) = O

  • λ2

λ1

  • k

and |r(vk) λ1| = O

  • λ2

λ1

  • 2k
  • Note: for a random initial guess, the probability of q

1 v0 = 0 is zero

  • We can expand

v0 = a1q1 + · · · + anqn so that vk = ckAkv0 = ck

  • a1λk

1q1 + a2λk 2q2 + · · · + anλk nqn

  • Since

, we must have (by normalization) that , which shows the first bound

  • The second bound follows from the first since it's a Rayleigh quotient
slide-18
SLIDE 18

: Power iteration converges provided |λ1| > |λ2| and q

1 v0 = 0, at the rate

vk (±q1) = O

  • λ2

λ1

  • k

and |r(vk) λ1| = O

  • λ2

λ1

  • 2k
  • Note: for a random initial guess, the probability of q

1 v0 = 0 is zero

  • We can expand

v0 = a1q1 + · · · + anqn so that vk = ckAkv0 = ck

  • a1λk

1q1 + a2λk 2q2 + · · · + anλk nqn

  • = ckλk

1

  • a1q1 + a2(λ2/λ1)kq2 + · · · + an(λn/λ1)kqn
  • Since aj(λj/λ1)k 0, we must have (by normalization) that ckλk

1a1 1, which

shows the first bound

  • The second bound follows from the first since it's a Rayleigh quotient
slide-19
SLIDE 19

INVERSE ITERATION

  • Suppose we have a good initial guess of an eigenvalue µ λk
  • Then A µI has an eigenvalue that is very small
  • Then (A µI)1 has largest eigenvalue (λk µ)1, and power iteration will

calculate this: vk+1 = (A µI)1vk (A µI)1vk

  • The eigenvectors of (A µI)1 are the same as A, so we still have

λ v

k Avk

where λ is the eigenvalue closest to µ

slide-20
SLIDE 20

Eigenvalue Eigenvector Eigenvalues

  • 1.0
  • 0.5

0.5 1.0

  • 1.0
  • 0.5

0.5 1.0 100 200 300 400 500 10-15 10-12 10-9 10-6 0.001 1

n = 100

  • Consider a randoom symmetric Gaussian matrix

Gn = 1 √n    G11 · · · G1n . . . ... . . . G1n · · · Gnn    where Gij are random Gaussian entries mean zero and variance twice on the diagonal as on the off diagonal

  • We take µ = 0 and find the smallest eigenvalue

Convergence

slide-21
SLIDE 21

RAYLEIGH QUOTIENT ITERATION

  • Use current guess with inverse iteration:

λ(k) = v

k1Avk1

vk = (A − λ(k)I)1vk1

  • (A − λ(k)I)1vk1
  • Convergence is cubic in the limit:

and

slide-22
SLIDE 22

RAYLEIGH QUOTIENT ITERATION

  • Use current guess with inverse iteration:

λ(k) = v

k1Avk1

vk = (A − λ(k)I)1vk1

  • (A − λ(k)I)1vk1
  • Convergence is cubic in the limit:
  • vk+1 − (±qj)
  • = O
  • vk − (±qj)
  • 3

and

  • λ(k+1) − λj
  • = O
  • λ(k) − λj
  • 3
slide-23
SLIDE 23

Eigenvalue Eigenvector Eigenvalues

  • 1.0
  • 0.5

0.5 1.0

  • 1.0
  • 0.5

0.5 1.0

5 10 15 20 25 10-14 10-11 10-8 10-5 0.01

n = 100

  • Consider a randoom symmetric Gaussian matrix

Gn = 1 √n    G11 · · · G1n . . . ... . . . G1n · · · Gnn    where Gij are random Gaussian entries mean zero and variance twice on the diagonal as on the off diagonal

  • We can't predict which eigenvalue we calculate apriori

In this experiment we get λ ≈ −0.136, but it depends on random initial vector

slide-24
SLIDE 24

QR ALGORITHM (NOT DECOMPOSITION)

slide-25
SLIDE 25
  • The pure QR Algorithm consists of calculating a QR decomposition and recombining

in the other order: for A0 = A, define QkRk = Ak1 Ak = RkQk

  • Note that Ak = RkQk = Q

k Ak1Qk has the same eigenvalue as A

  • The remarkable fact is that, provided the eigenvalues are strictly ordered (|λ1| >

|λ2| > · · · > |λn|) then this converges to a diagonal matrix: Ak → {λ1, . . . , λn}

slide-26
SLIDE 26

5000 10000 15000 20000 10-11 10-8 10-5 0.01

Ak Ak

  • Consider a randoom symmetric Gaussian matrix

Gn = 1 √n    G11 · · · G1n . . . ... . . . G1n · · · Gnn    where Gij are random Gaussian entries mean zero and variance twice on the diagonal as on the off diagonal

k