am 205 lecture 22
play

AM 205: lecture 22 Final project proposal due by 6pm on Thu Nov 17. - PowerPoint PPT Presentation

AM 205: lecture 22 Final project proposal due by 6pm on Thu Nov 17. Email Chris or the TFs to set up a meeting. Those who have completed this will see four proposal points on Canvas. Today: eigenvalue algorithms, QR algorithm Sensitivity


  1. AM 205: lecture 22 ◮ Final project proposal due by 6pm on Thu Nov 17. Email Chris or the TFs to set up a meeting. Those who have completed this will see four proposal points on Canvas. ◮ Today: eigenvalue algorithms, QR algorithm

  2. Sensitivity of Eigenvalue Problems Weyl’s Theorem: Let λ 1 ≤ λ 2 ≤ · · · ≤ λ n and ˜ λ 1 ≤ ˜ λ 2 ≤ · · · ≤ ˜ λ n be the eigenvalues of hermitian matrices A and A + δ A , i =1 ,..., n | λ i − ˜ respectively. Then max λ i | ≤ � δ A � 2 . Hence in the hermitian case, each perturbed eigenvalue must be in the disk 1 of its corresponding unperturbed eigenvalue! 1 In fact, eigenvalues of a hermitian matrix are real, so disk here is actually an interval in R

  3. Sensitivity of Eigenvalue Problems The Bauer–Fike Theorem relates to perturbations of the whole spectrum We can also consider perturbations of individual eigenvalues Suppose, for simplicity, that A ∈ C n × n is symmetric, and consider the perturbed eigenvalue problem ( A + E )( v + ∆ v ) = ( λ + ∆ λ )( v + ∆ v ) Expanding this equation, dropping second order terms, and using Av = λ v gives A ∆ v + Ev ≈ ∆ λ v + λ ∆ v

  4. Sensitivity of Eigenvalue Problems Premultiply A ∆ v + Ev ≈ ∆ λ v + λ ∆ v by v ∗ to obtain v ∗ A ∆ v + v ∗ Ev ≈ ∆ λ v ∗ v + λ v ∗ ∆ v Noting that v ∗ A ∆ v = ( v ∗ A ∆ v ) ∗ = ∆ v ∗ Av = λ ∆ v ∗ v = λ v ∗ ∆ v leads to ∆ λ = v ∗ Ev v ∗ Ev ≈ ∆ λ v ∗ v , or v ∗ v

  5. Sensitivity of Eigenvalue Problems Finally, we obtain | ∆ λ | ≈ | v ∗ Ev | ≤ � v � 2 � Ev � 2 = � E � 2 , � v � 2 � v � 2 2 2 so that | ∆ λ | � � E � 2 We observe that ◮ perturbation bound does not depend on cond( V ) when we consider only an individual eigenvalue ◮ this individual eigenvalue perturbation bound is asymptotic; it is rigorous only in the limit that the perturbations → 0

  6. Algorithms for Eigenvalue Problems

  7. Power Method

  8. Power Method The power method is perhaps the simplest eigenvalue algorithm It finds the eigenvalue of A ∈ C n × n with largest modulus 1: choose x 0 ∈ C n arbitrarily 2: for k = 1 , 2 , . . . do x k = Ax k − 1 3: 4: end for Question: How does this algorithm work?

  9. Power Method Assuming A is nondefective, then the eigenvectors v 1 , v 2 , . . . , v n provide a basis for C n Therefore there exist coefficients α i such that x 0 = � n j =1 α j v j Then, we have Ax k − 1 = A 2 x k − 2 = · · · = A k x 0 = x k   n n �  = � A k α j A k v j = α j v j  j =1 j =1 n � α j λ k = j v j j =1 � λ j   n − 1 � k � λ k =  α n v n + α j v j n  λ n j =1

  10. Power Method Then if | λ n | > | λ j | , 1 ≤ j < n , we see that x k → λ k n α n v n as k → ∞ This algorithm converges linearly: the error terms are scaled by a factor at most | λ n − 1 | / | λ n | at each iteration Also, we see that the method converges faster if λ n is well-separated from the rest of the spectrum

  11. Power Method However, in practice the exponential factor λ k n could cause overflow or underflow after relatively few iterations Therefore the standard form of the power method is actually the normalized power method 1: choose x 0 ∈ C n arbitrarily 2: for k = 1 , 2 , . . . do y k = Ax k − 1 3: x k = y k / � y k � 4: 5: end for

  12. Power Method Convergence analysis of the normalized power method is essentially the same as the un-normalized case Only difference is we now get an extra scaling factor, c k ∈ R , due to the normalization at each step � λ j   n − 1 � k � c k λ k x k =  α n v n + α j v j n  λ n j =1

  13. Power Method This algorithm directly produces the eigenvector v n One way to recover λ n is to note that y k = Ax k − 1 ≈ λ n x k − 1 Hence we can compare an entry of y k and x k − 1 to approximate λ n We also note two potential issues: 1. We require x 0 to have a nonzero component of v n 2. There may be more than one eigenvalue with maximum modulus

  14. Power Method Issue 1: ◮ In practice, very unlikely that x 0 will be orthogonal to v n ◮ Even if x ∗ 0 v n = 0, rounding error will introduce a component of v n during the power iterations Issue 2: ◮ We cannot ignore the possibility that there is more than one “max. eigenvalue” ◮ In this case x k would converge to a member of the corresponding eigenspace

  15. Power Method An important idea in eigenvalue computations is to consider the “shifted” matrix A − σ I , for σ ∈ R We see that ( A − σ I ) v i = ( λ i − σ ) v i and hence the spectrum of A − σ I is shifted by − σ , and the eigenvectors are the same For example, if all the eigenvalues are real, a shift can be used with the power method to converge to λ 1 instead of λ n

  16. Power Method Python example: Consider power method and shifted power method for � 4 � 1 A = , 1 − 2 which has eigenvalues λ 1 = − 2 . 1623, λ 2 = 4 . 1623

  17. Inverse Iteration

  18. Inverse Iteration The eigenvalues of A − 1 are the reciprocals of the eigenvalues of A , since ⇒ A − 1 v = 1 Av = λ v ⇐ λ v Question: What happens if we apply the power method to A − 1 ?

  19. Inverse Iteration Answer: We converge to the largest (in modulus) eigenvalue of A − 1 , which is 1 /λ 1 (recall that λ 1 is the smallest eigenvalue of A ) This is called inverse iteration 1: choose x 0 ∈ C n arbitrarily 2: for k = 1 , 2 , . . . do solve Ay k = x k − 1 for y k 3: x k = y k / � y k � 4: 5: end for

  20. Inverse Iteration Hence inverse iteration gives λ 1 without requiring a shift This is helpful since it may be difficult to determine what shift is required to get λ 1 in the power method Question: What happens if we apply inverse iteration to the shifted matrix A − σ I ?

  21. Inverse Iteration The smallest eigenvalue of A − σ I is ( λ i ∗ − σ ), where i ∗ = arg i =1 , 2 ,..., n | λ i − σ | , min and hence... Answer: We converge to ˜ λ = 1 / ( λ i ∗ − σ ), then recover λ i ∗ via λ i ∗ = 1 + σ ˜ λ Inverse iteration with shift allows us to find the eigenvalue closest to σ

  22. Rayleigh Quotient Iteration

  23. Rayleigh Quotient For the remainder of this section (Rayleigh Quotient Iteration, QR Algorithm) we will assume that A ∈ R n × n is real and symmetric 2 The Rayleigh quotient is defined as r ( x ) ≡ x T Ax x T x If ( λ, v ) ∈ R × R n is an eigenpair, then r ( v ) = v T Av = λ v T v v T v = λ v T v 2 Much of the material generalizes to complex non-hermitian matrices, but symmetric case is simpler

  24. Rayleigh Quotient Theorem: Suppose A ∈ R n × n is a symmetric matrix, then for any x ∈ R n we have λ 1 ≤ r ( x ) ≤ λ n Proof: We write x as a linear combination of (orthogonal) eigenvectors x = � n j =1 α j v j , and the lower bound follows from � n j =1 λ j α 2 � n j =1 α 2 r ( x ) = x T Ax j j = ≥ λ 1 = λ 1 � n j =1 α 2 � n j =1 α 2 x T x j j The proof of the upper bound r ( x ) ≤ λ n is analogous �

  25. Rayleigh Quotient Corollary: A symmetric matrix A ∈ R n × n is positive definite if and only if all of its eigenvalues are positive Proof: ( ⇒ ) Suppose A is symmetric positive definite (SPD), then for any nonzero x ∈ R n , we have x T Ax > 0 and hence λ 1 = r ( v 1 ) = v T 1 Av 1 > 0 v T 1 v 1 ( ⇐ ) Suppose A has positive eigenvalues, then for any nonzero x ∈ R n x T Ax = r ( x )( x T x ) ≥ λ 1 � x � 2 2 > 0 �

  26. Rayleigh Quotient But also, if x is an approximate eigenvector, then r ( x ) gives us a good approximation to the eigenvalue This is because estimation of an eigenvalue from an approximate eigenvector is an n × 1 linear least squares problem: x λ ≈ Ax x ∈ R n is our “tall thin matrix” and Ax ∈ R n is our right-hand side Hence the normal equation for x λ ≈ Ax yields the Rayleigh quotient, i.e. x T x λ = x T Ax

  27. Rayleigh Quotient Question: How accurate is the Rayleigh quotient approximation to an eigenvalue? Let’s consider r as a function of x , so r : R n → R ∂ ( x T Ax ) ∂ ∂ x j ( x T Ax ) ∂ x j ( x T x ) ∂ r ( x ) = − ( x T x ) 2 ∂ x j x T x − ( x T Ax )2 x j 2( Ax ) j = x T x ( x T x ) 2 2 = x T x ( Ax − r ( x ) x ) j (Note that the second equation relies on the symmetry of A )

  28. Rayleigh Quotient Therefore 2 ∇ r ( x ) = x T x ( Ax − r ( x ) x ) For an eigenpair ( λ, v ) we have r ( v ) = λ and hence 2 ∇ r ( v ) = v T v ( Av − λ v ) = 0 This shows that eigenvectors of A are stationary points of r

  29. Rayleigh Quotient Suppose ( λ, v ) is an eigenpair of A , and let us consider a Taylor expansion of r ( x ) about v : r ( v ) + ∇ r ( v ) T ( x − v ) r ( x ) = +1 2( x − v ) T H r ( v )( x − v ) + H.O.T. r ( v ) + 1 2( x − v ) T H r ( v )( x − v ) + H.O.T. = Hence as x → v the error in a Rayleigh quotient approximation is | r ( x ) − λ | = O ( � x − v � 2 2 ) That is, the Rayleigh quotient approx. to an eigenvalue squares the error in a corresponding eigenvector approx.

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend