notes the power method
play

Notes The Power Method Assignment 1 due tonight Start with some - PDF document

Notes The Power Method Assignment 1 due tonight Start with some random vector v, ||v|| 2 =1 (email me by tomorrow morning) Iterate v=(Av)/||Av|| The eigenvector with largest eigenvalue tends to dominate How fast? Linear


  1. Notes The Power Method � Assignment 1 due tonight � Start with some random vector v, ||v|| 2 =1 (email me by tomorrow morning) � Iterate v=(Av)/||Av|| � The eigenvector with largest eigenvalue tends to dominate � How fast? • Linear convergence, slowed down by close eigenvalues cs542g-term1-2007 1 cs542g-term1-2007 2 Shift and Invert (Rayleigh Iteration) Maximality and Orthogonality � Say the eigenvalue we want is approximately � k � Unit eigenvectors v 1 of the maximum � The matrix (A- � k I) -1 has the same eigenvectors magnitude eigenvalue satisfy as A 1 µ = Av 1 2 = max � But the eigenvalues are u = 1 Au 2 � � � k � Use this in the power method instead � Unit eigenvectors v k of the k � th eigenvalue � Even better, update guess at eigenvalue each satisfy Av k 2 = iteration: T Av k + 1 max Au 2 � k + 1 = v k + 1 u = 1 u T v i = 0, i < k � Gives cubic convergence! (triples the number of significant digits each iteration when converging) � Can pick them off one by one, or…. cs542g-term1-2007 3 cs542g-term1-2007 4 Orthogonal iteration Rayleigh-Ritz � Aside: find a subset of the eigenpairs � Solve for lots (or all) of eigenvectors • E.g. largest k, smallest k simultaneously � Orthogonal estimate V (n � k) of eigenvectors � Start with initial guess V � Simple Rayleigh estimate of eigenvalues: � For k=1, 2, … • diag(V T AV) • Z=AV � Rayleigh-Ritz approach: • Solve k � k eigenproblem V T AV • VR=Z (QR decomposition: orthogonalize Z) • Use those eigenvalues (Ritz values) and the � Easy, but slow associated orthogonal combinations of columns of V • Note: another instance of (linear convergence, nearby eigenvalues “assume solution lies in span of a few basis slow things down a lot) vectors, solve reduced dimension problem” cs542g-term1-2007 5 cs542g-term1-2007 6

  2. Solving the Full Problem Nonlinear optimization � Orthogonal iteration works, but it � s slow � Switch gears a little: we � ve already seen plenty of instances of � First speed-up: make A tridiagonal minimizing, with linear least-squares • Sequence of symmetric Householder � What about nonlinear problems? reflections • Then Z=AV runs in O(n 2 ) instead of O(n 3 ) ( ) x = argmin f x � Other ingredients: � Find • Shifting: if we shift A by an exact eigenvalue, x A- � I, we get an exact eigenvector out of QR � f(x) is called the objective (the last column) - improves on linear convergence � This is an unconstrained problem, since • Division: once an offdiagonal is almost zero, no limits on x. problem separates into decoupled blocks cs542g-term1-2007 7 cs542g-term1-2007 8 Classes of methods Steepest Descent � Only evaluate f: � The gradient is the direction of fastest • Stochastic search, pattern search, change cyclic coordinate descent (Gauss-Seidel), • Locally, f(x+dx) is smallest when dx is in the genetic algorithms, etc. direction of negative gradient � f � Also evaluate � f/ � x (gradient vector) � The algorithm: • Steepest descent and relatives x (0) • Start with guess • Quasi-Newton methods • Until converged: ( ) d ( k ) = �� f x ( k ) � Find direction � Also evaluate � 2 f/ � x 2 (Hessian matrix) � ( k ) � Choose step size • Newton � s method and relatives x ( k + 1) = x ( k ) + � ( k ) d ( k ) � Next guess is cs542g-term1-2007 9 cs542g-term1-2007 10 Convergence? Convexity � At global minimum, gradient is zero: � A function is convex if • Can test if gradient is smaller than some ( ) � � f x ( ) + (1 � � ) f y ( ) f � x + (1 � � ) y threshold for convergence [ ] � � 0,1 • Note: scaling problem: min A*f(B*x)+C � However, gradient is also zero at � Eliminates possibility of multiple strict local • Every local minimum mins • Every local maximum � Strictly convex: at most one local min • Every saddle-point � Very good property for a problem to have! cs542g-term1-2007 11 cs542g-term1-2007 12

  3. Selecting a step size � Scaling problem again: physical dimensions of x and gradient may not match � Choosing a step too large: • May end up further from minimum � Choosing a step too small: • Slow, maybe too slow to actually converge � Line search : keep picking different step sizes until satisfied cs542g-term1-2007 13

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend