communication avoiding krylov subspace methods
play

Communication-avoiding Krylov subspace methods Mark Hoemmen - PowerPoint PPT Presentation

Motivation Break the dependency Previous work Preconditioning Future work Summary Communication-avoiding Krylov subspace methods Mark Hoemmen mhoemmen@cs.berkeley.edu University of California Berkeley EECS SIAM Parallel Processing for


  1. Motivation Break the dependency Previous work Preconditioning Future work Summary Communication-avoiding Krylov subspace methods Mark Hoemmen mhoemmen@cs.berkeley.edu University of California Berkeley EECS SIAM Parallel Processing for Scientific Computing 2008 Hoemmen Comm.-avoiding KSMs

  2. Motivation Break the dependency Previous work Preconditioning Future work Summary Overview Current Krylov methods: communication-limited Can rearrange them to avoid communication Can do this in a numerically stable way Requires rethinking preconditioning Hoemmen Comm.-avoiding KSMs

  3. Motivation Break the dependency Two communication-bound kernels Previous work Potential to avoid communication Preconditioning Data dependencies limit reuse Future work Summary Motivation Two communication-bound kernels Can rearrange each kernel to avoid communication, but. . . Data dependency between the two precludes rearrangement. . . Unless you rearrange the Krylov method! Hoemmen Comm.-avoiding KSMs

  4. Motivation Break the dependency Two communication-bound kernels Previous work Potential to avoid communication Preconditioning Data dependencies limit reuse Future work Summary Krylov methods: Two communication-bound kernels Sparse matrix-vector multiplication (SpMV) Share/communicate source vector w/ neighbors Low computational intensity per processor Orthogonalization: Θ( 1 ) reductions per vector Arnoldi/GMRES: Modified Gram-Schmidt or Householder QR Lanczos/CG: Recurrence orthogonalizes implicitly Hoemmen Comm.-avoiding KSMs

  5. Motivation Break the dependency Two communication-bound kernels Previous work Potential to avoid communication Preconditioning Data dependencies limit reuse Future work Summary Krylov methods: Two communication-bound kernels Sparse matrix-vector multiplication (SpMV) Share/communicate source vector w/ neighbors Low computational intensity per processor Orthogonalization: Θ( 1 ) reductions per vector Arnoldi/GMRES: Modified Gram-Schmidt or Householder QR Lanczos/CG: Recurrence orthogonalizes implicitly Hoemmen Comm.-avoiding KSMs

  6. Motivation Break the dependency Two communication-bound kernels Previous work Potential to avoid communication Preconditioning Data dependencies limit reuse Future work Summary Potential to avoid communication SpMV: Matrix powers kernel (Marghoob) Compute [ v , Av , A 2 v , . . . , A s v ] Tiling to reuse matrix entries Parallel: same latency cost as one SpMV Sequential: only read matrix O ( 1 ) times Orthogonalization: TSQR (Julien) Just as stable as Householder QR Parallel: same latency cost as one reduction Sequential: only read vectors once Hoemmen Comm.-avoiding KSMs

  7. Motivation Break the dependency Two communication-bound kernels Previous work Potential to avoid communication Preconditioning Data dependencies limit reuse Future work Summary Potential to avoid communication SpMV: Matrix powers kernel (Marghoob) Compute [ v , Av , A 2 v , . . . , A s v ] Tiling to reuse matrix entries Parallel: same latency cost as one SpMV Sequential: only read matrix O ( 1 ) times Orthogonalization: TSQR (Julien) Just as stable as Householder QR Parallel: same latency cost as one reduction Sequential: only read vectors once Hoemmen Comm.-avoiding KSMs

  8. Motivation Break the dependency Two communication-bound kernels Previous work Potential to avoid communication Preconditioning Data dependencies limit reuse Future work Summary Problem: Data dependencies limit reuse Krylov methods advance one vector at a time SpMV, then orthogonalize, then SpMV, . . . Figure: Data dependencies in Krylov subspace methods. Hoemmen Comm.-avoiding KSMs

  9. Motivation Idea Break the dependency Example: GMRES Previous work Basis condition number Preconditioning Numerical experiments Future work Our algorithms Summary s -step Krylov methods: break the dependency Matrix powers kernel Compute basis of span { v , Av , A 2 v , . . . , A s v } TSQR Orthogonalize basis Use R factor to reconstruct upper Hessenberg H resp. tridiagonal T Solve least squares problem or linear system with H resp. T for coefficients of solution update Hoemmen Comm.-avoiding KSMs

  10. Motivation Idea Break the dependency Example: GMRES Previous work Basis condition number Preconditioning Numerical experiments Future work Our algorithms Summary s -step Krylov methods: break the dependency Matrix powers kernel Compute basis of span { v , Av , A 2 v , . . . , A s v } TSQR Orthogonalize basis Use R factor to reconstruct upper Hessenberg H resp. tridiagonal T Solve least squares problem or linear system with H resp. T for coefficients of solution update Hoemmen Comm.-avoiding KSMs

  11. Motivation Idea Break the dependency Example: GMRES Previous work Basis condition number Preconditioning Numerical experiments Future work Our algorithms Summary Example: GMRES Hoemmen Comm.-avoiding KSMs

  12. Motivation Idea Break the dependency Example: GMRES Previous work Basis condition number Preconditioning Numerical experiments Future work Our algorithms Summary Original GMRES 1: for k = 1 to s do 2: w = Av k − 1 Orthogonalize w against v 0 , . . . , v k − 1 using Modified 3: Gram-Schmidt 4: end for 5: Compute solution using H Hoemmen Comm.-avoiding KSMs

  13. Motivation Idea Break the dependency Example: GMRES Previous work Basis condition number Preconditioning Numerical experiments Future work Our algorithms Summary Version 2: Matrix powers kernel & TSQR 1: W = [ v 0 , Av 0 , A 2 v 0 , . . . , A s v 0 ] 2: [ Q , R ] = TSQR ( W ) 3: Compute H using R 4: Compute solution using H s powers of A for no extra latency cost s steps of QR for one step of latency But. . . Hoemmen Comm.-avoiding KSMs

  14. Motivation Idea Break the dependency Example: GMRES Previous work Basis condition number Preconditioning Numerical experiments Future work Our algorithms Summary Basis computation not stable v , Av , A 2 v , . . . looks familiar. . . It’s the power method! Converges to principal eigenvector of A Expect increasing linear dependence. . . Basis condition number exponential in s Hoemmen Comm.-avoiding KSMs

  15. Motivation Idea Break the dependency Example: GMRES Previous work Basis condition number Preconditioning Numerical experiments Future work Our algorithms Summary Basis computation not stable v , Av , A 2 v , . . . looks familiar. . . It’s the power method! Converges to principal eigenvector of A Expect increasing linear dependence. . . Basis condition number exponential in s Hoemmen Comm.-avoiding KSMs

  16. Motivation Idea Break the dependency Example: GMRES Previous work Basis condition number Preconditioning Numerical experiments Future work Our algorithms Summary Basis computation not stable v , Av , A 2 v , . . . looks familiar. . . It’s the power method! Converges to principal eigenvector of A Expect increasing linear dependence. . . Basis condition number exponential in s Hoemmen Comm.-avoiding KSMs

  17. Motivation Idea Break the dependency Example: GMRES Previous work Basis condition number Preconditioning Numerical experiments Future work Our algorithms Summary Version 3: Different basis Just like polynomial interpolation Use a different basis, e.g.: Newton basis W = [ v , ( A − θ 1 I ) v , ( A − θ 2 I )( A − θ 1 I ) v , . . . ] Get shifts θ i for free – Ritz values Can change shifts with each group of s Chebyshev basis W = [ v , T 1 ( v ) , T 2 ( v ) , . . . ] Use condition number bounds to scale T k ( z ) Uncertain sensitivity of κ 2 ( W ) to bounds Hoemmen Comm.-avoiding KSMs

  18. Motivation Idea Break the dependency Example: GMRES Previous work Basis condition number Preconditioning Numerical experiments Future work Our algorithms Summary Basis condition number Figure: Condition number of various bases as a function of basis length s . Matrix A is a 10 6 × 10 6 2-D Poisson operator. Hoemmen Comm.-avoiding KSMs

  19. Motivation Idea Break the dependency Example: GMRES Previous work Basis condition number Preconditioning Numerical experiments Future work Our algorithms Summary Numerical experiments Diagonal 10 4 × 10 4 matrix, κ 2 ( A ) = 10 8 s = 24 Newton: basis condition # about 10 14 Monomial: basis condition # about 10 16 Hoemmen Comm.-avoiding KSMs

  20. Motivation Idea Break the dependency Example: GMRES Previous work Basis condition number Preconditioning Numerical experiments Future work Our algorithms Summary Better basis pays off: restarting GMRES(24,1) residuals: cond(A) = 1e8, n=1e4 0 Standard(24,1) Monomial(24,1) − 0.5 Newton(24,1) − 1 Log base 10 of 2 − norm relative residual error − 1.5 − 2 − 2.5 − 3 − 3.5 − 4 − 4.5 − 5 100 200 300 400 500 600 700 800 900 1000 Iteration count Figure: Restart after every group of s steps Hoemmen Comm.-avoiding KSMs

  21. Motivation Idea Break the dependency Example: GMRES Previous work Basis condition number Preconditioning Numerical experiments Future work Our algorithms Summary Better basis pays off: less restarting GMRES(24,8) residuals: cond(A) = 1e8, n=1e4 1 Standard(24,8) Monomial(24,8) Newton(24,8) 0 Log base 10 of 2 − norm relative residual error − 1 − 2 − 3 − 4 − 5 − 6 100 200 300 400 500 600 700 800 900 1000 Iteration count Figure: Restart after 8 groups of s = 24 steps. Hoemmen Comm.-avoiding KSMs

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend