a parallel eigensolver eigensolver using using contour
play

A parallel eigensolver eigensolver using using contour contour A - PowerPoint PPT Presentation

A parallel eigensolver eigensolver using using contour contour A parallel integration for generalized generalized eigenvalue eigenvalue integration for problems in molecular simulation molecular simulation problems in Tetsuya Sakurai


  1. A parallel eigensolver eigensolver using using contour contour A parallel integration for generalized generalized eigenvalue eigenvalue integration for problems in molecular simulation molecular simulation problems in Tetsuya Sakurai (University of Tsukuba) Hiroto Tadano (Kyoto University) Umpei Nagashima (AIST)

  2. Contents  Introduction Background • Target problem & computing environment •  An Eigensolver using Contour Integration An algorithm • Numerical Properties • Parallel Implementation •  Numerical Examples  Conclusions

  3. approximation Design of Anticancer Drugs Hartree-Fock Molecular Orbital Computation Schrödinger Equation H Ψ = E E Ψ H Ψ = Ψ Generalized Eigenvalue Eigenvalue Problems Problems Generalized EGFR EGFR (Epidermal Growth Factor Receptor)

  4. A large molecule is separated to small segments. SCF for each segment FMO (Fragment MO) method FMO-MO method (small eigenproblems) (large-scale eigenproblem) Matrix Generation

  5. Eigenvectors related to chemical activities: HOMO LUMO Interior eigenvalue problems Energy state: Required Orbitals

  6. Matrix Properties • The size of matrix: 2K 2K ~ 200K • The size of matrix: ~ 200K The number of nonzero elements: 1M 1M ~ 400M The number of nonzero elements: ~ 400M - relatively l relatively large number of arge number of nonzero elements nonzero elements - - unstructured unstructured sparsity sparsity pattern pattern - Fock matrix of Lysozyme + H2O + H2O

  7. Client Clusters Client/Server Highly parallelized eigensolver is required. FMO-MO method is suitable for GRID computing. Computing Environment

  8. Contents  Introduction Background • Target problem & computing environment •  An Eigensolver using Contour Integration An algorithm • Numerical Properties • Parallel Implementation •  Numerical Examples  Conclusions

  9. Generalized Eigenvalue Problem The generalized eigenvalue problem: where , symmetric, and B is positive definite. : Eigenpair of the matrix pencil We find eigenpairs in a given interval: × × × × × × × × × | . . . . . .  λ m − 1 λ 2 λ 3 λ m λ m + 1 λ 1 γ

  10. Rayleigh-Ritz Procedure Algorithm: Inner Loop Outer Loop : Projected pencil : Ritz value : Ritz vector

  11. Contour Integral of Resolvent To avoid inner/outer loops , we use a contour integral in construction of a subspace . For a nonzero vector v , let where is a Jordan curve that includes . span( s 0 , . . . , s m − 1 ) = span( u 1 , . . . , u m ) [S and Tadano (2007)]

  12. Approximation for Contour Integral : Circle with center and radius Equidistributed points on the circle: s k are approximated by the N -point trapezoidal rule: where

  13. Contour Integral Rayleigh-Ritz Method Algorithm of CIRR (Contour Integral Rayleigh-Ritz) method: Construct a subspace Rayleigh-Ritz procedure

  14. Influence of Quadrature Error Let Then ρ × ×× × × × × × × × × λ m λ 1 λ m + 1 λ m + 4 λ m + 3 λ m + 2

  15. Block Method Block variant is also obtained by using a matrix instead of a vector v .

  16. Parallel Implementation ・・・・・

  17. Parallel Implementation Client Server/Client ・・・・・ ・・・・・ Server/Rank0 ・・・・・ ・・・・・ ・・・・・ ・・・・・

  18. Flow of the Eigensolver FMO-MO method Matrix Data Set appropriate circles Profiles Compute eigenpairs using contour integration Eigenpairs Post processing Molecular Orbitals

  19. Contents  Introduction Background • Target problem & computing environment •  An Eigensolver using Contour Integration An algorithm • Numerical Properties • Parallel Implementation •  Numerical Examples  Conclusions

  20. Numerical Example (1)  Test problem: • Model of 8 DNA base pairs • Matrix size: 1,980 × 1,980 • nnz: 728,080  Test Environment: • OS: MacOSX 10.5 • CPU: Core 2 Duo 2.2GHz (2GB memory) • Software: MATLAB 7.5 • Solver: UMFPACK (sparse direct solver)

  21. residual error L= 12, N = 16, center = -0.22, radius = 0.02, 38 eigs Numerical Example (1)

  22. residual error L = 16, N = 24, center = -0.22, radius = 0.02, 38 eigs Numerical Example (1)

  23. residual error L = 20, N = 24, center = -0.22, radius = 0.02, 38 eigs Numerical Example (1)

  24. Matrix [Okada, S and Teranishi (2007)] Solver: COCG method [van der Vorst and Melissen (1990)] Numerical Example (2)  Test Problem: • Lysozime + H2O • Basis function: STO-3G • Size: 20,758 × 20,758 • nnz: 20,064,444  Test Environment: • OS: MacOSX 10.5 • CPU: Core 2 Duo 2.2GHz (2GB memory) • Compiler: icc 10.1, ifort 10.1 • Preconditioner: Complete Factorization for Approximate • Sparse Direct Solver for Preconditioner: PARDISO

  25. Center: -0.22 Radius: 0.03 18 eigs Wall-clock time: 233.2 sec ARPACK+PARDISO: 316.1 sec, 20 eigs, max(res) = 6.6e-6 (Xeon 3.2GHz 2MB Memory) Numerical Example (2) L = 12 Residuals N = 24

  26. Matrix [Okada, S and Teranishi (2007)] Numerical Example (3)  Test Problems: • EGF (Epidermal Growth Factor) • Basis function: 6-31G • Size: 43,612 × 43,612 • nnz: 73,175,935  Test Environment: • OS: MacOSX 10.5 • CPU: Core 2 Duo 2.2GHz (2GB memory) • Compiler: icc 10.1, ifort 10.1, MKL 10.0 • Solver: COCG method [van der Vorst and Melissen (1990)] • Preconditioner: Complete Factorization for Approximate • Sparse Direct Solver for Preconditioner: PARDISO

  27. L = 8 N = 24 1583.1 sec L = 12 N = 24 2017.7 sec Residual Residual Numerical Example (3)

  28. L = 12 N = 24 2017.7 sec Numerical Example (3) Timing result (serial case): Preconditioner Iteration for v 1 ,..., v L RR . . . 36.8 sec 75.8 sec ω 0 B − A ω 1 B − A ω Ν /2-1 B − A

  29. L = 12 N = 24 Numerical Example (3) Timing result estimation (parallel case 1): Bloadcast Gather CPU 1 ω 0 B − A 36.8 sec 75.8 sec v 1 ,..., v L CPU 2 ω 1 B − A v 1 ,..., v L . . . . . . CPU L ω Ν /2-1 B − A

  30. L = 12 N = 24 Numerical Example (3) Timing result estimation (prallel case 2): CPU 1 ω 0 B − A v 1 CPU 2 ω 0 B − A v L . . . . . . CPU L*(N/2) ω Ν /2-1 B − A v L

  31. Summary  A Rayleigh-Ritz type method using the contour integral was proposed.  This method finds limited number of eigenpairs in a given interval. • Efficient for molecular orbital computation. • Easy to implement for distributed computing.  Find good preconditioner.  Application for other problems. (Not only for SPD case)

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend