A parallel eigensolver eigensolver using using contour contour A - - PowerPoint PPT Presentation

a parallel eigensolver eigensolver using using contour
SMART_READER_LITE
LIVE PREVIEW

A parallel eigensolver eigensolver using using contour contour A - - PowerPoint PPT Presentation

A parallel eigensolver eigensolver using using contour contour A parallel integration for generalized generalized eigenvalue eigenvalue integration for problems in molecular simulation molecular simulation problems in Tetsuya Sakurai


slide-1
SLIDE 1

A parallel A parallel eigensolver eigensolver using using contour contour integration for integration for generalized generalized eigenvalue eigenvalue problems in problems in molecular simulation molecular simulation

Tetsuya Sakurai (University of Tsukuba) Hiroto Tadano (Kyoto University) Umpei Nagashima (AIST)

slide-2
SLIDE 2

Contents

 Introduction

  • Background
  • Target problem & computing environment

 An Eigensolver using Contour Integration

  • An algorithm
  • Numerical Properties
  • Parallel Implementation

 Numerical Examples  Conclusions

slide-3
SLIDE 3

Molecular Orbital Computation

Design of Anticancer Drugs H HΨ Ψ = = E EΨ Ψ Schrödinger Equation EGFR EGFR

(Epidermal Growth Factor Receptor)

Generalized Generalized Eigenvalue Eigenvalue Problems Problems

Hartree-Fock approximation

slide-4
SLIDE 4

Matrix Generation

A large molecule is separated to small segments.

SCF for each segment FMO (Fragment MO) method FMO-MO method (small eigenproblems) (large-scale eigenproblem)

slide-5
SLIDE 5

Required Orbitals

Eigenvectors related to chemical activities:

HOMO LUMO

Interior eigenvalue problems Energy state:

slide-6
SLIDE 6

Matrix Properties

  • The size of matrix:

The size of matrix: 2K 2K ~ ~ 200K 200K The number of nonzero elements: The number of nonzero elements: 1M 1M ~ ~ 400M 400M

  • relatively l

relatively large number of arge number of nonzero elements nonzero elements

  • unstructured

unstructured sparsity sparsity pattern pattern

Fock matrix of Lysozyme + H2O + H2O

slide-7
SLIDE 7

Computing Environment

Client Clusters Client/Server Highly parallelized eigensolver is required. FMO-MO method is suitable for GRID computing.

slide-8
SLIDE 8

Contents

 Introduction

  • Background
  • Target problem & computing environment

 An Eigensolver using Contour Integration

  • An algorithm
  • Numerical Properties
  • Parallel Implementation

 Numerical Examples  Conclusions

slide-9
SLIDE 9

Generalized Eigenvalue Problem

The generalized eigenvalue problem: : Eigenpair of the matrix pencil where , symmetric, and B is positive definite. . . .

× × × × × × × × ×

λ1 γ

|

 λ2 λ3 λm . . . λm−1 λm+1 We find eigenpairs in a given interval:

slide-10
SLIDE 10

Rayleigh-Ritz Procedure

Algorithm: : Ritz value : Ritz vector : Projected pencil Inner Loop Outer Loop

slide-11
SLIDE 11

Contour Integral of Resolvent

To avoid inner/outer loops, we use a contour integral in construction of a subspace. For a nonzero vector v, let where is a Jordan curve that includes . span(s0, . . . , sm−1) = span(u1, . . . , um) [S and Tadano (2007)]

slide-12
SLIDE 12

: Circle with center and radius

Approximation for Contour Integral

Equidistributed points on the circle: sk are approximated by the N-point trapezoidal rule: where

slide-13
SLIDE 13

Contour Integral Rayleigh-Ritz Method

Algorithm of CIRR (Contour Integral Rayleigh-Ritz) method: Rayleigh-Ritz procedure Construct a subspace

slide-14
SLIDE 14

Influence of Quadrature Error

×

λm λm+1

× × × × × × ×× × ×

ρ Let Then λm+4 λm+2 λm+3 λ1

slide-15
SLIDE 15

Block Method

Block variant is also obtained by using a matrix instead of a vector v.

slide-16
SLIDE 16

Parallel Implementation

・・・・・

slide-17
SLIDE 17

Parallel Implementation

・・・・・ ・・・・・ ・・・・・ ・・・・・ ・・・・・ ・・・・・ Client Server/Client Server/Rank0

slide-18
SLIDE 18

Molecular Orbitals Post processing

Flow of the Eigensolver

Compute eigenpairs using contour integration Set appropriate circles FMO-MO method

Matrix Data Eigenpairs Profiles

slide-19
SLIDE 19

Contents

 Introduction

  • Background
  • Target problem & computing environment

 An Eigensolver using Contour Integration

  • An algorithm
  • Numerical Properties
  • Parallel Implementation

 Numerical Examples  Conclusions

slide-20
SLIDE 20

Numerical Example (1)

 Test problem:

  • Model of 8 DNA base pairs
  • Matrix size: 1,980 × 1,980
  • nnz: 728,080

 Test Environment:

  • OS: MacOSX 10.5
  • CPU: Core 2 Duo 2.2GHz (2GB memory)
  • Software: MATLAB 7.5
  • Solver: UMFPACK (sparse direct solver)
slide-21
SLIDE 21

Numerical Example (1)

error residual L= 12, N = 16, center = -0.22, radius = 0.02, 38 eigs

slide-22
SLIDE 22

Numerical Example (1)

error residual L = 16, N = 24, center = -0.22, radius = 0.02, 38 eigs

slide-23
SLIDE 23

Numerical Example (1)

error residual L = 20, N = 24, center = -0.22, radius = 0.02, 38 eigs

slide-24
SLIDE 24

Numerical Example (2)

 Test Problem:

  • Lysozime + H2O
  • Basis function: STO-3G
  • Size: 20,758 × 20,758
  • nnz: 20,064,444

 Test Environment:

  • OS: MacOSX 10.5
  • CPU: Core 2 Duo 2.2GHz (2GB memory)
  • Compiler: icc 10.1, ifort 10.1

Solver: COCG method [van der Vorst and Melissen (1990)]

  • Preconditioner: Complete Factorization for Approximate

Matrix [Okada, S and Teranishi (2007)]

  • Sparse Direct Solver for Preconditioner: PARDISO
slide-25
SLIDE 25

Numerical Example (2)

Center: -0.22 Radius: 0.03 18 eigs Wall-clock time: 233.2 sec

Residuals L = 12 N = 24

ARPACK+PARDISO: 316.1 sec, 20 eigs, max(res) = 6.6e-6 (Xeon 3.2GHz 2MB Memory)

slide-26
SLIDE 26

Numerical Example (3)

 Test Problems:

  • EGF (Epidermal Growth Factor)
  • Basis function: 6-31G
  • Size: 43,612 × 43,612
  • nnz: 73,175,935

 Test Environment:

  • OS: MacOSX 10.5
  • CPU: Core 2 Duo 2.2GHz (2GB memory)
  • Compiler: icc 10.1, ifort 10.1, MKL 10.0
  • Solver: COCG method [van der Vorst and Melissen (1990)]
  • Preconditioner: Complete Factorization for Approximate

Matrix [Okada, S and Teranishi (2007)]

  • Sparse Direct Solver for Preconditioner: PARDISO
slide-27
SLIDE 27

Numerical Example (3)

L = 8 N = 24 1583.1 sec L = 12 N = 24 2017.7 sec Residual Residual

slide-28
SLIDE 28

Numerical Example (3)

36.8 sec 75.8 sec Preconditioner Iteration for v1,...,vL . . . ω0B − A ω1B − A ωΝ/2-1B − A RR Timing result (serial case):

L = 12 N = 24 2017.7 sec

slide-29
SLIDE 29

Numerical Example (3)

Bloadcast ω0B − A ωΝ/2-1B − A . . . Timing result estimation (parallel case 1):

L = 12 N = 24

ω1B − A 36.8 sec 75.8 sec Gather CPU 1 CPU 2 CPU L . . . v1,...,vL v1,...,vL

slide-30
SLIDE 30

Numerical Example (3)

ω0B − A ωΝ/2-1B − A . . . Timing result estimation (prallel case 2):

L = 12 N = 24

ω0B − A CPU 1 CPU 2 CPU L*(N/2) . . . v1 vL vL

slide-31
SLIDE 31

Summary

 A Rayleigh-Ritz type method using the contour integral

was proposed.

 This method finds limited number of eigenpairs in a given

interval.

  • Efficient for molecular orbital computation.
  • Easy to implement for distributed computing.

 Find good preconditioner.  Application for other problems.

(Not only for SPD case)