Mitglied der Helmholtz-Gemeinschaft
Enabling large scale LAPW DFT calculations by a scalable iterative eigensolver
CSE15, Salt Lake City. March 17th
- E. Di Napoli, D. Wortmann, and
- M. Berljafa
Enabling large scale LAPW DFT calculations by a scalable iterative - - PowerPoint PPT Presentation
Mitglied der Helmholtz-Gemeinschaft Enabling large scale LAPW DFT calculations by a scalable iterative eigensolver CSE15, Salt Lake City. March 17th E. Di Napoli , D. Wortmann, and M. Berljafa Typical Applications Atomic Structure Magnetic
Mitglied der Helmholtz-Gemeinschaft
CSE15, Salt Lake City. March 17th
Folie 2
CSE15, Salt Lake City. March 17th
Folie 3
CSE15, Salt Lake City. March 17th
Folie 4
1 Φ(x1;s1,x2;s2,...,xn;sn) =
2 density of states n(r) = ∑a fa |φa(r)|2 3 In the Schrödinger equation the exact Coulomb interaction is substituted
CSE15, Salt Lake City. March 17th
Folie 5
Initial guess for charge density
Compute discretized Kohn-Sham equations Solve a set of eigenproblems
k1 ...P(ℓ) kN
Compute new charge density
Converged?
OUTPUT Electronic structure,
CSE15, Salt Lake City. March 17th
Folie 6
h2 2m∇2 +V0(r)
CSE15, Salt Lake City. March 17th
Folie 7
|G+k|≤Gmax
k,νφG(k,r)
ℓ,m
ℓm (k)uα ℓ (r)+bα,G ℓm (k)˙
ℓ (r)
ℓm (k)
ℓm (k)
CSE15, Salt Lake City. March 17th
Folie 8
CSE15, Salt Lake City. March 17th
Folie 9
1 every P(ℓ)
k
k ck = B(ℓ) k λck is a generalized eigenvalue problem;
2 A and B are DENSE and hermitian (B is positive definite); 3 required: lower 2÷10 % of eigenpairs; 4 momentum vector index: k = 1 : 10÷100; 5 iteration cycle index: ℓ = 1 : 20÷50.
CSE15, Salt Lake City. March 17th
Folie 9
CSE15, Salt Lake City. March 17th
Folie 10
Adjacent iteration cycles
k1
k1 ,Λ(ℓ) k1 )
k2
k2 ,Λ(ℓ) k2 )
kN
kN ,Λ(ℓ) kN )
direct solver direct solver direct solver
k1
k1
k1
k2
k2
k2
kN
kN
kN
direct solver direct solver direct solver
CSE15, Salt Lake City. March 17th
Folie 11
Adjacent iteration cycles
k1
k1 ,Λ(ℓ) k1 )
k2
k2 ,Λ(ℓ) k2 )
kN
kN ,Λ(ℓ) kN )
direct solver direct solver direct solver
k1
k1
k1
k2
k2
k2
kN
kN
kN
direct solver direct solver direct solver
CSE15, Salt Lake City. March 17th
Folie 11
Adjacent iteration cycles
k1
k1 ,Λ(ℓ) k1 )
k2
k2 ,Λ(ℓ) k2 )
kN
kN ,Λ(ℓ) kN )
direct solver direct solver direct solver
k1
k1
k1
k2
k2
k2
kN
kN
kN
direct solver direct solver direct solver
CSE15, Salt Lake City. March 17th
Folie 11
An example
2 6 10 14 18 22 10
−10
10
−8
10
−6
10
−4
10
−2
10
Evolution of subspace angle for eigenvectors of k−point 1 and lowest 75 eigs
Iterations (2 −> 22) Angle b/w eigenvectors of adjacent iterations
AuAg
CSE15, Salt Lake City. March 17th
Folie 12
Adjacent cycles
k1
k1 ,Λ(ℓ) k1 )
k2
k2 ,Λ(ℓ) k2 )
kN
kN ,Λ(ℓ) kN )
iterative solver iterative solver iterative solver
k1
k1
k1
k2
k2
k2
kN
kN
kN
iterative solver iterative solver iterative solver
CSE15, Salt Lake City. March 17th
Folie 13
CSE15, Salt Lake City. March 17th
Folie 14
Properties and algorithm evolution
ki
CSE15, Salt Lake City. March 17th
Folie 15
Chebyshev polynomials
A generic vector v = ∑n
i=1 sixi is very quickly aligned in the direction of the
eigenvector corresponding to the extremal eigenvalue λ1
n
i=1
n
i=1
n
i=2
e )
e
CSE15, Salt Lake City. March 17th
Folie 16
In practice
CSE15, Salt Lake City. March 17th
Folie 17
Convergence ratio and residuals
The convergence ratio for the eigenvector xi corresponding to eigenvalue λi / ∈ [α,β] is defined as τ(λi) = |ρi|−1 = min
±
e ± λi −c e 2 −1
The further away λi is from the interval [α,β] the smaller is |ρi|−1 and the faster the convergence to xi is.
i ) ∼
ρi
i
i )
ρi
Res(v
m0 i
)
CSE15, Salt Lake City. March 17th
Folie 18
1 Chebyshev filter. Initial filter W ←
2 Re-orthogonalize W = QR & compute the Rayleigh quotient G = Q†HQ. 3 Solve the reduced problem GY = YΛ and compute the approximate Ritz
CSE15, Salt Lake City. March 17th
Folie 19
1 Chebyshev filter. Initial filter W ←
2 Re-orthogonalize W = QR & compute the Rayleigh quotient G = Q†HQ. 3 Solve the reduced problem GY = YΛ and compute the approximate Ritz
4 Optimizer. Compute the polynomial degrees mi ≥ ln
Res(wi)
5 Chebyshev filter. Filter W ←
6 Re-orthogonalize W = QR & compute the Rayleigh quotient G = Q†HQ. 7 Solve the reduced problem GY = YΛ and compute the approximate Ritz
8 lock the converged vectors. 9 Store the residuals Res(wi) of the unconverged vectors.
CSE15, Salt Lake City. March 17th
Folie 19
CSE15, Salt Lake City. March 17th
Folie 20
CSE15, Salt Lake City. March 17th
Folie 21
As a function of iteration cycles
CSE15, Salt Lake City. March 17th
Folie 22
Speed-up =
CPU time (input random vectors) CPU time (input approximate eigenvectors)
CSE15, Salt Lake City. March 17th
Folie 23
CSE15, Salt Lake City. March 17th
Folie 24
For the size of eigenproblems here tested the ScaLAPAK implementation of BXINV
with ScaLAPACK is not included.
3 6 9 12 eigensystem index ℓ 5 10 15 20 25 30 time [s] EleMRRR EleChFSI, 1e-10 EleChFSI OPT, 1e-10 64 cores 128 cores
CSE15, Salt Lake City. March 17th
Folie 25
For the size of eigenproblems here tested the ScaLAPAK implementation of BXINV
with ScaLAPACK is not included.
620 1358 2037 25 50 75 100 125 150 175 200 time [s] ℓ =4 620 1358 2037 number of eigenpairs sought after ℓ =11 620 1358 2037 ℓ =18 EleMRRR ChFSI, 1e-10 ChFSI, 1e-08
CSE15, Salt Lake City. March 17th
Folie 25
The role of affinity on Xeon Phi
60 80 100 120 140 160 180 200 220 240 5 10 15 20 25 Time [s] iteration scatter compact compact small pages
CSE15, Salt Lake City. March 17th
Folie 26
GPUs vs CPUs
50 100 150 200 250 300 350 400 5 10 15 20 25 Time [s] iteration 2x Intel Xeon E5-2670 (16 cores) standard 2x Intel Xeon E5-2670 (16 cores) optimized 1x NVIDIA K20m standard 1x NVIDIA K20m optimized 2x NVIDIA K20m standard 2x NVIDIA K20m optimized
CSE15, Salt Lake City. March 17th
Folie 26
multi-cores vs many-cores
50 100 150 200 250 300 5 10 15 20 25 Time [s] iteration 2x Intel Xeon E5-2670 (16 cores) 1x NVIDIA K20m 2x NVIDIA K20m 1x XeonPhi optimized (compact)
CSE15, Salt Lake City. March 17th
Folie 26
1 Exploring hybrid parallelizations of the code. 2 Implement in FLEUR a mixed direct-iterative solver;
CSE15, Salt Lake City. March 17th
Folie 27
CSE15, Salt Lake City. March 17th
Folie 28