Optimization for Electronic Structure Calculation Zaiwen Wen - - PowerPoint PPT Presentation

optimization for electronic structure calculation
SMART_READER_LITE
LIVE PREVIEW

Optimization for Electronic Structure Calculation Zaiwen Wen - - PowerPoint PPT Presentation

Optimization for Electronic Structure Calculation Zaiwen Wen Beijing International Center For Mathematical Research Peking University Thanks: Yaxiang Yuan, Xin Liu, Xiao Wang, Xin Zhang, Jinwei Zhu, Aihui Zhou, Michael Ulbrich, Chao Yang BICMR


slide-1
SLIDE 1

Optimization for Electronic Structure Calculation

Zaiwen Wen

Beijing International Center For Mathematical Research Peking University Thanks: Yaxiang Yuan, Xin Liu, Xiao Wang, Xin Zhang, Jinwei Zhu, Aihui Zhou, Michael Ulbrich, Chao Yang

BICMR - PKU, 2014

slide-2
SLIDE 2

Electronic Structure Calculation

N particle Schrodinger equation: Physics of material systems — atomic and molecular properties, almost correct (nonrelativistic) phyiscs is quantum mechanics

(a) Thanks: Hege et. al. ZIB Berlin (b) Thanks: Reinhold Schneider

Numerical simulation of material on atomic and molecular scale

Zaiwen Wen (BICMR, PKU) Optimization for DFT BICMR, 2014 2 / 39

slide-3
SLIDE 3

Electronic Structure Calculation

Main goal: Given atomic positions {Rα}M

α=1, compute the ground

state electron energy Ee({Rα}). Ground state electron wavefunction Ψe(r1, . . . , rN; {Rα}):  −1 2

N

  • i=1

∆i −

M

  • α=1

N

  • j=1

Zα |ri − Rα| + 1 2

N

  • i,j=1,i=j

1 |ri − rj|   Ψe = Ee({Rα})Ψe Curse of dimensionality: Computational work goes as 103N, where N is the number of electrons

Zaiwen Wen (BICMR, PKU) Optimization for DFT BICMR, 2014 3 / 39

slide-4
SLIDE 4

Density Functional Theory (DFT)

The unknown is simple — the electron density ρ Hohenberg-Kohn Theory (1964)

There is a unique mapping between the ground state energy from Schrödinger equation and the electron density Exact form of the functional is unknown

Independent particle model

Electrons move independently in an average effective potential field Add correction for correlation

Best compromise between efficiency and accuracy. Most widely used electronic structure theory for condensed matter systems.

Zaiwen Wen (BICMR, PKU) Optimization for DFT BICMR, 2014 4 / 39

slide-5
SLIDE 5

Kohn-Sham Formulation

Replace many-particle wavefunctions, Ψi, with single particle wavefunction, ψi Write Kohn-Sham total energy as EKS({ψi}) = 1 2

ne

  • i=1

|∇ψi|2 +

Vion(ρ) +1 2

ρ(r)ρ(r ′) |r − r ′| drdr ′ + Exc(ρ) ρ(r) =

ne

  • i=1

|ψi(r)|2,

ψiψj = δi,j Exchange-correlation term, Exc, contains quantum mechanical contribution, plus, part of K.E. not converged by first term when using single-particle wavefunctions

Zaiwen Wen (BICMR, PKU) Optimization for DFT BICMR, 2014 5 / 39

slide-6
SLIDE 6

Towards Large-scale Simulation

Thanks: Taisuke Ozaki

Zaiwen Wen (BICMR, PKU) Optimization for DFT BICMR, 2014 6 / 39

slide-7
SLIDE 7

Discretized Kohn-Sham Formulation

Goal: find ground state energy/density by minimizing EKS. Finite dimensional problem: min

X ∗X=I EKS(X) := Ekinetic(X) + Eion(X) + EHartree(X) + Exc(X),

where X ∈ CK×N, Ekinetic(X) =

1 2tr(X ∗LX)

Eion(X) = tr(X ∗VionX) +

  • i
  • l

|x∗

i wl|2

EHartree(X) =

1 2ρ(X)⊤L†ρ(X)

Exc(X) = e⊤ǫxc(ρ(X)), e = (1, . . . , 1)⊤ ρ(X) = diag(XX ∗) = (

N

  • j=1

|xij|2)1≤i≤K

Zaiwen Wen (BICMR, PKU) Optimization for DFT BICMR, 2014 7 / 39

slide-8
SLIDE 8

KKT Conditions

Lagrange function: L(X, Λ) = EKS(X) − 1

2tr(Λ(X ∗X − I))

First-order optimality conditions:

  • ∇XL(X, Λ) = 0.

X ∗X = I, = ⇒

  • H(X)X = XΛ,

X ∗X = I. Λ = X ∗H(X)X, not necessarily a diagonal matrix Kohn-Sham Hamiltonian: H(X) := 1

2L+Vion+

  • l wlw∗

l +diag

  • ℜ(L†)ρ(X) + ∂ρǫxc(ρ(X))⊤e
  • .

Zaiwen Wen (BICMR, PKU) Optimization for DFT BICMR, 2014 8 / 39

slide-9
SLIDE 9

Orbital Free DFT (OFDFT)

Expresses the system by only using the charge density Avoids computing N eigenpairs Pros: main group elements and nearly-free-electron-like metals Cons: not for covalently bonded and ionic systems Orbital Free total energy: EOF(ρ) = TOF(ρ) + Eext(ρ) + EH(ρ) + Exc(ρ) + EII TOF(ρ): kinetic energy density functional (KEDF)

TTFW(ρ) = CTFTTF(ρ) + µTvW(ρ), TLR(ρ) = TTF(ρ) + µTvW(ρ) + CTF

  • R3
  • R3 K(r − r ′)ρα(r)ρβ(r ′)drdr ′

Other terms are the same as KSDFT

Zaiwen Wen (BICMR, PKU) Optimization for DFT BICMR, 2014 9 / 39

slide-10
SLIDE 10

Orbital Free DFT (OFDFT)

Variational problem inf EOF(ρ) s.t. ρ ∈ L1(R3), ρ

1 2 ∈ H1(R3), ρ ≥ 0,

  • R3 ρ(r)dr = N.

KKT Conditions:            HOFϕ  −µ 2∆ + δ

  • TOF(ρ) − µTvW(ρ)
  • δρ

+ Veff(ρ)   ϕ = λϕ,

  • R3 ϕ2 = N,

Discretized Form: min

c∈Rn EOF(ρ(c)),

s.t. c⊤Bc = 1.

Zaiwen Wen (BICMR, PKU) Optimization for DFT BICMR, 2014 10 / 39

slide-11
SLIDE 11

Self Consistent Field Iteration (SCF)

SCF Algorithm:

  • 1. Find the p-smallest eigenpairs (X, Λ):

H(ρk)X = XΛ X ∗X = I

  • 2. Calculate ρout(X) = diag(XX ∗).
  • 3. ρk+1 = (1 − α)ρk + αρout.
  • 4. Increment k and go to step 1 until ρk+1 − ρk is small enough.

Our motivation: Computation for the linear eigenvalue problem can be expensive Convergence of SCF is not clear Optimization Algorithms for solving DFT directly?

Zaiwen Wen (BICMR, PKU) Optimization for DFT BICMR, 2014 11 / 39

slide-12
SLIDE 12

Gradient-Type Approach: Wen and Yin ’12

Consider min E(X), subject to X ⊤X = I. At iteration i X (i+1) ← Orthogonalize ← X (i) − σW (i)X (i)

  • X (i+1) ← solutionY(τ) of Y = X (i) + σ

2W (i)(X (i) + Y)

W (i) is a skew-symmetric matrix defined by W (i) = ∇E(X (i))

  • X (i)∗

− X (i) ∇E(X (i)) ∗ [Y (i)]′(0) = −W (i)X (i) = tangential part of −∇E(X (i))

Zaiwen Wen (BICMR, PKU) Optimization for DFT BICMR, 2014 12 / 39

slide-13
SLIDE 13

Understanding SCF: the Hessian of EKS

Convenient scaling: Es(X) := 1

2EKS(X).

Gradient: ∇Es(X) := H(X)X.

Exact Hessian

Suppose that ǫxc(ρ) is twice differentiable with respect to ρ. Given a direction S ∈ CK×N, the Hessian-direction product for Es(X) is ∇2(Es(X))[S] = H(X)S + diag

  • J

X ⊙ S + X ⊙ ¯ S)e

  • X,

where J = ℜL† + ∂2

ρ(ǫ⊤ xce).

Note: The second part corresponds to (H′(X)[S])X. Good news: the Hessian-direction product is not too expensive.

Zaiwen Wen (BICMR, PKU) Optimization for DFT BICMR, 2014 13 / 39

slide-14
SLIDE 14

SCF from the Viewpoint of Optimization

see also: Yang Meza, Wang ’07

The linear eigenvalue problem in each SCF iteration is equivalent to: min q(X) := 1

2 H(Xk)X, X

s.t. X ∗X = I. On the other hand, a direct calculation reveals:

1 2 H(Xk)X, X = ℜ H(Xk)Xk, X − Xk

+ 1

2ℜ H(Xk)(X − Xk), X − Xk + const.

The second part H′(Xk)[X − Xk]X is omitted in SCF . similar to Gauss-Newton methods Our Goals: Provable global convergence + fast local convergence.

Zaiwen Wen (BICMR, PKU) Optimization for DFT BICMR, 2014 14 / 39

slide-15
SLIDE 15

Levenberg-Marquardt Type Regularization

SCF iteration is similar to Gauss-Newton (GN) method. Regularization of SCF by Levenberg-Marquardt type approach: min mL

k(X) := 1 2 H(Xk)X, X + τk

2 X − Xk2

F

s.t. X ∗X = I, with regularization parameter τk > 0. First-order optimality conditions: The solution X = Xk+1 satisfies (H(Xk) + τkI) Xk+1 = Xk+1Λk+1 + τkXk and X ∗

k+1Xk+1 = I,

where Λk+1 = Λ∗

k+1 ∈ CN×N is a Lagrange multiplier.

Zaiwen Wen (BICMR, PKU) Optimization for DFT BICMR, 2014 15 / 39

slide-16
SLIDE 16

Exact Hessian + Adaptive Regularization

using the exact Hessian:

mN

k (Xk + S) := ℜ H(Xk)Xk, S + 1 2ℜ H(Xk)S, S

+ 1

2ℜ

  • S, diag(J((¯

Xk ⊙ S + Xk ⊙ ¯ S)e)Xk

  • + τk

ν Sν

F,

τk ν Sν F: trust region like strategy for ensuring global convergence.

Compute the regularized Newton step: min mN

k (X)

s.t. X ∗X = I. Cartis, Gould, Toint ’10, ’11, ’12 on cubic regularization

Zaiwen Wen (BICMR, PKU) Optimization for DFT BICMR, 2014 16 / 39

slide-17
SLIDE 17

Convergence Results

Assumption: The gradient ∇Es(X) = H(X)X is Lipschitz on the convex hull of the Stiefel manifold {X ; X ∗X = I}. Let Gk = ∇Es(Xk) = H(Xk)Xk and define Wk = GkX ∗

k − XkG∗ k

Global Convergence Result: Wl = 0 for some l ≥ 0

  • r

lim

k→∞ WkF = 0.

Note: WkXk = tangential part of Gk in the canonical inner product.

Zaiwen Wen (BICMR, PKU) Optimization for DFT BICMR, 2014 17 / 39

slide-18
SLIDE 18

Formulating the KS Equation as a Fixed Point Map

Nonlinear equations with respect to ρ as ρ = diag(X(ρ)X(ρ)T). X is determined by the eigenvalue problem: ˆ H(ρ)X = XΛ, X TX = I, the Hamiltonian matrix ˆ H(ρ) := 1 2L + Vion + Diag(L†ρ) + Diag(µxc(ρ)Te)

Zaiwen Wen (BICMR, PKU) Optimization for DFT BICMR, 2014 18 / 39

slide-19
SLIDE 19

Formulating the KS Equation as a Fixed Point Map

The Hamiltonian matrix H(V) := 1 2L + Vion + Diag(V) The potential V := V(ρ) = L†ρ + µxc(ρ)Te Nonlinear equations with respect to

  • V = V(Fφ(V)),

Fφ(V) = diag(X(V)X(V)T).

Zaiwen Wen (BICMR, PKU) Optimization for DFT BICMR, 2014 19 / 39

slide-20
SLIDE 20

The Jacobian of the Fixed Point Maps

Let {λi(V), qi(V)} be the eigenpairs of H(V): λ1(V) ≤ . . . ≤ λp(V) ≤ λp+1(V) ≤ . . . ≤ λn(V). The eigenvalue decomposition of H(V): H(V) = Q(V)Π(V)Q(V)T, The function Fφ(V) in (19) is equivalent to Fφ(V) = diag(Q(V)φ(Π(V))Q(V)T), where φ(Π) = Diag(φ(λ1(V)), φ(λ2(V)), . . . , φ(λn(V))) and φ(t) :=

  • 1

for t ≤ λp(V)+λp+1(V)

2

, for t > λp(V)+λp+1(V)

2

.

Zaiwen Wen (BICMR, PKU) Optimization for DFT BICMR, 2014 20 / 39

slide-21
SLIDE 21

The Jacobian of the Fixed Point Maps

Suppose that λp+1(V) > λp(V). Then the directional derivative of Fφ(V) at V is

∂VFφ(V)[z] = diag

  • Q(V)
  • gφ(Π(V)) ◦
  • Q(V)TDiag (z) Q(V)
  • Q(V)T

,

where gφ(Π(V)) ∈ Rn×n is defined as

(gφ(Π(V)))ij =     

1 λi(V)−λj(V)

if i ∈ αk, j ∈ αl, k ≤ rp(V), l > rp(V),

−1 λi(V)−λj(V)

if i ∈ αk, j ∈ αl, k > rp(V), l ≤ rp(V),

  • therwise.

The Jacobian of V(Fφ(V)) at V is ∂VV(Fφ(V))[z] = J(Fφ(V))∂VFφ(V)[z], for all z ∈ Rn.

Zaiwen Wen (BICMR, PKU) Optimization for DFT BICMR, 2014 21 / 39

slide-22
SLIDE 22

Convergence of the SCF iteration

SCF recursively computes: H(V i)X(V i+1) = X(V i+1)Λ(V i+1), X(V i+1)TX(V i+1) = I, and then the potential is updated as V i+1 = V(Fφ(V i)). The simple mixing scheme replaces (22) by updating V i+1 = V i − α(V i − V(Fφ(V i)))

Zaiwen Wen (BICMR, PKU) Optimization for DFT BICMR, 2014 22 / 39

slide-23
SLIDE 23

Convergence of the SCF iteration

Assumption: the second-order derivatives of ǫxc(ρ): ∂µxc(ρ)e2 ≤ θ, for all ρ ∈ Rn. It holds ∂VFφ(V)2 ≤ 1 δ and ∂VV(Fφ(V))2 ≤ L†2 + θ δ .

Zaiwen Wen (BICMR, PKU) Optimization for DFT BICMR, 2014 23 / 39

slide-24
SLIDE 24

Local Convergence of the SCF iteration

Let V ∗ be a solution of the KS equation. Suppose that the eigenvalue gap satisfies δ > −λ∗

min := − min{0, λmin(J(Fφ(V ∗)))}.

There exists an open neighborhood Ω of V ∗, such that the sequence {V i} generated by the simple mixing scheme using V 0 ∈ Ω and a step size α ∈

  • 0,

2δ ||L†||2 + θ + δ

  • converges to V ∗ with R-linear convergence rate no more than

max

  • 1 − αδ + λ∗

min

δ

  • ,
  • α||L†||2 + θ + δ

2δ − 1

  • .

Zaiwen Wen (BICMR, PKU) Optimization for DFT BICMR, 2014 24 / 39

slide-25
SLIDE 25

Local Convergence of the SCF iteration

The condition δ > −λ∗

min holds if

max(θ − λmin(L†), 0) < δ. when J(Fφ(V ∗)) is positive semidefinite, we have λ∗

min = 0 and the

condition δ > −λ∗

min holds.

Convergence of using Fermi-Dirac distribution: 4 β > −λ∗

min,

where λ∗

min := min{0, λmin(J(Fφ(V ∗)))}.

Zaiwen Wen (BICMR, PKU) Optimization for DFT BICMR, 2014 25 / 39

slide-26
SLIDE 26

Approximate Newton Methods

The Jacobian of V(Fφ(V)) at V is ∂VV(Fφ(V))[z] = J(Fφ(V))∂VFφ(V)[z], for all z ∈ Rn, where

∂VFφ(V)[z] = diag

  • Q(V)
  • gφ(Π(V)) ◦
  • Q(V)TDiag (z) Q(V)
  • Q(V)T

.

Newton method V i+1 = V i − α

  • I − J(Fφ(V i))∂VFφ(V i)

−1 V i − V

  • Fφ(V i)
  • ,

Approximate Newton method V i+1 = V i − α

  • I − Di−1

V i − V

  • Fφ(V i)
  • ,

Zaiwen Wen (BICMR, PKU) Optimization for DFT BICMR, 2014 26 / 39

slide-27
SLIDE 27

Approximate Newton Methods

Setting Di := τ iJ(ρ): V i+1 = V i − α

  • I − τ iJ(Fφ(V i))

−1 V i − V

  • Fφ(V i)
  • .

Setting Di = τ iL†: V i+1 = V i − α

  • I − τ iL†−1

V i − V

  • Fφ(V i)
  • .

The Kerker preconditioner (pointed out by Lin Lin) Setting Di = L†W i for a diagonal W i: V i+1 = V i − α

  • I − L†W i−1

V i − V

  • Fφ(V i)
  • .

The method of elliptic preconditioner of Lin and Chao.

Zaiwen Wen (BICMR, PKU) Optimization for DFT BICMR, 2014 27 / 39

slide-28
SLIDE 28

Speedup factors

Definition speedup-factor(k0, k) = wall clock time for a k0-core run wall clock time for a k-core run . T0: the calculation of the total energy E(X) and its partial derivative EX. T1: all other wall clock time in each algorithm.

Zaiwen Wen (BICMR, PKU) Optimization for DFT BICMR, 2014 28 / 39

slide-29
SLIDE 29

Speedup factor with respect to T0

1 2 4 8 16 1 3 5 7 9 11 number of cores parallel speedup factor w.r.t. 1 core C60 SCF OptM−WY OptM−QR 1 2 4 8 16 32 1 4 7 10 13 16 number of cores parallel speedup factor w.r.t. 1 core alanine SCF OptM−WY OptM−QR 16 32 64 128 1 3 5 7 number of cores parallel speedup factor w.r.t. 16 core 2JMO SCF OptM−WY OptM−QR 32 64 128 256 1 3 5 7 number of cores parallel speedup factor w.r.t. 32 core Fasciculin2 SCF OptM−WY OptM−QR Zaiwen Wen (BICMR, PKU) Optimization for DFT BICMR, 2014 29 / 39

slide-30
SLIDE 30

Speedup factor with respect to T1

1 2 4 8 16 1 3 5 7 9 11 13 number of cores parallel speedup factor w.r.t. 1 core C60 SCF OptM−WY OptM−QR 1 2 4 8 16 32 1 5 9 13 17 21 25 number of cores parallel speedup factor w.r.t. 1 core alanine SCF OptM−WY OptM−QR 16 32 64 128 1 3 5 7 number of cores parallel speedup factor w.r.t. 16 core 2JMO SCF OptM−WY OptM−QR 32 64 128 256 1 3 5 7 number of cores parallel speedup factor w.r.t. 32 core Fasciculin2 SCF OptM−WY OptM−QR Zaiwen Wen (BICMR, PKU) Optimization for DFT BICMR, 2014 30 / 39

slide-31
SLIDE 31

Ratio: T0/(T0 + T1)

1 2 4 8 16 0.2 0.4 0.6 0.8 1 number of cores ratio of T0 C60 SCF OptM−WY OptM−QR 1 2 4 8 16 32 0.2 0.4 0.6 0.8 1 number of cores ratio of T0 alanine SCF OptM−WY OptM−QR 16 32 64 128 0.2 0.4 0.6 0.8 1 number of cores ratio of T0 2JMO SCF OptM−WY OptM−QR 32 64 128 256 0.2 0.4 0.6 0.8 1 number of cores ratio of T0 Zaiwen Wen (BICMR, PKU) Optimization for DFT BICMR, 2014 31 / 39

slide-32
SLIDE 32

Numerical Results on OFDFT

Figure: (a) and (c) are the contours of the ground state charge density at plane z = 0 for Al1688 and Al4631, respectively. (b) and (d) are the corresponding adaptive mesh distribution of (a) and (c), respectively.

Zaiwen Wen (BICMR, PKU) Optimization for DFT BICMR, 2014 32 / 39

slide-33
SLIDE 33

Numerical Results: Residual

20 40 60 80 100 10

−7

10

−6

10

−5

10

−4

10

−3

10

−2

10

−1

10 iteration residual SCF TRDCM OptM TRQ TRQH TRCH

(a) al

20 40 60 80 100 10

−8

10

−6

10

−4

10

−2

10 10

2

iteration residual SCF TRDCM OptM TRQ TRQH TRCH

(b) graphene30

Zaiwen Wen (BICMR, PKU) Optimization for DFT BICMR, 2014 33 / 39

slide-34
SLIDE 34

Quadratic Convergence is Observable

1 2 3 4 5 6 7 10

−7

10

−6

10

−5

10

−4

10

−3

10

−2

10

−1

10 10

1

iteration residual TRQH, 1e−10 TRCH, 1e−10 TRQH adaptive TRCH adaptive

(a) alanine

1 2 3 4 5 6 7 8 10

−10

10

−9

10

−8

10

−7

10

−6

10

−5

10

−4

10

−3

10

−2

10

−1

10 10

1

iteration residual TRQH, 1e−10 TRCH, 1e−10 TRQH adaptive TRCH adaptive

(b) c12h26

Zaiwen Wen (BICMR, PKU) Optimization for DFT BICMR, 2014 34 / 39