Ensemble Quasi-Newton HMC Xiao-Yong Jin and James Osborn Argonne - - PowerPoint PPT Presentation

▶

Dec 03, 2022 269 likes •456 views

Ensemble Quasi-Newton HMC Xiao-Yong Jin and James Osborn Argonne National Laboratory July 23, 2018 The 36th International Symposium on Lattice Field Theory East Lansing, MI 1 Reduce critical slowing down Part of US DOE-funded

SLIDE 1

Ensemble Quasi-Newton HMC

Xiao-Yong Jin and James Osborn Argonne National Laboratory

July 23, 2018 The 36th International Symposium on Lattice Field Theory East Lansing, MI

SLIDE 2

Reduce critical slowing down

Part of US DOE-funded

Exascale Computing Project (ECP)

Support research in lattice QCD

to prepare for exascale

Reducing critical slowing down,

lead by Norman Christ,  is part of the USQCD's effort in ECP

See Norman's slides

for a list of people actively involved

SLIDE 3

Outline

Generate ensemble assisted Markov chains
Apply Quasi-Newton HMC
Test on 2D U(1) pure gauge theory

(work in progress)

SLIDE 4

Generate multiple Markov chains

Can we exchange information between chains?
Use info from other chains
Extra info from itself (not explored in this talk)
Any advantage?

⋯ 1 2 3 ⋯ 1 2 3 0′ 1′ 2′ 3′

SLIDE 5

ℱ({2,3,0′}) ⋯ 1 2 3 0′ 1′ ⋯ 1 2 3 ⋯ 1 2 3 0′ ⋯ 1 2 3 3′ 2′ 1′ 0′ ⋯ 1 2 3 0′ 1′ 2′ ⋯ 1 2 3 0′ 1′ 2′ 3′ ℱ({1,2,3}) Reverse ℱ({3,0′,1′}) ℱ({0′,1′,2′})

Generate the next state

f each Markov chain

with information from

ther chains:

ℱ(a set of configs)

Detailed balance: evolve backward from (3′,2′,1′,0′)

SLIDE 6

Ensemble assisted Markov chains: in parallel

Embedding Markov chains in Markov chains

⋯ 0, 1 2, 3 0′, 1′ ⋯ 0, 1 2, 3 ⋯ 0, 1 2, 3 2′, 3′ 0′, 1′ ⋯ 0, 1 2, 3 0′, 1′ 2′, 3′ ℱ({2,3}) Reverse ℱ({0′,1′})

SLIDE 7

Ensemble assisted Markov chains: multi-state

Embedding Markov chains in Markov chains

⋯ 0, 0′, 1, 1′ 2, 2′, 3, 3′ 2′′, 2′′′, 3′′, 3′′′ 0′′, 0′′′, 1′′, 1′′′ ℱ({2, 2′, 3, 3′}) Reverse ℱ({0′′, 0′′′, 1′′, 1′′′}) ⋯ 0, 0′, 1, 1′ 2, 2′, 3, 3′ 0′′, 0′′′, 1′′, 1′′′ ⋯ 0, 0′, 1, 1′ 2, 2′, 3, 3′ 0′′, 0′′′, 1′′, 1′′′ 2′′, 2′′′, 3′′, 3′′′ ⋯ 0, 0′, 1, 1′ 2, 2′, 3, 3′

SLIDE 8

What kind of information from other chains?

How do we generate the next state?
Modify MD evolution

“Quasi-Newton MCMC” — Zhang & Sutton (2011)  “Ensemble precondition” — Matthews et al (2016)  “Quasi-Newton Langevin” — Simsekli et al (2016)  “Magnetic HMC” — Tripuraneni et al (2016)  “Wormhole” — Lan et al (2013)

Modify Metropolis-Hastings

“Multi-try” — Liu, Liang, and Wong (2000)

Other techniques? Machine learning!!!

SLIDE 9

Quasi-Newton method for HMC Hamiltonian

BFGS approximation of the Hessian:

Update an old approximation to a new one

Approximate Hessian from configs of other MC

Repeatedly apply the update according to

Use the approximate Hessian for the mass matrix
Note: Fourier acceleration≃Local free field Hessian

G′ = G + yy† y†s − Gss†G s†Gs

s = ln U′U† step y = ∇S(U′) − ∇S(U) yield

H = S(U) + 1 2 p†G−1p G′s = y

Nstream

SLIDE 10

Quasi-Newton method

Avoids the slow down of the steepest decent in

narrow valleys

Caveat in the current study:
The approximated Hessian is global
We do not use the current location

SLIDE 11

Benefits of rank-2 update (BFGS style)

Factorizable matrix (Brodlie et al 1973)
Initializing random momenta
Exactly invertible
MD evolution
Computing the kinetic energy

G′ = G + ww† − zz† → G′ = (1 − uv†)G(1 − vu†) G′−1 = (1 − vu† v†u − 1)G−1(1 − uv† v†u − 1 )

SLIDE 12

Gauge fixing of 2D U(1) lattice

Removes exact zero modes from the real Hessian
Frozen degrees of freedom take the same values
We choose maximal tree gauge fixing
Fix two more non-gauge degree of freedom

SLIDE 13

Regulate the approximated Hessian matrix

Remove low modes in the approximate global Hessian
Add one more term to keep the rank-2 update
Works in practice, but not a strict bound
Caveat:
Mildly violates
Still no upper bound

G′ = G + yy† y†s − (1 − λ s†s s†Gs ) Gss†G s†Gs G′s = y

SLIDE 14

Test on 2D U(1) theory (work in progress)

Fixed , lattice size 32 × 32
Serial version of the ensemble Markov chain
Second order Omelyan integrator (did not tune λ)
Look at the autocorrelation of the topological

susceptibility,

Topological charge, 
Topological charge is exact integer with periodic

boundary conditions

Q = 1 2π ∑

x

Arg □x β = 5.8 ⟨Q2/V⟩ Arg : ℂ ↦ (−π, π)

SLIDE 15

Acceptance tuning

8 streams QNHMC 𝜇=0.1 8 streams QNHMC 𝜇=0.01 16 streams QNHMC 𝜇=0.01

SLIDE 16

Autocorrelation of topological susceptibility

Trajectory length has no effect on HMC

<2× for Gfix HMC (update half lattice)

SLIDE 17

Autocorrelation of topological susceptibility

Cost grows if allow lower eigenmodes

We need more tuning

SLIDE 18

Summary & Outlook

We devise an algorithm creating multiple Markov chains in parallel

Allow exchange of information while generating the Markov chains

We modify HMC to use information from neighboring Markov chains

BFGS approximated Hessian as the mass matrix of the MD Hamiltonian  Use a custom regulator for the approximated Hessian for stability

We still need more tuning and testing (parameters / observables)
Ways to improve the algorithm
Exploit the ensemble of Markov chains (multi-scale?)
Other method for constructing the mass matrix
Use other information / observables to augment MD / Metropolis
Machine learning!

Ensemble Quasi-Newton HMC

Xiao-Yong Jin and James Osborn Argonne National Laboratory

Reduce critical slowing down

Exascale Computing Project (ECP)

to prepare for exascale

lead by Norman Christ, is part of the USQCD's effort in ECP

for a list of people actively involved

Outline

(work in progress)

Generate multiple Markov chains

⋯ 1 2 3 ⋯ 1 2 3 0′ 1′ 2′ 3′

ℱ({2,3,0′}) ⋯ 1 2 3 0′ 1′ ⋯ 1 2 3 ⋯ 1 2 3 0′ ⋯ 1 2 3 3′ 2′ 1′ 0′ ⋯ 1 2 3 0′ 1′ 2′ ⋯ 1 2 3 0′ 1′ 2′ 3′ ℱ({1,2,3}) Reverse ℱ({3,0′,1′}) ℱ({0′,1′,2′})

Generate the next state

with information from

ℱ(a set of configs)

Detailed balance: evolve backward from (3′,2′,1′,0′)

Ensemble assisted Markov chains: in parallel

⋯ 0, 1 2, 3 0′, 1′ ⋯ 0, 1 2, 3 ⋯ 0, 1 2, 3 2′, 3′ 0′, 1′ ⋯ 0, 1 2, 3 0′, 1′ 2′, 3′ ℱ({2,3}) Reverse ℱ({0′,1′})

Ensemble assisted Markov chains: multi-state

What kind of information from other chains?

“Quasi-Newton MCMC” — Zhang & Sutton (2011) “Ensemble precondition” — Matthews et al (2016) “Quasi-Newton Langevin” — Simsekli et al (2016) “Magnetic HMC” — Tripuraneni et al (2016) “Wormhole” — Lan et al (2013)

“Multi-try” — Liu, Liang, and Wong (2000)

Quasi-Newton method for HMC Hamiltonian

Update an old approximation to a new one

Repeatedly apply the update according to

G′ = G + yy† y†s − Gss†G s†Gs

s = ln U′U† step y = ∇S(U′) − ∇S(U) yield

H = S(U) + 1 2 p†G−1p G′s = y

Nstream

Quasi-Newton method

narrow valleys

Benefits of rank-2 update (BFGS style)

G′ = G + ww† − zz† → G′ = (1 − uv†)G(1 − vu†) G′−1 = (1 − vu† v†u − 1)G−1(1 − uv† v†u − 1 )

Gauge fixing of 2D U(1) lattice

Regulate the approximated Hessian matrix

G′ = G + yy† y†s − (1 − λ s†s s†Gs ) Gss†G s†Gs G′s = y

Test on 2D U(1) theory (work in progress)

susceptibility,

boundary conditions

Q = 1 2π ∑

x

Arg □x β = 5.8 ⟨Q2/V⟩ Arg : ℂ ↦ (−π, π)

Acceptance tuning

8 streams QNHMC 𝜇=0.1 8 streams QNHMC 𝜇=0.01 16 streams QNHMC 𝜇=0.01

Autocorrelation of topological susceptibility

<2× for Gfix HMC (update half lattice)

Autocorrelation of topological susceptibility

We need more tuning

Summary & Outlook

Allow exchange of information while generating the Markov chains

BFGS approximated Hessian as the mass matrix of the MD Hamiltonian Use a custom regulator for the approximated Hessian for stability

lead by Norman Christ,  is part of the USQCD's effort in ECP

“Quasi-Newton MCMC” — Zhang & Sutton (2011)  “Ensemble precondition” — Matthews et al (2016)  “Quasi-Newton Langevin” — Simsekli et al (2016)  “Magnetic HMC” — Tripuraneni et al (2016)  “Wormhole” — Lan et al (2013)

BFGS approximated Hessian as the mass matrix of the MD Hamiltonian  Use a custom regulator for the approximated Hessian for stability