Solving quantum many-body Hamiltonians with artificial neural - - PowerPoint PPT Presentation

solving quantum many body hamiltonians with artificial
SMART_READER_LITE
LIVE PREVIEW

Solving quantum many-body Hamiltonians with artificial neural - - PowerPoint PPT Presentation

Solving quantum many-body Hamiltonians with artificial neural networks Yusuke NOMURA Univ. of Tokyo TNSAA 2018 2018/12/03 YN, A. Darmawan, Y. Yamaji, and M. Imada, PRB 96 , 205152 (2017) G. Carleo, YN, and M. Imada, arXiv:1802.09558, to appear


slide-1
SLIDE 1

Solving quantum many-body Hamiltonians with artificial neural networks

Yusuke NOMURA

  • Univ. of Tokyo

TNSAA 2018 2018/12/03

YN, A. Darmawan, Y. Yamaji, and M. Imada, PRB 96, 205152 (2017)

  • G. Carleo, YN, and M. Imada, arXiv:1802.09558, to appear in Nature Communications

Collaborators: Andrew S. Darmawan, Youhei Yamaji, Giuseppe Carleo, Masatoshi Imada

slide-2
SLIDE 2

Artificial Neural Networks in Condensed Matter Physics

  • J. Carrasquilla and R. G. Melko, Nat. Phys. 13, 431 (2017).
  • E. P. L. van Nieuwenburg et al., Nat. Phys. 13, 435 (2017).
  • T. Ohtsuki and T. Ohtsuki, JPSJ 85, 123706 (2016).
  • A. Tanaka and A. Tomiya, JPSJ 86, 063001 (2017).
  • T. Ohtsuki and T. Ohtsuki, JPSJ 86, 044708 (2017).
  • N. Yoshioka et al., arXiv:1709.05790.

Phase classification (discriminative model)

  • G. Carleo and M. Troyer Science 355, 602 (2017)

See also: H. Saito, JPSJ 86, 093001 (2017);

  • H. Saito and M. Kato, JPSJ 87, 014001 (2018);

YN et al., PRB 96, 205152 (2017)

Many-body solver (generative model) Monte Carlo speed up (generative model)

  • L. Huang and L. Wang, PRB 95, 035105 (2017)
  • L. Wang, PRE 96, 051301 (2017)

(Related) G.Torlai and R. G. Melko PRB 94, 165134 (2016)

slide-3
SLIDE 3

Restricted Boltzmann machine (RBM)

Interaction Wij

  • Mag. Field (bias term) bj
  • Mag. Field (bias term) ai

Paul Smolensky (1986)

  • G. E. Hinton, R. R. Salakhutdinov, Science. 313, 504 (2006)

Single hidden layer + interlayer coupling only → restricted Boltzmann machine (RBM) Marginal distribution can represent any distribution over {0,1}N with infinite M

Energy function Boltzmann distribution Marginal distribution

  • K. Hornik, Neural Networks 4, 251 (1991); G. Cybenko, Mathematics of Control, Signals and Systems 2, 303 (1989); N. L. Roux and Y. Bengio, Neural Computation 20, 1631(2008).
slide-4
SLIDE 4

Using artificial neural network to solve quantum many body problems

Interaction Wij

  • Mag. Field (bias term) bj
  • Mag. Field (bias term) ai

Ψ(σz) = X

{hj}

exp ⇣X

i

aiσz

i +

X

i,j

σz

i Wijhj +

X

j

bjhj ⌘

X

σz

|Ψ(σz)|2 = 1

σz =

  • σz

1, σz 2, . . . , σz N

  • hj = ±1
  • G. Carleo and M. Troyer Science 355, 602 (2017)

See also: H. Saito, JPSJ 86, 093001 (2017);

  • H. Saito and M. Kato, arXiv:1709.05468

Quantum correlations among physical spins via artificial neural network Single hidden layer + interlayer coupling only → restricted Boltzmann machine (RBM)

: real space spin config. : spin of hidden neuron

Variational wave function

Ψ(σz) = e

P

i aiσz i ×

Y

j

2 cosh ⇣ bj + X

i

Wijσz

i

slide-5
SLIDE 5

Optimization strategy

many-body wave function = vector with exponentially large dimension

→ extract essential pattern from machine leaning and represent it with polynomial number of parameters

  • 1. We know exact ψ(x)

→ exact ψ(x) = teacher (supervised learning)

  • 2. We do not know the form of ψ(x), but we can observe real-space configuration x generated

according to ψ(x)

→ density estimation: estimate underlying probability density function (unsupervised learning)

  • 3. We do not know the form of ψ(x), we cannot observe real-space configuration x either

(most challenging)

→ finding unknown ground sate

※ For the moment, we will consider positive-definite wave function → wave function can be regarded as probability density function

we teach machine “rule of the game” (commutation relation, Hamiltonian, measurement of energy, …)

  • ptimize parameters following variational principle = minimization of energy
slide-6
SLIDE 6

Example: 1D Antiferromagnetic Heisenberg model (8site)

wave function (real and positive for any x) RBM interaction parameter

Exact Initial RBM Optimized RBM

ψ(x)

x

W

gauge transformation σx,y → -σx,y for one of sublattice

Optimization following variational principle :

E0: ground state energy

hHi = X

x

p(x)Eloc(x)

p(x) = |Ψ(x)|2 P

x |Ψ(x)|2

Eloc(x) = X

x0

hx|H|x0iΨ(x0) Ψ(x)

Energy: hHi = hΨ|H|Ψi hΨ|Ψi E0

Initial Optimized

slide-7
SLIDE 7
  • G. Carleo and M. Troyer Science 355, 602 (2017)

Using artificial neural network to solve quantum many body problems

α: hidden variable density = (# hidden units)/(# physical spins)

1D Transverse-field Ising model

80 spins, periodic boundary condition h: transverse field (h=1: critical)

1D AF Heisenberg model

80 spins, periodic boundary condition

2D AF Heisenberg model

10x10 square lattice periodic boundary condition

slide-8
SLIDE 8

Properties of RBM wave function

  • 1. Complex RBM → can be applied to general wave function
  • 2. Universal approximator

→ k nonzero complex amplitudes ψ(x) can be represented by RBM with k hidden units

  • 3. Short-range RBM

→ can be mapped onto entangled plaquette states (EPS) → area-law entanglement entropy

  • 4. Long-range RBM

→ can be mapped onto string-bond state (= product of MPS) → volume-law entanglement entropy

Ψ(σz) =

P

Y

p=1

Cp(σz

p)

Ψ(σz) = Y

i

Tr Y

j∈i

Ai,j(σz

j )

  • Ai,j(σz

j ) =

✓ ebi/N+Wijσz

j

e−bi/N−Wijσz

j

◆ D.-L. Deng et al., PRX 7, 021021 (2017); J. Chen et al., PRB 97, 085104 (2018)

  • I. Glasser et al., PRX 8, 011006 (2018); S. R. Clark, J. Phys. A: Math. Theor. 51, 135301 (2018)
  • Y. Huang and J. E. Moore, arXiv:1701.06246
slide-9
SLIDE 9

RBM vs diagonal SBS

Ψ(σz) = Y

i

Tr Y

j∈i

Ai,j(σz

j )

  • Ai,j(σz

j ) =

✓ ebi/N+Wijσz

j

e−bi/N−Wijσz

j

  • I. Glasser et al., PRX 8, 011006 (2018)

RBM wave function RBM vs diagonal SBS

J=1, Jχ=1, square lattice, 10x10, open boundary → 98 % overlap with Laughlin state in 4x4 lattice

J Jχ

slide-10
SLIDE 10

NetKet: open-source package

https://www.netket.org

slide-11
SLIDE 11

1.Combine concepts from machine learning and physics

  • 2. Adding additional hidden layer (deep Boltzmann machine)

How to improve RBM wave function?

  • G. Carleo, YN, and M. Imada, arXiv:1802.09558, to appear in Nat. Commun.

YN, A. Darmawan, Y. Yamaji, and M. Imada, PRB 96, 205152 (2017)

(3. Extension to fermion-boson coupled Hamiltonians)

slide-12
SLIDE 12

1.Combine concepts from machine learning and physics

  • 2. Adding additional hidden layer (deep Boltzmann machine)

How to improve RBM wave function?

YN, A. Darmawan, Y. Yamaji, and M. Imada, PRB 96, 205152 (2017)

(3. Extension to fermion-boson coupled Hamiltonians)

  • G. Carleo, YN, and M. Imada, arXiv:1802.09558, to appear in Nat. Commun.
slide-13
SLIDE 13

Restricted Boltzmann machine (RBM) wave function

Product-basis RBM (P-RBM) for quantum spins Fermi-sea-based RBM (F-RBM) for fermions

Neural-network correlation factor : Product state: Neural-network correlation factor can be efficiently calculated because neuron spins are noninteracting

  • G. Carleo and M. Troyer Science 355, 602 (2017)

YN, A. Darmawan, Y. Yamaji, and M. Imada, PRB 96, 205152 (2017)

slide-14
SLIDE 14

RBM+PP wave function ①

restricted Boltzmann machine + pair-product

combine concepts from machine learning (RBM) and physics (pair-product(PP) state)

Product-basis RBM (P-RBM) RBM +PP

YN, A. Darmawan, Y. Yamaji, and M. Imada, PRB 96, 205152 (2017)

Pair-Product state (geminal wave function):

no entanglement if hidden layer is absent direct entanglement in visible layer → help RBM to learn ground state

slide-15
SLIDE 15

RBM+PP wave function ②

restricted Boltzmann machine + pair-product

RBM+PP wave function

  • cf. many variable VMC wave function

Gutzwiller Jastrow Interaction Wij

  • Mag. Field (bias term) bj
  • Mag. Field (bias term) ai
  • G. Carleo and M. Troyer Science 355, 602 (2017)
  • D. Tahara and M. Imada JPSJ 77, 114701 (2008)

neural-network correlation factor

(number of visible variables) = Nsite (Heisenberg) = 2Nsite (Hubbard)

slide-16
SLIDE 16

Ability of RBM to represent Gutzwiller-Jastrow factor

W1 W2

YN, A. Darmawan, Y. Yamaji, and M. Imada, PRB 96, 205152 (2017)

Gutzwiller factor at site i

rewrite

except for constant factor and one-body potential

RBM form

except for constant factor

slide-17
SLIDE 17

Result for 2D Heisenberg model 2D Hubbard model

Parameters are numerically optimized following variational principles: → finding set of parameters W which minimize energy using machine learning techniques stochastic reconfiguration method (condensed-matter physics community) natural gradient (artificial intelligence community)

  • S. Sorella, PRB 64, 024512 (2001)

S.-I. Amari, K. Kurata, and H. Nagaoka, IEEE Transactions on Neural Networks 3, 260 (1992) S.-I. Amari, Neural Comput. 10, 251 (1998).

slide-18
SLIDE 18

Application to 2D antiferromagnetic Heisenberg model

8x8 square lattice with periodic boundary condition

YN, A. Darmawan, Y. Yamaji, and M. Imada, PRB 96, 205152 (2017) tensor netwrok data: L. Wang et al., PRB 83,134421 (2011);

  • F. Mezzacapo et al., New J. Phys. 11, 083026 (2009).

tensor network

(EPS and PEPS for virtual bond dim. 16)

simple RBM RBM+PP (RVB) RBM+PP (RVB) + trans. sym.

α = (# hidden units)/(# physical spins)

RBM+PP substantially improves accuracy compared to P-RBM Improving reference function helps RBM

slide-19
SLIDE 19

Application to 2D Hubbard model

8x8 square lattice, half-filling (periodic anti-periodic)

YN, A. Darmawan, Y. Yamaji, and M. Imada, PRB 96, 205152 (2017) TNVMC data: H.-H. Zhao et al., PRB 96, 085103 (2017).

α = (# hidden units)/(# physical spins)

RBM+PP substantially improves accuracy compared to F-RBM Improving reference function helps RBM

TNVMC

slide-20
SLIDE 20

Application to 2D Hubbard model

8x8 square lattice, half-filling (periodic anti-periodic)

YN, A. Darmawan, Y. Yamaji, and M. Imada, PRB 96, 205152 (2017)

RBM+PP result for spin structure factor U dependence of RBM+PP energy (α=32)

RBM+PP works better in larger U, in contrast to mVMC

Heisenberg result

slide-21
SLIDE 21

Discussion

Representability of RBM Representability of RBM+PP for ground-state wave functions

N(x) with real variables can represent any probability distribution over x with infinite α

  • K. Hornik, Neural Networks 4, 251 (1991); G. Cybenko, Mathematics of Control, Signals and Systems 2, 303 (1989); N. L. Roux and Y. Bengio, Neural Computation 20, 1631(2008).

YN, A. Darmawan, Y. Yamaji, and M. Imada, PRB 96, 205152 (2017)

In bipartite Heisenberg model, a gauge transformation makes wave function positive definite In Hubbard model, nodal structure of wave function is important → relative error should go to zero as α goes to infinity → PP wave function needs to take account of nodes Introduction of complex RBM helps to improve nodal structure ? Heisenberg Hubbard

slide-22
SLIDE 22

Short summary

Summary of RBM+PP

  • utperforms the original RBM and mVMC method

flexible applicability (both to bosons and fermions) No negative sign

→ can be applied to e.g. to frustrated spin systems and doped Hubbard model, where QMC is not applicable

RBM+PP : a new powerful solver for strongly correlated quantum systems

YN, A. Darmawan, Y. Yamaji, and M. Imada, PRB 96, 205152 (2017)

Perspectives

Application to other systems Introduction of another hidden layer (deep Boltzmann machine (DBM))

figure from X. Gao and L.-M. Duan,

  • Nat. Commun. 8, 662 (2017).

→ next part : exact DBM constructions to represent ground states of many-body Hamiltonians

  • G. Carleo, Y. Nomura, and M. Imada, arXiv:1802.09558 (see also N. Freitas et al., arXiv:1803.02118)
slide-23
SLIDE 23

1.Combine concepts from machine learning and physics

  • 2. Adding additional hidden layer (deep Boltzmann machine)

How to improve RBM wave function?

YN, A. Darmawan, Y. Yamaji, and M. Imada, PRB 96, 205152 (2017)

(3. Extension to fermion-boson coupled Hamiltonians)

  • G. Carleo, YN, and M. Imada, arXiv:1802.09558, to appear in Nat. Commun.
slide-24
SLIDE 24

DBM (deep Boltzmann machine) wave function

visible layer hidden layer deep layer

… … … σz

1

σz

2

σz

3

σz

N

h

M

h1 h2 h3

d1

d

M0

d2 d3

Wij W 0

jk

… … …

σz

1

σz

2

σz

3

σz

N

h

M

h1 h2 h3 d1

d

M0

d2 d3

= X

h,d

e

P

i aiσz i +P i,j σz i Wijhj+P j bjhj+P j,k hjW 0 jkdk+P k b0 kdk

Ψ(σ) = X

h,d

slide-25
SLIDE 25

reproduce imaginary-time evolution by dynamically modifying DBM network Physical quantities are measured by MC sampling of classical visible and hidden spins

DBM representation of ground states

Pros Cons

much more flexible representability cannot trace out both h and d analytically

(need to sample hidden spins to obtain wave function)

DBM compared with RBM |Ψ(τ)i = e−H1

δτ 2 e−H2δτ . . . e−H2δτ e−H1 δτ 2 |Ψ0i

Key idea

  • G. Carleo, Y. Nomura, and M. Imada, arXiv:1802.09558 (see also N. Freitas et al., arXiv:1803.02118)

Novel class of quantum-to-classical mapping

  • X. Gao and L.-M. Duan, Nat. Commun. 8, 662 (2017).

(no need to perform stochastic optimization of parameters! everything deterministic ! )

slide-26
SLIDE 26

Example: Transverse-Field Ising model

H = H1 + H2

H1 = X

l<m

Vlmσz

l σz m

H2 = − X

l

Γlσx

l

e−δτ Vlmσz

l σz m|DBMi

eδτ Γlσx

l |DBMi

Hamiltonian:

Interaction (classical): Transverse-field:

How to express short time propagators by DBM ?

Interaction propagator: Transverse-field propagator:

  • G. Carleo, YN, and M. Imada, arXiv:1802.09558
slide-27
SLIDE 27

Interaction propagator (diagonal)

σz

m

σz

l

e−δτ Vlmσz

l σz m|DBMi

Initial network (arbitrary)

slide-28
SLIDE 28

h[lm]

e−δτ Vlmσz

l σz m|DBMi

Network after the propagator with new h and W

Interaction propagator (diagonal)

RBM architecture is enough to represent classical interaction

slide-29
SLIDE 29

σz

l

Transverse-field propagator (off-diagonal)

eδτ Γlσx

l |DBMi

Initial network (arbitrary)

slide-30
SLIDE 30

d[l]

Transverse-field propagator (off-diagonal)

eδτ Γlσx

l |DBMi

(Intermediate step) new d and W’

Deep layer makes it possible to derive analytical expression for quantum propagator

slide-31
SLIDE 31

Transverse-field propagator (off-diagonal)

eδτ Γlσx

l |DBMi

(Intermediated step) modify W

¯ Wlj = Wlj + ∆Wlj = 0

slide-32
SLIDE 32

h[l]

Transverse-field propagator (off-diagonal)

eδτ Γlσx

l |DBMi

new h, W, W’ and obtain new network

slide-33
SLIDE 33

DBM construction for Heisenberg model

Initial State Step 0

New d and W 0 Step 1

Modify W Step 2 σz

m

σz

l

d[lm]

Step 3

New h, W, W 0 (real)

Step 4

New h, W, W 0 (constraint) σz

m

σz

l

a) b) c)

d[l] d[m] σz

m

σz

l

d[lm] d[l]

<latexit sha1_base64="uBb4RTSONRM48nVS4HZU7/FqT0U=">AB7XicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEUG9FLx4rGFtoQ9lsJu3SzSbsboQS+iO8eFDx6v/x5r9x2+ag1QcDj/dmJkXZoJr47pfTmVldW19o7pZ29re2d2r7x86DRXDH2WilR1Q6pRcIm+4UZgN1NIk1BgJxzfzPzOIyrNU3lvJhkGCR1KHnNGjZU60aDoiWA6qDfcpjsH+Uu8kjSgRHtQ/+xHKcsTlIYJqnXPczMTFQZzgROa/1cY0bZmA6xZ6mkCeqgmJ87JSdWiUicKlvSkLn6c6KgidaTJLSdCTUjvezNxP+8Xm7iy6DgMsNSrZYFOeCmJTMficRV8iMmFhCmeL2VsJGVFmbEI1G4K3/PJf4p81r5ru3XmjdV2mUYUjOIZT8OACWnALbfCBwRie4AVencx5dt6c90VrxSlnDuEXnI9v3BOPcg=</latexit><latexit sha1_base64="uBb4RTSONRM48nVS4HZU7/FqT0U=">AB7XicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEUG9FLx4rGFtoQ9lsJu3SzSbsboQS+iO8eFDx6v/x5r9x2+ag1QcDj/dmJkXZoJr47pfTmVldW19o7pZ29re2d2r7x86DRXDH2WilR1Q6pRcIm+4UZgN1NIk1BgJxzfzPzOIyrNU3lvJhkGCR1KHnNGjZU60aDoiWA6qDfcpjsH+Uu8kjSgRHtQ/+xHKcsTlIYJqnXPczMTFQZzgROa/1cY0bZmA6xZ6mkCeqgmJ87JSdWiUicKlvSkLn6c6KgidaTJLSdCTUjvezNxP+8Xm7iy6DgMsNSrZYFOeCmJTMficRV8iMmFhCmeL2VsJGVFmbEI1G4K3/PJf4p81r5ru3XmjdV2mUYUjOIZT8OACWnALbfCBwRie4AVencx5dt6c90VrxSlnDuEXnI9v3BOPcg=</latexit><latexit sha1_base64="uBb4RTSONRM48nVS4HZU7/FqT0U=">AB7XicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEUG9FLx4rGFtoQ9lsJu3SzSbsboQS+iO8eFDx6v/x5r9x2+ag1QcDj/dmJkXZoJr47pfTmVldW19o7pZ29re2d2r7x86DRXDH2WilR1Q6pRcIm+4UZgN1NIk1BgJxzfzPzOIyrNU3lvJhkGCR1KHnNGjZU60aDoiWA6qDfcpjsH+Uu8kjSgRHtQ/+xHKcsTlIYJqnXPczMTFQZzgROa/1cY0bZmA6xZ6mkCeqgmJ87JSdWiUicKlvSkLn6c6KgidaTJLSdCTUjvezNxP+8Xm7iy6DgMsNSrZYFOeCmJTMficRV8iMmFhCmeL2VsJGVFmbEI1G4K3/PJf4p81r5ru3XmjdV2mUYUjOIZT8OACWnALbfCBwRie4AVencx5dt6c90VrxSlnDuEXnI9v3BOPcg=</latexit>

h[lm3]h[lm4] h[lm1]

h[lm2] h[lm1] h[lm2] h[l] h[m] h[lm3] h[lm5]

h[lm6] h[lm1] h[lm2]

Initial network network after time evolution

1d3h 2d6h 2d4h

  • G. Carleo, YN, and M. Imada, arXiv:1802.09558

e−τ J~

l~ m|DBMi

slide-34
SLIDE 34

In this way, we can follow imaginary-time Hamiltonian evolution exactly within DBM framework (# hidden units) ∝ (system size) x (imaginary time)

slide-35
SLIDE 35

Numerical result (1)

1D Transverse-Field Ising

N = 20, Jδτ = 0.01

1D Antiferromagnetic Heisenberg

N = 80, Jδτ = 0.01

DBM reproduces exact time-evolution

from empty network from pre-optimized RBM

better initial state => faster convergence

  • G. Carleo, YN, and M. Imada, arXiv:1802.09558
slide-36
SLIDE 36

Discussion

  • G. Carleo, YN, and M. Imada, arXiv:1802.09558

In Heisenberg model, we have found (at least) 3 representations with different topology 1d3h => local W, non-local W’ 2d6h => local W, local W’ (equivalent to path-integral when h spins are traced out) 2d4h => non-local W, non-local W’ No sign problem for bipartite spin models For frustrated system, negative signs appear (general W, W’ become complex)

DBM Representation Not Unique Possible sign problem

different representation give different amount of negative signs ? Starting from pre-optimized state, we can reach ground state before negative signs become severe?

slide-37
SLIDE 37

Numerical result (2)

2D J1-J2 Antiferromagnetic Heisenberg model

N = 4x4 = 16, Jδτ = 0.001

τ

relative error (energy)

J2/J1 = 0.4 J2/J1 = 0.0

|Ψi = hY

i

exp(Hiδτ) iNslice pair-product (RVB) ↵

symbols : DBM solid curves: exact

slide-38
SLIDE 38

Summary and Perspective

Application to Frustrated Spin Systems How to compress the network? Show deterministic construction of DBM to represent ground states Additional hidden (deep) layer : “additional dimension” in statistical mechanics DBM representation => New quantum-to-classical mapping

Summary Perspective

  • G. Carleo, Y. Nomura, and M. Imada, arXiv:1802.09558 (see also N. Freitas et al., arXiv:1803.02118)

How does the negative-sign rate differ in 3 representations? An approximate mapping from DBM to RBM ?

The number of hidden units grows linearly with system size and imaginary time, respectively

slide-39
SLIDE 39

Future directions

Calculations of Excited States

Finite Temperature Calculations Dynamics Mutual understanding between Tensor and Neural networks