prospects of lattice field theory simulations powered by
play

Prospects of Lattice Field Theory Simulations powered by Deep Neural - PowerPoint PPT Presentation

Prospects of Lattice Field Theory Simulations powered by Deep Neural Networks Julian Urban ITP Heidelberg 2019/11/06 " this will " this is never work " revolutionary " Prospects of Lattice Field Theory Simulations


  1. Prospects of Lattice Field Theory Simulations powered by Deep Neural Networks Julian Urban ITP Heidelberg 2019/11/06 " this will " this is never work " revolutionary "

  2. Prospects of Lattice Field Theory Simulations powered by Deep Neural Networks Julian Urban ITP Heidelberg 2019/11/06 " this will " this is never work " revolutionary "

  3. Prospects of Lattice Field Theory Simulations powered by Deep Neural Networks Julian Urban ITP Heidelberg 2019/11/06 " this will " this is never work " revolutionary "

  4. Overview • Stochastic estimation of Euclidean path integrals • Overrelaxation with Generative Adversarial Networks (GAN)* • Ergodic sampling with Invertible Neural Networks (INN) † • Some results for real, scalar φ 4 -theory in d = 2 * Urban, Pawlowski (2018) — “Reducing Autocorrelation Times in Lattice Simulations with Generative Adversarial Networks” — arXiv: 1811.03533 † Albergo, Kanwar, Shanahan (2019) — “Flow-based generative models for Markov chain Monte Carlo in lattice field theory” — arXiv: 1904.12072 1 / 20

  5. Markov Chain Monte Carlo D φ e − S ( φ ) O ( φ ) N � = 1 ∼ � �O ( φ ) � φ ∼ e − S ( φ ) = O ( φ i ) � D φ e − S ( φ ) N i =1 Φ Φ ' • accept φ ′ with probability: � 1 , e − ∆ S � T A ( φ ′ | φ ) = min • autocorrelation function: C O ( t ) = �O i O i + t � − �O i ��O i + t � 2 / 20

  6. Real, Scalar φ 4 -Theory on the Lattice • φ ( x ) ∈ R discretized on d -cubic Euclidean lattice with volume V = L d and periodic boundary conditions � d � µ ) + (1 − 2 λ ) φ ( x ) 2 + λ φ ( x ) 4 � � S = − 2 κ φ ( x ) φ ( x + ˆ x µ =1 • magnetization M = 1 � φ ( x ) V x � M 2 � − � M � 2 � • connected susceptibility χ 2 = V � • connected two-point correlation function G ( x , y ) = � φ ( x ) φ ( y ) � − � φ ( x ) �� φ ( y ) � 3 / 20

  7. Real, Scalar φ 4 -Theory on the Lattice d = 2 <|M|> 0.03 25 "phase_diagram" u 1:2:3 0.025 20 0.02 15 λ 0.015 10 0.01 5 0.005 0 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 κ 4 / 20

  8. Real, Scalar φ 4 -Theory on the Lattice d = 2 , V = 8 2 , λ = 0 . 02 5 10 4 χ 2 5 3 0 <|M|> 0.1 0.2 0.3 0.4 2 1 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 κ 5 / 20

  9. Independent (Black-Box) Sampling Replace p ( φ ) by an approximate distribution q ( φ ) generated from a function g : R V − → R V , χ �− → φ , where the components of χ are i.i.d. random variables (commonly N (0 , 1 )). Theoretical / computational requirements: • ergodic in p ( φ ) • p ( φ ) � = 0 ⇒ q ( φ ) � = 0 • sufficient overlap between q and p for practical use on human timescales • balanced and asymptotically exact • statistical selection or weighting procedure for asymptotically unbiased estimation similar to accept/reject correction 6 / 20

  10. Overrelaxation Φ Φ ', S( Φ ' ) = S( Φ ) n MC T A ( φ ′ | φ ) = 1 for ∆ S = 0 • sampling on hypersurfaces of constant S • ergodicity through normal MC steps • requirements • ability to reproduce all possible S • symmetric a priori selection probability 7 / 20

  11. Generative Adversarial Networks random fake Discriminator loss Generator numbers samples real samples • overrelaxation step: find χ s.t. S [ g ( χ )] = S [ φ ] • iterative gradient descent solution of χ ′ = arg min � S [ g ( χ )] − S [ φ ] � χ 8 / 20

  12. Sample Examples d = 2, V = 32 2 , κ = 0 . 21, λ = 0 . 022 "real_sample.txt" matrix "good_sample.txt" matrix 2 2 1 1 0 0 -1 -1 -2 -2 "good_sample2.txt" matrix "good_sample3.txt" matrix 2 2 1 1 0 0 -1 -1 -2 -2 9 / 20

  13. Magnetization & Action Distributions HMC HMC GAN HMC + GAN 8000 8000 6000 6000 4000 4000 2000 2000 0 0 -0.2 -0.1 0 0.1 0.2 -0.2 -0.1 0 0.1 0.2 M M 20000 10000 HMC HMC GAN HMC + GAN 16000 8000 12000 6000 8000 4000 4000 2000 0 0 300 400 500 600 700 800 900 1000 400 450 500 550 600 S S 10 / 20

  14. Reduced Autocorrelations 0.0025 local HMC n H = 1 0.002 n H = 2 n H = 3 0.0015 C M (t) 0.001 0.0005 0 1 2 3 4 5 6 7 8 9 10 t 11 / 20

  15. Problems with this Approach • GAN • relies on the existence of an exhaustive dataset • no direct access to sample probability • adversarial learning complicates quantitative error assessment • convergence/stability issues such as mode collapse • Overrelaxation • still relies on traditional MC algorithms • symmetry of the selection probability • little effect on autocorrelations of observables coupled to S • latent space search is computationally rather demanding 12 / 20

  16. Proper Reweighting to Model Distribution � �O� φ ∼ p ( φ ) = D φ p ( φ ) O ( φ ) D φ q ( φ ) p ( φ ) � p ( φ ) � � q ( φ ) O ( φ ) = q ( φ ) O ( φ ) = φ ∼ q ( φ ) Generate q ( φ ) through parametrizable, invertible function g ( χ | ω ) with tractable Jacobian determinant: � det ∂ g − 1 ( φ ) � � � � q ( φ ) = r ( χ ( φ )) � � ∂φ � Optimal choice for q ( φ ) ← → Minimal relative entropy / Kullback-Leibler divergence D φ q ( φ ) log p ( φ ) � log p ( φ ) � � D KL ( q � p ) = − q ( φ ) = − q ( φ ) φ ∼ q ( φ ) 13 / 20

  17. INN / Real NVP Flow Ardizzone, Klessen, K¨ othe, Kruse, Maier-Hein, Pellegrini, Rahner, Rother, Wirkert (2018) — “Analyzing Inverse Problems with Invertible Neural Networks” — arXiv: 1808.04730 Ardizzone, K¨ othe, Kruse, L¨ uth, Rother, Wirkert (2019) — “Guided Image Generation with Conditional Invertible Neural Networks” — arXiv: 1907.02392 14 / 20

  18. Advantages of this Approach • learning is completely data-independent • improved error metrics • Metropolis-Hastings acceptance rate • convergence properties of D KL • ergodicity & balance + asymptotic exactness satisfied a priori • no latent space deformation required Objective: maximization of overlap between q ( φ ) and p ( φ ). 15 / 20

  19. Comparison with HMC Results d = 2 , V = 8 2 , λ = 0 . 02 INN: 8 layers, 4 hidden layers, 512 neurons / layer 5 10 4 χ 2 5 3 0 <|M|> 0.1 0.2 0.3 0.4 2 1 HMC bare weighted Metropolis 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 κ 16 / 20

  20. Comparison with HMC Results κ = 0 . 2 0.16 HMC bare weighted 0.14 Metropolis 0.12 0.1 G(s) 0.08 0.06 0.04 0.02 0 0 1 2 3 4 5 6 7 8 s 17 / 20

  21. Potential Applications & Future Work • accelerated simulations of physically interesting theories (QCD, Yukawa, Gauge-Higgs, Condensed Matter) • additional conditioning (cINN) to encode arbitrary couplings κ, λ • tackling sign problems with generalized thimble / path optimization approaches by latent space disentanglement • efficient minimization of D KL i.t.o. the ground state energy of an interacting hybrid classical-quantum system 18 / 20

  22. Challenges & Problems • scalability to higher dimensions / larger volumes / more d.o.f. (e.g. QCD: ∼ 10 9 floats per configuration) • multi-GPU parallelization • progressive growing to successively larger volumes • architectures that intrinsically respect symmetries and topological properties of the theory • gauge symmetry / equivariance • critical slowing down 19 / 20

  23. Thank you! 20 / 20

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend