machine learning for lattice theories
play

Machine learning for lattice theories Real-world lattices 2 - PowerPoint PPT Presentation

Machine learning for lattice theories 1 Michael S. Albergo, Gurtej Kanwar , Phiala E. Shanahan Center for Theoretical Physics, MIT Deep Learning and Physics 1 Albergo, GK, Shanahan [PRD 100 (2019) 034515] Kyoto, Japan (November 1, 2019) Machine


  1. Machine learning for lattice theories 1 Michael S. Albergo, Gurtej Kanwar , Phiala E. Shanahan Center for Theoretical Physics, MIT Deep Learning and Physics 1 Albergo, GK, Shanahan [PRD 100 (2019) 034515] Kyoto, Japan (November 1, 2019)

  2. Machine learning for lattice theories Real-world lattices 2

  3. Machine learning for lattice theories Real-world Quantum field lattices theories 3

  4. Lattices in the real world ● Many materials have degrees of freedom pinned to a lattice structure [Ella Maru Studio] [Mazurenko et al. 1612.08436] 4

  5. Lattices in the real world ● Thermodynamics describes collective behavior of many degrees of freedom ● At some temperature T, Boltzmann distribution over microstates with 5

  6. Lattices in the real world ● Thermodynamics describes collective behavior of many degrees of freedom ● At some temperature T, Boltzmann distribution over microstates with Ising model has spin s = {↑,↓} per site, with energy penalty for neighboring spins differing. Typical microstates have patches of the same spin at some scale. [ "Ising Model and Metropolis Algorithm", MathWorks Physics Team ] 6

  7. Lattices in the real world ● Derive thermodynamic observables by averaging microstates Boltzmann Partition function distribution total energy total energy Helmholtz free correlation energy function . . . . . . 7

  8. Lattices for quantum field theories ● Quantum-mechanical properties also computed as statistical expectation values via Path Integral similar to partition function 8

  9. Lattice Quantum Chromodynamics ● Predictions relevant to interpret upcoming high-energy expts ○ Electron-Ion Collider will investigate detailed nuclear structure ○ Deep Underground Neutrino Expt requires nuclear cross bnl.gov/eic sections with neutrinos So far! Hong-Ye's talk for holography ideas dunescience.org ● Pen-and-paper methods fail, numerical evaluation of path integral req'd 9 [D. Leinweber, Visual QCD Archive]

  10. Computational approach to lattice theories ● Partition functions and path integrals are typically intractable analytically ● Numerical approximation by Monte Carlo sampling sample integral according to estimate observables ● Markov Chain Monte Carlo converges to samples from p( 𝜚 ) . . . approximately 10 distributed ~ p( 𝜚 )

  11. Machine learning for lattice theories Real-world Quantum field lattices theories lattice theories 11

  12. Machine learning for lattice theories Real-world Quantum field lattices theories lattice theories numerical methods – Thermodynamics – Collective phenomena – Spectrum – ... 12

  13. Machine learning for lattice theories Real-world Quantum field lattices theories lattice theories numerical methods ✘ hard to reach continuum limit / critical point in some theories – Thermodynamics – Collective phenomena – Spectrum – ... 13

  14. Machine learning for lattice theories Real-world Quantum field lattices theories lattice theories numerical methods + ML ✘ hard to reach continuum limit / critical point in some theories – Thermodynamics – Collective phenomena – Spectrum – ... 14

  15. Machine learning for lattice theories Real-world Quantum field lattices theories lattice theories Sampling using ML 2 numerical methods + ML ✘ 1 Critical slowing down – Thermodynamics – Collective phenomena – Spectrum – ... 3 Toy model results 15

  16. Difficulties with Markov Chain Monte Carlo ● Need to wait for "burn-in period" ● Configurations close to each other on the chain will be correlated, so must take many steps before drawing independent samples correlated . . . ~ p( 𝜚 ) ~ p( 𝜚 ) burn-in ● Burn-in and correlations both related to Markov chain "autocorrelation time" → smaller autocorrelation time means less computational cost! typically quantify with integrated autocorrelation time: 16

  17. Critical slowing down ● As params defining distribution approach criticality, for Markov chains using local updates, autocorrelation time diverges continuum limit Fitting 𝜐 int to power law behavior gives ● dynamical critical exponents ● Smaller dynamical critical exponent = cheaper, closer approach to criticality 17

  18. CSD in scalar theory used in this work: Critical slowing down ● As params defining distribution approach criticality, for Markov chains using local updates, autocorrelation time diverges continuum limit Fitting 𝜐 int to power law behavior gives ● dynamical critical exponents CSD also affects more realistic, complex models: CP N-1 ○ [Flynn, et al. 1504.06292] ○ O(N) [Frick, et al. PRL 63, 2613] ○ QCD [ALPHA collaboration 1009.5228] ● Smaller dynamical critical exponent = ○ ... cheaper, closer approach to criticality 18

  19. Machine learning for lattice theories Real-world Quantum field lattices theories lattice theories Sampling using ML 2 numerical methods + ML ✘ 1 Critical slowing down – Thermodynamics – Collective phenomena – Spectrum – ... 3 Toy model results 19

  20. Sampling lattice configs likely (log prob = 22) likely (log prob = 25) likely (log prob = 5) unlikely (log prob = -6107) 20

  21. Sampling lattice configs ≅ generating images likely likely [Karras, Lane, Aila / NVIDIA 1812.04948] likely likely unlikely unlikely 21

  22. Unique features of the lattice sampling problem ✓ Probability density computable (up to normalization) ✓ Many symmetries in physics ○ Lattice symmetries like translation , rotation , and reflection ○ Per-site symmetries like negation ✘ High-dimensional (10 9 to 10 12 ) samples ✘ Few (~1000) samples available ahead of time (fewer than # vars!) ○ Hard to use training paradigms that rely on existing samples from distribution 22

  23. Image generation via ML 1. Likelihood free methods: [Goodfellow et al. 1406.2661] E.g. Generative Adversarial Networks (GANs) ✘ Needs many real samples No associated likelihood for each produced sample ✘ 2. Autoencoding: [Kingma & Welling 1312.6114] [Shen & Liu 1612.05363] E.g. Variational Auto-Encoders (VAEs) ✔ Good for human interpretability Same issues as GANs ✘ 3. Normalizing flows: [Rezende & Mohamed 1505.05770] Flow-based models learn a change-of-variables that transforms a known distribution to the desired distribution ✔ Exactly known likelihood for each sample ✔ Can be trained with samples from itself 23

  24. Image generation via ML 1. Likelihood free methods: [Goodfellow et al. 1406.2661] E.g. Generative Adversarial Networks (GANs) ✘ Needs many real samples No associated likelihood for each produced sample ✘ 2. Autoencoding: [Kingma & Welling 1312.6114] [Shen & Liu 1612.05363] E.g. Variational Auto-Encoders (VAEs) ✔ Good for human interpretability Same issues as GANs ✘ 3. Normalizing flows: [Rezende & Mohamed 1505.05770] Flow-based models learn a change-of-variables that transforms a known distribution to the desired distribution ✔ Exactly known likelihood for each sample ✔ Can be trained with samples from itself 24

  25. Many related approaches ● Continuous flows ● Normalizing flows for many-body systems [Noé, Olsson, Köhler, Wu Science 365 (2019) Iss. 6457, 982] [Zhang, E, Wang 1809.10188] ● Hamiltonian transforms ● Self-Learning Monte Carlo [Li, Dong, Zhang, Wang 1910.00024] See talks by Junwei Liu , Lei Wang and Hong-Ye Hu 25 [Liu, Qi, Meng, Fu 1610.03137]

  26. Flow-based generative models Using a change-of-variables, produce a distribution approximating what you want. [Rezende & Mohamed 1505.05770] 26

  27. Flow-based generative models Using a change-of-variables, produce a distribution approximating what you want. [Rezende & Mohamed 1505.05770] Invertible & Tractable Jacobian Approximates Easily sampled desired dist. 27

  28. Flow-based generative models We chose real non-volume preserving (real NVP) flows for our work. [Dinh et al. 1605.08803] Invertible & Tractable Jacobian Many simple layers composed to produce f Approximates Easily sampled desired dist. 28

  29. Flow-based generative models We chose real non-volume preserving (real NVP) flows for our work. [Dinh et al. 1605.08803] Invertible & Tractable Jacobian Approximates Easily sampled desired dist. 29

  30. Real NVP coupling layer -1 Application of g i 1. Freeze 1/2 of the inputs, z a 2. Feed frozen vars into neural networks s and t 3. Scale exp(- s ) and offset - t applied to unfrozen, z b ● Simple inverse and Jacobian 30

  31. Loss function ● Use known target probability density: ● For our application, train to minimize shifted KL divergence shift removes unknown normalization Z ● Can apply self-training : sampling model distribution p̃ f ( 𝜚 ) to estimate loss 31

  32. Correcting for model error ● Known model and target densities, many options to correct for error ● We use MCMC with proposals from ML model (interoperable with standard MC updates) ● Metropolis-Hastings step: model proposal, independent of previous sample Markov Chain ✘ ML model proposals 32

  33. Overview of algorithm Parameterize flow using Real Each layer contains NVP coupling layers arbitrary neural nets s and t 33

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend