Machine learning for lattice theories Real-world lattices 2 - PowerPoint PPT Presentation

Machine learning for lattice theories 1 Michael S. Albergo, Gurtej Kanwar , Phiala E. Shanahan Center for Theoretical Physics, MIT Deep Learning and Physics 1 Albergo, GK, Shanahan [PRD 100 (2019) 034515] Kyoto, Japan (November 1, 2019)

Machine learning for lattice theories Real-world lattices 2

Machine learning for lattice theories Real-world Quantum field lattices theories 3

Lattices in the real world ● Many materials have degrees of freedom pinned to a lattice structure [Ella Maru Studio] [Mazurenko et al. 1612.08436] 4

Lattices in the real world ● Thermodynamics describes collective behavior of many degrees of freedom ● At some temperature T, Boltzmann distribution over microstates with 5

Lattices in the real world ● Thermodynamics describes collective behavior of many degrees of freedom ● At some temperature T, Boltzmann distribution over microstates with Ising model has spin s = {↑,↓} per site, with energy penalty for neighboring spins differing. Typical microstates have patches of the same spin at some scale. [ "Ising Model and Metropolis Algorithm", MathWorks Physics Team ] 6

Lattices in the real world ● Derive thermodynamic observables by averaging microstates Boltzmann Partition function distribution total energy total energy Helmholtz free correlation energy function . . . . . . 7

Lattices for quantum field theories ● Quantum-mechanical properties also computed as statistical expectation values via Path Integral similar to partition function 8

Lattice Quantum Chromodynamics ● Predictions relevant to interpret upcoming high-energy expts ○ Electron-Ion Collider will investigate detailed nuclear structure ○ Deep Underground Neutrino Expt requires nuclear cross bnl.gov/eic sections with neutrinos So far! Hong-Ye's talk for holography ideas dunescience.org ● Pen-and-paper methods fail, numerical evaluation of path integral req'd 9 [D. Leinweber, Visual QCD Archive]

Computational approach to lattice theories ● Partition functions and path integrals are typically intractable analytically ● Numerical approximation by Monte Carlo sampling sample integral according to estimate observables ● Markov Chain Monte Carlo converges to samples from p( 𝜚 ) . . . approximately 10 distributed ~ p( 𝜚 )

Machine learning for lattice theories Real-world Quantum field lattices theories lattice theories 11

Machine learning for lattice theories Real-world Quantum field lattices theories lattice theories numerical methods – Thermodynamics – Collective phenomena – Spectrum – ... 12

Machine learning for lattice theories Real-world Quantum field lattices theories lattice theories numerical methods ✘ hard to reach continuum limit / critical point in some theories – Thermodynamics – Collective phenomena – Spectrum – ... 13

Machine learning for lattice theories Real-world Quantum field lattices theories lattice theories numerical methods + ML ✘ hard to reach continuum limit / critical point in some theories – Thermodynamics – Collective phenomena – Spectrum – ... 14

Machine learning for lattice theories Real-world Quantum field lattices theories lattice theories Sampling using ML 2 numerical methods + ML ✘ 1 Critical slowing down – Thermodynamics – Collective phenomena – Spectrum – ... 3 Toy model results 15

Difficulties with Markov Chain Monte Carlo ● Need to wait for "burn-in period" ● Configurations close to each other on the chain will be correlated, so must take many steps before drawing independent samples correlated . . . ~ p( 𝜚 ) ~ p( 𝜚 ) burn-in ● Burn-in and correlations both related to Markov chain "autocorrelation time" → smaller autocorrelation time means less computational cost! typically quantify with integrated autocorrelation time: 16

Critical slowing down ● As params defining distribution approach criticality, for Markov chains using local updates, autocorrelation time diverges continuum limit Fitting 𝜐 int to power law behavior gives ● dynamical critical exponents ● Smaller dynamical critical exponent = cheaper, closer approach to criticality 17

CSD in scalar theory used in this work: Critical slowing down ● As params defining distribution approach criticality, for Markov chains using local updates, autocorrelation time diverges continuum limit Fitting 𝜐 int to power law behavior gives ● dynamical critical exponents CSD also affects more realistic, complex models: CP N-1 ○ [Flynn, et al. 1504.06292] ○ O(N) [Frick, et al. PRL 63, 2613] ○ QCD [ALPHA collaboration 1009.5228] ● Smaller dynamical critical exponent = ○ ... cheaper, closer approach to criticality 18

Machine learning for lattice theories Real-world Quantum field lattices theories lattice theories Sampling using ML 2 numerical methods + ML ✘ 1 Critical slowing down – Thermodynamics – Collective phenomena – Spectrum – ... 3 Toy model results 19

Sampling lattice configs likely (log prob = 22) likely (log prob = 25) likely (log prob = 5) unlikely (log prob = -6107) 20

Sampling lattice configs ≅ generating images likely likely [Karras, Lane, Aila / NVIDIA 1812.04948] likely likely unlikely unlikely 21

Unique features of the lattice sampling problem ✓ Probability density computable (up to normalization) ✓ Many symmetries in physics ○ Lattice symmetries like translation , rotation , and reflection ○ Per-site symmetries like negation ✘ High-dimensional (10 9 to 10 12 ) samples ✘ Few (~1000) samples available ahead of time (fewer than # vars!) ○ Hard to use training paradigms that rely on existing samples from distribution 22

Image generation via ML 1. Likelihood free methods: [Goodfellow et al. 1406.2661] E.g. Generative Adversarial Networks (GANs) ✘ Needs many real samples No associated likelihood for each produced sample ✘ 2. Autoencoding: [Kingma & Welling 1312.6114] [Shen & Liu 1612.05363] E.g. Variational Auto-Encoders (VAEs) ✔ Good for human interpretability Same issues as GANs ✘ 3. Normalizing flows: [Rezende & Mohamed 1505.05770] Flow-based models learn a change-of-variables that transforms a known distribution to the desired distribution ✔ Exactly known likelihood for each sample ✔ Can be trained with samples from itself 23

Image generation via ML 1. Likelihood free methods: [Goodfellow et al. 1406.2661] E.g. Generative Adversarial Networks (GANs) ✘ Needs many real samples No associated likelihood for each produced sample ✘ 2. Autoencoding: [Kingma & Welling 1312.6114] [Shen & Liu 1612.05363] E.g. Variational Auto-Encoders (VAEs) ✔ Good for human interpretability Same issues as GANs ✘ 3. Normalizing flows: [Rezende & Mohamed 1505.05770] Flow-based models learn a change-of-variables that transforms a known distribution to the desired distribution ✔ Exactly known likelihood for each sample ✔ Can be trained with samples from itself 24

Many related approaches ● Continuous flows ● Normalizing flows for many-body systems [Noé, Olsson, Köhler, Wu Science 365 (2019) Iss. 6457, 982] [Zhang, E, Wang 1809.10188] ● Hamiltonian transforms ● Self-Learning Monte Carlo [Li, Dong, Zhang, Wang 1910.00024] See talks by Junwei Liu , Lei Wang and Hong-Ye Hu 25 [Liu, Qi, Meng, Fu 1610.03137]

Flow-based generative models Using a change-of-variables, produce a distribution approximating what you want. [Rezende & Mohamed 1505.05770] 26

Flow-based generative models Using a change-of-variables, produce a distribution approximating what you want. [Rezende & Mohamed 1505.05770] Invertible & Tractable Jacobian Approximates Easily sampled desired dist. 27

Flow-based generative models We chose real non-volume preserving (real NVP) flows for our work. [Dinh et al. 1605.08803] Invertible & Tractable Jacobian Many simple layers composed to produce f Approximates Easily sampled desired dist. 28

Flow-based generative models We chose real non-volume preserving (real NVP) flows for our work. [Dinh et al. 1605.08803] Invertible & Tractable Jacobian Approximates Easily sampled desired dist. 29

Real NVP coupling layer -1 Application of g i 1. Freeze 1/2 of the inputs, z a 2. Feed frozen vars into neural networks s and t 3. Scale exp(- s ) and offset - t applied to unfrozen, z b ● Simple inverse and Jacobian 30

Loss function ● Use known target probability density: ● For our application, train to minimize shifted KL divergence shift removes unknown normalization Z ● Can apply self-training : sampling model distribution p̃ f ( 𝜚 ) to estimate loss 31

Correcting for model error ● Known model and target densities, many options to correct for error ● We use MCMC with proposals from ML model (interoperable with standard MC updates) ● Metropolis-Hastings step: model proposal, independent of previous sample Markov Chain ✘ ML model proposals 32

Overview of algorithm Parameterize flow using Real Each layer contains NVP coupling layers arbitrary neural nets s and t 33

Machine learning for lattice theories Real-world lattices 2 - PowerPoint PPT Presentation

Machine learning for lattice theories 1 Michael S. Albergo, Gurtej Kanwar , Phiala E. Shanahan Center for Theoretical Physics, MIT Deep Learning and Physics 1 Albergo, GK, Shanahan [PRD 100 (2019) 034515] Kyoto, Japan (November 1, 2019) Machine

Enriched Lawvere Theories theories for Operational Semantics Lawvere theories enriched theories

Enriched Regular Theories Giacomo Tendas Joint work with: Stephen Lack 8 July 2019 Outline 1

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine

Quantum Machine Learning Adam Brown, HEP-AI Quantum Computing Machine Learning Quantum

MICROSOFT AZURE MACHINE LEARNING Oscar Naim Microsoft Microsoft Azure Machine Learning What is

MACHINE LEARNING Overview 1 1 APPLIED MACHINE LEARNING 2011-2012 APPLIED MACHINE LEARNING

MACHINE LEARNING kernels 1 MACHINE LEARNING 2012 MACHINE LEARNING Kernels: Intuition How

Algebraic Study of Lattice-Valued Logic and Lattice-Valued Modal Logic Yoshihiro Maruyama

Lattice gas simulations Tony Kim Spring 2007 18.354 Project 1) Introducing the lattice gas;

Lattice Points in Polytopes Richard P. Stanley U. Miami & M.I.T. A lattice polygon Georg

When is the lattice of closure operators on a subgroup lattice again a subgroup lattice? Martha

Energy Depositions For Lattices 1 and 2 Lattice 1 Lattice 2 Two scenarios FODO bend FODO

Theories within Theories Berislav Zarni c University of Split, Croatia (Hrvatska) Physics

Theories within Theories Berislav Zarni c Physics and Philosophy, Split, July 2012

Outline Classification of first-order theories Simple theories NIP theories NTP 2 Space of

A Machine Learning Approach A Machine Learning Approach A Machine Learning Approach A Machine

The Stochastic EM Algorithm for Censored Mixed Models Ian Marschner Macquarie University, Sydney

Stochastic Simulation Discrete simulation/event-by-event Bo Friis Nielsen Institute of

Markov chain Monte Carlo Reminder Need to sample large, non-standard distributions: Markov

How to Encipher Messages on a Small Domain Deterministic Encryption and the Thorp Shuffle Ben

Burts Social Origin of Good Ideas paper Steve Borgatti Study Design Ask people

An Embedding A Approac ach t to Anom omal aly D Detection Renjun Hu 1 , Charu Aggarwal 2 ,

Demonstrating Professionalism Tim Warner @TechTrainerTim timothy-warner@pluralsight.com The

Magnetrons - High Power RF Sources Brian Chase - Fermilab Michael Read - Calabazas Creek

Machine learning for lattice theories Real-world lattices 2 - PowerPoint PPT Presentation

Machine learning for lattice theories 1 Michael S. Albergo, Gurtej Kanwar , Phiala E. Shanahan Center for Theoretical Physics, MIT Deep Learning and Physics 1 Albergo, GK, Shanahan [PRD 100 (2019) 034515] Kyoto, Japan (November 1, 2019) Machine

Enriched Lawvere Theories theories for Operational Semantics Lawvere theories enriched theories

Enriched Regular Theories Giacomo Tendas Joint work with: Stephen Lack 8 July 2019 Outline 1

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine

Quantum Machine Learning Adam Brown, HEP-AI Quantum Computing Machine Learning Quantum

MICROSOFT AZURE MACHINE LEARNING Oscar Naim Microsoft Microsoft Azure Machine Learning What is

MACHINE LEARNING Overview 1 1 APPLIED MACHINE LEARNING 2011-2012 APPLIED MACHINE LEARNING

MACHINE LEARNING kernels 1 MACHINE LEARNING 2012 MACHINE LEARNING Kernels: Intuition How

Algebraic Study of Lattice-Valued Logic and Lattice-Valued Modal Logic Yoshihiro Maruyama

Lattice gas simulations Tony Kim Spring 2007 18.354 Project 1) Introducing the lattice gas;

Lattice Points in Polytopes Richard P. Stanley U. Miami &amp; M.I.T. A lattice polygon Georg

When is the lattice of closure operators on a subgroup lattice again a subgroup lattice? Martha

Energy Depositions For Lattices 1 and 2 Lattice 1 Lattice 2 Two scenarios FODO bend FODO

Theories within Theories Berislav Zarni c University of Split, Croatia (Hrvatska) Physics

Theories within Theories Berislav Zarni c Physics and Philosophy, Split, July 2012

Outline Classification of first-order theories Simple theories NIP theories NTP 2 Space of

A Machine Learning Approach A Machine Learning Approach A Machine Learning Approach A Machine

The Stochastic EM Algorithm for Censored Mixed Models Ian Marschner Macquarie University, Sydney

Stochastic Simulation Discrete simulation/event-by-event Bo Friis Nielsen Institute of

Markov chain Monte Carlo Reminder Need to sample large, non-standard distributions: Markov

How to Encipher Messages on a Small Domain Deterministic Encryption and the Thorp Shuffle Ben

Burts Social Origin of Good Ideas paper Steve Borgatti Study Design Ask people

An Embedding A Approac ach t to Anom omal aly D Detection Renjun Hu 1 , Charu Aggarwal 2 ,

Demonstrating Professionalism Tim Warner @TechTrainerTim timothy-warner@pluralsight.com The

Magnetrons - High Power RF Sources Brian Chase - Fermilab Michael Read - Calabazas Creek

Lattice Points in Polytopes Richard P. Stanley U. Miami & M.I.T. A lattice polygon Georg