Self-learning Monte Carlo method and all optical neural network - PowerPoint PPT Presentation

Self-learning Monte Carlo method and all optical neural network Junwei Liu ( 劉軍偉 ) Department of Physics Hong Kong University of Science and Technology Deep learning and Physics, YITP, Kyoto, Japan, 2019

The general motivation • As well known, there are great developments in machine learning techniques such as principle component analysis, deep neural networks, convolutional neural networks, generative neural networks, reinforcement learning and so on. • By using these methods, there are also many great achievements such as image recognition, auto-pilot cars and Alpha-GO. • Can we use these methods in physics to solve some problems? If so, what kind of problems can we solve and how? 2/46

Part 1: Self-learning Monte Carlo (ML  Physics) Monte Carlo simulation 3/46

Collaborators and References Liang Fu Yang Qi Ziyang Meng Xiaoyan Xu Yuki Nagai Huitao Shen (JAEA) (MIT) (Fudan) (IOP) (UCSD) (MIT) References 1. Self-Learning Monte Carlo Method, PRB 95, 041101(R) (2017) 2. Self-Learning Monte Carlo Method in Fermion Systems, PRB 95, 241104(R) (2017) 3. Self-Learning Quantum Monte Carlo Method in Interacting Fermion Systems, PRB 96, 041119(R) (2017) 4. Self-learning Monte Carlo method: Continuous-time algorithm, PRB 96, 161102 (2017) 5. Self-learning Monte Carlo with Deep Neural Networks, PRB 97, 205140 (2018) Related work from Prof Lei Wang’s group at IOP 1. Accelerated Monte Carlo simulations with restricted Boltzmann machines, PRB 95, 035105 (2017) 2. Recommender engine for continuous-time quantum Monte Carlo methods, PRE 95, 031301(R) (2017) 4/46

Monte Carlo methods • Consider a statistical mechanics problem 𝑓 −𝛾𝐼[𝐷] = ෍ 𝑎 = ෍ 𝑋(𝐷) 𝐷 𝐷 𝑃 𝐷 𝑓 −𝛾𝐼 𝐷 𝑃 = ෍ ൗ 𝑎 = ෍ 𝑃 𝐷 𝑋(𝐷) 𝑎 ൗ 𝐷 𝐷 • Pick up 𝑂 configurations (samples) in the configuration space 𝐷 based on the importance 𝑋 𝐷 𝑗 /𝑎 . Then we can estimate observables 𝑃 as 𝑂 𝑃 ≈ ෍ 𝑃 𝐷 𝑗 ൘ 𝑂 𝑗=1 • The statistical error is proportional to Τ 1 𝑂 . In high dimension (d>8), Monte Carlo is the most important and sometimes the only available method to perform the integral/summation. 5/46

Quantum Monte Carlo methods • Consider a quantum system characterized by 𝐼 𝜔 𝑓 −𝛾𝐼 𝜔 𝑎 = ෍ 𝜔 Method 1: Trotter decomposition 𝜔 𝑓 −𝛾𝐼 𝜔 𝑎 = ෍ 𝜔 = σ 𝜔𝜔 1 𝜔 2 …𝜔 𝑁 𝜔 1 𝑓 −𝛾𝐼/𝑁 𝜔 2 𝜔 2 𝑓 −𝛾𝐼/𝑁 𝜔 3 … 𝜔 𝑁 𝑓 −𝛾𝐼/𝑁 𝜔 1 Method 2: serial expansion 𝑎 = σ 𝜔 𝜔 𝑓 −𝛾𝐼 𝜔 = σ 𝜔 𝜔 σ 𝑜 −𝛾𝐼 𝑜 /𝑜! 𝜔 • Map the N -dimensional quantum model to be a ( N+1 )- dimensional “classical” model, and then use Monte Carlo method to simulate this “classical” model. 6/46

Markov chain Monte Carlo (MCMC) • MCMC is a way to do important sampling based on the distribution 𝑋(𝐷) . Configurations are generated one by one by the following approach: 1. first propose the next trial configuration 𝐷 𝑢 2. If 𝑆𝑏𝑜𝑒() ≤ 𝑼(𝑫 𝒋 → 𝑫 𝒖 ) , then the next configuration 𝐷 𝑗+1 = 𝐷 𝑢 Otherwise, 𝐷 𝑗+1 = 𝐷 𝑗 3. Repeating step 1 and step 2 ⋯ → 𝐷 𝑗−1 → 𝐷 𝑗 → 𝐷 𝑗+1 → ⋯ • It is clearly that the next step only depends on the last step and transition matrix 𝑈 , and we can demonstrate that the detailed balance principle guarantees that the Markov process will converge to the desired distribution 𝑈(𝐷 → 𝐸) 𝑈(𝐸 → 𝐷) = 𝑋 𝐸 𝑋 𝐷 N. Metropolis, J. Chem. Phys. 21, 1087 (1953) 7/46

Metropolis-Hastings algorithm • In Metropolis Algorithm, transition matrix is chosen as 𝑈 𝐷 → 𝐸 = min 1, 𝑋 𝐸 𝑋 𝐷 N. Metropolis, et. al J. Chem. Phys. 21, 1087 (1953) • The transition probability can be further written as 𝑈 𝐷 → 𝐸 = 𝑇 𝐷 → 𝐸 𝑞 𝐷 → 𝐸 𝑇 𝐷 → 𝐸 :proposal probability of conf. 𝐷 from conf. 𝐸 𝑞 𝐷 → 𝐸 :acceptance probability of configuration 𝐷 • In Metropolis-Hastings Algorithm, the acceptance ratio is 𝑞 𝐷 → 𝐸 = min 1, 𝑋 𝐸 𝑇 𝐸 → 𝐷 𝑋 𝐷 𝑇 𝐷 → 𝐸 W. H. Hastings, Biometrika 57, 97 (1970) 8/46

Independent Samples • As well known, for the statistic measurements, only the independent samples matter. • However, since the configurations are generated sequentially one by one as random walk in configuration space, it is inevitable that the configurations in the Markov chain are correlated with each other and not independent. • The configurations generated by different Monte Carlo methods have different autocorrelations. 9/46

Different update algorithms • Too small step length: small difference, high acceptance. • Too large step length: big difference, low acceptance. 𝒇 −𝜸(𝑭 𝑬 −𝑭(𝑫)) • Ideal step length: big difference and high acceptance, exploring the low-energy configurations. 10/46

How to justify different Monte Carlo methods? Time consumption 𝑢𝑚 𝑑 to get two statistically independent configurations • Auto-correlation time 𝑚 𝑑 : the number of steps to get independent configurations  Bigger differences  independent, but low acceptance ratio  Similar energies  high acceptance ratio, but not independent • Time consumption 𝑢 to get one configuration (mainly for the calculation of weight) Self-learning Monte Carlo methods are designed to improve both parts, thus can speed up the calculations dramatically. 11/46

Local update 1 • Local Update 𝑇 𝐷 → 𝐸 = 𝑇 𝐸 → 𝐷 = 𝑂 • Acceptance ratio 𝛽 𝐷 → 𝐸 = min 1, 𝑋 𝐸 𝑇 𝐸 → 𝐷 = min 1, 𝑓 −𝛾(𝐹 𝐸 −𝐹(𝐷)) 𝑋 𝐷 𝑇 𝐷 → 𝐸 • Very general: applies to any model N. Metropolis, et al J. Chem. Phys. 21, 1087 (1953) 12/46

Critical slowing down • Dynamical relaxation time diverges at the critical point: convergence is very slow in the critical system. For 2D Ising model, autocorrelation time 𝜐 ∝ 𝑀 𝑨 , 𝑨 = 2.125 • 13/46

How to get high acceptance ratio? 𝛽 𝐷 → 𝐸 = min 1, 𝑋 𝐸 𝑇 𝐸 → 𝐷 𝑋 𝐷 𝑇 𝐷 → 𝐸 𝑋 𝐸 𝑋 𝐷 = 𝑇 𝐷 → 𝐸 𝑇 𝐸 → 𝐷 14/46

Global update -- Wolff algorithm in Ising model 1. Randomly choose one site 𝑗 2. If the adjacent sites have the same status, then add them in the cluster 𝐷 with the probability 𝑇 𝑗 → 𝑘 = 1 − 𝑓 −2𝛾𝐾 3. Repeat step 2 for all the sites in the cluster 𝐷 4. Change the status of all the sites in the cluster 𝐷 Swendsen and Wang, Phys. Rev. Lett. 58, 86 (1987) U. Wolff, Phys. Rev. Lett. 62, 361 (1989) 15/46

Reduce critical slowing down Swendsen and Wang, Phys. Rev. Lett. 58, 86 (1987) 16/46

Local update and global update Local update Global update • Locally update configuration • Globally update configuration by changing one site per MC by simultaneously changing step many sites per MC step • Very general • High efficiency • Inefficient around the phase • Designed for specific transition points ( critical models and hard to be slowing down ) generalized to other models 17/46

How to get high acceptance ratio? 𝛽 𝐷 → 𝐸 = min 1, 𝑋 𝐸 𝑇 𝐸 → 𝐷 𝑋 𝐷 𝑇 𝐷 → 𝐸 𝑋 𝐸 𝑋 𝐷 = 𝑇 𝐷 → 𝐸 Very 𝑇 𝐸 → 𝐷 Hard 𝑋 𝐸 𝑋 𝐷 ≈ 𝑇 𝐷 → 𝐸 𝑇 𝐸 → 𝐷 18/46

My initial naïve idea • Use machine learning to learn some common features for these “important” configurations, and then generate new configurations based on these learned features. • It seems to be good, but it does not work, because we don’t 𝑇 𝐷→𝐸 know 𝑇 𝐸→𝐷 , and we cannot calculate the right acceptance probability. • However, this idea tells us that there are other important information in the generated configurations, besides 𝑃 ≈ 𝑂 σ 𝑗=1 Τ 𝑃 𝐷 𝑗 𝑂 based on them. 19/46

Hidden information in generated configurations • The generated configurations have a similar distribution close to the original distribution !!! • It is too obvious and seems to be too trivial, but we do not use this seriously besides calculating the average of operators. • Can we further use it and how? • The answer is YES and we can do it in self-learning Monte Carlo method. 20/46

The right way to use the hidden information in generated configurations Yang Qi (Fudan) Critical properties Learn Guide Universality class Effective Model 21/46

Core ideas of self-learning Monte Carlo • Learn an approximated simpler model  Having efficient global update methods First learn  Evaluating the weights faster How to train? • Use the simpler model to guide us to Then earn simulate the original hard model How to propose? J. Liu, et al PRB 95, 041101(R) (2017) 22/46

SLMC in Boson system • Original Hamiltonian has both two-body and four-body interactions, and we do not have global update method. 23/46

Fit the parameters in 𝑰 eff • Effective Hamiltonian • Generate configurations with local update at T=5>Tc, away from the critical points • Perform linear regression to fit the parameters in 𝐼eff • Generate configurations with reinforced learning at Tc 24/46

Self-learning Monte Carlo method and all optical neural network - PowerPoint PPT Presentation

Self-learning Monte Carlo method and all optical neural network Junwei Liu ( ) Department of Physics Hong Kong University of Science and Technology Deep learning and Physics, YITP, Kyoto, Japan, 2019 The general motivation As

Monte Carlo Generators Monte Carlo Generators Monte Carlo Generators QCD Lecture III P .

Monte Carlo Methods Guojin Chen Christopher Cprek Chris Rambicure Monte Carlo Methods 1.

Monte Carlo Approximation of Monte Carlo Filters Adam M. Johansen et al. Collaborators Include:

4. THE MONTE CARLO METHOD 4.1 I ntroduction This chapter is aimed at describing the Monte Carlo

Introduction to Monte Carlo Method Andrzej Palczewski and Jan Palczewski Introduction to Monte

BROCHURE 2019 TETRA JUICES DEL MONTE DEL MONTE 6 x 1L GOLD PINEAPPLE 6 x 1L 6 x 1L 6 x 1L

Chapter 5: Monte Carlo Methods Monte Carlo methods are learning methods Experience

Draft Introduction to (randomized) quasi-Monte Carlo Pierre LEcuyer MCQMC Conference,

Monte Carlo Estimation 7 January 2019 OSU CSE 1 Monte Carlo Methods Class of computational

Monte Carlo Localization Ximing Yu March 24, 2009 Ximing Yu Monte Carlo Localization 1

Monte Carlo Control CMPUT 366: Intelligent Systems S&B 5.3-5.5, 5.7 Lecture Outline 1.

Techniques in Artificial Intelligence - Part I Todd W. Neller Gettysburg College Monte Carlo

Limitations of Realistic A Faster Method: . . . Monte-Carlo Techniques Monte-Carlo: . . . Proof

Draft 1 Density estimation by Monte Carlo and randomized quasi-Monte Carlo (RQMC) Pierre

Approximate Counting Andreas-Nikolas Gbel National Technical University of Athens, Greece

Monte Carlo Simulation technique S. B. Santra Department of Physics Indian Institute of

The Chinta-Gunnells action and sums over highest weight crystals Anna Pusk as University of

Modelling of Phase Transitions in R VO 3 Perovskites Andrzej M. Ole M. Smoluchowski Institute

Machine Learning CSE 4308/5360: Artificial Intelligence I University of Texas at Arlington 1

Crystallization from the Gas Phase: Morphology Control, Co-Crystal and Salt Formation Ciarn

Nanopowder crystallite sizes and shapes from diffraction experiments D. Chateigner, L. Lutterotti

Dehydrogenation reactions of methanol in presence of Nanosized- ZnO/CuO/MgO system Sahar A.

HAXPES of novel charge transfer compounds K. Medjanik 1 , A. Gloskovskii 2 , D. Chercka 3 , M.

Difraccin de polvo y su aplicacin en metalurgia Oriol Vallcorba Experiments Division ALBA

Self-learning Monte Carlo method and all optical neural network - PowerPoint PPT Presentation

Self-learning Monte Carlo method and all optical neural network Junwei Liu ( ) Department of Physics Hong Kong University of Science and Technology Deep learning and Physics, YITP, Kyoto, Japan, 2019 The general motivation As

Monte Carlo Generators Monte Carlo Generators Monte Carlo Generators QCD Lecture III P .

Monte Carlo Methods Guojin Chen Christopher Cprek Chris Rambicure Monte Carlo Methods 1.

Monte Carlo Approximation of Monte Carlo Filters Adam M. Johansen et al. Collaborators Include:

4. THE MONTE CARLO METHOD 4.1 I ntroduction This chapter is aimed at describing the Monte Carlo

Introduction to Monte Carlo Method Andrzej Palczewski and Jan Palczewski Introduction to Monte

BROCHURE 2019 TETRA JUICES DEL MONTE DEL MONTE 6 x 1L GOLD PINEAPPLE 6 x 1L 6 x 1L 6 x 1L

Chapter 5: Monte Carlo Methods Monte Carlo methods are learning methods Experience

Draft Introduction to (randomized) quasi-Monte Carlo Pierre LEcuyer MCQMC Conference,

Monte Carlo Estimation 7 January 2019 OSU CSE 1 Monte Carlo Methods Class of computational

Monte Carlo Localization Ximing Yu March 24, 2009 Ximing Yu Monte Carlo Localization 1

Monte Carlo Control CMPUT 366: Intelligent Systems S&amp;B 5.3-5.5, 5.7 Lecture Outline 1.

Techniques in Artificial Intelligence - Part I Todd W. Neller Gettysburg College Monte Carlo

Limitations of Realistic A Faster Method: . . . Monte-Carlo Techniques Monte-Carlo: . . . Proof

Draft 1 Density estimation by Monte Carlo and randomized quasi-Monte Carlo (RQMC) Pierre

Approximate Counting Andreas-Nikolas Gbel National Technical University of Athens, Greece

Monte Carlo Simulation technique S. B. Santra Department of Physics Indian Institute of

The Chinta-Gunnells action and sums over highest weight crystals Anna Pusk as University of

Modelling of Phase Transitions in R VO 3 Perovskites Andrzej M. Ole M. Smoluchowski Institute

Machine Learning CSE 4308/5360: Artificial Intelligence I University of Texas at Arlington 1

Crystallization from the Gas Phase: Morphology Control, Co-Crystal and Salt Formation Ciarn

Nanopowder crystallite sizes and shapes from diffraction experiments D. Chateigner, L. Lutterotti

Dehydrogenation reactions of methanol in presence of Nanosized- ZnO/CuO/MgO system Sahar A.

HAXPES of novel charge transfer compounds K. Medjanik 1 , A. Gloskovskii 2 , D. Chercka 3 , M.

Difraccin de polvo y su aplicacin en metalurgia Oriol Vallcorba Experiments Division ALBA

Monte Carlo Control CMPUT 366: Intelligent Systems S&B 5.3-5.5, 5.7 Lecture Outline 1.