Monte Carlo simulation inspired by computational optimization Colin - PowerPoint PPT Presentation

Monte Carlo simulation inspired by computational optimization Colin Fox fox@physics.otago.ac.nz Al Parker, John Bardsley MCQMC Feb 2012, Sydney

Sampling from π ( x ) Consider : x is high-dimensional ( 10 4 – 10 8 ) and π is expensive to compute MH-MCMC repeatedly simulates fixed transition kernel P with π P = π via detailed balance (self-adjoint in π ) n -step distribution: π ( n ) = π ( n − 1) P = π (0) P n n -step error in distribution: π ( n ) − π = π (0) − π P n � � error polynomial: P n = ( I − ( I − P )) n = (1 − λ ) n Hence convergence is geometric This was the state of stationary linear solvers in 1950’s – now considered very slow Polynomial acceleration: n n (choose x ( i ) w.p. α i ) ∼ π (0) � α i P i � α i (1 − λ ) i error poly: Q n = i =1 i =1

Gibbs sampling Gibbs sampling a repeatedly samples from (block) conditional distributions Forward and reverse sweep simulates a reversible kernel, as in MH-MCMC Normal distributions � det ( A ) � − 1 � 2 x T Ax + b T x π ( x ) = exp 2 π n precision matrix A , covariance matrix Σ = A − 1 (both SPD) wlog treat zero mean Particularly interested in case where A is sparse (GMRF) and n large When is π (0) is also normal, then so is the n-step distribution A ( n ) → A Σ ( n ) → Σ In what sense is “stochastic relaxation” related to “relaxation”? What decomposition of A is this performing? a Glauber 1963 (heat-bath algorithm), Turcin 1971, Geman and Geman 1984

Gibbs samplers and equivalent linear solvers Optimization ... Gauss-Seidel Cheby-GS CG/Lanczos Sampling ... Gibbs Cheby-Gibbs Lanczos

Matrix formulation of Gibbs sampling from N(0 , A − 1 ) Let y = ( y 1 , y 2 , ..., y n ) T Component-wise Gibbs updates each component in sequence from the (normal) conditional distributions One ‘sweep’ over all n components can be written y ( k +1) = − D − 1 Ly ( k +1) − D − 1 L T y ( k ) + D − 1 / 2 z ( k ) where: D = diag( A ) , L is the strictly lower triangular part of A , z ( k − 1) ∼ N( 0 , I ) y ( k +1) = Gy ( k ) + c ( k ) c ( k ) is iid ’noise’ with zero mean, finite covariance stochastic AR(1) process = first order stationary iteration plus noise Goodman & Sokal, 1989

Matrix splitting form of stationary iterative methods The splitting A = M − N converts linear system Ax = b to Mx = Nx + b . If M is nonsingular x = M − 1 Nx + M − 1 b . Iterative methods compute successively better approximations by x ( k +1) = M − 1 Nx ( k ) + M − 1 b = Gx ( k ) + g Many splittings use terms in A = L + D + U . Gauss-Seidel sets M = L + D x ( k +1) = − D − 1 Lx ( k +1) − D − 1 L T x ( k ) + D − 1 b spot the similarity to Gibbs y ( k +1) = − D − 1 Ly ( k +1) − D − 1 L T y ( k ) + D − 1 / 2 z ( k ) Goodman & Sokal, 1989; Amit & Grenander, 1991

Matrix splitting form of stationary iterative methods The splitting A = M − N converts linear system Ax = b to Mx = Nx + b . If M is nonsingular x = M − 1 Nx + M − 1 b . Iterative methods compute successively better approximations by x ( k +1) = M − 1 Nx ( k ) + M − 1 b = Gx ( k ) + g Many splittings use terms in A = L + D + U . Gauss-Seidel sets M = L + D x ( k +1) = − D − 1 Lx ( k +1) − D − 1 L T x ( k ) + D − 1 b spot the similarity to Gibbs y ( k +1) = − D − 1 Ly ( k +1) − D − 1 L T y ( k ) + D − 1 / 2 z ( k ) Goodman & Sokal 1989; Amit & Grenander 1991

Gibbs converges ⇐ ⇒ solver converges Theorem 1 Let A = M − N , M invertible. The stationary linear solver x ( k +1) = M − 1 Nx ( k ) + M − 1 b = Gx ( k ) + M − 1 b converges, if and only if the random iteration y ( k +1) = M − 1 Ny ( k ) + M − 1 c ( k ) = Gy ( k ) + M − 1 c ( k ) converges in distribution. Here c ( k ) iid ∼ π n has zero mean and finite variance Proof. Both converge iff ̺ ( G ) < 1 � Convergent splittings generate convergent (generalized) Gibbs samplers Mean converges with asymptotic convergence factor ̺ ( G ) , covariance with ̺ ( G ) 2 Young 1971 Thm 3-5.1, Duflo 1997 Thm 2.3.18-4, Goodman & Sokal, 1989, Galli & Gao 2001 F Parker 2012

Some not so common Gibbs samplers for N(0 , A − 1 ) = M T + N c ( k ) � � splitting/sampler converge if M Var 2 1 2 Richardson ω I − A 0 < ω < ω I ̺ ( A ) Jacobi 2 D − A A SDD D GS/Gibbs D + L always D 1 2 − ω SOR/B&F ω D + L 0 < ω < 2 ω D ω ω 2 − ω M SOR D − 1 M T M SOR D − 1 M T � SSOR/REGS 0 < ω < 2 SOR 2 − ω SOR + N T SOR D − 1 N SOR � Want: convenient to solve Mu = r and sample from N(0 , M T + N ) Relaxation parameter ω can accelerate Gibbs SSOR is a forwards and backwards sweep of SOR to give a symmetric splitting SOR: Adler 1981; Barone & Frigessi 1990, Amit & Grenander 1991, SSOR: Roberts & Sahu 1997

A closer look at convergence To sample from N( µ, A − 1 ) where A µ = b Split A = M − N , M invertible. G = M − 1 N , and c ( k ) iid ∼ N(0 , M T + N ) The iteration y ( k +1) = Gy ( k ) + M − 1 � ( c ( k ) + b � implies � y ( m ) � − µ = G m � � y (0) � � E E − µ and � y ( m ) � − A − 1 = G m � � y (0) � − A − 1 � G m Var Var (Hence asymptotic average convergence factors ̺ ( G ) and ̺ ( G ) 2 ) Errors go down as the polynomial P m ( I − G ) = ( I − ( I − G )) m = � m I − M − 1 A � P m ( λ ) = (1 − λ ) m solver and sampler have exactly the same error polynomial

Controlling the error polynomial The splitting A = 1 � 1 − 1 � τ M + M − N τ gives the iteration operator I − τ M − 1 A � � G τ = and error polynomial P m ( λ ) = (1 − τλ ) m The sequence of parameters τ 1 , τ 2 , . . . , τ m gives the error polynomial m � P m ( λ ) = (1 − τ l λ ) l =1 ... so we can choose the zeros of P m ! This gives a non-stationary solver ≡ non-homogeneous Markov chain (equivalently, can post-process chain by taking linear combination of states) Golub & Varga 1961, Golub & van Loan 1989, Axelsson 1996, Saad 2003, F & Parker 2012

The best (Chebyshev) polynomial 10 iterations, factor of 300 improvement Choose � � 1 = λ n + λ 1 + λ n − λ 1 π 2 l + 1 cos l = 0 , 1 , 2 , . . . , p − 1 τ l 2 2 2 p where λ 1 λ n are extreme eigenvalues of M − 1 A

Second-order accelerated sampler First-order accelerated iteration turns out to be unstable Numerical stability, and optimality at each step, is given by the second-order iteration y ( k +1) = (1 − α k ) y ( k − 1) + α k y ( k ) + α k τ k M − 1 ( c ( k ) − Ay ( k ) ) with α k and τ k chosen so error polynomial satisfies Chebyshev recursion. Theorem 2 2 nd -order solver converges ⇒ 2 nd -order sampler converges (given correct noise distribution) Error polynomial is optimal, at each step, for both mean and covariance Asymptotic average reduction factor (Axelsson 1996) is � σ = 1 − λ 1 /λ n � 1 + λ 1 /λ n Axelsson 1996, F & Parker 2012

Algorithm 1: Chebyshev accelerated SSOR sampling from N( 0 , A − 1 ) input : The SSOR splitting M , N of A ; smallest eigenvalue λ min of M − 1 A ; largest eigenvalue λ max of M − 1 A ; relaxation parameter ω ; initial state y (0) ; k max output : y ( k ) approximately distributed as N ( 0 , A − 1 ) � 2 � 2 � 1 / 2 , δ = � λ max − λ min , θ = λ max + λ min set γ = ω − 1 ; 4 2 set α = 1 , β = 2 /θ , τ = 1 /θ , τ c = 2 τ − 1 ; for k = 1 , . . . , k max do if k = 0 then b = 2 α − 1 , a = τ c b , κ = τ ; else b = 2(1 − α ) /β + 1 , a = τ c + ( b − 1) (1 /τ + 1 /κ − 1) , κ = β + (1 − α ) κ ; end sample z ∼ N( 0 , I ) ; c = γb 1 / 2 D 1 / 2 z ; x = y ( k ) + M − 1 ( c − Ay ( k ) ) ; sample z ∼ N( 0 , I ) ; c = γa 1 / 2 D 1 / 2 z ; w = x − y ( k ) + M − T ( c − Ax ) ; y ( k +1) = α ( y ( k ) − y ( k − 1) + τ w ) + y ( k − 1) ; β = ( θ − βδ ) − 1 ; α = θβ ; end

Monte Carlo simulation inspired by computational optimization Colin - PowerPoint PPT Presentation

Monte Carlo simulation inspired by computational optimization Colin Fox fox@physics.otago.ac.nz Al Parker, John Bardsley MCQMC Feb 2012, Sydney Sampling from ( x ) Consider : x is high-dimensional ( 10 4 10 8 ) and is expensive to

Monte Carlo Generators Monte Carlo Generators Monte Carlo Generators QCD Lecture III P .

Simulation Monte Carlo Monte Carlo simulation Outcome of a single stochastic simulation run

Monte Carlo Methods Guojin Chen Christopher Cprek Chris Rambicure Monte Carlo Methods 1.

Monte Carlo Approximation of Monte Carlo Filters Adam M. Johansen et al. Collaborators Include:

BROCHURE 2019 TETRA JUICES DEL MONTE DEL MONTE 6 x 1L GOLD PINEAPPLE 6 x 1L 6 x 1L 6 x 1L

Monte Carlo Estimation 7 January 2019 OSU CSE 1 Monte Carlo Methods Class of computational

4. THE MONTE CARLO METHOD 4.1 I ntroduction This chapter is aimed at describing the Monte Carlo

Chapter 5: Monte Carlo Methods Monte Carlo methods are learning methods Experience

Draft Introduction to (randomized) quasi-Monte Carlo Pierre LEcuyer MCQMC Conference,

Monte Carlo Localization Ximing Yu March 24, 2009 Ximing Yu Monte Carlo Localization 1

Monte Carlo Control CMPUT 366: Intelligent Systems S&B 5.3-5.5, 5.7 Lecture Outline 1.

3. Monte Carlo Simulations Understanding Molecular Simulation Molecular Simulations Molecular

Techniques in Artificial Intelligence - Part I Todd W. Neller Gettysburg College Monte Carlo

Barrier Option Pricing Introduction Barrier Options and Monte Carlo Simulations The

Monte Carlo Simulation technique S. B. Santra Department of Physics Indian Institute of

Basics of Monte Carlo Simulation condensed from a presentation by M. Asai Dennis Wright Geant4

Variational approach to mean field games with density constraints Alp ar Rich ard M

No Spurious Local Minima in Training Deep Quadratic Networks Abbas Kazemipour Conference on

Multiplicative Weights Update as a Distributed Optimization Algorithm: Constrained Optimization

Stochastic Cubic Regularization for Fast Nonconvex Optimization Nilesh Tripuraneni, Mitchell

Saka e Fuchino ( ) Graduate School of System Informatics Kobe University (

The Cayley-Moser Problem Optimal Stopping Buying a house, selling an asset, or searching for a

Think Eternally: Improved Algorithms for the Temp Secretary Problem and Extensions Thomas

Median in Random Order Streams Lecture 17 March 26, 2019 Chandra (UIUC) CS498ABD 1 Spring

Monte Carlo simulation inspired by computational optimization Colin - PowerPoint PPT Presentation

Monte Carlo simulation inspired by computational optimization Colin Fox fox@physics.otago.ac.nz Al Parker, John Bardsley MCQMC Feb 2012, Sydney Sampling from ( x ) Consider : x is high-dimensional ( 10 4 10 8 ) and is expensive to

Monte Carlo Generators Monte Carlo Generators Monte Carlo Generators QCD Lecture III P .

Simulation Monte Carlo Monte Carlo simulation Outcome of a single stochastic simulation run

Monte Carlo Methods Guojin Chen Christopher Cprek Chris Rambicure Monte Carlo Methods 1.

Monte Carlo Approximation of Monte Carlo Filters Adam M. Johansen et al. Collaborators Include:

BROCHURE 2019 TETRA JUICES DEL MONTE DEL MONTE 6 x 1L GOLD PINEAPPLE 6 x 1L 6 x 1L 6 x 1L

Monte Carlo Estimation 7 January 2019 OSU CSE 1 Monte Carlo Methods Class of computational

4. THE MONTE CARLO METHOD 4.1 I ntroduction This chapter is aimed at describing the Monte Carlo

Chapter 5: Monte Carlo Methods Monte Carlo methods are learning methods Experience

Draft Introduction to (randomized) quasi-Monte Carlo Pierre LEcuyer MCQMC Conference,

Monte Carlo Localization Ximing Yu March 24, 2009 Ximing Yu Monte Carlo Localization 1

Monte Carlo Control CMPUT 366: Intelligent Systems S&amp;B 5.3-5.5, 5.7 Lecture Outline 1.

3. Monte Carlo Simulations Understanding Molecular Simulation Molecular Simulations Molecular

Techniques in Artificial Intelligence - Part I Todd W. Neller Gettysburg College Monte Carlo

Barrier Option Pricing Introduction Barrier Options and Monte Carlo Simulations The

Monte Carlo Simulation technique S. B. Santra Department of Physics Indian Institute of

Basics of Monte Carlo Simulation condensed from a presentation by M. Asai Dennis Wright Geant4

Variational approach to mean field games with density constraints Alp ar Rich ard M

No Spurious Local Minima in Training Deep Quadratic Networks Abbas Kazemipour Conference on

Multiplicative Weights Update as a Distributed Optimization Algorithm: Constrained Optimization

Stochastic Cubic Regularization for Fast Nonconvex Optimization Nilesh Tripuraneni, Mitchell

Saka e Fuchino ( ) Graduate School of System Informatics Kobe University (

The Cayley-Moser Problem Optimal Stopping Buying a house, selling an asset, or searching for a

Think Eternally: Improved Algorithms for the Temp Secretary Problem and Extensions Thomas

Median in Random Order Streams Lecture 17 March 26, 2019 Chandra (UIUC) CS498ABD 1 Spring

Monte Carlo Control CMPUT 366: Intelligent Systems S&B 5.3-5.5, 5.7 Lecture Outline 1.