zig zag monte carlo
play

Zig-Zag Monte Carlo Delft University of Technology Joris Bierkens - PowerPoint PPT Presentation

Zig-Zag Monte Carlo Delft University of Technology Joris Bierkens February 7, 2017 Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, 2017 1 / 33 Acknowledgements Collaborators Andrew Duncan Paul Fearnhead Antonietta Mira Gareth


  1. Zig-Zag Monte Carlo Delft University of Technology Joris Bierkens February 7, 2017 Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, 2017 1 / 33

  2. Acknowledgements Collaborators Andrew Duncan Paul Fearnhead Antonietta Mira Gareth Roberts Financial support Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, 2017 2 / 33

  3. Outline 1 Motivation: Markov Chain Monte Carlo 2 One-dimensional Zig-Zag process 3 Multi-dimensional ZZP 4 Subsampling 5 Doubly intractable likelihood Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, 2017 3 / 33

  4. Bayesian inference In Bayesian inference we typically deal with a posterior density x ∈ R d , π ( x ) = π ( x ; y ) ∝ L ( y | x ) π 0 ( x ) , where L ( y | x ) is the likelihood of the data y given parameter x ∈ R d , and π 0 is a prior density for x . Quantities of interest are e.g. � • posterior mean x π ( x ) dx , � �� � 2 , x 2 π ( x ) dx − • posterior variance x π ( x ) dx � • tail probability ✶ { x ≥ c } π ( x ) dx . � All of these involve integrals of the form h ( x ) π ( x ) dx . Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, 2017 4 / 33

  5. � h ( x ) π ( x ) dx Evaluating Possible approaches: 1 Explicit (analytic) integration. Rarely possible Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, 2017 5 / 33

  6. � h ( x ) π ( x ) dx Evaluating Possible approaches: 1 Explicit (analytic) integration. Rarely possible 2 Numerical integration. Curse of dimensionality Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, 2017 5 / 33

  7. � h ( x ) π ( x ) dx Evaluating Possible approaches: 1 Explicit (analytic) integration. Rarely possible 2 Numerical integration. Curse of dimensionality 3 Monte Carlo. Draw independent samples ( X 1 , X 2 , . . . ) from π and use the law of large numbers. Requires independent samples from π � K � 1 h ( x ) π ( x ) dx = lim h ( X k ) . K K →∞ k =1 Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, 2017 5 / 33

  8. � h ( x ) π ( x ) dx Evaluating Possible approaches: 1 Explicit (analytic) integration. Rarely possible 2 Numerical integration. Curse of dimensionality 3 Monte Carlo. Draw independent samples ( X 1 , X 2 , . . . ) from π and use the law of large numbers. Requires independent samples from π 4 Markov Chain Monte Carlo. Construct an ergodic Markov chain ( X 1 , X 2 , . . . ) with invariant distribution π ( x ) dx , use Birkhoff’s ergodic theorem. � K � 1 h ( x ) π ( x ) dx = lim h ( X k ) . K K →∞ k =1 Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, 2017 5 / 33

  9. One-dimensional Zig-Zag process Dynamics • Continuous time • Current state ( X ( t ) , Θ( t )) ∈ R × {− 1 , +1 } . • Move X ( t ) in direction Θ( t ) = ± 1 until a switch occurs. • The switching intensity is λ ( X ( t ) , Θ( t )). 2 1.5 1 0.5 0 −0.5 −1 −1.5 −2 −2.5 0 10 20 30 40 50 60 70 80 90 100 Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, 2017 6 / 33

  10. Relation between switching rate and potential L f ( x , θ ) = θ df dx + λ ( x , θ )( f ( x , − θ ) − f ( x , θ )) , x ∈ R , θ ∈ {− 1 , +1 } . • Potential U ( x ) = − log π ( x ) Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, 2017 7 / 33

  11. Relation between switching rate and potential L f ( x , θ ) = θ df dx + λ ( x , θ )( f ( x , − θ ) − f ( x , θ )) , x ∈ R , θ ∈ {− 1 , +1 } . • Potential U ( x ) = − log π ( x ) • π is invariant if and only if λ ( x , +1) − λ ( x , − 1) = U ′ ( x ) for all x . Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, 2017 7 / 33

  12. Relation between switching rate and potential L f ( x , θ ) = θ df dx + λ ( x , θ )( f ( x , − θ ) − f ( x , θ )) , x ∈ R , θ ∈ {− 1 , +1 } . • Potential U ( x ) = − log π ( x ) • π is invariant if and only if λ ( x , +1) − λ ( x , − 1) = U ′ ( x ) for all x . • Equivalently, λ ( x , θ ) = γ ( x ) + max (0 , θ U ′ ( x )) , γ ( x ) ≥ 0 . Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, 2017 7 / 33

  13. Relation between switching rate and potential L f ( x , θ ) = θ df dx + λ ( x , θ )( f ( x , − θ ) − f ( x , θ )) , x ∈ R , θ ∈ {− 1 , +1 } . • Potential U ( x ) = − log π ( x ) • π is invariant if and only if λ ( x , +1) − λ ( x , − 1) = U ′ ( x ) for all x . • Equivalently, λ ( x , θ ) = γ ( x ) + max (0 , θ U ′ ( x )) , γ ( x ) ≥ 0 . Example: Gaussian distribution N (0 , σ 2 ) • Density π ( x ) ∝ exp( − x 2 / (2 σ 2 )) • Potential U ( x ) = x 2 / (2 σ 2 ) • Derivative U ′ ( x ) = x /σ 2 • Switching rates λ ( x , θ ) = ( θ x /σ 2 ) + + γ ( x ) Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, 2017 7 / 33

  14. Proof of invariance of π ∝ exp( − U ) L f ( x , θ ) = θ ∂ f λ ( x , +1) − λ ( x , − 1) = U ′ ( x ) . ∂ x ( x , θ ) + λ ( x , θ ) ( f ( x , − θ ) − f ( x , θ )) , Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, 2017 8 / 33

  15. Proof of invariance of π ∝ exp( − U ) L f ( x , θ ) = θ ∂ f λ ( x , +1) − λ ( x , − 1) = U ′ ( x ) . ∂ x ( x , θ ) + λ ( x , θ ) ( f ( x , − θ ) − f ( x , θ )) , Markov semigroup P ( t ) f ( x , θ ) = E x ,θ f ( X ( t ) , Θ( t )) Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, 2017 8 / 33

  16. Proof of invariance of π ∝ exp( − U ) L f ( x , θ ) = θ ∂ f λ ( x , +1) − λ ( x , − 1) = U ′ ( x ) . ∂ x ( x , θ ) + λ ( x , θ ) ( f ( x , − θ ) − f ( x , θ )) , Markov semigroup P ( t ) f ( x , θ ) = E x ,θ f ( X ( t ) , Θ( t )) π stationary means that � � � � P ( t ) f ( x , θ ) π ( x ) dx = f ( x , θ ) π ( x ) dx f ∈ D ( L ) , t ≥ 0 . R R θ = ± 1 θ = ± 1 Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, 2017 8 / 33

  17. Proof of invariance of π ∝ exp( − U ) L f ( x , θ ) = θ ∂ f λ ( x , +1) − λ ( x , − 1) = U ′ ( x ) . ∂ x ( x , θ ) + λ ( x , θ ) ( f ( x , − θ ) − f ( x , θ )) , Markov semigroup P ( t ) f ( x , θ ) = E x ,θ f ( X ( t ) , Θ( t )) π stationary means that � � � � P ( t ) f ( x , θ ) π ( x ) dx = f ( x , θ ) π ( x ) dx f ∈ D ( L ) , t ≥ 0 . R R θ = ± 1 θ = ± 1 Differentiating gives the equivalent condition: � � R L f ( x , θ ) π ( x ) dx = 0, f ∈ D ( L ). θ = ± 1 Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, 2017 8 / 33

  18. Proof of invariance of π ∝ exp( − U ) L f ( x , θ ) = θ ∂ f λ ( x , +1) − λ ( x , − 1) = U ′ ( x ) . ∂ x ( x , θ ) + λ ( x , θ ) ( f ( x , − θ ) − f ( x , θ )) , Markov semigroup P ( t ) f ( x , θ ) = E x ,θ f ( X ( t ) , Θ( t )) π stationary means that � � � � P ( t ) f ( x , θ ) π ( x ) dx = f ( x , θ ) π ( x ) dx f ∈ D ( L ) , t ≥ 0 . R R θ = ± 1 θ = ± 1 Differentiating gives the equivalent condition: � � R L f ( x , θ ) π ( x ) dx = 0, f ∈ D ( L ). θ = ± 1 � � λ ( x , θ ) ( f ( x , − θ ) − f ( x , θ )) π ( x ) dx R θ = ± 1 Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, 2017 8 / 33

  19. Proof of invariance of π ∝ exp( − U ) L f ( x , θ ) = θ ∂ f λ ( x , +1) − λ ( x , − 1) = U ′ ( x ) . ∂ x ( x , θ ) + λ ( x , θ ) ( f ( x , − θ ) − f ( x , θ )) , Markov semigroup P ( t ) f ( x , θ ) = E x ,θ f ( X ( t ) , Θ( t )) π stationary means that � � � � P ( t ) f ( x , θ ) π ( x ) dx = f ( x , θ ) π ( x ) dx f ∈ D ( L ) , t ≥ 0 . R R θ = ± 1 θ = ± 1 Differentiating gives the equivalent condition: � � R L f ( x , θ ) π ( x ) dx = 0, f ∈ D ( L ). θ = ± 1 � � λ ( x , θ ) ( f ( x , − θ ) − f ( x , θ )) π ( x ) dx R θ = ± 1 � = { λ ( x , +1) ( f ( x , − 1) − f ( x , +1)) + λ ( x , − 1) ( f ( x , +1) − f ( x , − 1)) } π ( x ) dx R Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, 2017 8 / 33

  20. Proof of invariance of π ∝ exp( − U ) L f ( x , θ ) = θ ∂ f λ ( x , +1) − λ ( x , − 1) = U ′ ( x ) . ∂ x ( x , θ ) + λ ( x , θ ) ( f ( x , − θ ) − f ( x , θ )) , Markov semigroup P ( t ) f ( x , θ ) = E x ,θ f ( X ( t ) , Θ( t )) π stationary means that � � � � P ( t ) f ( x , θ ) π ( x ) dx = f ( x , θ ) π ( x ) dx f ∈ D ( L ) , t ≥ 0 . R R θ = ± 1 θ = ± 1 Differentiating gives the equivalent condition: � � R L f ( x , θ ) π ( x ) dx = 0, f ∈ D ( L ). θ = ± 1 � � λ ( x , θ ) ( f ( x , − θ ) − f ( x , θ )) π ( x ) dx R θ = ± 1 � = { λ ( x , +1) ( f ( x , − 1) − f ( x , +1)) + λ ( x , − 1) ( f ( x , +1) − f ( x , − 1)) } π ( x ) dx R Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, 2017 8 / 33

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend