understanding mcmc
play

Understanding MCMC Marcel Lthi, University of Basel Slides based - PowerPoint PPT Presentation

> DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE GRAVIS 2018 | BASEL Understanding MCMC Marcel Lthi, University of Basel Slides based on presentation by Sandro Schnborn 1 > DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE GRAVIS 2018


  1. > DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE GRAVIS 2018 | BASEL Understanding MCMC Marcel Lüthi, University of Basel Slides based on presentation by Sandro Schönborn 1

  2. > DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE GRAVIS 2018 | BASEL The big picture … which satisfies detailed … an aperiodic and irreducable balance condition for p(x) induces Metropolis Hastings If Markov Markov chain Algorithm Chain is a- periodic and irreducable converges to it … samples from Equilibrium is Distribution 𝑞(𝑦) distribution

  3. > DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE GRAVIS 2018 | BASEL Understanding Markov Chains 3

  4. > DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE GRAVIS 2018 | BASEL Markov Chain State space 𝑂 , 𝑌 𝑗 ∈ 𝑇 with joint distribution • Sequence of random variables 𝑌 𝑗 𝑗=1 Transition probability 𝑂 𝑄 𝑌 1 , 𝑌 2 , … , 𝑌 𝑂 = 𝑄 𝑌 1 ෑ 𝑄(𝑌 𝑗 |𝑌 𝑗−1 ) 𝑗=2 Initial distribution 1/6 1/3 2 Automatically true if we use • Simplifications: (for our analysis) 1 1 computers (e.g. 32 bit floats) • Discrete state space: 𝑇 = {1, 2, … , 𝐿 } 1 • Homogeneous Chain: 𝑄 𝑌 𝑗 = 𝑚 𝑌 𝑗−1 = 𝑛 = 𝑈 𝑚𝑛 1/2 3 4

  5. > DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE GRAVIS 2018 | BASEL Example: Markov Chain 0.05 0.8 0.95 • Simple weather model: dry (D) or rainy (R) hour R • Condition in next hour? 𝑌 𝑢+1 D • State space 𝑇 = {𝐸, 𝑆} • Stochastic: 𝑄(𝑌 𝑢+1 |𝑌 𝑢 ) 0.2 • Depends only on current condition 𝑌 𝑢 • Draw samples from chain: DDDDDDDDRRRRRRRRRRRDDDDDDDDDDD • Initial: 𝑌 0 = 𝐸 DDDDDDDDDDDDDDDDDDDDDDDDDDDDDD • Evolution: 𝑄 𝑌 𝑢+1 𝑌 𝑢 DDDDDDDDDRDD... • Long-term Behavior • Does it converge? Average probability of rain? • Dynamics? How quickly will it converge? 5

  6. > DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE GRAVIS 2018 | BASEL Discrete Homogeneous Markov Chain Formally linear algebra: • Distribution (vector): 𝑄(𝑌 𝑗 = 1) 𝑄 𝑌 𝑗 : 𝒒 𝒋 = ⋮ 𝑄(𝑌 𝑗 = 𝐿) • Transition probability (transition matrix): 𝑄 1 ← 1 ⋯ 𝑄 1 ← 𝐿 𝑄 𝑌 𝑗 𝑌 𝑗−1 : 𝑈 = ⋮ ⋱ ⋮ 𝑄 𝐿 ← 1 ⋯ 𝑄 𝐿 ← 𝐿 𝑈 𝑚𝑛 = 𝑄 𝑚 ← 𝑛 = 𝑄 𝑌 𝑗 = 𝑚 𝑌 𝑗−1 = 𝑛 6

  7. > DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE GRAVIS 2018 | BASEL Evolution of the Initial Distribution • Evolution of 𝑄 𝑌 1 → 𝑄(𝑌 2 ) : 𝑄 𝑌 2 = 𝑚 = ෍ 𝑄 𝑚 ← 𝑛 𝑄 𝑌 1 = 𝑛 𝑛∈𝑇 𝒒 2 = 𝑈𝒒 1 • Evolution of 𝑜 steps: 𝒒 𝑜+1 = 𝑈 𝑜 𝒒 1 • Is there a stable distribution 𝒒 ∗ ? (steady-state) A stable distribution is an 𝒒 ∗ = 𝑈𝒒 ∗ eigenvector of 𝑈 with eigenvalue 𝜇 = 1 7

  8. > DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE GRAVIS 2018 | BASEL Steady-State Distribution: 𝒒 ∗ • It exists: • 𝑈 subject to normalization constraint: left eigenvector to eigenvalue 1 ෍ 𝑈 𝑚𝑛 = 1 ⇔ 1 𝑈 = 1 1 … … 1 𝑚 • T has eigenvalue 𝜇 = 1 (left-/right eigenvalues are the same) • Steady-state distribution as corresponding right eigenvector 𝑈𝒒 ∗ = 𝒒 ∗ • Does any arbitrary initial distribution evolve to 𝒒 ∗ ? • Convergence? • Uniqueness? 8

  9. > DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE GRAVIS 2018 | BASEL Equilibrium Distribution: 𝒒 ∗ • Additional requirement for 𝑈 : T n 𝑚𝑛 > 0 for 𝑜 > 𝑂 0 The chain is called irreducible and aperiodic (implies ergodic ) • All states are connected using at most 𝑂 0 steps • Return intervals to a certain state are irregular • Perron-Frobenius theorem for positive matrices: • PF1: 𝜇 1 = 1 is a simple eigenvalue with 1d eigenspace ( uniqueness ) • PF2: 𝜇 1 = 1 is dominant, all 𝜇 𝑗 < 1, 𝑗 ≠ 1 ( convergence ) • 𝒒 ∗ is a stable attractor, called equilibrium distribution 𝑈𝒒 ∗ = 𝒒 ∗ 9

  10. > DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE GRAVIS 2018 | BASEL Convergence • Time evolution of arbitrary distribution 𝒒 0 𝒒 𝑜 = 𝑈 𝑜 𝒒 0 • Expand 𝒒 0 in Eigen basis of 𝑈: 𝑈𝒇 𝑗 = 𝜇 𝑗 𝒇 𝑗 , 𝜇 𝑗 < 𝜇 1 = 1, 𝜇 𝑙 ≥ |𝜇 𝑙+1 | 𝐿 𝒒 0 = ෍ 𝑑 𝑗 𝒇 𝑗 𝑗 𝐿 𝑈𝒒 0 = ෍ 𝑑 𝑗 𝜇 𝑗 𝒇 𝑗 𝑗 𝐿 𝑜 𝒇 𝑗 = 𝑑 1 𝒇 1 + 𝜇 2 𝑜 𝑑 2 𝒇 2 + 𝜇 3 𝑜 𝑑 3 𝒇 3 + ⋯ 𝑈 𝑜 𝒒 0 = ෍ 𝑑 𝑗 𝜇 𝑗 𝑗 10

  11. > DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE GRAVIS 2018 | BASEL Convergence (II) 𝐿 𝑜 𝒇 𝑗 = 𝑑 1 𝒇 𝟐 + 𝜇 2 𝑜 𝑑 2 𝒇 2 + 𝜇 3 𝑜 𝑑 3 𝒇 3 + ⋯ 𝑈 𝑜 𝒒 0 = ෍ 𝑑 𝑗 𝜇 𝑗 𝑗 ≈ 𝒒 ∗ + 𝜇 2 𝑜 𝑑 2 𝒇 2 𝑑 1 𝒇 𝟐 = 𝒒 ∗ (𝑜 ≫ 1) • We have convergence : Normalizations: 𝑜→∞ 𝒒 ∗ 𝑈 𝑜 𝒒 0 𝒇 1 = 1 ∗ = 1 σ 𝑗 𝑞 𝑗 • Rate of convergence: = 𝜇 2 𝑜 𝑑 2 𝑜 𝑑 2 𝒇 2 𝒒 𝑜 − 𝒒 ∗ ≈ 𝜇 2 11

  12. > DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE GRAVIS 2018 | BASEL Example: Weather Dynamics Rain forecast for stable versus mixed weather: 𝑡 = 0.95 0.2 𝑛 = 0.85 0.6 mixed stable 𝑋 𝑋 D R 0.05 0.8 0.15 0.4 𝒒 ∗ = 0.8 Long-term average 𝒒 ∗ = 0.8 0.2 probability of rain: 20% 20% 0.2 Eigenvalues: 1, 0.75 0.75 Eigenvalues: 1, 0.25 0.25 Rainy now, next hours? Rainy now, next hours? RRRRDDDDDDDDDDDD RDDDDDDDDDDDDDDD DDDDDDDDDDDDD... RDDDRDDDDDDDD... 12

  13. > DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE GRAVIS 2018 | BASEL Markov Chain: First Results • Aperiodic and irreducible chains are ergodic : (every state reachable after > 𝑂 steps, irregular return time) • Convergence towards a unique equilibrium distribution 𝒒 ∗ • Equilibrium distribution 𝒒 ∗ • Eigenvector of 𝑈 with eigenvalue 𝜇 = 1 : 𝑈𝒒 ∗ = 𝒒 ∗ • Rate of convergence: Exponential decay with second largest eigenvalue ∝ 𝜇 2 𝑜 Only useful if we can design chain with desired equilibrium distribution! 13

  14. > DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE GRAVIS 2018 | BASEL Detailed Balance • Special property of some Markov chains Distribution 𝑞 satisfies detailed balance if the total flow of probability between every pair of states is equal, (we have a local equilibrium): 𝑄 𝑚 ← 𝑛 𝑞 𝑛 = 𝑄 𝑛 ← 𝑚 𝑞 𝑚 • Detailed balance implies: 𝑞 is the equilibrium distribution 𝑈𝒒 𝑚 = ෍ 𝑈 𝑚𝑛 𝑞 𝑛 = ෍ 𝑈 𝑛𝑚 𝑞 𝑚 = 𝑞 𝑚 𝑛 𝑛 • Most MCMC methods construct chains which satisfies detailed balance. 14

  15. > DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE GRAVIS 2018 | BASEL The Metropolis-Hastings Algorithm MCMC to draw samples from an arbitrary distribution 16

  16. > DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE GRAVIS 2018 | BASEL Idea of Metropolis Hastings algorithm • Design a Markov Chain, which satisfies the detailed balance condition 𝑈 𝑁𝐼 𝑦 ′ ← 𝑦 𝑄 𝑦 = 𝑈 𝑁𝐼 𝑦 ← 𝑦 ′ 𝑄 𝑦 ′ • Ergodicity ensures that chain converges to this distribution

  17. > DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE GRAVIS 2018 | BASEL Attempt 1: A simple algorithm • Initialize with sample 𝒚 • Generate next sample, with current sample 𝒚 Draw a sample 𝒚 ′ from 𝑅(𝒚 ′ |𝒚) (“proposal”) 1. Emit current state 𝒚 as sample 2. • It’s a Markov chain • Need to choose Q for every P to satisfy detailed balance 𝑅 𝑦 ′ ← 𝑦 𝑄 𝑦 = 𝑅 𝑦 ← 𝑦 ′ 𝑄 𝑦 ′

  18. > DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE GRAVIS 2018 | BASEL Attempt 2: More general solution • Initialize with sample 𝒚 • Generate next sample, with current sample 𝒚 Draw a sample 𝒚 ′ from 𝑅(𝒚 ′ |𝒚) (“proposal”) 1. With probability 𝛽(x, x ′ ) emit 𝒚 ′ as new sample 2. With probability 1 − 𝛽(x, x ′ ) emit 𝑦 as new sample 3. • It’s a Markov chain • Decouples Q from P through acceptance rule a • How to choose a?

  19. > DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE GRAVIS 2018 | BASEL What is the acceptance function a? 𝑈 𝑁𝐼 𝑦 ′ ← 𝑦 𝑄 𝑦 = 𝑈 𝑁𝐼 𝑦 ← 𝑦 ′ 𝑄 𝑦 ′ 𝑏 𝑦 ′ 𝑦 𝑅 𝑦 ′ 𝑦 𝑄 𝑦 = 𝑏 𝑦 𝑦′ 𝑅 𝑦 𝑦′ 𝑄 𝑦′ Case A: x’ = x • Detailed balance trivially satisfied for every a( x’,x ) Case B: 𝑦 ′ ≠ 𝑦 • We have the following requirement 𝑏 𝑦 ′ 𝑦 𝑏 𝑦 𝑦 ′ = 𝑅 𝑦 𝑦 ′ 𝑄 𝑦 ′ 𝑅 𝑦 ′ 𝑦 𝑄 𝑦

  20. > DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE GRAVIS 2018 | BASEL What is the acceptance function a? Requirement: Choose 𝑏(𝑦 ′ |𝑦) such that 𝑏 𝑦 ′ 𝑦 𝑏 𝑦 𝑦 ′ = 𝑅 𝑦 𝑦 ′ 𝑄 𝑦 ′ 𝑅 𝑦 ′ 𝑦 𝑄 𝑦 • 𝑏 𝑦 𝑦 ′ is probability distribution 𝑏 𝑦 𝑦 ′ ≤ 1 and 𝑏 𝑦 𝑦 ′ ≥ 0 • Easy to check that: 𝑏 𝑦 ′ 𝑦 = min 1, 𝑅 𝑦 𝑦 ′ 𝑄 𝑦 ′ 𝑅 𝑦 ′ 𝑦 𝑄 𝑦 satisfies this property.

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend