Gaussian Processes to Speed up Hamiltonian Monte Carlo Matthieu L - - PowerPoint PPT Presentation

gaussian processes to speed up
SMART_READER_LITE
LIVE PREVIEW

Gaussian Processes to Speed up Hamiltonian Monte Carlo Matthieu L - - PowerPoint PPT Presentation

Gaussian Processes to Speed up Hamiltonian Monte Carlo Matthieu L Murray, Iain http://videolectures.net/mlss09uk_murray_mcmc/ Rasmussen, Carl Edward. "Gaussian processes to speed up hybrid Monte Carlo for expensive Bayesian


slide-1
SLIDE 1

Gaussian Processes to Speed up Hamiltonian Monte Carlo

Matthieu Lê

Journal Club 11/04/14 1

Neal, Radford M (2011). " MCMC Using Hamiltonian Dynamics. " In Steve Brooks, Andrew Gelman, Galin L. Jones, and Xiao-Li Meng. Handbook of Markov Chain Monte Carlo. Chapman and Hall/CRC. Rasmussen, Carl Edward. "Gaussian processes to speed up hybrid Monte Carlo for expensive Bayesian integrals." Bayesian Statistics 7: Proceedings of the 7th Valencia International Meeting. Oxford University Press, 2003. Murray, Iain http://videolectures.net/mlss09uk_murray_mcmc/

slide-2
SLIDE 2

MCMC

Journal Club 11/04/14 2

  • Monte Carlo : Rejection sampling, Importance sampling, …
  • MCMC : Markov Chain Monte Carlo
  • Sampling technique to estimate a probability distribution

Draw samples from complex probability distributions

In general : ∫ 𝑔 𝑦 𝑄 𝑦 𝑒𝑦 ≈ 1 𝑇 𝑔 𝑦 𝑡 , 𝑦(𝑡)~𝑄(𝑦)

𝑇 𝑡=1

slide-3
SLIDE 3

Bayesian Inference :

  • 𝜄 : Model parameter
  • 𝑦𝑗 : Observations

𝑄 𝜄 𝑦1 … 𝑦𝑜 = 𝑄 𝑦1 … 𝑦𝑜 𝜄 𝑄 𝜄 ∫ 𝑄 𝑦1 … 𝑦𝑜 𝜄 𝑄 𝜄 𝑒𝜄 𝑄 𝜄 𝑦1 … 𝑦𝑜 = 1 𝑎 𝑄 𝑦1 … 𝑦𝑜 𝜄 𝑄 𝜄 MCMC : sample 𝑄 𝜄 𝑦1 … 𝑦𝑜 without the knowledge of 𝑎

Objective

Journal Club 11/04/14 3

  • 𝑄 𝜄 : prior on the model parameters
  • 𝑄 𝑦1 … 𝑦𝑜 𝜄 : likelyhood, potentially expensive to evaluate
slide-4
SLIDE 4
  • 1. Propose new 𝜄′ from the transition function 𝑅(𝜄′, 𝜄) (e.g. 𝑂 𝜄, 𝜏2 ). The transition from
  • ne parameter to another is a Markov Chain
  • 2. Accept the new parameter 𝜄′ with a probability min

(1,

𝑄 𝜄′ 𝑅(𝜄,𝜄′) 𝑄 𝜄 𝑅(𝜄′,𝜄) )

𝑅 must be chosen to fulfill some technical requirements. Samples are not independent.

Metropolis-Hastings

Journal Club 11/04/14 4

We want to draw samples from 𝑄 𝛽 𝑄

Sample 𝑂 0,1 )

slide-5
SLIDE 5

Metropolis-Hastings

Journal Club 11/04/14 5

Problem : The proposed samples come from a Gaussian. There is possibly an important rejection rate. Metropolis-Hastings: 1000 samples, step 0.1 Rejection rate: 44%

slide-6
SLIDE 6

Hamiltonian Monte Carlo

Journal Club 11/04/14 6

𝜄 𝜄∗ 𝑞 𝑞∗ 𝜖𝜄 𝜖𝑢 = 𝜖𝐹𝑙𝑗𝑜 𝜖𝑞 = 𝑞 𝑛 𝜖𝑞 𝜖𝑢 = − 𝜖𝐹𝑞𝑝𝑢 𝜖𝜄 Same idea as Metropolis-Hastings BUT the proposed samples now come from the Hamiltonian dynamics : 𝐼 = 𝐹𝑞𝑝𝑢 + 𝐹𝑙𝑗𝑜 𝐹𝑞𝑝𝑢= − log 𝑄 𝜄 , 𝐹𝑙𝑗𝑜 = 𝑞2 2𝑛

slide-7
SLIDE 7

Hamiltonian Monte Carlo

Journal Club 11/04/14 7

The Energy is conserved so the acceptance probability should theoretically be 1. Because of the numerical precision, we need the Metropolis-Hastings type decision in the end. Algorithm :

  • Sample 𝑞 according to its known distribution
  • Run the Hamiltonian dynamics during a time T
  • Accept the new sample with probability :

min (1, exp −𝐹𝑞𝑝𝑢

+ 𝐹𝑞𝑝𝑢 exp (−𝐹𝑙𝑗𝑜

+ 𝐹𝑙𝑗𝑜))

slide-8
SLIDE 8

Hamiltonian Monte Carlo

Journal Club 11/04/14

Gibbs Sampling Metropolis-Hastings Hamiltonian Monte Carlo

slide-9
SLIDE 9

Hamiltonian Monte Carlo

Journal Club 11/04/14 9

Advantage : The Hamiltonian stays (approximately) constant during the dynamic, hence lower rejection rate ! Problem : Computing the Hamiltonian dynamic requires computing the model partial derivatives, high number of simulation evaluation !

Neal, Radford M (2011). " MCMC Using Hamiltonian Dynamics. " In Steve Brooks, Andrew Gelman, Galin L. Jones, and Xiao-Li Meng. Handbook of Markov Chain Monte Carlo. Chapman and Hall/CRC.

1000 samples, L = 200, 𝜗 =0,01 Rejection rate = 0%

slide-10
SLIDE 10

Gaussian Process HMC

Journal Club 11/04/14 10

Same algorithm as HMC BUT the Hamiltonian dynamic is computed using Gaussian process simulating 𝐹𝑞𝑝𝑢 𝜖𝑞 𝜖𝑢 = − 𝜖𝐹𝑞𝑝𝑢 𝜖𝜄 Gaussian process = distribution over smooth function to approximate 𝐹𝑞𝑝𝑢 : 𝑄 𝐹𝑞𝑝𝑢 𝜄 ~𝑂 0, Σ , Σpq = 𝜕0exp(− 1 2 𝑦𝑒

𝑞 − 𝑦𝑒 𝑟 2/𝜕𝑒 2 𝐸 𝑒=1

)

slide-11
SLIDE 11

Gaussian Process HMC

Journal Club 11/04/14 11

Once the Gaussian process is defined with a covariance matrix, we can predict new values : 𝑄 𝐹𝑞𝑝𝑢

𝜾, 𝑭𝒒𝒑𝒖, 𝜄∗ ~𝑂 𝜈, 𝜏2 , If the Gaussian process is “good”, 𝜈(𝜄∗) ≈ target density Algorithm :

  • 1. Initialization :
  • Evaluate the target density at D random points to define

the Gaussian process.

  • 2. Exploratory phase :
  • HMC with 𝐹𝑞𝑝𝑢 = 𝜈 − 𝜏 : evaluation of points with high

target value and high uncertainty. Evaluate the real target density at the end of each iteration.

  • 3. Sampling phase :
  • HMC with 𝐹𝑞𝑝𝑢 = 𝜈.
slide-12
SLIDE 12

Gaussian Process HMC

Journal Club 11/04/14 12

Rasmussen, Carl Edward. "Gaussian processes to speed up hybrid Monte Carlo for expensive Bayesian integrals." Bayesian Statistics 7: Proceedings of the 7th Valencia International Meeting. Oxford University Press, 2003.

slide-13
SLIDE 13

Conclusion

Journal Club 11/04/14 13

  • Metropolis-Hastings : few model evaluation per iteration but important rejection rate
  • Hamiltonian Monte Carlo : a lot of model evaluation per iteration but low rejection rate
  • GPHMC : few model evaluation per iteration and low rejection rate
  • BUT : Initialization requires model evaluations to define a “good” Gaussian process
  • BUT : Exploratory phase requires one model evaluation per iteration
slide-14
SLIDE 14