bayesian parameter estimation using multilevel and multi
play

Bayesian parameter estimation using Multilevel and multi-index Monte - PowerPoint PPT Presentation

SMC 2 S(ML)MC 2 MLMC BIP OBIP Coupling PMCMC PMC(ML)MC Numerics MIMC Summary Bayesian parameter estimation using Multilevel and multi-index Monte Carlo Kody Law joint with A. Jasra (NUS), K. Kamatani (Osaka), Y. Xu (NUS*), & Y.


  1. SMC 2 S(ML)MC 2 MLMC BIP OBIP Coupling PMCMC PMC(ML)MC Numerics MIMC Summary Bayesian parameter estimation using Multilevel and multi-index Monte Carlo Kody Law joint with A. Jasra (NUS), K. Kamatani (Osaka), Y. Xu (NUS*), & Y. Zhou (Cubist) Monash Workshop on Numerical Differential Equations and Applications Monash University, AU February 12, 2020

  2. SMC 2 S(ML)MC 2 MLMC BIP OBIP Coupling PMCMC PMC(ML)MC Numerics MIMC Summary Outline Multilevel Monte Carlo sampling 1 Bayesian inference problem 2 Our Bayesian inference problem 3 Approximate coupling 4 Particle Markov chain Monte Carlo 5 Particle Markov chain Multilevel Monte Carlo 6 Sequential Monte Carlo 2 7 Sequential Multilevel Monte Carlo 2 8 Numerical simulations 9 10 Multi-index Monte Carlo sampling 11 Summary

  3. SMC 2 S(ML)MC 2 MLMC BIP OBIP Coupling PMCMC PMC(ML)MC Numerics MIMC Summary Orientation Aim: Approximate posterior expectations of the state path and static parameters associated to an S(P)DE which must be finitely approximated. Solution: Apply an approximate coupling strategy so that multi-index Monte Carlo (MIMC) methods can be used within a particle MCMC [B02, AR08, ADH10] and SMC 2 [CJP13]. MLMC ( d = 1) [H00, G08] and MIMC ( d > 1) [HNT15] methods reduce cost to mean-squared error = O ( ε 2 ) ; Recently this methodology has been applied to inference , mostly in cases where target can be evaluated up to a normalizing constant [HSS13, DKST15, HTL16, BJLTZ17]. Here we can only simulate a non-negative unbiased estimator (utanc); using PMCMC we are able to sample consistently from an approximate coupling of successive targets [JKLZ18.i, JKLZ18.ii], and this is extended to the sequential context via SMC 2 [JLX19].

  4. SMC 2 S(ML)MC 2 MLMC BIP OBIP Coupling PMCMC PMC(ML)MC Numerics MIMC Summary Outline Multilevel Monte Carlo sampling 1 Bayesian inference problem 2 Our Bayesian inference problem 3 Approximate coupling 4 Particle Markov chain Monte Carlo 5 Particle Markov chain Multilevel Monte Carlo 6 Sequential Monte Carlo 2 7 Sequential Multilevel Monte Carlo 2 8 Numerical simulations 9 10 Multi-index Monte Carlo sampling 11 Summary

  5. SMC 2 S(ML)MC 2 MLMC BIP OBIP Coupling PMCMC PMC(ML)MC Numerics MIMC Summary Example: expectation for SDE [G08] Estimation of expectation of solution of intractable stochastic differential equation (SDE). dX = f ( X ) dt + σ ( X ) dW , X 0 = x 0 . Aim: estimate E ( g ( X T )) . We need to (1) Approximate, e.g. by Euler-Maruyama method with resolution h : √ ξ n ∼ N ( 0 , 1 ) . X n + 1 = X n + hf ( X n ) + h σ ( X n ) ξ n , (2) Sample { X ( i ) N T } N i = 1 , N T = T / h .

  6. SMC 2 S(ML)MC 2 MLMC BIP OBIP Coupling PMCMC PMC(ML)MC Numerics MIMC Summary Multilevel Monte Carlo (MLMC) Aim: Approximate η ∞ ( g ) := E η ∞ ( g ) for g : E → R . � N i = 1 g ( U ( i ) L ) , U ( i ) Single level estimator: 1 ∼ η L i.i.d. N L Cost to achieve MSE = O ( ε 2 ) is C = Cost ( U ( i ) L ) × ε − 2 . Multilevel estimator*: � L � N l i = 1 { g ( U ( i ) l ) − g ( U ( i ) 1 l − 1 ) } , l = 0 N l � ( U l , U l − 1 ) ( i ) ∼ ¯ η l i.i.d. such that η l du l − 1 , l = η l , l − 1 for ¯ l = 0 , . . . , L . (* g ( U ( i ) − 1 ) := 0) Cost is C ML = � L l = 0 C l N l , where C l is the cost to obtain a η l . sample from ¯ Fix bias by choosing L . Minimize cost C ML ( { N l } L l = 0 ) for � fixed variance = � L l = 0 V l / N l , ⇒ N l ∝ V l / C l . Example: Milstein solution of SDE for MSE = O ( ε 2 ) C = O ( ε − 3 ) C ML = O ( ε − 2 ) . vs .

  7. SMC 2 S(ML)MC 2 MLMC BIP OBIP Coupling PMCMC PMC(ML)MC Numerics MIMC Summary Illustration of pairwise coupling Pairwise coupling of trajectories of an SDE: √ X 1 n + 1 = X 1 n + hf ( X 1 h σ ( X 1 n )+ n ) ξ n , ξ n ∼ N ( 0 , 1 ) , n = 0 , . . . , N 1 √ X 0 n + 1 = X 0 n +( 2 h ) f ( X 0 h σ ( X 0 n )+ n )( ξ 2 n + ξ 2 n + 1 ) , n = 0 , . . . , N 1 / 2 . 1.6 � � � � 0.6 0 0 W X t 1.5 t � � � � 1 W 1 X 0.4 t t 1.4 0.2 1.3 1.2 0.0 1.1 0.2 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 t t (a) Wiener process (b) Stochastic process driven by √ h � n W 1 i = 0 ξ n , W 0 n = W 1 n = 2 n . Wiener process.

  8. SMC 2 S(ML)MC 2 MLMC BIP OBIP Coupling PMCMC PMC(ML)MC Numerics MIMC Summary Outline Multilevel Monte Carlo sampling 1 Bayesian inference problem 2 Our Bayesian inference problem 3 Approximate coupling 4 Particle Markov chain Monte Carlo 5 Particle Markov chain Multilevel Monte Carlo 6 Sequential Monte Carlo 2 7 Sequential Multilevel Monte Carlo 2 8 Numerical simulations 9 10 Multi-index Monte Carlo sampling 11 Summary

  9. SMC 2 S(ML)MC 2 MLMC BIP OBIP Coupling PMCMC PMC(ML)MC Numerics MIMC Summary Bayesian inference is about approximating integrals Suppose we know how to evaluate γ ( x ) for x ∈ X. Let γ ( x ) dx � η ( dx ) = X γ ( x ) dx , and ϕ : X → R , and suppose we want to estimate � η ( ϕ ) := ϕ ( x ) η ( dx ) . X X may be quite high dimension, e.g. R d with d = 100 easily, or even 1000, 10000, etc...

  10. SMC 2 S(ML)MC 2 MLMC BIP OBIP Coupling PMCMC PMC(ML)MC Numerics MIMC Summary Monte Carlo If we could obtain i.i.d. samples x i ∼ η , then we could use N � η ( ϕ ) ≈ 1 ϕ ( x i ) . N i = 1 Convergence rate (of MSE) is O ( 1 / N ) , independently of d . Unfortunately we cannot get i.i.d. samples.

  11. SMC 2 S(ML)MC 2 MLMC BIP OBIP Coupling PMCMC PMC(ML)MC Numerics MIMC Summary Importance sampling and ratio estimators Suppose we can get i.i.d. samples x i ∼ ν where 0 < G ( x ) := γ ( x ) ν ( x ) < C . Then we can use the self-normalized importance sampling estimator � N i = 1 G ( x i ) ϕ ( x i ) η ( ϕ ) ≈ . � N i = 1 G ( x i ) The rate will still be O ( 1 / N ) , but typically with a constant O ( e d ) , depending on E ( G ( x ) − E G ( x )) 2 . We may as well use quadrature.

  12. SMC 2 S(ML)MC 2 MLMC BIP OBIP Coupling PMCMC PMC(ML)MC Numerics MIMC Summary Markov chain Monte Carlo Suppose we can construct a Markov chain K , that is an operator with the property K : B ( X ) → B ( X ) and K ∗ : P ( X ) → P ( X ) , where B ( X ) are bounded measurable functions and P ( X ) are probability measures, and such that � η ( dx ′ ) K ( x ′ , dx ) = η ( dx ) , ( η K )( dx ) = X and for all A ⊂ X, x , x ′ ∈ X, � � K ( x ′ , dz ) . K ( x , dz ) ≤ A A

  13. SMC 2 S(ML)MC 2 MLMC BIP OBIP Coupling PMCMC PMC(ML)MC Numerics MIMC Summary Markov chain Monte Carlo Then we can run the Markov chain to collect samples, x 0 ∈ X and x i ∼ K ( x i − 1 , · ) = K i ( x 0 , · ) and use these for Monte Carlo N b + N � η ( ϕ ) ≈ 1 ϕ ( x i ) . N i = N b + 1 Again Monte Carlo provides rate O ( 1 / N ) , but now under quite general conditions one may achieve polynomial constant O ( d ) .

  14. SMC 2 S(ML)MC 2 MLMC BIP OBIP Coupling PMCMC PMC(ML)MC Numerics MIMC Summary Example: Metropolis-Hastings Let Q denote a Markov kernel on X. Let x 0 ∈ X. 1 Sample x ∗ ∼ Q ( x i , · ) . 2 Set x i + 1 = x ∗ with probability: 3 � � 1 , γ ( x ∗ ) Q ( x ∗ , x i ) min , γ ( x i ) Q ( x i , x ∗ ) otherwise x i + 1 = x i . Set i = i + 1 and return to the start of (2). 4

  15. SMC 2 S(ML)MC 2 MLMC BIP OBIP Coupling PMCMC PMC(ML)MC Numerics MIMC Summary Outline Multilevel Monte Carlo sampling 1 Bayesian inference problem 2 Our Bayesian inference problem 3 Approximate coupling 4 Particle Markov chain Monte Carlo 5 Particle Markov chain Multilevel Monte Carlo 6 Sequential Monte Carlo 2 7 Sequential Multilevel Monte Carlo 2 8 Numerical simulations 9 10 Multi-index Monte Carlo sampling 11 Summary

  16. SMC 2 S(ML)MC 2 MLMC BIP OBIP Coupling PMCMC PMC(ML)MC Numerics MIMC Summary Parameter inference Estimate the posterior expectation of a function ϕ of the joint path X 1 : T and parameters θ , of an intractable S(P)DE dX = f θ ( X ) dt + σ θ ( X ) dW , X 0 ∼ µ θ , given noisy partial observations Y n ∼ g θ ( X n , · ) , n = 1 , . . . , T . Aim: estimate E [ ϕ ( θ, X 0 : T ) | y 1 : T ] , where y 1 : T := { y 1 , . . . , y T } . The hidden process { X n } is a Markov chain. Discretize with resolution h and denote the transition kernel � � F θ, h x p − 1 , d x p – this can be simulated from , but its density cannot be evaluated .

  17. SMC 2 S(ML)MC 2 MLMC BIP OBIP Coupling PMCMC PMC(ML)MC Numerics MIMC Summary Return to ML (SDE, for simplicity) The joint measure (suppressing fixed y p in notation) is n � Π h ( d θ, d x 0 : n ) ∝ Π( d θ ) µ θ ( d x 0 ) g θ ( x p , y p ) F θ, h ( x p − 1 , d x p ) , p = 1 For + ∞ > h 0 > · · · > h L > 0, we would like to compute L � � � E Π hl [ ϕ ( θ, X 0 : n )] − E Π hl − 1 [ ϕ ( θ, X 0 : n )] E Π hL [ ϕ ( θ, X 0 : n )] = l = 0 where E Π h − 1 [ · ] := 0.

  18. SMC 2 S(ML)MC 2 MLMC BIP OBIP Coupling PMCMC PMC(ML)MC Numerics MIMC Summary Outline Multilevel Monte Carlo sampling 1 Bayesian inference problem 2 Our Bayesian inference problem 3 Approximate coupling 4 Particle Markov chain Monte Carlo 5 Particle Markov chain Multilevel Monte Carlo 6 Sequential Monte Carlo 2 7 Sequential Multilevel Monte Carlo 2 8 Numerical simulations 9 10 Multi-index Monte Carlo sampling 11 Summary

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend