ode filtering a gaussian decision agent for forward
play

ODE Filtering A Gaussian Decision Agent for Forward Problems Hans - PowerPoint PPT Presentation

ODE Filtering A Gaussian Decision Agent for Forward Problems Hans Kersting Alan Turing Institute 12 April 2018 some of the presented work is supported by the European Research Council. ODEs from a Bayesian machine learning perspective How we


  1. ODE Filtering A Gaussian Decision Agent for Forward Problems Hans Kersting Alan Turing Institute 12 April 2018 some of the presented work is supported by the European Research Council.

  2. ODEs from a Bayesian machine learning perspective How we think about ODEs... x ( t ) = f ( x ( t )), t ∈ [0, T ] x (0) = x 0 ∈ R d ˙ (1) We model all unkown (even if deterministic) objects, i.e. � solution x ∈ C 1 � [0, T ]; R d � , � vector field f ∈ C 0 � [0, T ]; R d � by random variables or stochastic processes ( prior information ), and define which information we obtain in the course of the numerical computation of the solution ( measurement model ). Prior information + measurement model → application of Bayes rule Choice of Prior information 1

  3. Prior information on x For prior information on f see our publications. We a priori model x and ˙ x with an arbitrary Gauss-Markov process, i.e. with linear SDE dX t = FX t dt + LdB t , with Gaussian initial condition X 0 ∼ N ( m 0 , P 0 ) . For an Integrated Brownian motion (Wiener process) � x ( t ) � � d X t � � 0 � � X t � � 0 � 1 ∼ = d t + d B t , (2) σ d ˙ ˙ x ( t ) ˙ X t 0 0 X t the ODE filter coincides with Runge-Kutta and Nordsieck methods in a certain sense [SSH18]. An Ornstein Uhlenbeck prior � x ( t ) � � d X t � � 0 � � X t � � 0 � 1 ∼ = d t + d B t , (3) − θ σ d ˙ ˙ x ( t ) ˙ 0 X t X t has also been studied [MKSH17]. 2

  4. Numerical solutions of IVPs plots: Runge-Kutta of order 3 How classical solvers extrapolate forward from time t 0 to t 0 + h : x ( t i ) , t 0 ≤ t 1 ≤ · · · ≤ t n ≤ t 0 + h by evaluating y i ≈ f (ˆ � Estimate ˙ x ( t i )) , where ˆ x ( t ) is itself an estimate for x ( t ) � Use this data y i := ˙ x ( t i ) to estimate x ( t 0 + h ) , i.e. b � x ( t 0 + h ) ≈ x ( t 0 ) + h ˆ w i y i . i =1 x ( t ) t 0 t 0 + c 1 t 0 + c 2 t 0 + h t 3

  5. Numerical solutions of IVPs plots: Runge-Kutta of order 3 How classical solvers extrapolate forward from time t 0 to t 0 + h : x ( t i ) , t 0 ≤ t 1 ≤ · · · ≤ t n ≤ t 0 + h by evaluating y i ≈ f (ˆ � Estimate ˙ x ( t i )) , where ˆ x ( t ) is itself an estimate for x ( t ) � Use this data y i := ˙ x ( t i ) to estimate x ( t 0 + h ) , i.e. b � x ( t 0 + h ) ≈ x ( t 0 ) + h ˆ w i y i . i =1 x ( t ) t 0 t 0 + c 1 t 0 + c 2 t 0 + h t 3

  6. Numerical solutions of IVPs plots: Runge-Kutta of order 3 How classical solvers extrapolate forward from time t 0 to t 0 + h : x ( t i ) , t 0 ≤ t 1 ≤ · · · ≤ t n ≤ t 0 + h by evaluating y i ≈ f (ˆ � Estimate ˙ x ( t i )) , where ˆ x ( t ) is itself an estimate for x ( t ) � Use this data y i := ˙ x ( t i ) to estimate x ( t 0 + h ) , i.e. b � x ( t 0 + h ) ≈ x ( t 0 ) + h ˆ w i y i . i =1 x ( t ) t 0 t 0 + c 1 t 0 + c 2 t 0 + h t 3

  7. Numerical solutions of IVPs plots: Runge-Kutta of order 3 How classical solvers extrapolate forward from time t 0 to t 0 + h : x ( t i ) , t 0 ≤ t 1 ≤ · · · ≤ t n ≤ t 0 + h by evaluating y i ≈ f (ˆ � Estimate ˙ x ( t i )) , where ˆ x ( t ) is itself an estimate for x ( t ) � Use this data y i := ˙ x ( t i ) to estimate x ( t 0 + h ) , i.e. b � x ( t 0 + h ) ≈ x ( t 0 ) + h ˆ w i y i . i =1 Information in these calculations: x ( t ) = f ( x ( t )) ≈ f (ˆ ˙ x ( t )) (4) For information, f is evaluated at (or around) the current numerical estimate ˆ x of x . 3

  8. Measurement Models In principle, given a Gaussian believe � x ( t ) � �� m ( t ) � � P 00 �� P 01 ∼ N , , (5) x ( t ) ˙ m ( t ) ˙ P 10 P 11 x ( t ) would be the pushforward measure f ∗ N ( ˙ the ‘true’ information on ˙ m ( t ), P 00 ) . For computational speed, we want a Gaussian, with matched moments � f ( ξ ) d N ( ξ ; m ( t ), P 00 ) , y = (6) and covariance � f ( ξ ) f T ( ξ ) dd N ( ξ ; m ( t ), P 00 ) . R = (7) Suitable ways to approximate these integrals have been studied in [KH16]. For maximum speed, we can just use y = f ( m ( t )) and R = 0 as proposed in [SSH18]. This yields a (Kalman) filtering algorithm for ODEs. 4

  9. Filtering–based probabilistic ODE solvers Gaussian filtering [SDH14] x , x (2) , ... , x ( q − 1) ) as a draw from a q -times-integrated Wiener process ( X t ) t ∈ [0, T ] = ( X (1) t , ... , X ( q ) t ) T We interpret ( x , ˙ t ∈ [0, T ] given by a linear time-invariant SDE: dX t = FX t dt + QdW t , ξ ∼ N ( m (0), P (0)). X 0 = ξ , 5

  10. Filtering–based probabilistic ODE solvers Gaussian filtering [SDH14] x , x (2) , ... , x ( q − 1) ) as a draw from a q -times-integrated Wiener process ( X t ) t ∈ [0, T ] = ( X (1) t , ... , X ( q ) t ) T We interpret ( x , ˙ t ∈ [0, T ] given by a linear time-invariant SDE: dX t = FX t dt + QdW t , ξ ∼ N ( m (0), P (0)). X 0 = ξ , Calculation of Posterior by Gaussian filtering 5

  11. Filtering–based probabilistic ODE solvers Gaussian filtering [SDH14] x , x (2) , ... , x ( q − 1) ) as a draw from a q -times-integrated Wiener process ( X t ) t ∈ [0, T ] = ( X (1) t , ... , X ( q ) t ) T We interpret ( x , ˙ t ∈ [0, T ] given by a linear time-invariant SDE: dX t = FX t dt + QdW t , ξ ∼ N ( m (0), P (0)). X 0 = ξ , Calculation of Posterior by Gaussian filtering Prediction step: m − t + h = A ( h ) m t , P − t + h = A ( h ) P t A ( h ) T + Q ( h ), 5

  12. Filtering–based probabilistic ODE solvers Gaussian filtering [SDH14] x , x (2) , ... , x ( q − 1) ) as a draw from a q -times-integrated Wiener process ( X t ) t ∈ [0, T ] = ( X (1) t , ... , X ( q ) t ) T We interpret ( x , ˙ t ∈ [0, T ] given by a linear time-invariant SDE: dX t = FX t dt + QdW t , ξ ∼ N ( m (0), P (0)). X 0 = ξ , Calculation of Posterior by Gaussian filtering Prediction step: m − t + h = A ( h ) m t , P − t + h = A ( h ) P t A ( h ) T + Q ( h ), Gradient prediction at t + h : Approximate � y ≈ f ( ξ ) d N ( ξ ; m ( t ), P 00 ) , � R ≈ f ( ξ ) f T ( ξ ) d N ( ξ ; m ( t ), P 00 ) 5

  13. Filtering–based probabilistic ODE solvers Gaussian filtering [SDH14] x , x (2) , ... , x ( q − 1) ) as a draw from a q -times-integrated Wiener process ( X t ) t ∈ [0, T ] = ( X (1) t , ... , X ( q ) t ) T We interpret ( x , ˙ t ∈ [0, T ] given by a linear time-invariant SDE: dX t = FX t dt + QdW t , ξ ∼ N ( m (0), P (0)). X 0 = ξ , Calculation of Posterior by Gaussian filtering Prediction step: m − Update step: t + h = A ( h ) m t , P − t + h = A ( h ) P t A ( h ) T + Q ( h ), z = y − e T n m − t + h , n P − S = e T t + h e n + R , Gradient prediction at t + h : Approximate K = P − t + h e n S − 1 , � y ≈ f ( ξ ) d N ( ξ ; m ( t ), P 00 ) , m t + h = m − t + h + Kz , � P t + h = P − t + h − Ke T n P − R ≈ f ( ξ ) f T ( ξ ) d N ( ξ ; m ( t ), P 00 ) t + h , 5

  14. Research Questions 1. worst-case convergence rates vs. average convergence rates (over a measure on f ), 2. trade-off between computational speed (with Gaussians) and statistical accuracy (with samples), 3. properties of different priors on x , 4. in which sense are ‘Bayesian’ algorithms (like the above) approximations of Bayesian algorithms in the sense of [COSG17], 5. can PN algorithms for ODEs be extended to SDEs?, 6. Bayesian inverse problems—inner loop vs outer loop trade-off like in Bayesian optimization? 7. different filters (particle filter, ensemble Kalman filter)? 6

  15. Thank you for listening! 7

  16. Bibliography ◮ J. Cockayne, C.J. Oates, T. Sullivan, and M.A. Girolami. Bayesian probabilistic numerical methods. arXiv:1702.03673 [stat.ME] , February 2017. ◮ Hans Kersting and P . Hennig. Active Uncertainty Calibration in Bayesian ODE Solvers. Uncertainty in Artificial Intelligence (UAI) , 2016. ◮ E. Magnani, H. Kersting, M. Schober, and P . Hennig. Bayesian Filtering for ODEs with Bounded Derivatives. arXiv:1709.08471 [cs.NA] , September 2017. ◮ M. Schober, D. Duvenaud, and P . Hennig. Probabilistic ODE Solvers with Runge–Kutta Means. Advances in Neural Information Processing Systems (NIPS) , 2014. ◮ Michael Schober, Simo Särkkä, and Philipp Hennig. A probabilistic model for the numerical solution of initial value problems. Statistics and Computing , January 2018. 8

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend