ODE Filtering A Gaussian Decision Agent for Forward Problems Hans - PowerPoint PPT Presentation

ODE Filtering A Gaussian Decision Agent for Forward Problems Hans Kersting Alan Turing Institute 12 April 2018 some of the presented work is supported by the European Research Council.

ODEs from a Bayesian machine learning perspective How we think about ODEs... x ( t ) = f ( x ( t )), t ∈ [0, T ] x (0) = x 0 ∈ R d ˙ (1) We model all unkown (even if deterministic) objects, i.e. � solution x ∈ C 1 � [0, T ]; R d � , � vector field f ∈ C 0 � [0, T ]; R d � by random variables or stochastic processes ( prior information ), and define which information we obtain in the course of the numerical computation of the solution ( measurement model ). Prior information + measurement model → application of Bayes rule Choice of Prior information 1

Prior information on x For prior information on f see our publications. We a priori model x and ˙ x with an arbitrary Gauss-Markov process, i.e. with linear SDE dX t = FX t dt + LdB t , with Gaussian initial condition X 0 ∼ N ( m 0 , P 0 ) . For an Integrated Brownian motion (Wiener process) � x ( t ) � � d X t � � 0 � � X t � � 0 � 1 ∼ = d t + d B t , (2) σ d ˙ ˙ x ( t ) ˙ X t 0 0 X t the ODE filter coincides with Runge-Kutta and Nordsieck methods in a certain sense [SSH18]. An Ornstein Uhlenbeck prior � x ( t ) � � d X t � � 0 � � X t � � 0 � 1 ∼ = d t + d B t , (3) − θ σ d ˙ ˙ x ( t ) ˙ 0 X t X t has also been studied [MKSH17]. 2

Numerical solutions of IVPs plots: Runge-Kutta of order 3 How classical solvers extrapolate forward from time t 0 to t 0 + h : x ( t i ) , t 0 ≤ t 1 ≤ · · · ≤ t n ≤ t 0 + h by evaluating y i ≈ f (ˆ � Estimate ˙ x ( t i )) , where ˆ x ( t ) is itself an estimate for x ( t ) � Use this data y i := ˙ x ( t i ) to estimate x ( t 0 + h ) , i.e. b � x ( t 0 + h ) ≈ x ( t 0 ) + h ˆ w i y i . i =1 x ( t ) t 0 t 0 + c 1 t 0 + c 2 t 0 + h t 3

Numerical solutions of IVPs plots: Runge-Kutta of order 3 How classical solvers extrapolate forward from time t 0 to t 0 + h : x ( t i ) , t 0 ≤ t 1 ≤ · · · ≤ t n ≤ t 0 + h by evaluating y i ≈ f (ˆ � Estimate ˙ x ( t i )) , where ˆ x ( t ) is itself an estimate for x ( t ) � Use this data y i := ˙ x ( t i ) to estimate x ( t 0 + h ) , i.e. b � x ( t 0 + h ) ≈ x ( t 0 ) + h ˆ w i y i . i =1 Information in these calculations: x ( t ) = f ( x ( t )) ≈ f (ˆ ˙ x ( t )) (4) For information, f is evaluated at (or around) the current numerical estimate ˆ x of x . 3

Measurement Models In principle, given a Gaussian believe � x ( t ) � �� m ( t ) � � P 00 �� P 01 ∼ N , , (5) x ( t ) ˙ m ( t ) ˙ P 10 P 11 x ( t ) would be the pushforward measure f ∗ N ( ˙ the ‘true’ information on ˙ m ( t ), P 00 ) . For computational speed, we want a Gaussian, with matched moments � f ( ξ ) d N ( ξ ; m ( t ), P 00 ) , y = (6) and covariance � f ( ξ ) f T ( ξ ) dd N ( ξ ; m ( t ), P 00 ) . R = (7) Suitable ways to approximate these integrals have been studied in [KH16]. For maximum speed, we can just use y = f ( m ( t )) and R = 0 as proposed in [SSH18]. This yields a (Kalman) filtering algorithm for ODEs. 4

Filtering–based probabilistic ODE solvers Gaussian filtering [SDH14] x , x (2) , ... , x ( q − 1) ) as a draw from a q -times-integrated Wiener process ( X t ) t ∈ [0, T ] = ( X (1) t , ... , X ( q ) t ) T We interpret ( x , ˙ t ∈ [0, T ] given by a linear time-invariant SDE: dX t = FX t dt + QdW t , ξ ∼ N ( m (0), P (0)). X 0 = ξ , 5

Filtering–based probabilistic ODE solvers Gaussian filtering [SDH14] x , x (2) , ... , x ( q − 1) ) as a draw from a q -times-integrated Wiener process ( X t ) t ∈ [0, T ] = ( X (1) t , ... , X ( q ) t ) T We interpret ( x , ˙ t ∈ [0, T ] given by a linear time-invariant SDE: dX t = FX t dt + QdW t , ξ ∼ N ( m (0), P (0)). X 0 = ξ , Calculation of Posterior by Gaussian filtering 5

Filtering–based probabilistic ODE solvers Gaussian filtering [SDH14] x , x (2) , ... , x ( q − 1) ) as a draw from a q -times-integrated Wiener process ( X t ) t ∈ [0, T ] = ( X (1) t , ... , X ( q ) t ) T We interpret ( x , ˙ t ∈ [0, T ] given by a linear time-invariant SDE: dX t = FX t dt + QdW t , ξ ∼ N ( m (0), P (0)). X 0 = ξ , Calculation of Posterior by Gaussian filtering Prediction step: m − t + h = A ( h ) m t , P − t + h = A ( h ) P t A ( h ) T + Q ( h ), 5

Filtering–based probabilistic ODE solvers Gaussian filtering [SDH14] x , x (2) , ... , x ( q − 1) ) as a draw from a q -times-integrated Wiener process ( X t ) t ∈ [0, T ] = ( X (1) t , ... , X ( q ) t ) T We interpret ( x , ˙ t ∈ [0, T ] given by a linear time-invariant SDE: dX t = FX t dt + QdW t , ξ ∼ N ( m (0), P (0)). X 0 = ξ , Calculation of Posterior by Gaussian filtering Prediction step: m − t + h = A ( h ) m t , P − t + h = A ( h ) P t A ( h ) T + Q ( h ), Gradient prediction at t + h : Approximate � y ≈ f ( ξ ) d N ( ξ ; m ( t ), P 00 ) , � R ≈ f ( ξ ) f T ( ξ ) d N ( ξ ; m ( t ), P 00 ) 5

Filtering–based probabilistic ODE solvers Gaussian filtering [SDH14] x , x (2) , ... , x ( q − 1) ) as a draw from a q -times-integrated Wiener process ( X t ) t ∈ [0, T ] = ( X (1) t , ... , X ( q ) t ) T We interpret ( x , ˙ t ∈ [0, T ] given by a linear time-invariant SDE: dX t = FX t dt + QdW t , ξ ∼ N ( m (0), P (0)). X 0 = ξ , Calculation of Posterior by Gaussian filtering Prediction step: m − Update step: t + h = A ( h ) m t , P − t + h = A ( h ) P t A ( h ) T + Q ( h ), z = y − e T n m − t + h , n P − S = e T t + h e n + R , Gradient prediction at t + h : Approximate K = P − t + h e n S − 1 , � y ≈ f ( ξ ) d N ( ξ ; m ( t ), P 00 ) , m t + h = m − t + h + Kz , � P t + h = P − t + h − Ke T n P − R ≈ f ( ξ ) f T ( ξ ) d N ( ξ ; m ( t ), P 00 ) t + h , 5

Research Questions 1. worst-case convergence rates vs. average convergence rates (over a measure on f ), 2. trade-off between computational speed (with Gaussians) and statistical accuracy (with samples), 3. properties of different priors on x , 4. in which sense are ‘Bayesian’ algorithms (like the above) approximations of Bayesian algorithms in the sense of [COSG17], 5. can PN algorithms for ODEs be extended to SDEs?, 6. Bayesian inverse problems—inner loop vs outer loop trade-off like in Bayesian optimization? 7. different filters (particle filter, ensemble Kalman filter)? 6

Thank you for listening! 7

Bibliography ◮ J. Cockayne, C.J. Oates, T. Sullivan, and M.A. Girolami. Bayesian probabilistic numerical methods. arXiv:1702.03673 [stat.ME] , February 2017. ◮ Hans Kersting and P . Hennig. Active Uncertainty Calibration in Bayesian ODE Solvers. Uncertainty in Artificial Intelligence (UAI) , 2016. ◮ E. Magnani, H. Kersting, M. Schober, and P . Hennig. Bayesian Filtering for ODEs with Bounded Derivatives. arXiv:1709.08471 [cs.NA] , September 2017. ◮ M. Schober, D. Duvenaud, and P . Hennig. Probabilistic ODE Solvers with Runge–Kutta Means. Advances in Neural Information Processing Systems (NIPS) , 2014. ◮ Michael Schober, Simo Särkkä, and Philipp Hennig. A probabilistic model for the numerical solution of initial value problems. Statistics and Computing , January 2018. 8

ODE Filtering A Gaussian Decision Agent for Forward Problems Hans - PowerPoint PPT Presentation

ODE Filtering A Gaussian Decision Agent for Forward Problems Hans Kersting Alan Turing Institute 12 April 2018 some of the presented work is supported by the European Research Council. ODEs from a Bayesian machine learning perspective How we

Gaussian Filter The Gaussian filter 1 2 1 A Gaussian kernel gives less 1 2 4 2 weight to

ode ode Basic Concepts and Theorems The n th order linear ODE takes the form: n n 1 d y

Filtering Cubemaps Filtering Cubemaps Angular Extent Filtering and Edge Seam Fixup Methods

Traffic Control Mechanisms Filtering Source address filtering Other forms of filtering

Lesson 7 Rate Conversion Filtering and Downsampling interchange Filtering and Upsampling

Overview Multi-Agent Systems Introduction to multi-agent systems and agent societies Agent

Definition. By a linear ODE of order 1 we mean any ODE written in the form y + a ( x ) y = b (

Lecture 3 Capacity of Multiuser Gaussian Channels The Gaussian uplink: 6.1 The fading

An Agent Architecture An Agent Architecture An Agent Architecture An Agent Architecture for

S S S S calable calable Agent calable calable Agent Agent Plat forms Agent Plat forms

Agent-Based Systems Agent communication Speech act theory Michael Rovatsos Agent

Faster Gaussian Lattice Sampling using Information Leakage Gaussian Sampling Our Work Lazy

Non-Gaussian likelihoods for Gaussian Processes Alan Saul Outline Motivation Non-Gaussian

CS70: Jean Walrand: Lecture 36. Gaussian and CLT CS70: Jean Walrand: Lecture 36. Gaussian and

Megan Ode) Kidical Mass DC My favorite riding buddies:

1 math(3) So Solu lutio tion n of of fi firs rst t or orde der r ode ode 2

Adapting Cache Partitioning Algorithms to Pseudo-LRU Replacement Policies Kamil Kedzierski 1,3 ,

COMMUNITY RESOURCE INFORMATION EXCHANGE (CORIE) PLANNING INITIATIVE Health IT and HIE Connect a

Constant-Size Commitments to Polynomials and Their Applications Ian Goldberg Cryptography,

HOUSING AS A SOCIAL DETERMINANT OF HEALTH PARTNERSHIP HEALTHPLAN OF CA JUNE 3, 2019 Carol

Analytics for Health Policy Don Willison Sc.D. Institute for Health Policy, Management &

Improving Care Through Innovation A Presentation on the District of Columbia State Medicaid

Efficient Constructions of Bilinear Accumulators Ioanna Karantaidou, Foteini Baldimtsi Set Me

Stateful Services on DC/OS Santa Clara, California | April 23th 25th, 2018 Who Am I?

Sambuz

Useful Links

Newsletter

Mail Us

ODE Filtering A Gaussian Decision Agent for Forward Problems Hans - PowerPoint PPT Presentation

ODE Filtering A Gaussian Decision Agent for Forward Problems Hans Kersting Alan Turing Institute 12 April 2018 some of the presented work is supported by the European Research Council. ODEs from a Bayesian machine learning perspective How we

Gaussian Filter The Gaussian filter 1 2 1 A Gaussian kernel gives less 1 2 4 2 weight to

ode ode Basic Concepts and Theorems The n th order linear ODE takes the form: n n 1 d y

Filtering Cubemaps Filtering Cubemaps Angular Extent Filtering and Edge Seam Fixup Methods

Traffic Control Mechanisms Filtering Source address filtering Other forms of filtering

Lesson 7 Rate Conversion Filtering and Downsampling interchange Filtering and Upsampling

Overview Multi-Agent Systems Introduction to multi-agent systems and agent societies Agent

Definition. By a linear ODE of order 1 we mean any ODE written in the form y + a ( x ) y = b (

Lecture 3 Capacity of Multiuser Gaussian Channels The Gaussian uplink: 6.1 The fading

An Agent Architecture An Agent Architecture An Agent Architecture An Agent Architecture for

S S S S calable calable Agent calable calable Agent Agent Plat forms Agent Plat forms

Agent-Based Systems Agent communication Speech act theory Michael Rovatsos Agent

Faster Gaussian Lattice Sampling using Information Leakage Gaussian Sampling Our Work Lazy

Non-Gaussian likelihoods for Gaussian Processes Alan Saul Outline Motivation Non-Gaussian

CS70: Jean Walrand: Lecture 36. Gaussian and CLT CS70: Jean Walrand: Lecture 36. Gaussian and

Megan Ode) Kidical Mass DC My favorite riding buddies:

1 math(3) So Solu lutio tion n of of fi firs rst t or orde der r ode ode 2

Adapting Cache Partitioning Algorithms to Pseudo-LRU Replacement Policies Kamil Kedzierski 1,3 ,

COMMUNITY RESOURCE INFORMATION EXCHANGE (CORIE) PLANNING INITIATIVE Health IT and HIE Connect a

Constant-Size Commitments to Polynomials and Their Applications Ian Goldberg Cryptography,

HOUSING AS A SOCIAL DETERMINANT OF HEALTH PARTNERSHIP HEALTHPLAN OF CA JUNE 3, 2019 Carol

Analytics for Health Policy Don Willison Sc.D. Institute for Health Policy, Management &amp;

Improving Care Through Innovation A Presentation on the District of Columbia State Medicaid

Efficient Constructions of Bilinear Accumulators Ioanna Karantaidou, Foteini Baldimtsi Set Me

Stateful Services on DC/OS Santa Clara, California | April 23th 25th, 2018 Who Am I?

Sambuz

Useful Links

Newsletter

Mail Us

Analytics for Health Policy Don Willison Sc.D. Institute for Health Policy, Management &