Learning unknown forces in nonlinear models with Gaussian processes - PowerPoint PPT Presentation

Learning unknown forces in nonlinear models with Gaussian processes and autoregressive flows Wil O C Ward w.ward@sheffield.ac.uk Department of Physics and Astronomy, The University of Sheffield GPSS Workshop: Structurally Constrained Gaussian Processes 12 Sep 2019 Wil O C Ward Department of Physics and Astronomy, The University of Sheffield

Collaborative Work Mauricio Alvarez Tom Ryder Dennis Prangle Wil O C Ward Department of Physics and Astronomy, The University of Sheffield

Gaussian Processes 3 2 1 0 f ( t ) GPs generalise Gaussian 1 distribution 2 3 Infinite dimension and 0 2 4 6 8 10 12 t non-parametric 2 Defined in terms of mean and 1 covariance function 0 f ( t ) 1 f ( t ) ∼ GP ( m ( t ) , k ( t , t ′ )) 2 0 2 4 6 8 10 12 t Wil O C Ward Department of Physics and Astronomy, The University of Sheffield

Motivating Example Consider the model, d � � T d t x = α ( x ( t ) , θ ) + u ( t ) 0 Where α : R 2 × Θ → R 2 are known dynamics: � θ 1 x 1 − θ 2 x 1 x 2 � α ( x , θ ) = θ 2 x 1 x 2 − θ 3 x 2 . . . but θ and u ( t ) are unknown. How can we infer x ( t ) and u ( t ) given some noisy observations y = [ x ( τ j ) + ε j ] N j = 0 ? Wil O C Ward Department of Physics and Astronomy, The University of Sheffield

Motivating Example x 1 x 2 2 6 u ( t ) 1 5 0 4 x ( t ) 0 20 40 t 3 6 2 4 2 x 1 2 0 5 10 15 20 25 30 35 40 2 4 t x 1 Wil O C Ward Department of Physics and Astronomy, The University of Sheffield

Contents 1 Stochastic Differential Equations and Gaussian Processes 2 Variational Solutions to Non-Linear Latent Force Models 3 Approximate Gaussian Processes 4 Some Results 5 Recap 6 Open Issues Wil O C Ward Department of Physics and Astronomy, The University of Sheffield

Itô Processes Consider an ordinary differential equation describing the dynamics of some (vector-valued) function x : R → R d The dynamics α k : R d → R d are known but it is driven by a white-noise process with covariance as function of x , Σ : R d → R d × d Ordinary Differential Equation with White Noise n α k ( x , t ; θ ) d n � 1 / 2 ( x , t ; θ ) w ( t ) d t n x ( t ) = Σ k = 0 Wil O C Ward Department of Physics and Astronomy, The University of Sheffield

Itô Processes Consider an ordinary differential equation describing the dynamics of some (vector-valued) function x : R → R d The dynamics α k : R d → R d are known but it is driven by a white-noise process with covariance as function of x , Σ : R d → R d × d Stochastic Differential Equation n d n � 1 / 2 ( x , t ; θ ) α k ( x , t ; θ ) d t n x ( t ) = Σ w ( t ) � �� k = 0 drift terms diffusion Wil O C Ward Department of Physics and Astronomy, The University of Sheffield

Solutions to Itô Processes If system has linear dynamics, can solve exactly using Kalman filtering / Rauch-Tung-Streibel smoothing Assuming non-linearity, there are a number of approximation methods Stochastic extension to Euler method for iterative discrete-time estimation Euler-Maruyama Discretisation x ( t k + 1 ) − x ( t k ) ∼ N ( α ( x ( t k ))∆ t , Σ ∆ t ) Wil O C Ward Department of Physics and Astronomy, The University of Sheffield

Solutions to Itô Processes If system has linear dynamics, can solve exactly using Kalman filtering / Rauch-Tung-Streibel smoothing Assuming non-linearity, there are a number of approximation methods Stochastic extension to Euler method for iterative discrete-time estimation Euler-Maruyama Discretisation as a Generative Prior x ( t k + 1 ) | x ( t k ) ∼ N ( x ( t k ) + α ( x ( t k ))∆ t , Σ ∆ t ) Wil O C Ward Department of Physics and Astronomy, The University of Sheffield

Gaussian Processes as SDEs Examples White noise process � � 0 , ς 2 δ ( t − t ′ ) w ( t ) ∼ GP Half-integer ( ν = p + 1 / 2) Matérn models � p � p ! ( p + i )! 0 , σ 2 exp � p − i − λ | t − t ′ | � 2 λ | t − t ′ | � � � f ν ( t ) ∼ GP ( 2 p )! i !( p − i )! i = 0 Gaussian Radial Basis / Exponentiated Quadratic ( ν → ∞ ) 0 , σ 2 exp( − λ | t − t ′ | 2 ) � � f ( t ) ∼ GP Wil O C Ward Department of Physics and Astronomy, The University of Sheffield

Gaussian Processes as SDEs Examples White noise process d w ( t ) = ς d β Half-integer ( ν = p + 1 / 2) Matérn models � p p � λ p + 1 − i d i � d t i f ( t ) = − λ p + 1 f ( t ) + w ( t ) i − 1 i = 1 Gaussian Radial Basis / Exponentiated Quadratic ( ν → ∞ ) infinitely differentiable so cannot represent as Itô process exactly Wil O C Ward Department of Physics and Astronomy, The University of Sheffield

Gaussian Processes as SDEs Examples White noise process d w ( t ) = ς d β Half-integer ( ν = p + 1 / 2) Matérn models       0 1 f ( t ) 0 d f / d t . ... ... ...       .       d f ( t ) = d t + ς ν . .  d β       .  0 1   .   0 − a 1 λ p + 1 d p − 1 f / d t p − 1 − a 2 λ p · · · − a p λ 1 � �� G f ( t ) w ( t ) d t Wil O C Ward Department of Physics and Astronomy, The University of Sheffield

Stochastic Latent Force Models Recall our motivating example, a mixture of known dynamics with some hidden input function General form: α 0 ( x , t ; θ ) x ( t ) + α 1 ( x , t ; θ ) d d t x ( t ) + . . . = u ( t ) Placing a GP prior over u ( t ) Termed latent force models M. A. Alvarez, D. Luengo, and N. D. Lawrence. Linear latent force models using Gaussian processes. IEEE Trans. Pattern Anal. Mach. Intell. , 35(11):2693–2705, 2013 Wil O C Ward Department of Physics and Astronomy, The University of Sheffield

Companion Form LFMs Easy enough to reframe n th -order differential equation as first-order d f / d t = D ( f ( t ) , θ ) + L w ( t ) Wil O C Ward Department of Physics and Astronomy, The University of Sheffield

Companion Form LFMs Easy enough to reframe n th -order differential equation as first-order d f / d t = D ( f ( t ) , θ ) + L w ( t ) Companion Form � � � � � � ⊤ d n − 1 x d m − 1 u d x d u � � � � f ( τ ) = x ( τ ) u ( τ ) · · · · · · � d t n − 1 � � d t m − 1 � d t d t t = τ t = τ t = τ t = τ     f 2 0 f 3 0         . .     . .     . .     α 0 f 1 + � n − 1     ˘ i = 1 ˘ α i f i + 1 + f n + 1 0     D ( f ( t ) , θ ) = , L =     f n + 2 0         f n + 3 0         . .     . .  .   .  a 0 f n + 1 + � m − 1 1 i = 1 a i f n + i + 1 Wil O C Ward Department of Physics and Astronomy, The University of Sheffield

Contents 1 Stochastic Differential Equations and Gaussian Processes 2 Variational Solutions to Non-Linear Latent Force Models 3 Approximate Gaussian Processes 4 Some Results 5 Recap 6 Open Issues Wil O C Ward Department of Physics and Astronomy, The University of Sheffield

Inferring the Joint Posterior of a Non-Linear LFM Problem: Infer f and θ d d t f ( t ) = D ( f ( t ) , θ ) + L w ( t ) We cannot infer f exactly if D is non-linear since the joint posterior is intractible Pseudo-chaos under some systems Non-linear versions of filters/smoothers, e.g. E/UKF, ADF, SMC Difficult to do joint parameter estimation, difficult to use autodifferentiation J. Hartikainen, M. Seppänen, and S. Särkkä. State-space inference for non-linear latent force models with application to satellite orbit prediction. In ICML , pages 723–730, 2012. Wil O C Ward Department of Physics and Astronomy, The University of Sheffield

Variational Bridge Constructs We want to build variational approximation of conditional posterior: p ( x , u , θ | y ) . Variational Bayes Find q ∗ ∈ Q , such that q ∗ = arg min KL [ q ( x , u , θ ) � p ( x , u , θ | y )] q ∈Q where Q is a family of distributions parameterised by φ Wil O C Ward Department of Physics and Astronomy, The University of Sheffield

Variational Bridge Constructs We want to build variational approximation of conditional posterior: p ( f , θ | y ) . Variational Bayes Find q ∗ ∈ Q , such that q ∗ = arg min KL [ q ( f , θ ) � p ( f , θ | y )] q ∈Q where Q is a family of distributions parameterised by φ Wil O C Ward Department of Physics and Astronomy, The University of Sheffield

Variational Bridge Constructs Evidence Lower Bound ( elbo ) L ( φ ) = E f , θ ∼ q [log p ( f , θ , y ) − log q ( f , θ )] Wil O C Ward Department of Physics and Astronomy, The University of Sheffield

Variational Bridge Constructs Unbiased Evidence Lower Bound ( elbo ) n s log p ( θ ( i ) ) p ( f ( i ) | θ ( i ) ) p ( y | f ( i ) , θ ( i ) ) L ( φ ) = 1 � ˆ q ( θ ( i ) ) q ( f ( i ) | θ ( i ) ) n s i = 1 where f ( i ) ∼ q ( f | θ ( i ) ) and θ ( i ) ∼ q ( θ ) i = 1 , . . . , n s Wil O C Ward Department of Physics and Astronomy, The University of Sheffield

Learning unknown forces in nonlinear models with Gaussian processes - PowerPoint PPT Presentation

Learning unknown forces in nonlinear models with Gaussian processes and autoregressive flows Wil O C Ward w.ward@sheffield.ac.uk Department of Physics and Astronomy, The University of Sheffield GPSS Workshop: Structurally Constrained Gaussian

Gaussian Filter The Gaussian filter 1 2 1 A Gaussian kernel gives less 1 2 4 2 weight to

Nonlinear Control Lecture # 31 Nonlinear Observers Nonlinear Control Lecture # 31 Nonlinear

Nonlinear Control Lecture # 22 Special nonlinear Forms Nonlinear Control Lecture # 22 Special

Nonlinear Control Lecture # 21 Special nonlinear Forms Nonlinear Control Lecture # 21 Special

Lecture 3 Capacity of Multiuser Gaussian Channels The Gaussian uplink: 6.1 The fading

natural 1 Forces We cant see forces but we can feel their effect . Forces make things:

Nonlinear Control Lecture # 8 Special nonlinear Forms Nonlinear Control Lecture # 8 Special

Nonlinear Control Lecture # 12 Nonlinear Observers and Output Feedback Stabilization Nonlinear

Nonlinear Control Lecture # 20 Special nonlinear Forms Nonlinear Control Lecture # 20 Special

Nonlinear Programming Models Fabio Schoen 2008 http://gol.dsi.unifi.it/users/schoen Nonlinear

Faster Gaussian Lattice Sampling using Information Leakage Gaussian Sampling Our Work Lazy

Non-Gaussian likelihoods for Gaussian Processes Alan Saul Outline Motivation Non-Gaussian

CS70: Jean Walrand: Lecture 36. Gaussian and CLT CS70: Jean Walrand: Lecture 36. Gaussian and

Elem entary Particles Fundam ental Forces & Forces of Nature Four forces responsible for

Statics of Structural Statics of Structural Supports Supports TYPES OF FORCES External Forces

! Introduction to Aerosols Introduction to Aerosols ! ! Drag Forces Drag Forces ! ! Cunningham

Sequence Models Instructor: John Thickstun Discussion Board: Available on Ed! Zoom Link: Available

Autoregressive Models Overview Direct Structures P Direct structures x ( n ) = a k x

CS7015 (Deep Learning) : Lecture 22 Autoregressive Models (NADE, MADE) Mitesh M. Khapra

ArDec: Autoregressive-based time series decomposition in R Susana Barbosa Universidade do Porto,

Attention Is All You Need Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones,

Innovation in Pediatric Healthcare Delivery Utah Regional Healthcare Innovation Day April 27,

Lecture 6 Firms and Markets in the Performing Arts Nonprofits and For-Profits Professor

Automatic Creation of Tile Size Selection Models Tomofumi Yuki Lakshminarayanan Renganarayanan

Learning unknown forces in nonlinear models with Gaussian processes - PowerPoint PPT Presentation

Learning unknown forces in nonlinear models with Gaussian processes and autoregressive flows Wil O C Ward w.ward@sheffield.ac.uk Department of Physics and Astronomy, The University of Sheffield GPSS Workshop: Structurally Constrained Gaussian

Gaussian Filter The Gaussian filter 1 2 1 A Gaussian kernel gives less 1 2 4 2 weight to

Nonlinear Control Lecture # 31 Nonlinear Observers Nonlinear Control Lecture # 31 Nonlinear

Nonlinear Control Lecture # 22 Special nonlinear Forms Nonlinear Control Lecture # 22 Special

Nonlinear Control Lecture # 21 Special nonlinear Forms Nonlinear Control Lecture # 21 Special

Lecture 3 Capacity of Multiuser Gaussian Channels The Gaussian uplink: 6.1 The fading

natural 1 Forces We cant see forces but we can feel their effect . Forces make things:

Nonlinear Control Lecture # 8 Special nonlinear Forms Nonlinear Control Lecture # 8 Special

Nonlinear Control Lecture # 12 Nonlinear Observers and Output Feedback Stabilization Nonlinear

Nonlinear Control Lecture # 20 Special nonlinear Forms Nonlinear Control Lecture # 20 Special

Nonlinear Programming Models Fabio Schoen 2008 http://gol.dsi.unifi.it/users/schoen Nonlinear

Faster Gaussian Lattice Sampling using Information Leakage Gaussian Sampling Our Work Lazy

Non-Gaussian likelihoods for Gaussian Processes Alan Saul Outline Motivation Non-Gaussian

CS70: Jean Walrand: Lecture 36. Gaussian and CLT CS70: Jean Walrand: Lecture 36. Gaussian and

Elem entary Particles Fundam ental Forces &amp; Forces of Nature Four forces responsible for

Statics of Structural Statics of Structural Supports Supports TYPES OF FORCES External Forces

! Introduction to Aerosols Introduction to Aerosols ! ! Drag Forces Drag Forces ! ! Cunningham

Sequence Models Instructor: John Thickstun Discussion Board: Available on Ed! Zoom Link: Available

Autoregressive Models Overview Direct Structures P Direct structures x ( n ) = a k x

CS7015 (Deep Learning) : Lecture 22 Autoregressive Models (NADE, MADE) Mitesh M. Khapra

ArDec: Autoregressive-based time series decomposition in R Susana Barbosa Universidade do Porto,

Attention Is All You Need Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones,

Innovation in Pediatric Healthcare Delivery Utah Regional Healthcare Innovation Day April 27,

Lecture 6 Firms and Markets in the Performing Arts Nonprofits and For-Profits Professor

Automatic Creation of Tile Size Selection Models Tomofumi Yuki Lakshminarayanan Renganarayanan

Elem entary Particles Fundam ental Forces & Forces of Nature Four forces responsible for