ares and mars adversarial and mmd minimizing regression
play

AReS and MaRS - Adversarial and MMD-Minimizing Regression for SDEs - PowerPoint PPT Presentation

AReS and MaRS - Adversarial and MMD-Minimizing Regression for SDEs Gabriele Abbati *1 , Philippe Wenk *23 , Michael A Osborne 1 , Andreas Krause 2 , Bernhard Schlkopf 4 , Stefan Bauer 4 1 University of Oxford, 2 ETH Zrich, 3 Max Planck ETH


  1. AReS and MaRS - Adversarial and MMD-Minimizing Regression for SDEs Gabriele Abbati *1 , Philippe Wenk *23 , Michael A Osborne 1 , Andreas Krause 2 , Bernhard Schölkopf 4 , Stefan Bauer 4 1 University of Oxford, 2 ETH Zürich, 3 Max Planck ETH Center for Learning Systems, 4 Max Planck Institute for Intelligent Systems Thirty-sixth International Conference on Machine Learning 1

  2. Stochastic Differential Equations in the Wild (a) Robotics (source: Athena robot, (b) Atmospheric Modeling (source: MPI-IS) wikipedia) (c) Stock Markets (source: Yahoo Finance) 2

  3. Gradient Matching { x = f ( x , θ ) Given f and y , ˙ ODE infer x and θ y = x + ϵ with ϵ ∼ N ( 0 , σ y ) { d x = f ( x , θ ) d t + G d W Given f and y , SDE infer x , G and θ y = x + ϵ with ϵ ∼ N ( 0 , σ y ) Integration-based methods Integration-free methods parameters → trajectory trajectory → parameters 3

  4. Classic Gradient Matching - Model (1) Gaussian Process prior on states φ σ p ( x | ϕ ) = N ( x | µ y , C φ ) p (˙ x | x , ϕ ) = N (˙ x | Dx , A ) y x x ˙ (2) ODE Model γ θ p (˙ x | x , θ , γ ) = N (˙ x | f ( x , θ ) , γ I ) x x ˙ 4

  5. Classic Gradient Matching - Inference Calderhead, Girolami, and Lawrence (2009) and Dondelinger et al. (2013) Product of Experts : p (˙ x ) ∝ p data (˙ x ) p ODE (˙ x ) Wenk et al. (2018), FGPGM Forced equality : p (˙ x ) ∝ p data (˙ x data ) p ODE (˙ x ODE ) δ (˙ x data − ˙ x ) δ (˙ x ODE − ˙ x ) Wenk*, Abbati* et al. (2019), ODIN ODEs as constraints 5

  6. Stochastic Differential Equations { d x = f ( x , θ ) d t + G d W General Given f and y , SDE Problem infer x , G and θ y = x + ϵ with ϵ ∼ N ( 0 , σ y ) 2 0 − 2 0 5 10 15 20 t { d x = θ 0 x ( θ 1 − x 2 ) d t + G d w Given f and y , Example infer x , G and θ y = x + ϵ with ϵ ∼ N ( 0 , σ y ) 6

  7. Stochastic Gradient Matching? Problems: 2 Both observation and process noise 0 Stochastic sample paths − 2 Paths are not differentiable 0 5 10 15 20 t 7

  8. The Doss-Sussmann Transformation { d x = f ( x , θ ) d t + G d W General Given f and y , SDE Problem infer x , G and θ y = x + ϵ with ϵ ∼ N ( 0 , σ y ) Definition (Ornstein-Uhlenbeck Process) A stochastic process o defined by the equation: d o = − o d t + G d W We introduce the latent variable z = x − o to get the stochastic gradients d z ( t ) = { f ( z ( t ) + o ( t ) , θ ) + o ( t ) } d t 8

  9. The Doss-Sussmann Transformation { d x = f ( x , θ ) d t + G d W General Given f and y , SDE Problem infer x , G and θ y = x + ϵ with ϵ ∼ N ( 0 , σ y ) Definition (Ornstein-Uhlenbeck Process) A stochastic process o defined by the equation: d o = − o d t + G d W We introduce the latent variable z = x − o to get the stochastic gradients d z ( t ) = { f ( z ( t ) + o ( t ) , θ ) + o ( t ) } d t 8

  10. The Doss-Sussmann Transformation { d x = f ( x , θ ) d t + G d W General Given f and y , SDE Problem infer x , G and θ y = x + ϵ with ϵ ∼ N ( 0 , σ y ) Definition (Ornstein-Uhlenbeck Process) A stochastic process o defined by the equation: d o = − o d t + G d W We introduce the latent variable z = x − o to get the stochastic gradients d z ( t ) = { f ( z ( t ) + o ( t ) , θ ) + o ( t ) } d t 8

  11. A Novel Generative Model Previous Generative Model New Generative Model Y X + E Y = Z + O + E = Resulting observation marginal distribution: y | ϕ , G , σ ) = N ( 0 , C φ + B Ω B T + T ) p (˜ Gaussian prior OU process Obs. noise 9

  12. A Novel Generative Model Previous Generative Model New Generative Model Y X + E Y = Z + O + E = Resulting observation marginal distribution: y | ϕ , G , σ ) = N ( 0 , C φ + B Ω B T + T ) p (˜ Gaussian prior OU process Obs. noise 9

  13. A Tale of Two Graphical Models SDE-based model Data-based model o o G G σ σ y y z z z z ˙ ˙ ϕ ϕ θ p (˙ z | o , z , θ ) = δ (˙ z − f ( z + o , θ ) − o ) p (˙ z | z , ϕ ) = N (˙ z | Dz , A ) p (˙ z | o , z , θ ) ∼ p (˙ z | z , ϕ ) Good θ estimate 10

  14. A Tale of Two Graphical Models SDE-based model Data-based model o o G G σ σ y y z z z z ˙ ˙ ϕ ϕ θ p (˙ z | o , z , θ ) = δ (˙ z − f ( z + o , θ ) − o ) p (˙ z | z , ϕ ) = N (˙ z | Dz , A ) p (˙ z | o , z , θ ) ∼ p (˙ z | z , ϕ ) Good θ estimate 10

  15. Sample-based Parameter Inference GP fit (ZOE noise model) Samples from p SDE Samples from p data z SDE ∼ p (˙ z | o , z , θ ) z data ∼ p (˙ z | z , ϕ ) ˙ ˙ Iterative Gradient-based optimization z SDE ∼ ˙ z data → ˙ ∑ M z ( i ) [ ] (1) AReS (WGAN) i =1 f ω (˙ 1 θ ← −∇ θ SDE ) M (2) MaRS (MMD) θ ← ∇ θ MMD 2 z SDE , ˙ z data ] u [˙ 11

  16. Sample-based Parameter Inference GP fit (ZOE noise model) Samples from p SDE Samples from p data z SDE ∼ p (˙ z | o , z , θ ) z data ∼ p (˙ z | z , ϕ ) ˙ ˙ Iterative Gradient-based optimization z SDE ∼ ˙ z data → ˙ ∑ M z ( i ) [ ] (1) AReS (WGAN) i =1 f ω (˙ 1 θ ← −∇ θ SDE ) M (2) MaRS (MMD) θ ← ∇ θ MMD 2 z SDE , ˙ z data ] u [˙ 11

  17. Sample-based Parameter Inference GP fit (ZOE noise model) Samples from p SDE Samples from p data z SDE ∼ p (˙ z | o , z , θ ) z data ∼ p (˙ z | z , ϕ ) ˙ ˙ Iterative Gradient-based optimization z SDE ∼ ˙ z data → ˙ ∑ M z ( i ) [ ] (1) AReS (WGAN) i =1 f ω (˙ 1 θ ← −∇ θ SDE ) M (2) MaRS (MMD) θ ← ∇ θ MMD 2 z SDE , ˙ z data ] u [˙ 11

  18. Sample-based Parameter Inference GP fit (ZOE noise model) Samples from p SDE Samples from p data z SDE ∼ p (˙ z | o , z , θ ) z data ∼ p (˙ z | z , ϕ ) ˙ ˙ Iterative Gradient-based optimization z SDE ∼ ˙ z data → ˙ ∑ M z ( i ) [ ] (1) AReS (WGAN) i =1 f ω (˙ 1 θ ← −∇ θ SDE ) M (2) MaRS (MMD) θ ← ∇ θ MMD 2 z SDE , ˙ z data ] u [˙ 11

  19. Samples during Training Data-based Data-based Model-based Model-based z ( t ) z ( t ) ˙ ˙ t t (a) Samples before training (b) Samples after training 12

  20. Experimental Results - Lotka Volterra LV, GT NPSDE ESGF AReS MaRS 2 . 00 ± 0 . 09 θ 0 = 2 1 . 58 ± 0 . 71 2 . 04 ± 0 . 09 2 . 36 ± 0 . 18 1 . 00 ± 0 . 04 θ 1 = 1 0 . 74 ± 0 . 31 1 . 02 ± 0 . 05 1 . 18 ± 0 . 9 3 . 97 ± 0 . 63 θ 2 = 4 2 . 26 ± 1 . 51 3 . 87 ± 0 . 59 3 . 70 ± 0 . 51 0 . 98 ± 0 . 18 θ 3 = 1 0 . 49 ± 0 . 35 0 . 96 ± 0 . 14 0 . 91 ± 0 . 14 H 1 , 1 = 0 . 05 / 0 . 03 ± 0 . 004 0 . 01 ± 0 . 03 H 1 , 2 = 0 . 03 / 0 . 02 ± 0 . 01 0 . 01 ± 0 . 01 H 2 , 1 = 0 . 03 / 0 . 02 ± 0 . 01 0 . 01 ± 0 . 01 H 2 , 2 = 0 . 09 / 0 . 09 ± 0 . 03 0 . 03 ± 0 . 02 8 d x 1 ( t ) [ θ 1 x 1 ( t ) − θ 2 x 1 ( t ) x 2 ( t )] d t = 6 + G 11 d w 1 ( t ) 4 d x 2 ( t ) [ − θ 3 x 2 ( t ) + θ 4 x 1 ( t ) x 2 ( t )] d t = 2 + G 21 d w 1 ( t ) + G 22 d w 2 ( t ) 0 . 0 0 . 5 1 . 0 1 . 5 2 . 0 t 13

  21. Experimental Results - Double Well Potential DW, GT NPSDE VGPA ESGF AReS MaRS 0 . 10 ± 0 . 05 θ 0 = 0 . 1 0 . 09 ± 7 . 00 0 . 05 ± 0 . 04 0 . 01 ± 0 . 03 0 . 09 ± 0 . 04 3 . 85 ± 1 . 10 θ 1 = 4 3 . 36 ± 248 . 82 1 . 11 ± 0 . 66 0 . 11 ± 0 . 16 3 . 68 ± 1 . 34 H = 0 . 25 / 0 . 21 ± 0 . 09 0 . 00 ± 0 . 02 0 . 20 ± 0 . 05 2 d x ( t ) = θ 0 x ( θ 1 − x 2 ) d t + G d w ( t ) 0 − 2 0 5 10 15 20 t 14

  22. Contributions We extend classical gradient matching to SDEs We introduce a novel statistical framework combining the Doss-Sussmann transformation and GPs We introduce a novel parameter inference scheme that leverages adversarial and moment matching loss functions We improve parameter inference accuracy in systems of SDEs 15

  23. Thank you Come and catch us → poster #216 Bonus Round: check out our paper on classic gradient matching! Wenk*, P ., Abbati*, G., Bauer, S., Osborne, M. A., Krause, A., Schölkopf, B. (2019). ODIN: ODE-Informed Regression for Parameter and State Inference in Time-Continuous Dynamical Systems . ArXiv Preprint ArXiv:1902.06278. 16

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend