safe learning of regions of attraction for uncertain
play

Safe Learning of Regions of Attraction for Uncertain, Nonlinear - PowerPoint PPT Presentation

Safe Learning of Regions of Attraction for Uncertain, Nonlinear Systems with Gaussian Processes Felix Berkenkamp, Riccardo Moriconi, Angela P. Schoellig, Andreas Krause @CDC, December 2016 What is control? Modelling Model Control theory


  1. Safe Learning of Regions of Attraction for Uncertain, Nonlinear Systems with Gaussian Processes Felix Berkenkamp, Riccardo Moriconi, Angela P. Schoellig, Andreas Krause @CDC, December 2016

  2. What is control? Modelling Model Control theory Implement Felix Berkenkamp 2

  3. One small assumption… Model Degraded performance ce Instability Felix Berkenkamp 3

  4. What is control? Modelling Model Control theory Implement Felix Berkenkamp 4

  5. Why is learning not commonly used? Because safety matters!

  6. What can go wrong? Modelling Model Control theory Feedback ck Implement Exci citation? Stability? Felix Berkenkamp 6

  7. Problem definition Can we learn about dynamics cs while remaining stable? with Lipschitz continuous Bounded RKHS norm Where is this control policy safe to use? You can experiment, but no system failures! Felix Berkenkamp 7

  8. Challenges with Bayesian learning Exploration (excitation) Stability certifi ficates (robustness) ✓ ✓ Linear systems [L. Jung, SAP’98] Linear controllers [F.Berkenkamp et al, ECC’15] ✓ ? Finite domains [R.I.Brafman et al, JMLR‘02] Nonlinear systems [A.K.Akametalu et al, CDC’14] ? Nonlinear, continuous This paper: Use ideas from sensor placement Lyapunov stability (nonlinear, unce certain systems) with high probability Felix Berkenkamp 8

  9. Region of attraction Felix Berkenkamp 9

  10. Lyapunov functions [A.M. Lyapunov 1966] Felix Berkenkamp 10

  11. What about unknown dynamics? known systems: [R. Bobiti, M. Lazar, CDC 2016] Felix Berkenkamp 11

  12. Gaussian process models high probability confidence intervals Lipschitz continuous Felix Berkenkamp 12

  13. What about unknown dynamics? True system is stable within with high probability! Felix Berkenkamp 13

  14. Exploring the safe set Felix Berkenkamp 14

  15. Challenges with Bayesian learning Exploration (excitation) Stability certifi ficates (robustness) ✓ ✓ Linear systems [L. Jung, SAP’98] Linear controllers [F.Berkenkamp et al, ECC’15] ✓ ? Finite domains [R.I.Brafman et al, JMLR‘02] Nonlinear systems [A.K.Akametalu et al, CDC’14] ? Nonlinear, continuous This paper: Use ideas from sensor placement Lyapunov stability (nonlinear, unce certain systems) with high probability Felix Berkenkamp 15

  16. How to explore? How to actively explore? Do we converge to maximum safe set? The policy is safe: keeps us in Apply Felix Berkenkamp 16

  17. Theoretical result Close-to-optimal measurements: [A.Krause, C.Guestrin , UAI’05] Theorem: Theorem: Theorem: Theorem: Guaranteed to converge to the maximum safe levelset up to a certain accuracy after a Guaranteed to converge to the maximum safe levelset up to a certain accuracy after a Guaranteed to converge to the maximum safe levelset Guaranteed to converge to the maximum safe levelset up to a certain accuracy finite number of data points – without leaving this safe levelset with high probability. finite number of data points Bound depends on • Size of the maximum safe levelset • Information capacity of the Gaussian process model • Accuracy Felix Berkenkamp 17

  18. Inverted pendulum Maximum torque limited! Safe exploration so that the pendulum doesn’t fall. Controller: LQR with prior mean model Quadratic Lyapunov function Felix Berkenkamp 18

  19. Safe learning for an inverted pendulum Felix Berkenkamp 19

  20. Conclusion Can simultaneously learn system dynamics and give stability guarantees Lyapunovstability for nonlinear, unce certain systems (with high probability, discretization) Convergence ce guarantees There is hope for safe fe reinfo force cement learning! Code is open source Example notebooks More safe learning at http://berkenkamp.me Felix Berkenkamp 20

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend