Safe Learning of Regions of Attraction for Uncertain, Nonlinear - - PowerPoint PPT Presentation

safe learning of regions of attraction for uncertain
SMART_READER_LITE
LIVE PREVIEW

Safe Learning of Regions of Attraction for Uncertain, Nonlinear - - PowerPoint PPT Presentation

Safe Learning of Regions of Attraction for Uncertain, Nonlinear Systems with Gaussian Processes Felix Berkenkamp, Riccardo Moriconi, Angela P. Schoellig, Andreas Krause @CDC, December 2016 What is control? Modelling Model Control theory


slide-1
SLIDE 1

Safe Learning of Regions of Attraction for Uncertain, Nonlinear Systems with Gaussian Processes

Felix Berkenkamp, Riccardo Moriconi, Angela P. Schoellig, Andreas Krause

@CDC, December 2016

slide-2
SLIDE 2

What is control?

2 Felix Berkenkamp

Model Control theory Modelling Implement

slide-3
SLIDE 3

One small assumption…

3 Felix Berkenkamp

Model

Degraded performance ce Instability

slide-4
SLIDE 4

What is control?

4 Felix Berkenkamp

Model Modelling Implement Control theory

slide-5
SLIDE 5

Why is learning not commonly used? Because safety matters!

slide-6
SLIDE 6

What can go wrong?

6 Felix Berkenkamp

Model Modelling Implement Control theory

Exci citation? Stability? Feedback ck

slide-7
SLIDE 7

Problem definition

7 Felix Berkenkamp

with Where is this control policy safe to use? You can experiment, but no system failures! Can we learn about dynamics cs while remaining stable? Lipschitz continuous Bounded RKHS norm

slide-8
SLIDE 8

Challenges with Bayesian learning

8 Felix Berkenkamp

Exploration (excitation) Linear systems Finite domains Nonlinear, continuous Use ideas from sensor placement Stability certifi ficates (robustness) Linear controllers Nonlinear systems This paper: Lyapunov stability (nonlinear, unce certain systems) with high probability

?

?

[F.Berkenkamp et al, ECC’15] [L. Jung, SAP’98]

[R.I.Brafman et al, JMLR‘02] [A.K.Akametalu et al, CDC’14]

slide-9
SLIDE 9

Region of attraction

9 Felix Berkenkamp

slide-10
SLIDE 10

Lyapunov functions

10 Felix Berkenkamp

[A.M. Lyapunov 1966]

slide-11
SLIDE 11

What about unknown dynamics?

11 Felix Berkenkamp

known systems: [R. Bobiti, M. Lazar, CDC 2016]

slide-12
SLIDE 12

Gaussian process models

12 Felix Berkenkamp

high probability confidence intervals Lipschitz continuous

slide-13
SLIDE 13

What about unknown dynamics?

13 Felix Berkenkamp

True system is stable within with high probability!

slide-14
SLIDE 14

Exploring the safe set

14 Felix Berkenkamp

slide-15
SLIDE 15

Challenges with Bayesian learning

15 Felix Berkenkamp

Exploration (excitation) Linear systems Finite domains Nonlinear, continuous Use ideas from sensor placement Stability certifi ficates (robustness) Linear controllers Nonlinear systems This paper: Lyapunov stability (nonlinear, unce certain systems) with high probability

?

?

[F.Berkenkamp et al, ECC’15] [L. Jung, SAP’98]

[R.I.Brafman et al, JMLR‘02] [A.K.Akametalu et al, CDC’14]

slide-16
SLIDE 16

How to explore?

16 Felix Berkenkamp

How to actively explore? Do we converge to maximum safe set? The policy is safe: keeps us in Apply

slide-17
SLIDE 17

Theoretical result

17 Felix Berkenkamp

Close-to-optimal measurements: Theorem: Guaranteed to converge to the maximum safe levelset up to a certain accuracy after a finite number of data points – without leaving this safe levelset with high probability. Bound depends on

  • Size of the maximum safe levelset
  • Information capacity of the Gaussian process model
  • Accuracy

Theorem: Guaranteed to converge to the maximum safe levelset Theorem: Guaranteed to converge to the maximum safe levelset up to a certain accuracy Theorem: Guaranteed to converge to the maximum safe levelset up to a certain accuracy after a finite number of data points

[A.Krause, C.Guestrin, UAI’05]

slide-18
SLIDE 18

Inverted pendulum

18 Felix Berkenkamp

Maximum torque limited! Safe exploration so that the pendulum doesn’t fall. Controller: LQR with prior mean model Quadratic Lyapunov function

slide-19
SLIDE 19

Safe learning for an inverted pendulum

19 Felix Berkenkamp

slide-20
SLIDE 20

Conclusion

20 Felix Berkenkamp

Can simultaneously learn system dynamics and give stability guarantees Lyapunovstability for nonlinear, unce certain systems (with high probability, discretization) Convergence ce guarantees There is hope for safe fe reinfo force cement learning! Code is open source Example notebooks More safe learning at http://berkenkamp.me