Safe Reinforcement Learning in Robotics with Bayesian Models Feli - - PowerPoint PPT Presentation

safe reinforcement learning in robotics with bayesian
SMART_READER_LITE
LIVE PREVIEW

Safe Reinforcement Learning in Robotics with Bayesian Models Feli - - PowerPoint PPT Presentation

Safe Reinforcement Learning in Robotics with Bayesian Models Feli lix Berk rkenkamp, Matteo Turchetta, Angela P. Schoellig, Andreas Krause @Workshop on Reliable AI, October 2017 A new era of autonomy Images: rethink robotics, Waymob, iRobot


slide-1
SLIDE 1

Safe Reinforcement Learning in Robotics with Bayesian Models

Feli lix Berk rkenkamp, Matteo Turchetta, Angela P. Schoellig, Andreas Krause

@Workshop on Reliable AI, October 2017

slide-2
SLIDE 2

A new era of autonomy

2

Felix Berkenkamp

Images: rethink robotics, Waymob, iRobot

slide-3
SLIDE 3

Policy

Reinforcement learning

3

Felix Berkenkamp

Image: Plainicon, https://flaticon.com

Explo loration Poli licy update

slide-4
SLIDE 4

Dangers of autonomous learning

4

Felix Berkenkamp

Image: Freepik, https://flaticon.com

Safety despite uncertain inty Safe exp xploration

slide-5
SLIDE 5

Policy

Safe reinforcement learning

5

Felix Berkenkamp

Image: Plainicon, https://flaticon.com

Exploration Policy update Bayesian models for safety Model-free Model-based

slide-6
SLIDE 6

Model-free reinforcement learning

6

Felix Berkenkamp

Tracking performance Safety constraint Few experiments Sa Safety for r all ll experiments

slide-7
SLIDE 7

Gaussian process

7

Felix Berkenkamp

slide-8
SLIDE 8

Constrained Bayesian optimization

8

Felix Berkenkamp

slide-9
SLIDE 9

9

Felix Berkenkamp

Vid ideo avail ilable at http:/ ://t /tiny.cc/ic icra16_video

slide-10
SLIDE 10

10

Felix Berkenkamp

slide-11
SLIDE 11

Policy

Safe reinforcement learning

11

Felix Berkenkamp

Image: Plainicon, https://flaticon.com

Exploration Policy update Bayesian models for safety Model-free Model-based

slide-12
SLIDE 12

Model-based reinforcement learning

12

Felix Berkenkamp

Model Modelling Implement Control Theory

slide-13
SLIDE 13

Poli licy update

Approximate dynamic programming

13

Felix Berkenkamp

Dynamics Expected cost

slide-14
SLIDE 14

Uncertain dynamics

14

Felix Berkenkamp

Dynamics model

Safety-critical

slide-15
SLIDE 15

Approximate dynamic programming

15

Felix Berkenkamp

Dynamics

slide-16
SLIDE 16

Policy

Reinforcement learning

16

Felix Berkenkamp

Image: Plainicon, https://flaticon.com

Explo loration Poli licy update Sa Safe exploration Sa Safe poli licy update

slide-17
SLIDE 17

Region of attraction

17

Felix Berkenkamp

slide-18
SLIDE 18

Lyapunov functions

18

Felix Berkenkamp

[A.M. Lyapunov 1892]

slide-19
SLIDE 19

Safe policy optimization (NIPS 2017)

19

Felix Berkenkamp

Optimize policy for performance Determine safe region Poli licy update

slide-20
SLIDE 20

Policy optimization

20

Felix Berkenkamp

Policy

slide-21
SLIDE 21

Policy optimization

21

Felix Berkenkamp

Need to explore!

slide-22
SLIDE 22

Obtaining data

22

Felix Berkenkamp

slide-23
SLIDE 23

Experimental results

23

Felix Berkenkamp

slide-24
SLIDE 24

Policy performance

24

Felix Berkenkamp

slide-25
SLIDE 25

Conclusion

25

Felix Berkenkamp

Sa Safe fe re rein info forcement lea learnin ing! Can use st statis istic ical models to give high-probability safety guarantees Theoretical guarantees in the paper Code at github.com/befelix More safe learning at http://berkenkamp.me