Safe model-based learning for robot control Breaking your robot is - PowerPoint PPT Presentation

Safe model-based learning for robot control Breaking your robot is only fun in simulation Felix ix Berkenkamp, Andreas Krause, Angela P. Schoellig @LCCC Workshop on Learning and Adaptation for Sensorimotor Control – Lund University October 2018

The Promise of Robotics = Physical In Interaction Virtual world of data & information. Angela Schoellig 2

The Promise of Robotics = Physical In Interaction Virtual world Virtual world of data & information. Real world Exponential increase in complexity! Angela Schoellig 3

The Real World Is Complex | Robots Today… and Tomorrow Dedic icated Envir ironments Human-centered Envir ironments Manually programmed. Unknown, unpredictable and changing Based on a-priori knowledge. Need safe and high-performance behavior Robots are limited by our under- Robots must safely le learn and adapt standing of the system/environment. Angela Schoellig 4

Characteristics of Robot Learning Robots are fe feedback systems Agent Action Reward Strict safety requirements State Environment Resource constraints (data, payload, communication) Reinforcement t Learning: An In Intr troducti tion R. Sutton, A.G. Barto, 1998 Results to date have been limited to learning sin ingle ta tasks, and demonstrated in sim imula lation or la lab sett ttings. NEXT CHALLENGE: realistic application scenarios — safety, data efficiency, online learning — Angela Schoellig 5

Work at the Dynamic Systems Lab (Prof. Schoellig) Approach Control = scie ience of f feedback Machine theory th (stability, performance, Learnin ing robustness) Research Characteristics Alg lgorithms th that run on real l robots. • Data efficiency • Online adaptation and learning • Safety guarantees during learning in in a clo losed-loop system Angela Schoellig 6

Performance and Safety: Fast Swarm Flight Angela Schoellig 7

Safety: Off-Road Driving Angela Schoellig 8

Prerequisites for safe reinforcement learning Understand model and Define safety, analyze a Algorithm to safely learning dynamics model for safety acquire data Safe Model-based Reinforcement Learning Felix Berkenkamp 9

Overview Understand model and Define safety, analyze a Algorithm to safely learning dynamics model for safety acquire data Safe Model-based Reinforcement Learning Felix Berkenkamp 10

Learning a model Dynamics Model error must decrease with measurements Need to quantify model error Felix Berkenkamp 11

Gaussian process Felix Berkenkamp 12

A Bayesian dynamics model Dynamics On Kernelized Multi lti-armed Bandits ts Onli line Learning of f Lin inearly Parameterized Contr trol Problems S.R. Chowdhury, A. Gopalan, ICML 2017 Y. Abbasi-Yadkori, PhD thesis 2012 Felix Berkenkamp 19

Samples from the Gaussian process prior state The transition dynamics are correlated! time Felix Berkenkamp 20

Safety definition robust, control-invariant prior knowledge unsafe Felix Berkenkamp 24

Safety for learned models Dynamics Poli licy + Stabil ility? Felix Berkenkamp 25

Lyapunov functions [A.M. Lyapunov 1892] Felix Berkenkamp 26

Lyapunov functions Felix Berkenkamp 27

Region of attraction Safe Model-based Reinforcement t Learning with ith Stability Guarantees F. Berkenkamp, M. Turchetta, A.P. Schoellig, A. Krause, NIPS, 2017 Initial safe policy Th Theorem (informally): Under suitable conditions can identify (near-)maximal unsafe subset of X on which π is stable, while never leaving the safe set Felix Berkenkamp 28

Illustration of safe learning Need to sa safely explore! Policy Sa Safe Model-based Rein inforcement Learn Learning wit ith St Stabili lity Gu Guarantees F. Berkenkamp, M. Turchetta, A.P. Schoellig, A. Krause, NIPS, 2017 Felix Berkenkamp 29

Illustration of safe learning Policy Sa Safe Model-based Rein inforcement Learn Learning wit ith St Stabili lity Gu Guarantees F. Berkenkamp, M. Turchetta, A.P. Schoellig, A. Krause, NIPS, 2017 Felix Berkenkamp 30

Lyapunov function Finding the right Lyapunov function is difficult! Weights - positive-definite Nonlinearities - trivial nullspace Decision boundary Th The Lyapunov Neural Netw twork: Adapti tive Stability ty Certif tificati tion for Safe Learning of f Dynamic Systems S.M. Richards, F. Berkenkamp, A. Krause, CoRL 2018 Felix Berkenkamp 31

Model predictive control Makes decisions based on predictions about the future Includes input / state constraints Felix Berkenkamp 33

Model predictive control on a robot Video at https://youtu.be/3xR NmNv5Efk Robust t constr trained le learning-based NMPC enabling reli liable mobile robot t path th tr track cking C.J. Ostafew, A.P. Schoellig, T.D. Barfoot, IJRR, 2016 Felix Berkenkamp 34

Model predictive control Problem: True dynamics are unknown! Felix Berkenkamp 35

Forward-propagating uncertainty Outer approximation contains true dynamics for all time steps with probability at least Learning-based Model Predictive Contr trol for Safe Explorati tion T. Koller, F. Berkenkamp, M. Turchetta, A. Krause, CDC, 2018 Felix Berkenkamp 36

Safe model-based learning framework exploration trajectory first step same Th Theorem (informally): Under suitable conditions can always guarantee that we are safety trajectory unsafe able to return to the safe set Felix Berkenkamp 37

Safe model-based learning framework exploration trajectory first step same Exploration limited by size of safety trajectory unsafe the safe set! Felix Berkenkamp 38

How should we collect data for a control task? Felix Berkenkamp 39

Optimizing expected performance We design our cost functions to be helpful for optimization Exploration objective: Driving too fast Slow down for safety Faster driving after learning Felix Berkenkamp 40

Example Video at https://youtu.be/3xR NmNv5Efk Robust t constr trained le learning-based NMPC enabling reli liable mobile robot t path th tr track cking C.J. Ostafew, A.P. Schoellig, T.D. Barfoot, IJRR, 2016 Felix Berkenkamp 41

Summary and Outlook Understand model and Define safety, analyze a Algorithm to safely learning dynamics model for safety acquire data Gaussia ian processes Lyapunov stabil ility Model l predic ictiv ive control Safe Model-based Rein inforcement Learnin ing https://berkenkamp.me www.dynsyslab.org Felix Berkenkamp 42

Thanks To… My y Team – In Industrial Partners – Funding Agencies www.dynsyslab.org My outstanding collaborators at U of f T (Tim Barfoot) and ETH (Andreas Krause, Raffaello D’Andrea and the whole FMA team). Angela Schoellig 43

Safe model-based learning for robot control Breaking your robot is - PowerPoint PPT Presentation

Safe model-based learning for robot control Breaking your robot is only fun in simulation Felix ix Berkenkamp, Andreas Krause, Angela P. Schoellig @LCCC Workshop on Learning and Adaptation for Sensorimotor Control Lund University October

Robothlon Team competition, each team programs a robot for each event Events Robot

Plan-based Control in an Plan-based Control in an Affordance-based Robot Control

Rational Robot A Test Automation Tool What is Rational Robot? Rational Robot is a complete

Verifying the Motion of a Robot Arm Akul Penugonda 1 /6 Akul Penugonda - Robot Arm Motion 2

What is a robot? A robot is an intelligent system that interacts with the Robot Lecture 2:

Robot behaviour and control A robot can be defined as an intelligent link between perception

Industrial Robots Industrial Robots Control Control Part 1 Control Control Part 1 Part 1

Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig

Establishing a Korean Robot Ethics Charter 2007. 4. 14 Robot Division, Ministry of Commerce,

Out line Robot ics Percept ion Robot ics Planning Reading: R&N Sect .

Robot Localization Localization Robot and and Kalman Filters Filters Kalman Rudy Negenborn

? 1 1/31/2012 Every robot maps to a point in Every robot maps to a point in its configuration

Robot Walking with Genetic Algorithms Bente Reichardt 14. December 2015 Bente Reichardt 1/52

What is a Robot? (3) What Can Robots Do? (1) Autonomous Underwater Vehicle Unmanned Aerial

Building New Robots 1 Extending Robot Language Suppose we needed a Robot to patrol the walls

Robot sensors A robot can be defined as an intelligent link between perception and action

DD2434 - Advanced Machine Learning Gaussian Processes Carl Henrik Ek { chek } @csc.kth.se Royal

Gaussian Processes to Speed up Hamiltonian Monte Carlo Matthieu L Murray, Iain

GP-BayesFilters Bayes Filters CSE-571 u(k-1) u(k) u(k+1)

Introduction to Nonparametric Bayesian Modeling and Gaussian Process Regression Piyush Rai Dept.

NONLINEAR REGRESSION Sylvain Calinon Robot Learning & Interaction Group Idiap Research

Repeatability Simulations Definition: repeatability A (distributed) simulation program is Each

Nonlinear dynamic stochastic general equilibrium models David Schenck Senior Econometrician

The Government Revenue Dataset 2017 Toward Closer Cohesion of International Tax Statistics

Safe model-based learning for robot control Breaking your robot is - PowerPoint PPT Presentation

Safe model-based learning for robot control Breaking your robot is only fun in simulation Felix ix Berkenkamp, Andreas Krause, Angela P. Schoellig @LCCC Workshop on Learning and Adaptation for Sensorimotor Control Lund University October

Robothlon Team competition, each team programs a robot for each event Events Robot

Plan-based Control in an Plan-based Control in an Affordance-based Robot Control

Rational Robot A Test Automation Tool What is Rational Robot? Rational Robot is a complete

Verifying the Motion of a Robot Arm Akul Penugonda 1 /6 Akul Penugonda - Robot Arm Motion 2

What is a robot? A robot is an intelligent system that interacts with the Robot Lecture 2:

Robot behaviour and control A robot can be defined as an intelligent link between perception

Industrial Robots Industrial Robots Control Control Part 1 Control Control Part 1 Part 1

Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig

Establishing a Korean Robot Ethics Charter 2007. 4. 14 Robot Division, Ministry of Commerce,

Out line Robot ics Percept ion Robot ics Planning Reading: R&amp;N Sect .

Robot Localization Localization Robot and and Kalman Filters Filters Kalman Rudy Negenborn

? 1 1/31/2012 Every robot maps to a point in Every robot maps to a point in its configuration

Robot Walking with Genetic Algorithms Bente Reichardt 14. December 2015 Bente Reichardt 1/52

What is a Robot? (3) What Can Robots Do? (1) Autonomous Underwater Vehicle Unmanned Aerial

Building New Robots 1 Extending Robot Language Suppose we needed a Robot to patrol the walls

Robot sensors A robot can be defined as an intelligent link between perception and action

DD2434 - Advanced Machine Learning Gaussian Processes Carl Henrik Ek { chek } @csc.kth.se Royal

Gaussian Processes to Speed up Hamiltonian Monte Carlo Matthieu L Murray, Iain

GP-BayesFilters Bayes Filters CSE-571 u(k-1) u(k) u(k+1)

Introduction to Nonparametric Bayesian Modeling and Gaussian Process Regression Piyush Rai Dept.

NONLINEAR REGRESSION Sylvain Calinon Robot Learning &amp; Interaction Group Idiap Research

Repeatability Simulations Definition: repeatability A (distributed) simulation program is Each

Nonlinear dynamic stochastic general equilibrium models David Schenck Senior Econometrician

The Government Revenue Dataset 2017 Toward Closer Cohesion of International Tax Statistics

Out line Robot ics Percept ion Robot ics Planning Reading: R&N Sect .

NONLINEAR REGRESSION Sylvain Calinon Robot Learning & Interaction Group Idiap Research