Safe Learning-Based Control using Gaussian Processes Prof. Angela - - PowerPoint PPT Presentation

safe learning based control using gaussian processes
SMART_READER_LITE
LIVE PREVIEW

Safe Learning-Based Control using Gaussian Processes Prof. Angela - - PowerPoint PPT Presentation

Safe Learning-Based Control using Gaussian Processes Prof. Angela Schoellig IFAC World Congress 2020 Learning for Control Tutorial The Future of Automation Large prior uncertainties. Active decision making. Expect safe and high-performance


slide-1
SLIDE 1

Safe Learning-Based Control using Gaussian Processes

IFAC World Congress 2020 – Learning for Control Tutorial

  • Prof. Angela Schoellig
slide-2
SLIDE 2

The Future of Automation

Angela Schoellig 2

Large prior uncertainties. Active decision making. Expect safe and high-performance behavior.

slide-3
SLIDE 3

Robots in My Lab

Model uncertainties that limit performance:

Angela Schoellig 3

Unknown terrain and topography Unknown aerodynamic effects Unknown weather conditions Interaction with unknown objects

slide-4
SLIDE 4

Learning from data can improve performance.

Angela Schoellig 4

System State Ref. Signal

System Baseline Controller

Baseline Closed-Loop System

Actual Output

Iteratively Learned Reference

Desired Output

Repetitive error Output for different trials Desired trajectory Reference input with earlier and larger amplitude

Input

slide-5
SLIDE 5

Learning from data can improve performance.

Angela Schoellig 5

System State Ref. Signal

System Baseline Controller

Baseline Closed-Loop System

Actual Output

Iteratively Learned Reference

Desired Output

Reference input with earlier and larger amplitude

Input

Video 2x

slide-6
SLIDE 6

Learned Triple Flip [ICRA10] https://youtu.be/bWExDW9J9sA

Angela Schoellig 6

slide-7
SLIDE 7

Learning from data can improve performance.

Learning a sin ingle task through repetition [ECC’09, IROS’12, AURO’12]

Angela Schoellig 7

System State Ref. Signal

System Baseline Controller

Baseline Closed-Loop System

Actual Output

Iteratively Learned Reference

Desired Output

Offlin ine le learnin ing of in inverse model [ICRA’16, CDC’17, RAL’18, ECC’19]

System State

System Baseline Controller

Baseline Closed-Loop System

Actual Output

Deep Neural Network Offline Learning

Desired Output

slide-8
SLIDE 8

Mobile Manipulator Control [IROS’20] http://tiny.cc/ball_catch

Angela Schoellig 8

slide-9
SLIDE 9

Learning from data can improve performance.

Angela Schoellig 9

In Input-output stabili lity if baseline system is stable Acausal corrections possible Base aseli line con

  • ntrolle

ller required Trai ainin ing phas ase St State con

  • nstrain

ints not considered

Learning a sin ingle task through repetition [ECC’09, IROS’12, AURO’12]

System State Ref. Signal

System Baseline Controller

Baseline Closed-Loop System

Actual Output

Iteratively Learned Reference

Desired Output

Offlin ine le learnin ing of in inverse model [ICRA’16, CDC’17, RAL’18, ECC’19]

System State

System Baseline Controller

Baseline Closed-Loop System

Actual Output

Deep Neural Network Offline Learning

Desired Output

slide-10
SLIDE 10

Considered system dynamics: Compare to (simplified view):

Problem Statement

Angela Schoellig 10

Design a controller for systems with prior uncertainty that learns online and continuously improves performance while satisfying safety constraints.

Key features:

  • Nonparametric model
  • Im

Improved performance with ith more data

with a-priori given sets

  • Robust contr

trol: l: finds controller that achieves stability and performance for all possible

  • Adaptiv

ive control: l: estimates and uses estimate in controller

slide-11
SLIDE 11

Approach

Angela Schoellig 11

Nonparametric model for unknown model error Defining and analyzing closed-loop safety Algorithm to safely acquire data and optimize task Gaussian processes reliable confidence intervals stability & performance under uncertainty

= safe model-based reinforcement le learning

Lyapunov analysis stability of learned models Robust contr trol

System State

System

Actual Output Desired Output

Robust Controller Stochastic Disturbance Model

slide-12
SLIDE 12

Approach

Angela Schoellig 12

Nonparametric model for unknown model error Defining and analyzing closed-loop safety Algorithm to safely acquire data and optimize task Gaussian processes reliable confidence intervals stability & performance under uncertainty

= safe model-based reinforcement le learning

Lyapunov analysis stability of learned models Robust contr trol

System State

System

Actual Output Desired Output

Robust Controller Stochastic Disturbance Model

slide-13
SLIDE 13

Gaussian Process

Angela Schoellig 13

Theorem (informally): The function is contained in the scaled Gaussian process confidence intervals with probability at least .

Gau aussian Process Optim imiz ization in in the Bandit it Setting: No Regret an and Exp xperim imental l Desig ign

  • N. Srinivas, A. Krause, S. Kakade, M.Seeger, ICML 2010
slide-14
SLIDE 14

Gaussian Process

  • Can model arbitrary smooth functions.
  • For a given input, it provides an interval in which the function value lies with

high probability.

  • As more data is gathered, the uncertainty is reduced.

Angela Schoellig 14

Our model framework for developing reinforcement learning algorithms with safety guarantees.

slide-15
SLIDE 15

Approach

Angela Schoellig 15

Nonparametric model for unknown model error Defining and analyzing closed-loop safety Algorithm to safely acquire data and optimize task Gaussian processes reliable confidence intervals stability & performance under uncertainty

= safe model-based reinforcement le learning

Lyapunov analysis stability of learned models Robust contr trol 1.

  • 1. Lin

Linear 2.

  • 2. Nonlinear

3.

  • 3. Nonlinear,

, predictive

slide-16
SLIDE 16

Linear Robust Control [ECC’15]

  • Gaussian Process Model
  • Linear Robust Control
  • Task: stabilization of an operating point
  • Lin

inear robust control:

  • linearization about operating point
  • Local Stability Guarantees
  • Local asymptotic stabili

lity around true

  • perating poin

int with high probability

Angela Schoellig 16

slide-17
SLIDE 17

Linear Robust Control [ECC’15] https://youtu.be/YqhLnCm0KXY

Angela Schoellig

17

slide-18
SLIDE 18

Linear Robust Control [ECC’15] https://youtu.be/YqhLnCm0KXY

Angela Schoellig

18

slide-19
SLIDE 19

Nonlinear Robust Control for Differentially Flat Systems [L-CSS’20]

Angela Schoellig 19

  • Model / Assumptions
  • Differentially flat, control-affine real

dynamics and prior model

  • Gaussian Process models in

inverse nonlinear mis ismatch

  • Linear Robust Control
  • Task: high-performance tracking
  • Linear robust control for feedback-

linearized system

  • Global Tracking Guarantees
  • Tracking error is uniformly ultimately

bounded with high probability

slide-20
SLIDE 20

Nonlinear Robust Control for Differentially Flat Systems [L-CSS’20]

Angela Schoellig 20

  • Model / Assumptions
  • Differentially flat, control-affine real

dynamics and prior model

  • Gaussian Process models in

inverse nonlinear mis ismatch

  • Linear Robust Control
  • Task: high-performance tracking
  • Linear robust control for feedback-

linearized system

  • Global Tracking Guarantees
  • Tracking error is uniformly ultimately

bounded with high probability

Linear Dynamics Nonlinear Term

Differentially Flat System Nonlinear Mismatch

Gaussian Process

Actual Input to Linear

Nominal Feedback Linearization Nominal LQR

Desired Input to Linear

Bound Robustness Term Inverse Nonlinear Mismatch

slide-21
SLIDE 21

Nonlinear Robust Control for Differentially Flat Systems [L-CSS’20]

Cart-pendulum example with model parameter uncertainties:

Angela Schoellig 21

Robust, online learning control with global guarantees on tracking error.

Predictiv ive cap apabili ilitie ies State con

  • nstrain

ints

slide-22
SLIDE 22

Robust Predictive Control [IJRR’16, JFR’16]

  • Gaussian Process Model
  • Nonlinear, Robust Model

Predictive Control

  • Task: high-performance tracking
  • Approximations in prediction and

nonlinear optimization step

  • Guarantees [e.g., Tomlin’13,

Krause’18, Zeilinger’18]

  • Robustly asymptotically stable
  • Robust constraint satisfaction
  • Recursively guaranteeing the existence of

safe control actions

Angela Schoellig 22

Unscented Transform for prediction

slide-23
SLIDE 23

Robust Predictive Control [IJRR’16, JFR’16]

Example: Mobile robot path th foll llowing

  • Problem setup:
  • Learning:

Angela Schoellig 23

Driving too fast Slow down for safety Faster driving after learning

slide-24
SLIDE 24

Robust Predictive Control [IJRR’16, JFR’16] https://youtu.be/3xRNmNv5Efk

Angela Schoellig 24

slide-25
SLIDE 25

Summary

Angela Schoellig 25

Nonparametric model for unknown model error Defining and analyzing closed-loop safety Algorithm to safely acquire data and optimize task Gauss ssian processes reliable confidence intervals stability & performance under uncertainty Lyapunov stability stability of learned models Robust contr trol 1.

  • 1. Lin

Linear

  • Local stability

guarantees

2.

  • 2. Nonlinear
  • Global tracking error

guarantees

3.

  • 3. Nonlinear,

, predictive

  • Probabilistic constraint

satisfaction and stability

Design a controller for systems with prior uncertainty that learns online and continuously improves performance while satisfying safety constraints.

slide-26
SLIDE 26

Acknowledgements

Angela Schoellig 26

www.dynsyslab.org

Senior collaborators: Andreas Krause, Tim Barfoot, Raffaello D’Andrea Funding:

slide-27
SLIDE 27

Other Learning Control Results from My Lab

  • Syste

tems wit ith changing dynamics

[ICRA’17, IROS’18, RAL’18, JACSP’19, RAL’19]

  • Transfer le

learning betw tween sim imilar syste tems (similarity metric from robust control)

[IROS’17, ICRA’17, RAL’18, ACSP’18]

  • Coll

llaborative le learning of in inte terconnecte ted syste tems

[AURO’19]

  • Acti

tive le learning

[ICRA’16, NeurIPS’17, CDC’19]

Angela Schoellig 27

  • M. Paton, “Expanding the Limits of Vision-Based Autonomous Path Following,”, 2017.