Knowledge Transfer Between Robots with Similar Dynamics for - PowerPoint PPT Presentation

Knowledge Transfer Between Robots with Similar Dynamics for High-Accuracy Impromptu Trajectory Tracking European Control Conference June 26, 2019 SiQi Zhou 1 , Andriy Sarabakha 2 , Erdal Kayacan 3 , Mohamed K. Helwa 1 , and Angela P. Schoellig 1 1 Dynamic Systems Lab, University of Toronto Institute for Aerospace Studies 2 School of Mechanical and Aerospace Engineering, Nanyang Technological University 3 Department of Engineering, Aarhus University

Introduction Designing control systems for high-accuracy tracking can be challenging Baseline Closed-Loop System Actual Desired Output Output Baseline Plant Controller Nonlinearities Unmodeled Effects Tracking Error Desired Trajectory Actual Trajectory 2

Introduction Neural networks as add-on blocks to enhance ‘black-box’ systems Baseline Closed-Loop System Actual DNN Desired Output DNN Offline Output Ref. Baseline Plant Learning Module Controller (Source System Inverse) Nonlinearities State Unmodeled Effects Tracking Error Desired Trajectory Actual Trajectory 3

Introduction Neural networks as add-on blocks to enhance ‘black-box’ systems 4

Note: If the video on previous slide has a problem, the full version of the video can be viewed here: https://youtu.be/C_teLkJDq3Y

Introduction Neural networks as add-on blocks to enhance ‘black-box’ systems Baseline Closed-Loop System Actual DNN Desired Output DNN Offline Output Ref. Baseline Plant Learning Module Controller (Source System Inverse) Nonlinearities State Unmodeled Effects Average of 62% error reduction over 30 test trajectories Count Tracking Error Desired Trajectory Actual Trajectory % RMS Error Reduction 6

Research Question What if we have a team of robots with different dynamics? 7

Research Question ? ? ? Source Target Robot Robots Implication of similarity? 8

Related Literature Transfer experience to accelerate learning on new tasks or for new robots Knowledge transfer: Leverage existing data or learned experience to accelerate or improve subsequent learning Cross-Task Transfer Knowledge Cross- Cross- Transfer Robot Robot (Robotics) Transfer Transfer 9

Related Literature Approaches for transferring data across robots Knowledge transfer: Leverage existing data or learned experience to accelerate or improve subsequent learning Cross-Task Transfer Alignment-Based Map from Source to Target Knowledge Cross- Cross- (e.g., [Bócsi et al., 2013; Helwa & Schoellig, 2017]) Transfer Robot Robot Invariant Feature Learning (Robotics) Transfer Transfer Exploiting Common Feature Space (e.g., [Gupta et al., 2017; Daftry et al., 2016]) 10

Related Literature Approaches for transferring data across robots Knowledge transfer: Leverage existing data or learned experience to accelerate or improve subsequent learning Cross-Task Transfer Alignment-Based Source Target Map from Source to Target Data Data Knowledge Cross- Cross- (e.g., [Bócsi et al., 2013; Helwa & Schoellig, 2017]) Transfer Robot Robot Invariant Feature Learning (Robotics) Transfer Transfer Exploiting Common Feature Space (e.g., [Gupta et al., 2017; Daftry et al., 2016]) 11

Related Literature Approaches for transferring data across robots Knowledge transfer: Leverage existing data or learned experience to accelerate or improve subsequent learning Cross-Task Transfer Source Encoder Decoder State Alignment-Based Map from Source to Target Knowledge Cross- Cross- (e.g., [Bócsi et al., 2013; Helwa & Schoellig, 2017]) Target Transfer Robot Robot Encoder State Decoder Invariant Feature Learning (Robotics) Transfer Transfer Exploiting Common Feature Space (e.g., [Gupta et al., 2017; Daftry et al., 2016]) [Gupta et al., 2017] 12

Related Literature Maximizing learning efficiency on physical robots shares a broader interest Knowledge transfer: Leverage existing data or learned experience to accelerate or improve subsequent learning • Sim-to-Real (e.g., [Marco et al., 2017]) ed ests elated Cross-Task eres • Meta-Learning (e.g., [Finn et al., 2017]) Inter • Modularity (e.g., [Devin et al., 2017]) Transfer Rel • … Alignment-Based Map from Source to Target Knowledge Cross- Cross- (e.g., [Bócsi et al., 2013; Helwa & Schoellig, 2017]) Transfer Robot Robot Invariant Feature Learning (Robotics) Transfer Transfer Exploiting Common Feature Space (e.g., [Gupta et al., 2017; Daftry et al., 2016]) 13

Contributions 1. Impromptu knowledge transfer (i.e., without 1. Impromptu knowledge transfer (i.e., without additional a-priori data collection on the robots) additional a-priori data collection on the robots) 2. Stability analysis of transfer-enhanced system and 2. Stability analysis of transfer-enhanced system and Source Source its connection to system similarity (linear case) its connection to system similarity (linear case) 3. Verification of the knowledge transfer approach 3. Verification of the knowledge transfer approach with quadrotors impromptu tracking experiments with quadrotors impromptu tracking experiments Target Target 14

Theoretical Results Problem definition Target Baseline Closed-Loop System Setup: Consider closed-loop source and target Actual Sys. Output Ref. Baseline systems represented in the following form Plant Controller Assumption: The source and the target systems a) are minimum phase b) have well-defined and the same relative degree Goal: To enhance the target baseline system with minimal amount of data (re)collection and training 15

Theoretical Results Leveraging the DNN inverse module from the source system Target Baseline Closed-Loop System Actual DNN Desired Output DNN Offline Output Reference Baseline Plant Learning Module Controller (Source System Inverse) How to leverage the source DNN model? • Update source DNN • Online correction learning State Offline Learning Module Approximates Inverse of the Source Robot System [CDC 17] approximated by a DNN (when and are unknown) 16

Theoretical Results Using online learning to adapt to the differences Target Baseline Closed-Loop System Actual DNN Sys. Desired Output DNN Offline Output Ref. Ref. Baseline Plant Learning Module Controller (Source System Inverse) Online Learning Module for Reference Adjustments Online Learning Module Online (Inverse Correction) Module Ref. Adaptation Error State Gain Prediction Ideal Expressions for Exact Tracking Predicted output of target system when is sent to the system 17

Theoretical Results Using online learning to adapt to the differences Target Baseline Closed-Loop System Actual DNN Sys. Desired Output DNN Offline Output Ref. Ref. Baseline Plant Learning Module Controller (Source System Inverse) Online Learning Module for Reference Adjustments Online Learning Module Online (Inverse Correction) Module Ref. Adaptation Error State Gain Prediction Onlin line T Train inin ing D Dataset Ideal Expressions for Exact Tracking Online Learning of Error Predictor (Based ed o on Lates test Observations) 18

Theoretical Results Characterizing similarity between the source and the target systems Target Baseline Closed-Loop System Linear Case Actual DNN Sys. Desired Output DNN Offline Output Ref. Ref. Baseline Plant Learning Module Controller (Source System Inverse) Input-Output Equation Online Learning Module Online (Inverse Correction) and Module Ref. where State Similarity Characterization Target Source = state-to-output gain = input-to-output gain 19

Theoretical Results Higher similarity leads to higher tolerances for learning error Assumptions Target Baseline Closed-Loop System Actual DNN Sys. Desired Output DNN Offline Output 1. Input-to-state stable Ref. Ref. Baseline Plant Learning Module Controller (Source System Inverse) 2. Offline module corresponds to the source inverse Online Learning Module 3. Error of the online learning Online (Inverse Correction) Module Ref. module is bounded as State Stability of the Overall Learning-Enhanced Target System when when (i.e., online learning module is not active) 20

Experiments We test our online learning approach on arbitrary hand drawings Samples of Arbitrary Hand-Drawn Test Trajectories 21

Experiments We test our online learning approach on arbitrary hand drawings Trajectories in the x and z Directions With offline transfer alone: 38% error reduction With online transfer: 67% error reduction Transfer Target Source Compensates for slow response Path in the x-z Plane Desired Baseline w/ DNN w/ Online 22

Experiments We can effectively reduce the amount of data required for training robots With offline transfer alone: 46% error reduction With online transfer: 74% error reduction (Comparable to fully-trained DNNs) Transfer Target Target Source 23

Knowledge Transfer Between Robots with Similar Dynamics for - PowerPoint PPT Presentation

Knowledge Transfer Between Robots with Similar Dynamics for High-Accuracy Impromptu Trajectory Tracking European Control Conference June 26, 2019 SiQi Zhou 1 , Andriy Sarabakha 2 , Erdal Kayacan 3 , Mohamed K. Helwa 1 , and Angela P. Schoellig 1

UNIVERSAL ROBOTS RUC 2018 Universal Robots - Evolving the future UNIVERSAL ROBOTS SET THE

The Imitation Game: The New Frontline of Security Fighting Robots Weve been warned for a

Robots Playing Catch Brandon Tolsch Brandon Tolsch Robots Playing Catch Two robots throwing

Similarity is crucial to cognition General (often implicit) hypothesis: similar stimulus in

Human robot interaction www.biorobotics.ttu.ee Social robots Traditional robots Tools

Knowledge-Based Agents knowledge knowledge representation, knowledge base, types of knowledge

Knowledge Transfer Using Latent Variable Models Ayan Acharya UT Austin, Department of ECE July

Technology Transfer or Knowledge Transfer? Russ Somma, Ph.D. SommaTech,LLC Affiliate of IPS

Radiative Transfer Radiative Transfer Radiative transfer is a branch of atmospheric physics. We

KNOWLEDGE ACQUISITION AND CONSTRUCTION Transfer of Knowledge Knowledge acquisition is the

Finding Similar Items:Nearest Neighbor Search Barna Saha March 29, 2018 Finding Similar Items

Trigonometric functions Step one: similar triangles Two similar triangles have the same set of

ROBOTS AND HEALTHCARE PAST, PRESENT, AND FUTURE COMPILED BY HOWIE BAUM What do you think of when

Agenda Overview of Mobile Industrial Robots Future Steps for Mobile Industrial Robots

Modular Robots Modular Robots by D. Dibbern and A. Werdermann by D. Dibbern and A. Werdermann

Building Situated Robots Overview: Agents and Robots Robot systems and architectures

Mixed Membership Matrix Factorization Lester Mackey 1 David Weiss 2 Michael I. Jordan 1 1

Model Adequacy Usual residual plots: Residuals versus predicted (fitted) values; Probability

Performance, Power, Die Yield CS301 Prof Szajda Administrative HW #1 assigned w Due

Two-Level Factors: The 2 k Factorial Design When several factors may affect a response, often each

12/6/2016 Overview of Financial & Administrative Review Agenda U.S. Department of Housing

403: Algorithms and Data Structures Quicksort Fall 2016 UAlbany Computer Science Some slides

ADT Stack 1 Stacks of Coins and Plates 2 Stacks of Rocks and Books TOP OF THE STACK TOP OF

Lectur ture e 8 In Intr tro to to CSP SP CSP as as Sear earch ch 1 Announ nouncem

Knowledge Transfer Between Robots with Similar Dynamics for - PowerPoint PPT Presentation

Knowledge Transfer Between Robots with Similar Dynamics for High-Accuracy Impromptu Trajectory Tracking European Control Conference June 26, 2019 SiQi Zhou 1 , Andriy Sarabakha 2 , Erdal Kayacan 3 , Mohamed K. Helwa 1 , and Angela P. Schoellig 1

UNIVERSAL ROBOTS RUC 2018 Universal Robots - Evolving the future UNIVERSAL ROBOTS SET THE

The Imitation Game: The New Frontline of Security Fighting Robots Weve been warned for a

Robots Playing Catch Brandon Tolsch Brandon Tolsch Robots Playing Catch Two robots throwing

Similarity is crucial to cognition General (often implicit) hypothesis: similar stimulus in

Human robot interaction www.biorobotics.ttu.ee Social robots Traditional robots Tools

Knowledge-Based Agents knowledge knowledge representation, knowledge base, types of knowledge

Knowledge Transfer Using Latent Variable Models Ayan Acharya UT Austin, Department of ECE July

Technology Transfer or Knowledge Transfer? Russ Somma, Ph.D. SommaTech,LLC Affiliate of IPS

Radiative Transfer Radiative Transfer Radiative transfer is a branch of atmospheric physics. We

KNOWLEDGE ACQUISITION AND CONSTRUCTION Transfer of Knowledge Knowledge acquisition is the

Finding Similar Items:Nearest Neighbor Search Barna Saha March 29, 2018 Finding Similar Items

Trigonometric functions Step one: similar triangles Two similar triangles have the same set of

ROBOTS AND HEALTHCARE PAST, PRESENT, AND FUTURE COMPILED BY HOWIE BAUM What do you think of when

Agenda Overview of Mobile Industrial Robots Future Steps for Mobile Industrial Robots

Modular Robots Modular Robots by D. Dibbern and A. Werdermann by D. Dibbern and A. Werdermann

Building Situated Robots Overview: Agents and Robots Robot systems and architectures

Mixed Membership Matrix Factorization Lester Mackey 1 David Weiss 2 Michael I. Jordan 1 1

Model Adequacy Usual residual plots: Residuals versus predicted (fitted) values; Probability

Performance, Power, Die Yield CS301 Prof Szajda Administrative HW #1 assigned w Due

Two-Level Factors: The 2 k Factorial Design When several factors may affect a response, often each

12/6/2016 Overview of Financial &amp; Administrative Review Agenda U.S. Department of Housing

403: Algorithms and Data Structures Quicksort Fall 2016 UAlbany Computer Science Some slides

ADT Stack 1 Stacks of Coins and Plates 2 Stacks of Rocks and Books TOP OF THE STACK TOP OF

Lectur ture e 8 In Intr tro to to CSP SP CSP as as Sear earch ch 1 Announ nouncem

12/6/2016 Overview of Financial & Administrative Review Agenda U.S. Department of Housing