using haptics and vision
play

using haptics and vision PhD thesis proposal Presented by: Leonel - PowerPoint PPT Presentation

Robot coaching of manipulation tasks using haptics and vision PhD thesis proposal Presented by: Leonel D. Rozo C. Advisors: Carme Torras Pablo Jimnez Barcelona. Spain September 29 th , 2008 Outline Objectives 1. State of the art 2.


  1. Robot coaching of manipulation tasks using haptics and vision PhD thesis proposal Presented by: Leonel D. Rozo C. Advisors: Carme Torras Pablo Jiménez Barcelona. Spain September 29 th , 2008

  2. Outline Objectives 1. State of the art 2. Expected contributions 3. Work planning 4. Resources 5. Conclusions 6.

  3. Objectives  Main objective  To provide robots with manipulation skills acquired from demonstrated examples given by a human who acts as a coach .

  4. Objectives  Specific objectives  To analyze (and adapt) different learning algorithms based on robot learning by demonstration, with the aim of finding those that best suit the manipulation task features.  Incremental learning  Fast learning  Robust learning  To identify the relevant features in the manipulation tasks from sensorial information with the aim of including them as input in the learning stage.  What to imitate ?

  5. Objectives  To develop a set-up where robot learning of manipulation tasks by demonstration will take place. It will be composed of a robot (the learner) teleoperated through a haptic device driven by a human user (the coach).  To fuse haptic and visual information for improving and speeding up the learning stage.

  6. State of the art Introduction  Introduction  Why should robots learn ?  Two main approaches exist for endowing robots with learning capabilities:  Self-learning  Learning from examples

  7. State of the art LbD – History and concepts  Learning by demonstration  Symbolic approaches  Exact reproduction of the demonstrated task ( playback ) (A. Billard et al. 2008)  State-action-state representation  Unsuitable approach when uncertainty appears  If-then rules

  8. State of the art LbD – History and concepts  Machine learning inclusion in programming by demonstration  Supervised methods  A training dataset composed by labelled input and desired outputs is given.  Goal: Given a new input, to predict its corresponding output  Some methods are: Artificial neural networks  Decision trees  Bayesian statistics  Gaussian process regression  Nearest neighbour  Support vector machines   Unsupervised methods  A input dataset is presented but no feedback about it is given  Goal: finding a representation of particular input patterns in a way that reflects the statistical structure of the overall collection of input patterns

  9. State of the art LbD – History and concepts  Imitation learning  What is imitation ?  Biological inspiration From an act witnessed learn to do an act (Thorndike).   Robotics Imitation takes place when an agent learns a behaviour from observing the  execution of that behaviour by a teacher (Bakker and Kuniyoshi, 1996).  Current challenges (P. Bakker & Y . Kuniyoshi, 1996)

  10. State of the art LbD – History and concepts  Movement primitives (MP)  Inductive approach MP are sequences of actions that accomplish a complete goal-directed behaviour  and allow to have a compact state-action representation (Schaal, 1999). (S. Schaal, 1999 )

  11. State of the art LbD – History and concepts  Movement primitives (MP)  Biological inspiration A behaviour-based control approach (Mataric)  How to interpret and How to integrate the understand observed perception and motion control behaviors ? system to reconstruct what was observed ? (Computational Neuroscience and Humanoid Robotics Department, ATR laboratories) To use a control system that is based on a set of behaviours (MP), which are real-  time processes that take inputs from sensors or other behaviours and send output commands to effectors or other system behaviour.

  12. State of the art LbD – History and concepts  Control policies  The motor control problem which can be conceived as finding a task-specific control policy Motor commands Algorithm parameters Policy States  Imitation learning can be defined as the problem of how control policies can be learned by observing a demonstration: Imitation by direct policy learning  Imitation by learning policies from demonstrated trajectories  Imitation by model-based policy learning 

  13. State of the art LbD – History and concepts  What to imitate ? – Learning invariances over demonstrations  Finding those features of the task that are relevant to the reproduction Those that appear most repeatedly in different demonstrations of the task i.e., the  invariants in time (Billard et al., 2004) Observation process Imitation task Execution process Categorization of the human actions (Dillman,2004):  Performative  Commenting  Commanding  (Dillman, 2004)

  14. State of the art LbD – History and concepts  Improving imitation learning  A task learned from imitation can be improved, corrected or refined in two ways:  By using reinforcement learning The given demonstrations enclose the search in the state-action space to a more  reduced subspace, which means RL is focused on those areas where demonstration data yield This approach is based on a self-improvement process, where the robot improves  the learned skill by interacting with its environment (A. Billard et al. 2008)

  15. State of the art LbD – History and concepts  By using active teaching The learned action from imitation is corrected or refined through teacher’s support  The information goes from The information flow is bi- teacher to the robot directional due to a social activity is being carried out S. Calinon and A. Billard. What is the teacher's role in robot programming by demonstration? toward benchmarks for improved learning. Interaction Studies, 8(3):441-464, 2007.

  16. State of the art LbD – History and concepts  Incremental learning  Whenever new data are generated, these should be included in the learning framework  New demonstrations  Corrections  Refinements  It is necessary to work with learning algorithms that accomplish at least the following requirements:  Online learning  Inexpensive computations  Robustness in front of the interference problem  Fast learning in highly dimensional state-action spaces

  17. State of the art LbD – History and concepts  Locally weighted learning  LWL methods approximate nonlinear functions by means of piecewise linear models  Memory-based Locally weighted regression – LWR  Locally weighted partial least squares - LWPLS 

  18. State of the art LbD – History and concepts  Non-memory-based Receptive field weighted regression – RFWR  Locally weighted projection regression – LWPR  (S. Vijayakumar & S. Schaal, 2000) (S. Schaal & C. Atkeson, 1998)  LWPR is an incremental learning algorithm, which is able to deal with high dimensional data streams. In addition is computationally cheap and numerically robust. SHORTCOMING !!! Too many open parameters to be manually tuned 

  19. State of the art LbD – History and concepts  LWL-based Bayesian learning  These methods deal with the problem of manually tuning of the open parameters in LWL algorithms Bayesian locally weighted regression – BLWR  It treats all open parameters probabilistically and learns the appropriate local  regime for each linearization problem based on the LWR algorithm approach. It is Bayesian formulation of spatially local adaptive kernels for LWR  Randomly varying coefficient – RVC  Probabilistic method based on the paradigm of Bayesian probabilistic online  learning It treats each open parameter in LWPR as a probability distribution   Gaussian processes  Incremental GMM Direct update method  It is based on the temporal coherence properties of data streams  It is assumed that were varying smoothly in time to adjust the GMM  parameters when new data were observed Reformulating the problem for a generic observation of multiple datapoints  Generative method  It uses Expectation-Maximization performed on data generated by GMR  Sparse online Gaussian processes - SOGP 

  20. State of the art LbD – History and concepts  Coaching  It can be divided into two process  Imitation learning Observation  Execution   Active teaching Observation and evaluation  Corrections and refinements  (A. Billard et al. 2008)  It allows ...  to acquire new knowledge  to focus attention on relevant task features  to give a strategy for correction  to help to iteratively define the characteristics of a successful outcome

  21. State of the art LbD – Entire systems Systems based on vision Manipulation Playing air Gestures Human motion tasks hockey Optimization Bayesian Gaussian HMM PCA criteria methods processes

  22. State of the art LbD – Entire systems  Learning basketball official’s signals  Motion sensors  Preprocessing stage by using PCA  Actions are encoded in a probabilistic way by using GMM  GMR is applied for reconstructing a general form for the signals S. Calinon and A. Billard. Incremental learning of gestures by imitation in a humanoid robot. 2007

  23. State of the art LbD – Entire systems Systems based on haptics Virtual Assembly tasks environments Neural Optimization Fuzzy HMM LWR networks criteria logic

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend