Deep Robotic Learning Sergey Levine UC Berkeley Google Brain - - PowerPoint PPT Presentation

deep robotic learning
SMART_READER_LITE
LIVE PREVIEW

Deep Robotic Learning Sergey Levine UC Berkeley Google Brain - - PowerPoint PPT Presentation

Deep Robotic Learning Sergey Levine UC Berkeley Google Brain robotic state low-level modeling & control observations estimation planning controls control prediction (e.g. vision) pipeline standard classifier features


slide-1
SLIDE 1

Deep Robotic Learning

Sergey Levine

UC Berkeley Google Brain

slide-2
SLIDE 2
slide-3
SLIDE 3

robotic control pipeline

  • bservations

state estimation (e.g. vision) modeling & prediction planning low-level control controls

slide-4
SLIDE 4

standard computer vision features (e.g. HOG) mid-level features (e.g. DPM) classifier (e.g. SVM) deep learning

Felzenszwalb ‘08

robotic control pipeline

  • bservations

state estimation (e.g. vision) modeling & prediction planning low-level control controls

deep robotic learning

  • bservations

state estimation (e.g. vision) modeling & prediction planning low-level control controls

end-to-end training end-to-end training

slide-5
SLIDE 5

no direct supervision actions have consequences

slide-6
SLIDE 6
  • 1. Does end-to-end learning produce better

sensorimotor skills?

  • 2. Can we apply sensorimotor skill learning to a

wide variety of robots & tasks?

  • 3. Can we scale up deep robotic learning and

produce skills that generalize?

  • 4. How can we learn safely and efficiently in

safety-critical domains?

  • 5. Can we transfer skills from simulation to the

real world, and from one robot to another?

slide-7
SLIDE 7
  • 1. Does end-to-end learning produce better

sensorimotor skills?

  • 2. Can we apply sensorimotor skill learning to a

wide variety of robots & tasks?

  • 3. Can we scale up deep robotic learning and

produce skills that generalize?

  • 4. How can we learn safely and efficiently in

safety-critical domains?

  • 5. Can we transfer skills from simulation to the

real world, and from one robot to another?

slide-8
SLIDE 8

Chelsea Finn

slide-9
SLIDE 9

end-to-end training 0% success rate 96.3% success rate pose prediction

(trained on pose only)

L.*, Finn*, Darrell, Abbeel, ‘16

slide-10
SLIDE 10
  • 1. Does end-to-end learning produce better

sensorimotor skills?

  • 2. Can we apply sensorimotor skill learning to a

wide variety of robots & tasks?

  • 3. Can we scale up deep robotic learning and

produce skills that generalize?

  • 4. How can we learn safely and efficiently in

safety-critical domains?

  • 5. Can we transfer skills from simulation to the

real world, and from one robot to another?

slide-11
SLIDE 11

Deep Robotic Learning Applications manipulation locomotion

with N. Wagener, P. Abbeel with V. Kumar, A. Gupta, E. Todorov with V. Koltun

aerial vehicles

with G. Kahn, T. Zhang, P. Abbeel

tensegrity robot

with X. Geng, M. Zhang, J. Bruce, K. Caluwaerts,

  • M. Vespignani, V. SunSpiral, P. Abbeel

dexterous hands

with C. Eppner, A. Gupta, P. Abbeel

soft hands

slide-12
SLIDE 12
  • 1. Does end-to-end learning produce better

sensorimotor skills?

  • 2. Can we apply sensorimotor skill learning to a

wide variety of robots & tasks?

  • 3. Can we scale up deep robotic learning and

produce skills that generalize?

  • 4. How can we learn safely and efficiently in

safety-critical domains?

  • 5. Can we transfer skills from simulation to the

real world, and from one robot to another?

slide-13
SLIDE 13

ingredients for success in learning:

supervised learning: learning robotic skills: computation algorithms data computation algorithms

~ data

?

slide-14
SLIDE 14

monocular RGB camera 7 DoF arm 2-finger gripper

  • bject

bin

Grasping with Learned Hand-Eye Coordination

  • monocular camera (no depth)
  • no camera calibration either
  • 2-5 Hz update
  • continuous arm control
  • servo the gripper to target
  • fix mistakes
  • no prior knowledge

L., Pastor, Krizhevsky, Quillen ‘16

Peter Pastor Alex Krizhevsky Deirdre Quillen

slide-15
SLIDE 15

Grasping Experiments

slide-16
SLIDE 16

Policy Learning with Multiple Robots

Local policy optimization Global policy optimization Rollout execution

Mrinal Kalakrishnan Yevgen Chebotar Adrian Li Ali Yahya

slide-17
SLIDE 17

Yahya, Li, Kalakrishnan, Chebotar, L., ‘16

slide-18
SLIDE 18

Policy Learning with Multiple Robots: Deep RL with NAF

Gu*, Holly*, Lillicrap, L., ‘16

Shane Gu Ethan Holly Tim Lillicrap

slide-19
SLIDE 19

Learning a Predictive Model of Natural Images

  • riginal

video predictions

Chelsea Finn

slide-20
SLIDE 20
  • 1. Does end-to-end learning produce better

sensorimotor skills?

  • 2. Can we apply sensorimotor skill learning to a

wide variety of robots & tasks?

  • 3. Can we scale up deep robotic learning and

produce skills that generalize?

  • 4. How can we learn safely and efficiently in

safety-critical domains?

  • 5. Can we transfer skills from simulation to the

real world, and from one robot to another?

slide-21
SLIDE 21

unknown environment

  • 1. Learn a collision prediction model

command velocities raw image neural network ensemble

  • 3. Iteratively train with on-policy samples
  • 2. Speed-dependent, uncertainty-aware

collision cost Key idea: To learn about collisions, must experience collisions (but safely!)

Safe Uncertainty-Aware Learning

Kahn, Pong, Abbeel, L. ‘16 Greg Kahn

slide-22
SLIDE 22

Safe Uncertainty-Aware Learning

Kahn, Pong, Abbeel, L. ‘16

slide-23
SLIDE 23
  • 1. Does end-to-end learning produce better

sensorimotor skills?

  • 2. Can we apply sensorimotor skill learning to a

wide variety of robots & tasks?

  • 3. Can we scale up deep robotic learning and

produce skills that generalize?

  • 4. How can we learn safely and efficiently in

safety-critical domains?

  • 5. Can we transfer skills from simulation to the

real world, and from one robot to another?

slide-24
SLIDE 24

Training in Simulation: CAD2RL

Sadeghi, L. ‘16 Fereshteh Sadeghi

slide-25
SLIDE 25

Training in Simulation: CAD2RL

Sadeghi, L. ‘16

slide-26
SLIDE 26

Training in Simulation: CAD2RL

Sadeghi, L. ‘16

slide-27
SLIDE 27

Sadeghi, L. ‘16

slide-28
SLIDE 28

Learning with Transfer in Mind: Ensemble Policy Optimization (EPOpt)

train test adapt training on single torso mass training on model ensemble unmodeled effects ensemble adaptation

Aravind Rajeswaran

slide-29
SLIDE 29
  • 1. Does end-to-end learning produce better

sensorimotor skills?

  • 2. Can we apply sensorimotor skill learning to a

wide variety of robots & tasks?

  • 3. Can we scale up deep robotic learning and

produce skills that generalize?

  • 4. How can we learn safely and efficiently in

safety-critical domains?

  • 5. Can we transfer skills from simulation to the

real world, and from one robot to another?

  • 6. How can we get sufficient supervision to learn

in unstructured real-world environments?

slide-30
SLIDE 30

Learning what Success Means

can we learn the goal with visual features?

Finn, Abbeel, L. ‘16

slide-31
SLIDE 31

Learning what Success Means

Sermanet, Xu, L. ‘16

slide-32
SLIDE 32

ingredients for success in learning:

supervised learning: learning robotic skills: computation algorithms data computation algorithms

~ data

?

slide-33
SLIDE 33

Announcement: New Conference Conference on Robotic Learning (CoRL) www.robot-learning.org

Goal: bring together robotics & machine learning in a focused conference format

Conference: November 2017 Papers deadline: late June 2017

Steering committee: Ken Goldberg (UC Berkeley), Sergey Levine (UC Berkeley), Vincent Vanhoucke (Google), Abhinav Gupta (CMU), Stefan Schaal (USC, MPI), Michael I. Jordan (UC Berkeley), Raia Hadsell (DeepMind), Dieter Fox (UW), Joelle Pineau (McGill), J. Andrew Bagnell (CMU), Aude Billard (EPFL), Stefanie Tellex (Brown), Minoru Asada (Osaka), Wolfram Burgard (Freiburg), Pieter Abbeel (UC Berkeley) Chelsea Finn

Peter Pastor Alex Krizhevsky Deirdre Quillen Mrinal Kalakrishnan Yevgen Chebotar Adrian Li Ali Yahya Shane Gu Ethan Holly Tim Lillicrap

Greg Kahn Fereshteh Sadeghi Aravind Rajeswaran Pieter Abbeel Trevor Darrell