Depth Sensing and Deep Learning: Grasping and Segmenting 3D Objects - PowerPoint PPT Presentation

Depth Sensing and Deep Learning: Grasping and Segmenting 3D Objects from Real Depth Images using Synthetic Data Mike D e Dan aniel elczuk uk, Jeffrey Mahler, Matthew Matl, Saurabh Gupta, Andrew Lee, Andrew Li, Vishal Satish, Bill DeRose, Stephen McKinley, Ken Goldberg

Black Grouse

Imagenet: 14M labeled images, 20K categories

Classical Computer Vision Pipeline. CV experts 1. Select / develop features: SURF, HoG, SIFT, RIFT, … 2. Add Machine Learning for multi-class recognition and train classifier Feat eatur ure Detection De on, Extracti Ex tion: Cl Clas assificat ation SIFT, T, Ho HoG.. ... Recogn ognition on Classical CV feature definition is domain- specific and time-consuming Slide from Boris Ginzburg

Imagenet Classification 2012 Slide from Boris Ginzburg

Imagenet 2012 Leaderboard http://www.image-net.org/challenges/LSVRC/2012/results.html N Error-5 Algorithm Team Authors 1 0.153 Deep Conv. Neural Network Univ. of Toronto Krizhevsky et al 2 0.262 Features + Fisher Vectors + ISI Gunji et al Linear classifier 3 0.270 Features + FV + SVM OXFORD_VGG Simonyan et al 4 0.271 SIFT + FV + PQ + SVM XRCE/INRIA Perronin et al 5 0.300 Color desc. + SVM Univ. of van de Sande et al Amsterdam Slide from Boris Ginzburg

Imagenet 2013 Leaderboard http://www.image-net.org/challenges/LSVRC/2013/results.php N Error-5 Algorithm Team Authors 1 0.117 Deep Convolutional Neural Clarifi Zeiler Network 2 0.129 Deep Convolutional Neural Nat.Univ Min LIN Networks Singapore 3 0.135 Deep Convolutional Neural NYU Zeiler Networks Fergus 4 0.135 Deep Convolutional Neural Andrew Howard Networks 5 0.137 Deep Convolutional Neural Overfeat Pierre Sermanet et al Networks NYU Slide from Boris Ginzburg

Imagenet Classification 2013 Slide from Boris Ginzburg

Today’s Lecture • (Brief) Intro to Convolutional Neural Networks (CNNs) • Learning Instance Specific Grasping • Learning Grasp Quality CNNs • Learning Instance Segmentation CNNs

Today’s Lecture • (B (Bri rief) I ) Intr tro t to Convolutional l Neura ral l Netw tworks (C (CNNs) • Learning Instance Specific Grasping • Learning Grasp Quality CNNs • Learning Instance Segmentation CNNs

Why is Convolution good? • Shared Parameters • Only store filter parameters: 5*5*3 (for kernel) + 1 (for bias) for example • Local Connectivity • Each neuron only corresponds to a local patch of the image (not the whole thing)

Idea: Learn Features

Today’s Lecture • (Brief) Intro to Convolutional Neural Networks (CNNs) • Lear arning ng I Ins nstan ance-Spec pecific G Grasping • Learning Grasp Quality CNNs • Learning Instance Segmentation CNNs

Grand Challenge The ability to “grasp sp m millio ions of of differe rent si sized an and sh shape ped o obj bjects… would have significant impact on deployment of robots in fact ctorie ries, in warehouses, and in homes es.” - Rod Brooks

Universal Picking: diversely shaped and sized objects

Motivation: Instance-Specific Grasping Desired Object: from Dasani Drops

Grasping under Uncertainty

Physics UNCERTAINTY Perception Control

First Wave: R ( x , u ) ∈ {0, 1} Analytic Methods u *=π( x ) = argmax R ( x , u ) KRUGER ET AL., 2012 SHIMOGA, 1996 NGUYEN, 1988 POKORNY ET AL., BICCHI & KUMAR, REAULEAUX, 1876 FERRARI & CANNY, 2013 HANAFUSA & ASADA, 1977 2001 1992 HAAS-HEGER ET AL., LI & SASTRY, 1988 ROA & SUAREZ, BICCHI, 1994 2006 2006

Data-driven Grasping under Uncertainty Uncertainty in object pose: green grasps are robust to pose errors, red ones are not. Uncertainty in object identity: aggregate pre-computed grasp information from multiple objects in database for robustness Computations performed over database to recognition errors. of ~200 household objects. Ciocarlie, Pantofaru, Hsiao, Bradski, Brook, and Dreyfuss. “A Side of Data with My Robot: Three Datasets for Mobile Manipulation in Human Environments.”, R&A Magazine, 2011

Image-Based Grasp Planning Filter Bank Regressor Input Image Proposals Grasp Col olor Imag ages Point C Poi Clouds [Fishchinger & Vincze, 2012] [Saxena et al., 2008] [Saxena et al., 2008] [Herzog et al., 2014] [Stark et al., 2008] [Detry et al., 2009] [Oberlin & Tellex, 2015] [Bohg & Kragic, 2010] [Hubner & Kragic, 2010] [ten Pas et al., 2015] [Le et al., 2010] [Boularis et al., 2011] Image sources: Saxena et al., 2008, Pinto & Gutpa, 2016

Deep Grasp Planning Convolutional Neural Network Input Image Grasp Col olor Imag ages Poi Point C Clouds [Lenz et al., 2015] [Kappler et al., 2015] [Redmon & Angelova, 2016] [Gualtieri et al., 2016] [Pinto et al., 2016] [Johns et al, 2016] [Levine et al., 2017] [Viereck et al., 2017] Image source: Pinto & Gutpa, 2016

Data Sources Human labeling Self-supervision 1035 images: 80% success ~1 robot year: 80-90% success Lenz et al. “Deep Learning for Detecting Robotic Grasps.” S. Levine et al.. “Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection.” IJRR 2015. IJRR 2017.

Dexterity Network (Dex- Net)

Synthetic Point Clouds Synthetic LIDAR Images

Dex-Net 6.7 million examples Positive Negative Negative

Grasp Quality CNN

GQ-CNN for Bins Iter 0 Iter 1 Plan

Mahler, Jeffrey, Matthew Matl, Vishal Satish, Michael Danielczuk, Bill DeRose, Stephen McKinley, and Ken Goldberg. "Learning ambidextrous robot grasping policies." Science Robotics 4, no. 26 (2019): eaau4984.

Motivation: Instance-Specific Grasping Desired Object: from Dasani Drops

One Approach Segm Segment all objects in the scene , Classify all segments, Gr Grasp segment that matches the target!

Slide from: Kaiming He

Mask R-CNN

Mask R-CNN • State-of-the-art instance e segm egmentation on network • Requires mas assive hand hand- label eled ed d dataset ets for training • Does not generalize to unseen classes He, K., Gkioxari, G., Dollár, P., & Girshick, R. (2017). Mask r-cnn. In Proceedings of the IEEE international conference on computer vision (pp. 2961-2969).

Automated Dataset Generation • WISDOM-Sim: 50,000 images with 320,000 object labels in just 3. 3.5 h hour urs • 80/20 train/test split for both images and objects • 1600 unique objects, 10,000 unique heaps

Sampling Distributions

Data Augmentation

Modal Segmentation Masks

Amodal Segmentation Masks

SD Mask R-CNN • State-of-the-art performance for objec ect i ins nstanc nce s seg egment entation on real depth images • No No ha hand nd-labelin ling required, trained solely on synt nthet hetic d dep epth i h images es • Gener Generalizes to unseen objects • Outperforms point cloud clustering baselines by 15% in average precision and 20% in average recall

COCO Metrics

COCO Metrics Prec eciso son: TP / (TP + FP) Recal all: TP / (TP + FN)

Application: Instance-Specific Grasping Target: Dasani Drops

Thank You! Pro roje ject W t Website, Code: e: Sup upplementar ary M Mat ater erial, https://github.com/ an and D Dat atas asets: BerkeleyAutomation/ https:/ ://b /bit.l .ly/2 /2letCuE sd-maskrcnn

Depth Sensing and Deep Learning: Grasping and Segmenting 3D Objects - PowerPoint PPT Presentation

Depth Sensing and Deep Learning: Grasping and Segmenting 3D Objects from Real Depth Images using Synthetic Data Mike D e Dan aniel elczuk uk, Jeffrey Mahler, Matthew Matl, Saurabh Gupta, Andrew Lee, Andrew Li, Vishal Satish, Bill DeRose,

Learning To Grasp Jake Varley Overview - What is a grasping pipeline? - A current grasping

HUMAN GRASPING Can robots grasp as well? DATA-DRIVEN GRASPING OF UNKNOWN OBJECTS Arsalan

Self-Supervised Deep Learning for Robotic Grasping Lars Berscheid | KUKA Roboter GmbH | 10/10/2017

Novel Gaits for a Novel Novel Gaits for a Novel Crawling/Grasping Mechanism Crawling/Grasping

for each dst in my.out_edges if dst.depth > my.depth+1 then dst.depth = my.depth+1

SEGMENTING HEAD & NECK MR IMAGES Antal Nagy Department of Image Processing and Computer

Evolution of valley depth and width Evolution of valley depth and width Evolution of valley depth

Hao Su July 6, 2017 Outline Overview of 3D deep learning 3D deep learning algorithms

All You Want To Know About CNNs Yukun Zhu Deep Learning Deep Learning Image from

Deep Neural Networks and Deep Reinforcement Learning Deep Learning, Goodfellow, Bengio and

SENSING ACTUATION Cluj school, September 2007 SENSING ACTUATION MAGNETIC MAGNETIC SENSING

RGBD Tutorial 14210240041 Gu Pan Image RGB YUV Lab Depth Image RGB image Depth image Each pixel in

AGN deep multiwavelength AGN deep multiwavelength AGN deep multiwavelength surveys: surveys:

Deep Learning: Theory and Practice Deep Learning - Practical 02-04-2020 Considerations

Deep Compressed Sensing Yan Wu, Mihaela Rosca, Tim Lillicrap Compressed Sensing A Brief Review

Deep Learning on GPUs March 2016 What is Deep Learning? GPUs and DL AGENDA DL in practice

2016 Third Quarter Update November 2, 2016 Legal Statements SAFE HARBOR STATEMENT /

of Israel and Judah Every kingdom divide divided against itself will be ruined, and every

How I approach patients with both proximal common carotid disease & carotid bifurcation

Safe Testing Peter Grnwald Centrum Wiskunde & Informatica Amsterdam Mathematisch

Digital Transformation Story Su Shan Tan 14 August 2018 The presentations contain

Sampling DSE 210 Outline 1 Laws of large numbers 2 Basic sampling designs 3 Confidence intervals

Interaction Techniques Using The Wii Remote Johnny Chung Lee Carnegie Mellon University May

are four in individual leadership! Making a Lasting Difference, Graeme Reekie The risks involved

Depth Sensing and Deep Learning: Grasping and Segmenting 3D Objects - PowerPoint PPT Presentation

Depth Sensing and Deep Learning: Grasping and Segmenting 3D Objects from Real Depth Images using Synthetic Data Mike D e Dan aniel elczuk uk, Jeffrey Mahler, Matthew Matl, Saurabh Gupta, Andrew Lee, Andrew Li, Vishal Satish, Bill DeRose,

Learning To Grasp Jake Varley Overview - What is a grasping pipeline? - A current grasping

HUMAN GRASPING Can robots grasp as well? DATA-DRIVEN GRASPING OF UNKNOWN OBJECTS Arsalan

Self-Supervised Deep Learning for Robotic Grasping Lars Berscheid | KUKA Roboter GmbH | 10/10/2017

Novel Gaits for a Novel Novel Gaits for a Novel Crawling/Grasping Mechanism Crawling/Grasping

for each dst in my.out_edges if dst.depth &gt; my.depth+1 then dst.depth = my.depth+1

SEGMENTING HEAD &amp; NECK MR IMAGES Antal Nagy Department of Image Processing and Computer

Evolution of valley depth and width Evolution of valley depth and width Evolution of valley depth

Hao Su July 6, 2017 Outline Overview of 3D deep learning 3D deep learning algorithms

All You Want To Know About CNNs Yukun Zhu Deep Learning Deep Learning Image from

Deep Neural Networks and Deep Reinforcement Learning Deep Learning, Goodfellow, Bengio and

SENSING ACTUATION Cluj school, September 2007 SENSING ACTUATION MAGNETIC MAGNETIC SENSING

RGBD Tutorial 14210240041 Gu Pan Image RGB YUV Lab Depth Image RGB image Depth image Each pixel in

AGN deep multiwavelength AGN deep multiwavelength AGN deep multiwavelength surveys: surveys:

Deep Learning: Theory and Practice Deep Learning - Practical 02-04-2020 Considerations

Deep Compressed Sensing Yan Wu, Mihaela Rosca, Tim Lillicrap Compressed Sensing A Brief Review

Deep Learning on GPUs March 2016 What is Deep Learning? GPUs and DL AGENDA DL in practice

2016 Third Quarter Update November 2, 2016 Legal Statements SAFE HARBOR STATEMENT /

of Israel and Judah Every kingdom divide divided against itself will be ruined, and every

How I approach patients with both proximal common carotid disease &amp; carotid bifurcation

Safe Testing Peter Grnwald Centrum Wiskunde &amp; Informatica Amsterdam Mathematisch

Digital Transformation Story Su Shan Tan 14 August 2018 The presentations contain

Sampling DSE 210 Outline 1 Laws of large numbers 2 Basic sampling designs 3 Confidence intervals

Interaction Techniques Using The Wii Remote Johnny Chung Lee Carnegie Mellon University May

are four in individual leadership! Making a Lasting Difference, Graeme Reekie The risks involved

for each dst in my.out_edges if dst.depth > my.depth+1 then dst.depth = my.depth+1

SEGMENTING HEAD & NECK MR IMAGES Antal Nagy Department of Image Processing and Computer

How I approach patients with both proximal common carotid disease & carotid bifurcation

Safe Testing Peter Grnwald Centrum Wiskunde & Informatica Amsterdam Mathematisch