Deep Learning Approach for Pose Estimation Talk #23444 MSc Kanter - - PowerPoint PPT Presentation

▶

Apr 08, 2023 477 likes •677 views

Deep Learning Approach for Pose Estimation Talk #23444 MSc Kanter van Deurzen Introduction Delft Robotics Kanter van Deurzen Founded 2014 CTO Delft Robotics Staff of 11 Specialized in Vision guided Robotics through AI Winner Amazon

SLIDE 1

Deep Learning Approach for Pose Estimation

Talk #23444 MSc Kanter van Deurzen

SLIDE 2

Introduction

Delft Robotics Founded 2014 Staff of 11 Specialized in Vision guided Robotics through AI Winner Amazon Picking Challenge 2016 Kanter van Deurzen CTO Delft Robotics

SLIDE 3

Application

Bin-picking

One type of object
Multiple objects
Restricted area

Order-picking

Large number of different objects
Multiple objects
Restricted area

Service robots

Large number of different objects
No area restriction

Any orientation possible (6DOF)!!

SLIDE 4

Typical pipeline

Data acquisition

Segmentation

Deep learning

Rough pose estimation

Ransac
4PCS
Fast global

registration Pose

ptimization
ICP

Grasp pose estimation

SLIDE 5

CAD pipeline

1. Locate object in 2D/3D
2. Use CAD model to find global optimum of object

pose

3. Use CAD model to refine pose locally
4. Determine grasp pose using estimated pose

SLIDE 6

CAD pipeline

Advantages:

– Complete 6DOF pose is known – Pose in gripper approximately known – Scales to other objects

Disadvantages:

– Symmetry causes ambiguity, risk of local minima – Requires CAD model

SLIDE 7

Non-rigid/CAD-less pipeline

1. Locate object in 2D/3D
2. Fit shape of gripper anywhere on object

SLIDE 8

Non-rigid/CAD-less pipeline

Advantages:

– No CAD model necessary – No knowledge of object necessary

Disadvantages:

– Pose in gripper unknown – No knowledge of object used (fragility/weight/...)

SLIDE 9

Why deep learning?

Rough pose estimation is slow
Rough pose estimation risks local minima

SLIDE 10

Attempted DL Approaches

Classification
Regression
Key-points

SLIDE 11

Classification

Concept

– Classify pose as one of X classes

Class 1 Class 2 Class 3 Class ...

SLIDE 12

Classification results

Conditions:

– Rendered images – 3 axes rotations – 10 classes per axis (36 degrees per class) – For only 1 axis, great (Z-axis, 99% accuracy)

Conclusions:

– Scales badly to multiple axes (10 classes per axis, 101010 = 1000 total classes).

SLIDE 13

Regression

Treat as a regression problem
Regress 4 values (queternion)
Use RGB(D) as input

Ground truth Predicted pose

SLIDE 14

Regression results

Conclusions:

– Works well on asymmetric objects – Symmetric objects cause ambiguity – Difficult to find correct loss function

SLIDE 15

Key-points

Concept:

– Recognize 3 or more keypoints – Determine pose based on these keypoints – Refine using (local) heuristics

SLIDE 16

Key-points results

Conclusions:

– Easily integrated end-to-end in network – Works only for well-defined keypoints, which is non-trivial – Results in good initial guesses

SLIDE 17

Future research

Investigate other DL approaches (
Process point clouds directly with DL

– challenges: data, data, data, lack of research

Use other forms of DL (reinforcement learning? GANs?)

SLIDE 18

Deep Learning Approach for Pose Estimation Talk #23444 MSc Kanter - - PowerPoint PPT Presentation

Deep Learning Approach for Pose Estimation

Talk #23444 MSc Kanter van Deurzen

Introduction

Delft Robotics Founded 2014 Staff of 11 Specialized in Vision guided Robotics through AI Winner Amazon Picking Challenge 2016 Kanter van Deurzen CTO Delft Robotics

Application

Bin-picking

Order-picking

Service robots

Any orientation possible (6DOF)!!

Typical pipeline

CAD pipeline

pose

CAD pipeline

– Complete 6DOF pose is known – Pose in gripper approximately known – Scales to other objects

– Symmetry causes ambiguity, risk of local minima – Requires CAD model

Non-rigid/CAD-less pipeline

Non-rigid/CAD-less pipeline

– No CAD model necessary – No knowledge of object necessary

– Pose in gripper unknown – No knowledge of object used (fragility/weight/...)

Why deep learning?

Attempted DL Approaches

Classification

– Classify pose as one of X classes

Class 1 Class 2 Class 3 Class ...

Classification results

– Rendered images – 3 axes rotations – 10 classes per axis (36 degrees per class) – For only 1 axis, great (Z-axis, 99% accuracy)

– Scales badly to multiple axes (10 classes per axis, 101010 = 1000 total classes).

Regression

Ground truth Predicted pose

Regression results

– Works well on asymmetric objects – Symmetric objects cause ambiguity – Difficult to find correct loss function

Key-points

– Recognize 3 or more keypoints – Determine pose based on these keypoints – Refine using (local) heuristics

Key-points results

– Easily integrated end-to-end in network – Works only for well-defined keypoints, which is non-trivial – Results in good initial guesses

Future research

– challenges: data, data, data, lack of research

Thank you for your attention

Kantervan Deurzen k.vandeurzen@delftrobotics.com 0651753705

Deep Learning Approach for Pose Estimation

Talk #23444 MSc Kanter van Deurzen

Introduction

Delft Robotics Founded 2014 Staff of 11 Specialized in Vision guided Robotics through AI Winner Amazon Picking Challenge 2016 Kanter van Deurzen CTO Delft Robotics

Application

Bin-picking

Order-picking

Service robots

Any orientation possible (6DOF)!!

Typical pipeline

CAD pipeline

pose

CAD pipeline

– Complete 6DOF pose is known – Pose in gripper approximately known – Scales to other objects

– Symmetry causes ambiguity, risk of local minima – Requires CAD model

Non-rigid/CAD-less pipeline

Non-rigid/CAD-less pipeline

– No CAD model necessary – No knowledge of object necessary

– Pose in gripper unknown – No knowledge of object used (fragility/weight/...)

Why deep learning?

Attempted DL Approaches

Classification

– Classify pose as one of X classes

Class 1 Class 2 Class 3 Class ...

Classification results

– Rendered images – 3 axes rotations – 10 classes per axis (36 degrees per class) – For only 1 axis, great (Z-axis, 99% accuracy)

– Scales badly to multiple axes (10 classes per axis, 10*10*10 = 1000 total classes).

Regression

Ground truth Predicted pose

Regression results

– Works well on asymmetric objects – Symmetric objects cause ambiguity – Difficult to find correct loss function

Key-points

– Recognize 3 or more keypoints – Determine pose based on these keypoints – Refine using (local) heuristics

Key-points results

– Easily integrated end-to-end in network – Works only for well-defined keypoints, which is non-trivial – Results in good initial guesses

Future research

– challenges: data, data, data, lack of research

Thank you for your attention

Kantervan Deurzen k.vandeurzen@delftrobotics.com 0651753705

– Scales badly to multiple axes (10 classes per axis, 101010 = 1000 total classes).