SLIDE 1
Deep Learning Approach for Pose Estimation Talk #23444 MSc Kanter - - PowerPoint PPT Presentation
Deep Learning Approach for Pose Estimation Talk #23444 MSc Kanter - - PowerPoint PPT Presentation
Deep Learning Approach for Pose Estimation Talk #23444 MSc Kanter van Deurzen Introduction Delft Robotics Kanter van Deurzen Founded 2014 CTO Delft Robotics Staff of 11 Specialized in Vision guided Robotics through AI Winner Amazon
SLIDE 2
SLIDE 3
Application
Bin-picking
- One type of object
- Multiple objects
- Restricted area
Order-picking
- Large number of different objects
- Multiple objects
- Restricted area
Service robots
- Large number of different objects
- No area restriction
Any orientation possible (6DOF)!!
SLIDE 4
Typical pipeline
Data acquisition
- 2D
- 3D
Segmentation
- Deep learning
Rough pose estimation
- Ransac
- 4PCS
- Fast global
registration Pose
- ptimization
- ICP
Grasp pose estimation
SLIDE 5
CAD pipeline
- 1. Locate object in 2D/3D
- 2. Use CAD model to find global optimum of object
pose
- 3. Use CAD model to refine pose locally
- 4. Determine grasp pose using estimated pose
SLIDE 6
CAD pipeline
- Advantages:
– Complete 6DOF pose is known – Pose in gripper approximately known – Scales to other objects
- Disadvantages:
– Symmetry causes ambiguity, risk of local minima – Requires CAD model
SLIDE 7
Non-rigid/CAD-less pipeline
- 1. Locate object in 2D/3D
- 2. Fit shape of gripper anywhere on object
SLIDE 8
Non-rigid/CAD-less pipeline
- Advantages:
– No CAD model necessary – No knowledge of object necessary
- Disadvantages:
– Pose in gripper unknown – No knowledge of object used (fragility/weight/...)
SLIDE 9
Why deep learning?
- Rough pose estimation is slow
- Rough pose estimation risks local minima
SLIDE 10
Attempted DL Approaches
- Classification
- Regression
- Key-points
SLIDE 11
Classification
- Concept
– Classify pose as one of X classes
Class 1 Class 2 Class 3 Class ...
SLIDE 12
Classification results
- Conditions:
– Rendered images – 3 axes rotations – 10 classes per axis (36 degrees per class) – For only 1 axis, great (Z-axis, 99% accuracy)
- Conclusions:
– Scales badly to multiple axes (10 classes per axis, 10*10*10 = 1000 total classes).
SLIDE 13
Regression
- Treat as a regression problem
- Regress 4 values (queternion)
- Use RGB(D) as input
Ground truth Predicted pose
SLIDE 14
Regression results
- Conclusions:
– Works well on asymmetric objects – Symmetric objects cause ambiguity – Difficult to find correct loss function
SLIDE 15
Key-points
- Concept:
– Recognize 3 or more keypoints – Determine pose based on these keypoints – Refine using (local) heuristics
SLIDE 16
Key-points results
- Conclusions:
– Easily integrated end-to-end in network – Works only for well-defined keypoints, which is non-trivial – Results in good initial guesses
SLIDE 17
Future research
- Investigate other DL approaches (
- Process point clouds directly with DL
– challenges: data, data, data, lack of research
- Use other forms of DL (reinforcement learning? GANs?)
SLIDE 18