[PPT] - Geometric and semantic SLAM using high level features Shichao Yang PowerPoint Presentation

SLIDE 1

Shichao Yang Michael Kaess Sebastian Scherer

Geometric and semantic SLAM using high level features

SLIDE 2

q Widely used in searching, monitoring, mapping etc. q Focus on the monocular camera.

Autonomous Robots

2

Bridge inspection Riverine mapping

SLIDE 3

q Simultaneous Localization and Mapping

SLAM

3

ORB SLAM, 2015 R Mur-Artal et al LSD, DSO, 2016 Jakob et al.

SLIDE 4

q Traditional SLAM might fail in challenging low-texture cases.

SLAM not enough?

4

DSO ORB SLAM

SLIDE 5

q Need objects and planes, in addition to points. q Reason position of 3D objects, layouts.

SLAM not enough?

5

Autonomous Driving

Mousavian, 2016

Virtual Reality

Placing furniture

Scene understanding

Hedau, Hoiem, 2010

Robotic Manipulation

Seung-Joon Yi, 2015

SLIDE 6

Methods

6

Scene understanding SLAM Jointly solve SLAM and scene understanding, demonstrating that they can benefit each other high level features: Lines, planes, objects…

SLIDE 7

q Line VO/SLAM q Plane SLAM q Object SLAM

Outline

7

SLIDE 8

q Line VO/SLAM q Plane SLAM q Object SLAM

Outline

8

SLIDE 9

q Line is an important feature

§ Exist in low-texture environments § Provide long range constraints

q Challenges:

§ Line parameterization § Sensitive to occlusion, less reliable than points

Line VO/SLAM

9

Feature points Lines

Shichao Yang,, Sebastian Scherer. " Direct Monocular Odometry Using Points and Lines." ICRA, 2017

SLIDE 10

q Only geometric error

Related Work: Points + Line

10

d d

Albert, et al, ICRA, 2017 Juan, et al. ICCV, 2015 Manohar, et al. ICRA, 2016

Point-point geometric error Line-line geometric error Point-line geometric error

SLIDE 11

q Point + lines. Two error types. q Contributions:

§ Combine points and lines with two types of error, especially suitable for

low-texture environments

§ Provide an uncertainty analysis and probabilistic fusion in tracking and

mapping

§ Real time VO, outperforming or comparable to existing VO

Proposed Line VO

11

Direct method Point-point photometric error d Feature-based method Point-line geometric error !" − !%

SLIDE 12

q Pipeline, as an extension of point based SDVO[1]

Proposed Line VO

12

[1] Semi-dense Visual Odometry for a Monocular Camera, Jakob Engel, et al. ICCV. 2013

Images

Point extraction Line detection

Tracking Mapping Estimate camera pose by minimizing two kinds of errors. Estimate keyframe’s depth through line regularization High gradient pixels Line pixels

SLIDE 13

Experiments - Line VO

13

Relative Position Error (cm/s) on TUM Dataset

SLIDE 14

q Datasets with various textures

Experiments - Line VO

14

https://youtu.be/wu4jL2jQEac

SLIDE 15

q Line VO/SLAM q Plane SLAM q Object SLAM

Outline

15

SLIDE 16

q Manhattan corridors §

Similar layout structures

§

Low-texture: few visual features Difficult for traditional v-SLAM

Introduction - Plane SLAM

16

SLAM with Layout Planes IROS, 2016

Texture-less but structured corridor. Sparse and inaccurate map of ORB-SLAM

SLIDE 17

Related Work – Layout Understanding

17 Hoiem, 2007

Decision tree segmentation + pop up

Hedau, 2009

Cuboidal room using vanishing points Fixed corridor models.

Lee, 2009 §

Usually works for Manhattan box environments or fixed corridor configurations, view points

§

Not real time.

SLIDE 18

q Sequential approach, solving problems separately.

Related Work - SLAM + Layout

18

Scene understanding SLAM Scene understanding SLAM Point Cloud Detect plane and object 3D Layout Post dense mapping

Sid Yingze Bao. 2014 Concha, Alejo, 2015

Limitations

§

One module fails, the other also fail.

SLIDE 19

Proposed methods

19

q Contributions

§

Jointly optimize scene layouts with camera poses in SLAM framework and large environments for the first time.

§

Real time system applicable for robot navigation. Scene understanding SLAM

SLIDE 20

q Layout plane extraction from single image

Plane model from Single Image

20

Shichao Yang, Daniel Maturana, Sebastian Scherer. " Real-time 3D scene layout from a single image using convolutional neural networks." ICRA, 2016

Ground segmentation Boundary line fitting Pop up 3D model

SLIDE 21

q Generalize to various environment structure

§ Wall Ground

Plane model from Single Image

Input Our Hoiem, 2007 Hedau, 2009 Lee, 2009

21

SLIDE 22

§

Fast, in real time 60Hz

§

More accurate and robust to various environments.

Plane model from Single Image

22

4x speed

https://youtu.be/2CvFHy5jk1c

SLIDE 23

q Previous method q Improvement

§

Plane matches true layout, invariant across frames, suitable for SLAM.

§

More accurate 3D model

Plane model from Single Image

23

Ground segmentation Line fitting Pop-up Detect edge Select edge Pop-up

SLIDE 24

q Optimal edge set selection

§

Submodular problem, greedy solution:

Plane model from Single Image

24

& ← & ∪ argmax

.∉0:0∪{.}∈5

6 7|&

∆ is the marginal cost gain of adding edge 7

Select edge one by one.

All edges Ground segmentation Selected ground edges

max

0⊆; < & , >?: & ∈ !

SLIDE 25

q Factor Graph q Edges

§

Plane measurement: @A from single image pop-up process

§

Re-pop to update measurement after camera poses changes

q Nodes §

Plane: BA = {D ∈ RF, | D | = 1}. 3 Dof quaternion as minimal representation for manifold optimization[1].

§

Camera Pose: HA 6 Dof SE3

Pop-up Plane SLAM

25

Shichao Yang, Yu Song, Michael Kaess, Sebastian Scherer. " Pop-up SLAM: a Semantic Monocular Plane SLAM for Low-texture Environments." IROS, 2016

SLIDE 26

q Data association

Geometry, not visual features due to low-texture
Plane normal I", I% angle difference.
Overlapping ratio by projection B% onto B"

q Loop closing

§

Bag of words place recognition

§

Planes have different appearance and size across frames. Landmarks merged after being created for some frames. Need to shift factors.

Pop-up Plane SLAM

26

Shift factors

SLIDE 27

q Only Plane SLAM sometimes not enough q Point SLAM is not accurate in forward corridor motion with low

parallax

Point-plane Fusion

27

RGB image Pop-up depth map Much better than stereo triangulation

SLIDE 28

q Depth fusion:

§

Integrate LSD depth JK and pop-up depth JL in a filtering approach: MK

% and ML % are covariance of depth measurements.

Pop-up covariance ML

% computed through error propagation rule.

Point-plane Fusion

28

N MK

%JL + ML %JK

MK

% + ML %

, MK

%ML %

MK

% + ML %

SLIDE 29

q On public and collected dataset.

Experiments of Plane SLAM

29

https://youtu.be/TOSOWdxmtkw

SLIDE 30

q Compare with LSD and ORB SLAM

§

On TUM dataset.

Experiments of Plane SLAM

30

Plane Normal error Depth error Depth error<0.1m Value 2.83 6.2cm 86.8% Existing point SLAM fails

SLIDE 31

q On our data I

Experiments of Plane SLAM

31

LSD SLAM ORB SLAM Depth Enhanced LSD SLAM LSD Pop-up SLAM Input Image Our algorithms

SLIDE 32

q On our data II

Experiments of Plane SLAM

32

LSD SLAM ORB SLAM Input Image Our algorithms Loop error 0.67%.

SLIDE 33

q Line VO/SLAM q Plane SLAM q Object SLAM

Outline

33

SLIDE 34

q SLAM with objects and planes.

Introduction – Object SLAM

34

Plane SLAM Completed work Plane and Object SLAM Proposed work

SLIDE 35

Related Work – 3D Object Understanding

35

Schwing, 2013 Choi, 2013

Limitations

§

Need prior object CAD model or

§

shape priors. Object aligned with room Prior CAD model

Murthy1, 2017

Keypoint model

SLIDE 36

Related Work – Object SLAM

36

Bao, Sid Yingze, et al. 2012

SLAM++ (RGBD)

Salas-Moreno, et al. 2013

(Only two image) Limitations

§

Work for small workspace

§

Require known object model

Dorian Gálvez-López, et al. 2016

SLIDE 37

q Without 3D CAD or keypoint model.

Single image 3D object detection

37

SLIDE 38

q On TUM sequence (preliminary result)

Object SLAM

38

3D Object detection in single image, without prior object model Multi-view object SLAM Existing point SLAM all fail Each object has 6 DoF pose, and Length, width, height

SLIDE 39

q SLAM with high level features, from scene understanding. q Improve both state estimation and mapping. q Without prior CAD model or room model.

Conclusion

39

Line Plane Object

Plane Object Points

Image Modified from Salas-Moreno, 2014

SLIDE 40

q More complicated environment? Support relations? q Jointly points, plane, objects?

Future work

40

Segmentation Intersection Occlusion

SLIDE 41

41