RGB-D Mapping: Using Depth Cameras for Dense 3D Modeling of Indoor - - PowerPoint PPT Presentation

rgb d mapping using depth cameras for dense 3d modeling
SMART_READER_LITE
LIVE PREVIEW

RGB-D Mapping: Using Depth Cameras for Dense 3D Modeling of Indoor - - PowerPoint PPT Presentation

RGB-D Mapping: Using Depth Cameras for Dense 3D Modeling of Indoor Environments Peter Henry 1 , Michael Krainin 1 , Evan Herbst 1 , Xiaofeng Ren 2 , and Dieter Fox 1,2 1 University of Washington Computer Science and Engineering 2 Intel Labs


slide-1
SLIDE 1

RGB-D Mapping: Using Depth Cameras for Dense 3D Modeling of Indoor Environments

Peter Henry1, Michael Krainin1, Evan Herbst1, Xiaofeng Ren2, and Dieter Fox1,2

1University of Washington Computer Science and

Engineering

2Intel Labs Seattle (now ISTC at UW)

1

slide-2
SLIDE 2

The Kinect

2

slide-3
SLIDE 3

PrimeSense Technology

3

slide-4
SLIDE 4

RGB-D Data

4

slide-5
SLIDE 5

5

slide-6
SLIDE 6

Related Work

 SLAM

 [Davison et al, PAMI 2007] (monocular)  [Konolige et al, IJR 2010] (stereo)  [Pollefeys et al, IJCV 2007] (multi-view stereo)  [Borrmann et al, RAS 2008] (3D laser)  [May et al, JFR 2009] (ToF sensor)

 Loop Closure Detection

 [Nister et al, CVPR 2006]  [Paul et al, ICRA 2010]

 Photo collections

 [Snavely et al, SIGGRAPH 2006]  [Furukawa et al, ICCV 2009]

6

slide-7
SLIDE 7

System Overview

  • 1. Frame-to-frame alignment
  • 2. Loop closure detection and global consistency
  • 3. Map representation

7

slide-8
SLIDE 8

Alignment (RANSAC)

 Visual features (from image) in 3D (from depth)  RANSAC alignment requires no initial estimate  Requires sufficient matched features

8

slide-9
SLIDE 9

RANSAC Details

 Feature Detector / Descriptor Options

 SIFT (SiftGPU)  SURF  FAST Detector / Calonder Descriptor  (All available in OpenCV)

 Matching:

 L2 descriptor distance  Either SIFT style matching or window matching

9

slide-10
SLIDE 10

RANSAC Failure Case

10

slide-11
SLIDE 11

Alignment (ICP)

 Does not need visual features or color  Requires sufficient geometry and initial estimate

11

slide-12
SLIDE 12

ICP Failure Case

12

slide-13
SLIDE 13

Joint Optimization (RGBD-ICP)

13

slide-14
SLIDE 14

Optimal Transformation

14

slide-15
SLIDE 15

Two-Stage Alternative

15

slide-16
SLIDE 16

Loop Closure

 Sequential alignments accumulate error  Revisiting a previous location results in an

inconsistent map

16

slide-17
SLIDE 17

17

slide-18
SLIDE 18

Loop Closure Detection

 Detect by running RANSAC against previous frames  Pre-filter options (for efficiency):

 Only a subset of frames (keyframes)  Only keyframes with similar estimated 3D pose  Place recognition using vocabulary tree

 Post-filter (avoid false positives)

 Estimate maximum expected drift and reject

detections changing pose too greatly

18

slide-19
SLIDE 19

Loop Closure Correction (TORO)

 TORO [Grisetti 2007]:

 Constraints between camera locations in pose graph  Maximum likelihood global camera poses

19

slide-20
SLIDE 20

Loop Closure Correction (SBA)

 Minimize reprojection error of features

20

slide-21
SLIDE 21

Comparison (TORO)

21

slide-22
SLIDE 22

Comparison (SBA)

22

slide-23
SLIDE 23

A Second Comparison

23

TORO SBA

slide-24
SLIDE 24

SBA on a tricky sequence

24

slide-25
SLIDE 25

25

slide-26
SLIDE 26

Map Representation: Surfels

 Surface Elements [Pfister 2000, Weise 2009, Krainin 2010]  Circular surface patches  Accumulate color / orientation / size information  Incremental, independent updates  Incorporate occlusion reasoning  750 million points reduced to 9 million surfels

26

slide-27
SLIDE 27

27

slide-28
SLIDE 28

Experiments

 Reprojection error is better for RANSAC:  Errors for variations of the algorithm:  Timing for variations of the algorithm:

28

slide-29
SLIDE 29

Experiments: Overlay 1

29

slide-30
SLIDE 30

Experiments: Overlay 2

30

slide-31
SLIDE 31

Application: Measurements

31

slide-32
SLIDE 32

Application: Quadrocopter

 Collaboration with Albert Huang, Abe Bacharach,

and Nicholas Roy from MIT

32

slide-33
SLIDE 33

33

slide-34
SLIDE 34

34

slide-35
SLIDE 35

35

slide-36
SLIDE 36

Occupancy Map

36

slide-37
SLIDE 37

37

slide-38
SLIDE 38

Resulting Map

38

slide-39
SLIDE 39

Conclusion

 Kinect-style depth cameras have recently become available as

consumer products

 RGB-D Mapping can generate rich 3D maps using these

cameras

 RGBD-ICP combines visual and shape information for robust

frame-to-frame alignment

 Global consistency achieved via loop closure detection and

  • ptimization (RANSAC, TORO, SBA)

 Surfels provide a compact map representation  ROS + OpenCV are powerful tools to enable these applications

39

slide-40
SLIDE 40

Open Questions

 Which are the best features to use?  How to find more loop closure constraints between

frames?

 What is the right representation (point clouds, surfels,

meshes, volumetric, geometric primitives, objects)?

 How to generate increasingly photorealistic maps?  Autonomous exploration for map completeness?  Can we use these rich 3D maps for semantic mapping?

40

slide-41
SLIDE 41

Thank You!

41

slide-42
SLIDE 42

Links

 www.ros.org  The following have nice ROS integration but also

work separately:  http://opencv.willowgarage.com/wiki/  http://www.pointclouds.org/

 Paper link on this page:

 http://www.cs.washington.edu/homes/fox/abstracts/

3d-mapping-iser-10.abstract.html

42