SLIDE 1 RGB-D Mapping: Using Depth Cameras for Dense 3D Modeling of Indoor Environments
Peter Henry1, Michael Krainin1, Evan Herbst1, Xiaofeng Ren2, and Dieter Fox1,2
1University of Washington Computer Science and
Engineering
2Intel Labs Seattle (now ISTC at UW)
1
SLIDE 2
The Kinect
2
SLIDE 3
PrimeSense Technology
3
SLIDE 4
RGB-D Data
4
SLIDE 5
5
SLIDE 6 Related Work
SLAM
[Davison et al, PAMI 2007] (monocular) [Konolige et al, IJR 2010] (stereo) [Pollefeys et al, IJCV 2007] (multi-view stereo) [Borrmann et al, RAS 2008] (3D laser) [May et al, JFR 2009] (ToF sensor)
Loop Closure Detection
[Nister et al, CVPR 2006] [Paul et al, ICRA 2010]
Photo collections
[Snavely et al, SIGGRAPH 2006] [Furukawa et al, ICCV 2009]
6
SLIDE 7 System Overview
- 1. Frame-to-frame alignment
- 2. Loop closure detection and global consistency
- 3. Map representation
7
SLIDE 8
Alignment (RANSAC)
Visual features (from image) in 3D (from depth) RANSAC alignment requires no initial estimate Requires sufficient matched features
8
SLIDE 9
RANSAC Details
Feature Detector / Descriptor Options
SIFT (SiftGPU) SURF FAST Detector / Calonder Descriptor (All available in OpenCV)
Matching:
L2 descriptor distance Either SIFT style matching or window matching
9
SLIDE 10
RANSAC Failure Case
10
SLIDE 11
Alignment (ICP)
Does not need visual features or color Requires sufficient geometry and initial estimate
11
SLIDE 12
ICP Failure Case
12
SLIDE 13
Joint Optimization (RGBD-ICP)
13
SLIDE 14
Optimal Transformation
14
SLIDE 15
Two-Stage Alternative
15
SLIDE 16
Loop Closure
Sequential alignments accumulate error Revisiting a previous location results in an
inconsistent map
16
SLIDE 17
17
SLIDE 18 Loop Closure Detection
Detect by running RANSAC against previous frames Pre-filter options (for efficiency):
Only a subset of frames (keyframes) Only keyframes with similar estimated 3D pose Place recognition using vocabulary tree
Post-filter (avoid false positives)
Estimate maximum expected drift and reject
detections changing pose too greatly
18
SLIDE 19
Loop Closure Correction (TORO)
TORO [Grisetti 2007]:
Constraints between camera locations in pose graph Maximum likelihood global camera poses
19
SLIDE 20
Loop Closure Correction (SBA)
Minimize reprojection error of features
20
SLIDE 21
Comparison (TORO)
21
SLIDE 22
Comparison (SBA)
22
SLIDE 23
A Second Comparison
23
TORO SBA
SLIDE 24
SBA on a tricky sequence
24
SLIDE 25
25
SLIDE 26
Map Representation: Surfels
Surface Elements [Pfister 2000, Weise 2009, Krainin 2010] Circular surface patches Accumulate color / orientation / size information Incremental, independent updates Incorporate occlusion reasoning 750 million points reduced to 9 million surfels
26
SLIDE 27
27
SLIDE 28
Experiments
Reprojection error is better for RANSAC: Errors for variations of the algorithm: Timing for variations of the algorithm:
28
SLIDE 29
Experiments: Overlay 1
29
SLIDE 30
Experiments: Overlay 2
30
SLIDE 31
Application: Measurements
31
SLIDE 32
Application: Quadrocopter
Collaboration with Albert Huang, Abe Bacharach,
and Nicholas Roy from MIT
32
SLIDE 33
33
SLIDE 34
34
SLIDE 35
35
SLIDE 36
Occupancy Map
36
SLIDE 37
37
SLIDE 38
Resulting Map
38
SLIDE 39 Conclusion
Kinect-style depth cameras have recently become available as
consumer products
RGB-D Mapping can generate rich 3D maps using these
cameras
RGBD-ICP combines visual and shape information for robust
frame-to-frame alignment
Global consistency achieved via loop closure detection and
- ptimization (RANSAC, TORO, SBA)
Surfels provide a compact map representation ROS + OpenCV are powerful tools to enable these applications
39
SLIDE 40 Open Questions
Which are the best features to use? How to find more loop closure constraints between
frames?
What is the right representation (point clouds, surfels,
meshes, volumetric, geometric primitives, objects)?
How to generate increasingly photorealistic maps? Autonomous exploration for map completeness? Can we use these rich 3D maps for semantic mapping?
40
SLIDE 41
Thank You!
41
SLIDE 42 Links
www.ros.org The following have nice ROS integration but also
work separately: http://opencv.willowgarage.com/wiki/ http://www.pointclouds.org/
Paper link on this page:
http://www.cs.washington.edu/homes/fox/abstracts/
3d-mapping-iser-10.abstract.html
42