direct visual slam

Direct Visual SLAM Instructor - Simon Lucey 16-623 - Designing - PowerPoint PPT Presentation

Direct Visual SLAM Instructor - Simon Lucey 16-623 - Designing Computer Vision Apps Reminder: SLAM S imultaneous L ocalization a nd M apping. On mobile interested primarily in Visual SLAM (VSLAM). Sometimes called Mono SLAM if there


  1. Direct Visual SLAM Instructor - Simon Lucey 16-623 - Designing Computer Vision Apps

  2. Reminder: SLAM • S imultaneous L ocalization a nd M apping. • On mobile interested primarily in Visual SLAM (VSLAM). • Sometimes called Mono SLAM if there is only one camera. • Can be viewed as an online SfM problem.

  3. Reminder: VO vs VSLAM vs SFM SFM VSLAM VO Taken from D. Scaramuzza “Tutorial on Visual Odometry”. – –

  4. Reminder: Keyframe-based SLAM Keyframe 1 Keyframe 2 Current frame New keyframe Initial pointcloud New triangulated points Taken from D. Scaramuzza “Tutorial on Visual Odometry”. [Nister’04, PTAM’07, LIBVISO’08, LSD SLAM’14 SVO’14, ORB SLAM’15] – –

  5. A Tale of Two Threads f θ f { θ f } F f =1 to Adapted from S. Lovegrove & A. J. Davison “Real-Time Spherical Mosaicing using Whole Image Alignment”, ECCV 2010.

  6. Example - ORB SLAM “Thread 1 - Visual Odometry” “Thread 2 - Local BA” R. Mur-Artal, J. M. M. Montiel, J. D. Tardos, “ORB-SLAM: a Versatile and Accurate Monocular SLAM System” IEEE Trans. Robotics 2015.

  7. Today • Direct vs. Feature based methods • Dense SLAM • Semi-Dense SLAM

  8. ECCV 1999 8

  9. Feature-Based Methods

  10. Feature-Based Methods Image is reduced to a sparse set of keypoints Usually matched with feature descriptors

  11. Feature-Based Advantages Vanishing point Mikolajczyk, 2007 Mikolajczyk, 2007 Easier transition from Wide-baseline matching Illumination images to geometry invariance

  12. Feature-Based Advantages Vanishing point Mikolajczyk, 2007 Mikolajczyk, 2007 Easier transition from Wide-baseline matching Illumination images to geometry invariance Using invariant descriptors

  13. Feature-Based Challenges • Creates only a sparse map of the world. • Does not sample across all available image data - edges & weak intensities. • Needs high-resolution camera mode (bad for efficiency and battery life). Direct Method Feature-Based Method (ours) (ORB+RANSAC)

  14. Feature-Based Challenges • Creates only a sparse map of the world. • Does not sample across all available image data - edges & weak intensities. • Needs high-resolution camera mode (bad for efficiency and battery life). Direct Method Feature-Based Method (ours) (ORB+RANSAC)

  15. Feature-Based Challenges • Creates only a sparse map of the world. • Does not sample across all available image data - edges & weak intensities. • Needs high-resolution camera mode (bad for efficiency and battery life). Direct Method Feature-Based Method (ours) (ORB+RANSAC)

  16. Today • Direct vs. Feature based methods • Dense SLAM • Semi-Dense SLAM

  17. Reminder: Warp Functions x “Template” “Source”

  18. Reminder: Warp Functions W ( x ; p ) x x 0 “Template” “Source” Our goal is to find the warp parameter vector ! p x = coordinate in template [ x, y ] T x 0 = corresponding coordinate in source [ x 0 , y 0 ] T W ( x ; p ) = warping function such that x 0 = W ( x ; p ) p = parameter vector describing warp

  19. Review: Pinhole Camera Instead model impossible but more convenient Real camera image virtual image is inverted Adapted from: Computer vision: models, learning and inference. Simon J.D. Prince

  20. Relating Points between Views First camera: Second camera: Substituting: Adapted from: Computer vision: models, learning and inference. Simon J.D. Prince

  21. Pinhole Warp Function • One can represent the relationship of points between views of pinhole cameras as a warp function, W ( x ; θ , λ ) = π ( λ Ω ˜ x + τ ) “warp function” 0 1 u ✓ ◆ u/w v A = “pinhole projection” π @ v/w w  � Ω τ T = ∈ SE(3) “pose parameters” 0 T 1

  22. Pinhole Warp Function • One can represent the relationship of points between views of pinhole cameras as a warp function, W ( x ; θ , λ ) = π ( λ Ω ˜ x + τ ) “warp function” 0 1 u ✓ ◆ u/w v A = “pinhole projection” π @ v/w w 6 ! X T ( θ ) = exp θ i A i ∈ SE(3) “pose parameters” i =1

  23. 𝑈 𝑙 Photometric Relationship • We can employ this warp function to now express the problem as, T ( x n ) = I ( W{ x n ; θ f , λ n } ) 𝐽 𝑙−1 𝐽 𝑙 𝑈 𝑙 θ f T “keyframe template” I f “f- th image” λ n ˜ x n 𝑈 𝑙 “An Invitation to 3D Vision”, Ma,

  24. Linearizing the Image for Pose T ( x n ) = I f ( W{ x n ; θ f � ∆ θ , λ n } ) ≈ I f ( W{ x n ; θ f , λ n } ) + A f n ∆ θ f Baker, Simon, and Iain Matthews. "Equivalence and efficiency of image alignment algorithms." CVPR 2001.

  25. Linearizing the Image for Pose T ( x n ) = I f ( W{ x n ; θ f � ∆ θ , λ n } ) ≈ I f ( W{ x n ; θ f , λ n } ) + A f n ∆ θ f Baker, Simon, and Iain Matthews. "Equivalence and efficiency of image alignment algorithms." CVPR 2001.

  26. 𝑈 𝑙 Direct Camera Tracking { λ n } N • Assuming known depths , n =1 N X ||T ( x n ) − I f ( W{ x n ; θ f , λ n } ) − A f 𝐽 𝑙−1 n ∆ θ f || 2 arg min 𝐽 𝑙 2 ∆ θ f n =1 𝑈 𝑙 θ f T “keyframe template” I f “f- th image” λ n ˜ x n 𝑈 𝑙 “An Invitation to 3D Vision”, Ma,

  27. Direct Camera Tracking • Most methods employ a variant of the Lucas-Kanade algorithm for estimating camera pose. • Engel et al. demonstrated using a “dense” number of points does not improve the performance of camera tracking (i.e pose estimation). • Advantage of density stems mainly from the map estimation. J. Engel, V. Koltun, and D. Cremers. Direct sparse odometry. arXiv preprint arXiv:1607.02565, 2016. J. Engel, T. Schops, and D. Cremers. LSD-SLAM: Large-scale direct monocular slam. In European Conference on Computer Vision, pages 834–849. Springer, 2014.

  28. Direct Camera Tracking • Most methods employ a variant of the Lucas-Kanade algorithm for estimating camera pose. • Engel et al. demonstrated using a “dense” number of points does not improve the performance of camera tracking (i.e pose estimation). • Advantage of density stems mainly from the map estimation. How do we update the depths? J. Engel, V. Koltun, and D. Cremers. Direct sparse odometry. arXiv preprint arXiv:1607.02565, 2016. J. Engel, T. Schops, and D. Cremers. LSD-SLAM: Large-scale direct monocular slam. In European Conference on Computer Vision, pages 834–849. Springer, 2014.

  29. Direct Map Estimation { θ f } F • Assuming known pose parameters , f =1 • Naively we could solve for the depths independently, λ n = arg min λ C ( x n , λ ) C ( x , λ ) F C ( x , λ ) = 1 X ||T ( x ) − I f ( W{ x ; θ f , λ } ) || 1 C ( x , λ ) F f =1 R. A. Newcombe, S. J. Lovegrove and A. J. Davison "DTAM: Dense Tracking and Mapping in Real-Time”, ICCV 2011.

  30. DTAM • Newcombe et al. proposed - D ense T racking a nd M apping. • Attempted to substitute the feature based tracking and mapping modules of traditional VSLAM (e.g. PTAM) for dense methods. -1 λ min -1 λ max x T I F f = 1 : F R. A. Newcombe, S. J. Lovegrove and A. J. Davison "DTAM: Dense Tracking and Mapping in Real-Time”, ICCV 2011.

  31. DTAM • Newcombe et al. proposed - D ense T racking a nd M apping. • Attempted to substitute the feature based tracking and mapping modules of traditional VSLAM (e.g. PTAM) for dense methods. -1 λ min “Sample across inverse depths” -1 λ max x T I F f = 1 : F R. A. Newcombe, S. J. Lovegrove and A. J. Davison "DTAM: Dense Tracking and Mapping in Real-Time”, ICCV 2011.

  32. DTAM - Example C ( a , λ ) λ -1 photometric functions and the resulting R. A. Newcombe, S. J. Lovegrove and A. J. Davison "DTAM: Dense Tracking and Mapping in Real-Time”, ICCV 2011.

  33. DTAM - Example C ( b , λ ) -1 λ R. A. Newcombe, S. J. Lovegrove and A. J. Davison "DTAM: Dense Tracking and Mapping in Real-Time”, ICCV 2011.

  34. DTAM - Example C ( c , λ ) -1 λ are shown for three example R. A. Newcombe, S. J. Lovegrove and A. J. Davison "DTAM: Dense Tracking and Mapping in Real-Time”, ICCV 2011.

  35. DTAM - Geometric Prior • Newcombe et al. proposed the employment of a geometric prior on depths, N X -1 arg min C ( x n , λ n ) + g ( x n ) || r ⇤ λ n || ✏ λ n =1 g ( x ) = exp( � α || r T ( x ) || β 2 )

  36. DTAM - Geometric Prior • Newcombe et al. proposed the employment of a geometric prior on depths, N X -1 arg min C ( x n , λ n ) + g ( x n ) || r ⇤ λ n || ✏ λ n =1 g ( x ) = exp( � α || r T ( x ) || β 2 ) What do you think the prior is doing?

  37. DTAM - Video

  38. DTAM - Video

Recommend


More recommend