visual slam for mobile

Visual SLAM for Mobile Instructor - Simon Lucey 16-623 - Designing - PowerPoint PPT Presentation

Visual SLAM for Mobile Instructor - Simon Lucey 16-623 - Designing Computer Vision Apps Example of SLAM for AR Taken from: H. Liu et al. Robust Keyframe-based Monocular SLAM for Augmented Reality, ISMAR 2016. Example of SLAM for AR


  1. Visual SLAM for Mobile Instructor - Simon Lucey 16-623 - Designing Computer Vision Apps

  2. Example of SLAM for AR Taken from: H. Liu et al. “Robust Keyframe-based Monocular SLAM for Augmented Reality”, ISMAR 2016.

  3. Example of SLAM for AR Taken from: H. Liu et al. “Robust Keyframe-based Monocular SLAM for Augmented Reality”, ISMAR 2016.

  4. Example of SLAM for AR Taken from: H. Liu et al. “Robust Keyframe-based Monocular SLAM for Augmented Reality”, ISMAR 2016.

  5. What is SLAM?? • S imultaneous L ocalization a nd M apping. • On mobile interested primarily in Visual SLAM (VSLAM). • Sometimes called Mono SLAM if there is only one camera. • Can be viewed as an online SfM problem.

  6. Today • SfM - Bundle Adjustment • VSLAM - Keyframe vs. Filtering • Visual Odometry • Loop Closure

  7. Reminder - Bundle Adjustment The cathedral dataset: [ Ω i , τ i ] • 480 camera matrices • Total dof = 480 × (3 + 3) = 2880 • 91178 3D points. • Total dof = 91178 × 3 = 273543 Adapted from: Optimization Methods in Computer Vision. Anders Eriksson

  8. Reminder - Two view reconstruction Start with pair of images taken from slightly different viewpoints

  9. Reminder - Two view reconstruction Find features using a corner detection algorithm

  10. Reminder - Two view reconstruction Match features using a greedy algorithm

  11. Reminder - Two view reconstruction Fit fundamental matrix using robust algorithm such as RANSAC

  12. Reminder - Two view reconstruction Find matching points that agree with the fundamental matrix

  13. Reminder - Two view reconstruction • Extract essential matrix from fundamental matrix. • Extract rotation and translation from essential matrix. Ω τ • Reconstruct the 3D positions w of points. x = Ω w + τ λ ˜ • We refer to these matrices as belonging to the Special Euclidean Group - SE(3).  Ω � τ T = ∈ SE(3) 0 T 1

  14. Reminder: Lie Algebra • Exponential maps on the SO(3), SL(3) and SE(3) groups are related to the much broader topic of Lie Algebra. • More details on this topic can be found at in Murray et al. 1994.  � Ω τ T = ∈ SE(3) 0 T 1 “Sophus Lie” θ

  15. Reminder: Lie Algebra • Exponential maps on the SO(3), SL(3) and SE(3) groups are related to the much broader topic of Lie Algebra. • More details on this topic can be found at in Murray et al. 1994. 6 ! X T ( θ ) = exp θ i A i ∈ SE(3) i =1 “Sophus Lie” θ

  16. Reminder: Lie Algebra • Exponential maps on the SO(3), SL(3) and SE(3) groups are related to the much broader topic of Lie Algebra. • More details on this topic can be found at in Murray et al. 1994. 6 ! X T ( θ ) = exp θ i A i ∈ SE(3) i =1 “Sophus Lie” θ

  17. SfM - Bundle Adjustment F N X X n − π ( w n ; θ f ) || 2 || x f arg min 2 w , θ n =1 f =1 x ← 2D projection w ← 3D point θ ← extrinsics N ← no. of points π ← projection function F ← no. of frames

  18. SfM - Linearization  ∆ θ f � π ( w n + ∆ w n ; θ f � ∆ θ f ) ⇡ π ( w n ; θ f ) + J f n ∆ w n

  19. SfM - Linearization  ∆ θ f � π ( w n + ∆ w n ; θ f � ∆ θ f ) ⇡ π ( w n ; θ f ) + J f n ∆ w n why not additive??

  20. SfM - Linearization  ∆ θ f � π ( w n + ∆ w n ; θ f � ∆ θ f ) ⇡ π ( w n ; θ f ) + J f n ∆ w n F N  ∆ θ f � X X || x f n − π ( w n ; θ f ) − J f || 2 arg min n 2 ∆ w n ∆ θ , ∆ w n =1 f =1 x ← 2D projection w ← 3D point θ ← extrinsics N ← no. of points π ← projection function F ← no. of frames

  21. Visibility of Points ρ 1 ρ F   1 1 . . . . . ... Υ =   . . “visibility matrix” . .   ρ 1 ρ F N N . . .

  22. SfM - Bundle Adjustment  ∆ θ f � π ( w n + ∆ w n ; θ f � ∆ θ f ) ⇡ π ( w n ; θ f ) + J f n ∆ w n F N  ∆ θ f � X X ρ f n || x f n − π ( w n ; θ f ) − J f || 2 arg min n 2 ∆ w n ∆ θ , ∆ w n =1 f =1 x ← 2D projection w ← 3D point θ ← extrinsics N ← no. of points π ← projection function ρ → visibility ∈ [0 , 1] F ← no. of frames

  23. SfM - Bundle Adjustment  � ∆ θ || 2 ∆ θ , ∆ w || b − A arg min 2 ∆ w poses landmarks • Can be solved efficiently using sparse linear solvers such as, • Google Ceres Solver - http://ceres-solver.org • G2o - https://openslam.org/g2o.html . Θ � 𝑞 𝑨 Θ 𝜖ℎ 𝑗 • Then iteratively apply GN or LM � ) 𝜖Θ� ℎ 𝑗 ( Θ 𝑨∈𝑎 algorithm. � Θ A b Θ � ℎ 𝑗 Θ − 𝑨 2 𝑗 e nt 𝜄 𝐵𝐵 − 𝑐 2 n

  24. SfM - Bundle Adjustment  � ∆ θ || 2 ∆ θ , ∆ w || b − A arg min 2 ∆ w 6 F + 3 N poses landmarks • Can be solved efficiently using sparse linear solvers such as, • Google Ceres Solver - http://ceres-solver.org • G2o - https://openslam.org/g2o.html . Θ � 𝑞 𝑨 Θ 𝜖ℎ 𝑗 • Then iteratively apply GN or LM � ) 2 FN 𝜖Θ� ℎ 𝑗 ( Θ 𝑨∈𝑎 algorithm. � Θ A b Θ � ℎ 𝑗 Θ − 𝑨 2 𝑗 e nt 𝜄 𝐵𝐵 − 𝑐 2 n

  25. Reminder: Gauss-Newton Algorithm • Gauss-Newton (GN) algorithm common strategy for optimizing non-linear least-squares problems. y || x − F ( y ) || 2 arg min 2 s.t. F : R N → R M Step 1: “Carl Friedrich Gauss” ∆ y || x − F ( y ) − ∂ F ( y ) ∂ y T ∆ y || 2 arg min 2 Step 2: y → y + ∆ y keep applying steps until converges. ∆ y “Isaac Newton” 18

  26. Reminder: Gauss-Newton Algorithm • Gauss-Newton (GN) algorithm common strategy for optimizing non-linear least-squares problems. y || x − F ( y ) || 2 arg min 2 s.t. F : R N → R M Step 1: “Carl Friedrich Gauss” ∆ y || x − F ( y ) − ∂ F ( y ) ∂ y T ∆ y || 2 arg min 2 Step 2: y → y + ∆ y “Is the update additive?” keep applying steps until converges. ∆ y “Isaac Newton” 18

  27. Today • SfM - Bundle Adjustment • VSLAM - Keyframe vs. Filtering • Visual Odometry • Loop Closure

  28. Mono SLAM = Online SFM • Monocular SLAM is just another name for “online” SFM. • If computation was not an issue, one would just apply Bundle Adjustment after every new frame F N X X n − π ( w n ; θ f ) || 2 || x f arg min 2 w , θ n =1 f =1 x ← 2D projection w ← 3D point θ ← extrinsics N ← no. of points π ← projection function F ← no. of frames

  29. Mono SLAM - MRF • One can view the problem of SfM - Bundle Adjustment as doing inference on a Markov Random Field (MRF). • Problem - becomes exponentially harder as times goes on. θ 3 θ 4 T 0 θ 1 θ 2 T 1 T 2 T 3 “edges based on visibility” ρ x x 2 x 3 x 4 x 5 x 6 w 1 w 2 w 3 w 4 w 5 w 6 1 H. Strasdat, J. M. M. Montiel, and A. J. Davison, “Visual SLAM: Why filter?” Image and Vision Computing, vol. 30, no. 2, pp. 65–77, 2012. .

  30. Mono SLAM - Filtering • Classic way of resolving this was to pose BA problem as a filter - such as an Extended Kalman Filter (EKF). • Problem - Wastes processing time on frames that added very little information. θ 4 T 0 T 1 T 2 T 3 θ 1 θ 2 θ 3 2 3 “marginalizing out previous poses also results in unwanted direct connections between 3D points” x 3 x x 2 x 4 x 5 x 6 w 1 w 2 w 3 w 4 w 5 w 6 1 1 H. Strasdat, J. M. M. Montiel, and A. J. Davison, “Visual SLAM: Why filter?” Image and Vision Computing, vol. 30, no. 2, pp. 65–77, 2012. .

  31. � Mono SLAM - Filtering • Filtering approaches are often times problematic (e.g. think when the device stops moving). • When frames are taken at nearby positions compared to the scene distance, 3D points will exhibit large uncertainty. Taken from D. Scaramuzza “Tutorial on Visual Odometry”. – –

  32. Mono SLAM - Keyframe • A better strategy is to employ keyframe BA . • Made popular by Klein & Murray’s - Parallel Tracking and Mapping (PTAM) algorithm. T 0 T 1 T 2 T 3 θ 4 θ 1 θ 2 θ 3 “remove all but a small subset of keyframes” x 3 x x 2 x 4 x 5 x 6 w 1 w 2 w 3 w 4 w 5 w 6 6 1 G. Klein and D. Murray, “Parallel tracking and mapping for small AR workspaces”, ISMAR 2007. H. Strasdat, J. M. M. Montiel, and A. J. Davison, “Visual SLAM: Why filter?” Image and Vision Computing, vol. 30, no. 2, pp. 65–77, 2012. .

  33. Keyframe Selection � • One way to avoid this consists of skipping frames until the � average uncertainty of the 3D points decreases below a � certain threshold. The selected frames are called keyframes . � • Rule of thumb: add a keyframe when, � keyframe distance > threshold (~10-20 %) when � average-depth . . . Taken from D. Scaramuzza “Tutorial on Visual Odometry”. – – – –

  34. Keyframe-based SLAM Keyframe 1 Keyframe 2 Current frame New keyframe Initial pointcloud New triangulated points Taken from D. Scaramuzza “Tutorial on Visual Odometry”. [Nister’04, PTAM’07, LIBVISO’08, LSD SLAM’14 SVO’14, ORB SLAM’15] – –

Recommend


More recommend