Visual SLAM for Mobile Instructor - Simon Lucey 16-423 - Designing Computer Vision Apps
Example of SLAM for AR Taken from: H. Liu et al. “Robust Keyframe-based Monocular SLAM for Augmented Reality”, ISMAR 2016.
Example of SLAM for AR Taken from: H. Liu et al. “Robust Keyframe-based Monocular SLAM for Augmented Reality”, ISMAR 2016.
Example of SLAM for AR Taken from: H. Liu et al. “Robust Keyframe-based Monocular SLAM for Augmented Reality”, ISMAR 2016.
What is SLAM?? • S imultaneous L ocalization a nd M apping. • On mobile interested primarily in Visual SLAM (VSLAM). • Sometimes called Mono SLAM if there is only one camera. • Can be viewed as an online SfM problem.
Today • SfM - Bundle Adjustment • VSLAM - Keyframe vs. Filtering • Visual Odometry • Loop Closure
Reminder - Bundle Adjustment The cathedral dataset: [ Ω i , τ i ] • 480 camera matrices • Total dof = 480 × (3 + 3) = 2880 • 91178 3D points. • Total dof = 91178 × 3 = 273543 Adapted from: Optimization Methods in Computer Vision. Anders Eriksson
Reminder - Two view reconstruction Start with pair of images taken from slightly different viewpoints
Reminder - Two view reconstruction Find features using a corner detection algorithm
Reminder - Two view reconstruction Match features using a greedy algorithm
Reminder - Two view reconstruction Fit fundamental matrix using robust algorithm such as RANSAC
Reminder - Two view reconstruction Find matching points that agree with the fundamental matrix
Reminder - Two view reconstruction • Extract essential matrix from fundamental matrix. • Extract rotation and translation from essential matrix. Ω τ • Reconstruct the 3D positions w of points. x = Ω w + τ λ ˜ • We refer to these matrices as belonging to the Special Euclidean Group - SE(3). Ω � τ T = ∈ SE(3) 0 T 1
Reminder: Lie Algebra • Exponential maps on the SO(3), SL(3) and SE(3) groups are related to the much broader topic of Lie Algebra. • More details on this topic can be found at in Murray et al. 1994. � Ω τ T = ∈ SE(3) 0 T 1 “Sophus Lie” θ
Reminder: Lie Algebra • Exponential maps on the SO(3), SL(3) and SE(3) groups are related to the much broader topic of Lie Algebra. • More details on this topic can be found at in Murray et al. 1994. 6 ! X T ( θ ) = exp θ i A i ∈ SE(3) i =1 “Sophus Lie” θ
Reminder: Lie Algebra • Exponential maps on the SO(3), SL(3) and SE(3) groups are related to the much broader topic of Lie Algebra. • More details on this topic can be found at in Murray et al. 1994. 6 ! X T ( θ ) = exp θ i A i ∈ SE(3) i =1 “Sophus Lie” θ
SfM - Bundle Adjustment F N X X n − π ( w n ; θ f ) || 2 || x f arg min 2 w , θ n =1 f =1 x ← 2D projection w ← 3D point θ ← extrinsics N ← no. of points π ← projection function F ← no. of frames
SfM - Linearization ∆ θ f � π ( w n + ∆ w n ; θ f � ∆ θ f ) ⇡ π ( w n ; θ f ) + J f n ∆ w n
SfM - Linearization ∆ θ f � π ( w n + ∆ w n ; θ f � ∆ θ f ) ⇡ π ( w n ; θ f ) + J f n ∆ w n why not additive??
SfM - Linearization ∆ θ f � π ( w n + ∆ w n ; θ f � ∆ θ f ) ⇡ π ( w n ; θ f ) + J f n ∆ w n F N ∆ θ f � X X || x f n − π ( w n ; θ f ) − J f || 2 arg min n 2 ∆ w n ∆ θ , ∆ w n =1 f =1 x ← 2D projection w ← 3D point θ ← extrinsics N ← no. of points π ← projection function F ← no. of frames
Visibility of Points ρ 1 ρ F 1 1 . . . . . ... Υ = . . “visibility matrix” . . ρ 1 ρ F N N . . .
SfM - Bundle Adjustment ∆ θ f � π ( w n + ∆ w n ; θ f � ∆ θ f ) ⇡ π ( w n ; θ f ) + J f n ∆ w n F N ∆ θ f � X X ρ f n || x f n − π ( w n ; θ f ) − J f || 2 arg min n 2 ∆ w n ∆ θ , ∆ w n =1 f =1 x ← 2D projection w ← 3D point θ ← extrinsics N ← no. of points π ← projection function ρ → visibility ∈ [0 , 1] F ← no. of frames
SfM - Bundle Adjustment � ∆ θ || 2 ∆ θ , ∆ w || b − A arg min 2 ∆ w poses landmarks • Can be solved efficiently using sparse linear solvers such as, • Google Ceres Solver - http://ceres-solver.org • G2o - https://openslam.org/g2o.html . Θ � 𝑞 𝑨 Θ 𝜖ℎ 𝑗 • Then iteratively apply GN or LM � ) 𝜖Θ� ℎ 𝑗 ( Θ 𝑨∈𝑎 algorithm. � Θ A b Θ � ℎ 𝑗 Θ − 𝑨 2 𝑗 e nt 𝜄 𝐵𝐵 − 𝑐 2 n
SfM - Bundle Adjustment � ∆ θ || 2 ∆ θ , ∆ w || b − A arg min 2 ∆ w 6 F + 3 N poses landmarks • Can be solved efficiently using sparse linear solvers such as, • Google Ceres Solver - http://ceres-solver.org • G2o - https://openslam.org/g2o.html . Θ � 𝑞 𝑨 Θ 𝜖ℎ 𝑗 • Then iteratively apply GN or LM � ) 2 FN 𝜖Θ� ℎ 𝑗 ( Θ 𝑨∈𝑎 algorithm. � Θ A b Θ � ℎ 𝑗 Θ − 𝑨 2 𝑗 e nt 𝜄 𝐵𝐵 − 𝑐 2 n
Reminder: Gauss-Newton Algorithm • Gauss-Newton (GN) algorithm common strategy for optimizing non-linear least-squares problems. y || x − F ( y ) || 2 arg min 2 s.t. F : R N → R M Step 1: “Carl Friedrich Gauss” ∆ y || x − F ( y ) − ∂ F ( y ) ∂ y T ∆ y || 2 arg min 2 Step 2: y → y + ∆ y keep applying steps until converges. ∆ y “Isaac Newton” 18
Reminder: Gauss-Newton Algorithm • Gauss-Newton (GN) algorithm common strategy for optimizing non-linear least-squares problems. y || x − F ( y ) || 2 arg min 2 s.t. F : R N → R M Step 1: “Carl Friedrich Gauss” ∆ y || x − F ( y ) − ∂ F ( y ) ∂ y T ∆ y || 2 arg min 2 Step 2: y → y + ∆ y “Is the update additive?” keep applying steps until converges. ∆ y “Isaac Newton” 18
Today • SfM - Bundle Adjustment • VSLAM - Keyframe vs. Filtering • Visual Odometry • Loop Closure
Mono SLAM = Online SFM • Monocular SLAM is just another name for “online” SFM. • If computation was not an issue, one would just apply Bundle Adjustment after every new frame F N X X n − π ( w n ; θ f ) || 2 || x f arg min 2 w , θ n =1 f =1 x ← 2D projection w ← 3D point θ ← extrinsics N ← no. of points π ← projection function F ← no. of frames
Mono SLAM - MRF • One can view the problem of SfM - Bundle Adjustment as doing inference on a Markov Random Field (MRF). • Problem - becomes exponentially harder as times goes on. θ 3 θ 4 T 0 θ 1 θ 2 T 1 T 2 T 3 “edges based on visibility” ρ x x 2 x 3 x 4 x 5 x 6 w 1 w 2 w 3 w 4 w 5 w 6 1 H. Strasdat, J. M. M. Montiel, and A. J. Davison, “Visual SLAM: Why filter?” Image and Vision Computing, vol. 30, no. 2, pp. 65–77, 2012. .
Mono SLAM - Filtering • Classic way of resolving this was to pose BA problem as a filter - such as an Extended Kalman Filter (EKF). • Problem - Wastes processing time on frames that added very little information. θ 4 T 0 T 1 T 2 T 3 θ 1 θ 2 θ 3 2 3 “marginalizing out previous poses also results in unwanted direct connections between 3D points” x 3 x x 2 x 4 x 5 x 6 w 1 w 2 w 3 w 4 w 5 w 6 1 1 H. Strasdat, J. M. M. Montiel, and A. J. Davison, “Visual SLAM: Why filter?” Image and Vision Computing, vol. 30, no. 2, pp. 65–77, 2012. .
� Mono SLAM - Filtering • Filtering approaches are often times problematic (e.g. think when the device stops moving). • When frames are taken at nearby positions compared to the scene distance, 3D points will exhibit large uncertainty. Taken from D. Scaramuzza “Tutorial on Visual Odometry”. – –
Mono SLAM - Keyframe • A better strategy is to employ keyframe BA . • Made popular by Klein & Murray’s - Parallel Tracking and Mapping (PTAM) algorithm. T 0 T 1 T 2 T 3 θ 4 θ 1 θ 2 θ 3 “remove all but a small subset of keyframes” x 3 x x 2 x 4 x 5 x 6 w 1 w 2 w 3 w 4 w 5 w 6 6 1 G. Klein and D. Murray, “Parallel tracking and mapping for small AR workspaces”, ISMAR 2007. H. Strasdat, J. M. M. Montiel, and A. J. Davison, “Visual SLAM: Why filter?” Image and Vision Computing, vol. 30, no. 2, pp. 65–77, 2012. .
Keyframe Selection � • One way to avoid this consists of skipping frames until the � average uncertainty of the 3D points decreases below a � certain threshold. The selected frames are called keyframes . � • Rule of thumb: add a keyframe when, � keyframe distance > threshold (~10-20 %) when � average-depth . . . Taken from D. Scaramuzza “Tutorial on Visual Odometry”. – – – –
Keyframe-based SLAM Keyframe 1 Keyframe 2 Current frame New keyframe Initial pointcloud New triangulated points Taken from D. Scaramuzza “Tutorial on Visual Odometry”. [Nister’04, PTAM’07, LIBVISO’08, LSD SLAM’14 SVO’14, ORB SLAM’15] – –
Recommend
More recommend