Dense Object Reconstruction on Mobile Instructor - Simon Lucey 16-423 - Designing Computer Vision Apps
Reminder - Project Presentation • Each team will be given approximately 2.5 minutes per member to present (for example a 2 member team will have 5 minutes allotted). • Each team will fill out the following form, providing a short (must be shorter than your allotted time) YouTube clip describing your App in action. • Teams can submit their YouTube clips through the form http://goo.gl/forms/YoeQt0c1Hf. • 16423 staff will select the best presentations, with the winner receiving the the best project prize.
Today • 3D Object Reconstruction through Motion • 3D Object Reconstruction through Learning
123D Catch
123D Catch
Feature-Based Methods
Feature-Based Methods Image is reduced to a sparse set of keypoints Usually matched with feature descriptors
Feature-Based Advantages Vanishing point Mikolajczyk, 2007 Mikolajczyk, 2007 Easier transition from Wide-baseline matching Illumination images to geometry invariance
Feature-Based Advantages Vanishing point Mikolajczyk, 2007 Mikolajczyk, 2007 Easier transition from Wide-baseline matching Illumination images to geometry invariance Using invariant descriptors
Feature-Based Challenges 1. High depth uncertainty. Small ? baseline H. Ha, et al. “High-quality Depth from Uncalibrated Small Motion Clip“ in CVPR 2016.
Feature-Based Challenges 1. High depth uncertainty. Small ? baseline 2. Degenerate case of two view methods. H. Ha, et al. “High-quality Depth from Uncalibrated Small Motion Clip“ in CVPR 2016.
Feature-Based Challenges 1. High depth uncertainty. Small ? baseline 2. Degenerate case of two view methods. 3. Reconstructions tend to be sparse and lack detail. H. Ha, et al. “High-quality Depth from Uncalibrated Small Motion Clip“ in CVPR 2016.
ECCV 1999 9
Feature-Based Challenges • Creates only a sparse map of the world. • Does not sample across all available image data - edges & weak intensities. • Needs high-resolution camera mode (bad for efficiency and battery life). Direct Method Feature-Based Method (ours) (ORB+RANSAC)
Feature-Based Challenges • Creates only a sparse map of the world. • Does not sample across all available image data - edges & weak intensities. • Needs high-resolution camera mode (bad for efficiency and battery life). Direct Method Feature-Based Method (ours) (ORB+RANSAC)
Feature-Based Challenges • Creates only a sparse map of the world. • Does not sample across all available image data - edges & weak intensities. • Needs high-resolution camera mode (bad for efficiency and battery life). Direct Method Feature-Based Method (ours) (ORB+RANSAC)
Direct Methods • Although not always perfect, a common measure of photometric image similarity is: • Sum of squared differences (SSD) ||I ( p ) − T ( 0 ) || 2 SSD( p ) = “Vector Form” 2 W ( x 1 ; p ) W ( x 1 ; 0 ) . . . . . . W ( x N ; p ) W ( x N ; 0 ) T I “Source Image” “Template”
Review - LK Algorithm • Lucas & Kanade (1981) realized this and proposed a method for estimating warp displacement using the principles of gradients and spatial coherence . • Technique applies Taylor series approximation to any spatially coherent area governed by the warp . W ( x ; p ) I ( p + ∆ p ) ≈ I ( p ) + ∂ I ( p ) ∂ p T ∆ p 12
Review - LK Algorithm • Lucas & Kanade (1981) realized this and proposed a method for estimating warp displacement using the principles of gradients and spatial coherence . • Technique applies Taylor series approximation to any spatially coherent area governed by the warp . W ( x ; p ) I ( p + ∆ p ) ≈ I ( p ) + ∂ I ( p ) ∂ p T ∆ p T ( 0 ) “We consider this image to always be static - referred to as the template.” 12
Review - LK Algorithm • Lucas & Kanade (1981) realized this and proposed a method for estimating warp displacement using the principles of gradients and spatial coherence . • Technique applies Taylor series approximation to any spatially coherent area governed by the warp . W ( x ; p ) I ( p + ∆ p ) ≈ I ( p ) + ∂ I ( p ) ∂ p T ∆ p 13
Review - LK Algorithm • Lucas & Kanade (1981) realized this and proposed a method for estimating warp displacement using the principles of gradients and spatial coherence . • Technique applies Taylor series approximation to any spatially coherent area governed by the warp . W ( x ; p ) I ( p + ∆ p ) ≈ I ( p ) + ∂ I ( p ) ∂ p T ∆ p ∂ I ( x 0 1 ) ∂ W ( x 1 ; p ) 0 T ∂ x 0 T . . . ∂ p T 1 ∂ I ( p ) . . . ... . . . = . . . ∂ p T ∂ I ( x 0 ∂ W ( x N ; p ) N ) 0 T ∂ p T ∂ x 0 T . . . N x 0 = W ( x ; p ) 14
Results Direct Method Feature-Based Method (ours) (ORB+RANSAC) H. Alismail, B. Browning, S. Lucey. "Enhancing Direct Camera Tracking with Feature Descriptors" ACCV 2016. H. Alismail, B. Browning, M. Kaess, S. Lucey, “Direct Visual Odometry in Low Light using Binary Descriptors”, IEEE International Conference on Robotics and Automation (ICRA) 2017.
Results Direct Method Feature-Based Method (ours) (ORB+RANSAC) H. Alismail, B. Browning, S. Lucey. "Enhancing Direct Camera Tracking with Feature Descriptors" ACCV 2016. H. Alismail, B. Browning, M. Kaess, S. Lucey, “Direct Visual Odometry in Low Light using Binary Descriptors”, IEEE International Conference on Robotics and Automation (ICRA) 2017.
Results Direct Method Feature-Based Method (ours) (ORB+RANSAC) H. Alismail, B. Browning, S. Lucey. "Enhancing Direct Camera Tracking with Feature Descriptors" ACCV 2016. H. Alismail, B. Browning, M. Kaess, S. Lucey, “Direct Visual Odometry in Low Light using Binary Descriptors”, IEEE International Conference on Robotics and Automation (ICRA) 2017.
Today • Direct vs. Feature based methods • Dense SLAM • Semi-Dense SLAM
Reminder: Warp Functions x “Template” “Source”
Reminder: Warp Functions W ( x ; p ) x x 0 “Template” “Source” Our goal is to find the warp parameter vector ! p x = coordinate in template [ x, y ] T x 0 = corresponding coordinate in source [ x 0 , y 0 ] T W ( x ; p ) = warping function such that x 0 = W ( x ; p ) p = parameter vector describing warp
Review: Pinhole Camera Instead model impossible but more convenient Real camera image virtual image is inverted Adapted from: Computer vision: models, learning and inference. Simon J.D. Prince
Relating Points between Views First camera: Second camera: Substituting: Adapted from: Computer vision: models, learning and inference. Simon J.D. Prince
Pinhole Warp Function • One can represent the relationship of points between views of pinhole cameras as a warp function, W ( x ; θ , λ ) = π ( λ Ω ˜ x + τ ) “warp function” 0 1 u ✓ ◆ u/w v A = “pinhole projection” π @ v/w w � Ω τ T = ∈ SE(3) “pose parameters” 0 T 1
Pinhole Warp Function • One can represent the relationship of points between views of pinhole cameras as a warp function, W ( x ; θ , λ ) = π ( λ Ω ˜ x + τ ) “warp function” 0 1 u ✓ ◆ u/w v A = “pinhole projection” π @ v/w w 6 ! X T ( θ ) = exp θ i A i ∈ SE(3) “pose parameters” i =1
𝑈 𝑙 Photometric Relationship • We can employ this warp function to now express the problem as, T ( x n ) = I ( W{ x n ; θ f , λ n } ) 𝐽 𝑙−1 𝐽 𝑙 𝑈 𝑙 θ f T “keyframe template” I f “f- th image” λ n ˜ x n 𝑈 𝑙 “An Invitation to 3D Vision”, Ma,
Linearizing the Image for Pose T ( x n ) = I f ( W{ x n ; θ f � ∆ θ , λ n } ) ≈ I f ( W{ x n ; θ f , λ n } ) + A f n ∆ θ f Baker, Simon, and Iain Matthews. "Equivalence and efficiency of image alignment algorithms." CVPR 2001.
Linearizing the Image for Pose T ( x n ) = I f ( W{ x n ; θ f � ∆ θ , λ n } ) ≈ I f ( W{ x n ; θ f , λ n } ) + A f n ∆ θ f Baker, Simon, and Iain Matthews. "Equivalence and efficiency of image alignment algorithms." CVPR 2001.
𝑈 𝑙 Direct Camera Tracking { λ n } N • Assuming known depths , n =1 N X ||T ( x n ) − I f ( W{ x n ; θ f , λ n } ) − A f 𝐽 𝑙−1 n ∆ θ f || 2 arg min 𝐽 𝑙 2 ∆ θ f n =1 𝑈 𝑙 θ f T “keyframe template” I f “f- th image” λ n ˜ x n 𝑈 𝑙 “An Invitation to 3D Vision”, Ma,
Direct Camera Tracking • Most methods employ a variant of the Lucas-Kanade algorithm for estimating camera pose. • Engel et al. demonstrated using a “dense” number of points does not improve the performance of camera tracking (i.e pose estimation). • Advantage of density stems mainly from the map estimation. J. Engel, V. Koltun, and D. Cremers. Direct sparse odometry. arXiv preprint arXiv:1607.02565, 2016. J. Engel, T. Schops, and D. Cremers. LSD-SLAM: Large-scale direct monocular slam. In European Conference on Computer Vision, pages 834–849. Springer, 2014.
Recommend
More recommend