From 2D to 3D: Monocular Vision
With application to robotics/AR
From 2D to 3D: Monocular Vision With application to robotics/AR - - PowerPoint PPT Presentation
From 2D to 3D: Monocular Vision With application to robotics/AR Motivation How many sensors do we really need? Motivation What is the limit of what can be inferred from a single embodied (moving) camera frame? Aim AR with a hand-held
With application to robotics/AR
How many sensors do we really need?
a single embodied (moving) camera frame?
by Davison 2003
X Y Z 1 ↦ fX fY Z =[ f f 1 0] X Y Z 1
= + + 1 1 1 1 1 Z Y X p f p f Z Zp Y f Zp X f
y x x x
= 1
y x
p f p f K
calibration matrix
[ ]
principal point:
) , (
y x p
p
( ) C ~
~ R X ~
cam =
X 1 C ~ R R 1 X ~ 1 C ~ R R Xcam − = − =
[ ]
[ ]X
C ~ R | R K X | I K x
cam
− = =
[ ],
t | R K P = C ~ R t − =
In non-homogeneous coordinates: Note: C is the null space of the camera projection matrix (PC=0)
matrices), find the coordinates of the point
O1 O2 x1 x2 X?
xij = Pi Xj , i = 1, … , m, j = 1, … , n
from the mn correspondences xij
x1j x2j x3j Xj P1 P2 P3
k and, at the same time, scale the camera matrices by the factor of 1/k, the projections of the scene points in the image remain exactly the same: ) ( 1 X P PX x k k = =
It is impossible to recover the absolute scale of the scene!
xij = Pi Xj , i = 1, … , m, j = 1, … , n Problem: estimate m projection matrices Pi and n 3D points Xj from the mn correspondences xij
recovered up to a 4x4 projective transformation Q: X QX, P PQ → →
2mn >= 11m +3n
motion (Levenberg-Marquardt)
( )
2 1 1
, ) , (
= =
=
m i n j j i ij
D E X P x X P
x1j x2j x3j Xj P1 P2 P3 P1Xj P2Xj P3Xj
determining intrinsic camera parameters directly from uncalibrated images
single moving camera, we can use the constraint that the intrinsic parameter matrix remains fixed for all the images
3D projective transformation matrix Q such that all camera matrices are in the form Pi = K [Ri | ti]
matrix: zero skew
http://www.youtube.com/watch?v=sQegEro5Bfo
http://www.youtube.com/watch?v=p16frKJLVi0
environment
SLAM: robot path and map are both unknown Robot path error correlates errors in the map
Robot pose uncertainty
catastrophic consequences
Integrations typically done one at a time
) , | , (
: 1 : 1 : 1 t t t
u z m x p
1 2 1 : 1 : 1 : 1 : 1 : 1
... ) , | , ( ) , | , (
−
=
t t t t t t t
dx dx dx u z m x p u z m x p
Estimates most recent pose and map! Estimates entire path and map!
) , | , (
: 1 : 1 : 1 t t t
u z m x p
1 2 1 : 1 : 1 : 1 : 1 : 1
... ) , | , ( ) , | , (
−
=
t t t t t t t
dx dx dx u z m x p u z m x p
{ }
1 1 ] 1 [
− − −
⋅ =
t t t t t t x t
t
robot motion current measurement map constructed so far
map relative to the (i-1)-th pose and map
known poses based on the poses and
correspondences
keyframe
keyframe:
poses
all keyframes (or use only last N keyframes)
map points
has idle time – use this to improve the map
reversible
keyframes
are visible, and at what pyramid level
from source keyframe)
position
subpixel position (for some patches)
where ej is the re-projection error vector
made That is, local bundle
neighbors, and all of the map points seen by these, using all of the measurements ever made of these points.
http://www.youtube.com/watch?v=Y9HMn6bd-v8 http://www.youtube.com/watch?v=pBI5HwitBX4
Multi-scale Compactly Supported Basis Functions Bundle adjusted point cloud with PTAM
http://www.youtube.com/watch?v=CZiSK7OMANw
robustly removed
keyframes can now be estimated
application
navigation, obstacle avoidance, scene understanding)
http://www.cs.washington.edu/ai/Mobile_Robotics/projects/rgbd-3d-mapping/
http://research.microsoft.com/apps/video/dl.aspx?id=152815
camera
single camera
cases