Structure from Motion
Computer Vision CS 143, Brown James Hays
11/18/11
Many slides adapted from Derek Hoiem, Lana Lazebnik, Silvio Saverese, Steve Seitz, and Martial Hebert
Structure from Motion Computer Vision CS 143, Brown James Hays - - PowerPoint PPT Presentation
11/18/11 Structure from Motion Computer Vision CS 143, Brown James Hays Many slides adapted from Derek Hoiem, Lana Lazebnik, Silvio Saverese, Steve Seitz, and Martial Hebert This class: structure from motion Recap of epipolar geometry
Computer Vision CS 143, Brown James Hays
11/18/11
Many slides adapted from Derek Hoiem, Lana Lazebnik, Silvio Saverese, Steve Seitz, and Martial Hebert
– Depth from two views
C
image
the cameras’ baseline with the image plane
images, compute the camera parameters and the 3D point coordinates
Camera 1 Camera 2 Camera 3
R1,t1 R2,t2 R3,t3
Slide credit: Noah Snavely
the same time, scale the camera matrices by the factor of 1/k, the projections of the scene points in the image remain exactly the same: It is impossible to recover the absolute scale of the scene!
How do we know the scale of image content?
the same time, scale the camera matrices by the factor of 1/k, the projections of the scene points in the image remain exactly the same
transformation Q and apply the inverse transformation to the camera matrices, then the images do not change
Xj from the mn corresponding points xij
x1j x2j x3j Xj P1 P2 P3 Slides from Lana Lazebnik
v
T
v t A
Projective 15dof Affine 12dof Similarity 7dof Euclidean 6dof
Preserves intersection and tangency Preserves parallellism, volume ratios Preserves angles, ratios of length
1 t A
T
1 t R
T
s 1 t R
T
Preserves angles, lengths
scene, we get a projective reconstruction
affine, similarity, or Euclidean
P
P
T
p
A
A
Affine
T A
S
S
T
s
2 1 1
m i n j j i ij
x1j x2j x3j Xj P1 P2 P3 P1Xj P2Xj P3Xj
Noah Snavely, Steven M. Seitz, Richard Szeliski, "Photo tourism: Exploring photo collections in 3D," SIGGRAPH 2006 http://photosynth.net/
center at infinity
inhomogeneous coordinates
1. We are given corresponding 2D points (x) in several frames 2. We want to estimate the 3D points (X) and the affine parameters of each camera (A)
x X a1 a2
t AX x
y x
t t Z Y X a a a a a a y x
23 22 21 13 12 11
Projection of world origin
coordinate system is at the centroid of the 3D points
the 3D point Xi by
j i n k k j i n k i k i i j i n k ik ij ij
n n n X A X X A b X A b X A x x x ˆ 1 1 1 ˆ
1 1 1
j i ij
mn m m n n n m
2 1 2 22 21 1 12 11 2 1 2 1
Camera Parameters (2mx3) 3D Points (3xn) 2D Image Points (2mxn)
cameras (2 m) points (n)
n m mn m m n n
2 1 2 1 2 1 2 22 21 1 12 11
Source: M. Hebert
Source: M. Hebert
Source: M. Hebert
Source: M. Hebert
Source: M. Hebert
This decomposition minimizes |D-MS|2
Source: M. Hebert
S ~ A ~ X ~
scale is 1
Ai L Ai
T = Id,
i = 1, …, m
x X a1 a2 a1 · a2 = 0 |a1|2 = |a2|2
= 1
Source: M. Hebert
– Column j contains the projection of point j in all views – Row i contains one coordinate of the projections of all the n points in image i
– Compute SVD: D = U W VT – Create U3 by taking the first 3 columns of U – Create V3 by taking the first 3 columns of V – Create W3 by taking the upper left 3 × 3 block of W
A = U3W3
½ and X = W3 ½ V3 T
Source: M. Hebert
– solve using a dense submatrix of visible points – Iteratively add new cameras
cameras points