structure from motion
play

Structure From Motion EECS 442 David Fouhey Fall 2019, University - PowerPoint PPT Presentation

Structure From Motion EECS 442 David Fouhey Fall 2019, University of Michigan http://web.eecs.umich.edu/~fouhey/teaching/EECS442_F19/ Structure from Motion Structure from motion Have: 2D points p ij seen in m images Assume: points generated


  1. Structure From Motion EECS 442 โ€“ David Fouhey Fall 2019, University of Michigan http://web.eecs.umich.edu/~fouhey/teaching/EECS442_F19/

  2. Structure from Motion

  3. Structure from motion Have: 2D points p ij seen in m images Assume: points generated from n fixed 3D points X j and cameras M i or ๐’’ ๐‘—๐‘˜ โ‰ก ๐‘ต ๐’‹ ๐’€ ๐’Œ X j Want: Cameras ๐‘ต ๐’‹ , points ๐’€ ๐’Œ p 1 j p 3 j p 2 j (Remember) M 1 M 3 ๐‘ต ๐’‹ โ‰ก ๐‘ณ ๐’‹ [๐‘บ ๐’‹ , ๐’– ๐’‹ ] M 2 ๐œ‡๐’’ ๐‘—๐‘˜ = ๐‘ต ๐’‹ ๐’€ ๐’Œ , ๐œ‡ โ‰  0 Known Unknown Diagram credit: S. Lazebnik

  4. Is SFM always uniquely solvable? โ€ข Necker cube Source: N. Snavely

  5. Structure from motion ambiguities Letโ€™s first find one easy ambiguity ๐’’ ๐‘—๐‘˜ โ‰ก ๐‘ต ๐’‹ ๐’€ ๐’Œ 3x1 3x4 4x1

  6. Zoolander , 2001

  7. Structure from motion ambiguities Letโ€™s first find one easy ambiguity ๐’’ ๐‘—๐‘˜ โ‰ก ๐‘ต ๐’‹ ๐’€ ๐’Œ Can pick any arbitrary scaling factor k and adjust the cameras and points ๐’’ ๐‘—๐‘˜ โ‰ก ๐‘ต ๐’‹ ๐‘™ โˆ’๐Ÿ ๐‘™๐’€ ๐’Œ (Can usually be fixed in practice: just need a number, obtainable from heights of known objects or an IMU)

  8. Structure from motion ambiguity Does this diagram change X j meaning if I use this coordinate system? x y p 1 j z 0 p 3 j p 2 j M 1 Versus this coordinate M 3 M 2 system?z Coordinate system irrelevant! x So global R,t also ambiguous 0 y

  9. Structure from motion ambiguities Not just limited to scale. Given: ๐’’ ๐‘—๐‘˜ โ‰ก ๐‘ต ๐’‹ ๐’€ ๐’Œ Can insert any global transform H ๐’’ ๐‘—๐‘˜ โ‰ก ๐‘ต ๐’‹ ๐’€ ๐’Œ = ๐‘ต ๐’‹ ๐‘ฐ โˆ’๐Ÿ ๐‘ฐ๐’€ ๐’Œ H is a 3D homography / perspective transform / projective transform

  10. Similarity/Affine/Perspective Given: Perspective Affine Similarity Lines +Parallelism +Angles ๐‘ ๐‘ ๐‘‘ ๐‘ ๐‘ ๐‘‘ ๐‘ก๐‘บ ๐’– ๐‘’ ๐‘“ ๐‘” ๐‘’ ๐‘“ ๐‘” 0 1 ๐‘• โ„Ž ๐‘— 0 0 1 3D: same idea, different dimensions House image: A. Efros

  11. Projective ambiguity With no constraints on cameras matrices and scene, can only reconstruct up to a perspective ambiguity H ๐’’ ๐‘—๐‘˜ โ‰ก ๐‘ต ๐’‹ ๐’€ ๐’Œ = ๐‘ต ๐’‹ ๐‘ฐ โˆ’๐Ÿ ๐‘ฐ๐’€ ๐’Œ Slide credit: S. Lazebnik

  12. Projective ambiguity Slide credit: S. Lazebnik

  13. Affine ambiguity If we have constraints in the form of what lines are parallel, can reduce ambiguity to affine ambiguity . ๐‘ฉ ๐’– ๐‘ฐ = Affine 0 1 ๐’’ ๐‘—๐‘˜ โ‰ก ๐‘ต ๐’‹ ๐’€ ๐’Œ = ๐‘ต ๐’‹ ๐‘ฐ โˆ’๐Ÿ ๐‘ฐ๐’€ ๐’Œ Slide credit: S. Lazebnik

  14. Affine ambiguity Slide credit: S. Lazebnik

  15. Similarity ambiguity If we have orthogonality constraints, get up to similarity transform. Really the best we can do. We get this if we have calibrated cameras. ๐‘ก๐‘บ ๐’– ๐‘ฐ = 0 1 ๐’’ ๐‘—๐‘˜ โ‰ก ๐‘ต ๐’‹ ๐’€ ๐’Œ = ๐‘ต ๐’‹ ๐‘ฐ โˆ’๐Ÿ ๐‘ฐ๐’€ ๐’Œ Slide credit: S. Lazebnik

  16. Similarity ambiguity Slide credit: S. Lazebnik

  17. Affine structure from motion Weโ€™ll do the math with affine / weak perspective cameras (math is much easier) Perspective Weak Perspective

  18. Recall: orthographic projection Orthographic camera: things infinitely far away but you have an amazing camera Image World Projection along the z direction ๐‘ฆ ๐‘ฃ 1 0 0 0 โ†’ ๐‘ฆ ๐‘ง ๐‘ค = 0 1 0 0 ๐‘ง ๐‘จ 1 0 0 0 1 1

  19. Field of view and focal length standard wide-angle telephoto Slide Credit: F. Durand

  20. Affine Camera 1 0 0 0 ๐‘ต = ๐‘ฉ 2๐ธ ๐’– 2๐ธ ๐‘ฉ 3๐ธ ๐’– 3๐ธ 0 1 0 0 0 1 0 1 0 0 0 1 3x3 Matrix 3x4 Ortho. 4x4 Matrix Affine 2D Proj Affine 3D Tedious mathโ€ฆ ๐‘ 11 ๐‘ 12 ๐‘ 13 ๐‘ 1 ๐‘ต = ๐‘ 21 ๐‘ 22 ๐‘ 23 ๐‘ 2 0 0 0 1

  21. Affine Camera So what? Who cares? Examine the projection ๐‘Œ ๐‘ฃ ๐‘ 11 ๐‘ 12 ๐‘ 13 ๐‘ 1 ๐‘ ๐‘ค โ‰ก ๐‘ 21 ๐‘ 22 ๐‘ 23 ๐‘ 2 ๐‘Ž 1 0 0 0 1 1 Projection becomes linear mapping + translation and doesnโ€™t involve homogeneous coordinates! ๐‘Œ ๐‘ค โ‰ก ๐‘ 11 ๐‘ 12 ๐‘ 13 ๐‘ฃ + ๐‘ 1 ๐‘ ๐‘ 21 ๐‘ 22 ๐‘ 23 ๐‘ 2 ๐‘Ž b is projection of origin. Can anyone see why?

  22. Affine structure from motion General structure ๐’’ ๐‘—๐‘˜ โ‰ก ๐‘ต ๐’‹ ๐’€ ๐’Œ from motion: 3x1 3x4 4x1 ๐’’ ๐‘—๐‘˜ = ๐‘ฉ ๐’‹ ๐’€ ๐’Œ + ๐’„ ๐’‹ Assume M is affine camera: 2x1 2x1 2x3 3x1 mn 2D points, m cameras, n 3D points up to arbitrary 3D affine (12 DOF) Need: 2mn โ‰ฅ 8m + 3n โ€“ 12 (m = 2): n โ‰ฅ 4 (for all m!)

  23. One simplifying trick Subtract off the average 2D point ๐’’ ๐‘—๐‘˜ = ๐‘ฉ ๐’‹ ๐’€ ๐’Œ + ๐’„ ๐’‹ ๐‘œ ๐‘œ ๐’’ ๐‘—๐‘˜ = ๐’’ ๐‘—๐‘˜ โˆ’ 1 = ๐‘ฉ ๐‘— ๐’€ ๐‘˜ + ๐’„ ๐‘— โˆ’ 1 เทž ๐‘œ เท ๐’’ ๐‘—๐‘™ ๐‘œ เท ๐‘ฉ ๐‘— ๐’€ ๐‘™ + ๐’„ ๐‘— ๐‘™=1 ๐‘™=1 Gather terms involving A i ,push out b i 0 ๐‘œ ๐‘œ ๐’’ ๐‘—๐‘˜ = ๐‘ฉ ๐’‹ ๐’€ ๐’Œ โˆ’ 1 + ๐’„ ๐’‹ โˆ’ 1 เทž ๐‘œ เท ๐’€ ๐‘™ ๐‘œ เท ๐’„ ๐‘— ๐‘™=1 ๐‘™=1 Set origin to mean of 3D points Can do this entirely in terms of A ! ๐’’ ๐‘—๐‘˜ = ๐‘ฉ ๐’‹ ๐’€ ๐’Œ เทž

  24. Affine structure from motion First, make data measurement matrix consisting of all the points stacked together ๐‘ฃ 11 เทž ๐‘ฃ 1๐‘œ เทž โ‹ฏ เทž เทž ๐’’ ๐Ÿ๐Ÿ โ‹ฏ ๐’’ ๐Ÿ๐’ ๐‘ค 11 เทž ๐‘ค 1๐‘œ เทž m โ‹ฎ โ‹ฑ โ‹ฎ โ‹ฎ โ‹ฑ โ‹ฎ cameras ๐’’ ๐’๐Ÿ เทž โ‹ฏ ๐’’ ๐’๐’ เทŸ ๐‘ฃ ๐‘›1 เทž ๐‘ฃ ๐‘›๐‘œ เทž โ‹ฏ ๐‘ค ๐‘›1 เทž ๐‘ค ๐‘›๐‘œ เทž n points How big is this matrix? C. Tomasi and T. Kanade. Shape and motion from image streams under orthography: A factorization method. IJCV , 9(2):137-154, November 1992.

  25. Affine structure from motion Then, write all the equations in one in terms of product of cameras and points. ๐‘ฉ ๐Ÿ ๐’’ ๐Ÿ๐Ÿ เทž โ‹ฏ ๐’’ ๐Ÿ๐’ เทž โ‹ฎ โ‹ฎ โ‹ฑ โ‹ฎ = ๐’€ ๐Ÿ โ‹ฏ ๐’€ ๐’ ๐‘ฌ = ๐‘ฉ ๐’ เทž เทŸ ๐’’ ๐’๐Ÿ โ‹ฏ ๐’’ ๐’๐’ 2m x n 2mx3 3xn D M S Whatโ€™s the rank of D ? 3! C. Tomasi and T. Kanade. Shape and motion from image streams under orthography: A factorization method. IJCV , 9(2):137-154, November 1992.

  26. Making Matrices Rank Deficient Repeat of epipolar geometry class, but important enough to see twice. Given matrix M: rotation matrices ๐‘‰ ๐‘›ร—๐‘› , ๐‘Š ๐‘œร—๐‘œ ๐‘ โ†’ ๐‘‰ฮฃ๐‘Š ๐‘ˆ diagonal scaling matrix ฮฃ ๐‘›ร—๐‘œ Keep only k ๐œ 1 โ‹ฏ 0 biggest ฯƒ ; set โ‹ฎ โ‹ฑ โ‹ฎ ฮฃ = 0 โ‹ฏ ๐œ ๐‘› others to 0 Minimizes ๐‘ โˆ’ เทก ๐‘ ๐บ (sum of ๐‘ โ† ๐‘‰เท  เทก ฮฃ๐‘Š ๐‘ˆ squares) subject to rank( เทก ๐‘ ) โ‰ค k See Eckart โ€“ Young โ€“Mirsky theorem if youโ€™re interested

  27. Affine structure from motion Weโ€™d like to take the measurements and convert them into M , S = x D M S 2m n 3 Remake of M. Hebert diagram

  28. Affine structure from motion Do SVD (typically you donโ€™t make full U,ฮฃ ,V) n n n n D U ฮฃ V T x x n = 2m Truncate to top 3 singular values ฮฃ 3 V 3 T D x x = U 3 Remake of M. Hebert diagram

  29. Affine structure from motion Nearly there apart from this annoying ฮฃ 3 . x x D = U 3 ฮฃ 3 V 3 T ฮค 1/2 ๐‘Š 1 2 ฮฃ 3 ๐‘ˆ One solution (split ฮฃ 3 in two): ๐ธ = ๐‘‰ 3 ฮฃ 3 3 ๐‘ ๐‘‡ But remember x D = M S that we can put HH -1 in the middle Remake of M. Hebert diagram

  30. Eliminating the affine ambiguity Rows a i of A i give axes of camera. Can multiply each projection A i with C to make A i C that satisfies: ๐‘ผ ๐’ƒ ๐Ÿ‘ = 0 ๐’ƒ ๐Ÿ p ๐’ƒ ๐Ÿ = 1 ๐’ƒ ๐Ÿ‘ = 1 a 2 X a 1 Gives 3 equations per camera, can set A i C to new camera, and C -1 S to new points. In general, a recipe for eliminating ambiguities Remake of M. Hebert diagram

  31. Reconstruction results C. Tomasi and T. Kanade, Shape and motion from image streams under orthography: A factorization method, IJCV 1992

  32. Dealing with missing data So far, assume we can see all points in all views In reality, measurement matrix typically looks like this: cameras points Possible solution: find dense blocks, solve in block, fuse. In general, finding these dense blocks is NP-complete Figure Credit: S. Lazebnik

  33. But cameras arenโ€™t affine! Want: m cameras M i , n 3D points X j Given: mn 2D points p ij ๐’’ ๐‘—๐‘˜ โ‰ก ๐‘ต ๐’‹ ๐’€ ๐’Œ = ๐‘ต ๐’‹ ๐‘ฐ โˆ’๐Ÿ ๐‘ฐ๐’€ ๐’Œ

  34. When is this Possible? Want: m cameras M i , n 3D points X j Given: mn 2D points p ij ๐’’ ๐‘—๐‘˜ โ‰ก ๐‘ต ๐’‹ ๐’€ ๐’Œ = ๐‘ต ๐’‹ ๐‘ฐ โˆ’๐Ÿ ๐‘ฐ๐’€ ๐’Œ 3D point (3) 2D 4x4 homography 3x4 camera point (2) (15) why? matrix (11) why? Need 2mn โ‰ฅ 11m + 3n โ€“ 15 (m = 2): n โ‰ฅ 7 (m = 3): n โ‰ฅ 6 (doesnโ€™t get better after) (m=1): n โ‰ค 4

  35. Two Camera Case For two cameras, we need 7 points. Hmm. What else (in theory) requires 7 points? Compute fundamental X matrix F and epipole b s.t. F T b = 0. Then: p p' ๐‘ต 1 = [๐‘ฑ, ๐Ÿ] b ๐‘ต 1 ๐‘ต 2 = [โˆ’ ๐’„ ๐‘ฆ ๐‘ฎ, ๐’„] ๐‘ต 2 Remember: this is up to a projective ambiguity!

  36. Incremental SFM Key idea: incrementally add cameras, points ? M 1 ? M 2 Cameras ? ? Points ? ? ? ? Remake of S. Lazebnik material Note: numbers of points arenโ€™t to scale.

  37. Incremental SFM Key idea: incrementally add cameras, points ? 1. Initialize motion M i M 1 = [R i ,t i ] with ? M 2 Cameras fundamental matrix ? ? Points ? ? ? ? Remake of S. Lazebnik material Note: numbers of points arenโ€™t to scale.

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend