c18 computer vision
play

C18 Computer Vision Lecture 5 Imaging geometry, camera calibration - PowerPoint PPT Presentation

C18 Computer Vision Lecture 5 Imaging geometry, camera calibration Victor Adrian Prisacariu http://www.robots.ox.ac.uk/~victor InfiniDense DEMO Course Content Projective geometry, camera calibration. Salient feature detection.


  1. C18 Computer Vision Lecture 5 Imaging geometry, camera calibration Victor Adrian Prisacariu http://www.robots.ox.ac.uk/~victor

  2. InfiniDense DEMO

  3. Course Content • Projective geometry, camera calibration. • Salient feature detection. • Recovering 3D from two images I: epipolar geometry. • Recovering 3D from two images II: stereo correspondences, triangulation, neural nets. Slides at http://www.robots.ox.ac.uk/~victor -> Teaching Lots borrowed from David Murray + AV C18.

  4. Useful Texts • Multi ltiple Vie iew Geometry try in in Computer Visi ision • Richard Hartley, Andrew Zisserman • Computer Visi ision: A Modern Approach • David Forsyth, Jean Ponce Prentice Hall; ISBN:0130851981 • 3-Dim imensional Computer Visi ision: A Geometr tric Vie iewpoint • Olivier Faugeras

  5. Computer Vision: This time… 5. 5. Im Imaging geometry, camera calibration. 1. Introduction. 2. The perspective camera as a geometric device. 3. Perspective using homogeneous coordinates. 4. Calibration the elements of the perspective model. 6. Salient feature detection and description. 7. Recovering 3D from two images I: epipolar geometry. 8. Recovering 3D from two images II: stereo correspondences, triangulation, neural nets.

  6. 5.1 Introduction Aim in geometric computati tion vis isio ion is to take a number of 2D images, and obtain an understanding of the 3D environment; what is in it; and how it evolves over time. What do we have here …? … seems very easy …

  7. It isn’t …

  8. Organizing the tricks … Although human and (3D) computer vision might be bags of tricks, it is useful to place the tricks with ithin la larger proce cessing paradigms. For example: a) Data-driven, bottom-up processing. b) Model-driven, top-down, generative processing. c) Dynamic Vision (mixes bottom-up with top-down feedback). d) Active Vision (task oriented). e) Data-driven discriminative approach (machine learning). These are neither all-embracing nor exclusive.

  9. (a) Data-driven, bottom-up processing • Image processing produces map of salient 2D features. • Features input into a range of shape from X processes whose output was the 2.5 .5D sketch. • Only in the last stage we get a fully 3D obje ject- ce centered description.

  10. (b) Model-driven, and (c) Dynamic vision • Model-driven, top-down, generati tive proce cessing: – a model of the scene is assumed known. – Supply a pose for the object relative to the camera, and use projection to predict where salient features should be found in the image space. – Search for the features, and refine the pose by minimizing the observed deviation. • Dynamic vis vision: mixes bottom- up/top-down by introducing Top-down Dynamic feedback.

  11. (d) Active Vision • Introduces task-oriented sensing-perception- actio ion lo loops: – Visual data needs only be “good enough” to drive the particular action. • No need to build and maintain an overarching representation of the surroundings. • Computational resources focused where they are needed.

  12. (e) Data-driven approach • The aim is to le learn a description of the transformation between input and output using exemplars. • Geometry is not forgotten, but implicit learned representation are favored.

  13. 5.2 The perspective camera as a geometric device

  14. This is (a picture of) my cat 0 520 x = 295 x 308 Cat nose 520

  15. My cat lives in a 3D world 𝑌 1 𝐲 = 𝑦 1 𝑌 2 𝐘 = 𝑦 2 𝑌 3 The point 𝐘 in world space projects to the point 𝐲 in image space

  16. Going from X in 3D to x in 2D ? 𝑌 1 𝐲 = 𝑦 1 𝑌 2 𝐘 = 𝑦 2 𝑌 3 film/sensor cat Output would be blurry  if film just exposed to the cat

  17. Going from X in 3D to x in 2D ? 𝑌 1 𝐲 = 𝑦 1 𝑌 2 𝐘 = 𝑦 2 𝑌 3 film/sensor barrier cat Blur reduced, looks good ☺

  18. Pinhole Camera ? 𝑌 1 𝐲 = 𝑦 1 𝑌 2 𝐘 = 𝑦 2 𝑌 3 Image Plane pinhole cat All rays pass through the ce center of of pr projection (a single point). Image forms on the image plane.

  19. Pinhole Camera image plane 𝑌 1 𝑌 2 𝐘 = 𝑌 3 f p o Optical axis 𝐲 = 𝑦 1 𝑦 2 𝑌 1 is imaged into 𝐲 = 𝑦 1 𝑌 2 The 3D point 𝐘 = 𝑦 2 as: 𝑌 3 f – focal length 𝑌 1 𝑔 o – camera origin 𝑦 1 𝑌 3 𝑦 2 = p – principal point 𝑌 2 𝑔 𝑌 3

  20. Homogeneous coordinates • The projection 𝐲 = 𝑔𝐘/𝑌 3 is non-linear  . • Can be made linear using homogeneous coordinates – involves representing the image and scene in higher dimensional space. • Limiting cases – e.g. vanishing points – are handled better. • Homogeneous coordinates allow for transformations to be concatenated more easily.

  21. 3D Euclidean transforms: inh inhomogeneous coordinates • My cat moves through 3D space. • The movement of the tip of the nose can be described using an Eucli lidean tr transform: ′ 𝐘 3×1 = 𝑺 3×3 𝐘 3×1 + 𝐮 3×1 rotation translation

  22. 3D Euclidean transforms: inh inhomogeneous coordinates ′ • Euclidean transform: 𝐘 3×1 = 𝑺 3×3 𝐘 3×1 + 𝐮 3×1 • Concatenation of successive transform is a mess! • 𝐘 1 = 𝑺 1 𝐘 + 𝐮 1 • 𝐘 2 = 𝑺 2 𝐘 1 + 𝐮 2 • 𝐘 2 = 𝑺 2 𝑺 1 𝐘 + 𝐮 1 + 𝐮 2 = 𝑺 2 𝑺 1 𝐘 + 𝑺 2 𝐮 𝟐 + 𝐮 2 .

  23. 3D Euclidean transforms: homogeneous coordinates 𝑌 𝑌 𝑍 • We replace the 3D points with a four vector . 𝑍 𝑎 𝑎 1 • The Euclidean transform becomes: 𝑺 𝐮 = 𝑭 𝐘 𝐘 𝐘′ 1 = 𝟏 𝑈 1 1 1 • Transformations can now be concatenated by matrix multiplication: 𝐘 1 = 𝑭 10 𝐘 0 𝐘 2 = 𝑭 21 𝐘 1 → 𝐘 2 = 𝑭 21 𝑭 10 𝐘 𝟏 1 1 1 1 1 1

  24. Homogeneous coordinates – definition in 𝑆 3 𝑌, 𝑍, 𝑎 𝑈 is represented in homogeneous coordinates by any • 𝐘 = 4-vector 𝑌 1 𝑌 2 𝑌 3 𝑌 4 • such that 𝑌 = 𝑌 1 /𝑌 4 , 𝑍 = 𝑌 2 /𝑌 4 , and 𝑎 = 𝑌 3 /𝑌 4 . • So the following homogeneous vectors represent the same point, for any 𝜇 ≠ 0 : 𝑌 1 𝑌 1 𝑌 2 𝑌 2 and 𝜇 𝑌 3 𝑌 3 𝑌 4 𝑌 4 E.g. 2,3,5, 1 𝑈 is the same as −3, −4.5, −7.5, −1.5 𝑈 and both • same inhomogeneous point 2,3,5 𝑈 represent the sam

  25. Homogeneous coordinates – definition in 𝑆 2 𝑦, 𝑧 𝑈 is represented in homogeneous • 𝐲 = coordinates by any 3-vector 𝑦 1 𝑦 2 𝑦 3 • such that 𝑦 = 𝑦 1 /𝑦 3 , 𝑧 = 𝑦 2 /𝑦 3 . • E.g. 1,2,3 𝑈 is the same as 3,6,9 𝑈 and both represent the same inhomogeneous point 0.33,0.66 𝑈

  26. Homogeneous notation – rues for use 1. Convert the inhomogeneous point to an homogeneous vector: 𝑌 𝑌 𝑍 → 𝑍 𝑎 𝑎 1 2. Apply a 4 × 4 transform. 3. Dehomogenize the resulting vector: 𝑌 1 𝑌 1 /𝑌 4 𝑌 2 𝑌 2 /𝑌 4 → 𝑌 3 𝑌 3 /𝑌 4 𝑌 4

  27. Projective transformations • A projective transformation is a linear transformation on homogeneous 4-vectors represented by a non-singular 4x4 matr trix ix. 𝑌′ 1 𝑞 11 𝑞 12 𝑞 13 𝑞 14 𝑌 1 𝑞 21 𝑞 22 𝑞 23 𝑞 24 𝑌′ 2 𝑌 2 = 𝑞 31 𝑞 32 𝑞 33 𝑞 34 𝑌 3 𝑌′ 3 𝑞 41 𝑞 42 𝑞 43 𝑞 44 𝑌 4 𝑌′ 4 • The effect on the homogenous points is that the original and transformed points are linked through a projection center. • The 4x4 matrix is defined up to scale, and so has 15 degrees of freedom.

  28. More 3D-3D and 2D-2D Transforms Projective (15 dof): Projective (aka Homography, 8 dof): 𝑌′ 1 𝑌 1 𝑦 ′1 𝑦 1 𝑌′ 2 𝑌 2 𝑦 ′2 𝑦 2 = 𝐼 3×3 = 𝑸 4×4 𝑌 3 𝑦 3 𝑌′ 3 𝑦 ′ 3 𝑌 4 𝑌′ 4 Affine (6 dof): Affine (12 dof): 1 = 𝑩 𝟑×𝟑 𝐮 2 𝐲 𝐲′ 1 = 𝑩 3×3 𝐮 3 𝐘 𝐘′ 𝟏 𝑈 1 1 𝟏 𝑈 1 1 Similarity (5 dof): Similarity (7 dof): 1 = 𝑇𝑺 2×2 𝐮 2 𝐲 𝐲′ = 𝑇𝑺 3×3 𝐮 3 𝐘 𝐘′ 𝟏 𝑈 1 1 𝟏 𝑈 1 1 1 Euclidean (4 dof): Euclidean (6 dof): 1 = 𝑺 2×𝟑 𝐮 𝟑 𝐲 𝐲′ = 𝑺 3×3 𝐮 3 𝐘 𝐘′ 1 𝟏 𝑈 1 𝟏 𝑈 1 1 1

  29. 2D-2D Transform Examples 𝑏 11 𝑏 12 𝑢 𝑦 ℎ 11 ℎ 12 ℎ 12 cos 𝜄 − sin 𝜄 𝑢 𝑦 𝑡cos 𝜄 − 𝑡sin 𝜄 𝑢 𝑦 𝑏 21 𝑏 22 𝑢 𝑧 ℎ 21 ℎ 22 ℎ 23 sin 𝜄 cos 𝜄 𝑢 𝑧 𝑡sin 𝜄 𝑡cos 𝜄 𝑢 𝑧 ℎ 31 ℎ 32 ℎ 33 0 0 1 0 0 1 0 0 1 Euclidean Similarity Affine Projective 3 DoF 4 DoF 6 DoF 8 DoF

  30. Perspective 3D-2D Transforms • Similar to a 3D-3D projective transform, but constr train the transformed poi point to to a a plane 𝒜 = 𝒈 . pla 𝑦 1 𝑦 2 𝑨 = 𝑔 → 𝐘 image = 𝑔 1 • Because z = 𝑔 is fixed, we can write: 𝑞 11 𝑞 12 𝑞 13 𝑞 14 𝑦 1 𝑌 1 𝑞 21 𝑞 22 𝑞 23 𝑞 24 𝑦 2 𝑌 2 𝜇 = 𝑔𝑞 31 𝑔𝑞 32 𝑔𝑞 33 𝑔𝑞 34 𝑔 𝑌 3 𝑞 31 𝑞 32 𝑞 33 𝑞 34 1 1 The 3 rd row is redundant, so: • 𝑌 1 𝑌 1 𝑞 11 𝑞 12 𝑞 13 𝑞 14 𝑦 1 𝑌 2 𝑌 2 𝑞 21 𝑞 22 𝑞 23 𝑞 24 𝑦 2 𝜇 = = 𝑄 3×4 𝑌 3 𝑌 3 𝑞 31 𝑞 32 𝑞 33 𝑞 34 1 1 1 𝑄 3×4 is the pr projection matrix ix and this is a per perspective transform

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend