Victor Adrian Prisacariu
http://www.robots.ox.ac.uk/~victor
C18 Computer Vision
Lecture 7
Recovering 3D from two images I: epipolar geometry
C18 Computer Vision Lecture 7 Recovering 3D from two images I: - - PowerPoint PPT Presentation
C18 Computer Vision Lecture 7 Recovering 3D from two images I: epipolar geometry Victor Adrian Prisacariu http://www.robots.ox.ac.uk/~victor Computer Vision: This time 5. Imaging geometry, camera calibration. 6. Salient feature detection
Victor Adrian Prisacariu
http://www.robots.ox.ac.uk/~victor
Lecture 7
Recovering 3D from two images I: epipolar geometry
7.
ring 3D from two im images I: I: epip ipola lar geometry ry.
1. Introduction 2. Epipolar Geometry 3. Algebraic Representation and the Fundamental Matrix 4. Computing the Fundamental Matrix 5. The Essential Matrix and Ego-Motion
correspondences, triangulation, neural nets.
7.1 Introduction: Forward and Inverse Mappings
ject to entities of higher dimensionality in the scene.
lines in the scene, lines to planes, etc.
Introduction: What do single-view ambiguities tell us?
ngle le vie views ar are e NOT T suf suffic icie ient to solve geometric problem in data- driven vision.
ultip iple vie views.
– Shape-from-stereo: different cameras, different viewpoints, same time. – Structure-from-motion: same camera, different viewpoints, different times.
in model (easy) or data driven vision (just recently) vision.
Introduction: Reconstruction from two views
In principle, recovering 3D structure is straightforward: Fin Find a a bit bit of
the tw two
scenes tha that is s ob
two
e cameras, and and bac backproject the the tw two
find the their ir intersectio ion in n the the worl
There are three things to cover: 1. Understanding the geometry – epip ipolar geometry ry. 2. Determining which points in the images are from the same scene location – th the e cor
ence problem. 3. Determining the 3D structure by back-projecting rays – tri triangu gulation.
Simple case: two identical parallel cameras
– Some 3D world. – Two views. – Two identical cameras – same camera params (f). – Separation only on x: 𝑢𝑦
𝐷1 = 𝑍 𝐷2, 𝑎𝐷1 = 𝑎𝐷2, 𝑌𝐷1 = 𝑌𝐷2 + 𝑢𝑦
1 𝑎 = 1 𝑔𝑢𝑦 (𝑦𝐷1 − 𝑦𝐷2)
eciprocal dep epth th 1/Z is proportional to the hori
l dis isparity 𝑦𝐷1 − 𝑦𝐷2.
View 1 View 2 𝑢𝑦 𝐷1 𝐷2 3D World
Simple case: two identical parallel cameras
൞ 𝑦𝐷2 = 𝑔 𝑎 (𝑌 + 𝑢𝑦) 𝑧𝐷2 = 𝑔 𝑎 𝑍 ⇒ 𝑦𝐷2 = 𝑦 + 𝑔𝑢𝑦 𝑎 𝑧𝐷2 = 𝑔 𝑎 𝑍
View 1 View 2 𝐷1 𝐷2 3D World 𝑢𝑦 𝑎 ranges from 0 to ∞ All possible matches lie on a straight lin ine. Finding a correspondence is a 1D 1D sear search.
backprojected ray in the other one.
always a a str traig ight line, and is called the ep epip ipola lar line, which we label with 𝐦’.
epip ipolar pla plane.
epipolar line.
line.
baseli line, forming a pencil of planes.
epip ipole le 𝐟’, where the baseline pierces the image plane.
proje
ion of
the op
ical l cen center of camera C into C’.
1. Converging cameras Notice that the epipoles must often lie off the physical image planes. What would be a quick test you could carry out to see whether the epipole was on the image plane?
2. (Close to) parallel cameras Epipolar geometry depends only on the relative post of the cameras (i.e. the rotation and translation between them) and
It It does not t depend on th the sc scene str truct ctrure. Can you reason qualitatively why not?
7.3 Algebraic representation and the F matrix
Before formulating the algebraic representation we must cover three things:
lines.
You must notice a duality between lines and points …
matrices.
1> Points at ∞ in homogeneous coordinates
𝐘 = 𝐁 + μ𝐄
𝑌1(𝜈) 𝑌2(𝜈) 𝑌3(𝜈) 𝑌4(𝜈) = 𝐁 + 𝜈𝐄 1 = 𝐁 1 + 𝜈 𝐄 0 = 1 𝜈 𝐁 1 + 𝐄 0 .
0 .
– Points at infinity are equivalent to directions. – Parallel lines in the scene meet at the same point.
1> Points at ∞ in homogeneous coordinates
poin int.
𝐰 = 𝑳 𝑺 𝐮 𝐄
1> Points at ∞ in homogeneous coordinates
2> Homogeneous notation for lines
homogeneous coordinates as 𝐲 = 𝑦1, 𝑦2, 𝑦3 T.
line l1𝑦 + l2y + l3 = 0 in 2D is represented by the homogeneous 3-vector: 𝐦 = 𝑚1, 𝑚2, 𝑚3 T
𝐉T𝐲 = 𝐲T𝐉 = 0
2> Homogeneous notation for lines
Reminders:
𝐦 ∙ 𝐲 = 0.
𝐦 = 𝐪 × 𝐫
𝐲 = 𝐦1 × 𝐦2
3> Matrix representation of vector products
The vector product 𝐛 × 𝐜 is: Ƹ 𝐣 Ƹ 𝐤 መ 𝐥 𝑏1 𝑏2 𝑏3 𝑐1 𝑐2 𝑐3 = 𝑏2𝑐3 − 𝑏3𝑐2 𝑏3𝑐1 − 𝑏1𝑐3 𝑏1𝑐2 − 𝑏2𝑐1 = −𝑏3 𝑏2 𝑏3 −𝑏1 −𝑏2 𝑏1 𝑐1 𝑐2 𝑐3 = 𝒃 𝑦𝐜 𝒃 𝒚 is a 3 × 3 skew-symmetric matrix and has rank=2 𝒃 is the kernel of 𝒃 𝒚𝐜 Example: compute the vector product of I=(1,2,3) and m=(2,3,4). Pseudo-determinant method gives (-1, 2, 1)T Skew-sym method gives: −3 2 3 −1 −2 1 2 3 4 =
Algebraic representation of Epipolar Geometry
fr from poin int t 𝐲 to lin line 𝐦’.
structure, so the mapping depends on th the overall proje jection matr trices P and P’.
written as: 𝑱’ = 𝑮𝒚, , where 𝑮 is is th the fu fundamental matr trix
Algebraic representation of Epipolar Geometry
coordinate system, so the overall projection matrix is: 𝑸 = 𝑳[𝑱|𝟏]
between camera frames as: 𝐘′ = 𝑺 𝐮 𝟏T 1 𝐘
𝑸′ = 𝑳′[𝑺|𝐮]
different.
Algebraic representation of Epipolar Geometry
ep 1: back project a ray from C.
𝐲 = 𝑳 𝑱 𝟏 𝐘 𝜂
𝑦 = 𝑦 𝑧 1 = 𝑳 𝑱 𝟏 X 𝑍 𝑎 1 = 𝑳 𝑌 𝑍 𝑎 ⇒ ⇒ 𝐘 𝜂 = 𝜂[𝑳]−1𝐲 1 and 𝐘 ∞ = 𝑳−1𝐲
In effect, 𝑳 −1 corrects the direction of the ray. Direction 𝐲
𝐲 was measured in a non-ideal camera.
Algebraic representation of Epipolar Geometry
ep 2: Chose two points on the ray and project into second camera C’.
– Point 1 with 𝜂 = 0 is the optical center 𝟏 1 . – Point 2 with 𝜂 = ∞ is the point at infinity 𝑳−1 .
𝐟′ = 𝑳′ 𝑺 𝐮 𝟏 1 = 𝑳′𝐮.
𝐫 = 𝑳′ 𝑺 𝐮 𝑳−𝟐 = 𝑳′𝑺𝑳−1𝐲.
– Use 𝑵𝐛 × 𝑵𝐜 = 𝑵−T(𝐛 × 𝐜), where 𝑵−T = 𝑵−1 T – Use the matrix representation of a vector product i.e. the matrix 𝐮 𝑦
fundamental l matrix, with the epipole line 𝐦′ = 𝑮𝐲 and the property that 𝐲′𝑮𝐲 = 0
Algebraic representation of Epipolar Geometry
to find the epipolar line 𝐦′ = 𝑳′𝐮 × 𝑳′𝑺𝑳−1𝐲
and 𝑮T𝐟’ = 𝟏.
is derived as F = 𝑳′ −𝐔 𝐮 𝐲𝑺𝑳−1𝐲 where – T denotes the transpose of the inverse.
The basis for several methods of computing 𝑮 lies in re-writing the constraint 𝐲′T𝑮𝐲 = 0 for each match 𝐲 ↔ 𝐲′ as:
𝑦′ 𝑧′ 1 𝐺
1
𝐺
2
𝐺
3
𝐺
4
𝐺
5
𝐺
6
𝐺
7
𝐺
8
𝐺
9
𝑦 𝑧 1 = 0
𝑦′𝑦𝐺
1 + 𝑦′𝑧𝐺2 + 𝑦′𝐺3 + 𝑧′𝑦𝐺 4 + 𝑧′𝑧𝐺5 + 𝑧′𝐺6 + 𝑦𝐺7 + 𝑧𝐺8 + 𝐺9 = 0
𝑦′𝑦 𝑦′𝑧 𝑦′ 𝑧′𝑦 𝑧′𝑧 𝑧′ 𝑦 𝑧 1 𝐺
11
⋮ 𝐺
33
= 0
𝑩𝑜×9𝐠 = 𝑦1
′𝑦1
𝑦1
′𝑧1
𝑦1
′
𝑧1
′𝑦1
𝑧1
′𝑧1
𝑧1
′
𝑦1 𝑧1 1 ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ 𝑦𝑜
′ 𝑦𝑜
𝑦𝑜
′ 𝑧𝑜
𝑦𝑜
′
𝑧𝑜
′𝑦𝑜
𝑧𝑜
′𝑧𝑜
𝑧𝑜
′
𝑦𝑜 𝑧𝑜 1 𝐺
1
⋮ 𝐺
9
poin ints ts: 7 point algorithm – nonlinear equations.
ints: 8 point algorithm – linear solution.
dimensional null-space.
space of A.
find an 𝐠 that gives det 𝑮 = 0.
𝐠 which is a solution, any scaling (+ve of –ve) of 𝐠 is also a solution.
runs from −∞ to +∞.
– Generate the matches. – Statistically center all sets of 𝑦 and 𝑧 values. – Build 𝑩 from seven of the matches. – Use SVD to find the two vectors 𝐰 and 𝐱 spanning the null- space. – Use 𝐰 and 𝐱 to find coeffs of the cubic. – Solve the cubic, and test which 𝛽 is best.
det 𝑮 = 𝛽𝑒1 + 𝑥1 𝛽𝑒2 + 𝑥2 𝛽𝑒3 + 𝑥3 𝛽𝑒4 + 𝑥4 𝛽𝑒5 + 𝑥5 𝛽𝑒6 + 𝑥6 𝛽𝑒7 + 𝑥7 𝛽𝑒8 + 𝑥8 𝛽𝑒9 + 𝑥9
function of 𝐞 and 𝐱.
𝑩𝑜×9𝐠 = 𝑦1
′𝑦1
𝑦1
′𝑧1
𝑦1
′
𝑧1
′𝑦1
𝑧1
′𝑧1
𝑧1
′
𝑦1 𝑧1 1 ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ 𝑦𝑜
′ 𝑦𝑜
𝑦𝑜
′ 𝑧𝑜
𝑦𝑜
′
𝑧𝑜
′𝑦𝑜
𝑧𝑜
′𝑧𝑜
𝑧𝑜
′
𝑦𝑜 𝑧𝑜 1 𝐺
1
⋮ 𝐺
9
poin ints ts.
ints.
null-space of 𝐵, and so 𝑮 is determined up to a scale (as expected).
𝑜 > 8. This is done using le least squares.
A least squares version of the 8-point algorithm
Due to noise, there will not be an exact solution to 𝑩𝐆 = 𝟏 (𝑩 has full rank) Least square formulation:
Find the unit vector 𝐆 that minimizes the norm of the residual 𝐬 = 𝑩𝐠: 𝐆∗ = argmin𝐆: 𝐆 =1 𝑩𝐆 𝟑
Solution with eigenvalues:
Compute the eig eigen-decomposit ition of the matrix 𝐍 = 𝑩𝑈𝑩 and set 𝐆 to the (unit) eigenvector 𝐟1 corresponding to the smallest eigenvalue 𝛽1.
Solution with SVD:
Compute the SVD of the matrix 𝑩 and set 𝐆 to the (unit) right singular vector 𝐟1 corresponding to the smallest singular value 𝜏1.
𝐆T𝐍𝐠.
as: 𝑵 = 𝑾𝜧𝑾T = 𝑾 𝜇1 ⋱ 𝜇𝑜 𝑾T =
𝑗=1 𝑜
𝜇𝑗[𝐟i𝐟i
T]
… 𝐟𝑜 is the orthonormal matrix of eigenvectors and 0 ≤ 𝜇1 ≤ … ≤ 𝜇𝑜 are non-decreasing eigenvalues.
𝑵𝐟i = 𝐟i𝝁i ⇒ 𝐟i
T𝑵𝐟i = 𝐟i T𝐟i𝜇𝑗 ⇒ 𝑩𝐟i T 𝑩𝐟i = 𝜇𝑗 > 0
: 𝐆T𝐍𝐠 = 𝜇𝑗 𝑮T𝐟1
2 + ⋯ + 𝜇𝑜 𝑮T𝐟n 2
This is minimised when 𝐆 = 𝐟1.
𝑩𝑛×𝑜 = 𝑽𝑛×𝑜 𝜏1 ⋱ 𝜏𝑜 𝑜×𝑜 𝑾𝑜×𝑜
T
the singular values ordered so 0 ≤ 𝜏1 ≤ … ≤ 𝜏𝑜
is usu sually preferred to the eigenvalue decomposition because it is numerically more stable.
requires 𝐲′T𝑮𝐲 = 0.
intrinsic cali libration, we can transform the matching points into their respective ideal images.
0, where 𝑭 = 𝐮 𝑦𝑺 is the Essential Matrix.
We know that 𝑮 = 𝑳′ −T 𝐮 𝐲𝑺𝑳−1. We now show how to recover 𝑺 and 𝐮 from 𝑮 (given 𝑳 and 𝑳′). 1. Compute the essential matrix 𝑭 = 𝐮 𝑦𝑺 = 𝑳′𝑮𝑳. 2. Compute 𝐮 as the null-vector of 𝑭T (i.e. 𝑭T𝐮 = 0).
– We can only determine 𝐮 up to a scaling factor 𝜈. – there are two solutions ±𝜈𝐮.
3. Compute 𝑺 from 𝑭.
– the algorithm for this step is given later. – it returns two solutions 𝑺1 and 𝑺2.
4. Overall, there are four solutions for the projection matrix: 𝑸′ = 𝑳′[𝑺1|𝜈𝒖] 𝑸′ = 𝑳′[𝑺2|𝜈𝐮] 𝑸′ = 𝑳′ 𝑺1 − 𝜈𝐮] 𝑸′ = 𝑳′[𝑺2| − 𝜈𝐮] 5. Exclude 3 of these using a visibility test.
point t is s in n fr fron
both cam ameras in n only
must pass through the image on its way to the optic centre!
Computing 𝑺1,2 from the essential matrix 𝑭
Recall that 𝑭 = 𝐮 𝑦𝑺; we now recover 𝑺 from 𝑭. Algorithm:
(SVD) of 𝑭. 𝑽 1 1 𝑾T ← 𝑵
−1 1 1 .
𝑽𝑿⊤𝑾⊤.
Introduction Epipolar Geometry
whiskers for converging cameras. Algebraic Representation and the Fundamental Matrix
in one image and epipolar line in the other Computing the F matrix
SVD. The Essential Matrix and the ego-motion from 𝑮.