3D Geometry for Computer Graphics Lesson 2: PCA & SVD Last - - PowerPoint PPT Presentation
3D Geometry for Computer Graphics Lesson 2: PCA & SVD Last - - PowerPoint PPT Presentation
3D Geometry for Computer Graphics Lesson 2: PCA & SVD Last week - eigendecomposition We want to learn how the matrix A works: A 2 Last week - eigendecomposition If we look at arbitrary vectors, it doesnt tell us much. A 3
2
Last week - eigendecomposition
A
We want to learn how the matrix A works:
3
Last week - eigendecomposition
A
If we look at arbitrary vectors, it doesn’t tell us
much.
4
Spectra and diagonalization
A
If A is symmetric, the eigenvectors are
- rthogonal (and there’s always an eigenbasis).
A = UΛU
T
⎟ ⎟ ⎟ ⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎜ ⎜ ⎜ ⎝ ⎛
n
λ λ λ
- 2
1
Λ= Aui = λi ui
5
Why SVD…
Diagonalizable matrix is essentially a scaling. Most matrices are not diagonalizable – they do other
things along with scaling (such as rotation)
So, to understand how general matrices behave,
- nly eigenvalues are not enough
SVD tells us how general linear transformations
behave, and other things…
6
The plan for today
First we’ll see some applications of PCA –
Principal Component Analysis that uses spectral decomposition.
Then look at the theory. SVD
Basic intuition Formal definition Applications
7
PCA finds an orthogonal basis that best represents given data set. The sum of distances2 from the x’ axis is minimized.
PCA – the general idea
x y x’ y’
8
PCA – the general idea
PCA finds an orthogonal basis that best represents given data set. PCA finds a best approximating plane (again, in terms of Σdistances2)
3D point set in standard basis
x y z
9
Application: finding tight bounding box
An axis-aligned bounding box: agrees with the axes x minX maxX y maxY minY
10
Application: finding tight bounding box
Oriented bounding box: we find better axes!
x’ y’
11
Application: finding tight bounding box
This is not the optimal bounding box
x y z
12
Application: finding tight bounding box
Oriented bounding box: we find better axes!
13
Usage of bounding boxes (bounding volumes)
Serve as very simple “approximation” of the object Fast collision detection, visibility queries Whenever we need to know the dimensions (size) of the object
- The models consist of
thousands of polygons
- To quickly test that they
don’t intersect, the bounding boxes are tested
- Sometimes a hierarchy
- f BB’s is used
- The tighter the BB – the
less “false alarms” we have
Sample
14
Scanned meshes
15
Point clouds
Scanners give us raw point cloud data How to compute normals to shade the surface?
normal
16
Point clouds
Local PCA, take the third vector
17
Notations
Denote our data points by x1, x2, …, xn ∈ Rd
1 1 1 1 2 2 2 2 1 2 1 2
, , ,
n n d d d n
x x x x x x x x x ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ = = = ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠
1 2 n
x x x …
18
The origin is zero-order approximation of our
data set (a point)
It will be the center of mass: It can be shown that:
The origin of the new axes
1 1 n i n i =
=
∑
m x
2 1
argmin
n i i=
=
∑
x
m x - x
19
Scatter matrix
Denote yi = xi – m, i = 1, 2, …, n
where Y is d×n matrix with yk as columns (k = 1, 2, …, n)
T
S YY =
1 1 1 1 2 1 2 1 1 1 2 2 2 1 2 1 2 2 2 2 1 2 1 2 d n d n d d d d n n n n
y y y y y y y y y y y y S y y y y y y ⎛ ⎞⎛ ⎞ ⎜ ⎟⎜ ⎟ ⎜ ⎟⎜ ⎟ = ⎜ ⎟⎜ ⎟ ⎜ ⎟⎜ ⎟ ⎜ ⎟⎜ ⎟ ⎝ ⎠⎝ ⎠
- Y
YT
20
Variance of projected points
In a way, S measures variance (= scatterness) of the data in different
directions.
Let’s look at a line L through the center of mass m, and project our
points xi onto it. The variance of the projected points x’i is:
Original set Small variance Large variance
2 1 1
var( ) || ||
n i n i
L
=
′ = −
∑ x
m
L L L L
21
Variance of projected points
Given a direction v, ||v|| = 1, the projection of xi
- nto L = m + vt is:
|| || , /|| || ,
T i i i i
′ − = < − > = < > = x m v x m v v y v y
v m xi x’i L
22
Variance of projected points
- So,
2 2 2 1 1 1 1 1 2 1 1 1 1 1
var( ) || || ( ) || || || || , ,
n n i i n n n i i T T T n n n n n
L Y Y Y Y YY S S
= =
′ = = = = = = < > = = = < >
∑ ∑
T T T T T
x -m v y v v v v v v v v v v
( ) ( )
2 2 1 1 1 1 1 2 2 2 2 2 2 1 2 1 2 2 1 2 1 1 1 2
( ) || ||
i n n n T d d i n i i d d d d i n
y y y y y y y y v v v v v v Y y y y y
= =
⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ = = = ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠
∑ ∑
T i
v y v
23
Directions of maximal variance
So, we have: var(L) = <Sv, v> Theorem:
Let f : {v ∈ Rd | ||v|| = 1} → R,
f (v) = <Sv, v>
(and S is a symmetric matrix). Then, the extrema of f are attained at the eigenvectors of S.
So, eigenvectors of S are directions of maximal/minimal
variance!
24
Summary so far
We take the centered data vectors y1, y2, …, yn ∈ Rd Construct the scatter matrix S measures the variance of the data points Eigenvectors of S are directions of maximal variance.
T
S YY =
25
Scatter matrix - eigendecomposition
S is symmetric
⇒ S has eigendecomposition: S = VΛVT S
= v2 v1 vd λ1 λ2 λd v2 v1 vd The eigenvectors form
- rthogonal basis
26
Principal components
Eigenvectors that correspond to big eigenvalues
are the directions in which the data has strong components (= large variance).
If the eigenvalues are more or less the same –
there is no preferable direction.
Note: the eigenvalues are always non-negative.
27
Principal components
- There’s no preferable
direction
S looks like this:
- Any vector is an eigenvector
T
V V λ λ ⎛ ⎞ ⎜ ⎟ ⎝ ⎠
- There is a clear preferable
direction
S looks like this:
- μ is close to zero, much
smaller than λ.
T
V V ⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛
μ
λ
28
How to use what we got
For finding oriented bounding box – we simply
compute the bounding box with respect to the axes defined by the eigenvectors. The origin is at the mean point m. v2 v1 v3
29
For approximation
x y v1 v2 x y This line segment approximates the original data set The projected data set approximates the original data set x y
30
For approximation
In general dimension d, the eigenvalues are
sorted in descending order: λ1 ≥ λ2 ≥ … ≥ λd
The eigenvectors are sorted accordingly. To get an approximation of dimension d’ < d, we
take the d’ first eigenvectors and look at the subspace they span (d’ = 1 is a line, d’ = 2 is a plane…)
31
For approximation
To get an approximating set, we project the
- riginal data points onto the chosen subspace:
xi = m + α1v1 + α2v2 +…+ αd’vd’ +…+αdvd
Projection:
xi’ = m + α1v1 + α2v2 +…+ αd’vd’ +0⋅vd’+1+…+ 0⋅ vd
SVD
33
Geometric analysis of linear transformations
We want to know what a linear transformation A does Need some simple and “comprehendible” representation
- f the matrix of A.
Let’s look what A does to some vectors
Since A(αv) = αA(v), it’s enough to look at vectors v of unit length
A
34
The geometry of linear transformations
A linear (non-singular) transform A always takes
hyper-spheres to hyper-ellipses.
A A
35
The geometry of linear transformations
Thus, one good way to understand what A does is to find
which vectors are mapped to the “main axes” of the ellipsoid.
A A
36
Geometric analysis of linear transformations
If we are lucky: A = V Λ VT, V orthogonal (true
if A is symmetric)
The eigenvectors of A are the axes of the ellipse
A
37
Symmetric matrix: eigen decomposition
In this case A is just a scaling matrix. The eigen
decomposition of A tells us which orthogonal axes it scales, and by how much:
[ ] [ ]
1 2 1 2 1 2 T n n n
A λ λ λ ⎡ ⎤ ⎢ ⎥ ⎢ ⎥ = ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎣ ⎦ v v v v v v … …
- A
1 1 λ2 λ1
i i i
A v v λ =
38
General linear transformations: SVD
In general A will also contain rotations, not just scales: A
[ ] [ ]
1 2 1 2 1 2 T n n n
A σ σ σ ⎡ ⎤ ⎢ ⎥ ⎢ ⎥ = ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎣ ⎦ u u u v v v … …
- 1
1 σ2 σ1
T
A U V = ∑
39
General linear transformations: SVD
A
[ ] [ ]
1 2 1 2 1 2 n n n
A σ σ σ ⎡ ⎤ ⎢ ⎥ ⎢ ⎥ = ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎣ ⎦ v v v u u u … …
- 1
1 σ2 σ1
AV U = ∑
,
i i
A σ σ = ≥
i i
v u
- rthonormal
- rthonormal
40
SVD more formally
SVD exists for any matrix Formal definition:
For square matrices A ∈ Rn×n, there exist orthogonal matrices
U, V ∈ Rn×n and a diagonal matrix Σ, such that all the diagonal values σi of Σ are non-negative and
T
A U V = Σ
=
A U Σ
T
V
41
SVD more formally
The diagonal values of Σ (σ1, …, σn) are called the singular values. It
is accustomed to sort them: σ1 ≥ σ2≥ … ≥ σn
The columns of U (u1, …, un) are called the left singular vectors.
They are the axes of the ellipsoid.
The columns of V (v1, …, vn) are called the right singular vectors.
They are the preimages of the axes of the ellipsoid.
T
A U V = Σ
=
A U Σ
T
V
42
Reduced SVD
For rectangular matrices, we have two forms of
- SVD. The reduced SVD looks like this:
The columns of U are orthonormal Cheaper form for computation and storage
A U Σ
T
V
=
43
Full SVD
We can complete U to a full orthogonal matrix
and pad Σ by zeros accordingly
A U Σ
T
V
=
44
Some history
SVD was discovered by the following people:
- E. Beltrami
(1835 − 1900)
- M. Jordan
(1838 − 1922)
- J. Sylvester
(1814 − 1897)
- H. Weyl
(1885-1955)
- E. Schmidt
(1876-1959)
45
SVD is the “working horse” of linear algebra
There are numerical algorithms to compute
- SVD. Once you have it, you have many things:
Matrix inverse → can solve square linear systems Numerical rank of a matrix Can solve least-squares systems PCA Many more…
46
Matrix inverse and solving linear systems
Matrix inverse: So, to solve
( ) ( )
1
1 1 1 1 1 1 1
n
T T T T
A U V A U V V U V U
σ σ − − − − −
= ∑ = ∑ = ∑ = ⎡ ⎤ ⎢ ⎥ = ⎢ ⎥ ⎢ ⎥ ⎣ ⎦
- 1
T
A V U
−
= = ∑ x b x b
47
Matrix rank
The rank of A is the number of non-zero singular
values
A U Σ
T
V
= m n σ1σ2 σn
48
Numerical rank
If there are very small singular values, then A is
close to being singular. We can set a threshold t, so that numeric_rank(A) = #{σi| σi > t}
If rank(A) < n then A is singular. It maps the entire
space Rn onto some subspace, like a plane (so A is some sort of projection).
49
Back to PCA
We wanted to find principal components x y x’ y’
50
PCA
Move the center of mass to the origin
pi’ = pi − m
x y x’ y’
51
PCA
Constructed the matrix X of the data points. The principal axes are eigenvectors of S = XXT
1 2
| | | | | |
n
X ⎡ ⎤ ⎢ ⎥ ′ ′ ′ = ⎢ ⎥ ⎢ ⎥ ⎣ ⎦ p p p
- 1
T T d
XX S U U λ λ ⎡ ⎤ ⎢ ⎥ = = ⎢ ⎥ ⎢ ⎥ ⎣ ⎦
52
PCA
We can compute the principal components by SVD of X: Thus, the left singular vectors of X are the principal
components! We sort them by the size of the singular values of X.
2
( )
T T T T T T T T T
X U V XX U V U V V U U U V U = Σ = Σ Σ = = = Σ Σ Σ
53
Shape matching
We have two objects in correspondence Want to find the rigid transformation that aligns
them
54
Shape matching
When the objects are aligned, the lengths of the
connecting lines are small.
55
Shape matching – formalization
Align two point sets Find a translation vector t and rotation matrix R
so that:
{ } { }
1 1
, , and , , .
n n
P Q = = p p q q … …
2 1
( ) is minimized
n i i i
R
=
− +
∑ p
t q
56
Shape matching – solution
Turns out we can solve the translation and rotation
separately.
Theorem: if (R, t) is the optimal transformation, then the
points {pi} and {Rqi + t} have the same centers of mass.
1
1
n i i
n
=
=
∑
p p
1
1
n i i
n
=
= ∑ q q
( )
1 1
1 1
n i i n i i
R R n R R n
= =
= + ⇓ ⎛ ⎞ = + = + ⎜ ⎟ ⎝ ⎠ = −
∑ ∑
p q p q q t t p t t q
57
Finding the rotation R
To find the optimal R, we bring the centroids of both point
sets to the origin:
We want to find R that minimizes
i i i i
′ ′ = − = − p p p q q q
2 1 n i i i
R
=
′ ′ −
∑ p
q
58
Finding the rotation R
( ) ( )
( )
2 1 1 1 n n T i i i i i i i i n T T T T T T i i i i i i i i i
R R R R R R R
= = =
′ ′ ′ ′ ′ ′ − = − − = ′ ′ ′ ′ ′ ′ ′ ′ = − − +
∑ ∑ ∑
p q p q p q p p p q q p q q
I
These terms do not depend on R, so we can ignore them in the minimization
59
Finding the rotation R
( ) ( ) ( ) ( ) ( )
1 1 1 1
min max . max 2 max
n n T T T T i i i i i i i i i i i i T T T T i i i i i i i n n T T T T T T i i i i i i
R R R R R R R R R
= = = =
′ ′ ′ ′ ′ ′ ′ ′ − − = + ′ ′ ′ ′ ′ ′ = = ′ ′ ′ ′ ⇒ =
∑ ∑ ∑ ∑
p q q p p q q p q p q p p q p q p q
this is a scalar
60
Finding the rotation R
( )
1 1 1 1 n n n T T T i i i i i i i i i n T i i i
Trace Trace H R R R
= = = =
⎛ ⎞ ⎛ ⎞ ′ ′ ′ ′ ′ ′ = = ⎜ ⎟ ⎜ ⎟ ⎝ ⎠ ⎝ ⎠ ′ ′ =
∑ ∑ ∑ ∑
p q q p q p q p
3 ×1 1 ×3 = 3 ×3 =
1
( )
n ii i
Trace A A
=
= ∑
61
So, we want to find R that maximizes Theorem: if M is symmetric positive definite (all
eigenvalues of M are positive) and B is any orthogonal matrix then
So, let’s find R so that RH is symmetric positive definite.
Then we know for sure that Trace(RH) is maximal.
Finding the rotation R
( )
Trace RH
( )
( ) Trace M Trace BM ≥
62
This is easy! Compute SVD of H: Define R: Check RH: T
H U V
Finding the rotation R
= Σ
T
R VU =
( )( )
T T T
VU U V R V V H = Σ = Σ
This is a symmetric matrix, Its eigenvalues are σi ≥ 0 So RH is positive!
63
Summary of rigid alignment:
Translate the input points to the centroids: Compute the “covariance matrix” Compute the SVD of H: The optimal rotation is The translation vector is i i i i
′ ′ = = − − q p p q p q
1 n T i i i
H
=
′ ′ = ∑ q p
T
H U V = Σ
T
R VU =
R = − t p q
64
Complexity
Numerical SVD is an expensive operation We always need to pay attention to the
dimensions of the matrix we’re applying SVD to.
65