 
              3D Geometry for Computer Graphics Lesson 2: PCA & SVD
Last week - eigendecomposition � We want to learn how the matrix A works: A 2
Last week - eigendecomposition � If we look at arbitrary vectors, it doesn’t tell us much. A 3
Spectra and diagonalization � If A is symmetric, the eigenvectors are orthogonal (and there’s always an eigenbasis). A T A = U Λ U λ ⎛ ⎞ ⎜ 1 ⎟ Λ = A u i = λ i u i λ ⎜ ⎟ 2 ⎜ ⎟ � ⎜ ⎟ ⎜ ⎟ λ ⎝ ⎠ n 4
Why SVD… � Diagonalizable matrix is essentially a scaling. � Most matrices are not diagonalizable – they do other things along with scaling (such as rotation) � So, to understand how general matrices behave, only eigenvalues are not enough � SVD tells us how general linear transformations behave, and other things… 5
The plan for today � First we’ll see some applications of PCA – Principal Component Analysis that uses spectral decomposition. � Then look at the theory. � SVD � Basic intuition � Formal definition � Applications 6
PCA – the general idea � PCA finds an orthogonal basis that best represents given data set. y’ y x’ x � The sum of distances 2 from the x’ axis is minimized. 7
PCA – the general idea � PCA finds an orthogonal basis that best represents given data set. z 3D point set in standard basis y x � PCA finds a best approximating plane (again, in terms of Σ distances 2 ) 8
Application: finding tight bounding box � An axis-aligned bounding box: agrees with the axes y maxY minX maxX x minY 9
Application: finding tight bounding box � Oriented bounding box: we find better axes! x’ y’ 10
Application: finding tight bounding box � This is not the optimal bounding box z y x 11
Application: finding tight bounding box � Oriented bounding box: we find better axes! 12
Usage of bounding boxes (bounding volumes) � Serve as very simple “approximation” of the object � Fast collision detection, visibility queries � Whenever we need to know the dimensions (size) of the object The models consist of � thousands of polygons To quickly test that they � don’t intersect, the bounding boxes are tested Sometimes a hierarchy � of BB’s is used The tighter the BB – the � less “false alarms” we have Sample 13
Scanned meshes 14
Point clouds � Scanners give us raw point cloud data � How to compute normals to shade the surface? normal 15
Point clouds � Local PCA, take the third vector 16
Notations � Denote our data points by x 1 , x 2 , …, x n ∈ R d ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ 1 1 1 x x x 1 2 n ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ 2 2 2 x x x ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ = = = ⎜ 1 2 n x , x , … , x ⎜ ⎟ ⎜ ⎟ ⎟ 1 2 n � � � ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ d d d x x x ⎝ ⎠ ⎝ ⎠ ⎝ ⎠ 1 2 n 17
The origin of the new axes � The origin is zero-order approximation of our data set (a point) � It will be the center of mass: n ∑ = m x 1 i n i = 1 � It can be shown that: n ∑ 2 = m argmin x - x i x i = 1 18
Scatter matrix � Denote y i = x i – m , i = 1, 2, …, n = T S YY where Y is d × n matrix with y k as columns ( k = 1, 2, …, n ) ⎛ ⎞⎛ ⎞ 1 1 1 1 2 d y y � y y y � y 1 2 n 1 1 1 ⎜ ⎟⎜ ⎟ 2 2 2 1 2 d y y y y y � y ⎜ ⎟⎜ ⎟ = ⎜ 1 2 n 2 2 2 S ⎟⎜ ⎟ � � � � � ⎜ ⎟⎜ ⎟ ⎜ ⎟⎜ ⎟ d d d 1 2 d y y � y y y � y ⎝ ⎠⎝ ⎠ 1 2 n n n n Y T Y 19
Variance of projected points � In a way, S measures variance (= scatterness) of the data in different directions. � Let’s look at a line L through the center of mass m , and project our points x i onto it. The variance of the projected points x’ i is: n ∑ x ′ = − 2 var( ) L || m || 1 i n = i 1 L L L L Original set Small variance Large variance 20
Variance of projected points � Given a direction v , || v || = 1 , the projection of x i onto L = m + v t is: ′ − = < − > = < > = T || x m || v x , m /|| v || v y , v y i i i i x i x’ i v L m 21
Variance of projected points So, � n n ∑ ∑ ′ = = = = 2 T 2 T 2 var( ) L || x -m || ( v y ) || v Y || 1 1 1 i i n n n = = i 1 i 1 = = < > = = = < > T 2 T T T T T || Y v || Y v , Y v v YY v v S v S v v , 1 1 1 1 1 n n n n n 2 2 ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ 1 1 1 1 � y y y y ⎜ i ⎟ 1 2 n ⎜ ⎟ ⎜ ⎟ 2 2 2 2 y y y y n n ⎜ ( ) ⎟ ( ) ⎜ ⎟ ⎜ ⎟ ∑ ∑ = = = T 2 1 2 d i 1 2 d 1 2 n T 2 ( v y ) v v � v v v � v || v Y || ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ i � � � � = = i 1 i 1 ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ d d d d y y y � y ⎝ ⎠ ⎝ ⎠ ⎝ ⎠ i 1 2 n 22
Directions of maximal variance � So, we have: var( L ) = < S v , v > � Theorem: Let f : { v ∈ R d | || v || = 1} → R, f ( v ) = <S v , v > (and S is a symmetric matrix). Then, the extrema of f are attained at the eigenvectors of S . � So, eigenvectors of S are directions of maximal/minimal variance! 23
Summary so far � We take the centered data vectors y 1 , y 2 , …, y n ∈ R d = � Construct the scatter matrix T S YY � S measures the variance of the data points � Eigenvectors of S are directions of maximal variance. 24
Scatter matrix - eigendecomposition � S is symmetric ⇒ S has eigendecomposition: S = V Λ V T λ 1 v 1 λ 2 v 2 S = v 1 v 2 v d λ d v d The eigenvectors form orthogonal basis 25
Principal components � Eigenvectors that correspond to big eigenvalues are the directions in which the data has strong components (= large variance). � If the eigenvalues are more or less the same – there is no preferable direction. � Note: the eigenvalues are always non-negative. 26
Principal components There’s no preferable There is a clear preferable � � direction direction � S looks like this: � S looks like this: λ λ ⎛ ⎞ ⎛ ⎞ ⎜ ⎟ T T V V V V ⎜ ⎟ ⎜ ⎟ λ ⎝ ⎠ ⎝ ⎠ μ μ is close to zero, much Any vector is an eigenvector � � smaller than λ . 27
How to use what we got � For finding oriented bounding box – we simply compute the bounding box with respect to the axes defined by the eigenvectors. The origin is at the mean point m . v 3 v 2 v 1 28
For approximation y v 2 v 1 x y y x x The projected data set This line segment approximates approximates the original data set the original data set 29
For approximation � In general dimension d , the eigenvalues are sorted in descending order: λ 1 ≥ λ 2 ≥ … ≥ λ d � The eigenvectors are sorted accordingly. � To get an approximation of dimension d’ < d , we take the d’ first eigenvectors and look at the subspace they span ( d’ = 1 is a line, d’ = 2 is a plane…) 30
For approximation � To get an approximating set, we project the original data points onto the chosen subspace: x i = m + α 1 v 1 + α 2 v 2 +…+ α d’ v d’ +…+ α d v d Projection : x i ’ = m + α 1 v 1 + α 2 v 2 +…+ α d’ v d’ +0 ⋅ v d’+1 +…+ 0 ⋅ v d 31
SVD
Geometric analysis of linear transformations � We want to know what a linear transformation A does � Need some simple and “comprehendible” representation of the matrix of A . � Let’s look what A does to some vectors � Since A ( α v ) = α A ( v ) , it’s enough to look at vectors v of unit length A 33
The geometry of linear transformations � A linear (non-singular) transform A always takes hyper-spheres to hyper-ellipses. A A 34
The geometry of linear transformations � Thus, one good way to understand what A does is to find which vectors are mapped to the “main axes” of the ellipsoid. A A 35
Geometric analysis of linear transformations � If we are lucky: A = V Λ V T , V orthogonal (true if A is symmetric) � The eigenvectors of A are the axes of the ellipse A 36
Symmetric matrix: eigen decomposition � In this case A is just a scaling matrix. The eigen decomposition of A tells us which orthogonal axes it scales, and by how much: 1 λ 1 1 λ 2 A λ ⎡ ⎤ 1 ⎢ ⎥ λ [ ] [ ] ⎢ ⎥ T = 2 A v v … v v v … v ⎢ ⎥ 1 2 n 1 2 n � ⎢ ⎥ λ ⎢ ⎥ ⎣ ⎦ n = λ A v v i i i 37
General linear transformations: SVD � In general A will also contain rotations, not just scales: σ 1 1 1 σ 2 A = ∑ T A U V σ ⎡ ⎤ 1 ⎢ ⎥ σ [ ] [ ] ⎢ ⎥ T = 2 A u u … u v v … v ⎢ ⎥ 1 2 n 1 2 n � ⎢ ⎥ σ ⎢ ⎥ ⎣ ⎦ n 38
General linear transformations: SVD σ 1 1 1 σ 2 A = ∑ AV U σ ⎡ ⎤ 1 ⎢ ⎥ orthonormal orthonormal σ [ ] [ ] ⎢ ⎥ = 2 A v v … v u u … u ⎢ ⎥ 1 2 n 1 2 n � ⎢ ⎥ σ ⎢ ⎥ ⎣ ⎦ n = σ σ ≥ A v u , 0 i i i i 39
Recommend
More recommend