Principal component analysis Ingo Blechschmidt December 17th, 2014 - PowerPoint PPT Presentation

Theory Applications Principal component analysis Ingo Blechschmidt December 17th, 2014 Kleine Bayessche AG Principal component analysis 1 / 12

Theory Applications Outline 1 Theory Singular value decomposition Pseudoinverses Low-rank approximation 2 Applications Image compression Proper orthogonal decomposition Principal component analysis Eigenfaces Digit recognition Kleine Bayessche AG Principal component analysis 2 / 12

Theory Applications SVD Pseudoinverses Low-rank approximation Singular value decomposition Let A ∈ R n × m . Then there exist numbers σ 1 ≥ σ 2 ≥ · · · ≥ σ m ≥ 0 , an orthonormal basis v 1 , . . . , v m of R m , and an orthonormal basis w 1 , . . . , w n of R n , such that A v i = σ i w i , i = 1 , . . . , m . In matrix language: A = W Σ V t , V = ( v 1 | . . . | v m ) ∈ R m × m orthogonal , where W = ( w 1 | . . . | w n ) ∈ R n × n orthogonal , Σ = diag ( σ 1 , . . . , σ m ) ∈ R n × m . Kleine Bayessche AG Principal component analysis 3 / 12

• The singular value decomposition (SVD) exists for any real matrix, even rectangular ones. • The singular values σ i are unique. • The basis vectors are not unique. • If A is orthogonally diagonalizable with eigenvalues λ i (for instance, if A is symmetric), then σ i = | λ i | . �� ij A 2 i σ 2 • � A � Frobenius = ij = tr( A t A ) = i . • There exists a generalization to complex matrices. In this case, the matrix A can be decomposed as W Σ V ⋆ , where V ⋆ is the complex conjugate of V t and W and V are unitary matrices. • The singular value decomposition can also be formulated in a basis-free manner as a result about linear maps between finite-dimensional Hilbert spaces.

Existence proof (sketch): 1. Consider the eigenvalue decomposition of the symmetric and positive-semidefinite matrix A t A : We have an orthonormal basis v i of eigenvectors corresponding to eigenvalues λ i . 2. Set σ i := √ λ i . 3. Set w i := 1 σ i A v i (for those i with λ i � = 0 ). 4. Then A v i = σ i w i holds trivially. σ i σ j ( A t A v i , v j ) = λ i δ ij 1 5. The w i are orthonormal: ( w i , w j ) = σ i σ j . 6. If necessary, extend the w i to an orthonormal basis. This proof gives rise to an algorithm for calculating the SVD, but unless A t A is small, it has undesirable numerical properties. (But note that one can also use AA t !) Since the 1960ies, there exists a stable iterative algorithm by Golub and van Loan.

Theory Applications SVD Pseudoinverses Low-rank approximation The pseudoinverse of a matrix Let A ∈ R n × m and b ∈ R n . Then the solutions to the optimization problem � A x − b � 2 − → min under x ∈ R m are given by � 0 � x = A + b + V , ⋆ where A = W Σ V t is the SVD and A + = W Σ + V t , Σ + = diag ( σ − 1 1 , . . . , σ − 1 m ) . Kleine Bayessche AG Principal component analysis 4 / 12

• In the formula for Σ + , set 0 − 1 := 0 . • If A happens to be invertible, then A + = A − 1 . • The pseudoinverse can be used for polynomial approximation: Let data points ( x i , y i ) ∈ R 2 , 1 ≤ i ≤ N , be given. Want to find a polynomial p ( z ) = � n k =0 α i z i , n ≪ N , such that N | p ( x i ) − y i | 2 − � → min. i =1 In matrix language, this problem is written � A u − y � 2 − → min where u = ( α 0 , . . . , α N ) T ∈ R n +1 and x 2  x n    1 x 1 · · · y 1 1 1 x 2 x n 1 · · · x 2 y 2     2 2  ∈ R N × ( n +1) ,  ∈ R N . A = y =  . . . .   .  ... . . . . .     . . . . .   x 2 x n 1 · · · x N y N N N

Theory Applications SVD Pseudoinverses Low-rank approximation Low-rank approximation Let A = W Σ V t ∈ R n × m and 1 ≤ r ≤ n , m . Then a solution to the optimization problem � A − M � Frobenius − → min under all matrices M with rank M ≤ r is given by M = W Σ r V t , where Σ r = diag ( σ 1 , . . . , σ r , 0 , . . . , 0) . The approximation error is � � A − W Σ r V t � F = σ 2 r +1 + · · · + σ 2 m . Kleine Bayessche AG Principal component analysis 5 / 12

• This is the Eckart–Young(–Mirsky) theorem. • Beware of false and incomplete proofs in the literature!

Theory Applications Image compression POD PCA Eigenfaces Digit recognition Image compression Think of images as matrices. Substitute a matrix W Σ V t by W Σ r V t with r small. To reconstruct W Σ r V t , only need to know the r singular values σ 1 , . . . , σ r , r the first r columns of W , and height · r the top r rows of V t . width · r Total amount: r · (1 + height + weight ) ≪ height · width Kleine Bayessche AG Principal component analysis 6 / 12

• See http://speicherleck.de/iblech/stuff/pca-images. pdf for sample compressions and http://pizzaseminar. speicherleck.de/skript4/08-principal-component-analysis/ svd-image.py for the Python code producing theses images. • Image compression by singular value decomposition is mostly of academic interest only. • This might be for the following reasons: other compression algorithms have more efficient implementations; other algorithms taylor to the specific properties of human vision; the basis vectors of other approaches (for instance, DCT) are similar to the most important singular basis vectors of a sufficiently large corpus of images. • See http://dsp.stackexchange.com/questions/7859/relationship-between-dct-and-pca .

Theory Applications Image compression POD PCA Eigenfaces Digit recognition Proper orthogonal decomposition Given data points x i ∈ R N , want to find a low-dimensional linear subspace which approximately contains the x i . Minimize � � x i − P U ( x i ) � 2 J ( U ) := i under all r -dimensional subspaces U ⊆ R N , r ≪ N , where P U : R N → R N is the orthogonal projection onto U . Kleine Bayessche AG Principal component analysis 7 / 12

Principal component analysis Ingo Blechschmidt December 17th, 2014 - PowerPoint PPT Presentation

Theory Applications Principal component analysis Ingo Blechschmidt December 17th, 2014 Kleine Bayessche AG Principal component analysis 1 / 12 Theory Applications Principal component analysis Ingo Blechschmidt December 17th, 2014 Kleine

Continuous Latent Variables Oliver Schulte - CMPT 419/726 Bishop PRML Ch. 12 Principal Component

Section 1 Principal Component Analysis 1 / 16 Principal Component Analysis ST 810-006

Functional Principal Component Analysis May 14, 2018 Empirical Principal Component FPC for the

Principal Component Analysis Powerpoint Presentation What is multivariate analysis? Summarizing

Functional components Notification component Application received Refuse ? Notification

WIO IOSAP Project Budget Nairobi Convention WIO IOSAP Budget per Project Component COMPONENT

Principal Component Analysis http://setosa.io/ev/principal- Food consumption in the UK

CS475/CS675 Lecture 23: July 19, 2016 Principal Component Analysis, Eigenfaces CS475/CS675 (c)

Dimensionality Reduction: Linear Discriminant Analysis and Principal Component Analysis CMSC 678

Introduction to Principal Component Analysis and Indepedent Component Analysis Tristan A. Hearn

Chapter 5 Singular value decomposition and principal component analysis In A Practical Approach to

Hebbian Learning, Hebbian Learning Principal Component Analysis, and Independent Component

Principal Component Analysis in a Linear Algebraic View by Anna Orosz under the mentorship of

Lecture 3 Principal Component Analysis Lin ZHANG, PhD School of Software Engineering Tongji

Component selection 1 (c) 2020 A.J.M. Montagne Component selection + - + - + - 2 (c)

For use in AIM Awards centres Component Level: Level Three Component Guided Learning Hours: 21

Proper Orthogonal Decomposition: Theory and Reduced-Order Modeling Stefan Volkwein M. Gubisch,

Asymptotic Analysis of Random Matrices and Orthogonal Polynomials Arno Kuijlaars University of

Lecture 6: (Probabilistic) Latent Semantic Analysis Julia Hockenmaier juliahmr@illinois.edu

Fugledes spectral set conjecture on cyclic groups Romanos Diogenes Malikiosis TU Berlin Frame

Juggling with representations Matrix representation of Symmetry Point Groups C2v Irreducible

Fourier Bases on Fractals Keri Kornelson University of Oklahoma - Norman February Fourier Talks

SOME NUMERICAL FUNCTIONS ASSOCIATED TO THE MASLOV INDEX Andrew Ranicki (Edinburgh)

Vector space of continuous periodic functions Fourier series Mathematical Tools for ITS (11MAI)