Section 1 Principal Component Analysis 1 / 16 Principal Component - PowerPoint PPT Presentation

ST 810-006 Statistics and Financial Risk Section 1 Principal Component Analysis 1 / 16 Principal Component Analysis

ST 810-006 Statistics and Financial Risk Background • Principal Component Analysis (PCA) is a tool for looking at multivariate data . • General setup: we observe several variables for each of several cases . • In our context, the variables are financial: • interest rates for various maturities; • log returns for various stocks; • exchange rates between USD and various other currencies. • Each case consists of the values of those variables on a given date. 2 / 16 Principal Component Analysis Background

ST 810-006 Statistics and Financial Risk • The general idea behind PCA (and Factor Analysis , FA) is that the way the variables covary can be attibuted to common underlying forces. • For example, stock market returns are all affected by overall market sentiment. • We look for: • common modes of variation (PCA); • unobserved (latent) factors (FA). 3 / 16 Principal Component Analysis Background

ST 810-006 Statistics and Financial Risk Matrix methods • Write y t , j for the value of the j th variable on the t th date. • Assemble these into a data matrix X , where x t , j might be: • raw data y t , j ; • centered data y t , j − ¯ y j , where ¯ y j is the average, over time, of the j th variable: T y j = 1 � ¯ y t , j ; T t =1 • standardized (or scaled ) data y t , j − ¯ y j , where s j is the standard s j deviation, again over time, of the j th variable: � T � � 1 � � y j ) 2 . s j = ( y t , j − ¯ T t =1 4 / 16 Principal Component Analysis Matrix methods

ST 810-006 Statistics and Financial Risk • The data are always centered by default. • But when all variables vary naturally around zero, such as log returns of tradable assets, it is not necessary. • If the variables are in different units, they must be scaled to make them comparable. • Even when they have common units, their variances may be very different, and scaling is again necessary. • Scaling by the standard deviation is convenient, but nothing more. 5 / 16 Principal Component Analysis Matrix methods

ST 810-006 Statistics and Financial Risk Modes of Variation • Each mode of variation is a part of X of the form d uv ′ , where: • d > 0 is a scalar multiplier; • u is a column vector of length T , with one entry for each date; • v ′ is a row vector of length J , with one entry for each variable; • in PCA, u and v ′ are normalized: u ′ u = v ′ v = 1 . 6 / 16 Principal Component Analysis Modes of Variation

ST 810-006 Statistics and Financial Risk • Note that d uv ′ is a rank-1 matrix, and that any rank-1 matrix can be written in this form. • Terminology: • The entries of the (normalized) row vector v ′ are called the loadings for the mode. • The entries of the (unnormalized) column vector d u are called the scores for the mode. 7 / 16 Principal Component Analysis Modes of Variation

ST 810-006 Statistics and Financial Risk Principal Component • PCA and FA differ in how the loadings and scores are constructed. • In PCA, the first (or dominant ) component is defined to be the best approximation to X in the Frobenius norm: d 1 u 1 v ′ 1 = argmin || X − d uv ′ || F , d , u , v where for any T × J matrix A , � T J � � � � a 2 || A || F = � t , j . t =1 j =1 8 / 16 Principal Component Analysis Principal Component

ST 810-006 Statistics and Financial Risk • The next component is the one that gives the best rank-2 approximation: d 2 u 2 v ′ || X − d 1 u 1 v ′ 1 − d uv ′ || F . 2 = argmin d , u , v • If, as here, we fix the first component and optimize over only the second, the solution can be shown to have the orthogonality properties u ′ 1 u 2 = v ′ 1 v 2 = 0 . (1) • If, instead, we optimize over both components simultaneously, we need to impose a constraint like (1), and the solution is essentially the same. 9 / 16 Principal Component Analysis Principal Component

ST 810-006 Statistics and Financial Risk • Components 3 through J are defined similarly, either: • incrementally, in which case they automatically satisfy the generalization of (1); • or simultaneously, constrained by (1). • Again, the solution is the same either way. • Note that for each component, d k u k v ′ k = ( − d k u k )( − v ′ k ) . • That is, the loadings and scores are determined only up to multiplication by − 1. • You should feel free to change the sign if it simplifies interpretation, provided you change both the loadings and the scores. 10 / 16 Principal Component Analysis Principal Component

ST 810-006 Statistics and Financial Risk Singular Value Decomposition • PCA can be carried out using the Singular Value Decomposition (SVD). • Any T × J matrix X , T ≥ J , can be factorized as X = UDV ′ (2) where: • U is T × J with U ′ U = I J ; • D is J × J diagonal, with diagonal entries d 1 ≥ d 2 ≥ · · · ≥ d J ≥ 0; • V is J × J with V ′ V = I J . 11 / 16 Principal Component Analysis Singular Value Decomposition

ST 810-006 Statistics and Financial Risk • Equation (2) can also be written J � d k u k v ′ X = k , k =1 where u k is the k th column of U and v ′ k is the k th row of V ′ . k is the k th PCA component. • Easily shown: d k u k v ′ k are the k th singular value, left • Terminology: d k , u k , and v ′ singular vector, and right singular vector, respectively. 12 / 16 Principal Component Analysis Singular Value Decomposition

ST 810-006 Statistics and Financial Risk Loadings and Scores • Note that the SVD factorization X = UDV ′ and the orthogonality conditions U ′ U = V ′ V = I J imply that U = XVD − 1 , D = U ′ XV , and V ′ = D − 1 U ′ X . • That is, any one of X , U , D , and V ′ can be calculated directly from the other three. 13 / 16 Principal Component Analysis Loadings and Scores

ST 810-006 Statistics and Financial Risk Covariance and Correlation • PCA is often described in terms of the covariance or correlation matrix, rather than the data matrix. • If X is the centered data matrix, then 1 T X ′ X is the sample covariance matrix. • If X is the standardized data matrix, then 1 T X ′ X is the sample corrrelation matrix. 14 / 16 Principal Component Analysis Covariance and Correlation

ST 810-006 Statistics and Financial Risk • In either case, the SVD shows that � 1 1 � T D 2 T X ′ X = V V ′ . 1 • That is, the eigenvectors of T X ′ X are the columns of V , which are the transposes of the rows of loadings. 1 1 T d 2 • Also, the eigenvalues of T X ′ X are k . • So the loadings and singular values can be found from the spectral decomposition of the correlation matrix or covariance matrix, as appropriate. • For the scores, you need the original data matrix: UD = XV . 15 / 16 Principal Component Analysis Covariance and Correlation

ST 810-006 Statistics and Financial Risk • Note that the variances of the variables are the diagonal entries 1 of T X ′ X . • The total variance is � 1 tr 1 � T D 2 T X ′ X = tr V V ′ = 1 T tr D 2 • That is, each squared singular value measures the contribution of the component to the total variance. • If the data were scaled, each variance is 1, and tr 1 T X ′ X = 1 T tr D 2 = J . 16 / 16 Principal Component Analysis Covariance and Correlation

Section 1 Principal Component Analysis 1 / 16 Principal Component - PowerPoint PPT Presentation

ST 810-006 Statistics and Financial Risk Section 1 Principal Component Analysis 1 / 16 Principal Component Analysis ST 810-006 Statistics and Financial Risk Background Principal Component Analysis (PCA) is a tool for looking at

Continuous Latent Variables Oliver Schulte - CMPT 419/726 Bishop PRML Ch. 12 Principal Component

Functional Principal Component Analysis May 14, 2018 Empirical Principal Component FPC for the

Principal Component Analysis Powerpoint Presentation What is multivariate analysis? Summarizing

Principal component analysis Ingo Blechschmidt December 17th, 2014 Kleine Bayessche AG

Functional components Notification component Application received Refuse ? Notification

WIO IOSAP Project Budget Nairobi Convention WIO IOSAP Budget per Project Component COMPONENT

Module V: Vector Spaces Module V Math 237 Module V Section V.0 Section V.1 Section V.2

Principal Component Analysis http://setosa.io/ev/principal- Food consumption in the UK

CS475/CS675 Lecture 23: July 19, 2016 Principal Component Analysis, Eigenfaces CS475/CS675 (c)

Dimensionality Reduction: Linear Discriminant Analysis and Principal Component Analysis CMSC 678

Introduction to Principal Component Analysis and Indepedent Component Analysis Tristan A. Hearn

Chapter 5 Singular value decomposition and principal component analysis In A Practical Approach to

Hebbian Learning, Hebbian Learning Principal Component Analysis, and Independent Component

Principal Component Analysis in a Linear Algebraic View by Anna Orosz under the mentorship of

Lecture 3 Principal Component Analysis Lin ZHANG, PhD School of Software Engineering Tongji

Component selection 1 (c) 2020 A.J.M. Montagne Component selection + - + - + - 2 (c)

PCA CS 446 Supervised learning So far, weve done supervised learning: Given (( x i , y i )) ,

Z 1 = a 11 X 1 + a 12 X 2 + + a 1n X n Coefficients for linear model 2 + a 12 2 + + a 1n 2

Lecture: Face Recognition and Feature Reduction Juan Carlos Niebles and Ranjay Krishna Stanford

Principal Components Analysis Sargur Srihari University at Buffalo 1 Topics Projection

3D Geometry for Computer Graphics Lesson 2: PCA & SVD Last week - eigendecomposition We

Dimension Reduction CS 760@UW-Madison Goals for the lecture you should understand the following

Principal Component Analysis for CRM Data Verena Pflieger Data Scientist at INWT Statistics

Advanced PCA: Choosing the right number of PCs Alexandros Tantos Assistant Professor Aristotle

Sambuz

Useful Links

Newsletter

Mail Us

Section 1 Principal Component Analysis 1 / 16 Principal Component - PowerPoint PPT Presentation

ST 810-006 Statistics and Financial Risk Section 1 Principal Component Analysis 1 / 16 Principal Component Analysis ST 810-006 Statistics and Financial Risk Background Principal Component Analysis (PCA) is a tool for looking at

Continuous Latent Variables Oliver Schulte - CMPT 419/726 Bishop PRML Ch. 12 Principal Component

Functional Principal Component Analysis May 14, 2018 Empirical Principal Component FPC for the

Principal Component Analysis Powerpoint Presentation What is multivariate analysis? Summarizing

Principal component analysis Ingo Blechschmidt December 17th, 2014 Kleine Bayessche AG

Functional components Notification component Application received Refuse ? Notification

WIO IOSAP Project Budget Nairobi Convention WIO IOSAP Budget per Project Component COMPONENT

Module V: Vector Spaces Module V Math 237 Module V Section V.0 Section V.1 Section V.2

Principal Component Analysis http://setosa.io/ev/principal- Food consumption in the UK

CS475/CS675 Lecture 23: July 19, 2016 Principal Component Analysis, Eigenfaces CS475/CS675 (c)

Dimensionality Reduction: Linear Discriminant Analysis and Principal Component Analysis CMSC 678

Introduction to Principal Component Analysis and Indepedent Component Analysis Tristan A. Hearn

Chapter 5 Singular value decomposition and principal component analysis In A Practical Approach to

Hebbian Learning, Hebbian Learning Principal Component Analysis, and Independent Component

Principal Component Analysis in a Linear Algebraic View by Anna Orosz under the mentorship of

Lecture 3 Principal Component Analysis Lin ZHANG, PhD School of Software Engineering Tongji

Component selection 1 (c) 2020 A.J.M. Montagne Component selection + - + - + - 2 (c)

PCA CS 446 Supervised learning So far, weve done supervised learning: Given (( x i , y i )) ,

Z 1 = a 11 X 1 + a 12 X 2 + + a 1n X n Coefficients for linear model 2 + a 12 2 + + a 1n 2

Lecture: Face Recognition and Feature Reduction Juan Carlos Niebles and Ranjay Krishna Stanford

Principal Components Analysis Sargur Srihari University at Buffalo 1 Topics Projection

3D Geometry for Computer Graphics Lesson 2: PCA &amp; SVD Last week - eigendecomposition We

Dimension Reduction CS 760@UW-Madison Goals for the lecture you should understand the following

Principal Component Analysis for CRM Data Verena Pflieger Data Scientist at INWT Statistics

Advanced PCA: Choosing the right number of PCs Alexandros Tantos Assistant Professor Aristotle

Sambuz

Useful Links

Newsletter

Mail Us

3D Geometry for Computer Graphics Lesson 2: PCA & SVD Last week - eigendecomposition We