Lecture 3 Principal Component Analysis Lin ZHANG, PhD School of - PowerPoint PPT Presentation

Lecture 3 Principal Component Analysis Lin ZHANG, PhD School of Software Engineering Tongji University Fall 2020 Lin ZHANG, SSE, 2020

Content • Matrix Differentiation • Lagrange Multiplier • Principal Component Analysis • Eigen-face based face classification Lin ZHANG, SSE, 2020

Matrix differentiation • Function is a vector and the variable is a scalar [ ] T = f t ( ) f t ( ), f t ( ),..., f t ( ) 1 2 n Definition T   df df t ( ) df t ( ) df t ( ) =  1 , 2 ,..., n  dt  dt dt dt  Lin ZHANG, SSE, 2020

Matrix differentiation • Function is a matrix and the variable is a scalar   f ( ) t f ( ),..., t f ( ) t 11 12 1 m   f ( ) t f ( ),..., t f ( ) t     = 21 22 2 m =  f t ( ) f ( ) t    ij  × n m   f ( ) t f ( ),..., t f ( ) t   n 1 n 2 nm Definition   df ( ) t df ( ) ,..., t df ( ) t 11 12 1 m   dt dt dt   df ( ) t df ( ) t df ( ) t     df ( ) t ,..., 2 m 21 22 df   = =  ij dt dt dt    dt dt    × n m     df ( ) t df ( ) t df ( ) t n 1 n 2 nm ,...,     dt dt dt Lin ZHANG, SSE, 2020

Matrix differentiation • Function is a scalar and the variable is a vector = ) T f ( ), x x ( , x x ,..., x 1 2 n Definition T   ∂ ∂ ∂ df f f f =  , ,...,  ∂ ∂ ∂ d x x x x   1 2 n In a similar way, = f ( ), x x ( , x x ,..., x ) 1 2 n   ∂ ∂ ∂ df f f f =  , ,...,  ∂ ∂ ∂ d x x x x   1 2 n Lin ZHANG, SSE, 2020

Matrix differentiation • Function is a vector and the variable is a vector [ ] [ ] T T = = x x x , ,..., x , y y ( ), x y ( ),..., x y ( ) x 1 2 n 1 2 m Definition ∂ ∂ ∂   y ( ) x y ( ) x y ( ) x 1 , 1 ,..., 1   ∂ ∂ ∂ x x x   1 2 n ∂ ∂ ∂   y ( ) x y ( ) x y ( ) x , ,..., 2 2 2 d y   ∂ ∂ ∂ =  x x x  1 2 n T d x      ∂ ∂ ∂ y ( ) x y ( ) x y ( ) x   m , m ,..., m ∂ ∂ ∂   x x x   1 2 n × m n Lin ZHANG, SSE, 2020

Matrix differentiation • Function is a vector and the variable is a vector [ ] [ ] T T = = x x x , ,..., x , y y ( ), x y ( ),..., x y ( ) x 1 2 n 1 2 m In a similar way, ∂ ∂ ∂  y ( ) x y ( ) x y ( ) x  1 , 2 ,..., m   ∂ ∂ ∂ x x x   1 1 1 ∂ ∂ ∂   y ( ) x y ( ) x y ( ) x 1 , 2 ,..., m T d y   ∂ ∂ ∂ =  x x x  2 2 2 d x      ∂ ∂ ∂ y ( ) x y ( ) x y ( ) x   1 , 2 ,..., m ∂ ∂ ∂   x x x   n n n × n m Lin ZHANG, SSE, 2020

Matrix differentiation • Function is a vector and the variable is a vector Example:   x 1   y ( ) x   1 = = = − = + 2 2 y , x x , y ( ) x x x , y ( ) x x 3 x     2 1 1 2 2 3 2 y ( ) x     2 x   3   ∂ ∂ y ( ) x y ( ) x 1 2   ∂ ∂ x x   1 1  2 x 0  1   ∂ ∂ T d y y ( ) x y ( ) x   = = − 1 2 1 3     ∂ ∂ d x x x     2 2 0 2 x     3 ∂ ∂ y ( ) x y ( ) x 1 2   ∂ ∂ x x   3 3 Lin ZHANG, SSE, 2020

Matrix differentiation • Function is a scalar and the variable is a matrix × ∈ m n f ( X X ),  Definition ∂ ∂ ∂   f f f    ∂ ∂ ∂ x x x   11 12 1 n df ( X ) =    d X   ∂ ∂ ∂ f f f    ∂ ∂ ∂   x x x   m 1 m 2 mn Lin ZHANG, SSE, 2020

Matrix differentiation • Useful results (1) ∈ n × 1 x a ,  Then, T T d a x d x a = = a , a d x d x How to prove? Lin ZHANG, SSE, 2020

Matrix differentiation • Useful results dA x × × ∈ ∈ = m n n 1   A , x A (2) Then, T d x T T d x A × × = ∈ ∈ T m n n 1 A A  , x  (3) Then, d x T d x A x × × ∈ ∈ n n n 1   = + A , x T (4) ( A A ) x Then, d x T d a Xb × × × ∈ ∈ ∈ m n m 1 n 1 = T X  , a  , b  ab (5) Then, d X T T d a X b × × × ∈ ∈ ∈ n m m 1 n 1 = T X  , a  , b  ba (6) Then, d X T d x x = 2 x n × ∈ 1 x  (7) Then, d x Lin ZHANG, SSE, 2020

Lagrange multiplier • Single-variable function ∈ x ( , ) a b f ( x ) is differentiable in ( a , b ) . At , f ( x ) achieves an 0 extremum df = | 0 x dx 0 • Two-variables function ( x , y ) f ( x , y ) is differentiable in its domain. At , f ( x, y ) 0 0 achieves an extremum ∂ ∂ f f = = | 0, | 0 ( x , y ) ( x , y ) ∂ ∂ x y 0 0 0 0 Lin ZHANG, SSE, 2020

Lagrange multiplier • In general case n × ∈ 1 x  f x ( ) If , achieves a local extremum at x 0 and it is differential at x 0 , then x 0 is a stationary point of f ( x ) , i.g., ∂ ∂ ∂ f f f = = = | 0, | 0,..., | 0 x x x ∂ ∂ ∂ x x x 0 0 0 1 2 n Or in other words, ∇ = f ( ) | x 0 = x x 0 Lin ZHANG, SSE, 2020

Lagrange multiplier • Lagrange multiplier is a strategy for finding the stationary point of a function subject to equality constraints = ∈ × n 1 y f ( ), x x  Problem: find stationary points for = = g ( ) x 0, k 1,2,..., m under m constraints k Solution: m + ∑ λ λ = λ F ( ; x ,..., ) f ( ) x g ( ) x 1 m k k = k 1 λ λ λ ( x , , ..., ) If is a stationary point 0 10 20 m 0 of F , then, x f x ( ) is a stationary point of with constraints 0 Joseph-Louis Lagrange Jan. 25, 1736~Apr.10, 1813 Lin ZHANG, SSE, 2020

Lagrange multiplier • Lagrange multiplier is a strategy for finding the stationary point of a function subject to equality constraints = ∈ × n 1 y f ( ), x x  Problem: find stationary points for = = g ( ) x 0, k 1,2,..., m under m constraints k m Solution: + ∑ λ λ = λ F ( ; x ,..., ) f ( ) x g ( ) x 1 m k k = k 1 λ λ ( x , ,..., ) is a stationary point of F 0 10 m 0 ∂ ∂ ∂ ∂ ∂ ∂ F F F F F F = = = = = = 0, 0,..., 0, 0, 0,..., 0 ∂ ∂ ∂ ∂ λ ∂ λ ∂ λ x x x 1 2 n 1 2 m at that point n + m equations! Lin ZHANG, SSE, 2020

Lagrange multiplier • Example Problem: for a given point p 0 = (1, 0), among all the points lying on the line y = x , identify the one having the least distance to p 0 . The distance is = − + − 2 2 f x y ( , ) ( x 1) ( y 0) Now we want to find the stationary point of f ( x , y ) under the constraint ? = − = g x y ( , ) y x 0 According to Lagrange multiplier method, p 0 construct another function y = x λ = + λ = − + + λ − 2 2 F x y ( , , ) f x ( ) g x ( ) ( x 1) y ( y x ) F x y λ ( , , ) Find the stationary point for Lin ZHANG, SSE, 2020

Lagrange multiplier • Example Problem: for a given point p 0 = (1, 0), among all the points lying on the line y = x , identify the one having the least ∂ distance to p 0 . F = 0  ∂ x ∂ − + λ =  = 2( x 1) 0  x 0.5  F   = 0  ∂ − λ = = 2 y 0  y 0.5  ? y   −  = λ = x y 0 1   ∂ F = ∂ 0 p 0  λ y = x F x y λ ( , , ) (0.5,0.5,1) is a stationary point of (0.5,0.5) is a stationary point of f ( x , y ) under constraints Lin ZHANG, SSE, 2020

Principal Component Analysis (PCA) • PCA: converts a set of observations of possibly correlated variables into a set of values of linearly uncorrelated variables called principal components • This transformation is defined in such a way that the first principal component has the largest possible variance, and each succeeding component in turn has the highest variance possible under the constraint that it be orthogonal to (i.e., uncorrelated with) the preceding components Lin ZHANG, SSE, 2020

Principal Component Analysis (PCA) How to find? • Illustration x , y (2.5, 2.4) (0.5, 0.7) (2.2, 2.9) (1.9, 2.2) (3.1, 3.0) (2.3, 2.7) (2.0, 1.6) (1.0, 1.1) (1.5, 1.6) (1.1, 0.9) De-correlation! Along which orientation the data points scatter most? Lin ZHANG, SSE, 2020

Lecture 3 Principal Component Analysis Lin ZHANG, PhD School of - PowerPoint PPT Presentation

Lecture 3 Principal Component Analysis Lin ZHANG, PhD School of Software Engineering Tongji University Fall 2020 Lin ZHANG, SSE, 2020 Content Matrix Differentiation Lagrange Multiplier Principal Component Analysis

Continuous Latent Variables Oliver Schulte - CMPT 419/726 Bishop PRML Ch. 12 Principal Component

Section 1 Principal Component Analysis 1 / 16 Principal Component Analysis ST 810-006

Functional Principal Component Analysis May 14, 2018 Empirical Principal Component FPC for the

Principal Component Analysis Powerpoint Presentation What is multivariate analysis? Summarizing

Principal component analysis Ingo Blechschmidt December 17th, 2014 Kleine Bayessche AG

CS475/CS675 Lecture 23: July 19, 2016 Principal Component Analysis, Eigenfaces CS475/CS675 (c)

Functional components Notification component Application received Refuse ? Notification

WIO IOSAP Project Budget Nairobi Convention WIO IOSAP Budget per Project Component COMPONENT

Principal Component Analysis http://setosa.io/ev/principal- Food consumption in the UK

Dimensionality Reduction: Linear Discriminant Analysis and Principal Component Analysis CMSC 678

Introduction to Principal Component Analysis and Indepedent Component Analysis Tristan A. Hearn

Chapter 5 Singular value decomposition and principal component analysis In A Practical Approach to

Hebbian Learning, Hebbian Learning Principal Component Analysis, and Independent Component

Principal Component Analysis in a Linear Algebraic View by Anna Orosz under the mentorship of

Component selection 1 (c) 2020 A.J.M. Montagne Component selection + - + - + - 2 (c)

For use in AIM Awards centres Component Level: Level Three Component Guided Learning Hours: 21

When Samples Are Strategically Selected Hanrui Zhang Yu Cheng Vincent Conitzer Duke University

A new window on primordial non-Gaussianity based on 1201.5375 with M. Zaldarriaga Enrico Pajer

Logic Programming Prolog as Language Temur Kutsia Research Institute for Symbolic Computation

Avoiding alerts overload from microservices Sarah Wells Principal Engineer, Financial Times

Animation Maneesh Agrawala CS 448B: Visualization Fall 2018 Last Time: Network Analysis 1

DRAFT This paper is a draft submission to Inequality Measurement, trends, impacts, and

Neighborhood Stabilization Program Preparing for Closeout Using DRGR Reports August 2, 2016

Fourth Quarter Review 14 / November / 2012 Forward-Looking Statements / Safe Harbor This