Lecture 24:
−Principal Component Analysis
Aykut Erdem
January 2017 Hacettepe University
Lecture 24: Principal Component Analysis Aykut Erdem January 2017 - - PowerPoint PPT Presentation
Lecture 24: Principal Component Analysis Aykut Erdem January 2017 Hacettepe University This week Motivation PCA algorithms Applications PCA shortcomings Autoencoders Kernel PCA 2 PCA Applications Data
−Principal Component Analysis
Aykut Erdem
January 2017 Hacettepe University
2
3
slide by Barnabás Póczos and Aarti Singh
4
slide by Barnabás Póczos and Aarti Singh
6
H-RBC H-Hgb H-Hct H-MCV H-MCH H-MCHC H-MCHC A1 8.0000 4.8200 14.1000 41.0000 85.0000 29.0000 34.0000 A2 7.3000 5.0200 14.7000 43.0000 86.0000 29.0000 34.0000 A3 4.3000 4.4800 14.1000 41.0000 91.0000 32.0000 35.0000 A4 7.5000 4.4700 14.9000 45.0000 101.0000 33.0000 33.0000 A5 7.3000 5.5200 15.4000 46.0000 84.0000 28.0000 33.0000 A6 6.9000 4.8600 16.0000 47.0000 97.0000 33.0000 34.0000 A7 7.8000 4.6800 14.7000 43.0000 92.0000 31.0000 34.0000 A8 8.6000 4.8200 15.8000 42.0000 88.0000 33.0000 37.0000 A9 5.1000 4.7100 14.0000 43.0000 92.0000 30.0000 32.0000
Instances Features Difficult to see the correlations between the features...
5
slide by Barnabás Póczos and Aarti Singh
6
20 30 40 50 60 100 200 300 400 500 600 700 800 900 1000 measurement Value
Difficult to compare the different patients...
slide by Barnabás Póczos and Aarti Singh
7
8
slide by Barnabás Póczos and Aarti Singh
8
9
0 50 150 250 350 450 50 100 150 200 250 300 350 400 450 500 550
C-Triglycerides C-LDH
100200300400500 200 400 600 1 2 3 4
C-Triglycerides C-LDH M-EPI
How can we visualize the other variables??? … ¡difficult ¡to ¡see ¡in ¡4 ¡or ¡higher ¡dimensional ¡spaces...
slide by Barnabás Póczos and Aarti Singh
9
slide by Barnabás Póczos and Aarti Singh
10
11
12
slide by Barnabás Póczos and Aarti Singh
13
slide by Barnabás Póczos and Aarti Singh
14
slide by Barnabás Póczos and Aarti Singh
15
slide by Barnabás Póczos and Aarti Singh
16
slide by Barnabás Póczos and Aarti Singh
17
18
i i T i T
m
1 2 1 1 1 2
} )] ( {[ 1 max arg x w w x w w
w
} ) {( 1 max arg
1 2 i 1 1
i T
m x w w
w
We maximize the variance
residual subspace We maximize the variance of projection of x
x’ ¡PCA reconstruction
Given the centered data {x1, ¡…, ¡xm}, compute the principal vectors:
1st PCA vector kth PCA vector x w1 w x’=w1(w1
Tx)
w x-x’
slide by Barnabás Póczos and Aarti Singh
18
19
i k j i T j j i T k
m
1 2 1 1 1
} )] ( {[ 1 max arg x w w x w w
w
We maximize the variance
residual subspace Maximize the variance of projection of x
x’ ¡PCA reconstruction
Given w1,…, ¡wk-1, we calculate wk principal vector as before:
kth PCA vector w1(w1
Tx)
w2(w2
Tx)
x w1 w2 x’=w1(w1
Tx)+w2(w2 Tx)
w
slide by Barnabás Póczos and Aarti Singh
19
i T i
1
i i
1
where
slide by Barnabás Póczos and Aarti Singh
20
22
PCA algorithm(X, k): top k eigenvalues/eigenvectors % X = N m data matrix, % ¡… ¡each ¡data point xi = column vector, i=1..m
1 2 … ¡ N
% top k PCA components
i
m
1
1
i
x x
slide by Barnabás Póczos and Aarti Singh
21
23
Singular Value Decomposition of the centered data matrix X.
samples
significant noise noise noise significant sig.
slide by Barnabás Póczos and Aarti Singh
22
slide by Barnabás Póczos and Aarti Singh
23
24
25
Can’t ¡just ¡use ¡the ¡given ¡256 ¡x ¡256 ¡pixels
slide by Barnabás Póczos and Aarti Singh
26
Example data set: Images of faces
[Turk & Pentland], [Sirovich & Kirby]
Each face x is ¡…
Form X = [ x1 , ¡…, ¡xm ] centered data mtx Compute = XXT Problem: is 64K 64K ¡… ¡HUGE!!!
256 x 256 real values m faces
x1, ¡…, ¡xm
Method A: Build a PCA subspace for each person and check which subspace can reconstruct the test image the best Method B: Build one PCA database for the whole dataset and then classify based on the weights.
27
slide by Barnabás Póczos and Aarti Singh
27
slide by Barnabás Póczos and Aarti Singh
28
then Xv is eigenvector of Proof: L v = v
256 x 256 real values m faces
x1, ¡…, ¡xm
slide by Barnabás Póczos and Aarti Singh
29
slide by Barnabás Póczos and Aarti Singh
30
… ¡faster ¡if ¡train ¡with…
slide by Barnabás Póczos and Aarti Singh
31
slide by Barnabás Póczos and Aarti Singh
32
slide by Barnabás Póczos and Aarti Singh
33
slide by Barnabás Póczos and Aarti Singh
34
slide by Barnabás Póczos and Aarti Singh
35
slide by Barnabás Póczos and Aarti Singh
36
slide by Barnabás Póczos and Aarti Singh
37
38
39
slide by Barnabás Póczos and Aarti Singh
40
slide by Barnabás Póczos and Aarti Singh
41
slide by Barnabás Póczos and Aarti Singh
42
2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12
slide by Barnabás Póczos and Aarti Singh
43
slide by Barnabás Póczos and Aarti Singh
44
2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12
slide by Barnabás Póczos and Aarti Singh
45
slide by Barnabás Póczos and Aarti Singh
46
2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12
slide by Barnabás Póczos and Aarti Singh
47
slide by Barnabás Póczos and Aarti Singh
48
slide by Barnabás Póczos and Aarti Singh
49
http://en.wikipedia.org/wiki/Discrete_cosine_transform
slide by Barnabás Póczos and Aarti Singh
50
51
x x’ U x
slide by Barnabás Póczos and Aarti Singh
52
slide by Barnabás Póczos and Aarti Singh
53
slide by Barnabás Póczos and Aarti Singh
54
55
PCA ¡doesn’t ¡know ¡labels!
slide by Barnabás Póczos and Aarti Singh
56
Fisher Linear Discriminant
slide by Javier Hernandez Rivera
57
slide by Barnabás Póczos and Aarti Singh
58
slide by Barnabás Póczos and Aarti Singh