PCA
CS 446
PCA CS 446 Supervised learning So far, weve done supervised - - PowerPoint PPT Presentation
PCA CS 446 Supervised learning So far, weve done supervised learning: Given (( x i , y i )) , find f with f ( x i ) y i . k -nn, decision trees, . . . 1 / 18 Supervised learning So far, weve done supervised learning: Given (( x i , y
CS 446
1 / 18
n
i=1 ℓ(f(xi), yi), hope R is small.
1 / 18
i=1, and the goal is. . . ?
2 / 18
i=1, and the goal is. . . ?
2 / 18
i=1, and the goal is. . . ?
2 / 18
3 / 18
3 / 18
3 / 18
3 / 18
3 / 18
3 / 18
3 / 18
4 / 18
4 / 18
i=1 siuivT i .
4 / 18
i=1 siuivT i .
4 / 18
i=1 siuivT i .
⊤
4 / 18
i=1 siuivT i .
⊤
4 / 18
i=1 siuivT i .
⊤
4 / 18
k, encoded data XV k = U kSk.
5 / 18
k, encoded data XV k = U kSk.
D∈Rk×d E∈Rd×k
F =
T
k
F . 5 / 18
k, encoded data XV k = U kSk.
D∈Rk×d E∈Rd×k
F =
T
k
F .
k performs orthogonal projection onto subspace spanned by V k;
5 / 18
D∈Rk×d E∈Rd×k
F =
D∈Rd×k DTD=I
T
F
T
k
F =
r
i .
D∈Rd×k DTD=I
T
F =X2 F −
D∈Rd×k DTD=I
F
F −XV k2 F =X2 F −
k
i .
6 / 18
D∈Rk×d E∈Rd×k
F =
D∈Rd×k DTD=I
T
F
T
k
F =
r
i .
D∈Rd×k DTD=I
T
F =X2 F −
D∈Rd×k DTD=I
F
F −XV k2 F =X2 F −
k
i .
i=1 s2 i is identical across SVD choices.
6 / 18
D∈Rk×d E∈Rd×k
F =
D∈Rd×k DTD=I
T
F
T
k
F =
r
i .
D∈Rd×k DTD=I
T
F =X2 F −
D∈Rd×k DTD=I
F
F −XV k2 F =X2 F −
k
i .
i=1 s2 i is identical across SVD choices.
6 / 18
D∈Rk×d E∈Rd×k
F =
D∈Rd×k DTD=I
T
F
T
k
F =
r
i .
D∈Rd×k DTD=I
T
F =X2 F −
D∈Rd×k DTD=I
F
F −XV k2 F =X2 F −
k
i .
i=1 s2 i is identical across SVD choices.
6 / 18
n
7 / 18
n
TX ∈ Rd×d is data covariance; 7 / 18
n
TX ∈ Rd×d is data covariance;
T(XD) is data covariance after projection; 7 / 18
n
TX ∈ Rd×d is data covariance;
T(XD) is data covariance after projection;
F = 1
T(XD)
k
T(XDei),
7 / 18
8 / 18
8 / 18
8 / 18
i=1 with xi ∈ R784.
j=1 variance in direction vj
F
F
j=1 variance in direction ej
j=1(Xej)T(Xej)
F
coordinate projections PCA projections
9 / 18
Mean λ1 = 3.4 · 105 λ2 = 2.8 · 105 λ3 = 2.4 · 105 λ4 = 1.6 · 105
10 / 18
11 / 18
11 / 18
11 / 18
11 / 18
11 / 18
11 / 18
11 / 18
M∈Rd×d rank(M)=k
F =
D∈Rk×d E∈Rd×k
F =
D∈Rd×k DTD=I
T
F . 12 / 18
M∈Rd×d rank(M)=k
F =
D∈Rk×d E∈Rd×k
F =
D∈Rd×k DTD=I
T
F .
T : D ∈ Rd×k, D TD = I
D∈Rd×k DTD=I
T
F ≤
M∈Rd×d rank(M)=k
F . 12 / 18
k),
F =
T
k + XV kV
T
k − XM
F
T
k
F + 2 tr
T
k
T
k − XM
T
k − XM
F .
F =
T
k
F +
T
k − XM
F ≥
T
k
F . 13 / 18
T
k
T
k − XM
T
k
TX
T
k
T
k
TX
T
k
T
k
T
k
T
k)
T
T
k
d
T
i − k
T
i
T
k
T
j
d
T
i
k
T
j = 0.
T
k
T
k − XM
F = X2 F − XD2 F, and
D∈Rd×k DTD=I
T
F = X2 F −
D∈Rd×k DTD=I
F,
D∈Rd×k DTD=I
T
F = arg max
D∈Rd×k DTD=I
F. 15 / 18
F = X2 F − XD2 F, and
D∈Rd×k DTD=I
T
F = X2 F −
D∈Rd×k DTD=I
F,
D∈Rd×k DTD=I
T
F = arg max
D∈Rd×k DTD=I
F.
T2 F = tr
T) T(XDD T)
T(XDD TD)
T(XD)
F,
T
F =X2 F − 2 tr
T) TX
T2 F
F − XD2 F.
D∈Rd×k DTD=I
F =XV k2 F =
k
k.
16 / 18
D∈Rd×k DTD=I
F =XV k2 F =
k
k.
TD = I},
D∈S1XD2
F = max
M∈S2XM2
F = max
D∈S1XV D2
F = max
D∈S1
TV D
F
D∈S1 tr
T(USD)
D∈S1 tr
TS TS
D∈S1 r
j k
ij.
16 / 18
D∈Rd×k DTD=I r
j k
ij = XV k2
F,
T(U kSk)
k
i . Lastly:
kV k = I,
D∈Rd×k DTD=I r
j k
ij ≥ XV k2
F.
i=1 D2 ij ≤ d i=1 M 2 ij = 1. Moreover, i,j D2 ij ≤ k, so
D∈Rd×k DTD=I r
j
k
ij
w∈[0,1]d
r
j w2 j ≤ k
j .
18 / 18