MACHINE LEARNING – 2013
1
Methods for feature extraction and reduction of dimensionality: - - PowerPoint PPT Presentation
MACHINE LEARNING 2013 MACHINE LEARNING Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA 1 MACHINE LEARNING 2013 Practicals Next Week Next Week, Practical Session on Computer Takes Place
MACHINE LEARNING – 2013
1
MACHINE LEARNING – 2013
2
MACHINE LEARNING – 2013
3
MACHINE LEARNING – 2013
4
MACHINE LEARNING – 2013
5
MACHINE LEARNING – 2013
6
both zero mean):
MACHINE LEARNING – 2013
7
1... 1... 1 1 1 1
i M i j j N N T N
N N j T T
MACHINE LEARNING – 2013
8
Raw 2D dataset Projected onto two first Principal components
MACHINE LEARNING – 2013
9
1,...
1... 1
j N
i M i i N j A N q M i q
MACHINE LEARNING – 2013
10
MACHINE LEARNING – 2013
11
MACHINE LEARNING – 2013
12
1
i i q
T i i
1 2 1 1 1 1
q N q q q q N
MACHINE LEARNING – 2013
13
By projecting a set of images of two classes (two different persons) onto first two principal component allows to extract features particular to each class, which can then be use for classification.
Two classes with 20 and 16 examples in each class Projection of the image datapoints on the first and 2nd PC Sepatatiing line
MACHINE LEARNING – 2013
14
MACHINE LEARNING – 2013
15
1... 1...
i M i j j N T
MACHINE LEARNING – 2013
16
2
T
Variance of the noise is diagonal conditional independence on the observables given the latent variables; z encapsulates all correlations across original dimensions.
1... 1...
i M i j j N T
MACHINE LEARNING – 2013
17
2
T
z p(z)
Variance of the noise is diagonal conditional independence on the observables given the latent variables; z encapsulates all correlations across original dimensions.
MACHINE LEARNING – 2013
18
x1 p(z)
2
2
T
z x2 w
Z 1 *|w|
p(x|z1)
z1
MACHINE LEARNING – 2013
19
x1 p(z) z x2 z1 w
Z 1 *|w|
p(x|z1)
2
T
p(x)
Open parameters; can be learned through maximum likelihood
Axes of the ellipse correspond to the colums of W, i.e. to the eigenvectors of the covariance matrix: XXT
MACHINE LEARNING – 2013
20
1
2 1
T M
See lecture notes for values of B and + exercises for derivation
MACHINE LEARNING – 2013
21
MACHINE LEARNING – 2013
23
x1 p(z|x) z x2 wwT(x-x- w
p(x) Axes of the ellipse correspond to the colums of W, i.e. to the eigenvectors of the covariance matrix: XXT
1
T
1 1 2 2
T
Is again Gaussian!
MACHINE LEARNING – 2013
24
1
T
1 1 2 2
q
T q
MACHINE LEARNING – 2013
25
MACHINE LEARNING – 2013
26
x1 p(z) z x2 z1 w
Z 1 *|w|
p(x|z1) p(x)
MACHINE LEARNING – 2013
27
MACHINE LEARNING – 2013
28
MACHINE LEARNING – 2013
29
Original Space x1 x2 e1 e2
1... 1 ,....., i M i N M
H Performs linear PCA in feature space
MACHINE LEARNING – 2013
30
1... 1 ,....., i M i N M
Original Space x2
MACHINE LEARNING – 2013
31
i j
i j i j
MACHINE LEARNING – 2013
32
p
2
' 2
x x
p
MACHINE LEARNING – 2013
33
1 1
N M T i j j i i i i j
1
i j
M T i j j i j i
MACHINE LEARNING – 2013
34
1
j
M i j j i
1 1
N M T i j j i i i i j
1
i j
M T i j j i j i
Scalar
MACHINE LEARNING – 2013
35
1 M i i
T i
f
MACHINE LEARNING – 2013
36
i i i j i j i i
f f
1 1
M i i j i i i i i M j
MACHINE LEARNING – 2013
37
1 1 1
i
M M M l i j j l j i j l j j
i
i i
i j ij
Kernel Trick
1 M i i j i j
f
j i j i i
f
MACHINE LEARNING – 2013
38
1 1
M M i i
f
1
M i i
MACHINE LEARNING – 2013
39
1 1
i M M i i j i j j j j j
i
Sum over all training points
MACHINE LEARNING – 2013
40
H From Scholkopf & Smola, 2002
MACHINE LEARNING – 2013
41
p
2
' 2
x x
p
MACHINE LEARNING – 2013
42
From Scholkopf & Smola, 2002
MACHINE LEARNING – 2013
43
From Scholkopf & Smola, 2002
MACHINE LEARNING – 2013
44
MACHINE LEARNING – 2013
45
MACHINE LEARNING – 2013
46
MACHINE LEARNING – 2013
47
MACHINE LEARNING – 2013
48
MACHINE LEARNING – 2013
49
MACHINE LEARNING – 2013
50
Points clusters here
MACHINE LEARNING – 2013
51
MACHINE LEARNING – 2013
54
MACHINE LEARNING – 2013
55
MACHINE LEARNING – 2013
56
MACHINE LEARNING – 2013
57
MACHINE LEARNING – 2013
58
MACHINE LEARNING – 2013
59
1. Report on coding project (10 pages maximum, 10pt minimum, single column; code from mini-project must be submitted together with the report). 2. Reports on lit. survey If you do the lit. survey as team of two, the maximum length of the survey should be 20 pages (10pt minimum, single column)