CS 559: Machine Learning Fundamentals and Applications 5th Set of Notes
Instructor: Philippos Mordohai Webpage: www.cs.stevens.edu/~mordohai E-mail: Philippos.Mordohai@stevens.edu Office: Lieb 215
1
CS 559: Machine Learning Fundamentals and Applications 5 th Set of - - PowerPoint PPT Presentation
1 CS 559: Machine Learning Fundamentals and Applications 5 th Set of Notes Instructor: Philippos Mordohai Webpage: www.cs.stevens.edu/~mordohai E-mail: Philippos.Mordohai@stevens.edu Office: Lieb 215 Project: Logistics Topics: Based
1
2
3
4
5
6
– UCI ML repository: http://archive.ics.uci.edu/ml/ – Google: http://www.google.com/publicdata/directory – dmoz www.dmoz.org/Computers/Artificial_Intelligence/Machine_Learning/Datasets/ – Netflix Challenge: http://www.cs.uic.edu/~liub/Netflix-KDD-Cup-2007.html – Kaggle https://www.kaggle.com/competitions and https://www.kaggle.com/datasets
– Enron email dataset: http://www.cs.cmu.edu/~enron/ – Web page classification: http://www-2.cs.cmu.edu/~webkb/
– Stanford dataset: http://ai.stanford.edu/~btaskar/ocr/ – NIST dataset: http://yann.lecun.com/exdb/mnist/
7
– Caltech 101: http://www.vision.caltech.edu/Image_Datasets/Caltech101/ – Caltech 256: http://www.vision.caltech.edu/Image_Datasets/Caltech256/ – MIT Labelme http://labelme.csail.mit.edu/ – PASCAL Visual Object Classes: http://pascallin.ecs.soton.ac.uk/challenges/VOC/ – Oxford buildings: http://www.robots.ox.ac.uk/~vgg/data/oxbuildings/index.html – ETH Computer Vision datasets: http://www.vision.ee.ethz.ch/datasets/ – ImageNet http://www.image-net.org/ – Scene classification http://lsun.cs.princeton.edu/2016/
– Yale face database: http://cvc.yale.edu/projects/yalefaces/yalefaces.html – Labeled Faces in the Wild: http://vis-www.cs.umass.edu/lfw/ see also http://vis-www.cs.umass.edu/fddb/ – BioID with labeled facial features: https://www.bioid.com/About/BioID-Face-Database – https://www.facedetection.com/datasets/
– University of Washington http://rgbd-dataset.cs.washington.edu/ – Cornell http://pr.cs.cornell.edu/sceneunderstanding/data/data.php – NYU http://cs.nyu.edu/~silberman/datasets/nyu_depth_v2.html – Princeton http://rgbd.cs.princeton.edu/
8
9
10
11
12
13
14
15
16
17
18
19
– Camera optical axes are not orthogonal to each other
20
– Determine that the dynamics are along a single axis – Determine the important axis
21
– Reflects the method wich gathered the data
22
– Restricts set of potential bases – Implicitly assumes continuity in data (superposition and interpolation are possible)
23
– Geometrically it is a rotation and stretch – The rows of P {p1,…, pm} are the new basis vectors for the columns of X – Each element of yi is a dot product of xi with the corresponding row of P P (a projection of xi onto pj)
24
– Noise – Rotation – Redundancy
25
26
27
28
29
– Subtract mean from all vectors in X
– Covariance between two measurement types
30
– Large interesting dynamics – Small noise
– Large high redundancy – Small low redundancy
– Off-diagonal elements should be zero
31
32
33
34
The eigenvectors of a matrix A form a basis for working with A However, for rectangular matrices A A (m x n), dim(Ax) ≠ dim(x) and the concept of eigenvectors does not exist Yet, ATA A (n x n) is a symmetric, real matrix (A is real) and therefore, there is an orthonormal basis of eigenvectors {uK} for ATA. A. Consider the vectors {vK} They are also orthonormal, since:
k k k
k k T T j
Note: here each row of A is a measurement in time and each column a measurement type
35
Since ATA is positive semidefinite, its eigenvalues are non-negative {λk≥0} Define the singular values of A as and order them in a non-increasing order: Motivation: One can see, that if A itself is square and symmetric, then {uk, σk} are the set of its own eigenvectors and eigenvalues. For a general matrix A, assume {σ1 ≥ σ2 ≥…σR>0=σr+1 =σr+2 =…=σn }.
k k
n 2 1
k k
1 1 ; m k n k
36
Now we can write:
VΣ AU A A A A A
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
1 1 1 1 1 1 1 1 1 1
r n r r n r r r n r r n r r
v v v v v v v v u u u u u u u u
n n n m m m n m T T
37
2 2 1 1 A
2 , 8 3 5 2 64 100 10
2 , 1
38
i i i
1 1
2 2
39
1 2 2 2 2 2 1 2 1 2 2 1 1 v u
1 1 1
A
; 2 2 , 1 v
1 1
; 2 , 1 v
2 2
2 1 2 1 2 1 2 1 2 2 2 1 1 2 2 1 1
1 2 2 2 1 2 1 2 2 1 1 v u
2 2 2
A
40
41
42
43
44
45
i,…,xn i), i=1…m
46
47
T Xc
48
Note: Covariance matrix is n×n (measurement types) (But there may be exceptions)
49
50
51
52
53
54
P RL
55
– Must either search (e.g., nearest neighbors) or store large probability density functions.
– Idea—fit a line, classifier measures distance to line
convert x into v1, v2 coordinates What does the v2 coordinate measure? What does the v1 coordinate measure?
56
– We can represent the orange points with only their v1 coordinates
– This makes it much cheaper to store and compare points – A bigger deal for higher dimensional problems
57
Consider the variation along direction v among all of the orange points: What unit vector v minimizes var? What unit vector v maximizes var? Solution: v1 is eigenvector of A with largest eigenvalue v2 is eigenvector of A with smallest eigenvalue
58
– Same procedure applies: – The eigenvectors of A define a new coordinate system
training vectors x
– We can compress the data by only using the top few eigenvectors
– represent points on a line, plane, or “hyper-plane”
principal components
59
60
61
62
63
64
The set of faces is a “subspace” of the set
– Suppose it is K dimensional – We can find the best subspace using PCA – This is like fitting a “hyper-plane” to the set of faces
, v2, ..., v , ..., vK
65
– Gives a set of vectors v1, v2, v3, ... – Each one of these vectors is a direction in face space
66
67
68
69
That is, distance in Eigenspace is approximately equal to the distance between original images
1 2 1 2
70
K NM i =
71
– Dark: small distance – Bright: large distance
72
– Project on eigenfaces and compute weights – Take weighted sum of eigenfaces to synthesize face image
73
74
75