 
              APPLIED MACHINE LEARNING APPLIED MACHINE LEARNING Methods for Reduction of Dimensionality through Linear Projection Principal Component Analysis (PCA) 1
APPLIED MACHINE LEARNING Curse of Dimensionality Computational Costs O(N 2 ) O(N) N: Nb of dimensions Linear increase is much preferred over exponential increase 2
APPLIED MACHINE LEARNING Curse of Dimensionality Several methods for classification Computational Costs and regression grow exponentially O(N 2 ) with the dimension of the data N. O(N) N: Nb of dimensions When the increase is exponential/polynomial  reduce dimensionality of data prior to further processing. 3
APPLIED MACHINE LEARNING Principal Component Analysis (PCA) PCA is a method to reduce the dimensionality of dataset. It does so by projecting the dataset onto a lower-dimensional space. 4
APPLIED MACHINE LEARNING Examples: PCA – dimensionality reduction Record human motion when writing letters A, B and C x 1 x 2 x 3 x 4 time The joint angle trajectories convey redundant information  reduce information with PCA 5
APPLIED MACHINE LEARNING Examples: PCA – dimensionality reduction  4 4-dimensional state space x    2 2 4 Project onto 2-dimensional space through matrix y A x 1 y 1 x 2 y 2 x 3 time x 4 time  y Ax 6
APPLIED MACHINE LEARNING Examples: PCA – dimensionality reduction y 1 y 2 Rotate the trajectories y onto the plane where the robot writes. Use inverse kinematics to drive the robot’s motion. 7
APPLIED MACHINE LEARNING PCA: Exercise 1.1 Reducing dimensionality of dataset Original Data 5 0 -5 -10 -3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5   1 -2 -3 1.5 Dataset X= 2 -4 -6 3     Can you find a way to reduce the amount of information needed to store the coordinates of these 4 datapoints? 8
APPLIED MACHINE LEARNING PCA: Exercise 1.2 Reducing dimensionality of dataset Original Data 5 0 -5 -10 -3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5   1.1 -2 -2.9 1.5 Dataset X= 2 -4 -6.2 3     If you use the same solution as before, how much error do you get? 9
APPLIED MACHINE LEARNING PCA: Exercise 1.3 Reducing dimensionality of dataset Original Data 5 0 -5 -10 -3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5   1.1 -2 -2.9 1.5 Dataset X= 2 -4 -6.2 3      2 If we project each datapoint x onto a one dimensional space onto the vector , which minimizes reconstruction error among these? a a T a=[0 1] T a=[1 0] T a=[1 2] 10
APPLIED MACHINE LEARNING PCA: Reduction of dimensionality Infinite number of choices for projecting the data  need criteria to reduce the choice 1: minimum information loss (minimal reconstruction error) x 2 x 1 What is the 2D to 1D projection that minimizes the reconstruction error? 11
APPLIED MACHINE LEARNING PCA: Reduction of dimensionality Infinite number of choices for projecting the data  need criteria to reduce the choice 1: minimum information loss(minimal reconstruction error) 2: equivalent to finding the direction with maximum variance Smallest breadth of x x data lost 2 2 Largest breadth of data conserved x x Reconstruction after projection 1 1 What is the 2D to 1D projection that minimizes the reconstruction error? 12
APPLIED MACHINE LEARNING Principal Component Analysis PCA is a method to reduce the dimensionality of dataset. It is used as: • Pre-processing method before classification to reduce computational costs of the classifier. • Compression method for ease of data storage and retrieval. • Feature extraction method. 13
APPLIED MACHINE LEARNING Principal Component Analysis PCA is a method to reduce the dimensionality of dataset. It is used as: • Pre-processing method before classification to reduce computational costs of the classifier. • Compression method for ease of data storage and retrieval. • Feature extraction method. 14
APPLIED MACHINE LEARNING Examples: PCA – preprocessing for classification Dataset with samples of two classes (red and green class) Each image is a high- dimensional vector   320 240 3 230400 x 15
APPLIED MACHINE LEARNING Examples: PCA – preprocessing for classification Project the images onto a lower dimensional space     2 2 230400 through matrix : y A y Ax Separating Line 16
APPLIED MACHINE LEARNING PCA: Exercise 2 Original Data 5 0 -5 -10 -3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5   1 -2 -3 1.5 Projection Dataset X= 2 -4 -6 3     Linear classification: - Separate the two groups of datapoints with a line. - Which projections make separation unfeasible? PCA does not seek projections that make data more separable! However, among the projections, some may make data more separable. 17
APPLIED MACHINE LEARNING Principal Component Analysis PCA is a method to reduce the dimensionality of dataset. It is used as: • Pre-processing method before classification to reduce computational costs of the classifier. • Compression method for ease of data storage and retrieval. • Feature extraction method. 18
APPLIED MACHINE LEARNING Examples: PCA for Feature Extraction Extract features and reduce dimensionality: 50 PCA extracted from a set of 100 faces, originally coded in a high dimensional pixel space (e.g. 54 150 dimensions), First 4 projections (principal components) Hancock, P et al (1996). Face processing: human perception and principal components analysis. Memory and Cognition 24 1, pp. 26 – 40 19
APPLIED MACHINE LEARNING Examples: PCA for Feature Extraction Examples of six facial expressions (happy, sad, anger, fear, disgust and surprise) in their original format (full-image, top row) and morphed to average face shape (shape-free, bottom row). Calder et al (2001), A principal component analysis of facial expressions Vision Research, 41:9, p. 1179-1208 20
APPLIED MACHINE LEARNING Examples: PCA for Feature Extraction The first eight eigenfaces abstracted from a PCA of facial expressions. Calder et al (2001), A principal component analysis of facial expressions Vision Research, 41:9, p. 1179-1208 21
APPLIED MACHINE LEARNING PCA: Exercise 3 To extract features common among groups and subgroups of images. Which projection would extract one common feature across the 3 images?       12 24 6       2 4 1             14 28 7       5 6 8                1 2 3 x 6 x 10 x 9        2   3    5       2 3 0             10 3 1             2 3 1 22
APPLIED MACHINE LEARNING PCA: Exercise 3 To extract features common among groups and subgroups of images. Which projection would extract some common feature across the 3 images? The first 3 dimensions can be compressed into 1. x 3       12 24 6       2 4 1             14 28 7       5 6 8       x 1          1 2 3 x 6 x 10 x 9        2   3    5       x 2 3 0 2             10 3 1             2 3 1 23
APPLIED MACHINE LEARNING PCA ~ Feature Extraction To extract features common among groups and subgroups of images. A first step toward clustering and classification. PCA is not classification! This projection embeds a feature. The regions of the eyes are darker than the middle region. x 3 a   6   1     7   .   x 1      a .   .   x . 2     .     . 24
APPLIED MACHINE LEARNING Constructing a projection Formally, a projection can be described as follows:    M i N Let [ ... ] a set of -dim. datapoints, , 1... . X x x M N x i M 1     N p A projection of through a linear map : Y X A x y , p N   is given by: . Y A X  X : N M   : and p A p N N  Y : p M 25
APPLIED MACHINE LEARNING Constructing a projection: Exercise 4 Example: 2-dimensional projection through a matrix A   0 15 0 15 0 15 0 15     0 0 15 15 0 0 15 15 X      0 0 0 0 15 15 15 15 Find a matrix A which groups the points into 4 groups. Original data X 15 10 x3 5 0 15 15 10 10 5 5 0 0 x2 x1 26
Recommend
More recommend