Kernel-Based Dimensionality Reduction Methods on Synthesized and - PowerPoint PPT Presentation

High Dimensionality Dimensionality Reduction Methods Application to Simulated Data Application to Morph-II Wrap-up Kernel-Based Dimensionality Reduction Methods on Synthesized and Facial Image Data Jonathan L. Fabish Statistical Data Mining and Visualization REU, UNCW July 25, 2017 Kernel-Based Dimensionality Reduction Methods on Synthesized and Facial Image Data Jonathan L. Fabish 1

High Dimensionality Dimensionality Reduction Methods Application to Simulated Data Application to Morph-II Wrap-up Outline High Dimensionality Introduction Dimensionality Reduction Methods Principal Component Analysis Kernel Principal Component Analysis Supervised Kernel Principal Component Analysis Fisher Discriminant Analysis Application to Simulated Data Simulations Application to Morph-II Morph-II Gender Classification Wrap-up Conclusions Kernel-Based Dimensionality Reduction Methods on Synthesized and Facial Image Data Jonathan L. Fabish 2

High Dimensionality Dimensionality Reduction Methods Application to Simulated Data Application to Morph-II Wrap-up Introduction High Dimensionality and Non-Linearity ◮ High dimensionality, which can hinder the use of traditional classification and regression algorithms, is associated with having a large number of often correlated predictors and a small sample size. ◮ Dimensionality reduction methods reduce noise, extract main features of a dataset, and decrease computational cost. ◮ Since classes in a real dataset are rarely linearly-separable, non-linear dimensionality reduction methods which employ kernel tricks, such as Kernel Fisher Discriminant Analysis (KFDA), Kernel Principal Component Analysis (KPCA), and Supervised Kernel Principal Component Analysis (SKPCA), are often effective. Kernel-Based Dimensionality Reduction Methods on Synthesized and Facial Image Data Jonathan L. Fabish 3

High Dimensionality Dimensionality Reduction Methods Application to Simulated Data Application to Morph-II Wrap-up Principal Component Analysis PCA ◮ In linear PCA, the directions of highest variation in a dataset are identified by computing the eigenvectors which correspond to the largest eigenvalues of the covariance matrix of the centered data � M C = 1 j =1 x j x T j . M ◮ This is accomplished by solving the following eigenvalue equation, λ v = Cv . ◮ The PCs are obtained from the eigenvectors corresponding to the largest eigenvalues. Kernel-Based Dimensionality Reduction Methods on Synthesized and Facial Image Data Jonathan L. Fabish 4

High Dimensionality Dimensionality Reduction Methods Application to Simulated Data Application to Morph-II Wrap-up Principal Component Analysis PCA visualized in 3-dimensions. Kernel-Based Dimensionality Reduction Methods on Synthesized and Facial Image Data Jonathan L. Fabish 5

High Dimensionality Dimensionality Reduction Methods Application to Simulated Data Application to Morph-II Wrap-up Kernel Principal Component Analysis Kernel Principal Component Analysis ◮ The kernel trick is widely used to facilitate the task of classification on a non-linearly separable dataset. ◮ Project the data to a subspace of arbitrarily high dimensionality using an implicit non-linear map Φ, which maps the elements of R n to a high dimensional space F , to explore non-linear relationships among the features. ◮ This is accomplished by replacing all occurrences of the inner product ( x ) T ( x ′ ) in the PCA algorithm by a kernel, in our case the Gaussian kernel, K ( x , x ′ ) = Φ( x ) T Φ( x ′ ) = exp − γ || x − x ′ || 2 2 . Kernel-Based Dimensionality Reduction Methods on Synthesized and Facial Image Data Jonathan L. Fabish 6

High Dimensionality Dimensionality Reduction Methods Application to Simulated Data Application to Morph-II Wrap-up Kernel Principal Component Analysis KPCA Algorithm 1. Compute the kernel matrix, K ij = ( k ( x i , x j )) ij . 2. Solve M λα = K α by diagonalizing K , where λ M ≥ λ M − 1 ≥ · · · λ 1 are the eigenvalues, and α is the column vector with entries α M , · · · , α 1 . 3. Normalize eigenvector expansion coefficients α n by requiring λ n ( a n · a n ) = 1, for n = p , . . . , M where λ p corresponds to the first non-zero eigenvalue. 4. Compute the projections of test data on to normalized eigenvectors V n in high dimensional feature space F using ( V n · Φ( x )) = � M i =1 α n i k ( x i , x ). Kernel-Based Dimensionality Reduction Methods on Synthesized and Facial Image Data Jonathan L. Fabish 7

High Dimensionality Dimensionality Reduction Methods Application to Simulated Data Application to Morph-II Wrap-up Supervised Kernel Principal Component Analysis Supervised Kernel Principal Component Analysis ◮ SKPCA is similar to KPCA, except that class labels are used to maximize the dependency of the covariates on the response in question. This is done by using an estimate of the Hilbert Schmidt Independence Criterion, 1 HSIC ( Z , F , G ) = ( n − 1) 2 tr ( KHLH ) , where H , K , L ∈ R nxn , with K ij = k ( x i , x ′ ) , L ij = l ( y i , y j ) , and H ij = [1( i = j ) − 1 / n ] , which measures the strength of association between two random variables. Kernel-Based Dimensionality Reduction Methods on Synthesized and Facial Image Data Jonathan L. Fabish 8

High Dimensionality Dimensionality Reduction Methods Application to Simulated Data Application to Morph-II Wrap-up Fisher Discriminant Analysis Kernel Fisher Discriminant Analysis ◮ FDA is a supervised method which approximates the theoretically optimum Bayes’ classifier. ◮ In FDA we find the vector w , which maximizes the following objective function, J ( w ) = w T S B w w T S W w , where S B = ( m 1 − m 2 )( m 1 − m 2 ) T , x ∈ X i ( x − m i )( x − m T S W = � � i ), i = c and m i = � l i j =1 x i j . Kernel-Based Dimensionality Reduction Methods on Synthesized and Facial Image Data Jonathan L. Fabish 9

High Dimensionality Dimensionality Reduction Methods Application to Simulated Data Application to Morph-II Wrap-up Fisher Discriminant Analysis Kernel Fisher Discriminant Analysis Written in form of kernels, KFDA is as follows: J ( α ) = α T P α α T Q α c N c [ κ c κ T c − κκ T ] where P = � Q = KK t − � c N c κ c κ T c 1 κ c = � i ∈ c K ij N c and κ = 1 � i K ij N K ij = k ( x i , x j ) To solve for α , we find the leading eigenvector of Q − 1 P , which maximizes the ratio of between class variation to within class variation in a high-dimensional feature space. Kernel-Based Dimensionality Reduction Methods on Synthesized and Facial Image Data Jonathan L. Fabish 10

High Dimensionality Dimensionality Reduction Methods Application to Simulated Data Application to Morph-II Wrap-up Simulations Wine Chocolate Data Kernel-Based Dimensionality Reduction Methods on Synthesized and Facial Image Data Jonathan L. Fabish 11

High Dimensionality Dimensionality Reduction Methods Application to Simulated Data Application to Morph-II Wrap-up Simulations Swiss Roll Data Kernel-Based Dimensionality Reduction Methods on Synthesized and Facial Image Data Jonathan L. Fabish 12

High Dimensionality Dimensionality Reduction Methods Application to Simulated Data Application to Morph-II Wrap-up Simulations Concentric Segmented Ring Data Kernel-Based Dimensionality Reduction Methods on Synthesized and Facial Image Data Jonathan L. Fabish 13

High Dimensionality Dimensionality Reduction Methods Application to Simulated Data Application to Morph-II Wrap-up Morph-II Morph-II Dataset ◮ Contains over 55,000 mugshots of 13,617 individuals, collected over 5 years. ◮ Ages range from 16 to 77 years. ◮ Provides race, gender, date of birth, date of arrest, age, time. difference since last picture, subject identifier, and image number for each picture in the database. Kernel-Based Dimensionality Reduction Methods on Synthesized and Facial Image Data Jonathan L. Fabish 14

High Dimensionality Dimensionality Reduction Methods Application to Simulated Data Application to Morph-II Wrap-up Morph-II Morph-II Features ◮ Due to the high computational cost of kernel based DR methods, subsets of 1000 random histogram-equalized images were selected from MORPH-II for analysis. ◮ Each subject has a single image in the subset. ◮ The ratio of male to female is 3:1 and black to white is 1:1. ◮ We applied the previously described DR methods to features extracted using two methods, namely local binary patterns (LBP) and biologically inspired features (BIF). Kernel-Based Dimensionality Reduction Methods on Synthesized and Facial Image Data Jonathan L. Fabish 15

High Dimensionality Dimensionality Reduction Methods Application to Simulated Data Application to Morph-II Wrap-up Morph-II Gender Classification ◮ After application of the previously described dimensionality reduction methods, we used a linear support vector machine with 5-fold cross-validation to predict gender and determine standard deviation, and computational time using the R programming language. Kernel-Based Dimensionality Reduction Methods on Synthesized and Facial Image Data Jonathan L. Fabish 16

Kernel-Based Dimensionality Reduction Methods on Synthesized and - PowerPoint PPT Presentation

High Dimensionality Dimensionality Reduction Methods Application to Simulated Data Application to Morph-II Wrap-up Kernel-Based Dimensionality Reduction Methods on Synthesized and Facial Image Data Jonathan L. Fabish Statistical Data Mining

STAT 209 Dimensionality Reduction November 26, 2019 Colin Reimer Dawson 1 / 24 Dimensionality

Dimensionality Reduction Alexandros Tantos Assistant Professor Aristotle University of

Investigating Dimensionality Dimensionality Dimensionality with with Investigating

WIKIPEDIA ARTICLE GROUP 9 Contents Article Overview 1. Dimensionality Reduction 2.

Nonlinear Dimensionality Reduction Donovan Parks Overview Direct visualization vs.

Dimensionality Reduction Algorithms (and how to interpret their output) Dalya Baron (Tel Aviv

Exploring Multivariate Data with Clustering and Dimensionality Reduction Marco Baroni Practical

Applied Machine Learning Dimensionality reduction using PCA Siamak Ravanbakhsh COMP 551 (Fall

Preprocessing and Dimensionality Reduction J er emy Fix CentraleSup elec

DIMENSIONALITY REDUCTION DIMENSIONALITY REDUCTION MATTHIEU BLOCH April 21, 2020 1 / 26

Probabilistic Dimensionality Reduction Neil D. Lawrence University of Sheffield Facebook, London

Research of Theories and Methods of Classification and Dimensionality Reduction Jie Gui (

Spatial Data: Dimensionality Reduction CS444 Techniques, Lecture 3 In this subfield, we think

Spatial Data: Dimensionality Reduction CSC444 Techniques In this subfield, we think of a data

Tight Kernel Query Complexity of Kernel Ridge Regression and Kernel -means Clustering Manuel

Dimensionality Reduction INFO-4604, Applied Machine Learning University of Colorado Boulder

On the Karhunen-Love basis for continuous mechanical systems R. Sampaio Pontifcia

+ The right answer to the wrong question The use of factor analysis and principal component

Introduction to Principal Component Analysis and Indepedent Component Analysis Tristan A. Hearn

Advanced Section #4: Methods of Dimensionality Reduction: Principal Component Analysis (PCA)

Application of Big Data Analytics via Soft Computing Yunus Yetis INTRODUCTION System of

Prediction of HIV viral tropism based on NGS data Nico Pfeifer Max Planck Institute for

Testing Alternative Aggregation Methods Using Ordinal Data for a Census Asset-Based Wealth Index

ONLINE MACHINE LEARNING AND DATA MINING EDO LIBERTY STANDARD MACHINE LEARNING SETTING =