Kernel-Based Dimensionality Reduction Methods on Synthesized and - - PowerPoint PPT Presentation

kernel based dimensionality reduction methods on
SMART_READER_LITE
LIVE PREVIEW

Kernel-Based Dimensionality Reduction Methods on Synthesized and - - PowerPoint PPT Presentation

High Dimensionality Dimensionality Reduction Methods Application to Simulated Data Application to Morph-II Wrap-up Kernel-Based Dimensionality Reduction Methods on Synthesized and Facial Image Data Jonathan L. Fabish Statistical Data Mining


slide-1
SLIDE 1

High Dimensionality Dimensionality Reduction Methods Application to Simulated Data Application to Morph-II Wrap-up

Kernel-Based Dimensionality Reduction Methods

  • n Synthesized and Facial Image Data

Jonathan L. Fabish

Statistical Data Mining and Visualization REU, UNCW

July 25, 2017

Kernel-Based Dimensionality Reduction Methods on Synthesized and Facial Image Data Jonathan L. Fabish 1

slide-2
SLIDE 2

High Dimensionality Dimensionality Reduction Methods Application to Simulated Data Application to Morph-II Wrap-up

Outline

High Dimensionality Introduction Dimensionality Reduction Methods Principal Component Analysis Kernel Principal Component Analysis Supervised Kernel Principal Component Analysis Fisher Discriminant Analysis Application to Simulated Data Simulations Application to Morph-II Morph-II Gender Classification Wrap-up Conclusions

Kernel-Based Dimensionality Reduction Methods on Synthesized and Facial Image Data Jonathan L. Fabish 2

slide-3
SLIDE 3

High Dimensionality Dimensionality Reduction Methods Application to Simulated Data Application to Morph-II Wrap-up Introduction

High Dimensionality and Non-Linearity

◮ High dimensionality, which can hinder the use of traditional

classification and regression algorithms, is associated with having a large number of often correlated predictors and a small sample size.

◮ Dimensionality reduction methods reduce noise, extract main

features of a dataset, and decrease computational cost.

◮ Since classes in a real dataset are rarely linearly-separable,

non-linear dimensionality reduction methods which employ kernel tricks, such as Kernel Fisher Discriminant Analysis (KFDA), Kernel Principal Component Analysis (KPCA), and Supervised Kernel Principal Component Analysis (SKPCA), are often effective.

Kernel-Based Dimensionality Reduction Methods on Synthesized and Facial Image Data Jonathan L. Fabish 3

slide-4
SLIDE 4

High Dimensionality Dimensionality Reduction Methods Application to Simulated Data Application to Morph-II Wrap-up Principal Component Analysis

PCA

◮ In linear PCA, the directions of highest variation in a dataset

are identified by computing the eigenvectors which correspond to the largest eigenvalues of the covariance matrix of the centered data C = 1

M

M

j=1 xjxT j . ◮ This is accomplished by solving the following eigenvalue

equation, λv = Cv.

◮ The PCs are obtained from the eigenvectors corresponding to

the largest eigenvalues.

Kernel-Based Dimensionality Reduction Methods on Synthesized and Facial Image Data Jonathan L. Fabish 4

slide-5
SLIDE 5

High Dimensionality Dimensionality Reduction Methods Application to Simulated Data Application to Morph-II Wrap-up Principal Component Analysis

PCA visualized in 3-dimensions.

Kernel-Based Dimensionality Reduction Methods on Synthesized and Facial Image Data Jonathan L. Fabish 5

slide-6
SLIDE 6

High Dimensionality Dimensionality Reduction Methods Application to Simulated Data Application to Morph-II Wrap-up Kernel Principal Component Analysis

Kernel Principal Component Analysis

◮ The kernel trick is widely used to facilitate the task of

classification on a non-linearly separable dataset.

◮ Project the data to a subspace of arbitrarily high

dimensionality using an implicit non-linear map Φ, which maps the elements of Rn to a high dimensional space F, to explore non-linear relationships among the features.

◮ This is accomplished by replacing all occurrences of the inner

product (x)T(x′) in the PCA algorithm by a kernel, in our case the Gaussian kernel, K(x, x′) = Φ(x)TΦ(x′) = exp −γ||x − x′||2

2.

Kernel-Based Dimensionality Reduction Methods on Synthesized and Facial Image Data Jonathan L. Fabish 6

slide-7
SLIDE 7

High Dimensionality Dimensionality Reduction Methods Application to Simulated Data Application to Morph-II Wrap-up Kernel Principal Component Analysis

KPCA Algorithm

  • 1. Compute the kernel matrix, Kij = (k(xi, xj))ij.
  • 2. Solve Mλα = Kα by diagonalizing K, where

λM ≥ λM−1 ≥ · · · λ1 are the eigenvalues, and α is the column vector with entries αM, · · · , α1.

  • 3. Normalize eigenvector expansion coefficients αn by requiring

λn(an · an) = 1, for n = p, . . . , M where λp corresponds to the first non-zero eigenvalue.

  • 4. Compute the projections of test data on to normalized

eigenvectors V n in high dimensional feature space F using (V n · Φ(x)) = M

i=1 αn i k(xi, x).

Kernel-Based Dimensionality Reduction Methods on Synthesized and Facial Image Data Jonathan L. Fabish 7

slide-8
SLIDE 8

High Dimensionality Dimensionality Reduction Methods Application to Simulated Data Application to Morph-II Wrap-up Supervised Kernel Principal Component Analysis

Supervised Kernel Principal Component Analysis

◮ SKPCA is similar to KPCA, except that class labels are used

to maximize the dependency of the covariates on the response in question. This is done by using an estimate of the Hilbert Schmidt Independence Criterion, HSIC(Z, F, G) = 1 (n − 1)2 tr(KHLH), where H, K, L ∈ Rnxn, with Kij = k(xi, x′), Lij = l(yi, yj), and Hij = [1(i = j) − 1/n], which measures the strength of association between two random variables.

Kernel-Based Dimensionality Reduction Methods on Synthesized and Facial Image Data Jonathan L. Fabish 8

slide-9
SLIDE 9

High Dimensionality Dimensionality Reduction Methods Application to Simulated Data Application to Morph-II Wrap-up Fisher Discriminant Analysis

Kernel Fisher Discriminant Analysis

◮ FDA is a supervised method which approximates the

theoretically optimum Bayes’ classifier.

◮ In FDA we find the vector w, which maximizes the following

  • bjective function,

J(w) = wT SBw

wT SW w ,

where SB = (m1 − m2)(m1 − m2)T, SW =

i=c

  • x∈Xi(x − mi)(x − mT

i ),

and mi = li

j=1 xi j .

Kernel-Based Dimensionality Reduction Methods on Synthesized and Facial Image Data Jonathan L. Fabish 9

slide-10
SLIDE 10

High Dimensionality Dimensionality Reduction Methods Application to Simulated Data Application to Morph-II Wrap-up Fisher Discriminant Analysis

Kernel Fisher Discriminant Analysis

Written in form of kernels, KFDA is as follows: J(α) = αT Pα

αT Qα

where P =

c Nc[κcκT c − κκT]

Q = KK t −

c NcκcκT c

κc =

1 Nc

  • i∈c Kij

and κ = 1

N

  • i Kij

Kij = k(xi, xj) To solve for α, we find the leading eigenvector of Q−1P, which maximizes the ratio of between class variation to within class variation in a high-dimensional feature space.

Kernel-Based Dimensionality Reduction Methods on Synthesized and Facial Image Data Jonathan L. Fabish 10

slide-11
SLIDE 11

High Dimensionality Dimensionality Reduction Methods Application to Simulated Data Application to Morph-II Wrap-up Simulations

Wine Chocolate Data

Kernel-Based Dimensionality Reduction Methods on Synthesized and Facial Image Data Jonathan L. Fabish 11

slide-12
SLIDE 12

High Dimensionality Dimensionality Reduction Methods Application to Simulated Data Application to Morph-II Wrap-up Simulations

Swiss Roll Data

Kernel-Based Dimensionality Reduction Methods on Synthesized and Facial Image Data Jonathan L. Fabish 12

slide-13
SLIDE 13

High Dimensionality Dimensionality Reduction Methods Application to Simulated Data Application to Morph-II Wrap-up Simulations

Concentric Segmented Ring Data

Kernel-Based Dimensionality Reduction Methods on Synthesized and Facial Image Data Jonathan L. Fabish 13

slide-14
SLIDE 14

High Dimensionality Dimensionality Reduction Methods Application to Simulated Data Application to Morph-II Wrap-up Morph-II

Morph-II Dataset

◮ Contains over 55,000 mugshots of 13,617 individuals,

collected over 5 years.

◮ Ages range from 16 to 77 years. ◮ Provides race, gender, date of birth, date of arrest, age, time.

difference since last picture, subject identifier, and image number for each picture in the database.

Kernel-Based Dimensionality Reduction Methods on Synthesized and Facial Image Data Jonathan L. Fabish 14

slide-15
SLIDE 15

High Dimensionality Dimensionality Reduction Methods Application to Simulated Data Application to Morph-II Wrap-up Morph-II

Morph-II Features

◮ Due to the high computational cost of kernel based DR

methods, subsets of 1000 random histogram-equalized images were selected from MORPH-II for analysis.

◮ Each subject has a single image in the subset. ◮ The ratio of male to female is 3:1 and black to white is 1:1. ◮ We applied the previously described DR methods to features

extracted using two methods, namely local binary patterns (LBP) and biologically inspired features (BIF).

Kernel-Based Dimensionality Reduction Methods on Synthesized and Facial Image Data Jonathan L. Fabish 15

slide-16
SLIDE 16

High Dimensionality Dimensionality Reduction Methods Application to Simulated Data Application to Morph-II Wrap-up Morph-II

Gender Classification

◮ After application of the previously described dimensionality

reduction methods, we used a linear support vector machine with 5-fold cross-validation to predict gender and determine standard deviation, and computational time using the R programming language.

Kernel-Based Dimensionality Reduction Methods on Synthesized and Facial Image Data Jonathan L. Fabish 16

slide-17
SLIDE 17

High Dimensionality Dimensionality Reduction Methods Application to Simulated Data Application to Morph-II Wrap-up Conclusions

Conclusions

◮ All simulated datasets were linearly separable by

class after application of KFDA, while results were mixed for KPCA and SKPCA which seem to be more sensitive to tuning parameters.

◮ Higher gender classification accuracies were

  • btained using BIFs than LBPs, with 91.3%

accuracy from SKPCA.

◮ More research should be done with respect to

tuning parameter selection as those used in this study are not necessarily optimal.

Kernel-Based Dimensionality Reduction Methods on Synthesized and Facial Image Data Jonathan L. Fabish 17

slide-18
SLIDE 18

High Dimensionality Dimensionality Reduction Methods Application to Simulated Data Application to Morph-II Wrap-up Conclusions

References

Bernhard Schlkopf, Alexander Smola, and Klaus-Robert Mller. 1998. Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput. 10, 5 (July 1998), 1299-1319. Elnaz Barshan, Ali Ghodsi, Zohreh Azimifar, and Mansoor Zolghadri Jahromi. 2011. Supervised principal component analysis: Visualization, classification and regression on subspaces and submanifolds. Pattern

  • Recogn. 44, 7 (July 2011), 1357-1371.

Wang Y., Chen C., Watkins V., Ricanek K. (2015) Modified Supervised Kernel PCA for Gender

  • Classification. In: He X. et al. (eds) Intelligence Science and Big Data Engineering. Image and Video Data
  • Engineering. Lecture Notes in Computer Science, vol 9242. Springer, Cham.
  • S. Mika, G. Ratsch, J. Weston, B. Scholkopf and K. R. Mullers, ”Fisher discriminant analysis with kernels,”

Neural Networks for Signal Processing IX: Proceedings of the 1999 IEEE Signal Processing Society Workshop (Cat. No.98TH8468), Madison, WI, 1999, pp. 41-48. Welling, M. (2005). Fisher linear discriminant analysis. Department of Computer Science, University of Toronto, 3(1). Kernel-Based Dimensionality Reduction Methods on Synthesized and Facial Image Data Jonathan L. Fabish 18