NONLINEAR COMPONENT ANALYSIS AS A KERNEL EIGENVALUE PROBLEM - PowerPoint PPT Presentation

NONLINEAR COMPONENT ANALYSIS AS A KERNEL EIGENVALUE PROBLEM Bernhard Schölkopf, Alexander Smola and Klaus-Robert Müller Karthik Naman Shubham Zhenye Ziyu Department of Industrial and Enterprise Systems Engineering

Overview Introduction and Motivation Application Examples ● ● Review of Principal Component Analysis Toy Example ○ ○ Problem of PCA IRIS Clustering ○ ○ Strategy Implementation USPS Classification ○ ○ Computational Hurdles Summary and ○ ● Introduction of Kernels Connection to the Course ○ Technical Background ● References ● Kernel Methods ○ Summary of Main Results ● Pseudocodes and Algorithm ○ Experimental Results of the Paper ○

INTRODUCTION AND MOTIVATION

Introduction and Technical Summary of Main Application Summary and Course References Motivation Background Results Examples Connection Review : Principal Component Analysis Motivation: Reduce the dimensions of the dataset with minimal loss of ● information. Definition: PCA is a statistical procedure that uses an orthogonal transformation ● to convert a set of observations of possibly correlated variables into a set of values of linearly uncorrelated variables called principal components. How to perform linear PCA?

Introduction and Technical Summary of Main Application Summary and Course References Motivation Background Results Examples Connection Principal Component Analysis in Action: Determining the axis (component) of ● maximum variance. Finding all such orthogonal ● component. Projecting the data on those ● components.

Introduction and Technical Summary of Main Application Summary and Course References Motivation Background Results Examples Connection Principal Component Analysis in Action: Problem: Determining the axis (component) of maximum variance. ●

Introduction and Technical Summary of Main Application Summary and Course References Motivation Background Results Examples Connection Principal Component Analysis in Action: Other examples: ● Facial images with emotional expressions ○ Images of an object of which orientation is variable ○ Data that can’t be separated by linear boundaries ○

Introduction and Technical Summary of Main Application Summary and Course References Motivation Background Results Examples Connection Problem of PCA Problem Statement: Unable to find components that represents nonlinear data effectively. ● Information loss with projected data. ● Strategy to tackle this problem: Map data to higher dimension. ● Assumption: The data will be ○ linearly distributed in higher dimensions. Perform PCA in that space. ● Project datapoint on that PC’s ●

Introduction and Technical Summary of Main Application Summary and Course References Motivation Background Results Examples Connection Strategy Implementation F1 F2 ... FN F - Feature Space ● Obs1 x 11 x 12 ... x 1N Φ - Transforming function ● M - Total number of observations ● Obs2 x 21 x 22 ... x 2N N - Total number of features ● x - Original data with M ● observations and N features ObsM x M1 x M2 ... x MN

Introduction and Technical Summary of Main Application Summary and Course References Motivation Background Results Examples Connection Strategy Implementation

Introduction and Technical Summary of Main Application Summary and Course References Motivation Background Results Examples Connection Computational hurdles Problem: ● We want to take the advantage of mapping ○ into high-dimensional space. The mapping, however, can be arbitrary, ○ with a very high or infinite dimensionality. Computing the mapping of each data point ○ to that space will be computational expensive.

Introduction and Technical Summary of Main Application Summary and Course References Motivation Background Results Examples Connection Introduction of Kernels One method to solve that computational problem is to use ‘KERNELS’. Definition: Kernels are functions that perform dot product in transformed space. ● Some examples for kernels: ●

Introduction and Technical Summary of Main Application Summary and Course References Motivation Background Results Examples Connection Introduction of Kernels Why ‘KERNELS’ are computationally efficient? Reason: computing dot product in ● transformed space, without explicitly carrying out the entire data transformation.. Example:

TECHNICAL BACKGROUND

Introduction and Technical Summary of Main Application Summary and Course References Motivation Background Results Examples Connection Algebraic Manipulations

Introduction and Technical Summary of Main Application Summary and Course References Motivation Background Results Examples Connection Kernel Method for PCA

Introduction and Technical Summary of Main Application Summary and Course References Motivation Background Results Examples Connection Kernel Method for PCA Note: The equations looks like eigenvalue decomposition of matrix K

Introduction and Technical Summary of Main Application Summary and Course References Motivation Background Results Examples Connection Projection Using Kernel Method

Introduction and Technical Summary of Main Application Summary and Course References Motivation Background Results Examples Connection Visual Representation : KPCA

Introduction and Technical Summary of Main Application Summary and Course References Motivation Background Results Examples Connection KPCA steps in a nutshell The following steps were necessary to compute the principal components: 1. Compute the kernel matrix K, 2. Compute its eigenvectors and normalize them in F, and 3. Compute projections of a test point onto the eigenvectors.

SUMMARY OF MAIN RESULTS

Introduction and Technical Summary of Main Application Summary and Course References Motivation Background Results Examples Connection Kernel PCA: Pseudocode Loading Test Data ● Centering Test data ● Creating Kernel K matrix ● Centering of Kernel K matrix in F space ● Eigenvalue Decomposition of K centered Matrix ● Sorting Eigenvalues in descending order. ● Selecting the significant eigenvectors corresponding ● to these eigenvalues. Normalizing all significant sorted eigenvectors of K ● Projecting data in the principal component coordinate ● system

Introduction and Technical Summary of Main Application Summary and Course References Motivation Background Results Examples Connection Algorithm For Kernel PCA

Introduction and Technical Summary of Main Application Summary and Course References Motivation Background Results Examples Connection COMPUTATIONAL COMPLEXITY A fifth-order polynomial kernel on a 256-dimensional input space yields a 10 10 ● dimensional feature space We have to evaluate the kernel function M times for each extracted principal ● component ,rather than just evaluating one dot product as for a linear PCA. Finally, although kernel principal component extraction is computationally more ● expensive than its linear counterpart, this additional investment can pay back afterward.

Introduction and Technical Summary of Main Application Summary and Course References Motivation Background Results Examples Connection USPS Handwriting Dataset The dataset refers to numeric data obtained from the scanning of handwritten digits from envelopes by the U.S. Postal Service. The images have been de-slanted and size normalized, resulting in 16 x 16 grayscale images (Le Cun et al., 1990). LINK TO USPS REPO : https://cs.nyu.edu/~roweis/data.html

Introduction and Technical Summary of Main Application Summary and Course References Motivation Background Results Examples Connection Experimental Results of Article Nonlinear PCs afforded better ● recognition rates than Test Error Rates on the USPS Handwritten Digit Database corresponding numbers of linear PCs. Performance for nonlinear ● components can be improved by using more components than is possible in the linear case.

APPLICATION EXAMPLES

Introduction and Technical Summary of Main Application Summary and Course References Motivation Background Results Examples Connection EXAMPLE APPLICATIONS 1. TOY EXAMPLE 2. IRIS Clustering 3. USPS Classification LINK TO OUR GITHUB REPO : https://github.com/Zhenye-Na/npca

NONLINEAR COMPONENT ANALYSIS AS A KERNEL EIGENVALUE PROBLEM - PowerPoint PPT Presentation

NONLINEAR COMPONENT ANALYSIS AS A KERNEL EIGENVALUE PROBLEM Bernhard Schlkopf, Alexander Smola and Klaus-Robert Mller Karthik Naman Shubham Zhenye Ziyu Department of Industrial and Enterprise Systems Engineering Overview Introduction

Analysis and Computation for Analysis and Computation for Nonlinear Eigenvalue Eigenvalue

Nonlinear Control Lecture # 31 Nonlinear Observers Nonlinear Control Lecture # 31 Nonlinear

Nonlinear Control Lecture # 22 Special nonlinear Forms Nonlinear Control Lecture # 22 Special

Nonlinear Control Lecture # 21 Special nonlinear Forms Nonlinear Control Lecture # 21 Special

Tight Kernel Query Complexity of Kernel Ridge Regression and Kernel -means Clustering Manuel

Nonlinear Control Lecture # 8 Special nonlinear Forms Nonlinear Control Lecture # 8 Special

Nonlinear Control Lecture # 12 Nonlinear Observers and Output Feedback Stabilization Nonlinear

Nonlinear Control Lecture # 20 Special nonlinear Forms Nonlinear Control Lecture # 20 Special

Continuous Latent Variables Oliver Schulte - CMPT 419/726 Bishop PRML Ch. 12 Principal Component

Linear and Nonlinear SP 2 Methods for Large Scale Eigenvalue Calculations Zhaojun Bai

Rational Krylov Methods for Solving Nonlinear Eigenvalue Problems Roel Van Beeumen

Black Kernel Rot Malady of Pecan B Wood, C Bock, l Wells, T Cottrell, M Hotchkiss Black Kernel

Kernel Properties - Convexity Leila Wehbe October 1st 2013 Leila Wehbe Kernel Properties -

Processes, Protection and the Kernel: Processes, Protection and the Kernel: Mode, Space, and

Linux Kernel Debugging Your kernel just oopsed - What do you do, hotshot? Muli Ben-Yehuda

Introduction to Linux Kernel Modules Luca Abeni luca.abeni@santannapisa.it Linux Kernel Modules

Practical Training (OPT) And Beyond! September 20, 2019 Wendy Owens, Immigration Advisor

PILOT STUDY ON CROSS BORDER E-COMMERCE TRADE STATISTICS OF CHINA LI QIAN,

Will voting by mail save democracy? Dana Chisnell NCoC dana@ncoc.org @danachis Michigan

MICROSOFT AZURE MACHINE LEARNING Oscar Naim Microsoft Microsoft Azure Machine Learning What is

Gonalities of Modular Curves Maarten Derickx 1 Mark van Hoeij 2 1 Algant (Leiden, Bordeaux and

Gonalities of Modular Curves Maarten Derickx 1 Mark van Hoeij 2 1 Algant (Leiden, Bordeaux and

The computational nature of phonological generalizations Jeffrey Heinz Linguistics Department

The computational nature of phonological generalizations Jeffrey Heinz Rutgers University April