Local Fisher Discriminant Local Fisher Discriminant Analysis for - PowerPoint PPT Presentation

ICML2006, Pittsburgh, USA June 25-29, 2006 Local Fisher Discriminant Local Fisher Discriminant Analysis for Supervised Analysis for Supervised Dimensionality Reduction Dimensionality Reduction Masashi Sugiyama Tokyo Institute of Technology, Japan

2 Dimensionality Reduction Dimensionality Reduction � High dimensional data is not easy to handle: Need to reduce dimensionality � We focus on � Linear dimensionality reduction: � Supervised dimensionality reduction:

3 Within-Class Multimodality Within-Class Multimodality One of the classes has several modes Class 1 (blue) Class 2 (red) � Medical checkup: hormone imbalance (high/low) vs. normal � Digit recognition: even (0,2,4,6,8) vs. odd (1,3,5,7,9) � Multi-class classification: one vs. rest

4 Goal of This Research Goal of This Research � We want to embed multimodal data so that � Between-class separability is maximized � Within-class multimodality is preserved Separable but within-class Within-class multimodality Separable and within-class multimodality lost preserved but non-separable multimodality preserved FDA LPP LFDA A C B

5 Fisher Discriminant Analysis (FDA) Fisher Discriminant Analysis (FDA) Fisher (1936) � Within-class scatter matrix: � Between-class scatter matrix: � FDA criterion: � Within-class scatter is made small � Between-class scatter is made large

6 Interpretation of FDA Interpretation of FDA :Number of samples in class � Pairwise expressions: :Total number of samples � Samples in the same class are made close � Samples in different classes are made apart

7 Examples of FDA Examples of FDA Simple Label-mixed cluster Multimodal 10 10 10 10 10 10 close 5 5 5 5 5 5 close close 0 0 0 0 0 0 −5 −5 −5 −5 −5 −5 apart apart apart −10 −10 −10 −10 −10 −10 −10 −5 0 5 10 −10 −10 −5 −5 0 0 5 5 10 10 −10 −10 −5 −5 0 0 5 5 10 10 −10 −5 0 5 10 FDA does not take within-class multimodality into account NOTE: FDA can extract only C-1 features since :Number of classes

8 Locality Preserving Projection Locality Preserving Projection (LPP) He & Niyogi (NIPS2003) (LPP) � Locality matrix: � Affinity matrix: e.g., � LPP criterion: � Nearby samples in original space are made close � Constraint is to avoid

9 Examples of LPP Examples of LPP Simple Label-mixed cluster Multimodal 10 10 10 10 10 10 5 5 5 5 5 5 close close 0 0 0 0 0 0 −5 −5 −5 −5 −5 −5 close −10 −10 −10 −10 −10 −10 −10 −10 −5 −5 0 0 5 5 10 10 −10 −10 −5 −5 0 0 5 5 10 10 −10 −10 −5 −5 0 0 5 5 10 10 LPP does not take between-class separability into account (unsupervised)

10 Our Approach Our Approach We combine FDA and LPP � Nearby samples in the 10 same class are made apart close 5 � Far-apart samples in the close 0 same class are not made close −5 � Samples in different don’t care −10 classes are made apart −10 −5 0 5 10

11 Local Fisher Discriminent Analysis Local Fisher Discriminent Analysis � Local within-class scatter matrix: � Local between-class scatter matrix:

12 How to Obtain Solution How to Obtain Solution � Since LFDA has a similar form to FDA, solution can be obtained just by solving a generalized eigenvalue problem:

13 Examples of LFDA Examples of LFDA Simple Label-mixed cluster Multimodal 10 10 10 5 5 5 0 0 0 −5 −5 −5 −10 −10 −10 −10 −5 0 5 10 −10 −5 0 5 10 −10 −5 0 5 10 LFDA works well for all three cases! Note: Usually so LFDA can extract more than C features (cf. FDA)

14 Neighborhood Component Neighborhood Component Analysis (NCA) Analysis (NCA) Goldberger, Roweis, Hinton & Salakhutdinov (NIPS2004) � Minimize leave-one-out error of a stochastic k-nearest neighbor classifier � Obtained embedding is separable � NCA involves non-convex optimization There are local optima � No analytic solution available Slow iterative algorithm � LFDA has analytic form of global solution

15 Maximally Collapsing Maximally Collapsing Metric Learning (MCML) Metric Learning (MCML) Globerson & Roweis (NIPS2005) � Idea is similar to FDA � Samples in the same class are close (“one point”) � Samples in different classes are apart � MCML involves non-convex optimization � There exists a nice convex approximation Non-global solution � No analytic solution available Slow iterative algorithm

16 Simulations Simulations � Visualization of UCI data sets: � Letter recognition (D=16) � Segment (D=18) � Thyroid disease (D=5) � Iris (D=4) � Extract 3 classes from original data � Merge 2 classes Class 1 (blue) Class 2 (red)

17 Summary of Simulation Results Summary of Simulation Results Lett Segm Thyr Iris Comments FDA No multi-modal LPP No label-separability LFDA NCA Slow, local optima MCML Slow, no multi-modal Separable and multimodality preserved Separable but no multimodality Multimodality preserved but no separability

18 Letter Recognition Letter Recognition FDA LPP LFDA FDA LPP LFDA A C B NCA MCML NCA MCML Blue vs. Red

19 Segment Segment FDA LPP LFDA FDA LPP LFDA Brickface Sky Foliage NCA MCML NCA MCML Blue vs. Red

20 Thyroid Disease Thyroid Disease FDA LPP LFDA FDA LPP LFDA Hyper Hypo Normal NCA MCML NCA MCML Blue vs. Red

21 Iris Iris FDA LPP LFDA FDA LPP LFDA Setosa Virginica Verisicolour NCA MCML NCA MCML Blue vs. Red

22 Kernelization Kernelization � LFDA can be non-linearized by kernel trick � FDA: Kernel FDA Mika et al . (NNSP1999) � LPP: Laplacian eigenmap Belkin & Niyogi (NIPS2001) � MCML: Kernel MCML Globerson & Roweis (NIPS2005) � NCA: not available yet?

23 Conclusions Conclusions � LFDA effectively combines FDA and LPP. � LFDA is suitable for embedding multimodal data. � Same as FDA, LFDA has analytic optimal solution thus computationally efficient. � Same as LPP, LFDA needs to pre-specify affinity matrix. � We used local scaling method for computing affinity, which does not include any tuning parameter. Zelnik-Manor & Perona (NIPS2004)

Local Fisher Discriminant Local Fisher Discriminant Analysis for - PowerPoint PPT Presentation

ICML2006, Pittsburgh, USA June 25-29, 2006 Local Fisher Discriminant Local Fisher Discriminant Analysis for Supervised Analysis for Supervised Dimensionality Reduction Dimensionality Reduction Masashi Sugiyama Tokyo Institute of Technology,

Semi-Supervised Local Fisher Semi-Supervised Local Fisher Discriminant Analysis Discriminant

SVM-flexible discriminant analysis Huimin Peng November 20, 2014 Outline SVM Nonlinear SVM =

Discriminant Analysis aka. Discriminant Function Analysis Discriminant Analysis (DISCRIM)

Flexible Discriminant Analysis Using Motivation MGLMM Multivariate Mixed Models Discriminant

Discriminant Analysis In discriminant analysis, we try to find functions of the data that

Pitfalls in Measuring SLOs Danyel Fisher @fisherdanyel An Outage Danyel Fisher @fisherdanyel

MERRY FISHER 1095 New 2018 PROVISIONAL DOCUMENT MERRY FISHER 1095 : THE JOY OF CRUISING 2 In

Linear Discriminant Functions Linear Discriminant Functions 5.8, 5.9, 5.11 Jacob Hays Amit

Linear Discrimination Discriminant-Based Classification 1 Linear Discrimination Linearly

DR. PHINNIZE J. FISHER MIDDLE SCHOOL DR. PHINNIZE J. FISHER MIDDLE SCHOOL South Carolina

16-11-04 Statistical Science and Data Science Nancy Reid 27 October 2016 2 Fisher Memorial

Lecture #13: Discriminant Analysis Data Science 1 CS 109A, STAT 121A, AC 209A, E-109A Pavlos

Lecture 14: Discriminant Analysis CS109A Introduction to Data Science Pavlos Protopapas and Kevin

Selecting Variables in Two-Group Robust Linear Discriminant Analysis . . . . . Stefan Van

Linear discriminant functions Andrea Passerini passerini@disi.unitn.it Machine Learning Linear

Linear Discriminant Analysis and Logistic Regression Matthieu R. Bloch 1 Linear Discriminant

Point of Care Testing: Taking Us Into the Future Barbara M. Goldsmith, Ph.D., FACB July 11, 2012

People, ideas, machines. @enricocoiera AUSTRALIAN INSTITUTE OF HEALTH INNOVATION 2014 (1) 2016

Co nte nt-base d Onto lo g y Ranking Mathew Jones & Harith Alani 9th Intl. Protg

String-Object Transduction with Dogmatic P systems Jos M. Sempere Department of Information

Advanced Data Mining with Weka Class 4 Lesson 1 What is distributed Weka? Mark Hall Pentaho

Case 45 yow comes to see you complaining of fatigue, depressive symptoms and weight gain over

Henry Chu Professor, School of Computing and Informatics Executive Director, Informatics Research

Who wins and how? Sasha Rubin Cornell REU 2009 Traditional Game Theory von Neumann,

Sambuz

Useful Links

Newsletter

Mail Us

Local Fisher Discriminant Local Fisher Discriminant Analysis for - PowerPoint PPT Presentation

ICML2006, Pittsburgh, USA June 25-29, 2006 Local Fisher Discriminant Local Fisher Discriminant Analysis for Supervised Analysis for Supervised Dimensionality Reduction Dimensionality Reduction Masashi Sugiyama Tokyo Institute of Technology,

Semi-Supervised Local Fisher Semi-Supervised Local Fisher Discriminant Analysis Discriminant

SVM-flexible discriminant analysis Huimin Peng November 20, 2014 Outline SVM Nonlinear SVM =

Discriminant Analysis aka. Discriminant Function Analysis Discriminant Analysis (DISCRIM)

Flexible Discriminant Analysis Using Motivation MGLMM Multivariate Mixed Models Discriminant

Discriminant Analysis In discriminant analysis, we try to find functions of the data that

Pitfalls in Measuring SLOs Danyel Fisher @fisherdanyel An Outage Danyel Fisher @fisherdanyel

MERRY FISHER 1095 New 2018 PROVISIONAL DOCUMENT MERRY FISHER 1095 : THE JOY OF CRUISING 2 In

Linear Discriminant Functions Linear Discriminant Functions 5.8, 5.9, 5.11 Jacob Hays Amit

Linear Discrimination Discriminant-Based Classification 1 Linear Discrimination Linearly

DR. PHINNIZE J. FISHER MIDDLE SCHOOL DR. PHINNIZE J. FISHER MIDDLE SCHOOL South Carolina

16-11-04 Statistical Science and Data Science Nancy Reid 27 October 2016 2 Fisher Memorial

Lecture #13: Discriminant Analysis Data Science 1 CS 109A, STAT 121A, AC 209A, E-109A Pavlos

Lecture 14: Discriminant Analysis CS109A Introduction to Data Science Pavlos Protopapas and Kevin

Selecting Variables in Two-Group Robust Linear Discriminant Analysis . . . . . Stefan Van

Linear discriminant functions Andrea Passerini passerini@disi.unitn.it Machine Learning Linear

Linear Discriminant Analysis and Logistic Regression Matthieu R. Bloch 1 Linear Discriminant

Point of Care Testing: Taking Us Into the Future Barbara M. Goldsmith, Ph.D., FACB July 11, 2012

People, ideas, machines. @enricocoiera AUSTRALIAN INSTITUTE OF HEALTH INNOVATION 2014 (1) 2016

Co nte nt-base d Onto lo g y Ranking Mathew Jones &amp; Harith Alani 9th Intl. Protg

String-Object Transduction with Dogmatic P systems Jos M. Sempere Department of Information

Advanced Data Mining with Weka Class 4 Lesson 1 What is distributed Weka? Mark Hall Pentaho

Case 45 yow comes to see you complaining of fatigue, depressive symptoms and weight gain over

Henry Chu Professor, School of Computing and Informatics Executive Director, Informatics Research

Who wins and how? Sasha Rubin Cornell REU 2009 Traditional Game Theory von Neumann,

Sambuz

Useful Links

Newsletter

Mail Us

Co nte nt-base d Onto lo g y Ranking Mathew Jones & Harith Alani 9th Intl. Protg