applications
play

APPLICATIONS Pittsburgh, February 24 th of 2010 Less is More 2 3D - PowerPoint PPT Presentation

Esteban Garca-Cuesta Researcher at Universidad Carlos III - Spain WHEN LESS IS MORE TECHNIQUES AND APPLICATIONS Pittsburgh, February 24 th of 2010 Less is More 2 3D 2D Esteban Garca-Cuesta, Universidad Carlos III de Madrid


  1. Esteban García-Cuesta Researcher at Universidad Carlos III - Spain WHEN “LESS IS MORE” TECHNIQUES AND APPLICATIONS Pittsburgh, February 24 th of 2010

  2. “ Less is More” 2 3D 2D Esteban García-Cuesta, Universidad Carlos III de Madrid

  3. Summary 3  This talk is about :  It is not specifically about :  High dimensional datasets  A machine learning algorithm  Two proposals developed during  Computer vision my PhD. studies  How each of the proposals point of view can help in a robotics context  Data mining Esteban García-Cuesta, Universidad Carlos III de Madrid

  4. Outline 4 Introduction to dimensionality  Homogeneous structures  reduction  Remote Sensing application Feature selection using eigenvector   Facial motion feature points coefficients (Part I) selection  Introduction: Principal Components Analysis  Map building without localization by DR  How to use the PCA coefficients for feature selection Recent trends in dimensionality   Application to a remote sensing reduction scenario Feature extraction models (Part II)   Graphs and embedding graphs Esteban García-Cuesta, Universidad Carlos III de Madrid

  5. Hough transform 5 Introduction to Dimensionality Reduction • Motivation • Problems related with high dimensional data

  6. Motivation 6  Modern technologies routinely produce massive amounts of data  Scientific progress now heavily depends on the ability to process and analyze high dimensional data  The heart of these analysis is the reduction of the dimensionality by selecting a subset of original features or obtaining a well-chosen combination of them Esteban García-Cuesta, Universidad Carlos III de Madrid

  7. Problems Related with HD 7 High dimensionality:   Most of the machine learning and data mining techniques are not effective with high dimension datasets.  Irrelevant features.  Redundant features.  The so- called “curse of Irrelevant features Redundant features dimensionality” ( CoD) [Bellman’61] . Esteban García-Cuesta, Universidad Carlos III de Madrid

  8. CoD 8  Number of training instances needed to `populate’ a space grows exponentially with dimensionality What can we do?  Unexpected properties  Euclidean distance tends to zero  Gaussian behaivor of uniformly sampled points Laurens van de Maaten, DM Summer-School 2008, Maastricht Esteban García-Cuesta, Universidad Carlos III de Madrid

  9. Dimensionality Reduction 9  Feature Selection The “intrinsic” dimensionality may be  smaller than the number of features  Only a subset of original features are selected  Def: the minimum number of  Discrete necessary features to preserve the data properties  Comprehensibility Other reasons for dimensionality  reduction:  Feature Extraction  Compress data  All features are used  We want to visualize high  Continous dimensional data Esteban García-Cuesta, Universidad Carlos III de Madrid

  10. Remote Sensing Application 10 FORWARD MODEL Spectrum of energy Infrared Sensor CO 2 H 2 O Intensity (a.u.) -- CO 2 Wavelength(cm -1 ) INVERSE MODEL Spectrum of energy RTE Intensity (a.u.) Temperature Retrieve Wavelength(cm -1 ) Length Esteban García-Cuesta, Universidad Carlos III de Madrid

  11. Machine Learning Approach 11 We have gathered a dataset X:   N data samples (different flame Spectrums of energy Temperature Profile observations) Intensity (a.u.)  D features /variables /dimensions (each one of the wavelengths) LEARN We want to ‘learn’ from this data: Wavelength(cm -1 )  Length  Inverse of the RTE  Regression problem Esteban García-Cuesta, Universidad Carlos III de Madrid

  12. Why is Important to Solve the IRTE 12 COMBUSTION Global warming Healthy dangerous To have an automatic control and diagnosis of combustions in order to obtain energy efficiently and minimize the pollutant emissions Esteban García-Cuesta, Universidad Carlos III de Madrid

  13. Wrapper selection [Kohavi’97] 13 Feature selection using the eigenvectors coefficients • Introduction: Principal Component Analysis • How to use the PCA to select a subset of original features • Applied to remote sensing data

  14. Feature Selection 14  Def: a process that chooses an Supervised  optimal subset of features  Exploits input-output relations according to an objective  Unstable due to multicollinearity function  Wrapped approach  Objectives  There are many subsets  To reduce dimensionality and remove noise Unsupervised   To improve mining performance  Feature ranking based on a quality metric  Speed of learning  Based on variance and separability  Accuracy of the data (PCA)  Simplicity and comprehensibility Esteban García-Cuesta, Universidad Carlos III de Madrid

  15. Subset Search Problem 15 [Kohavi & John ‘97] Esteban García-Cuesta, Universidad Carlos III de Madrid

  16. Feature Selection 16  In high dimensional data:  Large number of features to work with  Many irrelevant features and which is more important many redundant ones  Individual feature evaluation (filter approach)  Focus on identifying relevant features without handling feature redundancy or feature relations  Feature subset selection (wrapper approach)  Rely on the evaluation of the subset to handle the redundancy (too many possibilities) Esteban García-Cuesta, Universidad Carlos III de Madrid

  17. Multicollinearity 17 Esteban García-Cuesta, Universidad Carlos III de Madrid

  18. PCA 18 Its main objective is to reduce the  dimensionality but conserving the total variance  2  1 :[p x p] covariance matrix : k dimensional projection :[p x k] eigenvector matrix : column vector of the k eigenvector :[k x k] diagonal eigenvalue matrix : input data column matrix observation Esteban García-Cuesta, Universidad Carlos III de Madrid

  19. PCA Coefficients 19 Eigenvector 1 Coefficients of feature i Esteban García-Cuesta, Universidad Carlos III de Madrid

  20. PCA Coefficients 20  Key idea is that high absolute value coefficients means more influence relevant features  high absolute value coefficients Esteban García-Cuesta, Universidad Carlos III de Madrid

  21. B4 Method [Jolliffe,02] 21  Very appealing because of Eigenvector coefficient α k (a.u.) simplicity  It lacks of redundancy control Features Nº Esteban García-Cuesta, Universidad Carlos III de Madrid

  22. Analysis of PCA Coefficients 22  Key idea is that similar absolute value coefficients means high correlation Irrelevant features  coefficients  0 between their associated features Redundant features  similar coefficients  On the other extreme very independent features has maximum distance Different eigenvectors  uncorrelated bases Esteban García-Cuesta, Universidad Carlos III de Madrid

  23. Redundancy Control 23  Select a feature with the Eigenvector coefficient α k (a.u.) highest value for different ranges  Difficult to choose the threshold Features Nº Esteban García-Cuesta, Universidad Carlos III de Madrid

  24. Using a Priory Specific Knowledge 24 Infrared Sensor -- Emmits Absorbs Adjacent wavelengths/features have similar space information Wavelength (cm-1) X-space Esteban García-Cuesta, Universidad Carlos III de Madrid

  25. Guided Feature Selection [garcia- cuesta’08] 25 “Multilayer perceptron as inverse model in a ground-based remote sensing temperature retrieval problem” J. Eng. Appl. Artif. Intell., Vol.21:26-34, Issue 1, February 2008. Selection of features with high and different coefficient values Eigenvector coefficient α k (a.u.) Similar features have similar information Locally find features with high coefficient values Features Nº Esteban García-Cuesta, Universidad Carlos III de Madrid

  26. Algorithm 26 PCA 1. 2. Calculate the covariance input Obtain the eigenvectors α and matrix the eigenvalues Λ of Σ and Σ = XX T select α q Select a subset of features Use the selected subset of applying a maximum value features as input in a machine algorithm to α q learning algorithm 3. 4. Guided features selection Esteban García-Cuesta, Universidad Carlos III de Madrid

  27. Guided Feature Selection 27 Subset of selected original features Eigenvector (a.u.) Wavelength number (cm -1 ) Esteban García-Cuesta, Universidad Carlos III de Madrid

  28. Remote Sensing Application Results 28  A MLP neural network has been used for estimation 7 B4 purposes GFS 6.5  Cross-validation 6 Error (K)  Proofs with different number 5.5 of hidden neurons 5 4.5  The proposed GFS improves 4 and converges faster than B4 3.5 20 30 40 50 60 70 80 90 100  The error increases adding Number of selected features more features Esteban García-Cuesta, Universidad Carlos III de Madrid

  29. Remote Sensing Application Results 29 Esteban García-Cuesta, Universidad Carlos III de Madrid

  30. Feature Selection 30  We developed a feature selection method based on PCA to reveal the dependency between features  It allows to introduce a priori known knowledge  The selection of original features allows to design specific sensors  Reduce the cost of the equipment  Reduce the cost of massive data storage Esteban García-Cuesta, Universidad Carlos III de Madrid

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend