sparse pca lda
play

Sparse PCA / LDA T x A x max T General x B x symmetric matrices - PowerPoint PPT Presentation

Baback Moghaddam Machine Learning Group baback @ jpl.nasa.gov NASA Sounder Science Team Meeting November 3 rd - 5 th 2010 Sparse PCA / LDA T x A x max T General x B x symmetric matrices (given) Problem Formulation k card ( x ) x k subject


  1. Baback Moghaddam Machine Learning Group baback @ jpl.nasa.gov NASA Sounder Science Team Meeting November 3 rd - 5 th 2010

  2. Sparse PCA / LDA T x A x max T General x B x symmetric matrices (given) Problem Formulation k card ( x ≤ ) x k subject to : ie . ≤ 0

  3. DNA Microarray • n ~ 10 3 -10 4 genes • m ~ 10 2 samples • 2 classes: cancer vs. healthy subject to : card( x ) = k

  4. x 2 x 1

  5. 98% of spectral variance is in 500 frequencies (total of 2500), hence yielding a 5:1 compression AQUA/AIRS radiance data (June 4 th 2007)

  6. Hyperion Sulfur Detection Sparse Classifier (best 12 of 242 channels) 1024-by-256 imagery of Detection Performance (ROC) sulfur-rich Borup Fiord glacier also measured by 242-band Hyperion sensor during 2006-07 For FPR > 1% the 12-band detection rate is as good as using all 242 bands, yielding 20:1 compression ratio

  7. AIRS Dataset July, 2005 ~20,000 spectra Sec3on of Pacific with stratocumulus, cumulus and deep convec3ve clouds

  8. AIRS Spectrum

  9. Cloudy/Clear Classifier : 1-freq H 2 O

  10. Cloudy/Clear Classifier : 2-freqs CO 2 H 2 O

  11. Cloudy/Clear Classifier : 5-freqs CO 2 O 3 H 2 O

  12. Cloudy/Clear Classifier : 50-freqs CO 2 O 3 H 2 O

  13. Current Work • Algorithmic Enhancements • formulated Sparse-LDA as Sparse Regression problem • this speeds up optimization, reduces CPU time by factor of ~10 3 • Dataset Preparation • Selected suitable hyperspectral datasets from AIRS archive • IR spectra for a whole month (huge data matrix = 20000 x 1843) • Visual data in four frequency bands (from AIRS VIS instrument) • Demo of AIRS Cloudy/Clear sparse classifier • Separation of cloudy from clear data based on Level 1 data • Worked with Prof. Yung (Caltech) using their method of cloud separation • Tested 2 methods of cloud separation by AIRS Project Scientist G. Aumann

  14. Future Work • Methodology • Test current algorithm with a 3 rd cloud separation criterion (based on CO 2 retrievals) as suggested/used by Bill Irion (JPL) • Select more varied AIRS datasets (ocean and land regions separated) and perform cross-dataset validation • Missions • Meet with AIRS Science Team to discuss and propose new product (L1 D) based on our preliminary Sparse-LDA algorithm results

  15. Publications

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend