utlc unsupervised transfer learning challenge
play

UTLC Unsupervised Transfer Learning Challenge egoire Mesnil 1 , 2 , - PowerPoint PPT Presentation

Introduction Deep Architecture Results Summary UTLC Unsupervised Transfer Learning Challenge egoire Mesnil 1 , 2 , Yann Dauphin 1 , Xavier Glorot 1 , Gr Salah Rifai 1 , Yoshua Bengio 1 et al. 1 LISA, Universit e de Montr eal, Canada 2


  1. Introduction Deep Architecture Results Summary UTLC Unsupervised Transfer Learning Challenge egoire Mesnil 1 , 2 , Yann Dauphin 1 , Xavier Glorot 1 , Gr´ Salah Rifai 1 , Yoshua Bengio 1 et al. 1 LISA, Universit´ e de Montr´ eal, Canada 2 LITIS, Universit´ e de Rouen, France July 2 nd 2011 UTL Challenge, ICML Workshop 1/ 25

  2. Introduction Deep Architecture Results Summary Plan Introduction 1 Deep Architecture 2 Preprocessing Feature Extraction Postprocessing Results 3 Summary 4 UTL Challenge, ICML Workshop 2/ 25

  3. Introduction Deep Architecture Results Summary UTL Challenge Presentation Dates : Phase 1 : Unsupervised Learning ; start : january 3, end : march 4. Phase 2 : Transfer Learning ; start : march 4, end : april 15. Five different Data sets : data set # samples dimension sparsity AVICENNA Arabic Manuscripts 150205 120 0 % HARRY Human actions 69652 5000 98 % RITA CIFAR-10 111808 7200 1 % SYLVESTER Ecology 572820 100 0 % TERRY NLP 217034 47236 99 % UTL Challenge, ICML Workshop 3/ 25

  4. Introduction Deep Architecture Results Summary UTL Challenge Evaluation ALC : Area under Learning Curve 1 to 64 samples per class UTL Challenge, ICML Workshop 4/ 25

  5. Introduction Deep Architecture Results Summary UTL Challenge Performance How to evaluate the performance of one model without any label or prior knowledge on the training set ? UTL Challenge, ICML Workshop 5/ 25

  6. Introduction Deep Architecture Results Summary UTL Challenge Performance How to evaluate the performance of one model without any label or prior knowledge on the training set ? proxy : ALC Valid versus Test (Phase 1) UTL Challenge, ICML Workshop 5/ 25

  7. Introduction Deep Architecture Results Summary UTL Challenge Performance How to evaluate the performance of one model without any label or prior knowledge on the training set ? proxy : ALC Valid versus Test (Phase 1) valid ALC returned by the competition servers (Phase 1 & 2) UTL Challenge, ICML Workshop 5/ 25

  8. Introduction Deep Architecture Results Summary UTL Challenge Performance How to evaluate the performance of one model without any label or prior knowledge on the training set ? proxy : ALC Valid versus Test (Phase 1) valid ALC returned by the competition servers (Phase 1 & 2) ALC with the given labels (Phase 2) UTL Challenge, ICML Workshop 5/ 25

  9. Introduction Deep Architecture Results Summary UTL Challenge Performance How to evaluate the performance of one model without any label or prior knowledge on the training set ? proxy : ALC Valid versus Test (Phase 1) valid ALC returned by the competition servers (Phase 1 & 2) ALC with the given labels (Phase 2) UTL Challenge, ICML Workshop 5/ 25

  10. Introduction Deep Architecture Results Summary UTL Challenge Performance How to evaluate the performance of one model without any label or prior knowledge on the training set ? proxy : ALC Valid versus Test (Phase 1) valid ALC returned by the competition servers (Phase 1 & 2) ALC with the given labels (Phase 2) From phase 1 to phase 2, we over-explored the hyperparameters of the next models to grab the 1 st place. UTL Challenge, ICML Workshop 5/ 25

  11. Introduction Deep Architecture Results Summary Deep Architecture Stack different blocks We used this template : Pre-processing : PCA w/wo whitening, Contrast Normalization, 1 Uniformization Feature Extraction : Rectifiers, DAE, CAE, µ -ss-RBM 2 Post-processing : Transductive PCA 3 UTL Challenge, ICML Workshop 6/ 25

  12. Introduction Deep Architecture Results Summary Deep Architecture Stack different blocks We used this template : Pre-processing : PCA w/wo whitening, Contrast Normalization, 1 Uniformization Feature Extraction : Rectifiers, DAE, CAE, µ -ss-RBM 2 Post-processing : Transductive PCA 3 UTL Challenge, ICML Workshop 6/ 25

  13. Introduction Deep Architecture Results Summary Deep Architecture Stack different blocks We used this template : Pre-processing : PCA w/wo whitening, Contrast Normalization, 1 Uniformization Feature Extraction : Rectifiers, DAE, CAE, µ -ss-RBM 2 Post-processing : Transductive PCA 3 UTL Challenge, ICML Workshop 6/ 25

  14. Introduction Deep Architecture Results Summary Deep Architecture Stack different blocks We used this template : Pre-processing : PCA w/wo whitening, Contrast Normalization, 1 Uniformization Feature Extraction : Rectifiers, DAE, CAE, µ -ss-RBM 2 Post-processing : Transductive PCA 3 UTL Challenge, ICML Workshop 6/ 25

  15. Introduction Deep Architecture Results Summary Plan Introduction 1 Deep Architecture 2 Preprocessing Feature Extraction Postprocessing Results 3 Summary 4 UTL Challenge, ICML Workshop 7/ 25

  16. Introduction Deep Architecture Results Summary Preprocessing Given a training set D = { x ( j ) } j =1 ... n where x ( j ) ∈ R d : Uniformization (t-IDF) Rank all the x ( j ) and map them to [0 , 1] i Contrast Normalization For each x ( j ) , compute its mean µ ( j ) = � d i =1 x ( j ) and its i deviation σ ( j ) . x ( j ) ← ( x ( j ) − µ ( j ) ) /σ ( j ) Principal Component Analysis with/without whitening i.e divide by the squared root eigen value or not. UTL Challenge, ICML Workshop 8/ 25

  17. Introduction Deep Architecture Results Summary Preprocessing Given a training set D = { x ( j ) } j =1 ... n where x ( j ) ∈ R d : Uniformization (t-IDF) Rank all the x ( j ) and map them to [0 , 1] i Contrast Normalization For each x ( j ) , compute its mean µ ( j ) = � d i =1 x ( j ) and its i deviation σ ( j ) . x ( j ) ← ( x ( j ) − µ ( j ) ) /σ ( j ) Principal Component Analysis with/without whitening i.e divide by the squared root eigen value or not. UTL Challenge, ICML Workshop 8/ 25

  18. Introduction Deep Architecture Results Summary Preprocessing Given a training set D = { x ( j ) } j =1 ... n where x ( j ) ∈ R d : Uniformization (t-IDF) Rank all the x ( j ) and map them to [0 , 1] i Contrast Normalization For each x ( j ) , compute its mean µ ( j ) = � d i =1 x ( j ) and its i deviation σ ( j ) . x ( j ) ← ( x ( j ) − µ ( j ) ) /σ ( j ) Principal Component Analysis with/without whitening i.e divide by the squared root eigen value or not. UTL Challenge, ICML Workshop 8/ 25

  19. Introduction Deep Architecture Results Summary Preprocessing Given a training set D = { x ( j ) } j =1 ... n where x ( j ) ∈ R d : Uniformization (t-IDF) Rank all the x ( j ) and map them to [0 , 1] i Contrast Normalization For each x ( j ) , compute its mean µ ( j ) = � d i =1 x ( j ) and its i deviation σ ( j ) . x ( j ) ← ( x ( j ) − µ ( j ) ) /σ ( j ) Principal Component Analysis with/without whitening i.e divide by the squared root eigen value or not. UTL Challenge, ICML Workshop 8/ 25

  20. Introduction Deep Architecture Results Summary Plan Introduction 1 Deep Architecture 2 Preprocessing Feature Extraction Postprocessing Results 3 Summary 4 UTL Challenge, ICML Workshop 9/ 25

  21. Introduction Deep Architecture Results Summary Feature Extraction µ -ss-RBM µ -Spike & Slab Restricted Boltzmann Machine modelizes the interac- tion between three random vectors : visible vector v representing the observed data 1 binary “ spike ” variables h 2 real-valued “ slab ” variables s 3 UTL Challenge, ICML Workshop 10/ 25

  22. Introduction Deep Architecture Results Summary Feature Extraction µ -ss-RBM µ -Spike & Slab Restricted Boltzmann Machine modelizes the interac- tion between three random vectors : visible vector v representing the observed data 1 binary “ spike ” variables h 2 real-valued “ slab ” variables s 3 It is defined by the energy function : N � N � v T W i s i h i + 1 � 2 v T � E ( v , s , h ) = − Λ + Φ i h i v i =1 i =1 N N N N 1 � � � � 2 s T µ T µ T + i α i s i − i α i s i h i − b i h i + i α i µ i h i , i =1 i =1 i =1 i =1 In training, we use Persistent Contrastive Divergence with a Gibbs Sampling procedure. UTL Challenge, ICML Workshop 10/ 25

  23. Introduction Deep Architecture Results Summary Feature Extraction µ -ss-RBM more details in A.Courville, J.Bergstra and Y.Bengio, Unsupervised Models of Images by Spike-and-Slab RBMs , ICML 2011 . Pools of filters learned on CIFAR-10 UTL Challenge, ICML Workshop 11/ 25

  24. Introduction Deep Architecture Results Summary Feature Extraction Denoising Autoencoders A Denoising Autoencoder is an autoencoder trained to denoise artifi- cially corrupted training samples. x = x + ǫ where ǫ ∼ N (0 , σ 2 ) Corruption e.g ˜ Encoder : h (˜ x ) = s ( W ˜ x + b ) where s is the sigmoid function. ′ (tied weights). x ) = W T h (˜ Decoder : r (˜ x ) + b UTL Challenge, ICML Workshop 12/ 25

  25. Introduction Deep Architecture Results Summary Feature Extraction Denoising Autoencoders A Denoising Autoencoder is an autoencoder trained to denoise artifi- cially corrupted training samples. x = x + ǫ where ǫ ∼ N (0 , σ 2 ) Corruption e.g ˜ Encoder : h (˜ x ) = s ( W ˜ x + b ) where s is the sigmoid function. ′ (tied weights). x ) = W T h (˜ Decoder : r (˜ x ) + b Different loss functions to be minimized using stochastic gradient de- scent : x ) − x � 2 � r (˜ 2 (linear reconstruction and MSE) x )) − x � 2 � s ( r (˜ 2 (non-linear reconstruction) − � i x i log r (˜ x i ) − (1 − x i ) log(1 − r (˜ x i )) (cross-entropy) UTL Challenge, ICML Workshop 12/ 25

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend