iterative hybrid algorithm for semi supervised
play

Iterative Hybrid Algorithm for Semi-supervised Classification - PowerPoint PPT Presentation

Iterative Hybrid Algorithm for Semi-supervised Classification Martin SAVESKI Supervised by professor Thierry Arti` eres University Pierre and Marie Curie June 19, 2012 Martin SAVESKI Iterative Hybrid Algorithm for Semi-supervised


  1. Iterative Hybrid Algorithm for Semi-supervised Classification Martin SAVESKI Supervised by professor Thierry Arti` eres University Pierre and Marie Curie June 19, 2012 Martin SAVESKI Iterative Hybrid Algorithm for Semi-supervised Classification

  2. Outline Intro to Semi-supervised Learning The Iterative Hybrid Algorithm Other methods Experiments Performance comparison and observations Martin SAVESKI Iterative Hybrid Algorithm for Semi-supervised Classification

  3. Classical Supervised Learning Scenario Learning X, C Model Algorithm Parameters Label Dataset  {(x 1 ,c 1 ), (x 2 , c 2 ), … ( x n , c n )} Martin SAVESKI Iterative Hybrid Algorithm for Semi-supervised Classification

  4. Semi-Supervised Learning Unlabeled Data {x 1 , x 2 , … x n } X U Learning X L , C L Model Algorithm Parameters Label Dataset  {(x 1 ,c 1 ), (x 2 , c 2 ), … ( x n , c n )} How to use the unlabeled data to build better classifiers? Martin SAVESKI Iterative Hybrid Algorithm for Semi-supervised Classification

  5. Generative v.s. Discriminative Models Generative Models Model how samples from a particular class are generated p modeling inputs, hidden variables, and outputs jointly Strong modeling power, can easily handle missing values N � L G ( θ ) = p ( X , C , θ ) = p ( θ ) p ( x n , c n | θ c n ) . n =1 Martin SAVESKI Iterative Hybrid Algorithm for Semi-supervised Classification

  6. Generative v.s. Discriminative Models Generative Models Model how samples from a particular class are generated p modeling inputs, hidden variables, and outputs jointly Strong modeling power, can easily handle missing values N � L G ( θ ) = p ( X , C , θ ) = p ( θ ) p ( x n , c n | θ c n ) . n =1 Discriminative Models Concerned with defining the boundaries between the classes Directly optimize the boundary Tend to achieve better accuracy N � L D ( θ ) = p ( C | X , θ ) = p ( c n | x n , θ ) . n =1 Martin SAVESKI Iterative Hybrid Algorithm for Semi-supervised Classification

  7. Generative v.s. Discriminative Models Generative Models Model how samples from a particular class are generated p modeling inputs, hidden variables, and outputs jointly Strong modeling power, can easily handle missing values N � L G ( θ ) = p ( X , C , θ ) = p ( θ ) p ( x n , c n | θ c n ) . n =1 Discriminative Models Concerned with defining the boundaries between the classes Directly optimize the boundary Tend to achieve better accuracy N � L D ( θ ) = p ( C | X , θ ) = p ( c n | x n , θ ) . n =1 No easy way to combine them! Martin SAVESKI Iterative Hybrid Algorithm for Semi-supervised Classification

  8. Iterative Hybrid Algorithm Input: Labeled and Unlabeled data U Label L + U L L + U L Labeled Part of U Labeled Generative Generative Discriminative Generative Model Model Model Model Martin SAVESKI Iterative Hybrid Algorithm for Semi-supervised Classification

  9. Iterative Hybrid Algorithm (more formally) 1 Learn ˜ θ on L → ˜ θ (0) , by maximizing the following objective function: log p ( x | c , ˜ � θ ) x ∈ L 2 Learn ˜ θ on L ∪ U → ˜ θ (1) , starting from ˜ θ (0) , maximizing: log p ( x | c , ˜ p ( x | c ′ , ˜ � � � θ ) + λ log θ ) x ∈ L x ∈ U c ′ Martin SAVESKI Iterative Hybrid Algorithm for Semi-supervised Classification

  10. Iterative Hybrid Algorithm (more formally) Loop n number of iterations, or until convergence: 1 Learn θ on L → θ ( i ) , starting from ˜ θ ( i ) , maximizing: − 1 θ ( i ) || 2 + 2 || θ − ˜ � log p ( c | x , θ ) x ∈ L 2 Use θ ( i ) to label part of U → U Labeled , where the labels are assigned as: p ( c | x , θ ( i ) ) x → c = arg max c 3 Learn ˜ θ on L + U Labeled → ˜ θ ( i ) , maximizing: log p ( x | c , ˜ log p ( x | c , ˜ � � θ ) + λ θ ) x ∈ L x ∈ U Labeled Martin SAVESKI Iterative Hybrid Algorithm for Semi-supervised Classification

  11. Other methods Hybrid Model (Bishop and Lasserre, 2007) Multi-criteria objective function Combines generative and discriminative models with specific priors Optimizes: p ( θ, ˜ p ( X m | ˜ � � θ ) p ( C n | X n , θ ) θ ) n ∈ L m ∈ L ∪ U Martin SAVESKI Iterative Hybrid Algorithm for Semi-supervised Classification

  12. Other methods Hybrid Model (Bishop and Lasserre, 2007) Multi-criteria objective function Combines generative and discriminative models with specific priors Optimizes: p ( θ, ˜ p ( X m | ˜ � � θ ) p ( C n | X n , θ ) θ ) n ∈ L m ∈ L ∪ U Entropy Minimization (Grandvalet and Bengio, 2005) Uses the label entropy on unlabeled data as a regularizer. Assumes a prior which prefers minimal class overlap Optimizes: � � � p ( c ′ | x , θ ) log p ( c ′ | x , θ ) log p ( c | x , θ ) + λ x ∈ L x ∈ U c ′ ∈ C Martin SAVESKI Iterative Hybrid Algorithm for Semi-supervised Classification

  13. Experiments Data Set Synthetic Data (2 dimensions, 2 classes) Generated by elongated Gaussian distributions 2 labeled points per class 200 unlabeled per class 200 test samples per class Model p ( x | c ) → Iso-tropic Gaussian distribution Symmetric distribution (model misspecification) Setup Generate random data and label random points Run all algorithms for all hyper-parameter values Martin SAVESKI Iterative Hybrid Algorithm for Semi-supervised Classification

  14. Example Data Set Martin SAVESKI Iterative Hybrid Algorithm for Semi-supervised Classification

  15. Results with Two Labeled Points 0.85 Iterative Hybrid Algorithm Hybrid Model Entropy Minimization 0.80 Performance 0.75 0.70 0.65 0.0 0.2 0.4 0.6 0.8 1.0 Parameters have different semantics, not directly comparable Hybrid Model > Iterative Hybrid Algorithm > Entropy Minimization Martin SAVESKI Iterative Hybrid Algorithm for Semi-supervised Classification

  16. Results with Two Labeled Points (cont.) 1.0 1.0 1.0 1.0 1.0 0.8 0.8 0.8 0.8 0.8 0.6 0.6 0.6 0.6 0.6 0.4 0.4 0.4 0.4 0.4 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0 1.0 1.0 1.0 1.0 1.0 0.8 0.8 0.8 0.8 0.8 0.6 0.6 0.6 0.6 0.6 0.4 0.4 0.4 0.4 0.4 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0 1.0 1.0 1.0 1.0 1.0 0.8 0.8 0.8 0.8 0.8 0.6 0.6 0.6 0.6 0.6 0.4 0.4 0.4 0.4 0.4 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0 1.0 1.0 1.0 1.0 1.0 0.8 0.8 0.8 0.8 0.8 0.6 0.6 0.6 0.6 0.6 0.4 0.4 0.4 0.4 0.4 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0 Iterative Hybrid Algorithm Hybrid Model Entropy Minimization Hard to fix the hyper-parameters Unstable behavior of the Entropy Minimization method IHA and HM have stable behavior (iterative process possible) Martin SAVESKI Iterative Hybrid Algorithm for Semi-supervised Classification

  17. Particular Cases Manually fixed points Boundary induced by the labeled points far from the real one Important feature Overlap on the x axis between labeled points If NO Overlap → both perform well If Overlap → Hybrid Model superior Martin SAVESKI Iterative Hybrid Algorithm for Semi-supervised Classification

  18. Particular Cases (HM superior scenario) Figure 5: A case where there is an overlap overlap between the labeled points of each class on the x axis. The Iterative Hybrid Algorithm is shown on the top and the Hybrid Model on the bottom. The Iterative Hybrid Algorithm correctly classifies the labeled points, but fails to converge to the real boundary between the classes. However, the Hybrid Model for α = 0 . 8 converges to a satisfactory solution. Top: Iterative Hybrid Algorithm Bottom: Hybrid Model Martin SAVESKI Iterative Hybrid Algorithm for Semi-supervised Classification

  19. Increasing the number of labeled examples 0.85 0.85 0.85 0.80 0.80 0.80 Performance 0.75 0.75 0.75 0.70 0.70 0.70 0.65 0.65 0.65 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 (a) Two labeled points (b) Four labeled points (c) Six labeled points Iterative Hybrid Algorithm Hybrid Model Entropy Minimization As the number of labeled examples increases Difference between IHA and HM diminishes Entropy Minimization, improved performance, but still behind Martin SAVESKI Iterative Hybrid Algorithm for Semi-supervised Classification

  20. To sum up Iterative Algorithm for combining generative and discriminative models Compared with two other methods (HM and EM) Experiments on synthetic data IHA dominates Entropy Minimization, but outperformed by the Hybrid Model Difference vanishes as | L | increases Martin SAVESKI Iterative Hybrid Algorithm for Semi-supervised Classification

  21. It is your turn now ... Questions? Martin SAVESKI Iterative Hybrid Algorithm for Semi-supervised Classification

  22. Hybrid Model (details) Martin SAVESKI Iterative Hybrid Algorithm for Semi-supervised Classification

  23. Entropy Minimization Entropy Minimization (Grandvalet and Bengio, 2005) Uses the label entropy on unlabeled data as a regularizer. Assumes a prior which prefers minimal class overlap Optimizes: � � � log p ( c | x , θ ) + λ p ( c ′ | x , θ ) log p ( c ′ | x , θ ) x ∈ L x ∈ U c ′ ∈ C Using U to estimate the conditional Entropy H ( Y | X ) (measure of class overlap) Martin SAVESKI Iterative Hybrid Algorithm for Semi-supervised Classification

  24. Why Discriminative? Martin SAVESKI Iterative Hybrid Algorithm for Semi-supervised Classification

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend