deep multi task learning with evolving weights
play

Deep multi-task learning with evolving weights ESANN 2016 Soufiane - PowerPoint PPT Presentation

Deep multi-task learning with evolving weights ESANN 2016 Soufiane Belharbi Romain Hrault Clment Chatelain Sbastien Adam soufiane.belharbi@insa-rouen.fr LITIS lab., DocApp team - INSA de Rouen, France images/logos 27 April, 2016


  1. Deep multi-task learning with evolving weights ESANN 2016 Soufiane Belharbi Romain Hérault Clément Chatelain Sébastien Adam soufiane.belharbi@insa-rouen.fr LITIS lab., DocApp team - INSA de Rouen, France images/logos 27 April, 2016 LITIS lab., DocApp team - INSA de Rouen, France Deep multi-task learning with evolving weights

  2. Context Training deep neural networks Deep neural network are interesting models (Complex/hierarchical features, complex mapping) ⇒ Improve performance Training deep neural networks is difficult ⇒ Vanishing gradient ⇒ More parameters ⇒ Need more data Some solutions: ⇒ Pre-training technique [ Y.Bengio et al. 06, G.E.Hinton et al. 06 ] ⇒ Use unlabeled data images/logos LITIS lab., DocApp team - INSA de Rouen, France Deep multi-task learning with evolving weights 1/20

  3. Context Training deep neural networks Deep neural network are interesting models (Complex/hierarchical features, complex mapping) ⇒ Improve performance Training deep neural networks is difficult ⇒ Vanishing gradient ⇒ More parameters ⇒ Need more data Some solutions: ⇒ Pre-training technique [ Y.Bengio et al. 06, G.E.Hinton et al. 06 ] ⇒ Use unlabeled data images/logos LITIS lab., DocApp team - INSA de Rouen, France Deep multi-task learning with evolving weights 1/20

  4. Context Semi-supervised learning General case: Data = { labeled data , unlabeled data } � �� � � �� � expensive (money, time), few cheap, abundant E.g: medical images ⇒ semi-supervised learning: Exploit unlabeled data to improve the generalization images/logos LITIS lab., DocApp team - INSA de Rouen, France Deep multi-task learning with evolving weights 2/20

  5. Context Semi-supervised learning General case: Data = { labeled data , unlabeled data } � �� � � �� � expensive (money, time), few cheap, abundant E.g: medical images ⇒ semi-supervised learning: Exploit unlabeled data to improve the generalization images/logos LITIS lab., DocApp team - INSA de Rouen, France Deep multi-task learning with evolving weights 2/20

  6. Context Pre-training and semi-supervised learning The pre-training technique can exploit the unlabeled data A sequential transfer learning performed in 2 steps: Unsupervised task ( x labeled and unlabeled data) 1 Supervised task ( ( x , y ) labeled data) 2 images/logos LITIS lab., DocApp team - INSA de Rouen, France Deep multi-task learning with evolving weights 3/20

  7. Pre-training technique and semi-supervised learning Layer-wise pre-training: auto-encoders x 1 x 2 ˆ y 1 x 3 ˆ y 2 x 4 x 5 A DNN to train images/logos LITIS lab., DocApp team - INSA de Rouen, France Deep multi-task learning with evolving weights 4/20

  8. Pre-training technique and semi-supervised learning Layer-wise pre-training: auto-encoders 1) Step 1: Unsupervised layer-wise training Train layer by layer sequentially using only x (labeled or unlabeled) x 1 ˆ x 1 x 2 ˆ x 2 x 3 ˆ x 3 x 4 ˆ x 4 x 5 ˆ x 5 images/logos LITIS lab., DocApp team - INSA de Rouen, France Deep multi-task learning with evolving weights 5/20

  9. Pre-training technique and semi-supervised learning Layer-wise pre-training: auto-encoders 1) Step 1: Unsupervised layer-wise training Train layer by layer sequentially using only x (labeled or unlabeled) x 1 h 1 , 1 x 2 h 1 , 2 x 3 h 1 , 3 x 4 h 1 , 4 x 5 h 1 , 5 images/logos LITIS lab., DocApp team - INSA de Rouen, France Deep multi-task learning with evolving weights 5/20

  10. Pre-training technique and semi-supervised learning Layer-wise pre-training: auto-encoders 1) Step 1: Unsupervised layer-wise training Train layer by layer sequentially using only x (labeled or unlabeled) ˆ x 1 h 1 , 1 h 1 , 1 ˆ x 2 h 1 , 2 h 1 , 2 ˆ x 3 h 1 , 3 h 1 , 3 ˆ x 4 h 1 , 4 h 1 , 4 ˆ x 5 h 1 , 5 h 1 , 5 images/logos LITIS lab., DocApp team - INSA de Rouen, France Deep multi-task learning with evolving weights 5/20

  11. Pre-training technique and semi-supervised learning Layer-wise pre-training: auto-encoders 1) Step 1: Unsupervised layer-wise training Train layer by layer sequentially using only x (labeled or unlabeled) x 1 x 2 h 2 , 1 x 3 h 2 , 2 x 4 h 2 , 3 x 5 images/logos LITIS lab., DocApp team - INSA de Rouen, France Deep multi-task learning with evolving weights 5/20

  12. Pre-training technique and semi-supervised learning Layer-wise pre-training: auto-encoders 1) Step 1: Unsupervised layer-wise training Train layer by layer sequentially using only x (labeled or unlabeled) x 1 ˆ x 2 h 2 , 1 h 2 , 1 ˆ x 3 h 2 , 2 h 2 , 2 ˆ x 4 h 2 , 3 h 2 , 3 x 5 images/logos LITIS lab., DocApp team - INSA de Rouen, France Deep multi-task learning with evolving weights 5/20

  13. Pre-training technique and semi-supervised learning Layer-wise pre-training: auto-encoders 1) Step 1: Unsupervised layer-wise training Train layer by layer sequentially using only x (labeled or unlabeled) x 1 x 2 h 3 , 1 x 3 h 3 , 2 x 4 h 3 , 3 x 5 images/logos LITIS lab., DocApp team - INSA de Rouen, France Deep multi-task learning with evolving weights 5/20

  14. Pre-training technique and semi-supervised learning Layer-wise pre-training: auto-encoders 1) Step 1: Unsupervised layer-wise training Train layer by layer sequentially using only x (labeled or unlabeled) x 1 x 2 x 3 x 4 x 5 At each layer : ⇒ When to stop training? ⇒ What hyper-parameters to use? images/logos ⇒ How to make sure that the training improves the supervised task? LITIS lab., DocApp team - INSA de Rouen, France Deep multi-task learning with evolving weights 5/20

  15. Pre-training technique and semi-supervised learning Layer-wise pre-training: auto-encoders 2) Step 2: Supervised training Train the whole network using ( x , y ) x 1 x 2 ˆ y 1 x 3 ˆ y 2 x 4 x 5 Back-propagation images/logos LITIS lab., DocApp team - INSA de Rouen, France Deep multi-task learning with evolving weights 6/20

  16. Pre-training technique and semi-supervised learning Pre-training technique: Pros and cons Pros Improve generalization Can exploit unlabeled data Provide better initialization than random Train deep networks ⇒ Circumvent the vanishing gradient problem Cons Add more hyper-parameters No good stopping criterion during pre-training phase Good criterion for the unsupervised task But May not be good for the supervised task images/logos LITIS lab., DocApp team - INSA de Rouen, France Deep multi-task learning with evolving weights 7/20

  17. Pre-training technique and semi-supervised learning Pre-training technique: Pros and cons Pros Improve generalization Can exploit unlabeled data Provide better initialization than random Train deep networks ⇒ Circumvent the vanishing gradient problem Cons Add more hyper-parameters No good stopping criterion during pre-training phase Good criterion for the unsupervised task But May not be good for the supervised task images/logos LITIS lab., DocApp team - INSA de Rouen, France Deep multi-task learning with evolving weights 7/20

  18. Pre-training technique and semi-supervised learning Proposed solution Why is it difficult in practice? ⇒ Sequential transfer learning Possible solution: ⇒ Parallel transfer learning Why in parallel? Interaction between tasks Reduce the number of hyper-parameters to tune Provide one stopping criterion images/logos LITIS lab., DocApp team - INSA de Rouen, France Deep multi-task learning with evolving weights 8/20

  19. Pre-training technique and semi-supervised learning Proposed solution Why is it difficult in practice? ⇒ Sequential transfer learning Possible solution: ⇒ Parallel transfer learning Why in parallel? Interaction between tasks Reduce the number of hyper-parameters to tune Provide one stopping criterion images/logos LITIS lab., DocApp team - INSA de Rouen, France Deep multi-task learning with evolving weights 8/20

  20. Pre-training technique and semi-supervised learning Proposed solution Why is it difficult in practice? ⇒ Sequential transfer learning Possible solution: ⇒ Parallel transfer learning Why in parallel? Interaction between tasks Reduce the number of hyper-parameters to tune Provide one stopping criterion images/logos LITIS lab., DocApp team - INSA de Rouen, France Deep multi-task learning with evolving weights 8/20

  21. Proposed approach Parallel transfer learning: Tasks combination Train cost = supervised task + unsupervised task � �� � reconstruction l labeled samples, u unlabeled samples, w sh : shared parameters. Reconstruction (auto-encoder) task : l + u � J r ( D ; w ′ = { w sh , w r } ) = C r ( R ( x i ; w ′ ) , x i ) . i = 1 Supervised task : l � J s ( D ; w = { w sh , w s } ) = C s ( M ( x i ; w ) , y i ) . i = 1 Weighted tasks combination J ( D ; { w sh , w s , w r } ) = λ s · J s ( D ; { w sh , w s } ) + λ r · J r ( D ; { w sh , w r } ) . λ s , λ r ∈ [ 0 , 1 ] : importance weight, λ s + λ r = 1. images/logos LITIS lab., DocApp team - INSA de Rouen, France Deep multi-task learning with evolving weights 9/20

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend