Neural networks regularization through representation learning - presentation of my PhD work - Japanese-French workshop on optimization for machine learning (Riken & LITIS), INSA de Rouen. Sept.25 th , 2017 Soufiane Belharbi Romain Hérault Clément Chatelain Sébastien Adam soufiane.belharbi@insa-rouen.fr (https://sbelharbi.github.io) LITIS lab., Apprentissage team - INSA de Rouen, France images/logos Sep.25 th .2017 LITIS lab., Apprentissage team - INSA de Rouen, France Deep learning

Introduction My PhD work Key words: neural networks, regularization, representation learning. Selected work: A regularization framework for training neural networks for structured 1 output problems. S. Belharbi, C. Chatelain, R.Hérault, S. Adam, Multi-task Learning for Structured Output Prediction . Under review, Neurocomputing. ArXiv: arxiv.org/abs/1504.07550. 2017. A regularization framework for training neural networks for 2 classification. S. Belharbi, C. Chatelain, R.Hérault, S. Adam, Neural Networks Regularization Through Class-wise Invariant Representation Learning . In preparation for IEEE TNNLS. ArXiv: arxiv.org/abs/1709.01867. 2017. Transfer learning in neural networks: an application to medical 3 domain. S. Belharbi, R.Hérault, C. Chatelain, R. Modzelewski, S. Adam, M. Chastan, S. Thureau, Spotting L3 slice in CT scans using deep convolutional network and transfer learning . In Medical Image Analysis journal (MIA). 2017. images/logos LITIS lab., Apprentissage team - INSA de Rouen, France Deep learning 1/65

Introduction Machine Learning What is Machine Learning (ML)? ML is programming computers (algorithms) to optimize a performance criterion using example data or past experience . Learning a task Learn general models from data to perform a specific task f . f w : x − → y x : input, y : output (target, label) w : parameters of f ( · ) , f ( x ; w ) = y . Find w : E train = E ( x , y ) ∼ P data [ l ( f ( x ; w ) , y )] . Learning is the capability to generalize Generalization: E train ≈ E test . ( challenge!!! ). 1 Overfitting (model capacity, maximum likelihood estimate). 2 The no free lunch theorem : your training algorithm can not be 3 the best at every task: focus on the task in hand. Regularization : to better generalize use prior knowledge 4 images/logos about the task. LITIS lab., Apprentissage team - INSA de Rouen, France Deep learning 2/65

A regularization scheme for structured output problems My PhD work Key words: neural networks, regularization, representation learning. Selected work: A regularization framework for training neural networks for structured 1 output problems. S. Belharbi, C. Chatelain, R.Hérault, S. Adam, Multi-task Learning for Structured Output Prediction . Under review, Neurocomputing. ArXiv: arxiv.org/abs/1504.07550. 2017. A regularization framework for training neural networks for classifica- 2 tion. S. Belharbi, C. Chatelain, R.Hérault, S. Adam, Neural Networks Regularization Through Class-wise Invariant Representation Learning . In preparation for IEEE TNNLS. ArXiv: arxiv.org/abs/1709.01867. 2017. Transfer learning in neural networks: an application to medical domain. 3 S. Belharbi, R.Hérault, C. Chatelain, R. Modzelewski, S. Adam, M. Chastan, S. Thureau, Spotting L3 slice in CT scans using deep convolutional network and transfer learning . In Medical Image Analysis journal (MIA). 2017. images/logos LITIS lab., Apprentissage team - INSA de Rouen, France Deep learning 3/65

A regularization scheme for structured output problems Traditional Machine Learning Problems f : X → y Inputs X ∈ R d : any type of input Outputs y ∈ R for the task: classification, regression, . . . Machine Learning for Structured Output Problems f : X → Y Inputs X ∈ R d : any type of input Outputs Y ∈ R d ′ , d ′ > 1 a structured object ( dependencies ) See C. Lampert slides. images/logos LITIS lab., Apprentissage team - INSA de Rouen, France Deep learning 4/65

A regularization scheme for structured output problems Traditional Machine Learning Problems f : X → y Inputs X ∈ R d : any type of input Outputs y ∈ R for the task: classification, regression, . . . Machine Learning for Structured Output Problems f : X → Y Inputs X ∈ R d : any type of input Outputs Y ∈ R d ′ , d ′ > 1 a structured object ( dependencies ) See C. Lampert slides. images/logos LITIS lab., Apprentissage team - INSA de Rouen, France Deep learning 4/65

A regularization scheme for structured output problems Data = representation ( values ) + structure ( dependencies ) Text: part-of-speech tagging, translation speech ⇄ text Protein folding Image Structured data images/logos LITIS lab., Apprentissage team - INSA de Rouen, France Deep learning 5/65

A regularization scheme for structured output problems Approaches that Deal with Structured Output Data ◮ Kernel based methods: Kernel Density Estimation (KDE) ◮ Discriminative methods: Structure output SVM ◮ Graphical methods: HMM, CRF , MRF , . . . Drawbacks Perform one single data transformation Difficult to deal with high dimensional data Ideal approach ◮ Structured output problems ◮ High dimension data ◮ Multiple data transformation (complex mapping functions) Deep neural networks? images/logos LITIS lab., Apprentissage team - INSA de Rouen, France Deep learning 6/65

A regularization scheme for structured output problems Approaches that Deal with Structured Output Data ◮ Kernel based methods: Kernel Density Estimation (KDE) ◮ Discriminative methods: Structure output SVM ◮ Graphical methods: HMM, CRF , MRF , . . . Drawbacks Perform one single data transformation Difficult to deal with high dimensional data Ideal approach ◮ Structured output problems ◮ High dimension data ◮ Multiple data transformation (complex mapping functions) Deep neural networks? images/logos LITIS lab., Apprentissage team - INSA de Rouen, France Deep learning 6/65

A regularization scheme for structured output problems Traditional Deep neural Network Input layer Hidden layer 1 Hidden layer 2 Hidden layer 3 Hidden layer 4 Output layer x 1 x 2 x 3 y 1 car x 4 y 2 bus y 3 x 5 bike x 6 x 7 ◮ High dimension data OK ◮ Multiple data transformation (complex mapping functions) OK ◮ Structured output problems NO images/logos LITIS lab., Apprentissage team - INSA de Rouen, France Deep learning 7/65

A regularization scheme for structured output problems High dimensional output: Input layer Hidden layer 1 Hidden layer 2 Hidden layer 3 Hidden layer 4 Output layer x 1 y 1 x 2 y 2 x 3 y 3 x 4 y 4 Structured object x 5 y 5 x 6 y 6 y 7 x 7 images/logos LITIS lab., Apprentissage team - INSA de Rouen, France Deep learning 8/65

A regularization scheme for structured output problems Unsupervised learning: Layer-wise pre-training, auto-encoders x 1 x 2 x 3 ˆ y 1 x 4 ˆ y 2 x 5 x 6 A DNN to train: Use unsupervised training to initialize the network. 1 images/logos Finetune the network using supervised data. 2 LITIS lab., Apprentissage team - INSA de Rouen, France Deep learning 9/65

A regularization scheme for structured output problems Unsupervised learning: Layer-wise pre-training, auto-encoders 1) Step 1: Unsupervised layer-wise pre-training Train layer by layer sequentially using only x (labeled or unlabeled) x 1 ˆ x 1 x 2 ˆ x 2 x 3 ˆ x 3 x 4 ˆ x 4 x 5 ˆ x 5 x 6 ˆ x 6 images/logos LITIS lab., Apprentissage team - INSA de Rouen, France Deep learning 10/65

A regularization scheme for structured output problems Unsupervised learning: Layer-wise pre-training, auto-encoders 1) Step 1: Unsupervised layer-wise pre-training Train layer by layer sequentially using only x (labeled or unlabeled) x 1 h 1 , 1 x 2 h 1 , 2 x 3 h 1 , 3 x 4 h 1 , 4 x 5 h 1 , 5 x 6 images/logos LITIS lab., Apprentissage team - INSA de Rouen, France Deep learning 10/65

A regularization scheme for structured output problems Unsupervised learning: Layer-wise pre-training, auto-encoders 1) Step 1: Unsupervised layer-wise pre-training Train layer by layer sequentially using only x (labeled or unlabeled) x 1 ˆ h 1 , 1 h 1 , 1 x 2 ˆ h 1 , 2 h 1 , 2 x 3 ˆ h 1 , 3 h 1 , 3 x 4 ˆ h 1 , 4 h 1 , 4 x 5 ˆ h 1 , 5 h 1 , 5 x 6 images/logos LITIS lab., Apprentissage team - INSA de Rouen, France Deep learning 10/65

A regularization scheme for structured output problems Unsupervised learning: Layer-wise pre-training, auto-encoders 1) Step 1: Unsupervised layer-wise pre-training Train layer by layer sequentially using only x (labeled or unlabeled) x 1 x 2 h 2 , 1 x 3 h 2 , 2 x 4 h 2 , 3 x 5 x 6 images/logos LITIS lab., Apprentissage team - INSA de Rouen, France Deep learning 10/65

A regularization scheme for structured output problems Unsupervised learning: Layer-wise pre-training, auto-encoders 1) Step 1: Unsupervised layer-wise pre-training Train layer by layer sequentially using only x (labeled or unlabeled) x 1 x 2 ˆ h 2 , 1 h 2 , 1 x 3 ˆ h 2 , 2 h 2 , 2 x 4 ˆ h 2 , 3 h 2 , 3 x 5 x 6 images/logos LITIS lab., Apprentissage team - INSA de Rouen, France Deep learning 10/65

A regularization scheme for structured output problems Unsupervised learning: Layer-wise pre-training, auto-encoders 1) Step 1: Unsupervised layer-wise pre-training Train layer by layer sequentially using only x (labeled or unlabeled) x 1 x 2 h 3 , 1 x 3 h 3 , 2 x 4 h 3 , 3 x 5 x 6 images/logos LITIS lab., Apprentissage team - INSA de Rouen, France Deep learning 10/65

Download Presentation

Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend

More recommend