Neural networks regularization through representation learning - presentation of my PhD work - Japanese-French workshop on optimization for machine learning (Riken & LITIS), INSA de Rouen. Sept.25 th , 2017 Soufiane Belharbi Romain Hérault Clément Chatelain Sébastien Adam soufiane.belharbi@insa-rouen.fr (https://sbelharbi.github.io) LITIS lab., Apprentissage team - INSA de Rouen, France images/logos Sep.25 th .2017 LITIS lab., Apprentissage team - INSA de Rouen, France Deep learning
Introduction My PhD work Key words: neural networks, regularization, representation learning. Selected work: A regularization framework for training neural networks for structured 1 output problems. S. Belharbi, C. Chatelain, R.Hérault, S. Adam, Multi-task Learning for Structured Output Prediction . Under review, Neurocomputing. ArXiv: arxiv.org/abs/1504.07550. 2017. A regularization framework for training neural networks for 2 classification. S. Belharbi, C. Chatelain, R.Hérault, S. Adam, Neural Networks Regularization Through Class-wise Invariant Representation Learning . In preparation for IEEE TNNLS. ArXiv: arxiv.org/abs/1709.01867. 2017. Transfer learning in neural networks: an application to medical 3 domain. S. Belharbi, R.Hérault, C. Chatelain, R. Modzelewski, S. Adam, M. Chastan, S. Thureau, Spotting L3 slice in CT scans using deep convolutional network and transfer learning . In Medical Image Analysis journal (MIA). 2017. images/logos LITIS lab., Apprentissage team - INSA de Rouen, France Deep learning 1/65
Introduction Machine Learning What is Machine Learning (ML)? ML is programming computers (algorithms) to optimize a performance criterion using example data or past experience . Learning a task Learn general models from data to perform a specific task f . f w : x − → y x : input, y : output (target, label) w : parameters of f ( · ) , f ( x ; w ) = y . Find w : E train = E ( x , y ) ∼ P data [ l ( f ( x ; w ) , y )] . Learning is the capability to generalize Generalization: E train ≈ E test . ( challenge!!! ). 1 Overfitting (model capacity, maximum likelihood estimate). 2 The no free lunch theorem : your training algorithm can not be 3 the best at every task: focus on the task in hand. Regularization : to better generalize use prior knowledge 4 images/logos about the task. LITIS lab., Apprentissage team - INSA de Rouen, France Deep learning 2/65
A regularization scheme for structured output problems My PhD work Key words: neural networks, regularization, representation learning. Selected work: A regularization framework for training neural networks for structured 1 output problems. S. Belharbi, C. Chatelain, R.Hérault, S. Adam, Multi-task Learning for Structured Output Prediction . Under review, Neurocomputing. ArXiv: arxiv.org/abs/1504.07550. 2017. A regularization framework for training neural networks for classifica- 2 tion. S. Belharbi, C. Chatelain, R.Hérault, S. Adam, Neural Networks Regularization Through Class-wise Invariant Representation Learning . In preparation for IEEE TNNLS. ArXiv: arxiv.org/abs/1709.01867. 2017. Transfer learning in neural networks: an application to medical domain. 3 S. Belharbi, R.Hérault, C. Chatelain, R. Modzelewski, S. Adam, M. Chastan, S. Thureau, Spotting L3 slice in CT scans using deep convolutional network and transfer learning . In Medical Image Analysis journal (MIA). 2017. images/logos LITIS lab., Apprentissage team - INSA de Rouen, France Deep learning 3/65
A regularization scheme for structured output problems Traditional Machine Learning Problems f : X → y Inputs X ∈ R d : any type of input Outputs y ∈ R for the task: classification, regression, . . . Machine Learning for Structured Output Problems f : X → Y Inputs X ∈ R d : any type of input Outputs Y ∈ R d ′ , d ′ > 1 a structured object ( dependencies ) See C. Lampert slides. images/logos LITIS lab., Apprentissage team - INSA de Rouen, France Deep learning 4/65
A regularization scheme for structured output problems Traditional Machine Learning Problems f : X → y Inputs X ∈ R d : any type of input Outputs y ∈ R for the task: classification, regression, . . . Machine Learning for Structured Output Problems f : X → Y Inputs X ∈ R d : any type of input Outputs Y ∈ R d ′ , d ′ > 1 a structured object ( dependencies ) See C. Lampert slides. images/logos LITIS lab., Apprentissage team - INSA de Rouen, France Deep learning 4/65
A regularization scheme for structured output problems Data = representation ( values ) + structure ( dependencies ) Text: part-of-speech tagging, translation speech ⇄ text Protein folding Image Structured data images/logos LITIS lab., Apprentissage team - INSA de Rouen, France Deep learning 5/65
A regularization scheme for structured output problems Approaches that Deal with Structured Output Data ◮ Kernel based methods: Kernel Density Estimation (KDE) ◮ Discriminative methods: Structure output SVM ◮ Graphical methods: HMM, CRF , MRF , . . . Drawbacks Perform one single data transformation Difficult to deal with high dimensional data Ideal approach ◮ Structured output problems ◮ High dimension data ◮ Multiple data transformation (complex mapping functions) Deep neural networks? images/logos LITIS lab., Apprentissage team - INSA de Rouen, France Deep learning 6/65
A regularization scheme for structured output problems Approaches that Deal with Structured Output Data ◮ Kernel based methods: Kernel Density Estimation (KDE) ◮ Discriminative methods: Structure output SVM ◮ Graphical methods: HMM, CRF , MRF , . . . Drawbacks Perform one single data transformation Difficult to deal with high dimensional data Ideal approach ◮ Structured output problems ◮ High dimension data ◮ Multiple data transformation (complex mapping functions) Deep neural networks? images/logos LITIS lab., Apprentissage team - INSA de Rouen, France Deep learning 6/65
A regularization scheme for structured output problems Traditional Deep neural Network Input layer Hidden layer 1 Hidden layer 2 Hidden layer 3 Hidden layer 4 Output layer x 1 x 2 x 3 y 1 car x 4 y 2 bus y 3 x 5 bike x 6 x 7 ◮ High dimension data OK ◮ Multiple data transformation (complex mapping functions) OK ◮ Structured output problems NO images/logos LITIS lab., Apprentissage team - INSA de Rouen, France Deep learning 7/65
A regularization scheme for structured output problems High dimensional output: Input layer Hidden layer 1 Hidden layer 2 Hidden layer 3 Hidden layer 4 Output layer x 1 y 1 x 2 y 2 x 3 y 3 x 4 y 4 Structured object x 5 y 5 x 6 y 6 y 7 x 7 images/logos LITIS lab., Apprentissage team - INSA de Rouen, France Deep learning 8/65
A regularization scheme for structured output problems Unsupervised learning: Layer-wise pre-training, auto-encoders x 1 x 2 x 3 ˆ y 1 x 4 ˆ y 2 x 5 x 6 A DNN to train: Use unsupervised training to initialize the network. 1 images/logos Finetune the network using supervised data. 2 LITIS lab., Apprentissage team - INSA de Rouen, France Deep learning 9/65
A regularization scheme for structured output problems Unsupervised learning: Layer-wise pre-training, auto-encoders 1) Step 1: Unsupervised layer-wise pre-training Train layer by layer sequentially using only x (labeled or unlabeled) x 1 ˆ x 1 x 2 ˆ x 2 x 3 ˆ x 3 x 4 ˆ x 4 x 5 ˆ x 5 x 6 ˆ x 6 images/logos LITIS lab., Apprentissage team - INSA de Rouen, France Deep learning 10/65
A regularization scheme for structured output problems Unsupervised learning: Layer-wise pre-training, auto-encoders 1) Step 1: Unsupervised layer-wise pre-training Train layer by layer sequentially using only x (labeled or unlabeled) x 1 h 1 , 1 x 2 h 1 , 2 x 3 h 1 , 3 x 4 h 1 , 4 x 5 h 1 , 5 x 6 images/logos LITIS lab., Apprentissage team - INSA de Rouen, France Deep learning 10/65
A regularization scheme for structured output problems Unsupervised learning: Layer-wise pre-training, auto-encoders 1) Step 1: Unsupervised layer-wise pre-training Train layer by layer sequentially using only x (labeled or unlabeled) x 1 ˆ h 1 , 1 h 1 , 1 x 2 ˆ h 1 , 2 h 1 , 2 x 3 ˆ h 1 , 3 h 1 , 3 x 4 ˆ h 1 , 4 h 1 , 4 x 5 ˆ h 1 , 5 h 1 , 5 x 6 images/logos LITIS lab., Apprentissage team - INSA de Rouen, France Deep learning 10/65
A regularization scheme for structured output problems Unsupervised learning: Layer-wise pre-training, auto-encoders 1) Step 1: Unsupervised layer-wise pre-training Train layer by layer sequentially using only x (labeled or unlabeled) x 1 x 2 h 2 , 1 x 3 h 2 , 2 x 4 h 2 , 3 x 5 x 6 images/logos LITIS lab., Apprentissage team - INSA de Rouen, France Deep learning 10/65
A regularization scheme for structured output problems Unsupervised learning: Layer-wise pre-training, auto-encoders 1) Step 1: Unsupervised layer-wise pre-training Train layer by layer sequentially using only x (labeled or unlabeled) x 1 x 2 ˆ h 2 , 1 h 2 , 1 x 3 ˆ h 2 , 2 h 2 , 2 x 4 ˆ h 2 , 3 h 2 , 3 x 5 x 6 images/logos LITIS lab., Apprentissage team - INSA de Rouen, France Deep learning 10/65
A regularization scheme for structured output problems Unsupervised learning: Layer-wise pre-training, auto-encoders 1) Step 1: Unsupervised layer-wise pre-training Train layer by layer sequentially using only x (labeled or unlabeled) x 1 x 2 h 3 , 1 x 3 h 3 , 2 x 4 h 3 , 3 x 5 x 6 images/logos LITIS lab., Apprentissage team - INSA de Rouen, France Deep learning 10/65
Recommend
More recommend