Deep learning in TMVA
Benchmarking TMVA DNN Integration of a Deep Autoencoder Marc Huwiler
CERN
August 28, 2017
Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 1 / 34
Deep learning in TMVA Benchmarking TMVA DNN Integration of a Deep - - PowerPoint PPT Presentation
Deep learning in TMVA Benchmarking TMVA DNN Integration of a Deep Autoencoder Marc Huwiler CERN August 28, 2017 Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 1 / 34 Outline Outline Introduction 1 About me ROOT, TMVA and
Benchmarking TMVA DNN Integration of a Deep Autoencoder Marc Huwiler
CERN
August 28, 2017
Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 1 / 34
Outline
1
Introduction About me ROOT, TMVA and Machine learning
2
Benchmarking TMVA DNN vs PyKeras Methodology Benchmarking implementation Results
3
Implementing Autoencoder classes for TMVA Concepts Autoencoder integration in TMVA Results
4
Further possibilities
5
Aknowledgements
Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 2 / 34
Introduction About me
Master in High Energy Physics at EPFL in Lausanne Hiking, travelling, reading
Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 3 / 34
Introduction ROOT, TMVA and Machine learning
ROOT Data analysis framework for HEP , developed mainly at CERN Written in C++ (fully interpreted)
Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 4 / 34
Introduction ROOT, TMVA and Machine learning
ROOT Data analysis framework for HEP , developed mainly at CERN Written in C++ (fully interpreted) TMVA Toolkit for Multivariate Analysis Includes several machine learning algorithms such as :
Likelihood, KNN, Fisher, MLP , SVN, Neural Networks, BDT, etc...
Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 4 / 34
Introduction ROOT, TMVA and Machine learning
Machine learning is a type of artificial intelligence (AI) that allows software applications to become more accurate in predicting
input data and using statistical analysis to predict an output value within an acceptable range.1
1source : http://whatis.techtarget.com/definition/machine-learning Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 5 / 34
Introduction ROOT, TMVA and Machine learning
x + b
Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 6 / 34
Introduction ROOT, TMVA and Machine learning
A special sort of neural networks
Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 7 / 34
Introduction ROOT, TMVA and Machine learning
A special sort of neural networks
The inputs {xi} are encoded into a lower dimensionality set of variables, that contain a compressed representation of the data
Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 7 / 34
Introduction ROOT, TMVA and Machine learning
A special sort of neural networks
The inputs {xi} are encoded into a lower dimensionality set of variables, that contain a compressed representation of the data Then, they can be decoded again into the output {x′
i} of the same
dimension as the input, and aimed to be the closest possible to it.
Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 7 / 34
Benchmarking TMVA DNN vs PyKeras
1
Introduction About me ROOT, TMVA and Machine learning
2
Benchmarking TMVA DNN vs PyKeras Methodology Benchmarking implementation Results
3
Implementing Autoencoder classes for TMVA Concepts Autoencoder integration in TMVA Results
4
Further possibilities
5
Aknowledgements
Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 8 / 34
Benchmarking TMVA DNN vs PyKeras Methodology
How the benchmarking is performed
Run the same training on a similar neural network layout in both TMVA DNN and PyKeras Benchmarks : ROC curve integral, CPU time, Real time Studies of the benchmarks as function of any common parameter
Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 9 / 34
Benchmarking TMVA DNN vs PyKeras Methodology
Input parameters in common
TMVA DNN PyKeras network layout network layout WeightInitialization initializer ErrorStrategy loss LearningRate lr Momentum momentum ConvergenceSteps TriesEarlyStopping DropConfing Dropout Sampling and Preprocessing are performed by TMVA (Factory)
Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 10 / 34
Benchmarking TMVA DNN vs PyKeras Methodology
Input parameters in common
TMVA DNN PyKeras network layout network layout WeightInitialization initializer ErrorStrategy loss LearningRate lr Momentum momentum ConvergenceSteps TriesEarlyStopping DropConfing Dropout Sampling and Preprocessing are performed by TMVA (Factory) There are also parameters that have no counterpart in the other framework
Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 10 / 34
Benchmarking TMVA DNN vs PyKeras Benchmarking implementation
Batch script
Batch script TMVA DNN macro PyKeras macro Input file
A batch script does the following: Generate a common input file with the given parameters Feed it into the macro for both TMVA DNN and PyKeras The macros write the benchmarks into a common file These steps are repeated changing a given parameter
Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 11 / 34
Benchmarking TMVA DNN vs PyKeras Benchmarking implementation
Input parameter nNeurons 100 nLayers 3 Activation 2 Lastactivation 3 Initializer 1 Lossfunction Transformations ”N,D” Factorystring
!V:!Silent:Color:DrawProgressBar:Transformations=I:AnalysisType=Classification
Learningrate 0.1 Momentum 0.0 Batchsize 128 Convergencesteps 100 Dropout 0.0 Ntrainsignal 50000 Ntrainbackground 50000 Ntestsignal 100000 Ntestbackground 100000
Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 12 / 34
Benchmarking TMVA DNN vs PyKeras Benchmarking implementation
Input parameter nNeurons 100 nLayers 3 Activation 2 Lastactivation 3 Initializer 1 Lossfunction Transformations ”N,D” Factorystring
!V:!Silent:Color:DrawProgressBar:Transformations=I:AnalysisType=Classification
Learningrate 0.1 Momentum 0.0 Batchsize 128 Convergencesteps 100 Dropout 0.0 Ntrainsignal 50000 Ntrainbackground 50000 Ntestsignal 100000 Ntestbackground 100000
Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 12 / 34
Benchmarking TMVA DNN vs PyKeras Results
Example of a ROC curve
0.2 0.4 0.6 0.8 1 Signal efficiency (Sensitivity) 0.2 0.4 0.6 0.8 1 Background rejection (Specificity)
Signal efficiency vs. Background rejection (DNN CPU) Signal efficiency vs. Background rejection (DNN CPU)
nNeurons = 20 , Convergence steps = 100 , Batch size = 128 , Dropout = 0
Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 13 / 34
Benchmarking TMVA DNN vs PyKeras Results
Varying the number of neurons
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 50 100 150 200 250 number of neurons per layer ROC curve integral
TMVA DNN PyKeras Tensorflow SGD PyKeras Theano SGD PyKeras Tensorflow Adam PyKeras Theano Adam
Convergence steps = 100 , Batch size = 128 , Dropout = 0
Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 14 / 34
Benchmarking TMVA DNN vs PyKeras Results
Varying the number of neurons
5000 10000 15000 20000 25000 30000 35000 40000 50 100 150 200 250 number of neurons per layer CPU time [s]
TMVA DNN PyKeras Tensorflow SGD PyKeras Theano SGD PyKeras Tensorflow Adam PyKeras Theano Adam
Convergence steps = 100 , Batch size = 128 , Dropout = 0
Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 15 / 34
Benchmarking TMVA DNN vs PyKeras Results
Varying the number of neurons
500 1000 1500 2000 2500 3000 3500 4000 4500 5000 50 100 150 200 250 number of neurons per layer real time [s]
TMVA DNN PyKeras Tensorflow SGD PyKeras Theano SGD PyKeras Tensorflow Adam PyKeras Theano Adam
Convergence steps = 100 , Batch size = 128 , Dropout = 0
Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 16 / 34
Benchmarking TMVA DNN vs PyKeras Results
Statistical validity
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 50 100 150 200 250 number of neurons per layer ROC curve integral TMVA DNN 1 TMVA DNN 2 TMVA DNN 3 TMVA DNN 4 TMVA DNN 5 Mean
Convergence steps = 100 , Batch size = 128 , Dropout = 0
Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 17 / 34
Benchmarking TMVA DNN vs PyKeras Results
Statistical validity
5000 10000 15000 20000 25000 30000 35000 40000 45000 50 100 150 200 250 number of neurons per layer CPU time [s] TMVA DNN 1 TMVA DNN 2 TMVA DNN 3 TMVA DNN 4 TMVA DNN 5 Mean
Convergence steps = 100 , Batch size = 128 , Dropout = 0
Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 18 / 34
Benchmarking TMVA DNN vs PyKeras Results
Statistical validity
1000 2000 3000 4000 5000 6000 7000 50 100 150 200 250 number of neurons per layer real time [s] TMVA DNN 1 TMVA DNN 2 TMVA DNN 3 TMVA DNN 4 TMVA DNN 5 Mean
Convergence steps = 100 , Batch size = 128 , Dropout = 0
Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 19 / 34
Implementing Autoencoder classes for TMVA
1
Introduction About me ROOT, TMVA and Machine learning
2
Benchmarking TMVA DNN vs PyKeras Methodology Benchmarking implementation Results
3
Implementing Autoencoder classes for TMVA Concepts Autoencoder integration in TMVA Results
4
Further possibilities
5
Aknowledgements
Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 20 / 34
Implementing Autoencoder classes for TMVA Concepts
Unsupervised learning : The algorithm trains on data without actual information on what the output should be Supervised Learning : The true value of the predicted output is known during the learning process
Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 21 / 34
Implementing Autoencoder classes for TMVA Concepts
Unsupervised learning : The algorithm trains on data without actual information on what the output should be Supervised Learning : The true value of the predicted output is known during the learning process TMVA : Classification and Regression ⇒ supervised An autoencoder (AE) is unsupervised
Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 21 / 34
Implementing Autoencoder classes for TMVA Autoencoder integration in TMVA
Envisaged approaches
Making unsupervised learning possible in TMVA : Change the data loading process (classes or targets no longer required) Design new evaluation methods Classification : ROC curve and integral Regression : deviation of output versus targets/inputs Autoencoder : deviation between {xi} and {x′
i} ?
Other unsupervised methods : ???
Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 22 / 34
Implementing Autoencoder classes for TMVA Autoencoder integration in TMVA
Envisaged approaches
Making unsupervised learning possible in TMVA : Change the data loading process (classes or targets no longer required) Design new evaluation methods Classification : ROC curve and integral Regression : deviation of output versus targets/inputs Autoencoder : deviation between {xi} and {x′
i} ?
Other unsupervised methods : ??? Finding autoencoder use cases that better fit the TMVA framework : Dimensionality reduction Anomaly detection Weight initialisation for deep neural network (DNN) layers Classification (with logistic regression layer)
Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 22 / 34
Implementing Autoencoder classes for TMVA Autoencoder integration in TMVA
Variable transformation
Used to reduce the dimensionality of a dataset, by figuring out complex patterns among the variables The encoded inputs are passed as outputs, and can be fed into any other method or pretransformation in TMVA ( ≈ non-linear PCA)
Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 23 / 34
Implementing Autoencoder classes for TMVA Autoencoder integration in TMVA
Weight initialisation of DNN layers
Greedy layerwise approach Pretrain the layers of a DNN with autoencoder approach Better initialisation than random, since done according to the data Better and faster convergence
Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 24 / 34
Implementing Autoencoder classes for TMVA Autoencoder integration in TMVA
Anomaly detection
Train the DAE on normal data Compare the decoded output to the actual input A significant mismatch could suggest that the autoencoder was not trained on such an event ⇒ anomaly
Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 25 / 34
Implementing Autoencoder classes for TMVA Results
Autoencoder training accuracy Var 1
300 − 200 − 100 − 100 200 300 400 500 5000 10000 15000 20000 25000 30000
Var 3
6 − 4 − 2 − 2 4 500 1000 1500 2000
Var 2
400 − 200 − 200 400 600 5000 10000 15000 20000 25000
Var 4
6 − 4 − 2 − 2 4 500 1000 1500 2000
Histograms of xi − x′
i
(N epochs = 100, Corruption level = 0.1, Dropout = 0)
Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 26 / 34
Implementing Autoencoder classes for TMVA Results
Autoencoder training accuracy Var 1
20 − 15 − 10 − 5 − 5 10 15 20 25
9
10 × 5000 10000 15000 20000
Var 3
160 − 140 − 120 − 100 − 80 − 60 − 40 − 20 −
6
10 × 10000 20000 30000 40000 50000 60000
Var 2
20 − 10 − 10 20
9
10 × 5000 10000 15000 20000 25000
Var 4
160 − 140 − 120 − 100 − 80 − 60 − 40 − 20 −
6
10 × 10000 20000 30000 40000 50000 60000
Histograms of xi − x′
i
(N epochs = 15’000, Corruption level = 0.2)
Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 27 / 34
Implementing Autoencoder classes for TMVA Results
Cheated
VariableDAETransform + BDT
0.6 − 0.4 − 0.2 − 0.2 0.4 BDT response 2 4 6 8 10 12 14 16 18 dx / (1/N) dN
Signal Background
U/O-flow (S,B): (0.0, 0.0)% / (0.0, 0.0)%TMVA response for classifier: BDT
Figure: BDT classifier response with DAE
pretransformation
0.3 − 0.2 − 0.1 − 0.1 0.2 0.3 0.4 BDT response 0.5 1 1.5 2 2.5 3 3.5 4 4.5 dx / (1/N) dN
Signal Background
U/O-flow (S,B): (0.0, 0.0)% / (0.0, 0.0)%TMVA response for classifier: BDT
Figure: BDT classifier response without
pretransformation
Dataset : tmva class example.root
Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 28 / 34
Implementing Autoencoder classes for TMVA Results
More realistic
VariableDAETransform + BDT
0.15 − 0.1 − 0.05 − 0.05 0.1 0.15 BDT response 2 4 6 8 10 12 dx / (1/N) dN
Signal Background
U/O-flow (S,B): (0.0, 0.0)% / (0.0, 0.0)%TMVA response for classifier: BDT
Figure: BDT classifier response with DAE
pretransformation
0.3 − 0.2 − 0.1 − 0.1 0.2 0.3 0.4 BDT response 0.5 1 1.5 2 2.5 3 3.5 4 4.5 dx / (1/N) dN
Signal Background
U/O-flow (S,B): (0.0, 0.0)% / (0.0, 0.0)%TMVA response for classifier: BDT
Figure: BDT classifier response without
pretransformation
Dataset : tmva class example.root
Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 29 / 34
Further possibilities
1
Introduction About me ROOT, TMVA and Machine learning
2
Benchmarking TMVA DNN vs PyKeras Methodology Benchmarking implementation Results
3
Implementing Autoencoder classes for TMVA Concepts Autoencoder integration in TMVA Results
4
Further possibilities
5
Aknowledgements
Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 30 / 34
Further possibilities
Pass parameters to VariableDAETransform (and othe pretransformations) Implement the anomaly detection Implement the AE weight initialisation for DNN layers Extend TMVA framework for unsupervised learning ...
Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 31 / 34
Aknowledgements
1
Introduction About me ROOT, TMVA and Machine learning
2
Benchmarking TMVA DNN vs PyKeras Methodology Benchmarking implementation Results
3
Implementing Autoencoder classes for TMVA Concepts Autoencoder integration in TMVA Results
4
Further possibilities
5
Aknowledgements
Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 32 / 34
Aknowledgements
Lorenzo Moneta and Sergei Gleyzer, who gave me the opportunity to work at CERN and supported me during the project Akshay Vashistha, who did a tremendous work for the AE in the new deep learning module Kim Albertsson, who was always available for questions or discussions
Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 33 / 34
Aknowledgements
Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 34 / 34