Deep learning in TMVA Benchmarking TMVA DNN Integration of a Deep - PowerPoint PPT Presentation

Deep learning in TMVA Benchmarking TMVA DNN Integration of a Deep Autoencoder Marc Huwiler CERN August 28, 2017 Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 1 / 34

Outline Outline Introduction 1 About me ROOT, TMVA and Machine learning Benchmarking TMVA DNN vs PyKeras 2 Methodology Benchmarking implementation Results Implementing Autoencoder classes for TMVA 3 Concepts Autoencoder integration in TMVA Results Further possibilities 4 Aknowledgements 5 Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 2 / 34

Introduction About me About me Master in High Energy Physics at EPFL in Lausanne Hiking, travelling, reading Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 3 / 34

Introduction ROOT, TMVA and Machine learning ROOT and TMVA ROOT Data analysis framework for HEP , developed mainly at CERN Written in C++ (fully interpreted) Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 4 / 34

Introduction ROOT, TMVA and Machine learning ROOT and TMVA ROOT Data analysis framework for HEP , developed mainly at CERN Written in C++ (fully interpreted) TMVA T oolkit for M ulti v ariate A nalysis Includes several machine learning algorithms such as : Likelihood, KNN, Fisher, MLP , SVN, Neural Networks, BDT, etc... Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 4 / 34

Introduction ROOT, TMVA and Machine learning Machine learning Machine learning is a type of artificial intelligence (AI) that allows software applications to become more accurate in predicting outcomes without being explicitly programmed (...) by receiving input data and using statistical analysis to predict an output value within an acceptable range. 1 1 source : http://whatis.techtarget.com/definition/machine-learning Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 5 / 34

Introduction ROOT, TMVA and Machine learning Neural networks � � ω · � output = α � x + b with α the activation function Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 6 / 34

Introduction ROOT, TMVA and Machine learning Autoencoder A special sort of neural networks Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 7 / 34

Introduction ROOT, TMVA and Machine learning Autoencoder A special sort of neural networks The inputs { x i } are encoded into a lower dimensionality set of variables, that contain a compressed representation of the data Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 7 / 34

Introduction ROOT, TMVA and Machine learning Autoencoder A special sort of neural networks The inputs { x i } are encoded into a lower dimensionality set of variables, that contain a compressed representation of the data Then, they can be decoded again into the output { x ′ i } of the same dimension as the input, and aimed to be the closest possible to it. Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 7 / 34

Benchmarking TMVA DNN vs PyKeras Introduction 1 About me ROOT, TMVA and Machine learning Benchmarking TMVA DNN vs PyKeras 2 Methodology Benchmarking implementation Results Implementing Autoencoder classes for TMVA 3 Concepts Autoencoder integration in TMVA Results Further possibilities 4 Aknowledgements 5 Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 8 / 34

Benchmarking TMVA DNN vs PyKeras Methodology Proceeding How the benchmarking is performed Run the same training on a similar neural network layout in both TMVA DNN and PyKeras Benchmarks : ROC curve integral, CPU time, Real time Studies of the benchmarks as function of any common parameter Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 9 / 34

Benchmarking TMVA DNN vs PyKeras Methodology Common basis between TMVA DNN and PyKeras Input parameters in common TMVA DNN PyKeras network layout network layout WeightInitialization initializer ErrorStrategy loss LearningRate lr Momentum momentum ConvergenceSteps TriesEarlyStopping DropConfing Dropout Sampling and Preprocessing are performed by TMVA (Factory) Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 10 / 34

Benchmarking TMVA DNN vs PyKeras Methodology Common basis between TMVA DNN and PyKeras Input parameters in common TMVA DNN PyKeras network layout network layout WeightInitialization initializer ErrorStrategy loss LearningRate lr Momentum momentum ConvergenceSteps TriesEarlyStopping DropConfing Dropout Sampling and Preprocessing are performed by TMVA (Factory) There are also parameters that have no counterpart in the other framework Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 10 / 34

Benchmarking TMVA DNN vs PyKeras Benchmarking implementation Benchmarking workflow Batch script Batch script TMVA DNN macro Input file PyKeras macro A batch script does the following: Generate a common input file with the given parameters Feed it into the macro for both TMVA DNN and PyKeras The macros write the benchmarks into a common file These steps are repeated changing a given parameter Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 11 / 34

Benchmarking TMVA DNN vs PyKeras Benchmarking implementation Input parameters used Input parameter nNeurons 100 nLayers 3 Activation 2 Lastactivation 3 Initializer 1 Lossfunction 0 Transformations ”N,D” Factorystring !V:!Silent:Color:DrawProgressBar:Transformations=I:AnalysisType=Classification Learningrate 0.1 Momentum 0.0 Batchsize 128 Convergencesteps 100 Dropout 0.0 Ntrainsignal 50000 Ntrainbackground 50000 Ntestsignal 100000 Ntestbackground 100000 Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 12 / 34

Benchmarking TMVA DNN vs PyKeras Results ROC curve Example of a ROC curve Signal efficiency vs. Background rejection (DNN CPU) Signal efficiency vs. Background rejection (DNN CPU) Background rejection (Specificity) 1 0.8 0.6 0.4 0.2 0 0 0.2 0.4 0.6 0.8 1 Signal efficiency (Sensitivity) nNeurons = 20 , Convergence steps = 100 , Batch size = 128 , Dropout = 0 Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 13 / 34

Benchmarking TMVA DNN vs PyKeras Results ROC curve integral Varying the number of neurons 1 0.9 0.8 ROC curve integral 0.7 0.6 TMVA DNN 0.5 PyKeras Tensorflow SGD 0.4 PyKeras Theano SGD 0.3 PyKeras Tensorflow Adam 0.2 PyKeras Theano Adam 0.1 0 0 50 100 150 200 250 number of neurons per layer Convergence steps = 100 , Batch size = 128 , Dropout = 0 Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 14 / 34

Benchmarking TMVA DNN vs PyKeras Results CPU time Varying the number of neurons 40000 35000 30000 CPU time [s] 25000 TMVA DNN 20000 PyKeras Tensorflow SGD 15000 PyKeras Theano SGD 10000 PyKeras Tensorflow Adam PyKeras Theano Adam 5000 0 0 50 100 150 200 250 number of neurons per layer Convergence steps = 100 , Batch size = 128 , Dropout = 0 Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 15 / 34

Benchmarking TMVA DNN vs PyKeras Results Real time Varying the number of neurons 5000 4500 4000 3500 real time [s] 3000 TMVA DNN 2500 PyKeras Tensorflow SGD 2000 PyKeras Theano SGD 1500 PyKeras Tensorflow Adam 1000 PyKeras Theano Adam 500 0 0 50 100 150 200 250 number of neurons per layer Convergence steps = 100 , Batch size = 128 , Dropout = 0 Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 16 / 34

Benchmarking TMVA DNN vs PyKeras Results ROC curve integral Statistical validity 1 0.9 0.8 ROC curve integral TMVA DNN 1 0.7 TMVA DNN 2 0.6 TMVA DNN 3 0.5 TMVA DNN 4 0.4 TMVA DNN 5 0.3 Mean 0.2 0.1 0 0 50 100 150 200 250 number of neurons per layer Convergence steps = 100 , Batch size = 128 , Dropout = 0 Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 17 / 34

Benchmarking TMVA DNN vs PyKeras Results CPU time Statistical validity 45000 40000 35000 TMVA DNN 1 30000 CPU time [s] TMVA DNN 2 25000 TMVA DNN 3 TMVA DNN 4 20000 TMVA DNN 5 15000 Mean 10000 5000 0 0 50 100 150 200 250 number of neurons per layer Convergence steps = 100 , Batch size = 128 , Dropout = 0 Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 18 / 34

Benchmarking TMVA DNN vs PyKeras Results Real time Statistical validity 7000 6000 TMVA DNN 1 5000 real time [s] TMVA DNN 2 4000 TMVA DNN 3 TMVA DNN 4 3000 TMVA DNN 5 2000 Mean 1000 0 0 50 100 150 200 250 number of neurons per layer Convergence steps = 100 , Batch size = 128 , Dropout = 0 Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 19 / 34

Deep learning in TMVA Benchmarking TMVA DNN Integration of a Deep - PowerPoint PPT Presentation

Deep learning in TMVA Benchmarking TMVA DNN Integration of a Deep Autoencoder Marc Huwiler CERN August 28, 2017 Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 1 / 34 Outline Outline Introduction 1 About me ROOT, TMVA and

Particle identification using TMVA/MLP and Nave Bayes for EMC detector Malgorzata

Photon Not Meeting 27 th July 2017 1 TMVA Classification Can now extract the response variable

Hao Su July 6, 2017 Outline Overview of 3D deep learning 3D deep learning algorithms

All You Want To Know About CNNs Yukun Zhu Deep Learning Deep Learning Image from

Deep Neural Networks and Deep Reinforcement Learning Deep Learning, Goodfellow, Bengio and

Simple ML Tutorial Mike Williams MIT June 16, 2017 Machine Learning ROOT provides a C++

scikit-learn to TMVA: XML converter tool Yuriy Ilchenko (U. of Texas), Nazim Huseynov (JINR) IML

TMVA Exercise Crist ov ao Beir ao da Cruz e Silva Instituto Superior T ecnico,

Integration of Spark parallelization in TMVA Georgios Douzas Enric Tejedor, Sergei Gleyzer,

Introduction to TMVA and Primary Electron Track Determination Erin Conley SNB/LE Working Group

AGN deep multiwavelength AGN deep multiwavelength AGN deep multiwavelength surveys: surveys:

Deep Learning: Theory and Practice Deep Learning - Practical 02-04-2020 Considerations

Presentation about Deep Learning --- Zhongwu xie Contents 1.Brief introduction of Deep learning.

Deep Learning on GPUs March 2016 What is Deep Learning? GPUs and DL AGENDA DL in practice

Deep learning Deep reinforcement learning Hamid Beigy Sharif university of technology December

Differen'able Func'onal Programming Noel Welsh @noelwelsh underscore Goals Deep learning

AI as a Service, Edge, Platform Avitas System, IaaS FPI

Reducing the Cost of Regression Testing by Identifying Irreplaceable Test Cases Chu-Ti Lin ,

A bit of History First activity Second activity 1.-

A Low Power Design of Gray and T0 Codecs for the Address Bus Encoding for System Level Power

Algorithms for Product Pricing and Energy Allocation in Energy Harvesting Sensor Networks

Meeting the Needs of Community College Students Developing Sustainable Webbased Advising Tools

OPPORTUNITIES IN A GLOBAL GREEN PLACE Rio Global Green Business is a starting point of offices/

AYT Centre for Analysis of Youth Transitions Evaluating the impact of youth programmes Claire