Deep learning in TMVA Benchmarking TMVA DNN Integration of a Deep - - PowerPoint PPT Presentation

deep learning in tmva
SMART_READER_LITE
LIVE PREVIEW

Deep learning in TMVA Benchmarking TMVA DNN Integration of a Deep - - PowerPoint PPT Presentation

Deep learning in TMVA Benchmarking TMVA DNN Integration of a Deep Autoencoder Marc Huwiler CERN August 28, 2017 Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 1 / 34 Outline Outline Introduction 1 About me ROOT, TMVA and


slide-1
SLIDE 1

Deep learning in TMVA

Benchmarking TMVA DNN Integration of a Deep Autoencoder Marc Huwiler

CERN

August 28, 2017

Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 1 / 34

slide-2
SLIDE 2

Outline

Outline

1

Introduction About me ROOT, TMVA and Machine learning

2

Benchmarking TMVA DNN vs PyKeras Methodology Benchmarking implementation Results

3

Implementing Autoencoder classes for TMVA Concepts Autoencoder integration in TMVA Results

4

Further possibilities

5

Aknowledgements

Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 2 / 34

slide-3
SLIDE 3

Introduction About me

About me

Master in High Energy Physics at EPFL in Lausanne Hiking, travelling, reading

Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 3 / 34

slide-4
SLIDE 4

Introduction ROOT, TMVA and Machine learning

ROOT and TMVA

ROOT Data analysis framework for HEP , developed mainly at CERN Written in C++ (fully interpreted)

Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 4 / 34

slide-5
SLIDE 5

Introduction ROOT, TMVA and Machine learning

ROOT and TMVA

ROOT Data analysis framework for HEP , developed mainly at CERN Written in C++ (fully interpreted) TMVA Toolkit for Multivariate Analysis Includes several machine learning algorithms such as :

Likelihood, KNN, Fisher, MLP , SVN, Neural Networks, BDT, etc...

Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 4 / 34

slide-6
SLIDE 6

Introduction ROOT, TMVA and Machine learning

Machine learning

Machine learning is a type of artificial intelligence (AI) that allows software applications to become more accurate in predicting

  • utcomes without being explicitly programmed (...) by receiving

input data and using statistical analysis to predict an output value within an acceptable range.1

1source : http://whatis.techtarget.com/definition/machine-learning Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 5 / 34

slide-7
SLIDE 7

Introduction ROOT, TMVA and Machine learning

Neural networks

  • utput = α
  • ω ·

x + b

  • with α the activation function

Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 6 / 34

slide-8
SLIDE 8

Introduction ROOT, TMVA and Machine learning

Autoencoder

A special sort of neural networks

Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 7 / 34

slide-9
SLIDE 9

Introduction ROOT, TMVA and Machine learning

Autoencoder

A special sort of neural networks

The inputs {xi} are encoded into a lower dimensionality set of variables, that contain a compressed representation of the data

Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 7 / 34

slide-10
SLIDE 10

Introduction ROOT, TMVA and Machine learning

Autoencoder

A special sort of neural networks

The inputs {xi} are encoded into a lower dimensionality set of variables, that contain a compressed representation of the data Then, they can be decoded again into the output {x′

i} of the same

dimension as the input, and aimed to be the closest possible to it.

Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 7 / 34

slide-11
SLIDE 11

Benchmarking TMVA DNN vs PyKeras

1

Introduction About me ROOT, TMVA and Machine learning

2

Benchmarking TMVA DNN vs PyKeras Methodology Benchmarking implementation Results

3

Implementing Autoencoder classes for TMVA Concepts Autoencoder integration in TMVA Results

4

Further possibilities

5

Aknowledgements

Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 8 / 34

slide-12
SLIDE 12

Benchmarking TMVA DNN vs PyKeras Methodology

Proceeding

How the benchmarking is performed

Run the same training on a similar neural network layout in both TMVA DNN and PyKeras Benchmarks : ROC curve integral, CPU time, Real time Studies of the benchmarks as function of any common parameter

Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 9 / 34

slide-13
SLIDE 13

Benchmarking TMVA DNN vs PyKeras Methodology

Common basis between TMVA DNN and PyKeras

Input parameters in common

TMVA DNN PyKeras network layout network layout WeightInitialization initializer ErrorStrategy loss LearningRate lr Momentum momentum ConvergenceSteps TriesEarlyStopping DropConfing Dropout Sampling and Preprocessing are performed by TMVA (Factory)

Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 10 / 34

slide-14
SLIDE 14

Benchmarking TMVA DNN vs PyKeras Methodology

Common basis between TMVA DNN and PyKeras

Input parameters in common

TMVA DNN PyKeras network layout network layout WeightInitialization initializer ErrorStrategy loss LearningRate lr Momentum momentum ConvergenceSteps TriesEarlyStopping DropConfing Dropout Sampling and Preprocessing are performed by TMVA (Factory) There are also parameters that have no counterpart in the other framework

Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 10 / 34

slide-15
SLIDE 15

Benchmarking TMVA DNN vs PyKeras Benchmarking implementation

Benchmarking workflow

Batch script

Batch script TMVA DNN macro PyKeras macro Input file

A batch script does the following: Generate a common input file with the given parameters Feed it into the macro for both TMVA DNN and PyKeras The macros write the benchmarks into a common file These steps are repeated changing a given parameter

Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 11 / 34

slide-16
SLIDE 16

Benchmarking TMVA DNN vs PyKeras Benchmarking implementation

Input parameters used

Input parameter nNeurons 100 nLayers 3 Activation 2 Lastactivation 3 Initializer 1 Lossfunction Transformations ”N,D” Factorystring

!V:!Silent:Color:DrawProgressBar:Transformations=I:AnalysisType=Classification

Learningrate 0.1 Momentum 0.0 Batchsize 128 Convergencesteps 100 Dropout 0.0 Ntrainsignal 50000 Ntrainbackground 50000 Ntestsignal 100000 Ntestbackground 100000

Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 12 / 34

slide-17
SLIDE 17

Benchmarking TMVA DNN vs PyKeras Benchmarking implementation

Input parameters used

Input parameter nNeurons 100 nLayers 3 Activation 2 Lastactivation 3 Initializer 1 Lossfunction Transformations ”N,D” Factorystring

!V:!Silent:Color:DrawProgressBar:Transformations=I:AnalysisType=Classification

Learningrate 0.1 Momentum 0.0 Batchsize 128 Convergencesteps 100 Dropout 0.0 Ntrainsignal 50000 Ntrainbackground 50000 Ntestsignal 100000 Ntestbackground 100000

Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 12 / 34

slide-18
SLIDE 18

Benchmarking TMVA DNN vs PyKeras Results

ROC curve

Example of a ROC curve

0.2 0.4 0.6 0.8 1 Signal efficiency (Sensitivity) 0.2 0.4 0.6 0.8 1 Background rejection (Specificity)

Signal efficiency vs. Background rejection (DNN CPU) Signal efficiency vs. Background rejection (DNN CPU)

nNeurons = 20 , Convergence steps = 100 , Batch size = 128 , Dropout = 0

Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 13 / 34

slide-19
SLIDE 19

Benchmarking TMVA DNN vs PyKeras Results

ROC curve integral

Varying the number of neurons

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 50 100 150 200 250 number of neurons per layer ROC curve integral

TMVA DNN PyKeras Tensorflow SGD PyKeras Theano SGD PyKeras Tensorflow Adam PyKeras Theano Adam

Convergence steps = 100 , Batch size = 128 , Dropout = 0

Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 14 / 34

slide-20
SLIDE 20

Benchmarking TMVA DNN vs PyKeras Results

CPU time

Varying the number of neurons

5000 10000 15000 20000 25000 30000 35000 40000 50 100 150 200 250 number of neurons per layer CPU time [s]

TMVA DNN PyKeras Tensorflow SGD PyKeras Theano SGD PyKeras Tensorflow Adam PyKeras Theano Adam

Convergence steps = 100 , Batch size = 128 , Dropout = 0

Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 15 / 34

slide-21
SLIDE 21

Benchmarking TMVA DNN vs PyKeras Results

Real time

Varying the number of neurons

500 1000 1500 2000 2500 3000 3500 4000 4500 5000 50 100 150 200 250 number of neurons per layer real time [s]

TMVA DNN PyKeras Tensorflow SGD PyKeras Theano SGD PyKeras Tensorflow Adam PyKeras Theano Adam

Convergence steps = 100 , Batch size = 128 , Dropout = 0

Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 16 / 34

slide-22
SLIDE 22

Benchmarking TMVA DNN vs PyKeras Results

ROC curve integral

Statistical validity

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 50 100 150 200 250 number of neurons per layer ROC curve integral TMVA DNN 1 TMVA DNN 2 TMVA DNN 3 TMVA DNN 4 TMVA DNN 5 Mean

Convergence steps = 100 , Batch size = 128 , Dropout = 0

Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 17 / 34

slide-23
SLIDE 23

Benchmarking TMVA DNN vs PyKeras Results

CPU time

Statistical validity

5000 10000 15000 20000 25000 30000 35000 40000 45000 50 100 150 200 250 number of neurons per layer CPU time [s] TMVA DNN 1 TMVA DNN 2 TMVA DNN 3 TMVA DNN 4 TMVA DNN 5 Mean

Convergence steps = 100 , Batch size = 128 , Dropout = 0

Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 18 / 34

slide-24
SLIDE 24

Benchmarking TMVA DNN vs PyKeras Results

Real time

Statistical validity

1000 2000 3000 4000 5000 6000 7000 50 100 150 200 250 number of neurons per layer real time [s] TMVA DNN 1 TMVA DNN 2 TMVA DNN 3 TMVA DNN 4 TMVA DNN 5 Mean

Convergence steps = 100 , Batch size = 128 , Dropout = 0

Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 19 / 34

slide-25
SLIDE 25

Implementing Autoencoder classes for TMVA

1

Introduction About me ROOT, TMVA and Machine learning

2

Benchmarking TMVA DNN vs PyKeras Methodology Benchmarking implementation Results

3

Implementing Autoencoder classes for TMVA Concepts Autoencoder integration in TMVA Results

4

Further possibilities

5

Aknowledgements

Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 20 / 34

slide-26
SLIDE 26

Implementing Autoencoder classes for TMVA Concepts

Supervised vs unsupervised Learning

Unsupervised learning : The algorithm trains on data without actual information on what the output should be Supervised Learning : The true value of the predicted output is known during the learning process

Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 21 / 34

slide-27
SLIDE 27

Implementing Autoencoder classes for TMVA Concepts

Supervised vs unsupervised Learning

Unsupervised learning : The algorithm trains on data without actual information on what the output should be Supervised Learning : The true value of the predicted output is known during the learning process TMVA : Classification and Regression ⇒ supervised An autoencoder (AE) is unsupervised

Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 21 / 34

slide-28
SLIDE 28

Implementing Autoencoder classes for TMVA Autoencoder integration in TMVA

Autoencoder integration in TMVA

Envisaged approaches

Making unsupervised learning possible in TMVA : Change the data loading process (classes or targets no longer required) Design new evaluation methods Classification : ROC curve and integral Regression : deviation of output versus targets/inputs Autoencoder : deviation between {xi} and {x′

i} ?

Other unsupervised methods : ???

Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 22 / 34

slide-29
SLIDE 29

Implementing Autoencoder classes for TMVA Autoencoder integration in TMVA

Autoencoder integration in TMVA

Envisaged approaches

Making unsupervised learning possible in TMVA : Change the data loading process (classes or targets no longer required) Design new evaluation methods Classification : ROC curve and integral Regression : deviation of output versus targets/inputs Autoencoder : deviation between {xi} and {x′

i} ?

Other unsupervised methods : ??? Finding autoencoder use cases that better fit the TMVA framework : Dimensionality reduction Anomaly detection Weight initialisation for deep neural network (DNN) layers Classification (with logistic regression layer)

Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 22 / 34

slide-30
SLIDE 30

Implementing Autoencoder classes for TMVA Autoencoder integration in TMVA

Possible use cases

Variable transformation

Used to reduce the dimensionality of a dataset, by figuring out complex patterns among the variables The encoded inputs are passed as outputs, and can be fed into any other method or pretransformation in TMVA ( ≈ non-linear PCA)

Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 23 / 34

slide-31
SLIDE 31

Implementing Autoencoder classes for TMVA Autoencoder integration in TMVA

Possible use cases

Weight initialisation of DNN layers

Greedy layerwise approach Pretrain the layers of a DNN with autoencoder approach Better initialisation than random, since done according to the data Better and faster convergence

Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 24 / 34

slide-32
SLIDE 32

Implementing Autoencoder classes for TMVA Autoencoder integration in TMVA

Possible use cases

Anomaly detection

Train the DAE on normal data Compare the decoded output to the actual input A significant mismatch could suggest that the autoencoder was not trained on such an event ⇒ anomaly

Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 25 / 34

slide-33
SLIDE 33

Implementing Autoencoder classes for TMVA Results

Results

Autoencoder training accuracy Var 1

300 − 200 − 100 − 100 200 300 400 500 5000 10000 15000 20000 25000 30000

Var 3

6 − 4 − 2 − 2 4 500 1000 1500 2000

Var 2

400 − 200 − 200 400 600 5000 10000 15000 20000 25000

Var 4

6 − 4 − 2 − 2 4 500 1000 1500 2000

Histograms of xi − x′

i

(N epochs = 100, Corruption level = 0.1, Dropout = 0)

Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 26 / 34

slide-34
SLIDE 34

Implementing Autoencoder classes for TMVA Results

Results

Autoencoder training accuracy Var 1

20 − 15 − 10 − 5 − 5 10 15 20 25

9

10 × 5000 10000 15000 20000

Var 3

160 − 140 − 120 − 100 − 80 − 60 − 40 − 20 −

6

10 × 10000 20000 30000 40000 50000 60000

Var 2

20 − 10 − 10 20

9

10 × 5000 10000 15000 20000 25000

Var 4

160 − 140 − 120 − 100 − 80 − 60 − 40 − 20 −

6

10 × 10000 20000 30000 40000 50000 60000

Histograms of xi − x′

i

(N epochs = 15’000, Corruption level = 0.2)

Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 27 / 34

slide-35
SLIDE 35

Implementing Autoencoder classes for TMVA Results

Results

Cheated

VariableDAETransform + BDT

0.6 − 0.4 − 0.2 − 0.2 0.4 BDT response 2 4 6 8 10 12 14 16 18 dx / (1/N) dN

Signal Background

U/O-flow (S,B): (0.0, 0.0)% / (0.0, 0.0)%

TMVA response for classifier: BDT

Figure: BDT classifier response with DAE

pretransformation

0.3 − 0.2 − 0.1 − 0.1 0.2 0.3 0.4 BDT response 0.5 1 1.5 2 2.5 3 3.5 4 4.5 dx / (1/N) dN

Signal Background

U/O-flow (S,B): (0.0, 0.0)% / (0.0, 0.0)%

TMVA response for classifier: BDT

Figure: BDT classifier response without

pretransformation

Dataset : tmva class example.root

Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 28 / 34

slide-36
SLIDE 36

Implementing Autoencoder classes for TMVA Results

Results

More realistic

VariableDAETransform + BDT

0.15 − 0.1 − 0.05 − 0.05 0.1 0.15 BDT response 2 4 6 8 10 12 dx / (1/N) dN

Signal Background

U/O-flow (S,B): (0.0, 0.0)% / (0.0, 0.0)%

TMVA response for classifier: BDT

Figure: BDT classifier response with DAE

pretransformation

0.3 − 0.2 − 0.1 − 0.1 0.2 0.3 0.4 BDT response 0.5 1 1.5 2 2.5 3 3.5 4 4.5 dx / (1/N) dN

Signal Background

U/O-flow (S,B): (0.0, 0.0)% / (0.0, 0.0)%

TMVA response for classifier: BDT

Figure: BDT classifier response without

pretransformation

Dataset : tmva class example.root

Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 29 / 34

slide-37
SLIDE 37

Further possibilities

1

Introduction About me ROOT, TMVA and Machine learning

2

Benchmarking TMVA DNN vs PyKeras Methodology Benchmarking implementation Results

3

Implementing Autoencoder classes for TMVA Concepts Autoencoder integration in TMVA Results

4

Further possibilities

5

Aknowledgements

Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 30 / 34

slide-38
SLIDE 38

Further possibilities

Future Possibilities

Pass parameters to VariableDAETransform (and othe pretransformations) Implement the anomaly detection Implement the AE weight initialisation for DNN layers Extend TMVA framework for unsupervised learning ...

Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 31 / 34

slide-39
SLIDE 39

Aknowledgements

1

Introduction About me ROOT, TMVA and Machine learning

2

Benchmarking TMVA DNN vs PyKeras Methodology Benchmarking implementation Results

3

Implementing Autoencoder classes for TMVA Concepts Autoencoder integration in TMVA Results

4

Further possibilities

5

Aknowledgements

Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 32 / 34

slide-40
SLIDE 40

Aknowledgements

Aknowledgements

Lorenzo Moneta and Sergei Gleyzer, who gave me the opportunity to work at CERN and supported me during the project Akshay Vashistha, who did a tremendous work for the AE in the new deep learning module Kim Albertsson, who was always available for questions or discussions

Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 33 / 34

slide-41
SLIDE 41

Aknowledgements

Thanks

Thanks for your attention!

Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 34 / 34