Machine Learning Techniques for HEP Data Analysis with T MVA Andreas - PowerPoint PPT Presentation

Machine Learning Techniques for HEP Data Analysis with T MVA Andreas Hoecker ( * ) (CERN) Seminar, LAL Orsay, June 21, 2007 ( * ) On behalf of the author team: A. Hoecker, P. Speckmayer, J. Stelzer, F. Tegenfeldt, H. Voss, K. Voss And the contributors: A. Christov, S. Henrot-Versillé, M. Jachowski, A. Krasznahorkay Jr., Y. Mahalalel, R. Ospanov, X. Prudent, M. Wolter, A. Zemla See acknowledgments on page 43 On the web: http://tmva.sf.net/ (home), https://twiki.cern.ch/twiki/bin/view/TMVA/WebHome (tutorial) LAL Seminar, June 21, 2007 A. Hoecker: Machine Learning with T MVA 1 / 41

a d v e r t i s e m e n t We (finally) have a Users Guide ! Available on http://tmva.sf.net T MVA Users Guide 97pp, incl. code examples arXiv physics/0703039 LAL Seminar, June 21, 2007 A. Hoecker: Machine Learning with T MVA 2 / 41

Event Classification Suppose data sample with two types of events: H 0 , H 1 We have found discriminating input variables x 1 , x 2 , … What decision boundary should we use to select events of type H 1 ? Rectangular cuts? A linear boundary? A nonlinear one? x 2 x 2 x 2 H 1 H 1 H 1 H 0 H 0 H 0 x 1 x 1 x 1 How can we decide this in an optimal way ?  Let the machine learn it ! LAL Seminar, June 21, 2007 A. Hoecker: Machine Learning with T MVA 3 / 41

Multivariate Event Classification All multivariate classifiers have in common to condense (correlated) multi-variable input information in a single scalar output variable It is a R n → R regression problem; classification is in fact a discretised regression y ( H 0 ) → 0, y ( H 1 ) → 1 … LAL Seminar, June 21, 2007 A. Hoecker: Machine Learning with T MVA 4 / 41

Event Classification in High-Energy Physics (HEP) Most HEP analyses require discrimination of signal from background: Event level (Higgs searches, …) Cone level (Tau-vs-jet reconstruction, …) Track level (particle identification, …) Lifetime and flavour tagging ( b -tagging, …) Parameter estimation ( CP violation in B system, …) etc. The multivariate input information used for this has various sources Kinematic variables (masses, momenta, decay angles, …) Event properties (jet/lepton multiplicity, sum of charges, …) Event shape (sphericity, Fox-Wolfram moments, …) Detector response (silicon hits, dE / dx , Cherenkov angle, shower profiles, muon hits, …) etc. Traditionally few powerful input variables were combined; new methods allow to use up to 100 and more variables w/o loss of classification power LAL Seminar, June 21, 2007 A. Hoecker: Machine Learning with T MVA 5 / 41

T M V A T M V A LAL Seminar, June 21, 2007 A. Hoecker: Machine Learning with T MVA 6 / 41

What is T MVA The various classifiers have very different properties Ideally, all should be tested for a given problem Systematically choose the best performing and simplest classifier Comparisons between classifiers improves the understanding and takes away mysticism T MVA ― Toolkit for multivariate data analysis Framework for parallel training , testing , evaluation and application of MV classifiers Training events can have weights A large number of linear, nonlinear, likelihood and rule-based classifiers implemented The classifiers rank the input variables The input variables can be decorrelated or projected upon their principal components Training results and full configuration are written to weight files Application to data classification using a Reader or standalone C++ classes LAL Seminar, June 21, 2007 A. Hoecker: Machine Learning with T MVA 7 / 41

T MVA Development and Distribution T MVA is a sourceforge (SF) package for world-wide access Home page ……………….http://tmva.sf.net/ SF project page …………. http://sf.net/projects/tmva View CVS …………………http://tmva.cvs.sf.net/tmva/TMVA/ Mailing list .………………..http://sf.net/mail/?group_id=152074 Tutorial TWiki ……………. https://twiki.cern.ch/twiki/bin/view/TMVA/WebHome Active project  fast response time on feature requests Currently 6 main developers, and 27 registered contributors at SF >1200 downloads since March 2006 (not accounting cvs checkouts and ROOT users) Written in C++, relying on core ROOT functionality Full examples distributed with T MVA, including analysis macros and GUI Scripts are provided for T MVA use in ROOT macro, as C++ executable or with python Integrated and distributed with ROOT since ROOT v5.11/03 LAL Seminar, June 21, 2007 A. Hoecker: Machine Learning with T MVA 8 / 41

T h e T M V A C l a s s i f i e r s Currently implemented classifiers : Rectangular cut optimisation Projective and multidimensional likelihood estimator k-Nearest Neighbor algorithm Fisher and H-Matrix discriminants Function discriminant Artificial neural networks (3 different multilayer perceptron s) Boosted/bagged decision trees with automatic node pruning RuleFit Support Vector Machine LAL Seminar, June 21, 2007 A. Hoecker: Machine Learning with T MVA 9 / 41

Data Preprocessing: Decorrelation Commonly realised for all methods in T MVA (centrally in DataSet class) Removal of linear correlations by rotating input variables Determine square-root C ′ of covariance matrix C , i . e ., C = C ′ C ′ Transform original ( x ) into decorrelated variable space ( x ′ ) by: x ′ = C ′ − 1 x Various ways to choose basis for decorrelation (also implemented PCA) Note that decorrelation is only complete, if Correlations are linear Input variables are Gaussian distributed Not very accurate conjecture in general SQRT derorr. PCA derorr. original LAL Seminar, June 21, 2007 A. Hoecker: Machine Learning with T MVA 10 / 41

Rectangular Cut Optimisation Simplest method: cut in rectangular variable volume ( ) I x i 0,1 x i x , x ( ) { } ( ) � = � � � � � cut event v eve nt v ,min v ,ma x v variabl es { } � Technical challenge: how to find optimal cuts ? MINUIT fails due to non-unique solution space T MVA uses: Monte Carlo sampling , Genetic Algorithm , Simulated Annealing Huge speed improvement of volume search by sorting events in binary tree Cuts usually benefit from prior decorrelation of cut variables LAL Seminar, June 21, 2007 A. Hoecker: Machine Learning with T MVA 11 / 41

Projective Likelihood Estimator (PDE Approach) Much liked in HEP: probability density estimators for each input variable combined in likelihood estimator Likelihood ratio PDFs discriminating variables for event i event PDE introduces fuzzy logic signal p x ( i ) ( ) � k k event k { variables } � y i ( ) = L event � � Species: signal, U p x ( i ) ( ) � � � � background k k event � � U { species } k variable s � { } � � � types Ignores correlations between input variables Optimal approach if correlations are zero (or linear  decorrelation) Otherwise: significant performance loss LAL Seminar, June 21, 2007 A. Hoecker: Machine Learning with T MVA 12 / 41

PDE Approach: Estimating PDF Kernels Technical challenge: how to estimate the PDF shapes 3 ways: parametric fitting (function) nonparametric fitting event counting Difficult to automate Easy to automate, can create Automatic, unbiased, for arbitrary PDFs artefacts/suppress information but suboptimal We have chosen to implement nonparametric fitting in T MVA original distribution Binned shape interpolation using spline is Gaussian functions (orders: 1, 2, 3, 5) Unbinned kernel density estimation (KDE) with Gaussian smearing T MVA performs automatic validation of goodness-of-fit LAL Seminar, June 21, 2007 A. Hoecker: Machine Learning with T MVA 13 / 41

Multidimensional PDE Approach Use a single PDF per event class (sig, bkg), which spans N var dimensions PDE Range-Search: count number of signal and background events in Carli-Koblitz, NIM A501, 576 (2003) “vicinity” of test event  preset or adaptive volume defines “vicinity” The signal estimator is then given by (simplified, y i , V � 0.86 classifier: k-Nearest Neighbor – implemented by R. Ospanov (Texas U.) : ( ) PDERS event x 2 full formula accounts for event weights and training population) H 1 Better than searching within a volume (fixed or floating), count adjacent reference events till statistically significant number reached PDE-RS ratio chosen #signal events in V for event i event volume test Method intrinsically adaptive event n i , V ( ) Very fast search with kd-tree event sorting S event y i , V H 0 ( ) = PDERS event n i , V n i , V ( ) ( ) + S event B event #background events in V x 1 Improve y PDERS estimate within V by using various N var -D kernel estimators Enhance speed of event counting in volume by binary tree search LAL Seminar, June 21, 2007 A. Hoecker: Machine Learning with T MVA 14 / 41

Machine Learning Techniques for HEP Data Analysis with T MVA Andreas - PowerPoint PPT Presentation

Machine Learning Techniques for HEP Data Analysis with T MVA Andreas Hoecker ( * ) (CERN) Seminar, LAL Orsay, June 21, 2007 ( * ) On behalf of the author team: A. Hoecker, P. Speckmayer, J. Stelzer, F. Tegenfeldt, H. Voss, K. Voss And the

Quantum Machine Learning Adam Brown, HEP-AI Quantum Computing Machine Learning Quantum

Learning to Inflate Tom Rudelius IAS Based on 1810.05159/hep-th Outline Machine Learning

The dual life of giant gravitons David Berenstein UCSB Based on: hep-th/0306090, hep-th/0403110

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine

MACHINE LEARNING kernels 1 MACHINE LEARNING 2012 MACHINE LEARNING Kernels: Intuition How

MICROSOFT AZURE MACHINE LEARNING Oscar Naim Microsoft Microsoft Azure Machine Learning What is

MACHINE LEARNING Overview 1 1 APPLIED MACHINE LEARNING 2011-2012 APPLIED MACHINE LEARNING

MACHINE LEARNING Kernel Canonical Correlation Analysis 1 ADVANCED MACHINE LEARNING ADVANCED

APPLIED MACHINE LEARNING Methods for Clustering K-means, Soft K-means DBSCAN 1 MACHINE

A Machine Learning Approach A Machine Learning Approach A Machine Learning Approach A Machine

Supertubes and the 4D black hole Per Kraus, UCLA with I. Bena: hep-th/0402144, hep-th/0408186,

HEP DC Trips 2017 US HEP User Community Outreach & Advocacy to the Federal Government

Yasunori Nomura UC Berkeley; LBNL hep-ph/0509039 [PLB] Based on work with hep-ph/0509221 [PLB]

Machine learning for finance Nathan George Data Science Professor DataCamp Machine Learning

Machine Learning 1 Machine(Learning(in(a(Nutshell ( Data$ Model$ Performance$ Measure$

Welcome to the Machine Learning Toolbox! Machine Learning Toolbox Supervised learning caret

Natural Language Processing Classification I Dan Klein UC Berkeley 1 2 Classification

Generative and Discriminative Learning Machine Learning 1 What we saw most of the semester

Machine Learning Anders Holst SICS Big Data Analytics Analysis Big Data Big Value Big Data

Statistical Tools in Collider Experiments Multivariate analysis in high energy physics Lecture 3

Multivariate Data Analysis with T MVA Andreas Hoecker ( * ) (CERN) Statistical Tools Workshop,

Charming new results from STAR! NSD Staff Meeting, January 22, 2019 Sooraj Radhakrishnan

CSE 573: Artificial Intelligence Autumn 2010 Lecture 16: Machine Learning Topics 12/7/2010 Luke

Does Training Affect Match Performance? A Study Using Data Mining And Tracking Devices

Machine Learning Techniques for HEP Data Analysis with T MVA Andreas - PowerPoint PPT Presentation

Machine Learning Techniques for HEP Data Analysis with T MVA Andreas Hoecker ( * ) (CERN) Seminar, LAL Orsay, June 21, 2007 ( * ) On behalf of the author team: A. Hoecker, P. Speckmayer, J. Stelzer, F. Tegenfeldt, H. Voss, K. Voss And the

Quantum Machine Learning Adam Brown, HEP-AI Quantum Computing Machine Learning Quantum

Learning to Inflate Tom Rudelius IAS Based on 1810.05159/hep-th Outline Machine Learning

The dual life of giant gravitons David Berenstein UCSB Based on: hep-th/0306090, hep-th/0403110

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine

MACHINE LEARNING kernels 1 MACHINE LEARNING 2012 MACHINE LEARNING Kernels: Intuition How

MICROSOFT AZURE MACHINE LEARNING Oscar Naim Microsoft Microsoft Azure Machine Learning What is

MACHINE LEARNING Overview 1 1 APPLIED MACHINE LEARNING 2011-2012 APPLIED MACHINE LEARNING

MACHINE LEARNING Kernel Canonical Correlation Analysis 1 ADVANCED MACHINE LEARNING ADVANCED

APPLIED MACHINE LEARNING Methods for Clustering K-means, Soft K-means DBSCAN 1 MACHINE

A Machine Learning Approach A Machine Learning Approach A Machine Learning Approach A Machine

Supertubes and the 4D black hole Per Kraus, UCLA with I. Bena: hep-th/0402144, hep-th/0408186,

HEP DC Trips 2017 US HEP User Community Outreach &amp; Advocacy to the Federal Government

Yasunori Nomura UC Berkeley; LBNL hep-ph/0509039 [PLB] Based on work with hep-ph/0509221 [PLB]

Machine learning for finance Nathan George Data Science Professor DataCamp Machine Learning

Machine Learning 1 Machine(Learning(in(a(Nutshell ( Data$ Model$ Performance$ Measure$

Welcome to the Machine Learning Toolbox! Machine Learning Toolbox Supervised learning caret

Natural Language Processing Classification I Dan Klein UC Berkeley 1 2 Classification

Generative and Discriminative Learning Machine Learning 1 What we saw most of the semester

Machine Learning Anders Holst SICS Big Data Analytics Analysis Big Data Big Value Big Data

Statistical Tools in Collider Experiments Multivariate analysis in high energy physics Lecture 3

Multivariate Data Analysis with T MVA Andreas Hoecker ( * ) (CERN) Statistical Tools Workshop,

Charming new results from STAR! NSD Staff Meeting, January 22, 2019 Sooraj Radhakrishnan

CSE 573: Artificial Intelligence Autumn 2010 Lecture 16: Machine Learning Topics 12/7/2010 Luke

Does Training Affect Match Performance? A Study Using Data Mining And Tracking Devices

HEP DC Trips 2017 US HEP User Community Outreach & Advocacy to the Federal Government