Experimental Particle Physics Experimental Particle Physics - - PowerPoint PPT Presentation
Experimental Particle Physics Experimental Particle Physics - - PowerPoint PPT Presentation
Experimental Particle Physics Experimental Particle Physics Detector by function Position: Beam Tracker Vertex Telescope Multi-wire Proportional Chambers (MWPCs) Energy: Zero Degree Calorimeter (ZDC) Charge:
Detector by function
- Position:
– Beam Tracker – Vertex Telescope – Multi-wire Proportional Chambers (MWPCs)
- Energy:
– Zero Degree Calorimeter (ZDC)
- Charge:
– Quartz Blade
- PID:
∑
∝
2
z s
PID:
– Hadron Absorber and Iron wall
∑ z
s
13 Feb. 2008 2 Pedro Parracho - MEFT@CERN 2008
From position o track
- That is the job for …
reconstruction
1. Chose start and finish point 2. Try to fit track to targets targets 3. Add middle points 4. Check that all the groups have groups have points in the track
13 Feb. 2008 3 Pedro Parracho - MEFT@CERN 2008
Reconstructed event
13 Feb. 2008 4 Pedro Parracho - MEFT@CERN 2008
Experimental Particle Physics
- Chose a particle and a particular decay
channel channel.(PDG)
- From that it will depend what is more
i t t f i t f d t t d important for you in terms of detector, and tracks
- For this presentation you’re going to see:
+ − +
→ π π
s
K
13 Feb. 2008 5 Pedro Parracho - MEFT@CERN 2008
Choice of good events
- You need to make sure that all the
detectors you depend for your study were detectors you depend for your study were working correctly at the data taking time.
13 Feb. 2008 6 Pedro Parracho - MEFT@CERN 2008
First mass spectrum
13 Feb. 2008 7 Pedro Parracho - MEFT@CERN 2008
Cuts
This is 90% of the Work…
What these cuts are:
Track distance IV PCA Δz
- Daughter particles of a V0 decay
- riginate at the same point in space
They make sense because:
- riginate at the same point in space
- The particles have decay lengths of
2.7 cm (becomes 72 cm in the
13 Feb. 2008 8 Pedro Parracho - MEFT@CERN 2008
laboratory frame)
After the Cuts
13 Feb. 2008 9 Pedro Parracho - MEFT@CERN 2008
Background Subtraction
Combinatorial Fit
This is the other 90% of the Work…
The idea:
- Take a “particle” that
could be real but that you could be real, but that you are sure it is not.
– Each track is from a different collision – The ditracks characteristics are according to the real
- nes
- Take enough of them
- Subtract their mass to
- Subtract their mass to
your histogram
13 Feb. 2008 10 Pedro Parracho - MEFT@CERN 2008
Acceptances
- The result you “see” has been biased by
the detector and by the analysis steps the detector and by the analysis steps.
- Now you must “unbias” so that you can
bli h lt bl ith th publish a result comparable with other results.
- This is again… 90% of the work
- But after this you are done… You just
y j have to write the thesis/article ☺
13 Feb. 2008 11 Pedro Parracho - MEFT@CERN 2008
Pause for questions
13 Feb. 2008 Pedro Parracho - MEFT@CERN 2008 12
Multivariate analysis
- Multivariate statistical analysis is a collection of
procedures which involve observation and procedures which involve observation and analysis of more than one statistical variable at a time.
- Some Classification Methods :
– Fisher Linear Discriminant Gaussian Discriminant – Gaussian Discriminant – Random Grid Search – Naïve Bayes (Likelihood Discriminant) – Kernel Density Estimation e e e s y s a o – Support Vector Machines – Genetic Algorithms – Binary Decision Trees – Neural Networks
13 Feb. 2008 13 Pedro Parracho - MEFT@CERN 2008
Decision Trees
Node
A decision tree is f a sequence of cuts. h h Choose cuts that partition the data i bi f i i into bins of increasing purity.
Leaf
Key idea: do so i l
MiniBoone, Byron Roe
recursively.
13 Feb. 2008 14 Pedro Parracho - MEFT@CERN 2008
TMVA, what is it?
- Toolkit for Multivariate Analysis
ft f k i l ti l – software framework implementing several MVA techniques common processing of input data – common processing of input data (decorrelation, cuts,...) training testing and evaluation (plots log file) – training, testing and evaluation (plots, log-file) – reusable output of obtained models (C++ codelets text files) codelets, text files)
13 Feb. 2008 Pedro Parracho - MEFT@CERN 2008 15
Implemented methods
- Rectangular cut optimisation
- Likelihood estimator
Likelihood estimator
- Multi-dimensional likelihood estimator and k-
nearest neighbor (kNN) g ( )
- Fisher discriminant and H-Matrix
- Artificial Neural Network (3 different
implementations)
- Boosted/bagged Decision Trees
R l bl
- Rule ensemble
- Support Vector Machine (SVM)
F ti Di i i t A l i (FDA)
13 Feb. 2008 Pedro Parracho - MEFT@CERN 2008
- Function Discriminant Analysis (FDA)
16
Advantages of TMVA
- Distributed with ROOT
- several methods under one 'roof‘
– easy to systematically compare many classifiers, – and find the best one for the problem at hand – common input/output interfaces l ti f ll l ifi i bj ti – common evaluation of all classifiers in an objective way – plugin as many classifiers as possible
- a GUI provides a set of performance plots
th fi l d l( ) d i l t t fil d
- the final model(s) are saved as simple text files and
reusable through a reader class
- also,
the models may be saved as C++ classes (package independent) hich can be inserted into an (package independent), which can be inserted into any application
- it’s easy to use and flexible
t i l t th h l ifi i
13 Feb. 2008 Pedro Parracho - MEFT@CERN 2008 17
- easy
to implement the chosen classifier in user applications
Logical Flow
13 Feb. 2008 Pedro Parracho - MEFT@CERN 2008 18
Plots
13 Feb. 2008 Pedro Parracho - MEFT@CERN 2008 19
Correlation Plots
13 Feb. 2008 Pedro Parracho - MEFT@CERN 2008 20
Comparison of all the methods Comparison of all the methods
- In this plot we can
see how good see how good each of the methods is for our problem.
- The best method
seems to be the BDT (boosted decision trees) that is basically a method that expands the usual cut method to more dimensions
13 Feb. 2008 Pedro Parracho - MEFT@CERN 2008 21
Methods output Methods output
All the methods output a number (the output classifier) that represents All the methods output a number (the output classifier) that represents how well the given event matches the background. Here we can see the distributions of this value for two chosen methods (the best: BDT and the worst: Function Discriminant Analysis). This plots can help s to pinpoint the c t al e to chose for o r st d
13 Feb. 2008 Pedro Parracho - MEFT@CERN 2008 22
us to pinpoint the cut value to chose for our study.
Where to cut Where to cut
Th TMVA d thi ki d f l t hi h f l t h l
13 Feb. 2008 Pedro Parracho - MEFT@CERN 2008 23
- The TMVA produces this kind of plots, which are very useful to help
deciding how pure the selected signal can be
Eye Candy
13 Feb. 2008 Pedro Parracho - MEFT@CERN 2008 24
Eye Candy II
13 Feb. 2008 Pedro Parracho - MEFT@CERN 2008 25
End
13 Feb. 2008 26 Pedro Parracho - MEFT@CERN 2008
Backup
13 Feb. 2008 27 Pedro Parracho - MEFT@CERN 2008
PID in NA60
This is the “muon part of NA60”: After the hadron absorber only muons survive and are tracked in the MWPCS After the hadron absorber, only muons survive, and are tracked in the MWPCS
13 Feb. 2008 28 Pedro Parracho - MEFT@CERN 2008
Decision Trees
Geometrically, a decision
200 f(x) = 0
f(x) = 1 y, tree is an n- dimensional histogram whose bins B = 10 B = 1 histogram whose bins are constructed recursively
100
B = 10 S = 9 B = 1 S = 39 f(x) = 0 y Each bin is associated
MT Hits
B 37 f(x) = 0 with some value of the desired function f(x)
E (G V) PM
B = 37 S = 4
MiniBoone, Byron Roe
0.4 Energy (GeV)
13 Feb. 2008 29 Pedro Parracho - MEFT@CERN 2008
Decision Trees
For each variable find
200 f(x) = 0
f(x) = 1
For each variable find the best cut:
B = 10 B = 1
Decrease in impurity = Impurity(parent)
100
B = 10 S = 9 B = 1 S = 39 f(x) = 0
= Impurity(parent)
- Impurity(leftChild)
I it ( i htChild)
MT Hits
B 37 f(x) = 0
- Impurity(rightChild)
and partition sing the
E (G V) PM
B = 37 S = 4
and partition using the best of the best
0.4 Energy (GeV)
13 Feb. 2008 30 Pedro Parracho - MEFT@CERN 2008
Decision Trees
A common impurity
200 f(x) = 0
f(x) = 1 A common impurity measure is (Gini): B = 10 B = 1 Impurity = N * p*(1-p)
100
B = 10 S = 9 B = 1 S = 39 f(x) = 0 where p = S / (S+B)
MT Hits
B 37 f(x) = 0 p = S / (S+B) N = S + B
E (G V) PM
B = 37 S = 4 N = S + B
0.4 Energy (GeV)
13 Feb. 2008 31 Pedro Parracho - MEFT@CERN 2008
How to use TMVA
13 Feb. 2008 Pedro Parracho - MEFT@CERN 2008 32
Train the methods
1. Book a “factory”
TMVA::Factory* factory = new TMVA Factory(“<JobName>” targetFile new TMVA::Factory(“<JobName>”, targetFile, ”<options>”)
2. Add Trees to the factory
factory->AddSignalTree(sigTree, sigWeight); y g g g g factory->AddBackgroundTree(bkgTreeA, bkgWeightA);
3. Add Variables
factory->AddVariable(“VarName”, ‘I’) factory->AddVariable(“log(<VarName>)”, ‘F’)
4. Book the methods to use
factory->BookMethod(TMVA::Types::<method enum>, “<MethodName>" “<options>") <MethodName> , <options> )
5. Train, test and evaluate the methods
factory->TrainAllMethods(); factory->TestAllMethods();
13 Feb. 2008 Pedro Parracho - MEFT@CERN 2008 33
y (); factory->EvaluateAllMethods();
Apply the methods
- 1. Book a “reader”
TMVA::Reader *reader = new TMVA::Reader() TMVA::Reader reader = new TMVA::Reader()
- 2. Add the variables
reader->AddVariable(“<YourVar1>", &localVar1); ( , ); reader->AddVariable(“log(<YourVar1>)", &localVar1);
- 3. Book Classifiers
reader->BookMVA( “<YourClassifierName>", ”<WheightFile.weights.txt>” );
4 Get the Classifier output
- 4. Get the Classifier output
reader->EvaluateMVA(“<YourClassifierName>") reader->EvaluateMVA("Cuts",signalEfficiency)
13 Feb. 2008 Pedro Parracho - MEFT@CERN 2008 34