OPTIMIZATION OF MULTIVARIATE DISCRIMINATORS IN THE WHèLVBB CHANNEL AT DØ
Stephanie Hamilton Michigan State University
SIST Final Presentation 1
5 August 2013
OPTIMIZATION OF MULTIVARIATE DISCRIMINATORS IN THE WH LVBB CHANNEL - - PowerPoint PPT Presentation
SIST Final Presentation 1 OPTIMIZATION OF MULTIVARIATE DISCRIMINATORS IN THE WH LVBB CHANNEL AT D Stephanie Hamilton 5 August 2013 Michigan State University Introduction 2 The Standard Model (SM) The SM Higgs Boson SIST Final
Stephanie Hamilton Michigan State University
SIST Final Presentation 1
5 August 2013
5 August 2013
SIST Final Presentation
5 August 2013 SIST Final Presentation
3
¨ Current theory of known fundamental particles and
¨ Extremely successful!
¤ Predicted the existence of the
5 August 2013 SIST Final Presentation
4
¨ A Higgs mechanism is an essential part of the SM
¤ Gives mass to most particles – without it, the SM would
¤ Provides explanation for electroweak symmetry
¨ A victory for the Standard Model!
¤ A Higgs boson was discovered by ATLAS and CMS at
¤ Simultaneously saw evidence for a new particle in the
5 August 2013 SIST Final Presentation
5
¨ WHèlνbb is one of six analyses combined for this plot
¤ Want to improve sensitivity because the Higgs boson has not been
¨ Expected production cross-section over predicted SM cross-
¤ A measure of our sensitivity
n Greater than 1 => cannot give a
definite answer
n Less than 1 => can definitively say
whether or not the particle is there
Credit: CDF and D0, http://arxiv.org/pdf/1303.6346v3.pdf
1 10 100 120 140 160 180 200 mH (GeV/c2) 95% C.L. Limit/SM Tevatron Run II, Lint 10 fb-1 SM Higgs combination
Observed Expected w/o Higgs Expected ± 1 s.d. Expected ± 2 s.d. Expected if mH=125 GeV/c2 SM=1
5 August 2013
SIST Final Presentation
5 August 2013 SIST Final Presentation
7
¨ Direct search at √s = 1.96 TeV ¨ Two primary means of production
¤ Gluon fusion ¤ Associated production
¨ Decay branching ratios depend on the mass
5 August 2013 SIST Final Presentation
8
¨ Multiple subdetectors
¤ Tracking system
n Silicon Microstrip Tracker n Central Fiber Tracker
¤ Calorimeter ¤ Muon system
¨ Neutrinos identified as
5 August 2013 SIST Final Presentation
9
¨ Tiny Higgs signal against huge backgrounds ¨ Reducing the huge background
¤ b-tagging, Multivariate techniques Multijet ttbar
Credit: Dr. Mike Cooke
WHèlνbb V+jets Diboson
5 August 2013 SIST Final Presentation
10
¨ First, what is a jet?
¤ Attempting to separate a pair of quarks - takes less
¤ Charged particles leave tracks in the tracker and the
¨ Identifying bottom quark jets
¤ Look for:
n A secondary vertex displaced from
n Displaced impact parameter
5 August 2013
SIST Final Presentation
5 August 2013 SIST Final Presentation
12
¨ Toolkit for Multivariate Analysis (TMVA)
¤ A library of ROOT, the statistical analysis framework used
¨ Multivariate Analysis (MVA)
¤ Combining several moderately discriminating variables into
n Discriminating => background distribution of the variable tends
toward left of histogram, while signal tends toward right
¤ Secondary MVAs
n Higgs vs. specific background (ttbar, V+jets, diboson, multijet)
¤ Final MVA
n Higgs vs. all background
5 August 2013 SIST Final Presentation
13
¨ Decision Trees (DT)
¤ Subsequent cuts are made on different input variables
¤ Each leaf has a specific signal-to-background ratio
¨ Boosted Decision Trees (BDT)
¤ A “forest” of many DTs ¤ The signal-to-background
Credit: Dr. Mike Cooke
5 August 2013 SIST Final Presentation
14
¨ Possible to vary
¤ BoostType – defines how TMVA uses the signal-to-
¤ NTrees – number of trees in the random forest ¤ Shrinkage – defines the learning rate of the boosting
¤ NNodesMax – maximum number of nodes any tree is
¤ MaxDepth – how many “levels” a tree is allowed to have ¤ GradBaggingFraction – defines the fraction of events that
¤ And many more…
15
¨ Overtraining
¤ TMVA begins to cut on statistical fluctuations rather than on
¤ Compare “train” and “test” subsamples to determine the
n KS test – considered passed if both background and signal results
were above 1%
5 August 2013 SIST Final Presentation
16
¨ Background Rejection vs. Signal Acceptance Curve
¤ How much signal is being kept after a certain amount
5 August 2013
SIST Final Presentation
5 August 2013 SIST Final Presentation
18
¨ When run, the optimization process would vary
¤ NTrees ¤ Shrinkage ¤ NNodesMax ¤ GradBaggingFraction
¨ Signal Acceptance vs. Background Rejection curve
5 August 2013 SIST Final Presentation
19
20
**WORK IN PROGRESS** **WORK IN PROGRESS** **WORK IN PROGRESS** **WORK IN PROGRESS**
5 August 2013 SIST Final Presentation
21 **WORK IN PROGRESS**
5 August 2013 SIST Final Presentation
22
¨ Significant improvements in our expected sensitivity
Before Summer 2013 After Summer 2013 Percent Increase MVA el 6.28 5.70 9.24% MVA mu 6.52 5.88 9.51% MVA el+mu 4.42 4.02 9.05%
5 August 2013 SIST Final Presentation
23
¨ New optimization tools for Multivariate Analysis
¤ Varies the values of different options used for training
¨ These tools played an important part in the
5 August 2013 SIST Final Presentation
24
¨ Dr. Michael Cooke ¨ Dr. Ryuji Yamada ¨ My fellow summer students and the rest of the WH
¨ The SIST Committee
¤ Linda Diepholz ¤ Dianne Engram ¤ Dr. Davenport
¨ The DØ Collaboration ¨ Fermi National Accelerator Laboratory