JET SUBSTRUCTURE AT THE LHC & BEYOND
Simone Marzani Università di Genova & INFN Sezione di Genova
1
Interpreting the LHC Run 2 data and Beyond ICTP Trieste 27th - 31st May 2019
JET SUBSTRUCTURE AT THE LHC & BEYOND Simone Marzani Universit - - PowerPoint PPT Presentation
JET SUBSTRUCTURE AT THE LHC & BEYOND Simone Marzani Universit di Genova & INFN Sezione di Genova Interpreting the LHC Run 2 data and Beyond ICTP Trieste 27 th - 31 st May 2019 1 OUTLINE Jet substructure: where we are
Simone Marzani Università di Genova & INFN Sezione di Genova
1
Interpreting the LHC Run 2 data and Beyond ICTP Trieste 27th - 31st May 2019
Jet substructure: where we are Machine-learning for jet physics Precision calculations in jet physics Conclusions and Open Questions
2
3
the two major goals of the LHC search for new particles characterise the particles we know jets can be formed by QCD particles but also by the decay of massive particles (if they are sufficiently boosted) how can we distinguish signal jets from background ones?
courtesy of G. Soyez
4
the final energy deposition pattern is influenced by the originating splitting hard vs soft translate into 2-prong vs 1-prong structure picture is mudded by many effects (hadronisation, Underlying Event, pileup) two-step procedure: grooming: clean the jets up by removing soft radiation tagging: identify the features of hard decays and cut on them
different energy deposition pattern
courtesy of G. Soyez
5
devise clever ways to project the multi- dimensional parameter space of final-state momenta into suitable lower dimensional (typically 1-D) distributions
Jet Mass [GeV] 70 80 90 Normalized to Unity 0.005 0.01 0.015= 13 TeV, Pythia 8 s /GeV < 300 GeV, 65 < mass/GeV < 95
T 250 < p qq' → W QCD dijets 21 τ Jet 0.2 0.4 0.6 0.8 1 Normalized to Unity 0.01 0.02= 13 TeV, Pythia 8 s /GeV < 300 GeV, 65 < mass/GeV < 95
T 250 < p qq' → W QCD dijets R between subjets Δ 0.5 1 1.5 Normalized to Unity 0.02 0.04= 13 TeV, Pythia 8 s /GeV < 300 GeV, 65 < mass/GeV < 95
T 250 < p qq' → W QCD dijetscourtesy of G. Soyez
for an introduction see SM, Soyez, Spannowsky
6
1 2 3 4 5 6 7 1 2 3 4 5 6 7
εS=0.4 65<m<105 GeV Pythia8(M13) anti-kt(1.0)
pt>500 GeV
D2
(2)[l⊗l/l]D2
(2)[t⊗l/l]D2
(2)[t⊗l/t]N2
(2)[t⊗l/l]N2
(2)[t⊗l/t]τ21
(2)[t⊗l/l]τ21
(2)[t⊗l/t]D2
(1)[t⊗l/t]M2
(2)[t⊗l/t]M2
(2)[trim]M2
(2)[l⊗l/l]performance resilience truth v. parton
all studied ATLAS-like CMS-like D2
(2,dichroic)
⌘ ⇣ = ∆✏2
S
h✏i2
S
+ ∆✏2
B
h✏i2
B
!1/2
∆✏S,B = ✏S,B ✏0
S,B,
h✏iS,B = 1 2
S,B
perturbative properties has reached remarkable levels resilience measures a tagger’s robustness against non- perturbative effects (hadronisation and UE) it is defined in terms of signal/background efficiencies with/without non-pert. contributions Looking inside jets
7 Events / 7 GeV
1000 2000 3000 4000 5000 6000 7000 8000
W Z t t Multijet Total background ) b H(b Data
(13 TeV)
35.9 fb
CMS
< 1000 GeV
T450 < p double-b tagger passing region
(GeV)
SD
m
40 60 80 100 120 140 160 180 200
Data
σ t t − multijet − Data
5 − 5 10
Z : 5.1σ H : 1.5σ
Phys.Rev.Lett. 120 (2018) no.7, 071802
reconstruction (anti-kt & particle-flow)
with energy correlation function N12
N12→N1,DDT2
corrections to obtain Z+jets and W+jets
corrected for finite top mass effects
normalisation
5 10 15 20 25
3
10 × Events / 5 GeV 80 100 120 140 160 180 200 220 jet mass [GeV] R Signal candidate large- 2
3
10 × t Data-QCD-t
σ 1 ± QCD and Top
Preliminary ATLAS
= 13 TeV, 80.5 fb s Signal Region
Data = 5.8)
Hµ SM Higgs ( QCD Fit = 1.5)
Vµ V+Jets ( σ 1 ± QCD Fit Top
80 100 120 140 160 180 200 220 jet mass [GeV] R Signal candidate large- 1 − 1
3
10 ×
t Data-QCD-t
σ 1 ± QCD, Top and V+Jets
8
ATLAS-CONF-2018-052
reconstruction (anti-kt & topoclusters)
by requiring two track subjets with variable R
corrections to obtain Z+jets and W+jets
corrected for finite top mass effects
normalisation
more details in the backup
9
H→bb is the holy grail of jet substructure, where it all started … embarrassingly it’s not been observed yet! Need more efficient tools? enter machine-learning Tremendous work went into understanding groomers and taggers, what’s the best use of these methods? deep thinking meets deep learning precision measurements using jet substructure
a wave of machine learning algorithms has hit jet physics in the past 3/4 years ML algorithms are powerful tools for classification, can we then apply them to our task?
10
if an algorithm can distinguish pictures of cats and dogs, can it also distinguish QCD jets from boosted-objects? number of papers trying to answer this question has recently exploded! very active and fast-developing field
= 13 TeV, Pythia 8 s
/GeV < 300 GeV, 65 < mass/GeV < 95
T250 < p
21 τ mass+ R Δ mass+ R Δ + 21 τ MaxOut Convnet Convnet-norm Randomjet images do what they say: project the jet into a nxn pixel image, where intensity is given by energy deposition use convolutional neural network (CNN) to classify right pre-processing is crucial for many reasons: we average over many events and Lorentz symmetry would wash away any pattern
11
W’→ WZ event Convolutions Convolved Feature Layers Max-Pooling Repeat
Repeat
[GeV] T Pixel pCogan, Kagan, Strauss, Schwartzman (2015) de Olivera, Kagan, Mackey, Nachman, Schwartzman (2016)
analyses typically have access to more information than energy deposit in the calorimeter: e.g. particle id, tracks, clustering history in a jet, etc. build network that take 4-momenta as inputs: clever N-body phase-space parametrisation to maximise information recurrent / recursive neural networks to model jet clustering history (using techniques borrowed from language recognition)
12
Louppe, Cho, Cranmer (2017) Datta, Larkoski (2017) Guest, Cranmer, Whiteson (2018)
inputs of ML algorithms can be low-level (calorimeter cells/particle 4-momenta) but also higher-level variables physics intuition can lead us to construct better representations of a jet: the Lund jet plane de-cluster the jet following the hard branch and record (kt, Δ) at each step feed this representation to a log-likelihood or a ML algorithm
13
Primary Lund-plane regions s
t
l i n e a r h a r d
l i n e a r ( l a r g e z ) ISR (large ) non-pert. (small kt) M P I / U E ln(R/) ln(kt/GeV)
Perturbative Non-perturbative – – – – – – – – –
Dryer, Salam, Soyez (2018)
14
0.0 0.2 0.4 0.6 0.8 1.0 Translated Rapidity y 0.0 0.2 0.4 0.6 0.8 1.0 Translated Azimuthal Angle φ 0.0 0.2 0.4 0.6 0.8 1.0 Translated Rapidity y 0.0 0.2 0.4 0.6 0.8 1.0 Translated Azimuthal Angle φ −R −R/2 R/2 R Translated Rapidity y −R −R/2 R/2 R Translated Azimuthal Angle φcontour
Learned Filters
Observable O Map Φ Function F Mass m pµ F(xµ) = √xµxµ Multiplicity M 1 F(x) = x Track Mass mtrack pµItrack F(xµ) = √xµxµ Track Multiplicity Mtrack Itrack F(x) = x Jet Charge [72] Qκ (pT , Q pκ
T )F(x, y) = y/xκ Eventropy [74] z ln z (pT , pT ln pT ) F(x, y) = y/x − ln x Momentum Dispersion [93] pD
T(pT , p2
T )F(x, y) = p y/x2 C parameter [94] C (|~ p |, ~ p ⊗ ~ p/|~ p |) F(x, Y ) =
3 2x2 [(Tr Y )2 − Tr Y 2]EFN : F (
M
∑
i=1
ziΦ(θi, ϕi)) PFN : F (
M
∑
i=1
Φ(pi))
. . . . . . . . . Particles Observable
Per-Particle Representation Event Representation
Φ Φ Φ F
Energy/Particle Flow Network
Latent Space
Komiske, Metodiev, Thaler (2018)
14
0.0 0.2 0.4 0.6 0.8 1.0 Translated Rapidity y 0.0 0.2 0.4 0.6 0.8 1.0 Translated Azimuthal Angle φ 0.0 0.2 0.4 0.6 0.8 1.0 Translated Rapidity y 0.0 0.2 0.4 0.6 0.8 1.0 Translated Azimuthal Angle φ −R −R/2 R/2 R Translated Rapidity y −R −R/2 R/2 R Translated Azimuthal Angle φcontour
Learned Filters
Observable O Map Φ Function F Mass m pµ F(xµ) = √xµxµ Multiplicity M 1 F(x) = x Track Mass mtrack pµItrack F(xµ) = √xµxµ Track Multiplicity Mtrack Itrack F(x) = x Jet Charge [72] Qκ (pT , Q pκ
T )F(x, y) = y/xκ Eventropy [74] z ln z (pT , pT ln pT ) F(x, y) = y/x − ln x Momentum Dispersion [93] pD
T(pT , p2
T )F(x, y) = p y/x2 C parameter [94] C (|~ p |, ~ p ⊗ ~ p/|~ p |) F(x, Y ) =
3 2x2 [(Tr Y )2 − Tr Y 2]EFN : F (
M
∑
i=1
ziΦ(θi, ϕi)) PFN : F (
M
∑
i=1
Φ(pi))
. . . . . . . . . Particles Observable
Per-Particle Representation Event Representation
Φ Φ Φ F
Energy/Particle Flow Network
Latent Space
−R −R/2 R/2 R Translated Rapidity y −R −R/2 R/2 R Translated Azimuthal Angle φ Energy Flow Network Latent Space
Komiske, Metodiev, Thaler (2018)
b e t t e r
Kasieczka et al. (2019)
15
AUC Accuracy 1/✏B (✏S = 0.3) #Parameters CNN [16] 0.981 0.930 780 610k ResNeXt [32] 0.984 0.936 1140 1.46M TopoDNN [18] 0.972 0.916 290 59k Multi-body N-subjettiness 6 [24] 0.979 0.922 856 57k Multi-body N-subjettiness 8 [24] 0.981 0.929 860 58k RecNN 0.981 0.929 810 13k P-CNN 0.980 0.930 760 348k ParticleNet [45] 0.985 0.938 1280 498k LBN [19] 0.981 0.931 860 705k LoLa [22] 0.980 0.929 730 127k Energy Flow Polynomials [21] 0.980 0.932 380 1k Energy Flow Network [23] 0.979 0.927 600 82k Particle Flow Network [23] 0.982 0.932 880 82k
images four- momenta theory- inspired
Flavor
→
Δ Mj~Mt
Substructur
all solutions offer big improvement over standard analysis (nsub+m) similar performances physics intuition useful to match performance
understanding of groomers and taggers led to the definition of theory-friendly efficient tools, e.g. soft drop: good perturbative properties (convergence, absence of soft effects such as non- global logs) small non-perturbative corrections
16
discussions at BOOST 2013
→ + > = = β = = β =
Frye, Larkoski, Schwartz, Yan (2016)
time is mature for theory / data comparison reduced sensitivity to non-pert physics (hadronisation and UE) should make the comparison more meaningful what is the value of unfolded measurements / theory comparisons for “discovery” tools? understanding systematics (e.g. kinks and bumps) where non-pert. corrections are small, test perturbative showers in MCs at low mass, hadronisation is large but UE is small: TUNE!
17
0.6 0.8 1 1.2 1.4 1.6 1.8 2 1 10 100 1000
√s=13 T eV, R=0.8, zcut=0.1dσ/dlog(m), NP correction factor m [GeV] UE correction, 460<pt,jet<550 GeV Herwig6(AUET2) Pythia6(Perugia2011) Pythia6(Z2) Pythia8(4C) Pythia8(Monash13) 0.6 0.8 1 1.2 1.4 1.6 1.8 2 1 10 100 1000
√s=13 T eV, R=0.8, zcut=0.1dσ/dlog(m), NP correction factor m [GeV] hadronisation correction, 460<pt,jet<550 GeV Herwig6(AUET2) Pythia6(Perugia2011) Pythia6(Z2) Pythia8(4C) Pythia8(Monash13)
18
SM, Schunk, Soyez (2017,2018) see also Frye et al. (2016) and Kang et al. (2018)
0.1 0.2 0.3 0.4 0.5 1 10 100 1000
√s=13 T eV, R=0.8, zcut=0.1
Δσ/Δlog(m) [nb]
⊗
NLO+LL⊗NP NLO+LL
m(GeV)
large range of masses where non-pert. corrections are small and we can trust resummation they can be included through MC or analytical modelling
19
SM, Schunk, Soyez (2017,2018) see also Frye et al. (2016) and Kang et al. (2018)
0.1 0.2 0.3 0.4 0.5 1 10 100 1000
√s=13 T eV, R=0.8, zcut=0.1
Δσ/Δlog(m) [nb]
⊗
NLO+LL⊗NP NLO+LL
m(GeV)
JHEP 11 (2018) 113
determination of other fundamental parameters may benefit from grooming, e.g. the top quark mass in the context of e+e- collisions SCET factorisation theorems allow for a precision- determination of the top-jet mass the picture at pp collisions is polluted by wide-angle soft radiation grooming “turns” pp observables into e+e- ones
20
Hoang, Mantry, Pathak, Stewart (2017)
current precision below 1%, dominated by lattice extractions LEP event shapes also very precise (5%) however they are in tension with the world average thrust (and C parameter) known with outstanding accuracy
21 τ-decays lattice
structure functions e+e- annihilation
hadron collider electroweak precision fjts Baikov ABM BBG JR MMHT NNPDF Davier Pich Boito SM review HPQCD (Wilson loops) HPQCD (c-c correlators) Maltmann (Wilson loops) JLQCD (Adler functions) Dissertori (3j) JADE (3j) DW (T) Abbate (T)
CMS
(tt cross section)GFitter Hoang
(C)JADE(j&s) OPAL(j&s) ALEPH (jets&shapes) PACS-CS (vac. pol. fctns.) ETM (ghost-gluon vertex) BBGPSV (static energy)
τ
σ dσ dτ τ
0.30 0.10 0.15 0.20 0.25 0.0 0.4 0.3 0.2 0.1
Fit at N LL
3
’
theory scan error
DELPHI ALEPH OPAL L3 SLDfor
&
!!"### !"##$ !"##% !"##& !"##' !"##(
!"&! !"'! !"(! !")! !"*! !
"
+*H9/
PLQ PD[ 8 Q 0 25 0 " VWULFW 8 Q 0 33 0 " VWULFW vary 0 33 5 Q 0 38 0 " 5 Q 0 33 6 Q 0 38 0 " 6 Q 0 25 VWULFW 6 Q 0 33 VWULFW 0 33 0 " 0 09 " 8 Q 0 33 :ELQV &*) &%% %<$ '&' '#( ')& &$) %%' &&$ %<& " " " " "39% CL 68% CL
τ τ
strong correlation with non-perturbative parameter
noticeable reduction of non-pert. corrections may allow to disentangle the degeneracy can we compute it at the same accuracy as standard event shapes? NNLO calculations recently performed
22
Baron, SM, Theeuwes (2018) Kardos, Somogyi, Trocsanyi (2018)
fits to pseudo-data generated by SHERPA preliminary results shows reduced dependence on non-pert. corrections subleading effects are under investigation
23
soft-drop allows us to extend the fit range Generale question: is there a natural way to define soft-drop event shapes? e.g. bottom-up soft- drop
SM, Reichelt, Schumann, Soyez, and Theeuwes (soon to appear) Dreyer, Necib, Soyez, Thaler (2018) Baron (in preparation)
0.090 0.100 0.110 0.120 0.130 0.140 0.150 0.160 0.170
FO Res NP (MC) NP (ana)
Q = 91.2 GeV β = 0 αs
plain zcut = 0.05 zcut = 0.1 zcut = 0.2 zcut = 0.33
0.09 0.1 0.11 0.12 0.13 0.14 0.15 0.16 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 Q = 91.2 GeV β = 0 NP (ana) αs τmin
plain zcut = 0.05 zcut = 0.1 zcut = 0.2 zcut = 0.33
1 2 3 4 5 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 Q = 91.2 GeV β = 0 NP (ana) χ2/dof τmin
plain zcut = 0.05 zcut = 0.1 zcut = 0.2 zcut = 0.33
24
a detailed understanding of boosted massive particles decaying into jets is
even stronger for future colliders at higher energies 10+ years of jet substructure allowed us to reach a profound understanding of QCD dynamics at small scales this understanding has been turned into algorithms which feature both performance and robustness Is this enough? Do we need more efficient tools? E.g. boosted H→bb is the holy grail of jet substructure, where it all started … embarrassingly it’s not been observed yet! (~1.5σ)
In the context of ML, are we suspicious of black-boxes? Should we? can we move from machine-learning to learning-from-machines? Interpretable neural networks? Prescriptive analytics? can we devise ML learning algorithms that preserve calculability? (jet topics, grooming through reinforcement learning …) What’s the best use of first-principle knowledge in jet physics? extraction of SM parameters? PDFs with q/g tagging? jet substructure probes of quark-gluon plasma in heavy ion collisions
25
(there are links to things I hadn’t time to discuss)
In the context of ML, are we suspicious of black-boxes? Should we? can we move from machine-learning to learning-from-machines? Interpretable neural networks? Prescriptive analytics? can we devise ML learning algorithms that preserve calculability? (jet topics, grooming through reinforcement learning …) What’s the best use of first-principle knowledge in jet physics? extraction of SM parameters? PDFs with q/g tagging? jet substructure probes of quark-gluon plasma in heavy ion collisions
25
(there are links to things I hadn’t time to discuss)
THANK YOU !
0.05 0.1 0.15 0.2 0.25 0.3 0.01 0.1 1 10-6 10-5 10-4 10-3 10 100 1000
zcut=ftrim=zprune=0.1 zcut=ftrim=zprune=0.1 Rtrim=0.2 Rtrim=0.2 fprune=0.5 fprune=0.5 R=1 R=1 pt=3 T eV pt=3 T eV zcut zprune ftrim ftrimr2
trimz2
pruneρ/σ dσ/ρ ρ=m2/(pt2 R2) m [GeV] plain SD(β=2) mMDT trimming pruning Y-pruning
26
CMS favours soft drop, ATLAS trimming, why? Performance does depend on the detail of the jet reconstruction procedure / detector However, performance is not the only criterion!
Trim
log R θ
log 1 z
z = zcut β > 0
trimmed
θ = Rsub
soft dropped
trimming has an abrupt change of behaviour due to fixed Rsub loss of efficiency at high pT in SD angular resolution controlled by the exponent β: phase-space appears smoother SD under better theory control
27
' ρ
1 − 1 2 3 4 5 6 7 8
1τ /
2τ
0.2 0.4 0.6 0.8 1 1.2 1.4
= 300-400 GeV T bkg, p = 500-600 GeV T bkg, p = 1000-1100 GeV T bkg, p = 300-400 GeV T sig, p = 500-600 GeV T sig, p = 1000-1100 GeV T sig, p' ρ
1 − 1 2 3 4 5 6 7 8
'
21τ
0.2 0.4 0.6 0.8 1 1.2 1.4 1.6
= 300-400 GeV T bkg, p = 500-600 GeV T bkg, p = 1000-1100 GeV T bkg, p = 300-400 GeV T sig, p = 500-600 GeV T sig, p = 1000-1100 GeV T sig, pCMS analysis cuts on a shape to isolate 2-pronged jets N12 is a ratio of generalised energy correlation functions optimised to work after grooming DDT is a procedure to de-correlate the mass from the jet shape cut, reducing sculpting
Moult, Necib, Thaler (2016) Dolen, Harris, SM, Nhan, Rappoccio (2016)
ATLAS analysis looks for 2 track jets using variable-R jets
Krohn, Thaler, Wang (2009)
Etadij = min ⇥ p2n
Ti, p2n Tj
⇤ R2
ij,
diB = p2n
TiReff(pTi)2,
Reff(pT ) = min ρ pT , Rmax
0.4