Towards Data-Driven Particle Physics Classifiers Deep Learning in - - PowerPoint PPT Presentation

towards data driven particle physics classifiers
SMART_READER_LITE
LIVE PREVIEW

Towards Data-Driven Particle Physics Classifiers Deep Learning in - - PowerPoint PPT Presentation

Towards Data-Driven Particle Physics Classifiers Deep Learning in the Natural Sciences University of Hamburg Eric M. Metodiev Center for Theoretical Physics Massachusetts Institute of Technology Based on work with Patrick Komiske, Benjamin


slide-1
SLIDE 1

Towards Data-Driven Particle Physics Classifiers

Deep Learning in the Natural Sciences

University of Hamburg

Eric M. Metodiev

Center for Theoretical Physics Massachusetts Institute of Technology Based on work with Patrick Komiske, Benjamin Nachman, Matthew Schwartz, and Jesse Thaler

1 [1708.02949] [1801.10158] [1802.00008] [1809.01140]

March 1, 2019

slide-2
SLIDE 2

Data-Driven Particle Physics Classifiers

Outline

2

Classification at Colliders Training on Data Disentangling Categories

Eric M. Metodiev, MIT

slide-3
SLIDE 3

Data-Driven Particle Physics Classifiers

Outline

3

Classification at Colliders Training on Data Disentangling Categories

Eric M. Metodiev, MIT

slide-4
SLIDE 4

Data-Driven Particle Physics Classifiers

Jet Classification

Eric M. Metodiev, MIT 4

u, d, s g c b W/Z t H

Or

New Physics

? ? ? ? ? ?

slide-5
SLIDE 5

Data-Driven Particle Physics Classifiers

Jet Classification

Eric M. Metodiev, MIT 5

quark gluon c b W/Z t H

Or

New Physics

? ? ? ? ? ?

slide-6
SLIDE 6

Data-Driven Particle Physics Classifiers

Jet Classification

Eric M. Metodiev, MIT 6

quark gluon c b W/Z t H

Or

New Physics

? ? ? ? ? ? § §

New Physics signal quark jets QCD background gluon jets

slide-7
SLIDE 7

Data-Driven Particle Physics Classifiers

Jet Classification

Eric M. Metodiev, MIT 7

quark gluon c b W/Z t H

Or

New Physics

? ? ? ? ? ? § §

New Physics signal quark jets QCD background gluon jets

§ §

𝐷𝑟 = 4/3 𝐷𝑕 = 3 gluon jets are “twice as wide” as quark jets

slide-8
SLIDE 8

Data-Driven Particle Physics Classifiers

Machine Learning with Jets

Eric M. Metodiev, MIT 8

Images Observables Sequences Point Clouds …

All supervised classification methods require training data. Impossible to isolate pure samples of quark jets and gluon jets. Often rely on simulation, which is sensitive to mismodeling.

[M. Andrews, et al.,1902.08276]

[1902.08276]

[T. Cheng, 1711.02633]

[P.T. Komiske, EMM, M.D. Schwartz, 1612.01551]

[L. de Oliveira, et al., 1511.05190]

e.g. e.g.

[K. Datta, A. Larkoski, 1704.08249] [P.T. Komiske, EMM, J. Thaler, 1712.07124]

[G. Louppe, et al., 1702.00748]

e.g.

[G. Kasieczka, N. Kiefer, T. Plehn, J. Thompson, 1812.09223]

slide-9
SLIDE 9

Data-Driven Particle Physics Classifiers

Simulation vs. Data

Eric M. Metodiev, MIT 9

Simulation Data

[ATLAS Collaboration, 1405.6583]

Simple two-feature quark vs. gluon jet classifier using simulation and data. Very different! Is it possible to train classifiers on data?

“number of particles in the jet” “width of the jet” “width of the jet” “number of particles in the jet”

slide-10
SLIDE 10

Data-Driven Particle Physics Classifiers

Outline

10

Classification at Colliders Training on Data Disentangling Categories

Eric M. Metodiev, MIT

Classifying jets based on their originating particles.

slide-11
SLIDE 11

Data-Driven Particle Physics Classifiers

Outline

11

Classification at Colliders Training on Data Disentangling Categories

Eric M. Metodiev, MIT

Classifying jets based on their originating particles.

slide-12
SLIDE 12

Data-Driven Particle Physics Classifiers

Training on pure samples: Cat vs. Dog jets

Eric M. Metodiev, MIT 12

Dog Jets Cat Jets Classifier 1

vs.

slide-13
SLIDE 13

Data-Driven Particle Physics Classifiers

Training on mixed samples: Cat vs. Dog jets

Eric M. Metodiev, MIT 13

Dog-enriched Jets Cat-enriched Jets Classifier 1

vs.

This defines an equivalent classifier to the pure case!

slide-14
SLIDE 14

Data-Driven Particle Physics Classifiers

Classification without labels (CWoLa)

Eric M. Metodiev, MIT 14

𝑀𝑁1

𝑁2

𝑦 = 𝑞𝑁1 𝒚 𝑞𝑁2 𝒚 = 𝑔

1 cat 𝑞cat 𝒚 + 1 − 𝑔 1 cat 𝑞dog 𝒚

𝑔

2 cat 𝑞cat 𝒚 + 1 − 𝑔 2 cat 𝑞dog 𝒚 =

𝑔

1 cat 𝑀 cat dog

𝒚 + (1 − 𝑔

1 cat)

𝑔

2 cat𝑀 cat dog

𝒚 + 1 − 𝑔

2 cat

Optimal mixed sample classifier Optimal cat vs. dog classifier is a monotonic rescaling of Hence they define equivalent classifiers.

[EMM, B. Nachman, J. Thaler, 1708.02949] see also [L. Dery, B. Nachman, F. Rubbo, A. Schwartzman, 1702.00414] [T. Cohen, M. Freytsis, B. Ostdiek, 1706.09451] [P .T. Komiske, EMM, B. Nachman, M.D. Schwartz, 1801.10158]

slide-15
SLIDE 15

Data-Driven Particle Physics Classifiers

Training on pure samples: Quark vs. Gluon jets

Eric M. Metodiev, MIT 15

Gluon Jets Quark Jets Classifier 1

vs.

slide-16
SLIDE 16

Data-Driven Particle Physics Classifiers

Training on mixed samples: Quark vs. Gluon jets

Eric M. Metodiev, MIT 16

Gluon-enriched jets Quark-enriched Jets Classifier 1 Z + jet dijets

vs.

This defines an equivalent classifier to the pure case!

slide-17
SLIDE 17

Data-Driven Particle Physics Classifiers

Performance

Eric M. Metodiev, MIT 17

Expert

  • bservables

Works for very impure mixtures! Can train on mixed samples! Vary the mixture purity Compare 80-20% mixtures to pure samples Also works for convolutional neural networks and jet images.

[P .T. Komiske, EMM, B. Nachman, M.D. Schwartz, 1801.10158] [EMM, B. Nachman, J. Thaler, 1708.02949]

slide-18
SLIDE 18

Data-Driven Particle Physics Classifiers

Outline

18

Classification at Colliders Training on Data Disentangling Categories

Eric M. Metodiev, MIT

Classifying jets based on their originating particles. Weak supervision with mixed jet samples.

slide-19
SLIDE 19

Data-Driven Particle Physics Classifiers

Outline

19

Classification at Colliders Training on Data Disentangling Categories

Eric M. Metodiev, MIT

Classifying jets based on their originating particles. Weak supervision with mixed jet samples.

slide-20
SLIDE 20

Data-Driven Particle Physics Classifiers

What do we even mean by quark and gluon jets?

Eric M. Metodiev, MIT 20

Quarks are color triplets. Gluons are color octets. Hadrons in jets are color singlets. No unambiguous definition of quark and gluon jets. Various definitions of increasing verbosity We obtained a quark vs. gluon jet classifier without a definition… Operational data-driven definition of quark and gluon jets

[P . Gras, et al., 1704.03878]

slide-21
SLIDE 21

Data-Driven Particle Physics Classifiers

Topic Modeling and Blind Source Separation

Eric M. Metodiev, MIT 21 [Image: J. Bobin] [Image: D. Blei]

slide-22
SLIDE 22

Data-Driven Particle Physics Classifiers

Disentangling Categories

Eric M. Metodiev, MIT 22

Let’s model cats and dogs as random animal noise producers.

Growl Meow Growl Purr Woof Howl

slide-23
SLIDE 23

Data-Driven Particle Physics Classifiers

Disentangling Categories

Eric M. Metodiev, MIT 23

Listen to the animal noises from two different pet stores. 𝑂"Meow"

Store 𝐵

𝑂"Meow"

Store 𝐶 = 𝑔 Cat Store 𝐵

𝑔

Cat Store 𝐶

𝑂"Bark"

Store 𝐵

𝑂"Bark"

Store 𝐶 = 1 − 𝑔 Cat Store 𝐵

1 − 𝑔

Cat Store 𝐶

Meow Growl Purr Growl Woof Howl

slide-24
SLIDE 24

Data-Driven Particle Physics Classifiers

Disentangling Categories

Eric M. Metodiev, MIT 24

Disentangle cat and dog vocabularies from the animal noises at pet stores.

Pure cat and dog noise “phase space” is key

𝑂"Meow"

Store 𝐵

𝑂"Meow"

Store 𝐶 = 𝑔 Cat Store 𝐵

𝑔

Cat Store 𝐶

𝑂"Bark"

Store 𝐵

𝑂"Bark"

Store 𝐶 = 1 − 𝑔 Cat Store 𝐵

1 − 𝑔

Cat Store 𝐶

Meow Growl Purr Growl Woof Howl

slide-25
SLIDE 25

Data-Driven Particle Physics Classifiers

Disentangling Categories

Eric M. Metodiev, MIT 25

𝜆AB ≡ min

𝒚

𝑞𝐵 𝒚 𝑞𝐶 𝒚 =

1−𝑔

𝐵 𝑟

1−𝑔

𝐶 𝑟

𝜆BA ≡ min

𝒚

𝑞𝐶 𝒚 𝑞𝐵 𝒚 =

𝑔

𝐶 𝑟

𝑔

𝐵 𝑟

Number of particles in the jet

An operational definition of quark and gluon jets.

[P .T. Komiske, EMM, J. Thaler, 1809.01140] [EMM, J. Thaler, 1802.00008]

slide-26
SLIDE 26

Data-Driven Particle Physics Classifiers

Disentangling Categories

Eric M. Metodiev, MIT 26

𝜆AB ≡ min

𝒚

𝑞𝐵 𝒚 𝑞𝐶 𝒚 =

1−𝑔

𝐵 𝑟

1−𝑔

𝐶 𝑟

𝜆BA ≡ min

𝒚

𝑞𝐶 𝒚 𝑞𝐵 𝒚 =

𝑔

𝐶 𝑟

𝑔

𝐵 𝑟

With reducibility factors 𝜆AB and 𝜆BA, solve for the quark and gluon distributions:

𝑞quark 𝒚 = 𝑞𝐵 𝒚 −𝜆AB 𝑞𝐶 𝒚

1−𝜆AB

𝑞gluon 𝒚 = 𝑞𝐶 𝒚 −𝜆BA 𝑞𝐵 𝒚

1−𝜆BA

Number of particles in the jet

An operational definition of quark and gluon jets.

[P .T. Komiske, EMM, J. Thaler, 1809.01140] [EMM, J. Thaler, 1802.00008]

Can also use machine learning to determine the feature space.

slide-27
SLIDE 27

Data-Driven Particle Physics Classifiers

Collider data as mixtures of jet types

Theoretical and experimental definition of jet categories. Theoretically tractable: calculate reducibility factors from perturbative QCD for certain observables. Can use the fractions to calibrate ROC curves. Allows for any observable distributions to be extracted for quark and gluon jets separately.

Eric M. Metodiev, MIT 27

See extra slides for more.

slide-28
SLIDE 28

Data-Driven Particle Physics Classifiers

Summary

28

Classification at Colliders Training on Data Disentangling Categories

Eric M. Metodiev, MIT

Classifying jets based on their originating particles. Topic modeling to define data-driven jet categories. Weak supervision with mixed jet samples.

slide-29
SLIDE 29

Data-Driven Particle Physics Classifiers

The End

Thank you!

Eric M. Metodiev, MIT 29

slide-30
SLIDE 30

Data-Driven Particle Physics Classifiers

Extra Slides

Eric M. Metodiev, MIT 30

slide-31
SLIDE 31

Data-Driven Particle Physics Classifiers

A/B Likelihood Ratio

Eric M. Metodiev, MIT 31

𝑀A

B

𝒚 ≡ 𝑞𝐵 𝒚 𝑞𝐶 𝒚 = 𝑔

𝐵 𝑟 𝑀quark gluon

𝒚 + 1 − 𝑔

𝐵 𝑟

𝑔

𝐶 𝑟 𝑀quark gluon

𝒚 + 1 − 𝑔

𝐶 𝑟

The A/B and quark/gluon likelihood ratios are monotonic! The A/B likelihood ratio is bounded between

𝑔

𝐵 𝑟

𝑔

𝐶 𝑟 and

1−𝑔

𝐵 𝑟

1−𝑔

𝐶 𝑟!

Classification without labels (CWoLa)

  • Optimal A/B classifier is the optimal quark/gluon classifier.
  • Use machine learning to approximate A/B likelihood ratio.

Jet T

  • pics
  • “Mutually irreducibility” means the bounds saturate
  • Obtain the maxima and minima of the A/B likelihood ratio.
  • Solve for the quark/gluon fractions and distributions.

𝑞sample 𝐵 𝒚 = 𝑔

𝐵 𝑟 𝑞quark 𝒚 + 1 − 𝑔 𝐵 𝑟 𝑞gluon(𝒚)

𝑞sample 𝐶 𝒚 = 𝑔

𝐶 𝑟 𝑞quark 𝒚 + 1 − 𝑔 𝐶 𝑟 𝑞gluon(𝒚)

[EMM, B. Nachman, J. Thaler, 1708.02949] [EMM, J. Thaler, 1802.00008]

slide-32
SLIDE 32

Data-Driven Particle Physics Classifiers

An operational definition of quark and gluon jets

Eric M. Metodiev, MIT 32

𝑞quark 𝒚 ≡ 𝑞𝐵 𝒚 −𝜆AB 𝑞𝐶 𝒚

1−𝜆AB

𝑞gluon 𝒚 ≡ 𝑞𝐶 𝒚 −𝜆BA 𝑞𝐵 𝒚

1−𝜆BA

Well-defined and operational statement in terms of hadronic cross sections. Not a per-jet flavor label, but rather an aggregate distribution label. Jets themselves are operationally defined. Quark and Gluon Jet Definition (Operational): Given two samples A and B of QCD jets at a fixed 𝑞𝑈 obtained by a suitable jet-finding procedure, taking A to be “quark-enriched” compared to B, and a jet substructure feature space 𝒚, quark and gluon jet distributions are defined to be:

slide-33
SLIDE 33

Data-Driven Particle Physics Classifiers

Extracting quark and gluon distributions

Eric M. Metodiev, MIT 33

The extracted quark and gluon fractions can be used to obtain any quark/gluon distributions.

slide-34
SLIDE 34

Data-Driven Particle Physics Classifiers

(Self-)calibrating quark and gluon classifiers

Eric M. Metodiev, MIT 34

The extracted quark and gluon fractions can calibrate any data-driven quark/gluon classifiers. better

slide-35
SLIDE 35

Data-Driven Particle Physics Classifiers

How does “sample dependence” manifest in this language?

Pairs of samples define quark and gluon. Different pairs of samples may yield different flavor definitions. Comparing definitions from different pairs of samples (dijets, Z+jet, gamma+jet, …) in data could probe how universal quark and gluon are. Can grooming improve this?

There are ways to quantify how “explainable” a new sample C is by quark and gluon: Thus topic modeling techniques could be an interesting avenue to explore issues of sample dependence directly in data.

Eric M. Metodiev, MIT 35

max(𝑔𝑟 + 𝑔𝑕)

  • s. t. 𝑞𝐷 𝒚 = 𝑔𝑟 𝑞𝑟 𝒚 + 𝑔𝑕 𝑞𝑕 𝒚 + 1 − 𝑔𝑟 − 𝑔𝑕 𝑞other(𝒚)

Sample dependence?

slide-36
SLIDE 36

Data-Driven Particle Physics Classifiers Eric M. Metodiev, MIT 36

Jet topics from QCD: Casimir scaling

Jet mass (and many substructure observables) exhibits Casimir scaling at Leading Logarithmic accuracy: Σ𝑕(𝑛) = Σ𝑟 𝑛

𝐷𝐵 𝐷𝐺

The quark/gluon reducibility factors at LL for any Casimir scaling observable are: 𝜆𝑟𝑕 = min

𝑛

𝑞𝑟(𝑛) 𝑞𝑕(𝑛) = min

𝑛

Σ𝑟′(𝑛) Σ𝑕′(𝑛) = 𝐷𝐺 𝐷𝐵 min

𝑛 Σ𝑟 ′ 𝑛 1−𝐷𝐵 𝐷𝐺 = 𝐷𝐺

𝐷𝐵 = 4 9

𝐷𝐺 =

4 3 for quarks

𝐷

𝐵 = 3 for gluons

𝜆𝑕𝑟 = min

𝑛

𝑞𝑕(𝑛) 𝑞𝑟(𝑛) = min

𝑛

Σ𝑕′ (𝑛) Σ𝑟′(𝑛) = 𝐷𝐵 𝐷𝐺 min

𝑛 Σ𝑟 ′ 𝑛 𝐷𝐵 𝐷𝐺−1 = 0

slide-37
SLIDE 37

Data-Driven Particle Physics Classifiers Eric M. Metodiev, MIT 37

Jet topics from QCD: Poisson scaling

Soft Drop Multiplicity (and other count observables) exhibits Poisson scaling at Leading Logarithmic accuracy: 𝑞𝑟 𝑜 = Pois 𝑜; 𝐷𝐺𝜇 , 𝑞𝑕 𝑜 = Pois 𝑜; 𝐷𝐵𝜇 . The quark/gluon reducibility factors at LL for any Poisson scaling observable are:

𝐷𝐺 =

4 3 for quarks

𝐷

𝐵 = 3 for gluons

𝜆𝑕𝑟 = min

𝑜

𝑞𝑕(𝑜) 𝑞𝑟(𝑜) = min

𝑜

𝐷𝐵𝜇 𝑜 𝑓−𝐷𝐵𝜇 𝐷𝐺𝜇 𝑜 𝑓−𝐷𝐺𝜇 = 𝑓𝜇(𝐷𝐺−𝐷𝐵) min

𝑜

𝐷𝐵 𝐷𝐺

𝑜

= 𝑓𝜇(𝐷𝐺−𝐷𝐵) 𝜆𝑟𝑕 = min

𝑜

𝑞𝑟(𝑜) 𝑞𝑕(𝑜) = min

𝑜

𝐷𝐺𝜇 𝑜 𝑓−𝐷𝐺𝜇 𝐷𝐵𝜇 𝑜 𝑓−𝐷𝐵𝜇 = 𝑓𝜇(𝐷𝐵−𝐷𝐺) min

𝑜

𝐷𝐺 𝐷𝐵

𝑜

= 0

slide-38
SLIDE 38

Data-Driven Particle Physics Classifiers

Exploring substructure feature spaces

Eric M. Metodiev, MIT 38

Casimir scaling of mass and width is observed (gray). Count observables come closer to saturating the bounds (black). Lower bound easier to extract than upper. (i.e. Gluons are easy!) Models CWoLa-trained. Fully data-driven. Well-behaved likelihoods close to S/(S+B) expectation. All different models manifest the same bounds.

slide-39
SLIDE 39

Data-Driven Particle Physics Classifiers

Parton-labeled sample dependence in Pythia

Eric M. Metodiev, MIT 39