(machine) learning jet substructure Machine Learning for Jet Physics - - PowerPoint PPT Presentation

โ–ถ
machine learning jet substructure
SMART_READER_LITE
LIVE PREVIEW

(machine) learning jet substructure Machine Learning for Jet Physics - - PowerPoint PPT Presentation

A complete linear basis for (machine) learning jet substructure Machine Learning for Jet Physics Workshop, 2017 Eric M. Metodiev Center for Theoretical Physics Massachusetts Institute of Technology Based on work with Patrick T. Komiske and


slide-1
SLIDE 1

A complete linear basis for (machine) learning jet substructure

Machine Learning for Jet Physics Workshop, 2017

Eric M. Metodiev

Center for Theoretical Physics Massachusetts Institute of Technology Based on work with Patrick T. Komiske and Jesse Thaler

December 12, 2017 Eric M. Metodiev, MIT 1

slide-2
SLIDE 2

Energy Flow Polynomials (EFPs) The Energy Flow Basis from IRC safety Spanning Substructure with Linear Regression

Eric M. Metodiev, MIT 2 ECFs Angularities Planar Flow ECFGs Jet Mass โ€ฆ

Taming the (IRC-safe) Substructure Zoo

slide-3
SLIDE 3

EFP

G = เท ๐‘—1=1 ๐‘

เท

๐‘—2=1 ๐‘

โ‹ฏ เท

๐‘—๐‘‚=1 ๐‘

๐‘จ๐‘—1๐‘จ๐‘—2 โ‹ฏ ๐‘จ๐‘—๐‘‚ เท‘

๐‘™,๐‘š โˆˆG

๐œ„๐‘—๐‘™๐‘—๐‘š

Correlator

Sum over all N-tuples of particle in the event

Energies

Product of the N energy fractions

Angles

One ๐œ„๐‘—๐‘™๐‘—๐‘š for each edge in ๐‘™, ๐‘š โˆˆ ๐ป

Anatomy of an Energy Flow Polynomial:

In equations: In words:

  • f

and

Eric M. Metodiev, MIT 3

๐‘“+๐‘“โˆ’: ๐‘จ๐‘— =

๐น๐‘˜ ฯƒ๐‘™ ๐น๐‘™,

๐œ„๐‘—๐‘˜ =

2๐‘ž๐‘—

๐œˆ๐‘ž๐‘˜๐œˆ

๐น๐‘—๐น๐‘˜

๐›พ 2

Hadronic: ๐‘จ๐‘— =

๐‘ž๐‘ˆ๐‘˜ ฯƒ๐‘™ ๐‘ž๐‘ˆ๐‘™, ๐œ„๐‘—๐‘˜ = ฮ”๐‘ง๐‘—๐‘˜ 2 + ฮ”๐œš๐‘—๐‘˜ 2

๐›พ 2

Energy Fraction Pairwise Angular Distance

๐‘จ๐‘— ๐‘จ๐‘˜ ๐œ„๐‘—๐‘˜

multigraph

slide-4
SLIDE 4

EFP

G = เท ๐‘—1=1 ๐‘

เท

๐‘—2=1 ๐‘

โ‹ฏ เท

๐‘—๐‘‚=1 ๐‘

๐‘จ๐‘—1๐‘จ๐‘—2 โ‹ฏ ๐‘จ๐‘—๐‘‚ เท‘

๐‘™,๐‘š โˆˆG

๐œ„๐‘—๐‘™๐‘—๐‘š

Correlator

Sum over all N-tuples of particle in the event

Energies

Product of the N energy fractions

Angles

One ๐œ„๐‘—๐‘™๐‘—๐‘š for each edge in ๐‘™, ๐‘š โˆˆ ๐ป

Anatomy of an Energy Flow Polynomial:

In equations: In words:

  • f

and

In pictures:

๐‘จ๐‘—๐‘˜ ๐‘˜ ๐œ„๐‘—๐‘™๐‘—๐‘š ๐‘™ ๐‘š (e.g.) = เท

๐‘—1=1 ๐‘

เท

๐‘—2=1 ๐‘

เท

๐‘—3=1 ๐‘

เท

๐‘—4=1 ๐‘

๐‘จ๐‘—1๐‘จ๐‘—2๐‘จ๐‘—3๐‘จ๐‘—4 ๐œ„๐‘—1๐‘—2๐œ„๐‘—2๐‘—3๐œ„๐‘—3๐‘—4๐œ„๐‘—2๐‘—4

2

1 2 3 4

(any index labelling works)

Eric M. Metodiev, MIT 4

๐‘“+๐‘“โˆ’: ๐‘จ๐‘— =

๐น๐‘˜ ฯƒ๐‘™ ๐น๐‘™,

๐œ„๐‘—๐‘˜ =

2๐‘ž๐‘—

๐œˆ๐‘ž๐‘˜๐œˆ

๐น๐‘—๐น๐‘˜

๐›พ 2

Hadronic: ๐‘จ๐‘— =

๐‘ž๐‘ˆ๐‘˜ ฯƒ๐‘™ ๐‘ž๐‘ˆ๐‘™, ๐œ„๐‘—๐‘˜ = ฮ”๐‘ง๐‘—๐‘˜ 2 + ฮ”๐œš๐‘—๐‘˜ 2

๐›พ 2

Energy Fraction Pairwise Angular Distance

๐‘จ๐‘— ๐‘จ๐‘˜ ๐œ„๐‘—๐‘˜

multigraph

slide-5
SLIDE 5

Multigraph/EFP Correspondence

๐‘จ๐‘—๐‘˜ ๐‘˜ ๐œ„๐‘—๐‘™๐‘—๐‘š ๐‘™ ๐‘š Number of vertices N-particle correlator Number of edges Degree of angular monomial Treewidth + 1 Optimal VE Complexity Connected Disconnected Prime Composite = เท

๐‘—1=1 ๐‘

เท

๐‘—2=1 ๐‘

เท

๐‘—3=1 ๐‘

เท

๐‘—4=1 ๐‘

๐‘จ๐‘—1๐‘จ๐‘—2๐‘จ๐‘—3๐‘จ๐‘—4 ๐œ„๐‘—1๐‘—2๐œ„๐‘—2๐‘—3๐œ„๐‘—3๐‘—4๐œ„๐‘—2๐‘—4

2

Multigraph EFP

โ‹ฎ

e.g. Tree graph EFPs are ๐‘ƒ(๐‘2)! Surprisingly efficient to compute. Stay tunedโ€ฆ See P. Komiskeโ€™s talk.

Eric M. Metodiev, MIT 5

N d ๐œ“

slide-6
SLIDE 6

Energy Flow Polynomials (EFPs) The Energy Flow Basis from IRC safety Spanning Substructure with Linear Regression

Eric M. Metodiev, MIT 6 ECFs Angularities Planar Flow ECFGs Jet Mass โ€ฆ

Taming the (IRC-safe) Substructure Zoo

slide-7
SLIDE 7

IRC-safe Observable

EFPs linearly span IRC-safe observables

Eric M. Metodiev, MIT 7

slide-8
SLIDE 8

IRC-safe Observable

IR safety: Observable unchanged by addition of infinitesimally soft particle Relabeling Symmetry: All ways of indexing particles are equivalent C safety: Observable unchanged by the collinear splitting of a particle

Energy correlators linearly span IRC-safe observables

Energy Expansion: Expand/approximate the observable in polynomials of the particle energies

New, direct argument from IRC safety See also: F. Tkachov, hep-ph/9601308

  • N. Sveshnikov and F. Tkachov, hep-ph/9512370

Eric M. Metodiev, MIT 8

EFPs linearly span IRC-safe observables

slide-9
SLIDE 9

IRC-safe Observable

IR safety: Observable unchanged by addition of infinitesimally soft particle Relabeling Symmetry: All ways of indexing particles are equivalent C safety: Observable unchanged by the collinear splitting of a particle

Energy correlators linearly span IRC-safe observables

Energy Expansion: Expand/approximate the observable in polynomials of the particle energies

New, direct argument from IRC safety See also: F. Tkachov, hep-ph/9601308

  • N. Sveshnikov and F. Tkachov, hep-ph/9512370

Angular Expansion: Expansion/approximation of angular part of correlators in pairwise angular distances Analyze: Identify the unique analytic structures that emerge as non-isomorphic multigraphs/EFPs

EFPs linearly span/approximate IRC-safe observables!

Similar expansions & emergent multigraphs in:

  • M. Hogervorst et al. arXiv:1409.1581
  • B. Henning et al. arXiv:1706.08520

Eric M. Metodiev, MIT 9

EFPs linearly span IRC-safe observables

slide-10
SLIDE 10

Organization of the basis

EFPs are truncated by angular degree d, the order of the angular expansion. Finite number at each order in d! All prime EFPs up to d=5

Online Encyclopedia of Integer Sequences (OEIS) # of multigraphs with d edges # of EFPs of degree d # of connected multigraphs with d edges # of prime EFPs of degree d A050535 A076864 Exactly 1000 EFPs up to degree d=7!

Eric M. Metodiev, MIT 10

Image files for all of the prime EFP multigraphs up to d = 7 are available here.

slide-11
SLIDE 11

Energy Flow Polynomials (EFPs) The Energy Flow Basis from IRC safety Spanning Substructure with Linear Regression

Eric M. Metodiev, MIT 11 ECFs Angularities Planar Flow ECFGs Jet Mass โ€ฆ

Taming the (IRC-safe) Substructure Zoo

slide-12
SLIDE 12

Jet Observables with Energy Flow

๐‘›๐พ

2

๐‘ž๐‘ˆ๐พ

2 = เท ๐‘—1=1 ๐‘

เท

๐‘—2=1 ๐‘

๐‘จ๐‘—1๐‘จ๐‘—2(cosh ฮ”๐‘ง๐‘—1๐‘—2 โˆ’ cos ฮ”๐œš๐‘—1๐‘—2) = 1 2 + โ‹ฏ

Can include these using a fully general measure

Jet Mass

Dumbbell EFP

Eric M. Metodiev, MIT 12

slide-13
SLIDE 13

Jet Observables with Energy Flow

๐‘›๐พ

2

๐‘ž๐‘ˆ๐พ

2 = เท ๐‘—1=1 ๐‘

เท

๐‘—2=1 ๐‘

๐‘จ๐‘—1๐‘จ๐‘—2(cosh ฮ”๐‘ง๐‘—1๐‘—2 โˆ’ cos ฮ”๐œš๐‘—1๐‘—2) = 1 2 + โ‹ฏ ๐œ‡(4) = โˆ’ 3 4 ๐œ‡(6) = โˆ’ 3 2 + 5 8

Can include these using a fully general measure (and so on, for all even angularities)

Jet Mass

Dumbbell EFP

Angularities

Star Graph EFPs

๐œ‡(๐›ฝ) = เท

๐‘— ๐‘

๐‘จ๐‘—๐œ„๐‘—

๐›ฝ

Eric M. Metodiev, MIT 13

using pT

  • centroid axis

[C. Berger, T. Kucs, and G. Sterman, hep-ph/0303051] [S. Ellis, et al., arXiv:10010014] [L. Almeida, et al., arXiv:0807.0234] [A. Larkoski, J. Thaler, and W. Waalewijn, arXiv:1408.3122]

slide-14
SLIDE 14

Jet Observables with Energy Flow

Energy Correlation Functions

Complete Graph EFPs

๐‘“3

(๐›พ) =

๐‘“๐‘‚

(๐›พ) = เท ๐‘—1=1 ๐‘

เท

๐‘—2=1 ๐‘

โ‹ฏ เท

๐‘—๐‘‚=1 ๐‘

๐‘จ๐‘—1๐‘จ๐‘—2 โ‹ฏ ๐‘จ๐‘—๐‘‚ เท‘

๐‘™<๐‘šโˆˆ{1,โ‹ฏ,๐‘‚}

๐œ„๐‘—๐‘™๐‘—๐‘š

๐›พ

๐‘“4

(๐›พ) =

โ‹ฏ ๐‘“2

(๐›พ) =

with measure choice of ๐›พ

Eric M. Metodiev, MIT 14

[A. Larkoski, G. Salam, and J. Thaler, arXiv:1305.0007]

slide-15
SLIDE 15

Jet Observables with Energy Flow

Energy Correlation Functions

Complete Graph EFPs

Geometric Moments

Higher dumbbell EFPs

๐‘“3

(๐›พ) =

๐‘“๐‘‚

(๐›พ) = เท ๐‘—1=1 ๐‘

เท

๐‘—2=1 ๐‘

โ‹ฏ เท

๐‘—๐‘‚=1 ๐‘

๐‘จ๐‘—1๐‘จ๐‘—2 โ‹ฏ ๐‘จ๐‘—๐‘‚ เท‘

๐‘™<๐‘šโˆˆ{1,โ‹ฏ,๐‘‚}

๐œ„๐‘—๐‘™๐‘—๐‘š

๐›พ

๐‘“4

(๐›พ) =

โ‹ฏ ๐‘“2

(๐›พ) =

with measure choice of ๐›พ

๐ƒ = เท

๐‘—1=1 ๐‘

๐‘จ๐‘— ฮ”๐‘ง๐‘—

2

ฮ”๐‘ง๐‘—ฮ”๐œš๐‘— ฮ”๐œš๐‘—ฮ”๐‘ง๐‘— ฮ”๐œš๐‘—

2

tr ๐ƒ = 1 2 det ๐ƒ = โˆ’ 1 2

  • e. g.

Pf = 4 det ๐ƒ tr ๐ƒ 2

Eric M. Metodiev, MIT

using pT

  • centroid axis

15

[A. Larkoski, G. Salam, and J. Thaler, arXiv:1305.0007] [L. Almeida, et al., arXiv:0807.0234] [J. Gallicchio and M. Schwartz, arXiv:1211.7038] [J, Thaler and L.-T. Wang, arXiv:0806.0023]

slide-16
SLIDE 16

Energy Flow Polynomials (EFPs) The Energy Flow Basis from IRC safety Spanning Substructure with Linear Regression

Eric M. Metodiev, MIT 16 ECFs Angularities Planar Flow ECFGs Jet Mass โ€ฆ

Taming the (IRC-safe) Substructure Zoo

slide-17
SLIDE 17

Linear Models and Energy Flow

Linear methods:

Utilize the linear completeness of the Energy Flow basis. Convex and few/no hyperparameters to tune. Achieve global optimum via closed-form solution or convergent iteration. Simple models with the minimum number of parameters/input.

Rich in tools and applications:

First few chapters of C. Bishopโ€™s Pattern Recognition and Machine Learning:

๐‘‡ = เท

๐ป

๐‘ฅ๐ป EFP

G

Machine learn these Eric M. Metodiev, MIT 17

See P. Komiskeโ€™s talk. This talk.

slide-18
SLIDE 18

Confirming Analytic Relationships with Regression

๐œ‡(4) = โˆ’ 3 4 ๐œ‡(6) = โˆ’ 3 2 + 5 8 ๐œ‡(2) = 1 2

Eric M. Metodiev, MIT 18

slide-19
SLIDE 19

Linear Regression and IRC-safety

๐‘›๐พ ๐‘ž๐‘ˆ๐พ: IRC safe. No Taylor expansion due to square root.

๐œ‡(๐›ฝ=1/2): IRC safe. No simple analytic relationship. ๐œ2: IRC safe. Algorithmically defined. ๐œ21: Sudakov safe. Safe for 2-prong jets and higher. ๐œ32: Sudakov safe. Safe for 3-prong jets and higher. Multiplicity: IRC unsafe.

Expected to be IRC safe = Solid. Expected to be IRC unsafe = Dashed. T

  • p Jets (3 prong)

Eric M. Metodiev, MIT QCD Jets (1 prong) W Jets (2 prong) 19

[A. Larkoski, S. Marzani, and J. Thaler, arXiv:1502.01719]

slide-20
SLIDE 20

Eric M. Metodiev, MIT 20

Conclusions EFPs form a complete, linear representation of the jet

  • EFPs energy correlators with monomial angular structure
  • Encompass many existing classes of expert variables
  • Opens the door to using linear methods for jet substructure
  • IRC-unsafe information? Combine!
  • Use EFPs & linearity to reduce radiation pattern to a

single optimal observable

(Linear) Learning is easy

  • Linear models are convex & even closed-form at times
  • Few or no hyperparameters to tune at all
  • Guaranteed global optima
slide-21
SLIDE 21

Energy Flow Polynomials (EFPs) The Energy Flow Basis from IRC safety Spanning Substructure with Linear Regression

Eric M. Metodiev, MIT 21 ECFs Angularities Planar Flow ECFGs Jet Mass โ€ฆ

Taming the (IRC-safe) Substructure Zoo

The End