(Machine) Learning to Remove Pileup at the LHC BSM/LHC/DM Journal - - PowerPoint PPT Presentation

machine learning to remove pileup at the lhc
SMART_READER_LITE
LIVE PREVIEW

(Machine) Learning to Remove Pileup at the LHC BSM/LHC/DM Journal - - PowerPoint PPT Presentation

(Machine) Learning to Remove Pileup at the LHC BSM/LHC/DM Journal Club Eric M. Metodiev Center for Theoretical Physics, Massachusetts Institute of Technology Based on: Patrick T. Komiske, EMM, Benjamin Nachman, Matthew D. Schwartz,


slide-1
SLIDE 1

(Machine) Learning to Remove Pileup at the LHC

BSM/LHC/DM Journal Club Eric M. Metodiev

Center for Theoretical Physics, Massachusetts Institute of Technology

Based on: Patrick T. Komiske, EMM, Benjamin Nachman, Matthew D. Schwartz, arXiv:1707.08600

September 8, 2017

Eric M. Metodiev (MIT) PUMML September 8, 2017 1 / 25

slide-2
SLIDE 2

Overview

Pileup Jet Images Pileup Mitigation with Machine Learning (PUMML) Performance and Robustness What is being learned?

Eric M. Metodiev (MIT) PUMML September 8, 2017 2 / 25

slide-3
SLIDE 3

Pileup

Eric M. Metodiev (MIT) PUMML September 8, 2017 3 / 25

slide-4
SLIDE 4

Pileup

Pileup problem in context

Presently: ∼20 pileup vertices per bunch crossing Run 3: ∼80 pileup vertices per bunch crossing HL-LHC: ∼200 pileup vertices per bunch crossing

Eric M. Metodiev (MIT) PUMML September 8, 2017 4 / 25

slide-5
SLIDE 5

Pileup

Pileup pT is roughly uniform in pseudorapidity and azimuth. Charged particles with pT > 500MeV can be ID’d as pileup from tracking. The problem is thus to predict the neutral leading vertex (LV) pT .

Eric M. Metodiev (MIT) PUMML September 8, 2017 5 / 25

slide-6
SLIDE 6

Mitigation Approaches

Pileup Per Particle Identification (PUPPI)

Bertolini, Harris, Low, and Tran, arXiv:1407.6013 Correct particle/calorimeter energies based on surrounding charged pileup distribution.

SoftKiller

Cacciari, Salam, Soyez, arXiv:1407.0408 Dynamically determined transverse momentum cut.

Jet Cleansing

Krohn, Low, Schwartz, Wang, arXiv:1309.4777 Rescaling subjet four-momenta using charged leading vertex/pileup information.

Used default parameters to give sense of performance.

Eric M. Metodiev (MIT) PUMML September 8, 2017 6 / 25

slide-7
SLIDE 7

Machine Learning?

How to input the information?

The spirit is to organize all of our available local information. Have information on whether charged particles are pileup or not. Need low-level inputs.

What sort of architecture?

Use tools from modern machine learning. Don’t necessarily have to go “deep”

What sort of loss function?

Eric M. Metodiev (MIT) PUMML September 8, 2017 7 / 25

slide-8
SLIDE 8

Jet Images

Treat the detector as a camera and energy deposits as pixel intensities.

Cogan, Kagan, Strauss, Schwartzman. arXiv:1407.5675

Make use of the extensively developed computer vision technology, such as convolutional neural nets.

de Oliviera, Kagan, Mackey, Nachman, Schwartzman. arXiv:1511.05190

Translated Translated

Eric M. Metodiev (MIT) PUMML September 8, 2017 8 / 25

slide-9
SLIDE 9

Modern ML in HEP

An overview of recent machine learning applications with jet images. Classification

W vs QCD jets. (de Oliviera, Kagan, Mackey, Nachman, Schwartzman. arXiv:1511.05190) Top vs QCD jets. (Kasieczka, Plehn, Russell, Schell. arXiv:1701.08784) Quark vs Gluon jets. (Komiske, EMM, Schwartz. arXiv:1612.01551) And more...

Generation

Generative model. (de Oliveira, Paganini, Nachman. arXiv:1701.05927)

Regression

This work.

Eric M. Metodiev (MIT) PUMML September 8, 2017 9 / 25

slide-10
SLIDE 10

Our Model

Inputs: three-channel RGB “pileup image”

red = pT of all neutrals green = pT of charged PU blue = pT of charged LV

Output: neutral image

  • utput = pT of neutral LV

Eric M. Metodiev (MIT) PUMML September 8, 2017 10 / 25

slide-11
SLIDE 11

Our Study

Process

Leading vertex: 500GeV scalar to dijets with Pythia8 R = 0.4 anti-kT jets in |η| < 2 with pT > 100GeV. Pileup: NPU=140 Poissonian of soft QCD events overlaid.

Image parameters:

Charged jet image pixel resolution: ∆η × ∆φ = 0.025 × 0.025 Neutral jet image pixel resolution: ∆η × ∆φ = 0.1 × 0.1 Jet image size 0.9 × 0.9 Leading vertex/pileup information for charged particles with pT > 500MeV

Eric M. Metodiev (MIT) PUMML September 8, 2017 11 / 25

slide-12
SLIDE 12

Architecture

What sort of neural network layers should we use? Dense: Units connected to every input pixel with different weights Locally connected: Units connected to local input patches with different weights Convolutional: Units connected to local input patches with weight sharing

Eric M. Metodiev (MIT) PUMML September 8, 2017 12 / 25

slide-13
SLIDE 13

Architecture

Architecture: Two convolutional layers

6 × 6 filter sizes 10 filters per layer Only 4711 parameters

Architecture is local:

Pileup removal of a pixel depends only on the information in a window around it Can apply the trained model at the event-level, jet level, or on any specified region

Eric M. Metodiev (MIT) PUMML September 8, 2017 13 / 25

slide-14
SLIDE 14

PUMML Framework

Eric M. Metodiev (MIT) PUMML September 8, 2017 14 / 25

slide-15
SLIDE 15

Subtracted Jets

An example event with pileup and subtracted with each method. Loss function: Should we treat all pT errors equally or penalize hard/soft errors more? ℓ =

  • log
  • p(pred)

T

+ ¯ p p(true)

T

+ ¯ p

2

, with ¯ p → 0 favoring soft pixels and ¯ p → ∞ favors all pT equally.

Eric M. Metodiev (MIT) PUMML September 8, 2017 15 / 25

slide-16
SLIDE 16

Subtracted Observables

Distributions before and after subtraction of jet pT and dijet mass

Eric M. Metodiev (MIT) PUMML September 8, 2017 16 / 25

slide-17
SLIDE 17

Subtracted Observables

Distributions before and after subtraction of jet mass and N95.

Eric M. Metodiev (MIT) PUMML September 8, 2017 17 / 25

slide-18
SLIDE 18

Subtracted Observables

Distributions before and after subtraction of two energy correlation functions.

Eric M. Metodiev (MIT) PUMML September 8, 2017 18 / 25

slide-19
SLIDE 19

Model Robustness

25 50 75 100 125 150 175 NPU 0.88 0.90 0.92 0.94 0.96 0.98 1.00 Jet Mass Correlation Coefficient PUMML trained on NPU=20 PUMML trained on NPU=140 PUPPI SoftKiller

Study robustness to pileup by training and testing with different NPU.

300 400 500 600 700 800 900 mφ (GeV) 0.84 0.86 0.88 0.90 0.92 0.94 0.96 0.98 1.00 Jet Mass Correlation Coefficient

PUMML, mφ = 200 GeV PUMML, mφ = 2000 GeV PUPPI SoftKiller

Study robustness to the process by training and testing with different mφ.

Eric M. Metodiev (MIT) PUMML September 8, 2017 19 / 25

slide-20
SLIDE 20

What is being learned?

Train a single 12 × 12 filter and inspect it. Pixel-wise, PUMML learns: pN,LV

T, PUMML ≈ pN,tot T

− βpC,PU

T

This is of the same parametric form as Linear Cleansing! pN,LV

T, Linear Cleansing = pN,tot T

+ (1 − 1

¯ γ0 )pC,PU T

Eric M. Metodiev (MIT) PUMML September 8, 2017 20 / 25

slide-21
SLIDE 21

What is being learned?

pN,LV

T, PUMML ≈ pN,tot T

− βpC,PU

T

Robust as NPU→ 0 despite training on NPU=140. Can we understand PUMML’s β? It depends on your loss function: ℓ = |pN,LV

T,True − pN,LV T,Pred| −

→ β∗ = pN,PU

T

  • pC,PU

T

  • ℓ = (pN,LV

T,True − pN,LV T,Pred)2 −

→ β∗ = pN,PU

T

pC,PU

T

  • (pC,PU

T

)2 . Thinking about what PUMML learned suggested including charged/neutral PU correlations in the subtractor. This perspective could extend Jet Cleansing in interesting ways.

Eric M. Metodiev (MIT) PUMML September 8, 2017 21 / 25

slide-22
SLIDE 22

What is being learned?

Linear Cleansing Non-Linear Cleansing PUPPI

5 10 15 20 0.0 0.5 1.0 1.5 2.0 Number of Filters Number of Layers

PUMML Parameter Space

Eric M. Metodiev (MIT) PUMML September 8, 2017 22 / 25

slide-23
SLIDE 23

Learning from Data

Training from simulation risks mis-modelling issues Prefer to train on data rather than simulation

Data overlay approach using minimum bias and zero-bias events already used by experimental groups in other contexts. Promising for training PUMML directly with data for the relevant application.

Eric M. Metodiev (MIT) PUMML September 8, 2017 23 / 25

slide-24
SLIDE 24

Concluding Remarks

We have developed an ML framework that successfully organizes all of the availabe local information to directly learn to mitigate pileup. Can use tools from modern machine learning without going “deep”. Thinking about what the machine is learning may teach us something. Pileup mitigation can be a good proving ground for modern machine learning techniques in high energy physics.

Eric M. Metodiev (MIT) PUMML September 8, 2017 24 / 25

slide-25
SLIDE 25

The End

Thank You!

Eric M. Metodiev (MIT) PUMML September 8, 2017 25 / 25