Anomalies Detection for HEP Experiments Maxim Borisyak Denis - - PowerPoint PPT Presentation

anomalies detection for hep experiments
SMART_READER_LITE
LIVE PREVIEW

Anomalies Detection for HEP Experiments Maxim Borisyak Denis - - PowerPoint PPT Presentation

Anomalies Detection for HEP Experiments Maxim Borisyak Denis Derkach, Fedor Ratnikov, Andrey Ustyuzhanin HEP/ML Group, Yandex School of Data Analysis, National Research University High School of Economics with incredible help from CMS


slide-1
SLIDE 1

Maxim Borisyak Denis Derkach, Fedor Ratnikov, Andrey Ustyuzhanin

HEP/ML Group, Yandex School of Data Analysis, National Research University High School of Economics with incredible help from CMS colleagues

Anomalies Detection for HEP Experiments

slide-2
SLIDE 2

Fedor.Ratnikov@cern.ch Anomalies in High Energy Physics

Content

◊ Yandex ◊ Supervised anomalies detection ◊ Decomposition of anomalies by source ◊ Rare Anomalies

2

slide-3
SLIDE 3

Fedor.Ratnikov@cern.ch Anomalies in High Energy Physics

Yandex in Wikipedia

3

IT resources: ∼ 10%×

slide-4
SLIDE 4

Fedor.Ratnikov@cern.ch Anomalies in High Energy Physics

Yandex in HEP

◊ Member of CERN OpenLab ◊ Member of LHCb Collaboration

◊ trigger ◊ B-tagging ◊ monitoring ◊ anomalies detection ◊ computing resources

◊ Member of SHIP Collaboration

◊ detector optimisation ◊ computing resources

◊ Cooperating with CMS Collaboration

◊ data certification

◊ Cooperating with ATLAS Collaboration

◊ GRID optimisation

◊ Contributing to other Particle Physics experiments beyond CERN

4

slide-5
SLIDE 5

Fedor.Ratnikov@cern.ch Anomalies in High Energy Physics

Levels of Data Quality Monitoring

◊ Detector Level

◊ hit maps, occupancies…

◊ Routine Physics Level

◊ basic physics objects: hadrons, leptons, photons…

◊ Physics Candles

◊ J/ψ, Z, W, top, …

5

slide-6
SLIDE 6

Fedor.Ratnikov@cern.ch Anomalies in High Energy Physics

Formalising the Problem

Use Routine Physics operation level

◊ Continuously supervised learning approach

◊ we have historical data processed by experts

◊ expert classified data as “good” or “bad”

◊ the system learns typical patterns

◊ establishes procedure to split data samples into “black” (definitely bad),

“white” (definitely good), and “grey” (expert intervention needed) zones

“definitely bad” ≡ FalsePositive < cut_bad

“definitely good” ≡ FalseNegative < cut_good

◊ let system classify “black” and “white” domains, pass “grey” domain

for expert decision

As new data is coming, supervisor continue making complicated labelling

Ultimate goal: take burden of routine classification from experts, let experts deal with non-trivial cases

6

slide-7
SLIDE 7

Fedor.Ratnikov@cern.ch Anomalies in High Energy Physics

Practical Approach

◊ CMS 2010B run open data

◊ http://opendata.cern.ch/record/8 ◊ Streams: MinimalBias, Muons, Photons

◊ LumiSections (minimal chunk of data defined in metadata) are labelled

as “good” or “bad” by the experiment

◊ Objects: Particle Flow Jets, Calorimeter Jets, Photons, Muons

◊ (pT, 𝜃, 𝜒, Vxyz, mass) for 5 particles in quantiles in pT ◊ 7 features for every variable:

◊ quantiles: 0, 0.25, 0.5, 0.75, 1. + mean + variance

◊ over objects of all events in given LumiSection

◊ ∼2500 features describing every LumiSection

7

slide-8
SLIDE 8

Fedor.Ratnikov@cern.ch Anomalies in High Energy Physics

Reference Performance

◊ Use training part of all available data to train classifier

◊ ultimate best case scenario

◊ Analyse test part of data

◊ get probability for the given LS to be “good” ◊ select two probability thresholds: Cut_bad, Cut_good

◊ define three zones

◊ “black zone” - LS is definitely bad ◊ “white zone” - LS is definitely good ◊ “grey zone” - classifier is in doubt, expert decision is needed

8

black zone grey zone white zone

1 Cut “bad” Cut “good”

expert decision automatic decision

slide-9
SLIDE 9

Fedor.Ratnikov@cern.ch Anomalies in High Energy Physics

Performance

◊ Loss Rate

◊ “good” LS is classified as “definitely bad” and

thus is lost for physics

◊ LR = FN(“black”) / (TP(“white”) +FN (“black”))

◊ Pollution Rate

◊ “bad” LS is classified as “definitely good” and

thus pollutes certified data

◊ PR = FP(“white”) / (TP(“black”) + FP(“white”))

◊ Rejection Rate

◊ fraction of all LS which are not automatically

classified as “definitely bad” or “definitely good”

◊ RR = (“grey”) / (“black” + “grey” + “white”)

◊ ManualWork = RejectionRate

9

black zone grey zone white zone

1 Cut “bad” Cut “good”

∼80% saving on manual work is feasible for PR and LR at 5‰

slide-10
SLIDE 10

Fedor.Ratnikov@cern.ch Anomalies in High Energy Physics

Decomposing Anomalies

◊ Study effect of anomalies on individual channels

◊ what channels are responsible for anomalies? ◊ if only photons are affected, may muon data still be used? ◊ which plots should receive more attention from Data Quality experts?

◊ Decomposition of Channels

◊ build separate NN for every channel

◊ corresponding NN scores each channel

◊ connect networks by

◊ “min” operator with dropout ◊ exp ( (fi

subnetwork - 1)) a kind of “fuzzy AND”

◊ train network to approximate a global score ◊ individual NN has high predictive power against anomalies within corresponding

◊ this may be mathematically proven in some reasonable for our case assumptions

10

slide-11
SLIDE 11

Fedor.Ratnikov@cern.ch Anomalies in High Energy Physics Cal Particle Flow Photon Muon

NN Design

◊ Use 3 - layer NN ◊ Each subnetwork returns score

◊ close to 1 for good lumisections ◊ close to 1 for anomalies “invisible” from subnetwork’s channel data ◊ close to 0 for anomalies “visible” from subnetwork’s channel data

◊ Thus NN decomposes anomalies by channels

11

slide-12
SLIDE 12

Fedor.Ratnikov@cern.ch Anomalies in High Energy Physics

Decomposition Results

◊ Different channels contribute differently

12

globally anomalous lumisections globally good lumisections

slide-13
SLIDE 13

Fedor.Ratnikov@cern.ch Anomalies in High Energy Physics

Correlations

◊ Reverse test

◊ trying to predict output of the network for subsystems. ROC AUCs:

◊ muon: 0.89 ◊ photons: 0.95 ◊ particle flow: 0.86 ◊ calo: 0.94

◊ 𝒪ℬ: expect ==1 in our assumptions

13 m u o n s J e t M e t E G a m m a T r a c k s

slide-14
SLIDE 14

Fedor.Ratnikov@cern.ch Anomalies in High Energy Physics

Rare Anomalies

◊ 2010 Open Data contains significant fraction of bad data

◊ 1:2 bad-to-good lumisections

◊ Better data quality in Run 2

◊ 1:100 bad-to-good lumisections ◊ lack of anomaly data for supervised learning

◊ also the case for LHCb

◊ Need other approaches

14

slide-15
SLIDE 15

Fedor.Ratnikov@cern.ch Anomalies in High Energy Physics

Rare Anomalies

◊ Assumptions

  • 1. good samples are embedded in small region of low-dimensional

subspace

  • 2. every point outside this region is an anomaly

◊ Technically, two-class problem

◊ suffers from class disbalance

◊ very few anomalous data

◊ assumptions allow using one-class methods

◊ but then still available information about anomalies would not be used

◊ Need to merge one-class and two-class approaches

15

slide-16
SLIDE 16

Fedor.Ratnikov@cern.ch Anomalies in High Energy Physics

Mixed Objective

◊ Consider classification of objects of class 𝓓

◊ can use “one-class on 𝓓”, e.g. one-class SVM

◊ Add artificial noise data 𝓞 to fill initial phase space

◊ then classifier 𝓓 against 𝓞 effectively separates 𝓓 from the rest of the

phase space

◊ “one-class on 𝓓” = “𝓓 against everything”

◊ Now anomalous data may be added to the noise

◊ Loss function: 𝓜 = 𝓜+ + (1-𝛽)𝓜- + 𝛽𝓜noise

◊ 𝓜+, 𝓜-, 𝓜noise - losses on normal, anomalous, and noise examples ◊ 𝛽 - trade-off parameter

16

slide-17
SLIDE 17

Fedor.Ratnikov@cern.ch Anomalies in High Energy Physics

Illustration

◊ If negative samples are nearby positive region, produce solution

as in classification problem

◊ Otherwise produce one-class bordering ◊ Toy example: 2D Gaussian normal (green), random anomalous

(red)

◊ mixed objective produces more accurate separation

17

slide-18
SLIDE 18

Fedor.Ratnikov@cern.ch Anomalies in High Energy Physics

Noise Injection

◊ Each layer of the deep network acts like a dimensionality

reduction

◊ can inject noise more consistently into the middle of the net

◊ 𝓜noise imposes a bias proportional to the phase volume of the

positive class

◊ positive class volume tends to collapse to a single point

◊ add embedded Auto Encoder to penalties positive class volume

shrinking

◊ 𝓜 = 𝓜+ + (1-𝛽)𝓜- + 𝛽𝓜noise + 𝛾𝓜AE

18

slide-19
SLIDE 19

Fedor.Ratnikov@cern.ch Anomalies in High Energy Physics

Tests on the Same Problem

◊ Data from the decomposition studies

◊ train/test: 10K/10K - positive, 64/6.4K - negative

◊ 800 features (reduced)

◊ ROC AUC (32 experiments) - 0.85±0.02

◊ 0.80±0.05 without noise injection and autoencoder

19

Noise Injection Noise Injection Noise Injection Noise Injection

slide-20
SLIDE 20

Fedor.Ratnikov@cern.ch Anomalies in High Energy Physics

Access to Actual Data

◊ Actual studies need access to actual data

◊ historical (open) data may not represent the current status

◊ Both DS and detector operation expertise are necessary to

implement advanced approaches into the detector operation chain

◊ cooperation between Yandex and CMS via CERN Open Lab is

established

◊ conditional access to real time data is granted

20

slide-21
SLIDE 21

Fedor.Ratnikov@cern.ch Anomalies in High Energy Physics

Conclusions

◊ Yandex group develops procedures to detect anomalies in

detector data

◊ different approaches may work for different run conditions

◊ Current data access policies allow technical access to data in

real time (break through since last year)

◊ CERN Open Lab in action

◊ Started moving from academic studies to practical solutions

21