Anomalies Detection for HEP Experiments Maxim Borisyak Denis - - PowerPoint PPT Presentation
Anomalies Detection for HEP Experiments Maxim Borisyak Denis - - PowerPoint PPT Presentation
Anomalies Detection for HEP Experiments Maxim Borisyak Denis Derkach, Fedor Ratnikov, Andrey Ustyuzhanin HEP/ML Group, Yandex School of Data Analysis, National Research University High School of Economics with incredible help from CMS
Fedor.Ratnikov@cern.ch Anomalies in High Energy Physics
Content
◊ Yandex ◊ Supervised anomalies detection ◊ Decomposition of anomalies by source ◊ Rare Anomalies
2
Fedor.Ratnikov@cern.ch Anomalies in High Energy Physics
Yandex in Wikipedia
3
IT resources: ∼ 10%×
Fedor.Ratnikov@cern.ch Anomalies in High Energy Physics
Yandex in HEP
◊ Member of CERN OpenLab ◊ Member of LHCb Collaboration
◊ trigger ◊ B-tagging ◊ monitoring ◊ anomalies detection ◊ computing resources
◊ Member of SHIP Collaboration
◊ detector optimisation ◊ computing resources
◊ Cooperating with CMS Collaboration
◊ data certification
◊ Cooperating with ATLAS Collaboration
◊ GRID optimisation
◊ Contributing to other Particle Physics experiments beyond CERN
4
Fedor.Ratnikov@cern.ch Anomalies in High Energy Physics
Levels of Data Quality Monitoring
◊ Detector Level
◊ hit maps, occupancies…
◊ Routine Physics Level
◊ basic physics objects: hadrons, leptons, photons…
◊ Physics Candles
◊ J/ψ, Z, W, top, …
5
Fedor.Ratnikov@cern.ch Anomalies in High Energy Physics
Formalising the Problem
Use Routine Physics operation level
◊ Continuously supervised learning approach
◊ we have historical data processed by experts
◊ expert classified data as “good” or “bad”
◊ the system learns typical patterns
◊ establishes procedure to split data samples into “black” (definitely bad),
“white” (definitely good), and “grey” (expert intervention needed) zones
◊
“definitely bad” ≡ FalsePositive < cut_bad
◊
“definitely good” ≡ FalseNegative < cut_good
◊ let system classify “black” and “white” domains, pass “grey” domain
for expert decision
As new data is coming, supervisor continue making complicated labelling
Ultimate goal: take burden of routine classification from experts, let experts deal with non-trivial cases
6
Fedor.Ratnikov@cern.ch Anomalies in High Energy Physics
Practical Approach
◊ CMS 2010B run open data
◊ http://opendata.cern.ch/record/8 ◊ Streams: MinimalBias, Muons, Photons
◊ LumiSections (minimal chunk of data defined in metadata) are labelled
as “good” or “bad” by the experiment
◊ Objects: Particle Flow Jets, Calorimeter Jets, Photons, Muons
◊ (pT, 𝜃, 𝜒, Vxyz, mass) for 5 particles in quantiles in pT ◊ 7 features for every variable:
◊ quantiles: 0, 0.25, 0.5, 0.75, 1. + mean + variance
◊ over objects of all events in given LumiSection
◊ ∼2500 features describing every LumiSection
7
Fedor.Ratnikov@cern.ch Anomalies in High Energy Physics
Reference Performance
◊ Use training part of all available data to train classifier
◊ ultimate best case scenario
◊ Analyse test part of data
◊ get probability for the given LS to be “good” ◊ select two probability thresholds: Cut_bad, Cut_good
◊ define three zones
◊ “black zone” - LS is definitely bad ◊ “white zone” - LS is definitely good ◊ “grey zone” - classifier is in doubt, expert decision is needed
8
black zone grey zone white zone
1 Cut “bad” Cut “good”
expert decision automatic decision
Fedor.Ratnikov@cern.ch Anomalies in High Energy Physics
Performance
◊ Loss Rate
◊ “good” LS is classified as “definitely bad” and
thus is lost for physics
◊ LR = FN(“black”) / (TP(“white”) +FN (“black”))
◊ Pollution Rate
◊ “bad” LS is classified as “definitely good” and
thus pollutes certified data
◊ PR = FP(“white”) / (TP(“black”) + FP(“white”))
◊ Rejection Rate
◊ fraction of all LS which are not automatically
classified as “definitely bad” or “definitely good”
◊ RR = (“grey”) / (“black” + “grey” + “white”)
◊ ManualWork = RejectionRate
9
black zone grey zone white zone
1 Cut “bad” Cut “good”
∼80% saving on manual work is feasible for PR and LR at 5‰
Fedor.Ratnikov@cern.ch Anomalies in High Energy Physics
Decomposing Anomalies
◊ Study effect of anomalies on individual channels
◊ what channels are responsible for anomalies? ◊ if only photons are affected, may muon data still be used? ◊ which plots should receive more attention from Data Quality experts?
◊ Decomposition of Channels
◊ build separate NN for every channel
◊ corresponding NN scores each channel
◊ connect networks by
◊ “min” operator with dropout ◊ exp ( (fi
subnetwork - 1)) a kind of “fuzzy AND”
◊ train network to approximate a global score ◊ individual NN has high predictive power against anomalies within corresponding
◊ this may be mathematically proven in some reasonable for our case assumptions
10
Fedor.Ratnikov@cern.ch Anomalies in High Energy Physics Cal Particle Flow Photon Muon
NN Design
◊ Use 3 - layer NN ◊ Each subnetwork returns score
◊ close to 1 for good lumisections ◊ close to 1 for anomalies “invisible” from subnetwork’s channel data ◊ close to 0 for anomalies “visible” from subnetwork’s channel data
◊ Thus NN decomposes anomalies by channels
11
Fedor.Ratnikov@cern.ch Anomalies in High Energy Physics
Decomposition Results
◊ Different channels contribute differently
12
globally anomalous lumisections globally good lumisections
Fedor.Ratnikov@cern.ch Anomalies in High Energy Physics
Correlations
◊ Reverse test
◊ trying to predict output of the network for subsystems. ROC AUCs:
◊ muon: 0.89 ◊ photons: 0.95 ◊ particle flow: 0.86 ◊ calo: 0.94
◊ 𝒪ℬ: expect ==1 in our assumptions
13 m u o n s J e t M e t E G a m m a T r a c k s
Fedor.Ratnikov@cern.ch Anomalies in High Energy Physics
Rare Anomalies
◊ 2010 Open Data contains significant fraction of bad data
◊ 1:2 bad-to-good lumisections
◊ Better data quality in Run 2
◊ 1:100 bad-to-good lumisections ◊ lack of anomaly data for supervised learning
◊ also the case for LHCb
◊ Need other approaches
14
Fedor.Ratnikov@cern.ch Anomalies in High Energy Physics
Rare Anomalies
◊ Assumptions
- 1. good samples are embedded in small region of low-dimensional
subspace
- 2. every point outside this region is an anomaly
◊ Technically, two-class problem
◊ suffers from class disbalance
◊ very few anomalous data
◊ assumptions allow using one-class methods
◊ but then still available information about anomalies would not be used
◊ Need to merge one-class and two-class approaches
15
Fedor.Ratnikov@cern.ch Anomalies in High Energy Physics
Mixed Objective
◊ Consider classification of objects of class 𝓓
◊ can use “one-class on 𝓓”, e.g. one-class SVM
◊ Add artificial noise data 𝓞 to fill initial phase space
◊ then classifier 𝓓 against 𝓞 effectively separates 𝓓 from the rest of the
phase space
◊ “one-class on 𝓓” = “𝓓 against everything”
◊ Now anomalous data may be added to the noise
◊ Loss function: 𝓜 = 𝓜+ + (1-𝛽)𝓜- + 𝛽𝓜noise
◊ 𝓜+, 𝓜-, 𝓜noise - losses on normal, anomalous, and noise examples ◊ 𝛽 - trade-off parameter
16
Fedor.Ratnikov@cern.ch Anomalies in High Energy Physics
Illustration
◊ If negative samples are nearby positive region, produce solution
as in classification problem
◊ Otherwise produce one-class bordering ◊ Toy example: 2D Gaussian normal (green), random anomalous
(red)
◊ mixed objective produces more accurate separation
17
Fedor.Ratnikov@cern.ch Anomalies in High Energy Physics
Noise Injection
◊ Each layer of the deep network acts like a dimensionality
reduction
◊ can inject noise more consistently into the middle of the net
◊ 𝓜noise imposes a bias proportional to the phase volume of the
positive class
◊ positive class volume tends to collapse to a single point
◊ add embedded Auto Encoder to penalties positive class volume
shrinking
◊ 𝓜 = 𝓜+ + (1-𝛽)𝓜- + 𝛽𝓜noise + 𝛾𝓜AE
18
Fedor.Ratnikov@cern.ch Anomalies in High Energy Physics
Tests on the Same Problem
◊ Data from the decomposition studies
◊ train/test: 10K/10K - positive, 64/6.4K - negative
◊ 800 features (reduced)
◊ ROC AUC (32 experiments) - 0.85±0.02
◊ 0.80±0.05 without noise injection and autoencoder
19
Noise Injection Noise Injection Noise Injection Noise Injection
Fedor.Ratnikov@cern.ch Anomalies in High Energy Physics
Access to Actual Data
◊ Actual studies need access to actual data
◊ historical (open) data may not represent the current status
◊ Both DS and detector operation expertise are necessary to
implement advanced approaches into the detector operation chain
◊ cooperation between Yandex and CMS via CERN Open Lab is
established
◊ conditional access to real time data is granted
20
Fedor.Ratnikov@cern.ch Anomalies in High Energy Physics
Conclusions
◊ Yandex group develops procedures to detect anomalies in
detector data
◊ different approaches may work for different run conditions
◊ Current data access policies allow technical access to data in
real time (break through since last year)
◊ CERN Open Lab in action
◊ Started moving from academic studies to practical solutions
21