Anomaly Detection for the CERN Large Hadron Collider injection - - PowerPoint PPT Presentation
Anomaly Detection for the CERN Large Hadron Collider injection - - PowerPoint PPT Presentation
Anomaly Detection for the CERN Large Hadron Collider injection magnets Armin Halilovic KU Leuven - Department of Computer Science In cooperation with CERN 2018-07-27 0 Outline 1 Context 2 Data 3 Preprocessing 4 Anomaly Detection 5
Outline
1 Context 2 Data 3 Preprocessing 4 Anomaly Detection 5 Postprocessing 6 Evaluation 7 Results 8 Conclusion
1 Anomaly Detection for the CERN Large Hadron Collider injection magnets
1 Outline
1 Context 2 Data 3 Preprocessing 4 Anomaly Detection 5 Postprocessing 6 Evaluation 7 Results 8 Conclusion
2 Anomaly Detection for the CERN Large Hadron Collider injection magnets
1 Context - Anomaly Detection
◮ Classification ◮ Normal vs. Abnormal/novel data ◮ One-class vs. Multiclass classification ◮ High amount of normal data ◮ Very low amount of anomalous data ◮ Unsupervised machine learning models ◮ Assign “anomaly scores” to data ◮ = Outlier removal
3 Anomaly Detection for the CERN Large Hadron Collider injection magnets
1 Context - Problem Statement & Motivation
The goal is to develop an anomaly detection application that can detect anomalies in the behaviour of the injection kicker magnets of the Large Hadron Collider. This is useful, because it can be used to: ◮ Detect anomalous behaviour and thus predict failures ◮ Improve CERN’s response time ◮ Improve machine reliability
4 Anomaly Detection for the CERN Large Hadron Collider injection magnets
2 Outline
1 Context 2 Data 3 Preprocessing 4 Anomaly Detection 5 Postprocessing 6 Evaluation 7 Results 8 Conclusion
5 Anomaly Detection for the CERN Large Hadron Collider injection magnets
2 Data - Types I
◮ 6 types of data collections:
1 Continuous 2 Internal Post Operational Check (IPOC) 3 State 4 Controller 5 LHC 6 Electronic Logbook
◮ Continuous & discrete variables ◮ Fixed sampling rates & asynchronous sampling triggers ◮ 120 data collections ◮ Data from June 2015 to September 2016
6 Anomaly Detection for the CERN Large Hadron Collider injection magnets
2 Data - Types II
Continuous Data: ◮ Temperature and pressures ◮ Fixed frequency sampling + save based on change in value ◮ Missing data: Forward Fill
7 Anomaly Detection for the CERN Large Hadron Collider injection magnets
2 Data - Types III
Continuous Data: ◮ Temperature and pressures ◮ Fixed frequency sampling + save based on change in value ◮ Missing data: Forward Fill
8 Anomaly Detection for the CERN Large Hadron Collider injection magnets
2 Data - Types IV
Internal Post Operational Check (IPOC) Data: ◮ Closely related to magnets: energy, strength, delay, . . . ◮ Only sampled when magnet generators pulse ◮ All IPOC measurements recorded simultaneously ◮ At most once every 10 seconds ◮ Many large gaps when experiments run ◮ Missing data: cannot fill ◮ Different timestamps for beams B1 and B2 → Anomaly detection for the two MKI installations separately
9 Anomaly Detection for the CERN Large Hadron Collider injection magnets
2 Data - Types V
IPOC, I STRENGTH, 2016:
10 Anomaly Detection for the CERN Large Hadron Collider injection magnets
2 Data - Types VI
State Data: ◮ Not used ◮ No data for 2015 Controller Data: ◮ Not used ◮ Technical issues (duplicate timestamps) with received database
11 Anomaly Detection for the CERN Large Hadron Collider injection magnets
2 Data - Types VII
LHC Data: ◮ Particle beam measurements: beam intensity & beam length ◮ Sampled and stored in similar way to Continuous measurements ◮ Missing data: Forward fill
12 Anomaly Detection for the CERN Large Hadron Collider injection magnets
2 Data - Types VIII
Electronic Logbook Data: ◮ Manually created logbook entries (labels) ◮ Describe certain events ◮ Anomaly labels not precise, but range of 12 hours Label type Beam 1 Beam 2 anomaly 23 24 fault 11 34 info 75 134 intervention 33 62 research 10 20 Total: 152 274
13 Anomaly Detection for the CERN Large Hadron Collider injection magnets
2 Data - IPOC Segments I
◮ Magnets only in use for certain time periods ◮ IPOC data sampled only when magnets in use ◮ IPOC segment = period of magnet usage ◮ Introduced to deal with uncertainty of anomaly labels ◮ Important semantic meaning ◮ Data is split into segments based on “segmentation distance”
14 Anomaly Detection for the CERN Large Hadron Collider injection magnets
2 Data - IPOC Segments II
◮ Data is split into segments based on “segmentation distance”
15 Anomaly Detection for the CERN Large Hadron Collider injection magnets
3 Outline
1 Context 2 Data 3 Preprocessing 4 Anomaly Detection 5 Postprocessing 6 Evaluation 7 Results 8 Conclusion
16 Anomaly Detection for the CERN Large Hadron Collider injection magnets
3 Preprocessing - Data Filtering I
◮ Want to train models based on correct/relevant data ◮ Sudden extremely high temperatures, negative timing, etc. are impossible Measurement Minimum Maximum PRESSURE 9 × 10−12 mbar 5 × 10−9 mbar TEMP MAGNET (DOWN|UP) 18 ◦C 60 ◦C TEMP TUBE (DOWN|UP) 18 ◦C 120 ◦C I STRENGTH 1 kA N/A T DELAY 10 µs N/A
17 Anomaly Detection for the CERN Large Hadron Collider injection magnets
3 Preprocessing - Data Filtering II
◮ True pattern emerges
18 Anomaly Detection for the CERN Large Hadron Collider injection magnets
3 Preprocessing - Data Filtering III
◮ Impossible time delays removed
19 Anomaly Detection for the CERN Large Hadron Collider injection magnets
3 Preprocessing - Features
◮ All IPOC data ◮ + Continuous data at IPOC data timestamps (with forward fill) ◮ + LHC data at IPOC data timestamps (with forward fill) ◮ + Temporal features on Continuous and LHC data:
- To catch temporal relationship in data
- Sliding window features: mean & sum
- Important parameter: sliding window size
◮ Done separately for both B1 and B2
20 Anomaly Detection for the CERN Large Hadron Collider injection magnets
4 Outline
1 Context 2 Data 3 Preprocessing 4 Anomaly Detection 5 Postprocessing 6 Evaluation 7 Results 8 Conclusion
21 Anomaly Detection for the CERN Large Hadron Collider injection magnets
4 Anomaly Detection
◮ Train machine learning model using preprocessed data ◮ Use the model to generate anomaly scores ◮ Rescale scores to [0, 1]
22 Anomaly Detection for the CERN Large Hadron Collider injection magnets
4 Anomaly Detection - Isolation Forest Anomaly Scores
23 Anomaly Detection for the CERN Large Hadron Collider injection magnets
4 Anomaly Detection - Gaussian Mixture Model Scores I
24 Anomaly Detection for the CERN Large Hadron Collider injection magnets
4 Anomaly Detection - Gaussian Mixture Model Scores II
25 Anomaly Detection for the CERN Large Hadron Collider injection magnets
4 Anomaly Detection - Dummy Detectors
◮ Simple detection strategies as baseline to compare to ◮ Constant, uniformly random, stratified random
26 Anomaly Detection for the CERN Large Hadron Collider injection magnets
5 Outline
1 Context 2 Data 3 Preprocessing 4 Anomaly Detection 5 Postprocessing 6 Evaluation 7 Results 8 Conclusion
27 Anomaly Detection for the CERN Large Hadron Collider injection magnets
5 Postprocessing I
◮ Anomaly labels are unspecific, 12 hour range ◮ Will use segments instead of individual data tuples in evaluation ◮ Transform scored data into lists of IPOC segments ◮ Segment anomaly score based on anomaly scores of its data ◮ Anomalous behavior likely occurs in multiple successive timestamps ◮ These timestamps should get higher anomaly scores ◮ The segments that contain these timestamps should then have higher anomaly scores
28 Anomaly Detection for the CERN Large Hadron Collider injection magnets
5 Postprocessing II
Methods for Segment Anomaly Score: ◮ Max ◮ Top K (10) ◮ Top Percentage (25%) Ground Truth Annotation: ◮ Need to compare segment anomaly scores to consistent basis of ground truth ◮ This allows for fair performance evaluation ◮ Mark segments as anomalous if they lie in the 12 hour range of an anomaly label
29 Anomaly Detection for the CERN Large Hadron Collider injection magnets
5 Postprocessing III
We now have: ◮ A set of IPOC segments with anomaly scores ◮ Knowledge of which segments are actually anomalous
30 Anomaly Detection for the CERN Large Hadron Collider injection magnets
6 Outline
1 Context 2 Data 3 Preprocessing 4 Anomaly Detection 5 Postprocessing 6 Evaluation 7 Results 8 Conclusion
31 Anomaly Detection for the CERN Large Hadron Collider injection magnets
6 Evaluation
◮ Anomaly scores lie in [0, 1] ◮ Ground truth is 0 or 1 ◮ To evaluate performance, need to select a threshold anomaly score in order to count True Positives, False Positives, True Negatives, and False Negatives ◮ If score above threshold, then prediction is Positive, else Negative Ground Truth Positive Negative Prediction Positive TP FP Negative FN TN
32 Anomaly Detection for the CERN Large Hadron Collider injection magnets
6 Evaluation - Performance Metric
◮ Precision and Recall are useful context of imbalanced data ◮ Precision =
TP TP+FP
◮ Recall =
TP TP+FN
◮ But, want single number as performance metric for automated comparisons ◮ Calculate Precision and Recall for each possible anomaly score threshold and plot the resulting curve ◮ Performance metric = Area under Precision-Recall Curve (AUPR)
33 Anomaly Detection for the CERN Large Hadron Collider injection magnets
6 Evaluation - Grid Search
◮ Many parameters for developed anomaly detection pipeline ◮ Segmentation distance, scale data, anomaly score method, anomaly detector, anomaly detector hyperparameters, labels ◮ Grid search for parameter optimization ◮ Pipeline is executed automatically with predetermined combinations of parameters built by a grid of parameters ◮ Results are stored and sorted by AUPR so that the best performing parameters can be found easily
34 Anomaly Detection for the CERN Large Hadron Collider injection magnets
7 Outline
1 Context 2 Data 3 Preprocessing 4 Anomaly Detection 5 Postprocessing 6 Evaluation 7 Results 8 Conclusion
35 Anomaly Detection for the CERN Large Hadron Collider injection magnets
7 Results - Dummy Detectors
Figure: PR curves of Dummy detectors with evaluation parameters segmentation distance = 30 min,
anomaly score method = topk, labels = all.
36 Anomaly Detection for the CERN Large Hadron Collider injection magnets
7 Results - GMM I
◮ Best PR Curve
Figure: Parameters: n components = 6, covariance type = full, scale data = F alse,
segmentation distance = 60 min, anomaly score method = topk, labels = anomaly
37 Anomaly Detection for the CERN Large Hadron Collider injection magnets
7 Results - GMM II
◮ Predictions with 99-th percentile anomaly score threshold
38 Anomaly Detection for the CERN Large Hadron Collider injection magnets
7 Results - Isolation Forest I
◮ Best PR Curve
Figure: Parameters: n estimators = 250, max samples = 5120, scale data = F alse,
segmentation distance = 60 min, anomaly score method = max, labels = anomaly
39 Anomaly Detection for the CERN Large Hadron Collider injection magnets
7 Results - Isolation Forest II
◮ Predictions with 99-th percentile anomaly score threshold
40 Anomaly Detection for the CERN Large Hadron Collider injection magnets
7 Results - Isolation Forest III
◮ Only IPOC features, best PR Curve
41 Anomaly Detection for the CERN Large Hadron Collider injection magnets
7 Results - Isolation Forest IV
◮ Only IPOC features, predictions with 99-th percentile anomaly score threshold
42 Anomaly Detection for the CERN Large Hadron Collider injection magnets
7 Results - Source Code
◮ Written to be extensible ◮ Pipeline components in clear modules:
- preprocessing
- anomaly detection
- postprocessing
- evaluation
- pipeline
◮ Parameters can be varied easily ◮ https://github.com/arminnh/masters-thesis
43 Anomaly Detection for the CERN Large Hadron Collider injection magnets
8 Outline
1 Context 2 Data 3 Preprocessing 4 Anomaly Detection 5 Postprocessing 6 Evaluation 7 Results 8 Conclusion
44 Anomaly Detection for the CERN Large Hadron Collider injection magnets
8 Conclusion
◮ Anomaly detection application has been developed ◮ Some anomalies are detected very well ◮ Many are still not detected at all ◮ Experiments have shown that performance can still be improved significantly ◮ More experiments should be done around feature selection
45 Anomaly Detection for the CERN Large Hadron Collider injection magnets
8 Future Work
◮ Feature selection ◮ Controller data ◮ Integration of more anomaly detectors (e.g. one class SVM or Local Outlier Factor) ◮ Better segmentation procedure without segmentation distance parameter ◮ More efficient and autonomous parameter optimization using e.g. Evolutionary algorithms or Bayesian Optimization
46 Anomaly Detection for the CERN Large Hadron Collider injection magnets
8 Bibliography
CERN. Overview lhc. http://cern60.web.cern.ch/en/exhibitions/overview-lhc. Accessed 2018-06-27. W Herr and T Pieloni. Beam-beam effects. (arXiv:1601.05235):1–29, 2014. Contribution to the CAS - CERN Accelerator School: Advanced Accelerator Physics Course, Trondheim, Norway, 18-29 Aug 2013. Fei Tony Liu, Kai Ming Ting, and Zhi-Hua Zhou. Isolation forest. In Data Mining, 2008. ICDM’08. Eighth IEEE International Conference on, pages 413–422. IEEE, 2008.
47 Anomaly Detection for the CERN Large Hadron Collider injection magnets
Questions?
8 Extra - Comparison to Previous Work
◮ Enabled use of many machine learning models instead of just 1 ◮ Segmentation of input data instead of segmentation of output anomaly scores ◮ Consistent basis of ground truth → more correct comparison of results ◮ Evaluation metrics in terms of TP, FP TN, FN instead of ambiguous terms ◮ PR curve using all anomaly score thresholds instead of calculating Precision and Recall for one threshold ◮ . . .
48 Anomaly Detection for the CERN Large Hadron Collider injection magnets
8 Extra - Isolation Forest
◮ Ensemble of simple decision trees which split randomly
- n features
◮ Trees are grown for random samples of dataset until each data tuple forms a leaf node ◮ Average path length will be shorter for anomalies ◮ Works well in high dimensional problems ◮ ≈ Density estimation, but without a density measure
Source: [3]
49 Anomaly Detection for the CERN Large Hadron Collider injection magnets