Anomaly Detection for the CERN Large Hadron Collider injection - - PowerPoint PPT Presentation

anomaly detection for the cern large hadron collider
SMART_READER_LITE
LIVE PREVIEW

Anomaly Detection for the CERN Large Hadron Collider injection - - PowerPoint PPT Presentation

Anomaly Detection for the CERN Large Hadron Collider injection magnets Armin Halilovic KU Leuven - Department of Computer Science In cooperation with CERN 2018-07-27 0 Outline 1 Context 2 Data 3 Preprocessing 4 Anomaly Detection 5


slide-1
SLIDE 1

Anomaly Detection for the CERN Large Hadron Collider injection magnets

Armin Halilovic KU Leuven - Department of Computer Science In cooperation with CERN 2018-07-27

slide-2
SLIDE 2

Outline

1 Context 2 Data 3 Preprocessing 4 Anomaly Detection 5 Postprocessing 6 Evaluation 7 Results 8 Conclusion

1 Anomaly Detection for the CERN Large Hadron Collider injection magnets

slide-3
SLIDE 3

1 Outline

1 Context 2 Data 3 Preprocessing 4 Anomaly Detection 5 Postprocessing 6 Evaluation 7 Results 8 Conclusion

2 Anomaly Detection for the CERN Large Hadron Collider injection magnets

slide-4
SLIDE 4

1 Context - Anomaly Detection

◮ Classification ◮ Normal vs. Abnormal/novel data ◮ One-class vs. Multiclass classification ◮ High amount of normal data ◮ Very low amount of anomalous data ◮ Unsupervised machine learning models ◮ Assign “anomaly scores” to data ◮ = Outlier removal

3 Anomaly Detection for the CERN Large Hadron Collider injection magnets

slide-5
SLIDE 5

1 Context - Problem Statement & Motivation

The goal is to develop an anomaly detection application that can detect anomalies in the behaviour of the injection kicker magnets of the Large Hadron Collider. This is useful, because it can be used to: ◮ Detect anomalous behaviour and thus predict failures ◮ Improve CERN’s response time ◮ Improve machine reliability

4 Anomaly Detection for the CERN Large Hadron Collider injection magnets

slide-6
SLIDE 6

2 Outline

1 Context 2 Data 3 Preprocessing 4 Anomaly Detection 5 Postprocessing 6 Evaluation 7 Results 8 Conclusion

5 Anomaly Detection for the CERN Large Hadron Collider injection magnets

slide-7
SLIDE 7

2 Data - Types I

◮ 6 types of data collections:

1 Continuous 2 Internal Post Operational Check (IPOC) 3 State 4 Controller 5 LHC 6 Electronic Logbook

◮ Continuous & discrete variables ◮ Fixed sampling rates & asynchronous sampling triggers ◮ 120 data collections ◮ Data from June 2015 to September 2016

6 Anomaly Detection for the CERN Large Hadron Collider injection magnets

slide-8
SLIDE 8

2 Data - Types II

Continuous Data: ◮ Temperature and pressures ◮ Fixed frequency sampling + save based on change in value ◮ Missing data: Forward Fill

7 Anomaly Detection for the CERN Large Hadron Collider injection magnets

slide-9
SLIDE 9

2 Data - Types III

Continuous Data: ◮ Temperature and pressures ◮ Fixed frequency sampling + save based on change in value ◮ Missing data: Forward Fill

8 Anomaly Detection for the CERN Large Hadron Collider injection magnets

slide-10
SLIDE 10

2 Data - Types IV

Internal Post Operational Check (IPOC) Data: ◮ Closely related to magnets: energy, strength, delay, . . . ◮ Only sampled when magnet generators pulse ◮ All IPOC measurements recorded simultaneously ◮ At most once every 10 seconds ◮ Many large gaps when experiments run ◮ Missing data: cannot fill ◮ Different timestamps for beams B1 and B2 → Anomaly detection for the two MKI installations separately

9 Anomaly Detection for the CERN Large Hadron Collider injection magnets

slide-11
SLIDE 11

2 Data - Types V

IPOC, I STRENGTH, 2016:

10 Anomaly Detection for the CERN Large Hadron Collider injection magnets

slide-12
SLIDE 12

2 Data - Types VI

State Data: ◮ Not used ◮ No data for 2015 Controller Data: ◮ Not used ◮ Technical issues (duplicate timestamps) with received database

11 Anomaly Detection for the CERN Large Hadron Collider injection magnets

slide-13
SLIDE 13

2 Data - Types VII

LHC Data: ◮ Particle beam measurements: beam intensity & beam length ◮ Sampled and stored in similar way to Continuous measurements ◮ Missing data: Forward fill

12 Anomaly Detection for the CERN Large Hadron Collider injection magnets

slide-14
SLIDE 14

2 Data - Types VIII

Electronic Logbook Data: ◮ Manually created logbook entries (labels) ◮ Describe certain events ◮ Anomaly labels not precise, but range of 12 hours Label type Beam 1 Beam 2 anomaly 23 24 fault 11 34 info 75 134 intervention 33 62 research 10 20 Total: 152 274

13 Anomaly Detection for the CERN Large Hadron Collider injection magnets

slide-15
SLIDE 15

2 Data - IPOC Segments I

◮ Magnets only in use for certain time periods ◮ IPOC data sampled only when magnets in use ◮ IPOC segment = period of magnet usage ◮ Introduced to deal with uncertainty of anomaly labels ◮ Important semantic meaning ◮ Data is split into segments based on “segmentation distance”

14 Anomaly Detection for the CERN Large Hadron Collider injection magnets

slide-16
SLIDE 16

2 Data - IPOC Segments II

◮ Data is split into segments based on “segmentation distance”

15 Anomaly Detection for the CERN Large Hadron Collider injection magnets

slide-17
SLIDE 17

3 Outline

1 Context 2 Data 3 Preprocessing 4 Anomaly Detection 5 Postprocessing 6 Evaluation 7 Results 8 Conclusion

16 Anomaly Detection for the CERN Large Hadron Collider injection magnets

slide-18
SLIDE 18

3 Preprocessing - Data Filtering I

◮ Want to train models based on correct/relevant data ◮ Sudden extremely high temperatures, negative timing, etc. are impossible Measurement Minimum Maximum PRESSURE 9 × 10−12 mbar 5 × 10−9 mbar TEMP MAGNET (DOWN|UP) 18 ◦C 60 ◦C TEMP TUBE (DOWN|UP) 18 ◦C 120 ◦C I STRENGTH 1 kA N/A T DELAY 10 µs N/A

17 Anomaly Detection for the CERN Large Hadron Collider injection magnets

slide-19
SLIDE 19

3 Preprocessing - Data Filtering II

◮ True pattern emerges

18 Anomaly Detection for the CERN Large Hadron Collider injection magnets

slide-20
SLIDE 20

3 Preprocessing - Data Filtering III

◮ Impossible time delays removed

19 Anomaly Detection for the CERN Large Hadron Collider injection magnets

slide-21
SLIDE 21

3 Preprocessing - Features

◮ All IPOC data ◮ + Continuous data at IPOC data timestamps (with forward fill) ◮ + LHC data at IPOC data timestamps (with forward fill) ◮ + Temporal features on Continuous and LHC data:

  • To catch temporal relationship in data
  • Sliding window features: mean & sum
  • Important parameter: sliding window size

◮ Done separately for both B1 and B2

20 Anomaly Detection for the CERN Large Hadron Collider injection magnets

slide-22
SLIDE 22

4 Outline

1 Context 2 Data 3 Preprocessing 4 Anomaly Detection 5 Postprocessing 6 Evaluation 7 Results 8 Conclusion

21 Anomaly Detection for the CERN Large Hadron Collider injection magnets

slide-23
SLIDE 23

4 Anomaly Detection

◮ Train machine learning model using preprocessed data ◮ Use the model to generate anomaly scores ◮ Rescale scores to [0, 1]

22 Anomaly Detection for the CERN Large Hadron Collider injection magnets

slide-24
SLIDE 24

4 Anomaly Detection - Isolation Forest Anomaly Scores

23 Anomaly Detection for the CERN Large Hadron Collider injection magnets

slide-25
SLIDE 25

4 Anomaly Detection - Gaussian Mixture Model Scores I

24 Anomaly Detection for the CERN Large Hadron Collider injection magnets

slide-26
SLIDE 26

4 Anomaly Detection - Gaussian Mixture Model Scores II

25 Anomaly Detection for the CERN Large Hadron Collider injection magnets

slide-27
SLIDE 27

4 Anomaly Detection - Dummy Detectors

◮ Simple detection strategies as baseline to compare to ◮ Constant, uniformly random, stratified random

26 Anomaly Detection for the CERN Large Hadron Collider injection magnets

slide-28
SLIDE 28

5 Outline

1 Context 2 Data 3 Preprocessing 4 Anomaly Detection 5 Postprocessing 6 Evaluation 7 Results 8 Conclusion

27 Anomaly Detection for the CERN Large Hadron Collider injection magnets

slide-29
SLIDE 29

5 Postprocessing I

◮ Anomaly labels are unspecific, 12 hour range ◮ Will use segments instead of individual data tuples in evaluation ◮ Transform scored data into lists of IPOC segments ◮ Segment anomaly score based on anomaly scores of its data ◮ Anomalous behavior likely occurs in multiple successive timestamps ◮ These timestamps should get higher anomaly scores ◮ The segments that contain these timestamps should then have higher anomaly scores

28 Anomaly Detection for the CERN Large Hadron Collider injection magnets

slide-30
SLIDE 30

5 Postprocessing II

Methods for Segment Anomaly Score: ◮ Max ◮ Top K (10) ◮ Top Percentage (25%) Ground Truth Annotation: ◮ Need to compare segment anomaly scores to consistent basis of ground truth ◮ This allows for fair performance evaluation ◮ Mark segments as anomalous if they lie in the 12 hour range of an anomaly label

29 Anomaly Detection for the CERN Large Hadron Collider injection magnets

slide-31
SLIDE 31

5 Postprocessing III

We now have: ◮ A set of IPOC segments with anomaly scores ◮ Knowledge of which segments are actually anomalous

30 Anomaly Detection for the CERN Large Hadron Collider injection magnets

slide-32
SLIDE 32

6 Outline

1 Context 2 Data 3 Preprocessing 4 Anomaly Detection 5 Postprocessing 6 Evaluation 7 Results 8 Conclusion

31 Anomaly Detection for the CERN Large Hadron Collider injection magnets

slide-33
SLIDE 33

6 Evaluation

◮ Anomaly scores lie in [0, 1] ◮ Ground truth is 0 or 1 ◮ To evaluate performance, need to select a threshold anomaly score in order to count True Positives, False Positives, True Negatives, and False Negatives ◮ If score above threshold, then prediction is Positive, else Negative Ground Truth Positive Negative Prediction Positive TP FP Negative FN TN

32 Anomaly Detection for the CERN Large Hadron Collider injection magnets

slide-34
SLIDE 34

6 Evaluation - Performance Metric

◮ Precision and Recall are useful context of imbalanced data ◮ Precision =

TP TP+FP

◮ Recall =

TP TP+FN

◮ But, want single number as performance metric for automated comparisons ◮ Calculate Precision and Recall for each possible anomaly score threshold and plot the resulting curve ◮ Performance metric = Area under Precision-Recall Curve (AUPR)

33 Anomaly Detection for the CERN Large Hadron Collider injection magnets

slide-35
SLIDE 35

6 Evaluation - Grid Search

◮ Many parameters for developed anomaly detection pipeline ◮ Segmentation distance, scale data, anomaly score method, anomaly detector, anomaly detector hyperparameters, labels ◮ Grid search for parameter optimization ◮ Pipeline is executed automatically with predetermined combinations of parameters built by a grid of parameters ◮ Results are stored and sorted by AUPR so that the best performing parameters can be found easily

34 Anomaly Detection for the CERN Large Hadron Collider injection magnets

slide-36
SLIDE 36

7 Outline

1 Context 2 Data 3 Preprocessing 4 Anomaly Detection 5 Postprocessing 6 Evaluation 7 Results 8 Conclusion

35 Anomaly Detection for the CERN Large Hadron Collider injection magnets

slide-37
SLIDE 37

7 Results - Dummy Detectors

Figure: PR curves of Dummy detectors with evaluation parameters segmentation distance = 30 min,

anomaly score method = topk, labels = all.

36 Anomaly Detection for the CERN Large Hadron Collider injection magnets

slide-38
SLIDE 38

7 Results - GMM I

◮ Best PR Curve

Figure: Parameters: n components = 6, covariance type = full, scale data = F alse,

segmentation distance = 60 min, anomaly score method = topk, labels = anomaly

37 Anomaly Detection for the CERN Large Hadron Collider injection magnets

slide-39
SLIDE 39

7 Results - GMM II

◮ Predictions with 99-th percentile anomaly score threshold

38 Anomaly Detection for the CERN Large Hadron Collider injection magnets

slide-40
SLIDE 40

7 Results - Isolation Forest I

◮ Best PR Curve

Figure: Parameters: n estimators = 250, max samples = 5120, scale data = F alse,

segmentation distance = 60 min, anomaly score method = max, labels = anomaly

39 Anomaly Detection for the CERN Large Hadron Collider injection magnets

slide-41
SLIDE 41

7 Results - Isolation Forest II

◮ Predictions with 99-th percentile anomaly score threshold

40 Anomaly Detection for the CERN Large Hadron Collider injection magnets

slide-42
SLIDE 42

7 Results - Isolation Forest III

◮ Only IPOC features, best PR Curve

41 Anomaly Detection for the CERN Large Hadron Collider injection magnets

slide-43
SLIDE 43

7 Results - Isolation Forest IV

◮ Only IPOC features, predictions with 99-th percentile anomaly score threshold

42 Anomaly Detection for the CERN Large Hadron Collider injection magnets

slide-44
SLIDE 44

7 Results - Source Code

◮ Written to be extensible ◮ Pipeline components in clear modules:

  • preprocessing
  • anomaly detection
  • postprocessing
  • evaluation
  • pipeline

◮ Parameters can be varied easily ◮ https://github.com/arminnh/masters-thesis

43 Anomaly Detection for the CERN Large Hadron Collider injection magnets

slide-45
SLIDE 45

8 Outline

1 Context 2 Data 3 Preprocessing 4 Anomaly Detection 5 Postprocessing 6 Evaluation 7 Results 8 Conclusion

44 Anomaly Detection for the CERN Large Hadron Collider injection magnets

slide-46
SLIDE 46

8 Conclusion

◮ Anomaly detection application has been developed ◮ Some anomalies are detected very well ◮ Many are still not detected at all ◮ Experiments have shown that performance can still be improved significantly ◮ More experiments should be done around feature selection

45 Anomaly Detection for the CERN Large Hadron Collider injection magnets

slide-47
SLIDE 47

8 Future Work

◮ Feature selection ◮ Controller data ◮ Integration of more anomaly detectors (e.g. one class SVM or Local Outlier Factor) ◮ Better segmentation procedure without segmentation distance parameter ◮ More efficient and autonomous parameter optimization using e.g. Evolutionary algorithms or Bayesian Optimization

46 Anomaly Detection for the CERN Large Hadron Collider injection magnets

slide-48
SLIDE 48

8 Bibliography

CERN. Overview lhc. http://cern60.web.cern.ch/en/exhibitions/overview-lhc. Accessed 2018-06-27. W Herr and T Pieloni. Beam-beam effects. (arXiv:1601.05235):1–29, 2014. Contribution to the CAS - CERN Accelerator School: Advanced Accelerator Physics Course, Trondheim, Norway, 18-29 Aug 2013. Fei Tony Liu, Kai Ming Ting, and Zhi-Hua Zhou. Isolation forest. In Data Mining, 2008. ICDM’08. Eighth IEEE International Conference on, pages 413–422. IEEE, 2008.

47 Anomaly Detection for the CERN Large Hadron Collider injection magnets

slide-49
SLIDE 49

Questions?

slide-50
SLIDE 50

8 Extra - Comparison to Previous Work

◮ Enabled use of many machine learning models instead of just 1 ◮ Segmentation of input data instead of segmentation of output anomaly scores ◮ Consistent basis of ground truth → more correct comparison of results ◮ Evaluation metrics in terms of TP, FP TN, FN instead of ambiguous terms ◮ PR curve using all anomaly score thresholds instead of calculating Precision and Recall for one threshold ◮ . . .

48 Anomaly Detection for the CERN Large Hadron Collider injection magnets

slide-51
SLIDE 51

8 Extra - Isolation Forest

◮ Ensemble of simple decision trees which split randomly

  • n features

◮ Trees are grown for random samples of dataset until each data tuple forms a leaf node ◮ Average path length will be shorter for anomalies ◮ Works well in high dimensional problems ◮ ≈ Density estimation, but without a density measure

Source: [3]

49 Anomaly Detection for the CERN Large Hadron Collider injection magnets