- 1-
Haoyi Fan 1 , Fengbin Zhang 1 , Ruidong Wang 1 , Liang Xi 1 , Zuoyong - - PowerPoint PPT Presentation
Haoyi Fan 1 , Fengbin Zhang 1 , Ruidong Wang 1 , Liang Xi 1 , Zuoyong - - PowerPoint PPT Presentation
-1- Correlation-aware Deep Generative Model for Unsupervised Anomaly Detection Haoyi Fan 1 , Fengbin Zhang 1 , Ruidong Wang 1 , Liang Xi 1 , Zuoyong Li 2 Harbin University of Science and Technology 1 Minjiang University 2 isfanhy@hrbust.edu.cn
- 2-
Background
Latent Space
Anomaly
Anomaly
Normal
Observed Space
- 3-
Background
https://www.explosion.com/135494/5-effective-strategies-of-fraud- detection-and-prevention-for-ecommerce/ https://blog.exporthub.com/working-with-chinese-manufacturers/ https://planforgermany.com/switching-private-public-health-insurance- germany/
Fraud Detection Disease Detection Fault Detection Intrusion Detection
https://towardsdatascience.com/building-an-intrusion-detection-system- using-deep-learning-b9488332b321
- 4-
Background
Latent Space
Unsupervised Anomaly Detection
โ From the Density Estimation Perspective Data samples: ๐๐ข๐ ๐๐๐= ๐ฆ1, ๐ฆ2, ๐ฆ3, โฆ , ๐ฆ๐ , ๐ฆ๐ is assumed normal.
- 5-
Background
Latent Space
Unsupervised Anomaly Detection
โ From the Density Estimation Perspective Data samples: ๐๐ข๐ ๐๐๐ = ๐ฆ1, ๐ฆ2, ๐ฆ3, โฆ , ๐ฆ๐ , ๐ฆ๐ is assumed normal. Model: ๐(๐ฆ)
- 6-
Background
Latent Space
Unsupervised Anomaly Detection
โ From the Density Estimation Perspective Data samples: ๐๐ข๐ ๐๐๐ = ๐ฆ1, ๐ฆ2, โฆ , ๐ฆ๐ , ๐ฆ๐ is assumed normal. Model: ๐(๐ฆ) Test samples: ๐๐ข๐๐ก๐ข = ๐ฆ1, ๐ฆ2, โฆ , ๐ฆ๐ , ๐ฆ๐ข is unknow. if ๐(๐ฆ๐ข) < ๐, ๐ฆ๐ข is abnormal. if ๐(๐ฆ๐ข) โฅ ๐, ๐ฆ๐ข is normal.
- 7-
Background
Unsupervised Anomaly Detection
โ From the Density Estimation Perspective Data samples: ๐๐ข๐ ๐๐๐ = ๐ฆ1, ๐ฆ2, โฆ , ๐ฆ๐ , ๐ฆ๐ is assumed normal. Model: ๐(๐ฆ) Test samples: ๐๐ข๐๐ก๐ข = ๐ฆ1, ๐ฆ2, โฆ , ๐ฆ๐ , ๐ฆ๐ข is unknow. if ๐(๐ฆ๐ข) < ๐, ๐ฆ๐ข is abnormal. if ๐(๐ฆ๐ข) โฅ ๐, ๐ฆ๐ข is normal.
Latent Space
Anomalies reside in the low probability density areas.
- 8-
Background
Correlation among data samples
Conventional Feature Learning Correlation-aware Feature Learning Anomaly Detection Anomaly Detection Feature Space Structure Space Graph Modeling
How to discover the normal pattern from both the feature level and structural level ?
- 9-
Problem Statement
Anomaly Detection
Given a set of input samples ๐จ = {๐ฆ๐|๐ = 1, . . . , ๐}, each of which is associated with a ๐บ dimension feature ๐๐ โ โ๐บ, we aim to learn a score function ๐ฃ(๐๐): โ๐บ โฆ โ, to classify sample ๐ฆ๐ based on the threshold ๐ : ๐ง๐ = {1, ๐๐ ๐ฃ(๐๐) โฅ ๐, 0, ๐๐ขโ๐๐ ๐ฅ๐๐ก๐. where ๐ง๐ denotes the label of sample ๐ฆ๐, with 0 being the normal class and 1 the anomalous class.
Notations
๐ : Graph. ๐ฆ : Set of nodes in a graph. ๐ : Set of edges in a graph. ๐: Number of nodes. ๐บ : Dimension of attribute. ๐ โ โ๐ร๐ : Adjacency matrix
- f a network.
๐ โ โ๐ร๐บ : Feature matrix of all nodes.
- 10-
Method
CADGMM
Dual-Encoder Estimation network Feature Decoder Graph Construction
- 11-
Method
CADGMM
Graph Construction K-Nearest Neighbor e.g. K=5
Original feature: ๐จ = {๐ฆ๐|๐ = 1, . . . , ๐} Find neighbors by K-NN: เต ๐๐ = {๐ฆ๐๐|๐ = 1, . . . , ๐ฟ Model correlation as graph: ๐ = {๐ฆ, ๐, ๐} ๐ฆ = {๐ค๐ = ๐ฆ๐|๐ = 1, . . . , ๐} ๐ = {๐๐๐ = (๐ค๐, ๐ค๐๐)|๐ค๐๐ โ ๐๐}
- 12-
Method
CADGMM
Feature Encoder e.g. MLP , CNN, LSTM Graph Encoder e.g. GAT Feature Decoder
- 13-
Method
CADGMM
Estimation network
Gaussian Mixture Model
Initial embedding: Z Membership: Z๐(๐โณ) = ๐ Z๐ ๐โณโ1 W๐ ๐โณโ1 + b๐ ๐โณโ1 , Z๐(0) = Z ๐ = Softmax(Z๐(๐โณ)), ๐ โ โ๐ร๐ Parameter Estimation: ๐๐ =
เท๐=1
๐
๐๐,๐Z๐ เท๐=1
๐
๐๐,๐
, ๐ป๐ =
เท
๐=1 ๐
๐๐,๐(Z๐โ๐๐)(Z๐โ๐๐)T เท๐=1
๐
๐๐,๐
Energy: EZ = โlog ฯ๐=1
๐
ฯ๐=1
๐ ๐๐,๐ ๐
exp(โ1
2(Zโ๐๐)T๐ป๐ โ1(Zโ๐๐))
|2๐๐ป๐|
1 2
- 14-
โ = ||X โ เทก X||2
2 + ๐1EZ + ๐2 ฯ๐=1 ๐
ฯ๐=1
๐ 1 (๐ป๐)๐๐ + ๐3||Z||2 2
Method
Loss and Anomaly Score
๐๐๐๐ ๐ = EZ Loss Function: Anomaly Score:
- Rec. Error
Energy Covariance Penalty Embedding Penalty
๐ง๐ = {1, ๐๐ ๐ฃ(๐๐) โฅ ๐, 0, ๐๐ขโ๐๐ ๐ฅ๐๐ก๐. ๐=Distribution(๐๐๐๐ ๐) Solution for Problem:
- 15-
Experiment
Datasets Baselines Evaluation Metrics
Precision Recall F1-Score OC-SVM Chen et al. 2001 IF Liu et al. 2008 DSEBM Zhai et al. 2016 DAGMM Zong et al. 2018 AnoGAN Schlegl et al. 2017 ALAD Zenati et al. 2018
- 16-
Experiment
Results
Consistent performance improvement!
- 17-
Experiment
Results
Less sensitive to noise data! More robust!
- 18-
Experiment
Results
- Fig. Impact of different K values of K-NN
algorithms in graph construction.
Less sensitive to hyper-parameters! Easy to use!
- 19-
Experiment
Results
Explainable and Effective!
- Fig. Embedding visualization on KDD99 (Blue indicates
the normal samples and orange the anomalies). (a). DAGMM (b). CADGMM
- 20-
Conclusion and Future Works
- Conventional feature learning models cannot
effectively capture the correlation among data samples for anomaly detection.
- We propose a general representation learning
framework to model the complex correlation among data samples for unsupervised anomaly detection.
- We plan to explore the correlation among samples
for extremely high-dimensional data sources like image or video.
- We plan to develop an adaptive and learnable graph
construction module for a more reasonable correlation modeling.
- 21-
Reference
- [OC-SVM] Chen, Y., Zhou, X.S., Huang, T.S.: One-class svm for learning in image
- retrieval. ICIP. 2001
- [IF] 8. Liu, F.T., Ting, K.M., Zhou, Z.H.: Isolation forest. ICDM. 2008.
- [DSEBM] Zhai, S., Cheng, Y., Lu, W., Zhang, Z.: Deep structured energy based
models for anomaly detection. ICML. 2016.
- [DAGMM] Zong, B., Song, Q., Min, M.R., Cheng, W., Lumezanu, C., Cho, D., Chen,
H.: Deep autoencoding gaussian mixture model for unsupervised anomaly
- detection. ICLR. 2018.
- [AnoGAN] Schlegl, T., Seebโข
- ck, P
., Waldstein, S.M., Schmidt-Erfurth, U., Langs, G.: Unsupervised anomaly detection with generative adversarial networks to guide marker discovery. IPMI. 2017.
- [ALAD] Zenati, H., Romain, M., Foo, C.S., Lecouat, B., Chandrasekhar, V.:
Adversarially learned anomaly detection. ICDM. 2018.
- 22-