In Incorporating Feedback in into Tree-based Anomaly Detection
Shubhomoy Das, Weng-Keen Wong, Alan Fern, Thomas G. Dietterich and Md Amran Siddiqui School of EECS
In Incorporating Feedback in into Tree-based Anomaly Detection - - PowerPoint PPT Presentation
In Incorporating Feedback in into Tree-based Anomaly Detection Shubhomoy Das, Weng-Keen Wong, Alan Fern, Thomas G. Dietterich and Md Amran Siddiqui School of EECS Anomaly Detection Goal: Identify rare or strange objects 2 Anomaly
Shubhomoy Das, Weng-Keen Wong, Alan Fern, Thomas G. Dietterich and Md Amran Siddiqui School of EECS
2
2
3
Anomaly Detector π(π¦) Ranking
3
Anomaly Detector π(π¦) Ranking
3
Anomaly Detector π(π¦) Ranking
3
Anomaly Detector π(π¦) Ranking
. . .
3
Anomaly Detector π(π¦) Ranking
. . .
4
Anomaly Detector π(π¦) Ranking
4
Anomaly Detector π(π¦) Ranking Nominal
4
Anomaly Detector π(π¦) Ranking Nominal
4
Anomaly Detector π(π¦) Ranking
4
Anomaly Detector π(π¦) Ranking Nominal
4
Anomaly Detector π(π¦) Ranking Nominal
4
Anomaly Detector π(π¦) Ranking
. . .
Nominal
4
Anomaly Detector π(π¦) Ranking
. . .
Anomaly
4
Anomaly Detector π(π¦) Ranking
. . .
Anomaly
4
Anomaly Detector π(π¦) Ranking
. . .
Anomaly
5
6
Random feature and random split point Deeper leaf indicates nominal
Shallow leaf indicates anomaly
< β₯
6
Random feature and random split point Deeper leaf indicates nominal
Shallow leaf indicates anomaly
< β₯
6
Typically 100 trees in practice Random feature and random split point Deeper leaf indicates nominal
Shallow leaf indicates anomaly
< β₯
7
(extremely sparse)
8
ππ
π’
π₯π’
8
ππ
π’
π₯π’
8
ππ
π’
Nominal π₯π’
8
ππ
π’
ππ
π’+1
Nominal π₯π’ π₯π’+1
8
ππ
π’
ππ
π’+1
Nominal π₯π’ π₯π’+1
8
ππ
π’
ππ
π’+1
Nominal Anomaly π₯π’ π₯π’+1
8
ππ
π’
ππ
π’+1
ππ
π’+2
Nominal Anomaly π₯π’ π₯π’+1 π₯π’+2
8
ππ
π’
ππ
π’+1
ππ
π’+2
Nominal Anomaly π₯π’ π₯π’+1 π₯π’+2
8
ππ
π’
ππ
π’+1
ππ
π’+2
Nominal Anomaly π₯π’ π₯π’+1 π₯π’+2
9
Synthetic Dataset Baseline discovers 12 anomalies in 35 iterations AAD discovers 23 anomalies in 35 iterations
True anomalies
10
0 Feedback
10
0 Feedback 10 Feedback
10
0 Feedback 10 Feedback 20 Feedback
10
0 Feedback 10 Feedback 20 Feedback 25 Feedback
10
0 Feedback 10 Feedback 20 Feedback 25 Feedback 35 Feedback
11
β50 50 β50 50 x y
+ + + + + + + + + + + +
+ + + + + + + + + + + + + + + + Abalone Baseline
β100 β50 50 100 β100 β50 50 100 x y
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + ++ + + + + + + + + + + + + + + + + ANN Thyroid 1v3 Baseline
π₯ False Positive + False Negative + True Positive
π True Negative
11
β50 50 β50 50 x y
+ + + + + + + + + + + +
+ + + + + + + + + + + + + + + + Abalone Baseline
β50 50 β50 50 x y
+ + + + + + + +
+ + + + + + + + + + + + + + + + + + + + Abalone IF-AAD
β100 β50 50 100 β100 β50 50 100 x y
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + ++ + + + + + + + + + + + + + + + + ANN Thyroid 1v3 Baseline
π₯ False Positive + False Negative + True Positive
π True Negative
β100 β50 50 100 β100 β50 50 100 x y
+ + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + ++ + + + + + + + + + + + + + + + + + + ANN Thyroid 1v3 IF-AAD
11
β50 50 β50 50 x y
+ + + + + + + + + + + +
+ + + + + + + + + + + + + + + + Abalone Baseline
β50 50 β50 50 x y
+ + + + + + + +
+ + + + + + + + + + + + + + + + + + + + Abalone IF-AAD
β100 β50 50 100 β100 β50 50 100 x y
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + ++ + + + + + + + + + + + + + + + + ANN Thyroid 1v3 Baseline
anomalies have been discovered previously π₯ False Positive + False Negative + True Positive
π True Negative
β100 β50 50 100 β100 β50 50 100 x y
+ + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + ++ + + + + + + + + + + + + + + + + + + ANN Thyroid 1v3 IF-AAD
11
β50 50 β50 50 x y
+ + + + + + + + + + + +
+ + + + + + + + + + + + + + + + Abalone Baseline
β50 50 β50 50 x y
+ + + + + + + +
+ + + + + + + + + + + + + + + + + + + + Abalone IF-AAD
β100 β50 50 100 β100 β50 50 100 x y
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + ++ + + + + + + + + + + + + + + + + ANN Thyroid 1v3 Baseline
anomalies have been discovered previously
unpromising regions π₯ False Positive + False Negative + True Positive
π True Negative
β100 β50 50 100 β100 β50 50 100 x y
+ + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + ++ + + + + + + + + + + + + + + + + + + ANN Thyroid 1v3 IF-AAD
11
12
10 20 30 40 50 60 10 20 30 40 50 60
iter # anomalies seen IForest Baseline IFβAAD LODAβAAD
20 40 60 80 100 20 40 60 80 100
iter # anomalies seen IForest Baseline IFβAAD LODAβAAD
20 40 60 80 100 20 40 60 80 100
iter # anomalies seen IForest Baseline IFβAAD LODAβAAD
Abalone Covtype Mammography
10 20 30 40 50 60 10 20 30 40 50 60
iter # anomalies seen IForest Baseline IFβAAD LODAβAAD
20 40 60 80 100 20 40 60 80 100
iter # anomalies seen IForest Baseline IFβAAD LODAβAAD
10 20 30 40 50 60 10 20 30 40 50 60
iter # anomalies seen IForest Baseline IFβAAD LODAβAAD
Cardiotocography KDDCup99 ANN Thyroid 1v3
13
13
13
14
15
50 100 150 200 250 300 10 20 30 40 iter # anomalies seen
IForest Baseline IFβAAD IFβAADβTree
50 100 150 200 250 300 50 100 150 200 250 300 iter # anomalies seen
IForest Baseline IFβAAD IFβAADβTree
50 100 150 200 250 300 20 40 60 iter # anomalies seen
IForest Baseline IFβAAD IFβAADβTree
Cardiotocography KDDCup99 ANN Thyroid 1v3
50 100 150 200 250 300 5 10 15 20 25 30 iter # anomalies seen
IForest Baseline IFβAAD IFβAADβTree
50 100 150 200 250 300 50 100 150 200 250 300 iter # anomalies seen
IForest Baseline IFβAAD IFβAADβTree
50 100 150 200 250 300 50 100 150 200 250 iter # anomalies seen
IForest Baseline IFβAAD IFβAADβTree
Abalone Covtype Mammography
16
50 100 150 200 250 300 50 100 150 200 250 300
number of queries time (secs)
IFβAAD
50 100 150 200 250 300 50 100 150 200 250 300
number of queries time (secs)
IFβAAD
50 100 150 200 250 300 50 100 150 200 250 300
number of queries time (secs)
IFβAAD
Covtype Mammography Shuttle
17