Anomaly Detection
Jia-Bin Huang Virginia Tech
Spring 2019
ECE-5424G / CS-5824
Anomaly Detection Jia-Bin Huang Virginia Tech Spring 2019 - - PowerPoint PPT Presentation
Anomaly Detection Jia-Bin Huang Virginia Tech Spring 2019 ECE-5424G / CS-5824 Administrative Anomaly Detection Motivation Developing an anomaly detection system Anomaly detection vs. supervised learning Choosing what features
Jia-Bin Huang Virginia Tech
Spring 2019
ECE-5424G / CS-5824
π¦2
vibration
π¦1
heat
Model π π
π¦2
vibration
π¦1
heat
π standard deviation
1 2ππ exp β π¦βπ 2 2π2
π =
1 π Οπ=1 π π¦(π)
π2 =
1 π Οπ=1 π (π¦ π β ΰ·
π)2
2) π π¦2; π2, π2 2 β― π π¦π; ππ, ππ 2
= Ξ π π(π¦π; ππ, π
π 2)
examples
2, π2 2, β― , ππ 2
1 π Οπ=1 π π¦π (π)
π 2 = 1 π Οπ=1 π (π¦π π β ππ)2
π π¦ = Ξ π π(π¦π; ππ, π
π 2)
Anomaly if π π¦ < π
anomalous examples (π§ = 0 if normal, π§ = 1 if anomalous)
(1), π§ππ€ (1)), (π¦ππ€ (2), π§ππ€ (2)), β― , (π¦ππ€ (πππ€), π§ππ€ (πππ€))
(1) , π§π’ππ‘π’ (1) ), (π¦π’ππ‘π’ (2) , π§π’ππ‘π’ (2) ), β― , (π¦π’ππ‘π’ (ππ’ππ‘π’), π§π’ππ‘π’ (ππ’ππ‘π’))
if π π¦ < π (anomaly) if π π¦ β₯ π (normal)
ππ π+π
Anomaly detection
examples (y=1) (0-20 is common)
examples
Hard for any algorithm to learn from positive examples what the anomalies look like
like any of the anomalous examples we have seen so far Supervised learning Large number of positive and negative examples Enough positive examples for algorithm to get a sense of what positive are like, future positive examples likely to be similar to ones in training set.
Anomaly detection
center Supervised learning
log π¦
Want π(π¦) large for normal examples π¦ π(π¦) small for anomalous examples π¦ Most common problem: π(π¦) is comparable (say both large) for normal and anomalous examples
the event of an anomaly
CPU load network traffic π¦5 = CPU load^2 network traffic
π¦1 (CPU load) π¦2 (Memory use) π¦1 (CPU load) π¦2 (Memory use)
1 2π π/2 Ξ£ 1/2 exp β π¦ β π β€Ξ£β1(π¦ β π)
Ξ£ = 1 1 Ξ£ = 0.6 0.6 Ξ£ = 2 2 π¦1 π¦2 π¦1 π¦2 π¦1 π¦2
Ξ£ = 1 1 Ξ£ = 0.6 1 Ξ£ = 2 1 π¦1 π¦2 π¦1 π¦2 π¦1 π¦2
Ξ£ = 1 1 Ξ£ = 1 0.5 0.5 1 Ξ£ = 1 0.8 0.8 1 π¦1 π¦2 π¦1 π¦2 π¦1 π¦2
π = 1 π ΰ·
π=1 π
π¦(π) Ξ£ = 1 π ΰ·
π=1 π
(π¦(π)βπ)(π¦(π) β π)β€ 2 Give a new example π¦, compute π π¦; π, Ξ£ = 1 2π π/2 Ξ£ 1/2 exp β π¦ β π β€Ξ£β1(π¦ β π) Flag an anomaly if π π¦ < π
Original model π π¦1; π1, π1
2 π π¦2; π2, π2 2 β― π π¦π; ππ, ππ 2
Manually create features to capture anomalies where π¦1, π¦2 take unusual combinations of values Computationally cheaper (alternatively, scales better) OK even if training set size is small
Original model
π π¦; π, Ξ£ = 1 2π π/2 Ξ£ 1/2 exp(β π¦ β π β€Ξ£β1(π¦