with negative sampling
play

with Negative Sampling ICML 2020 John Sipple sipple@google.com - PowerPoint PPT Presentation

Interpretable, Multidimensional, Multimodal Anomaly Detection with Negative Sampling ICML 2020 John Sipple sipple@google.com July 2020 Motivation Outside range Correlations lost Complex patuerns Novel failure modes Few


  1. Interpretable, Multidimensional, Multimodal Anomaly Detection with Negative Sampling ICML 2020 John Sipple sipple@google.com July 2020

  2. Motivation Outside range ● Correlations lost ● Complex patuerns ● Novel failure modes ● Few failure examples ●

  3. Multidimensional ○ Correlated ○ Multimodal ○ Complex ○

  4. Anomaly Detection Problem What is “normal”? Why is it “anomalous”? How do we test? x : observed point in ℝ D Normal : region in ℝ D representing expected behavior

  5. Detect the Anomaly - What is “normal”? - How do we test?

  6. Anomaly Detection Few/no failure labels challenge supervised approaches One-class Classifjers Density-Based Autoencoders and Negative Sampling Generative Models Methods Learn a transformation to Anomalous points occur in Anomalies have larger separate the observed points low-density regions Explicitly defjne negative reconstruction errors from the origin. Local Outlier Factor (2000) space for anomalies. ● Isolation Forest (2009) and One-Class SVM (2001) than Normal points ● Neg Selection ● ● Ext. Isolation Forest (2018) Deep SVDD (2018) AnoGAN (2017) Algorithms (NSA) (2002) ● ● GANomaly (2018) Neg Sampling Classifjers ● ● DAE-DBC (2018) ( this work ) ●

  7. Negative Sampling Anomaly Detection Positive Region = Observed ≈ Normal temp observed Negative Region = Complement of Positive ≈ Anomalous ℝ 2 Train DNNs and Random Forests to predict P( x ∈ Normal ) temp setpoint

  8. Sampling the Training Set Positive Sample : Most observed points are normal, and anomalies are rare. ∆ v Negative Sample : Computationally hard to defjne a tight hull of an arbitrary shape in ℝ D ∆ u Alternatively, sample uniformly ∆ u = 1.1∆ v Concentration Phenomenon : Volume increases exponentially with D

  9. Anomaly Detection Pipeline Generate Select Negative Train Positive Sample Classifjer Sample Classify Anomalies

  10. Anomaly Detection Results Extended Iso NegSampleRnd NegSample ROC-AUC % OC-SVM Deep SVDD Iso Forest Forest Forest Neural Net Forest Cover * 53 ±20 69 ±7 85 ±4 93 ±1 80 ±2 86 ±4 Shutule * 93 ±0 88 ±9 96 ±1 91 ±1 93 ±7 96 ±5 Mammography * 71 ±7 78 ±6 77 ±2 86 ±2 85 ±4 84 ±2 Mulcross * 90 ±0 54 ±4 88 ±0 66 ±4 94 ±1 99 ±1 Satellite * 51 ±1 62 ±3 67 ±2 71 ±3 65 ±4 73 ±3 Smaru Buildings 76 ±1 60 ±7 71 ±7 80 ±4 95 ±1 93 ±1 * Courtesy of ODDS Library [http://odds.cs.stonybrook.edu]. Stony Brook, NY: Stony Brook University, Department of Computer Science

  11. Interpret the Anomaly - Why is it “anomalous”?

  12. Anomaly Interpretation Aturibute infmuence with difgerentiable classifjer function F( x ) , and Integrated Gradients (Sundararajan, 2017) Requires a neutral, baseline point, u *. By the Completeness Axiom, the sum across all (1) Choose a baseline set U * from the positive sample dimensions should be nearly 1 U, where U * are Normal (2) Choose u* from U* with the minimum distance dist (∙,∙) Each dimension d gets a proporuional blame B d to Anomaly x

  13. Anomaly Detection Pipeline with Interpretability Select Generate Train Positive Negative Classifjer Sample Sample Classify Anomalies Choose Baseline Blame Variables

  14. Case Study: Smaru Buildings Objective: Make buildings smaruer, secure and reduce energy use! Improve occupant comforu and productivity while also improving facilities’ operation effjciencies. 120 million measurements daily, generated by over 15,000 climate control devices, in 145 Google buildings Since going live in June 2019, FDD has created 458 facilities technician work orders, with a 44% True Positive rate

  15. Thank You htups://github.com/google/madi

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend