automatic outlier detection a bayesian approach
play

Automatic Outlier Detection: A Bayesian Approach Jo-Anne Ting, - PowerPoint PPT Presentation

Automatic Outlier Detection: A Bayesian Approach Jo-Anne Ting, University of Southern California Aaron DSouza, Google, Inc. Stefan Schaal, University of Southern California ICRA 2007 April 12, 2007 Outline Motivation Past &


  1. Automatic Outlier Detection: A Bayesian Approach Jo-Anne Ting, University of Southern California Aaron D’Souza, Google, Inc. Stefan Schaal, University of Southern California ICRA 2007 April 12, 2007

  2. Outline • Motivation • Past & related work • Bayesian regression for automatic outlier detection – Batch version – Incremental version • Results – Synthetic data – Robotic data • Conclusions J. Ting 2

  3. Motivation • Real-world sensor data is susceptible to outliers – E.g., motion capture (MOCAP) data of a robotic dog J. Ting 3

  4. Outline • Motivation • Past & related work • Bayesian regression for automatic outlier detection – Batch version – Incremental version • Results – Synthetic data – Robotic data • Conclusions J. Ting 4

  5. Past & Related Work • Current methods for outlier detection may: – Require parameter tuning (i.e. an optimal threshold) – Require sampling (e.g. active sampling, Abe et al., 2006) or the setting of certain parameters, e.g., k in k-means clustering (MacQueen, 1967) – Assume an underlying data structure (e.g. mixture models, Fox et al., 1999) – Adopt a weighted linear regression model, but model the weights with some heuristic function (e.g., robust least squares, Hoaglin, 1983) J. Ting 5

  6. Outline • Motivation • Past & related work • Bayesian regression for automatic outlier detection – Batch version – Incremental version • Results – Synthetic data – Robotic data • Conclusions J. Ting 6

  7. Bayesian Regression for Automatic Outlier Detection • Consider linear regression: y i = b T x i + � y i • We can modify the above to get a weighted linear regression model (Gelman et al., 1995): y i ~ Normal b T x i , � 2 � � � � w i � � ( ) b ~ Normal b 0 , � 2 � b 0 Except now: ( ) w i ~ Gamma a w i , b w i J. Ting 7

  8. Bayesian Regression for Automatic Outlier Detection • This Bayesian treatment of weighted linear regression: • Is suitable for real-time outlier detection • Requires no model assumptions • Requires no parameter tuning J. Ting 8

  9. Bayesian Regression for Automatic Outlier Detection • Our goal is to infer the posterior distributions of b and w • We can treat this as an EM problem (Dempster et al., 1977) and maximize the incomplete log likelihood: log p ( y | X ) by maximizing the expected complete log likelihood: E [log p ( y , b , w | X )] J. Ting 9

  10. Bayesian Regression for Automatic Outlier Detection • In the E-step, we need to calculate: E Q ( b , w ) [log p ( y , b , w | X )] but since the true posterior over all hidden variables is analytically intractable, we make a factorial variational approximation (Hinton & van Camp 1993, Ghahramani & Beal, 2000): Q ( b , w ) = Q ( b ) Q ( w ) J. Ting 10

  11. Bayesian Regression for Automatic Outlier Detection Reminder: • EM Update Equations (batch version): y i ~ Normal b T x i , � 2 � � � � � � w i E - step : � 1 � � � 1 + N � � b = � b 0 T � � w i x i x i If prediction error is � � Point is i = 1 very large, E[w i ] goes downweighted � � N to 0 � b = � b � b 0 � 1 b 0 + � � w i y i x i � � i = 1 a w i ,0 + 0.5 w i = ( ) 1 + 1 T x i 2 b w i ,0 + 2 y i � b T � b x i x i 2 � 2 M - step : ( ) + w i x i N 2 = 1 � T x i � � � y i � b T � b x i w i � � N i = 1 J. Ting 11

  12. Bayesian Regression for Automatic Outlier Detection • EM Update Equations (incremental version): E - step : Sufficient statistics � 1 � � N � 1 + are exponentially � � b = � b 0 N k = 1 + � N k � 1 � k wxx T � � discounted by λ , � � wxx T = w k x k x k i = 1 T + � � k � 1 0 ≤ λ ≤ 1 (e.g., Ljung � k wxx T � � N � b = � b � b 0 � 1 b 0 + � k & Soderstrom, 1983) wyx wyx = w k y k x k + � � k � 1 � � � k � � wyx i = 1 wy 2 = w k y k 2 + � � k � 1 a w i ,0 + 0.5 � k wy 2 w i = ( ) 1 + 1 T x i 2 b w i ,0 + 2 � 2 y i � b T � b x i 2 x i M - step : { } wy 2 � 2 � k wxx T � b T � k N � 2 = 1 wyx + b wxx T b + 1 T diag � k � � � � k � � N k i = 1 J. Ting 12

  13. Outline • Motivation • Past & related work • Bayesian regression for automatic outlier detection – Batch version – Incremental version • Results – Synthetic data – Robotic data • Conclusions J. Ting 13

  14. Results: Synthetic Data • Given noisy data (+outliers) from a linear regression problem: • 5 input dimensions • 1000 samples • SNR = 10 • 20% outliers • outliers are 3 σ from output mean J. Ting 14

  15. Results: Synthetic Data Available in Batch Form Average Normalized Mean Squared Prediction Error as a Function of How Far Outliers are from Inliers Lowest prediction error Distance of outliers from mean is at least… +3 σ +2 σ + σ Algorithm Thresholding (optimally tuned) 0.0903 0.0503 0.0232 Mixture model 0.1327 0.0688 0.0286 Robust Least Squares 0.1890 0.1518 0.0880 Robust Regression (Faul & 0.1320 0.0683 0.0282 Tipping 2001) Bayesian weighted regression 0.0273 0.0270 0.0210 Data: Globally linear data with 5 input dimensions evaluated in batch form, averaged over 10 trials (SNR = 10, σ is the standard deviation of the true conditional output mean) J. Ting 15

  16. Results: Synthetic Data Available Incrementally Prediction Error Over Time with Outliers at least 2 σ away ( λ =0.999) Lowest prediction error J. Ting 16

  17. Results: Synthetic Data Available Incrementally Prediction Error Over Time with Outliers at least 3 σ away ( λ =0.999) Lowest prediction error J. Ting 17

  18. Results: Robotic Orientation Data • Offset between MOCAP data & IMU data for LittleDog: J. Ting 18

  19. Results: Predicted Output on LittleDog MOCAP Data J. Ting 19

  20. Outline • Motivation • Past & related work • Bayesian regression for automatic outlier detection – Batch version – Incremental version • Results – Synthetic data – Robotic data • Conclusions J. Ting 20

  21. Conclusions • We have an algorithm that: – Automatically detects outliers in real-time – Requires no user interference, parameter tuning or sampling – Performs on par with and even exceeds standard outlier detection methods • Extensions to the Kalman filter and other filters J. Ting 21

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend