a bayesian hybrid approach to
play

A Bayesian Hybrid Approach to Unsupervised Time Series - PowerPoint PPT Presentation

A Bayesian Hybrid Approach to Unsupervised Time Series Discretization Yoshitaka Kameya Gabriel Synnaeve Andrei Doncescu Tokyo Institute of Technology Grenoble University LAAS-CNRS Katsumi Inoue Taisuke Sato National Institute of


  1. A Bayesian Hybrid Approach to Unsupervised Time Series Discretization Yoshitaka Kameya Gabriel Synnaeve Andrei Doncescu Tokyo Institute of Technology Grenoble University LAAS-CNRS Katsumi Inoue Taisuke Sato National Institute of Informatics Tokyo Institute of Technology 1 20/Nov/2010 TAAI-2010

  2. Outline • Review: Unsupervised discretization of time series data – Preliminary experimental results • Hybrid discretization method based on variational Bayes • Experimental results • Summary and future work 20/Nov/2010 TAAI-2010 2

  3. Discretization 3.2 medium 2.8 medium 0.1 low ... converts numeric data into symbolic data • 6.4 high ... ... ... is a preprocessing task in knowledge discovery • Interpretation/ Evaluation Data mining Knowledge Preprocessing Transformation Patterns ........ ......... ...... ............ .... .... ...... Selection Transformed data Preprocessed data Target data [Fayyad et al. 1995] ... may lead to noise reduction and a good data abstraction • – We wish to have interpretable discrete levels ... may help symbolic data mining • – Frequent pattern mining – Inductive logic programming 20/Nov/2010 TAAI-2010 3

  4. Unsupervised discretization of time series data Common strategy: combined sequentially or simultaneously – Smoothing at the time (x) axis – Binning or clustering at the measurement (y) axis measurement Binning: • Equal width binning – Equal frequency binning – ... – Clustering: • Hierarchical clustering [Dimitrova et al. 05] – SAX K-means – time Gaussian mixture models [Mörchen et al. 05b] –  b a a b c c b c ... – Smoothing: • All-in-one methods: • Regression trees [Geurts 01] – Smoothing filters SAX [Lin et al. 07] – – • Moving averaging Persist [Mörchen et al. 05a] – • Savitzky-Golay filters [Mörchen et al. 05b] Continuous hidden Markov models – ... – [Mörchen et al. 05a] TAAI-2010 20/Nov/2010 4

  5. Persist [Mörchen et al. 05a] Assumption: • Time series tries to stay at one of the discrete levels (= states) as long as possible Persist greedily chooses the breakpoints so that less state changes occur •  a role of smoothing Breakpoints state changes S 1 S 2 S 3 S 4 20/Nov/2010 TAAI-2010 5

  6. Continuous hidden Markov models • Two-step procedure – Train the HMM – Find the most probable state sequence by the Viterbi algorithm  State sequence = Discrete time series discrete state S 1 S 2 S 3 S 4 S 5 S 6 S 7 S 8 Gaussian output X 1 X 2 X 3 X 4 X 5 X 6 X 7 X 8 (measurement) State 1 Positions and shapes of State 2 Gaussians State 3 are adjusted by EM Mean at State 1 Mean at State 2 Mean at State 3 20/Nov/2010 TAAI-2010 6

  7. Preliminary experiment [Mörchen et al. 05] Comparison on the predictive performance among the discretizers • We used an artificial dataset called the “enduring - state” dataset • How well do the discretizers recover the answers? outliers noises – SAX – Persist – HMMs – Equal width binning (EQW) – Equal frequency binning (EQF) – Gaussian mixture model (GMM) 20/Nov/2010 TAAI-2010 7

  8. Preliminary experiment (Cont’d) • Error analysis: Persist – Levels are correctly identified – However many noises go across the boundaries 5 levels 5 % outliers 20/Nov/2010 TAAI-2010 8

  9. Preliminary experiment (Cont’d) • Error analysis: HMMs – Some levels are misidentified – Small noises are correctly smoothed 5 levels 5 % outliers 20/Nov/2010 TAAI-2010 9

  10. Motivation • From preliminary experiments, we can see: – Persist : robust in identifying the discrete levels (because its heuristic score captures the global behavior of the time series) – HMMs : good at local smoothing Our proposal : Hybridization of heterogeneous discretizers based on variational Bayes 20/Nov/2010 TAAI-2010 10

  11. Variational Bayes • Efficient technique for Bayesian learning [Beal 03] – Empirically known as robust against outliers – Gives a principled way of determining # of discrete levels An HMM is modeled as: p ( x , z , q ) = p ( q ) p ( x , z | q ) • – x : input time series – z : hidden state sequence z : S 1 S 2 S 3 S 4 S 5 S 6 S 7 S 8 (discretized time series) x : X 1 X 2 X 3 X 4 X 5 X 6 X 7 X 8 – q : parameters – p ( q ) : prior – p ( x , z | q ) : likelihood • Prior of means and variances in HMMs: Normal-Gamma distribution (conjugate prior) hyperparameters 20/Nov/2010 TAAI-2010 11

  12. Variational Bayes (Cont’d) Variational Bayesian EM in general form: • – We try to find q = q * that maximizes the variational free energy F [ q ] : q ( , , ) p x z    q q [ ] ( , ) log F q q z d q  z ( , ) q z – F [ q ] is a lower bound of the marginal likelihood L ( x ) :     q ) q ( ) log ( ) log ( , , L x p x p x z d  z  F [ q *] is a good approximation of L ( x ) – To get q * , assuming q ( z , q ) ≈ q ( z ) q ( q ) , we iterate the two steps   alternately:   q q q VB - E step : ( ) exp ( ) log ( , | ) ) q z q p x z d     q  q q VB - M step : ( ) ( ) exp ( ) log ( , | ) q p q z p x z z – From L ( x ) – F [ q *] = KL( q *( z , q ), p ( z , q | x )) , q * is a good approximation of the posterior distribution and so used for discretization 20/Nov/2010 TAAI-2010 12

  13. Hybridization We aim to control the HMM by the settings of t and m k The means of Gaussians are updated by: • Expected mean of the Gaussian for level k Expected counts of staying at level k Prior mean of the Gaussian for level k Weight (Pseudo count) We simply set • breakpoints where b k are the breakpoints S 1 S 2 obtained by Persist S 3 S 4 In a similar way, we can also • combine HMMs with SAX 1/3 breakpoints 1/3 1/3 20/Nov/2010 TAAI-2010 13

  14. Experiment 1: “Enduring - state” dataset Weight t = 0.5, 1, 5, 10, 20, 50, 70, 100 raw time series (input) ratio of outliers 20/Nov/2010 TAAI-2010 14

  15. Experiment 1: “Enduring - state” dataset Weight t = 0.5, 1, 5, 10, 20, 50, 70, 100 raw time series (input) ratio of outliers 20/Nov/2010 TAAI-2010 15

  16. Experiment 1: “Enduring - state” dataset Weight t = 0.5, 1, 5, 10, 20, 50, 70, 100 raw time series (input) ratio of outliers 20/Nov/2010 TAAI-2010 16

  17. Experiment 1: “Enduring - state” dataset Weight t = 0.5, 1, 5, 10, 20, 50, 70, 100 raw time series (input) ratio of outliers 20/Nov/2010 TAAI-2010 17

  18. Experiment 1: “Enduring - state” dataset Weight t = 0.5, 1, 5, 10, 20, 50, 70, 100 raw time series (input) ratio of outliers 20/Nov/2010 TAAI-2010 18

  19. Experiment 1: “Enduring - state” dataset Weight t = 0.5, 1, 5, 10, 20, 50, 70, 100 Under accuracy HMM+Persist is significantly better than Persist except several cases with a large # of levels and many outliers Under NMI HMM+Persist is significantly better than Persist for all cases according to Wilcoxon’s rank sum test ( p = 0.01) ratio of outliers 20/Nov/2010 TAAI-2010 19

  20. Experiment 2: Background • Also based on [Mörchen et al. 05a] • Data on muscle activation of a professional inline speed skater – Nearly 30,000 points recorded in log-scale 20/Nov/2010 TAAI-2010 20

  21. Experiment 2: Goal • Estimating a plausible # of discrete levels automatically with variational Bayes • An expert prefers to have 3 levels [Mörchen et al. 05a] Last kick to the ground to move forward Gliding phase (muscle is used to keep stability) 20/Nov/2010 TAAI-2010 21

  22. Experiment 2: Settings • Having so many (30,000) data points, we need to: – Use large pseudo counts (  500) – Use PAA (used in SAX) to compress the time series (frame size = 50) 20/Nov/2010 TAAI-2010 22

  23. Experiment 2: Discretization by CHMMs (Cont’d) • PAA disabled • Savitzky-Golay filter enabled with half-window size = 100 • Pseudo counts = 1 # of levels 20/Nov/2010 TAAI-2010 23

  24. Experiment 2: Discretization by CHMMs (Cont’d) • PAA disabled • Pseudo counts = 1000 # of levels 20/Nov/2010 TAAI-2010 24

  25. Experiment 2: Discretization by CHMMs (Cont’d) • PAA enabled with frame size = 10 • Pseudo counts = 1 # of levels 20/Nov/2010 TAAI-2010 25

  26. Experiment 2: Discretization by CHMMs (Cont’d) • PAA enabled with frame size = 20 • Pseudo counts = 1 # of levels 20/Nov/2010 TAAI-2010 26

  27. Summary • Unsupervised discretization of time series data • Hybridizing heterogeneous discritizers via variational Bayes – Fast approximate Bayesian inference – Robust against noises – Automatic finding of the plausible number of discrete levels Future work • Histogram-based discretizer 20/Nov/2010 TAAI-2010 27

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend