kernels to detect abrupt changes in time series
play

Kernels to detect abrupt changes in time series Alain Celisse 1 UMR - PowerPoint PPT Presentation

Intro. Framework Algorithm Change-pts location? ( D fixed) How many chg-pts? Kernels to detect abrupt changes in time series Alain Celisse 1 UMR 8524 CNRS - Universit e Lille 1 2 Modal INRIA team-project 3 SSB group Paris joint work


  1. Intro. Framework Algorithm Change-pts location? ( D fixed) How many chg-pts? Kernels to detect abrupt changes in time series Alain Celisse 1 UMR 8524 CNRS - Universit´ e Lille 1 2 Modal INRIA team-project 3 SSB group – Paris joint work with S. Arlot, Z. Harchaoui, G. Rigaill, and G. Marot “Computational and statistical trade-offs in learning” – IHES Paris, March 22nd, 2016 1/47 Kernels to detect abrupt changes in time series Alain Celisse

  2. Intro. Framework Algorithm Change-pts location? ( D fixed) How many chg-pts? Outline 1 Motivating examples and framework (kernels) 2 KCP Algorithm and computational complexity 3 Where are the change-points ( D fixed)? 4 How many change-points? 2/47 Kernels to detect abrupt changes in time series Alain Celisse

  3. Intro. Framework Algorithm Change-pts location? ( D fixed) How many chg-pts? Change-point detection: 1-D signal (example) 1.2 Signal 1 Reg. func. 0.8 0.6 0.4 Signal 0.2 0 ? ? −0.2 −0.4 −0.6 −0.8 0 10 20 30 40 50 60 70 80 90 100 Position t 3/47 Kernels to detect abrupt changes in time series Alain Celisse

  4. Intro. Framework Algorithm Change-pts location? ( D fixed) How many chg-pts? Detect abrupt changes. . . General purposes: 1 Detect changes in (features of) the distribution (not only in the mean) 4/47 Kernels to detect abrupt changes in time series Alain Celisse

  5. Intro. Framework Algorithm Change-pts location? ( D fixed) How many chg-pts? Abrupt changes in high-order moments − → Detecting changes in the mean is useless 5/47 Kernels to detect abrupt changes in time series Alain Celisse

  6. Intro. Framework Algorithm Change-pts location? ( D fixed) How many chg-pts? Detect abrupt changes. . . General purposes: 1 Detect changes in (features of) the distribution (not only in the mean) 2 Complex data: High-dimension: measures in R d , curves,. . . Structured: audio/video streams, graphs, DNA sequence,. . . 6/47 Kernels to detect abrupt changes in time series Alain Celisse

  7. Intro. Framework Algorithm Change-pts location? ( D fixed) How many chg-pts? Motivating example 1: Structured objects Description: Video sequences from “Le grand ´ echiquier”, 70s-80s French talk show. At each time, one observes an image (high-dimensional). Each image is summarized by a histogram. 7/47 Kernels to detect abrupt changes in time series Alain Celisse

  8. Intro. Framework Algorithm Change-pts location? ( D fixed) How many chg-pts? Motivating example 2: Structured objects Observe networks along the time Goal: Detect abrupt changes in some features of the network 8/47 Kernels to detect abrupt changes in time series Alain Celisse

  9. Intro. Framework Algorithm Change-pts location? ( D fixed) How many chg-pts? Detect abrupt changes. . . General purposes: 1 Detect changes in (features of) the distribution (not only in the mean) 2 Complex data: High-dimension: measures in R d , curves,. . . Structured: audio/video streams, graphs, DNA sequence,. . . 3 Fusion of heterogeneous data Deal simultaneously with different types of complex data 4 Efficient algorithm allowing to deal with large data sets (“Big data” challenge) 9/47 Kernels to detect abrupt changes in time series Alain Celisse

  10. Intro. Framework Algorithm Change-pts location? ( D fixed) How many chg-pts? I Kernel framework 10/47 Kernels to detect abrupt changes in time series Alain Celisse

  11. Intro. Framework Algorithm Change-pts location? ( D fixed) How many chg-pts? Kernel and Reproducing Kernel Hilbert Space (RKHS) X 1 , . . . , X n ∈ X : initial observations. k ( · , · ) : X × X → R : reproducing kernel (Aronszajn (1950)) H : RKHS associated with k ( · , · ) ( φ : X → H s.t. φ ( x ) = k ( x , · ): canonical feature map) Assets: Versatile tool to work with different types of data Complex data (high dimensional/structured) 11/47 Kernels to detect abrupt changes in time series Alain Celisse

  12. Intro. Framework Algorithm Change-pts location? ( D fixed) How many chg-pts? Instances of kernels Gaussian kernel: (with R d -valued data) � � −� x − y � 2 k δ ( x , y ) = exp , δ > 0 . δ χ 2 -kernel: (with histogram-valued data) � � I � ( p i − q i ) 2 k I ( p , q ) = exp − · p i + q i i =1 . . . 12/47 Kernels to detect abrupt changes in time series Alain Celisse

  13. Intro. Framework Algorithm Change-pts location? ( D fixed) How many chg-pts? Model Y i = φ ( X i ) = µ ⋆ ∀ 1 ≤ i ≤ n , i + ε i ∈ H , where µ ⋆ i ∈ H : mean element of P X i (distribution of X i ) � � � ε i � 2 ε i := Y i − µ ⋆ ∀ i , i , with E ε i = 0 , v i := E . H Mean element of P X i The mean element of P X i : ( H separable and E [ k ( X , X ) ] < + ∞ ) < µ ⋆ i , f > H = E X i [ < φ ( X i ) , f > H ] , ∀ f ∈ H . With characteristic kernels, µ ⋆ i � = µ ⋆ P X i � = P X j ⇒ j . 13/47 Kernels to detect abrupt changes in time series Alain Celisse

  14. Intro. Framework Algorithm Change-pts location? ( D fixed) How many chg-pts? Estimation rather than identification Assumption n ) ′ ∈ H n : µ ⋆ = ( µ ⋆ 1 , . . . , µ ⋆ piecewise constant. Signal: Y 1 Reg. func. s 0.8 Fact: 0.6 With finite sample, it is 0.4 impossible to recover 0.2 change-point in noisy regions. 0 −0.2 Purpose: 55 60 65 70 75 80 85 90 95 100 Estimate µ ⋆ to recover change-points. Performance measure: � µ ⋆ − µ � 2 := � n i − µ i � 2 i =1 � µ ⋆ H 14/47 Kernels to detect abrupt changes in time series Alain Celisse

  15. Intro. Framework Algorithm Change-pts location? ( D fixed) How many chg-pts? II Algorithm 15/47 Kernels to detect abrupt changes in time series Alain Celisse

  16. Intro. Framework Algorithm Change-pts location? ( D fixed) How many chg-pts? Notation Segmentation with D segments: τ = ( τ 0 , . . . , τ D ) , with 0 = τ 0 < τ 1 < τ 2 < · · · < τ D = n Quality of a segmentation τ : Following Hachaoui and Capp´ e (2007),   � n � D � τ ℓ � τ ℓ R n ( τ ) = 1 k ( X i , X i ) − 1 1 �   . k ( X i , X j ) n n τ ℓ − τ ℓ − 1 i =1 ℓ =1 i = τ ℓ − 1 +1 j = τ ℓ − 1 +1 Rk: With the linear kernel k ( x , x ′ ) = < x , x ′ > on X = R d , � R n ( τ ) reduces to the usual least-squares empirical risk. 16/47 Kernels to detect abrupt changes in time series Alain Celisse

  17. Intro. Framework Algorithm Change-pts location? ( D fixed) How many chg-pts? KCP Algorithm Input: observations: X 1 , . . . , X n ∈ X , kernel: k : X × X → R , 17/47 Kernels to detect abrupt changes in time series Alain Celisse

  18. Intro. Framework Algorithm Change-pts location? ( D fixed) How many chg-pts? KCP Algorithm Input: observations: X 1 , . . . , X n ∈ X , kernel: k : X × X → R , ∀ 1 ≤ D ≤ D max , compute: Step 1: � � � � τ ( D ) ∈ Argmin τ ∈T D R n ( τ ) n → dynamic programming � � ( τ 0 , . . . , τ D ) ∈ N D +1 / 0 = τ 0 < τ 1 < τ 2 < · · · < τ D = n T D n = 17/47 Kernels to detect abrupt changes in time series Alain Celisse

  19. Intro. Framework Algorithm Change-pts location? ( D fixed) How many chg-pts? KCP Algorithm Input: observations: X 1 , . . . , X n ∈ X , kernel: k : X × X → R , ∀ 1 ≤ D ≤ D max , compute: Step 1: � � � � τ ( D ) ∈ Argmin τ ∈T D R n ( τ ) n → dynamic programming Step 2: Find: � � � � D ∈ Argmin 1 ≤ D ≤ D max R n ( � τ ( D )) + pen ( � τ ( D )) → model selection � � � Output: sequence of change-points: � τ = � τ D . � � ( τ 0 , . . . , τ D ) ∈ N D +1 / 0 = τ 0 < τ 1 < τ 2 < · · · < τ D = n T D n = 17/47 Kernels to detect abrupt changes in time series Alain Celisse

  20. Intro. Framework Algorithm Change-pts location? ( D fixed) How many chg-pts? Computational complexity (Naive approach) Dynamic programming (DP) update rule: ∀ 2 ≤ D ≤ D max , L D , n = min t ≤ n − 1 { L D − 1 , t + C t , n } , where L D − 1 , t : cost of the best segmentation in D − 1 segments up to time t , C t , n : cost of the segment 〚 t,n 〛 . t t t � � � 1 C s , t = k ( X i , X i ) − k ( X i , X j ) t − s i = s +1 i = s +1 j = s +1 Complexity (Naive approach): time: O ( D max n 4 ) (computation of { C s , t } 1 ≤ s , t ≤ n ) space: O ( n 2 ) (storage of the cost matrix) 18/47 Kernels to detect abrupt changes in time series Alain Celisse

  21. Intro. Framework Algorithm Change-pts location? ( D fixed) How many chg-pts? Computational complexity (Improvement) Ideas: (with G. Rigaill and G. Marot) Never store the cost matrix Update each column C · , t +1 from C · , t Pseudo-code: 1: for t = 1 to n − 1 do Compute the ( t + 1)-th column C · , t +1 from C · , t 2: for D = 2 to min( t , D max ) do 3: L D , t +1 = min s ≤ t { L D − 1 , s + C s , t +1 } 4: end for 5: 6: end for Computational complexity Space: O ( D max n ) (only store C · , t ∈ R n ) Time: O ( D max n 2 ) (update rule+DP complexity) 19/47 Kernels to detect abrupt changes in time series Alain Celisse

  22. Intro. Framework Algorithm Change-pts location? ( D fixed) How many chg-pts? Runtime Open questions: Reduce computation time by low-rank matrix approx. Quantify what has been lost by the approx. 20/47 Kernels to detect abrupt changes in time series Alain Celisse

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend