Restarted Bayesian Online Change-point Detector achieves Optimal - - PowerPoint PPT Presentation

restarted bayesian online change point detector achieves
SMART_READER_LITE
LIVE PREVIEW

Restarted Bayesian Online Change-point Detector achieves Optimal - - PowerPoint PPT Presentation

Restarted Bayesian Online Change-point Detector achieves Optimal Detection Delay Reda ALAMI Joint work with Odalric Maillard and Raphael F eraud. reda.alami@total.com Presented at ICML 2020 Overview A pruning version of the Bayesian


slide-1
SLIDE 1

Restarted Bayesian Online Change-point Detector achieves Optimal Detection Delay

Reda ALAMI

Joint work with Odalric Maillard and Raphael F´ eraud.

reda.alami@total.com Presented at ICML 2020

slide-2
SLIDE 2

Overview

◮ A pruning version of the Bayesian Online Change-point Detector.

2/14

slide-3
SLIDE 3

Overview

◮ A pruning version of the Bayesian Online Change-point Detector. ◮ High probability guarantees in term of:

2/14

slide-4
SLIDE 4

Overview

◮ A pruning version of the Bayesian Online Change-point Detector. ◮ High probability guarantees in term of:

◮ False alarm rate.

2/14

slide-5
SLIDE 5

Overview

◮ A pruning version of the Bayesian Online Change-point Detector. ◮ High probability guarantees in term of:

◮ False alarm rate. ◮ Detection delay.

2/14

slide-6
SLIDE 6

Overview

◮ A pruning version of the Bayesian Online Change-point Detector. ◮ High probability guarantees in term of:

◮ False alarm rate. ◮ Detection delay.

◮ The detection delay is asymptotically optimal

2/14

slide-7
SLIDE 7

Overview

◮ A pruning version of the Bayesian Online Change-point Detector. ◮ High probability guarantees in term of:

◮ False alarm rate. ◮ Detection delay.

◮ The detection delay is asymptotically optimal (reaching the existing lower bound

[Lai and Xing, 2010]).

2/14

slide-8
SLIDE 8

Overview

◮ A pruning version of the Bayesian Online Change-point Detector. ◮ High probability guarantees in term of:

◮ False alarm rate. ◮ Detection delay.

◮ The detection delay is asymptotically optimal (reaching the existing lower bound

[Lai and Xing, 2010]).

◮ Empirical comparisons with the original BOCPD [Fearnhead and Liu, 2007]

2/14

slide-9
SLIDE 9

Overview

◮ A pruning version of the Bayesian Online Change-point Detector. ◮ High probability guarantees in term of:

◮ False alarm rate. ◮ Detection delay.

◮ The detection delay is asymptotically optimal (reaching the existing lower bound

[Lai and Xing, 2010]).

◮ Empirical comparisons with the original BOCPD [Fearnhead and Liu, 2007] and the Improved Generalized Likelihood Ratio test [Maillard, 2019].

2/14

slide-10
SLIDE 10

Setting & Notations

3/14

slide-11
SLIDE 11

Setting & Notations

◮ B (µt): Bernoulli distribution of mean µt ∈ [0, 1].

3/14

slide-12
SLIDE 12

Setting & Notations

◮ B (µt): Bernoulli distribution of mean µt ∈ [0, 1]. ◮ Piece-wise stationary process: ∀c ∈ [1, C] , ∀t ∈ Tc = [τc, τc+1) µt = θc

3/14

slide-13
SLIDE 13

Setting & Notations

◮ B (µt): Bernoulli distribution of mean µt ∈ [0, 1]. ◮ Piece-wise stationary process: ∀c ∈ [1, C] , ∀t ∈ Tc = [τc, τc+1) µt = θc ◮ Sequence of observations: xs:t = (xs, ...xt).

3/14

slide-14
SLIDE 14

Setting & Notations

◮ B (µt): Bernoulli distribution of mean µt ∈ [0, 1]. ◮ Piece-wise stationary process: ∀c ∈ [1, C] , ∀t ∈ Tc = [τc, τc+1) µt = θc ◮ Sequence of observations: xs:t = (xs, ...xt). ◮ Length: ns:t = t − s + 1.

3/14

slide-15
SLIDE 15

Setting & Notations

◮ B (µt): Bernoulli distribution of mean µt ∈ [0, 1]. ◮ Piece-wise stationary process: ∀c ∈ [1, C] , ∀t ∈ Tc = [τc, τc+1) µt = θc ◮ Sequence of observations: xs:t = (xs, ...xt). ◮ Length: ns:t = t − s + 1.

3/14

slide-16
SLIDE 16

Bayesian Online Change-point Detector

Runlength inference

Runlength inference

4/14

slide-17
SLIDE 17

Bayesian Online Change-point Detector

Runlength inference

Runlength inference

Runlength rt: number of time steps since the last change-point. ∀rt ∈ [0, t − 1] p (rt|x1:t)

  • Runlength distribution at t

  • rt−1∈[0,t−2]

p (rt|rt−1)

  • hazard

p (xt|rt−1, x1:t−1)

  • UPM

p (rt−1|x1:t−1)

4/14

slide-18
SLIDE 18

Bayesian Online Change-point Detector

Runlength inference

Runlength inference

Runlength rt: number of time steps since the last change-point. ∀rt ∈ [0, t − 1] p (rt|x1:t)

  • Runlength distribution at t

  • rt−1∈[0,t−2]

p (rt|rt−1)

  • hazard

p (xt|rt−1, x1:t−1)

  • UPM

p (rt−1|x1:t−1)

Constant hazard rate assumption (h ∈ (0, 1)) (geometric inter-arrival time of change-point):

4/14

slide-19
SLIDE 19

Bayesian Online Change-point Detector

Runlength inference

Runlength inference

Runlength rt: number of time steps since the last change-point. ∀rt ∈ [0, t − 1] p (rt|x1:t)

  • Runlength distribution at t

  • rt−1∈[0,t−2]

p (rt|rt−1)

  • hazard

p (xt|rt−1, x1:t−1)

  • UPM

p (rt−1|x1:t−1)

Constant hazard rate assumption (h ∈ (0, 1)) (geometric inter-arrival time of change-point):

  • p(rt = rt−1 + 1|x1:t)

∝ (1 − h) p(xt|rt−1, x1:t−1)p(rt−1|x1:t−1) p(rt = 0|x1:t) ∝ h

rt−1 p(xt|rt−1, x1:t−1)p(rt−1|x1:t−1)

4/14

slide-20
SLIDE 20

Bayesian Online Change-point Detector

Runlength inference

Runlength inference

Runlength rt: number of time steps since the last change-point. ∀rt ∈ [0, t − 1] p (rt|x1:t)

  • Runlength distribution at t

  • rt−1∈[0,t−2]

p (rt|rt−1)

  • hazard

p (xt|rt−1, x1:t−1)

  • UPM

p (rt−1|x1:t−1)

Constant hazard rate assumption (h ∈ (0, 1)) (geometric inter-arrival time of change-point):

  • p(rt = rt−1 + 1|x1:t)

∝ (1 − h) p(xt|rt−1, x1:t−1)p(rt−1|x1:t−1) p(rt = 0|x1:t) ∝ h

rt−1 p(xt|rt−1, x1:t−1)p(rt−1|x1:t−1)

p(xt|rt−1, x1:t−1) is computed via the Laplace predictor as MLE: Lp (xt+1|xs:t) := t

i=s xi+1

ns:t+2

if xt+1 = 1

t

i=s(1−xi)+1

ns:t+2

if xt+1 = 0

4/14

slide-21
SLIDE 21

Bayesian Online Change-point Detector

Forecaster Learning

Forecaster Learning

Instead of runlength rt ∈ [0, t − 1], use the forecaster notion. Forecaster weight: ∀s ∈ [1, t] vs,t := p(rt = t − s|xs:t)

5/14

slide-22
SLIDE 22

Bayesian Online Change-point Detector

Forecaster Learning

Forecaster Learning

Instead of runlength rt ∈ [0, t − 1], use the forecaster notion. Forecaster weight: ∀s ∈ [1, t] vs,t := p(rt = t − s|xs:t) vs,t =

  • (1 − h) exp (−ls,t) vs,t−1

∀s < t, h t−1

i=1 exp (−li,t) vi,t−1

s = t .

5/14

slide-23
SLIDE 23

Bayesian Online Change-point Detector

Forecaster Learning

Forecaster Learning

Instead of runlength rt ∈ [0, t − 1], use the forecaster notion. Forecaster weight: ∀s ∈ [1, t] vs,t := p(rt = t − s|xs:t) vs,t =

  • (1 − h) exp (−ls,t) vs,t−1

∀s < t, h t−1

i=1 exp (−li,t) vi,t−1

s = t . Instantaneous loss: ls,t := − log Lp (xt|xs′:t−1).

5/14

slide-24
SLIDE 24

Bayesian Online Change-point Detector

Forecaster Learning

Forecaster Learning

Instead of runlength rt ∈ [0, t − 1], use the forecaster notion. Forecaster weight: ∀s ∈ [1, t] vs,t := p(rt = t − s|xs:t) vs,t =

  • (1 − h) exp (−ls,t) vs,t−1

∀s < t, h t−1

i=1 exp (−li,t) vi,t−1

s = t . vs,t =

  • (1 − h)ns:t hI{s=1} exp

Ls:t

  • Vs

∀s < t, hVt s = t. Instantaneous loss: ls,t := − log Lp (xt|xs′:t−1).

5/14

slide-25
SLIDE 25

Bayesian Online Change-point Detector

Forecaster Learning

Forecaster Learning

Instead of runlength rt ∈ [0, t − 1], use the forecaster notion. Forecaster weight: ∀s ∈ [1, t] vs,t := p(rt = t − s|xs:t) vs,t =

  • (1 − h) exp (−ls,t) vs,t−1

∀s < t, h t−1

i=1 exp (−li,t) vi,t−1

s = t . vs,t =

  • (1 − h)ns:t hI{s=1} exp

Ls:t

  • Vs

∀s < t, hVt s = t. Instantaneous loss: ls,t := − log Lp (xt|xs′:t−1).

  • Ls:t := t

s′=s ls,t: cumulative loss and Vt = t s=1 vs,t

5/14

slide-26
SLIDE 26

Main difficulty to provide the theoretical guarantees

Lemma (Computing the initial weight Vt)

6/14

slide-27
SLIDE 27

Main difficulty to provide the theoretical guarantees

Lemma (Computing the initial weight Vt)

Vt = (1 − h)t−2

t−1

  • k=1
  • h

1 − h k−1 ˜ Vk:t, .

6/14

slide-28
SLIDE 28

Main difficulty to provide the theoretical guarantees

Lemma (Computing the initial weight Vt)

Vt = (1 − h)t−2

t−1

  • k=1
  • h

1 − h k−1 ˜ Vk:t, where: ˜ Vk:t =

t−k

  • i1=1

t−(k−1)

  • i2=i1+1

...

t−2

  • ik−1=ik−2+1

exp

L1:i1

  • ×

k−2

  • j=1

exp

Lij+1:ij+1

  • × exp

Lik−1+1:t−1

  • ,

.

6/14

slide-29
SLIDE 29

Main difficulty to provide the theoretical guarantees

Lemma (Computing the initial weight Vt)

Vt = (1 − h)t−2

t−1

  • k=1
  • h

1 − h k−1 ˜ Vk:t, where: ˜ Vk:t =

t−k

  • i1=1

t−(k−1)

  • i2=i1+1

...

t−2

  • ik−1=ik−2+1

exp

L1:i1

  • ×

k−2

  • j=1

exp

Lij+1:ij+1

  • × exp

Lik−1+1:t−1

  • ,

with: t−k

  • i1=1

t−(k−1)

  • i2=i1+1

...

t−2

  • ik−1=ik−2+1

1 = t − 2 k − 1

  • .

6/14

slide-30
SLIDE 30

Main difficulty to provide the theoretical guarantees

Lemma (Computing the initial weight Vt)

Vt = (1 − h)t−2

t−1

  • k=1
  • h

1 − h k−1 ˜ Vk:t, where: ˜ Vk:t =

t−k

  • i1=1

t−(k−1)

  • i2=i1+1

...

t−2

  • ik−1=ik−2+1

exp

L1:i1

  • ×

k−2

  • j=1

exp

Lij+1:ij+1

  • × exp

Lik−1+1:t−1

  • ,

with: t−k

  • i1=1

t−(k−1)

  • i2=i1+1

...

t−2

  • ik−1=ik−2+1

1 = t − 2 k − 1

  • and

Ls:t :=

t

  • s′=s

ls,t.

6/14

slide-31
SLIDE 31

Main difficulty to provide the theoretical guarantees

Lemma (Computing the initial weight Vt)

Vt = (1 − h)t−2

t−1

  • k=1
  • h

1 − h k−1 ˜ Vk:t, where: ˜ Vk:t =

t−k

  • i1=1

t−(k−1)

  • i2=i1+1

...

t−2

  • ik−1=ik−2+1

exp

L1:i1

  • ×

k−2

  • j=1

exp

Lij+1:ij+1

  • × exp

Lik−1+1:t−1

  • ,

with: t−k

  • i1=1

t−(k−1)

  • i2=i1+1

...

t−2

  • ik−1=ik−2+1

1 = t − 2 k − 1

  • and

Ls:t :=

t

  • s′=s

ls,t. Combinatorial number of cumulative losses: very difficult to use classical concentrations.

6/14

slide-32
SLIDE 32

From BOCPD to Restarted-BOCPD

7/14

slide-33
SLIDE 33

From BOCPD to Restarted-BOCPD

Some modifications of BOCPD to tackle the theoretical difficulty.

7/14

slide-34
SLIDE 34

From BOCPD to Restarted-BOCPD

Some modifications of BOCPD to tackle the theoretical difficulty. ◮ Restart time r >= 0 (updated for each time a change-point is raised).

7/14

slide-35
SLIDE 35

From BOCPD to Restarted-BOCPD

Some modifications of BOCPD to tackle the theoretical difficulty. ◮ Restart time r >= 0 (updated for each time a change-point is raised). ◮ Initial weight function: Vr:t−1 := exp

Lr:t−1

  • instead of Vt.

7/14

slide-36
SLIDE 36

From BOCPD to Restarted-BOCPD

Some modifications of BOCPD to tackle the theoretical difficulty. ◮ Restart time r >= 0 (updated for each time a change-point is raised). ◮ Initial weight function: Vr:t−1 := exp

Lr:t−1

  • instead of Vt.

◮ Hyper-parameter ηr,s,t instead of the hazard rate h ∈ (0, 1).

7/14

slide-37
SLIDE 37

From BOCPD to Restarted-BOCPD

Some modifications of BOCPD to tackle the theoretical difficulty. ◮ Restart time r >= 0 (updated for each time a change-point is raised). ◮ Initial weight function: Vr:t−1 := exp

Lr:t−1

  • instead of Vt.

◮ Hyper-parameter ηr,s,t instead of the hazard rate h ∈ (0, 1). ◮ Restart criterion: Restartr:t = I

  • ∃s ∈ (r, t] : ϑr,s,t > ϑr,r,t
  • .

7/14

slide-38
SLIDE 38

From BOCPD to Restarted-BOCPD

Some modifications of BOCPD to tackle the theoretical difficulty. ◮ Restart time r >= 0 (updated for each time a change-point is raised). ◮ Initial weight function: Vr:t−1 := exp

Lr:t−1

  • instead of Vt.

◮ Hyper-parameter ηr,s,t instead of the hazard rate h ∈ (0, 1). ◮ Restart criterion: Restartr:t = I

  • ∃s ∈ (r, t] : ϑr,s,t > ϑr,r,t
  • .

R-BOCPD update rule

For some starting time r: ϑr,s,t ← ηr,s,t

ηr,s,t−1 exp (−ls,t) ϑr,s,t−1

∀s < t, ηr,t,t × Vr:t−1 s = t.

7/14

slide-39
SLIDE 39

From BOCPD to Restarted-BOCPD

Some modifications of BOCPD to tackle the theoretical difficulty. ◮ Restart time r >= 0 (updated for each time a change-point is raised). ◮ Initial weight function: Vr:t−1 := exp

Lr:t−1

  • instead of Vt.

◮ Hyper-parameter ηr,s,t instead of the hazard rate h ∈ (0, 1). ◮ Restart criterion: Restartr:t = I

  • ∃s ∈ (r, t] : ϑr,s,t > ϑr,r,t
  • .

R-BOCPD update rule

For some starting time r: ϑr,s,t ← ηr,s,t

ηr,s,t−1 exp (−ls,t) ϑr,s,t−1

∀s < t, ηr,t,t × Vr:t−1 s = t.

Recall BOCPD update rule

vs,t ←

  • (1 − h) exp (−ls,t) vs,t−1

∀s < t, h × Vt s = t.

7/14

slide-40
SLIDE 40

8/14

slide-41
SLIDE 41

Analysis of R-BOCPD

False alarm control

Theorem: False alarm rate control

Assume that (xr, ...xt) ∼ B (θ). Let: α > 1. If: ∀t ∈ [r, τ) , s ∈ (r, t] : ηr,s,t < √nr:s−1 × ns:t 10nr:t+1

  • log(4α + 2)δ2

4nr:t log((α + 3) nr:t) α then, with probability higher than 1 − δ, no false alarm occurs on the interval [r, τ): ∀δ ∈ (0, 1) Pθ

  • ∃ t ∈ [r, τ) : Restartr:t = 1
  • δ.

For α ≈ 1, ηr,s,t = O

  • 1

t−r+1

  • 8/14
slide-42
SLIDE 42

Analysis of R-BOCPD

Detection delay control

Theorem: Detection delay control

Let (xr, ...xτ−1) ∼ B (θ1), (xτ, ...xt) ∼ B (θ2) and ∆ = |θ1 − θ2|: the change-point gap. Then, let: fr,s,t = log nr:s + log ns:t+1 − 1

2 log nr:t + 9 8.

If ηr,s,t > exp

  • − 2nr,s−1 (∆r,s,t − Cr,s,t,δ)2 + fr,s,t
  • , then, the change-point τ is detected

(with a probability at least 1 − δ) with a delay not exceeding D∆,r,τ, such that: D∆,r,τ = min

  • d ∈ N⋆ : d >
  • 1− Cr,τ,d+τ−1,δ

∆ −2 2∆2

×

− log ηr,τ,d+τ−1+fr,τ,d+τ−1 1+ log ηr,τ,d+τ−1−fr,τ,d+τ−1 2nr,τ−1(∆−Cr,τ,d+τ−1,δ)

2

  • ,

with: Cr,s,t,δ =

√ 2 2

  • 1+

1 nr:s−1

nr:s−1

log 2√nr:s

δ

  • +
  • 1+ 1

ns:t ns:t

log

  • 2nr:t

√ns:t+1 log2(nr:t) log(2)δ

  • .

ηr,s,t = Ω (exp(−nr,s,t)) and Cr,s,t,δ = O

  • log (nr:s/δ) /nr:s−1 +
  • log (ns:t+1/δ) /ns:t
  • 9/14
slide-43
SLIDE 43

Analysis of R-BOCPD

Asymptotic analysis of the Detection delay

10/14

slide-44
SLIDE 44

Analysis of R-BOCPD

Asymptotic analysis of the Detection delay

10/14

slide-45
SLIDE 45

Analysis of R-BOCPD

Asymptotic analysis of the Detection delay

Asymptotic Analysis

if ηr,s,t =

1 t−r+1, then in the asymptotic

regime: D|θ2−θ1|,r,τ →

τ→∞

  • log 1

δ

  • 2 |θ2 − θ1|2

10/14

slide-46
SLIDE 46

Analysis of R-BOCPD

Asymptotic analysis of the Detection delay

Asymptotic Analysis

if ηr,s,t =

1 t−r+1, then in the asymptotic

regime: D|θ2−θ1|,r,τ →

τ→∞

  • log 1

δ

  • 2 |θ2 − θ1|2

= O

  • log 1

δ

  • KL (θ2, θ1)
  • 10/14
slide-47
SLIDE 47

Analysis of R-BOCPD

Asymptotic analysis of the Detection delay

Asymptotic Analysis

if ηr,s,t =

1 t−r+1, then in the asymptotic

regime: D|θ2−θ1|,r,τ →

τ→∞

  • log 1

δ

  • 2 |θ2 − θ1|2

= O

  • log 1

δ

  • KL (θ2, θ1)
  • Existing lower bound [Lai and Xing, 2010].

10/14

slide-48
SLIDE 48

Empirical comparisons

Comparison with the original BOCPD: Benchmark 1

11/14

slide-49
SLIDE 49

Empirical comparisons

Comparison with the original BOCPD: Benchmark 1

Benchmark 1: Highlighting the use of the function Vr:t−1 instead of Vt

11/14

slide-50
SLIDE 50

Empirical comparisons

Comparison with the original BOCPD: Benchmark 1

Benchmark 1: Highlighting the use of the function Vr:t−1 instead of Vt

◮ Generate 2500 trajectories (sequences)

  • f length T = 5000.

11/14

slide-51
SLIDE 51

Empirical comparisons

Comparison with the original BOCPD: Benchmark 1

Benchmark 1: Highlighting the use of the function Vr:t−1 instead of Vt

◮ Generate 2500 trajectories (sequences)

  • f length T = 5000.

◮ Vary the number of observation before the change in [10, 1000].

11/14

slide-52
SLIDE 52

Empirical comparisons

Comparison with the original BOCPD: Benchmark 1

Benchmark 1: Highlighting the use of the function Vr:t−1 instead of Vt

◮ Generate 2500 trajectories (sequences)

  • f length T = 5000.

◮ Vary the number of observation before the change in [10, 1000]. ◮ Vary the change-point gap ∆ in [0.01, 1].

11/14

slide-53
SLIDE 53

Empirical comparisons

Comparison with the original BOCPD: Benchmark 1

Benchmark 1: Highlighting the use of the function Vr:t−1 instead of Vt

◮ Generate 2500 trajectories (sequences)

  • f length T = 5000.

◮ Vary the number of observation before the change in [10, 1000]. ◮ Vary the change-point gap ∆ in [0.01, 1]. ◮ Plot detection delays differences between R-BOCPD and BOCPD.

11/14

slide-54
SLIDE 54

Empirical comparisons

Comparison with the original BOCPD: Benchmark 1

Benchmark 1: Highlighting the use of the function Vr:t−1 instead of Vt

◮ Generate 2500 trajectories (sequences)

  • f length T = 5000.

◮ Vary the number of observation before the change in [10, 1000]. ◮ Vary the change-point gap ∆ in [0.01, 1]. ◮ Plot detection delays differences between R-BOCPD and BOCPD.

11/14

slide-55
SLIDE 55

Empirical comparisons

Comparison with the original BOCPD: Benchmark 2

12/14

slide-56
SLIDE 56

Empirical comparisons

Comparison with the original BOCPD: Benchmark 2

Benchmark 2: Highlighting the use of the restart procedure Restartr:t

12/14

slide-57
SLIDE 57

Empirical comparisons

Comparison with the original BOCPD: Benchmark 2

Benchmark 2: Highlighting the use of the restart procedure Restartr:t

◮ Piece-wise stationary Bernoulli process τ1 = 1, τ2 = 301, τ3 = 701, τ4 = 1051.

12/14

slide-58
SLIDE 58

Empirical comparisons

Comparison with the original BOCPD: Benchmark 2

Benchmark 2: Highlighting the use of the restart procedure Restartr:t

◮ Piece-wise stationary Bernoulli process τ1 = 1, τ2 = 301, τ3 = 701, τ4 = 1051. ◮ Run R-BOCPD and BOCPD.

12/14

slide-59
SLIDE 59

Empirical comparisons

Comparison with the original BOCPD: Benchmark 2

Benchmark 2: Highlighting the use of the restart procedure Restartr:t

◮ Piece-wise stationary Bernoulli process τ1 = 1, τ2 = 301, τ3 = 701, τ4 = 1051. ◮ Run R-BOCPD and BOCPD. ◮ Plot the change-point estimation τt for both R-BOCPD and BOCPD.

12/14

slide-60
SLIDE 60

Empirical comparisons

Comparison with the original BOCPD: Benchmark 2

Benchmark 2: Highlighting the use of the restart procedure Restartr:t

◮ Piece-wise stationary Bernoulli process τ1 = 1, τ2 = 301, τ3 = 701, τ4 = 1051. ◮ Run R-BOCPD and BOCPD. ◮ Plot the change-point estimation τt for both R-BOCPD and BOCPD.

12/14

slide-61
SLIDE 61

Empirical comparisons

Comparison with the Improved GLR [Maillard, 2019]

13/14

slide-62
SLIDE 62

Empirical comparisons

Comparison with the Improved GLR [Maillard, 2019]

Improved GLR final formulation

IMPGLRδ (y1, ..., yt) = I

  • ∃s ∈ [1, t) :
  • 1

s s

  • i=1

yi −

1 t−s t

  • i=s+1

yi

  • Cδ,s,t
  • 13/14
slide-63
SLIDE 63

Empirical comparisons

Comparison with the Improved GLR [Maillard, 2019]

Improved GLR final formulation

IMPGLRδ (y1, ..., yt) = I

  • ∃s ∈ [1, t) :
  • 1

s s

  • i=1

yi −

1 t−s t

  • i=s+1

yi

  • Cδ,s,t
  • Cδ,s,t =

√ 2 2

  • 1

s + 1 s2 log

  • 2√s+1

δ

  • +
  • 1

t−s + 1 (t−s)2 log

  • 2t√t−s+1 log2(t)

log(2)δ

  • 13/14
slide-64
SLIDE 64

Empirical comparisons

Comparison with the Improved GLR [Maillard, 2019]

14/14

slide-65
SLIDE 65

Empirical comparisons

Comparison with the Improved GLR [Maillard, 2019]

Benchmark :

14/14

slide-66
SLIDE 66

Empirical comparisons

Comparison with the Improved GLR [Maillard, 2019]

Benchmark :

◮ Generate 2500 trajectories (sequences)

  • f length T = 2500.

14/14

slide-67
SLIDE 67

Empirical comparisons

Comparison with the Improved GLR [Maillard, 2019]

Benchmark :

◮ Generate 2500 trajectories (sequences)

  • f length T = 2500.

◮ Vary the number of observation before the change in [10, 500].

14/14

slide-68
SLIDE 68

Empirical comparisons

Comparison with the Improved GLR [Maillard, 2019]

Benchmark :

◮ Generate 2500 trajectories (sequences)

  • f length T = 2500.

◮ Vary the number of observation before the change in [10, 500]. ◮ Vary the change-point gap ∆ ∈ [0.01, 1]. ◮ Plot the difference of detection delays between R-BOCPD and Improved GLR.

14/14

slide-69
SLIDE 69

Empirical comparisons

Comparison with the Improved GLR [Maillard, 2019]

Benchmark :

◮ Generate 2500 trajectories (sequences)

  • f length T = 2500.

◮ Vary the number of observation before the change in [10, 500]. ◮ Vary the change-point gap ∆ ∈ [0.01, 1]. ◮ Plot the difference of detection delays between R-BOCPD and Improved GLR.

14/14

slide-70
SLIDE 70

Fearnhead, P . and Liu, Z. (2007). On-line inference for multiple changepoint problems. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 69(4):589–605. Lai, T. L. and Xing, H. (2010). Sequential change-point detection when the pre-and post-change parameters are unknown. Sequential analysis, 29(2):162–175. Maillard, O.-A. (2019). Sequential change-point detection: Laplace concentration of scan statistics and non-asymptotic delay bounds. In Algorithmic Learning Theory, pages 610–632.

14/14