Statistical issues at online surveillance Marianne Frisn - - PowerPoint PPT Presentation

statistical issues at online surveillance
SMART_READER_LITE
LIVE PREVIEW

Statistical issues at online surveillance Marianne Frisn - - PowerPoint PPT Presentation

Statistical issues at online surveillance Marianne Frisn Statistical Research Unit Gteborg University Sweden Marianne Frisn DIMACS 03 1 Outline I Inferential framework II Demonstration of computer program III


slide-1
SLIDE 1

Marianne Frisén DIMACS 03 1

Statistical issues at online surveillance

Marianne Frisén

Statistical Research Unit Göteborg University Sweden

slide-2
SLIDE 2

Marianne Frisén DIMACS 03 2

Outline

  • I Inferential framework
  • II Demonstration of computer program
  • III Complicated problems - examples
slide-3
SLIDE 3

Marianne Frisén DIMACS 03 3

Statistical methods to separate important changes from stochastic variation.

2 4 6 8 10 12 14 16 5 10 15 20 25 30

Enough information for decision?

slide-4
SLIDE 4

Marianne Frisén DIMACS 03 4

Continual observation of a time series,

  • Monitoring
  • Surveillance
  • Change-point

analysis

  • SPC
  • Control charts
  • Early warnings
  • Just in time

as soon as possible after it has occurred. with the goal of detecting an important change in the underlying process

slide-5
SLIDE 5

Marianne Frisén DIMACS 03 5

POPULATIONS:

  • control of epidemic diseases
  • surveillance of known risk factors
  • detection of new environmental risks

INDIVIDUALS:

  • natural family planning
  • Hormone cycles
  • regular health controls
  • pregnancy
  • Intensive care
  • fetal heart rate
  • surveillance after intervention
  • kidney transplant

Monitoring of health

slide-6
SLIDE 6

Marianne Frisén DIMACS 03 6

Surveillance

  • Repeated measurements
  • Repeated decisions
  • No fix hypothesis
  • Time important
slide-7
SLIDE 7

Marianne Frisén DIMACS 03 7

Quality control Stopping rules in probability theory Medicine Inference

Scources of knowledge

slide-8
SLIDE 8

Marianne Frisén DIMACS 03 8

The First (τ-1) observations xτ-1 = x(1), ..., x(τ-1) have density fD The following observations have density fC Change in distribution

2 4 6 8 10 12 14 16 10 20 30

τ tA

Alarm

slide-9
SLIDE 9

Marianne Frisén DIMACS 03 9

Timely detection

  • f a change in a process

from state D to state C

slide-10
SLIDE 10

Marianne Frisén DIMACS 03 10

Evaluations

  • Quick detection
  • Few false alarms
  • Frisén, M. (1992). Evaluations of methods for statistical
  • surveillance. Statistics in Medicine, 11, 1489 - 1502.
slide-11
SLIDE 11

Marianne Frisén DIMACS 03 11

False alarms

  • The Average Run Length at no change,

ARL0 = E( tA| D)

  • The false alarm probability

P(tA<τ).

slide-12
SLIDE 12

Marianne Frisén DIMACS 03 12

Motivated alarms

  • ARL1 The Average Run Length until detection of a

change (that occurred at the same time as the inspection started) E(tA|τ=1).

  • ED(t) = E[max (0, tA-t) | J=t]
  • ARL1 = ED(1)
  • CED(t) = E[tA-t | J=t, tA $ t]
  • ED= EJ[ED(J)]
  • Probability of Successful Detection
slide-13
SLIDE 13

Marianne Frisén DIMACS 03 13

Predictive value

T h e p r e d i c t iv e v a lu e r e f l e c t s t h e t r u s t y o u s h o u l d h a v e in a n a l a r m .

Pr(J#t | tA= t)

slide-14
SLIDE 14

Marianne Frisén DIMACS 03 14

Optimality

  • ARL-optimality
  • ED-optimality
  • Minimax-optimality
  • Frisén, M. and de Maré, J. (1991). Optimal surveillance.

Biometrika, 78, 271-80.

  • Frisén, M. (in press), Statistical Surveillance. Optimality and

Methods., International Statistical Review.

  • Frisén, M. and Sonesson, C. (2003): Optimal surveillance by

exponentially moving average mehtods. Submitted.

slide-15
SLIDE 15

Marianne Frisén DIMACS 03 15

ARL Optimality

  • Minimal ARL1 for fixed ARL0
  • Observe that τ=1
  • Consequences demonstrated in

– Frisén, M. (in press), Statistical Surveillance. Optimality and Methods., International Statistical Review. – Frisén, M. and Sonesson, C. (2003): Optimal surveillance by exponentially moving average mehtods. Submitted.

  • Use only with care!
slide-16
SLIDE 16

Marianne Frisén DIMACS 03 16

Utility

  • The loss of a false alarm is a function of the the time

between the alarm and the change point.

  • The gain of an alarm is a linear function of the same

difference.

( ) ( )

A A A 1 A 2 A

h t -τ , t τ u(t , τ) a t -τ a , t τ <   =  ⋅ + ≥  

Shiryaev, A. N. (1963), "On Optimum Methods in Quickest Detection Problems," Theory of Probability and its Applications, 8, 22-46

slide-17
SLIDE 17

Marianne Frisén DIMACS 03 17

ED Optimality

M in im a l e x p e c te d d e la y f o r a f ix e d f a ls e a la r m p r o b a b ility

ED

[ ]

τ <

A

t P

Maximizes the utility by Shiryaev

slide-18
SLIDE 18

Marianne Frisén DIMACS 03 18

Minimax Optimality

  • Minimal expected delay

for the worst value of τ and for the worst history of observations before τ

– Pollak, M. (1985), "Optimal Detection of a Change in Distribution," The Annals of Statistics, 13, 206-227 – Lai, T. L. (1995), "Sequential Changepoint Detection in Quality- Control and Dynamical-Systems," Journal of the Royal Statistical Society Ser. B, 57, 613-658.

slide-19
SLIDE 19

Marianne Frisén DIMACS 03 20

Methods

  • LR

– Shiryaev-Roberts

  • Shewhart
  • EWMA

– Moving average

  • CUSUM
slide-20
SLIDE 20

Marianne Frisén DIMACS 03 21

Partial likelihood ratio

– Detection of τ=t – C={τ=t} D={τ >s} – L(s, t) = fXs(xs |τ=t) /fXs(xs | τ >s)

slide-21
SLIDE 21

Marianne Frisén DIMACS 03 22

LR

  • Full likelihood ratio

– LR(s) = fXs(xs |C) /fXs(xs |D) – C={τ≤s} D={τ >s} – LR(s)=

slide-22
SLIDE 22

Marianne Frisén DIMACS 03 23

LR

  • Fulfills several optimality criteria e.g.
  • Maximum expected utility
  • Frisén, M. and de Maré, J. (1991). Optimal surveillance. Biometrika, 78,

271-80.

slide-23
SLIDE 23

Marianne Frisén DIMACS 03 24

LR

  • Alarmrule equivalent to rule with constant limit

for the posterior probability

– if only two states C and D.

Frisén, M. and de Maré, J. (1991). Optimal surveillance. Biometrika, 78, 271- 80.

  • ”The Bayes method”
  • Frequentistic inference possible
  • Comparison: Hidden Markov Modeling and LR

– Andersson, E., Bock, D. and Frisén, M. (2002) Statistical surveillance of cyclical processes with application to turns in business cycles. Submitted.

slide-24
SLIDE 24

Marianne Frisén DIMACS 03 25

Shirayev Roberts

  • The LR method with a non-informative prior.
  • The limit of the LR method when the intensity

ν tends to zero.

  • Can often be used as an approximation of LR

for rather large values of ν

Frisén, M., and Wessman, P. (1999), "Evaluations of Likelihood Ratio Methods for

  • Surveillance. Differences and Robustness.," Communications in Statistics.

Simulations and Computations, 28, 597-622.

slide-25
SLIDE 25

Marianne Frisén DIMACS 03 26

Shewhart

  • Alarmstatistic

X(s)=L(s,s)

  • Alarmlimit

constant (often 3σ)

  • Alarmrule

tA = min{s: X(s) > 3σ},

2 4 6 8 10 12 14 16 5 10 15 20 25 30

slide-26
SLIDE 26

Marianne Frisén DIMACS 03 27

EWMA

Alarmstatistic Approximates LR if

– Frisén, M. (in press), Statistical Surveillance. Optimality and Methods., International Statistical Review. – Frisén, M. and Sonesson, C. (2003): Optimal surveillance by exponentially moving average mehtods. Submitted.

λ = 1 - exp(-µ2/2)/(1-ν)

slide-27
SLIDE 27

Marianne Frisén DIMACS 03 28

CUSUM

  • Alarmrule

– max(L(s, t); t=1, 2,.., s) > G

  • Minimax optimality
slide-28
SLIDE 28

Marianne Frisén DIMACS 03 30

Alarm limits at the second observation

1 2 3 4 5 6

  • 3
  • 2
  • 1

1 2 3 x(1) x(2) LR EWMA Shewhart CUSUM

slide-29
SLIDE 29

Marianne Frisén DIMACS 03 31

Parameters for optimizing

The Shewhart method has no parameters The CUSUM and the Shiryaev-

Roberts methods have one

parameter M to optimize for the size of the shift :. The LR-method has besides M also the parameter V to optimize for the intensity <.

slide-30
SLIDE 30

Marianne Frisén DIMACS 03 32

Similarity

The LR, Shiryaev-Roberts and the

CUSUM methods tend to the Shewhart

method when the parameter M tends to infinity.

This explains some earlier claims of similarities between some methods. These studies were made for very large values of M.

slide-31
SLIDE 31

Marianne Frisén DIMACS 03 33

Predictive value

Shewhart - many early alarms. These alarms are often false.

The LR and the Shiryaev- Roberts methods have relatively constant predicted values. A constant predicted value makes the same kind of action appropriate both for early and late alarms.