A Space-Time Conditional Intensity Model for Invasive Meningococcal - - PowerPoint PPT Presentation

a space time conditional intensity model for invasive
SMART_READER_LITE
LIVE PREVIEW

A Space-Time Conditional Intensity Model for Invasive Meningococcal - - PowerPoint PPT Presentation

Motivation Point Process Modelling Inference Data Analysis Summary A Space-Time Conditional Intensity Model for Invasive Meningococcal Disease Occurrence Sebastian Meyer 1 , 3 Johannes Elias 4 Michael Hhle 2 , 3 1 Division of Biostatistics,


slide-1
SLIDE 1

Motivation Point Process Modelling Inference Data Analysis Summary

A Space-Time Conditional Intensity Model for Invasive Meningococcal Disease Occurrence

Sebastian Meyer1,3 Johannes Elias4 Michael Höhle2,3

1Division of Biostatistics, Institute for Social & Preventive Medicine, Univ. of Zürich 2Department for Infectious Disease Epidemiology, Robert Koch Institute, Berlin 3(previously) Department of Statistics, Ludwig-Maximilians-Universität, München 4German Reference Centre for Meningococci, University of Würzburg, Würzburg

QMUL – Institute of Zoology London, United Kingdom 7 September 2012

1 / 27

slide-2
SLIDE 2

Motivation Point Process Modelling Inference Data Analysis Summary

Outline

1

Motivation

2

Space-Time Point Process Modelling

3

Inference

4

Data Analysis

5

Summary

2 / 27

slide-3
SLIDE 3

Motivation Point Process Modelling Inference Data Analysis Summary

Motivation and Aim

Understanding the spread of an infectious disease is a step towards its control There is increased agreement that such dynamics are stochastic phenomena operating in a heterogeneous population The spatial and temporal resolution of infectious disease data is becoming better and better Aim Establish a regression framework for point referenced infectious disease surveillance data, where the transmission dynamics and its dependency on covariates can be quantified within the context of a spatio-temporal stochastic process.

3 / 27

slide-4
SLIDE 4

Motivation Point Process Modelling Inference Data Analysis Summary

Application: Invasive meningococcal disease (IMD)

Description Life-threatening infectious disease triggered by the bacterium Neisseria meningitidis (aka meningococcus) Involves meningitis (50%), septicemia (5–20%), pneumonia (5-15%) Transmission by mucous secretions, also airborne Epidemiology Yearly incidence (Germany, 2001–2008): 0.5–1 infections per 100 000 inhabitants Mainly affected are infants and adolescents Lethality: 8.4%, for meningococcal sepsis: ≈ 40%

4 / 27

slide-5
SLIDE 5

Motivation Point Process Modelling Inference Data Analysis Summary

Available IMD data

T wo most common finetypes in Germany in 2002–2008: 336 cases of B:P1.7-2,4:F1-5, 300 cases of C:P1.5,2:F3-3 Case variables: date, residence postcode, age, gender

B:P1.7-2,4:F1-5

2 4 6 8 10 12 14 16 Time (month) Number of cases of the serogroup B finetype 2002 2003 2004 2005 2006 2007 2008 2009

C:P1.5,2:F3-3

2 4 6 8 10 12 14 16 Time (month) Number of cases of the serogroup C finetype 2002 2003 2004 2005 2006 2007 2008 2009 5 / 27

slide-6
SLIDE 6

Motivation Point Process Modelling Inference Data Analysis Summary

Spatial distribution

B:P1.7-2,4:F1-5

48°N 50°N 52°N 54°N 6°E 8°E 10°E 12°E 14°E

  • 500

1000 1500 2000 2500 3000 3500 4000 4500

C:P1.5,2:F3-3

48°N 50°N 52°N 54°N 6°E 8°E 10°E 12°E 14°E

  • 500

1000 1500 2000 2500 3000 3500 4000 4500

Scientific question: Do the finetypes spread differently? My task: Quantify the transmission dynamics.

6 / 27

slide-7
SLIDE 7

Motivation Point Process Modelling Inference Data Analysis Summary

Relationship of IMD and influenza

Weekly numbers of SurvNet influenza cases

10 20 30 40 50 1000 2000 3000 4000 Week Number of influenza cases

  • 2002

2003 2004 2005 2006 2007 2008

Weekly numbers of SurvNet IMD cases

10 20 30 40 50 10 20 30 40 50 Week Number of IMD cases

  • 2002

2003 2004 2005 2006 2007 2008

Scientific question: Do waves of influenza predispose to IMD accumulations? Statistical solution: Quantify and test the local effect of (lagged) numbers of influenza cases on occurrences of IMD

7 / 27

slide-8
SLIDE 8

Motivation Point Process Modelling Inference Data Analysis Summary

1

Motivation

2

Space-Time Point Process Modelling

3

Inference

4

Data Analysis

5

Summary

8 / 27

slide-9
SLIDE 9

Motivation Point Process Modelling Inference Data Analysis Summary

Conditional intensity function (CIF)

A regular spatio-temporal point process N on ❘+ × ❘2 can be uniquely characterised by its left-continuous CIF λ∗(t, s). Definition λ∗(t, s) = lim

Δt→0, |ds|→0

P

  • N([t, t + Δt) × ds) = 1
  • Ht−
  • Δt |ds|

Instantaneous event rate at (t, s) given all past events Key to modelling, likelihood analysis and simulation of evolutionary (“self-exciting”) point processes In application, N is only defined on a subset (0, T] × W ⊂ ❘+ × ❘2 (observation period and region)

9 / 27

slide-10
SLIDE 10

Motivation Point Process Modelling Inference Data Analysis Summary

Proposed additive-multiplicative continuous space-time intensity model (twinstim)

λ∗(t, s) = h(t, s) + e∗(t, s) Inspiration Additive-multiplicative SIR (susceptible-infectious-recovered) compartmental model (Höhle, 2009) for a fixed population Spatio-temporal ETAS (epidemic-type aftershock-sequences) model (Ogata, 1998)

10 / 27

slide-11
SLIDE 11

Motivation Point Process Modelling Inference Data Analysis Summary

Proposed additive-multiplicative continuous space-time intensity model (twinstim)

λ∗(t, s) = h(t, s) + e∗(t, s) Multiplicative endemic component h(t, s) = exp

  • ξ(s) + β′zτ(t),ξ(s)
  • Piecewise constant function on a spatio-temporal grid

{C1, . . . , CD} × {A1, . . . , AM} with time interval index τ(t) and region index ξ(s) Region-specific offset oξ(s), e.g., log-population density Endemic linear predictor β′zτ(t),ξ(s) includes discretised time trend and exogenous effects, e.g., influenza cases

10 / 27

slide-12
SLIDE 12

Motivation Point Process Modelling Inference Data Analysis Summary

Proposed additive-multiplicative continuous space-time intensity model (twinstim)

λ∗(t, s) = h(t, s) + e∗(t, s) Additive epidemic (self-exciting) component e∗(t, s) =

  • j∈∗(t,s;ϵ,δ)

eηj gα(t − tj) ƒσ(s − sj) Individual infectivity weighting through linear predictor ηj = γ′mj based on the vector of unpredictable marks Positive parametric interaction functions, e.g., ƒσ(s) = exp

  • − s2

2σ2

  • and gα(t) = e−αt

Set of active infectives depends on fixed maximum temporal and spatial interaction ranges ϵ and δ

10 / 27

slide-13
SLIDE 13

Motivation Point Process Modelling Inference Data Analysis Summary

Marked extension with event type

Motivation: joint modelling of both finetypes of IMD Additional dimension K = {1, . . . , K} for event type κ ∈ K Marked CIF λ∗(t, s, κ) = exp

  • β0,κ + oξ(s) + β′zτ(t),ξ(s)
  • +
  • j∈∗(t,s,κ;ϵ,δ)

qκj,κ eηj gα(t − tj|κj) ƒσ(s − sj|κj) T ype-specific endemic intercept T ype-specific transmission, qk, ∈ {0, 1}, k,  ∈ K T ype-specific infection pressure ηj = γ′mj, κj is part of mj T ype-specific interaction functions, e.g., variances σ2

κ

11 / 27

slide-14
SLIDE 14

Motivation Point Process Modelling Inference Data Analysis Summary

Marked extension with event type

Motivation: joint modelling of both finetypes of IMD Additional dimension K = {1, . . . , K} for event type κ ∈ K Marked CIF λ∗(t, s, κ) = exp

  • β0,κ + oξ(s) + β′zτ(t),ξ(s)
  • +
  • j∈∗(t,s,κ;ϵ,δ)

qκj,κ eηj gα(t − tj|κj) ƒσ(s − sj|κj) T ype-specific endemic intercept T ype-specific transmission, qk, ∈ {0, 1}, k,  ∈ K T ype-specific infection pressure ηj = γ′mj, κj is part of mj T ype-specific interaction functions, e.g., variances σ2

κ

11 / 27

slide-15
SLIDE 15

Motivation Point Process Modelling Inference Data Analysis Summary

Marked extension with event type

Motivation: joint modelling of both finetypes of IMD Additional dimension K = {1, . . . , K} for event type κ ∈ K Marked CIF λ∗(t, s, κ) = exp

  • β0,κ + oξ(s) + β′zτ(t),ξ(s)
  • +
  • j∈∗(t,s,κ;ϵ,δ)

qκj,κ eηj gα(t − tj|κj) ƒσ(s − sj|κj) T ype-specific endemic intercept T ype-specific transmission, qk, ∈ {0, 1}, k,  ∈ K T ype-specific infection pressure ηj = γ′mj, κj is part of mj T ype-specific interaction functions, e.g., variances σ2

κ

11 / 27

slide-16
SLIDE 16

Motivation Point Process Modelling Inference Data Analysis Summary

Marked extension with event type

Motivation: joint modelling of both finetypes of IMD Additional dimension K = {1, . . . , K} for event type κ ∈ K Marked CIF λ∗(t, s, κ) = exp

  • β0,κ + oξ(s) + β′zτ(t),ξ(s)
  • +
  • j∈∗(t,s,κ;ϵ,δ)

qκj,κ eηj gα(t − tj|κj) ƒσ(s − sj|κj) T ype-specific endemic intercept T ype-specific transmission, qk, ∈ {0, 1}, k,  ∈ K T ype-specific infection pressure ηj = γ′mj, κj is part of mj T ype-specific interaction functions, e.g., variances σ2

κ

11 / 27

slide-17
SLIDE 17

Motivation Point Process Modelling Inference Data Analysis Summary

Marked extension with event type

Motivation: joint modelling of both finetypes of IMD Additional dimension K = {1, . . . , K} for event type κ ∈ K Marked CIF λ∗(t, s, κ) = exp

  • β0,κ + oξ(s) + β′zτ(t),ξ(s)
  • +
  • j∈∗(t,s,κ;ϵ,δ)

qκj,κ eηj gα(t − tj|κj) ƒσ(s − sj|κj) T ype-specific endemic intercept T ype-specific transmission, qk, ∈ {0, 1}, k,  ∈ K T ype-specific infection pressure ηj = γ′mj, κj is part of mj T ype-specific interaction functions, e.g., variances σ2

κ

11 / 27

slide-18
SLIDE 18

Motivation Point Process Modelling Inference Data Analysis Summary

1

Motivation

2

Space-Time Point Process Modelling

3

Inference

4

Data Analysis

5

Summary

12 / 27

slide-19
SLIDE 19

Motivation Point Process Modelling Inference Data Analysis Summary

Log-likelihood of the proposed model

Observed spatio-temporal marked point pattern:  =

  • (t, s, m) :  = 1, . . . , n
  • Covariate information zτ,ξ on a spatio-temporal grid:

(θ) =  

n

  • =1

log λ∗

θ (t, s, κ)

  − T

  • W
  • κ∈K

λ∗

θ (t, s, κ) dt ds

θ =

  • β′

0, β′, γ′, σ′, α′′

Integration of epidemic component e∗

θ (t, s, κ) involves

min{T−tj;ϵ} gα(t|κj) dt and

  • W∩b(sj;δ)

−sj ƒσ(s|κj) ds

13 / 27

slide-20
SLIDE 20

Motivation Point Process Modelling Inference Data Analysis Summary

Numerical log-likelihood maximisation

For a polygonal region R, perform approximation

  • R

ƒσ(s) ds ≈

n

  • j=1

j ƒσ(sj) with fixed evaluation points s1, . . . , sn Benchmark experiment ⇒ two-dimensional midpoint rule with adaptive bandwidth choice depending on the value

  • f σ as best trade-off between accuracy and speed

Rathbun (1996): existence, consistence and asymptotic normality of a local maximum ˆ θML as T → ∞ for fixed W Newton-algorithm using R’s nlminb function with analytical score function and expected Fisher information

14 / 27

slide-21
SLIDE 21

Motivation Point Process Modelling Inference Data Analysis Summary

Goodness-of-fit and simulation

Define residuals Y = ˆ Λ∗(t) − ˆ Λ∗(t−1),  = 2, . . . , n, where ˆ Λ∗(t) is the cumulative intensity function If the estimated CIF describes the true CIF well in the temporal dimension, then U = 1 − exp(−Y)

iid

∼ U(0, 1) Use the Kolmogorov-Smirnov test and plot the empirical distribution function of the U’s to check for deviations Alternative: compare the observed epidemic with simulations from the model using Ogata’s modified thinning (Daley & Vere-Jones, 2003, Algorithm 7.5.V.)

15 / 27

slide-22
SLIDE 22

Motivation Point Process Modelling Inference Data Analysis Summary

1

Motivation

2

Space-Time Point Process Modelling

3

Inference

4

Data Analysis

5

Summary

16 / 27

slide-23
SLIDE 23

Motivation Point Process Modelling Inference Data Analysis Summary

Data representation: epidataCS class

R> library("surveillance") R> # [... loads of data preparation ...] R> imdepi <- as.epidataCS(events, stgrid, W = germany, qmatrix = diag(2)) R> print(imdepi, n=5, digits=5) History of an epidemic Observation period: 0 -- 2562 Observation window (bounding box): [4034.1, 4670.4] x [2686.7, 3543.2] Spatio-temporal grid (not shown): 366 time blocks, 413 tiles Types of events: 'B' 'C' Overall number of events: 636 coordinates ID time tile type eps.t eps.s age sex BLOCK 103 (4112.19, 3202.79) 1 0.99 05554 B 30 200 17 male 1 402 (4122.51, 3076.97) 2 1.00 05382 C 30 200 3 male 1 312 (4412.47, 2915.94) 3 6.00 09574 B 30 200 34 female 1 314 (4202.64, 2879.7) 4 8.00 08212 B 30 200 15 female 2 629 (4128.33, 3223.31) 5 23.00 05554 C 30 200 15 male 4 start popdensity influenza0 influenza1 influenza2 influenza3 103 260.86 402 519.36 312 209.45 314 7 1665.61 629 21 260.86 [....]

17 / 27

slide-24
SLIDE 24

Motivation Point Process Modelling Inference Data Analysis Summary

IMD model selection by AIC

Joint analysis of the two finetypes T emporal interaction function g: constant ϵ = 30 days, δ = 200 km District-specific population density as endemic offset Compare all models composed by subsets of the following terms:

Common or finetype-specific endemic intercept Linear time trend and sine-cosine time-of-year effects Linear effect of weekly number of influenza cases registered in the district of a point (lag 0 – lag 3) Epidemic predictor with gender, age (categorized as 0-2, 3-18, ≥19), finetype and age-finetype interaction Spatial interaction function ƒ: Gaussian or constant

18 / 27

slide-25
SLIDE 25

Motivation Point Process Modelling Inference Data Analysis Summary

Example code

R> fit <- twinstim( + endemic = ~ 1 + offset(log(popdensity)) + I(start/365) + + sin(start*2*pi/365) + cos(start*2*pi/365), + epidemic = ~ 1 + type + agegrp, + siaf = siaf.gaussian(1), + tiaf = tiaf.constant(), + data = imdepi, subset = !is.na(agegrp), + nCub = 36, nCub.adaptive = TRUE, +

  • ptim.args = list(par = startvalues), model = TRUE

+ )

19 / 27

slide-26
SLIDE 26

Motivation Point Process Modelling Inference Data Analysis Summary

Model summary (1)

R> toLatex(summary(fit), digits=2, withAIC=FALSE) Estimate

  • Std. Error

z value P(|Z| > |z|) h.(Intercept) −20.365 0.087 −233.5 < 2 · 10−16 h.I(start/365) −0.049 0.022 −2.2 0.03 h.sin(start*2*pi/365) 0.262 0.065 4.0 6 · 10−05 h.cos(start*2*pi/365) 0.267 0.064 4.1 3 · 10−05 e.(Intercept) −12.575 0.313 −40.2 < 2 · 10−16 e.typeC −0.850 0.257 −3.3 0.001 e.agegrp[3,19) 0.646 0.320 2.0 0.04 e.agegrp[19,Inf) −0.187 0.432 −0.4 0.67 e.siaf.1 2.829 0.082

endemic: common intercept, no influenza effect epidemic: no gender effect, no age-finetype interaction, Gaussian ƒ

Basic reproduction numbers

ˆ μB = 0.25 (95% CI: 0.19 − 0.34) ˆ μC = 0.11 (95% CI: 0.07 − 0.17)

20 / 27

slide-27
SLIDE 27

Motivation Point Process Modelling Inference Data Analysis Summary

Model summary (2)

R> intensityplot(fit, which = "total intensity", aggregate = "time", + types = 1, col = "orangered", ylim = c(0,0.3)) B:P1.7-2,4:F1-5

500 1000 1500 2000 2500 0.00 0.05 0.10 0.15 0.20 0.25 0.30 Time [days] Fitted intensity process total intensity endemic intensity

C:P1.5,2:F3-3

500 1000 1500 2000 2500 0.00 0.05 0.10 0.15 0.20 0.25 0.30 Time [days] Fitted intensity process total intensity endemic intensity

21 / 27

slide-28
SLIDE 28

Motivation Point Process Modelling Inference Data Analysis Summary

Model summary (3)

0.4 0.6 0.8 1.0 1.2 1.4 1.6 Time Multiplicative effect 2002 2004 2006 2008 point estimate 95% Wald CI

T ypical IMD peak in late February and minimum in August

50 100 150 200 0.0 0.2 0.4 0.6 0.8 1.0 Distance ||s − sj|| from host eγ

^

CIC(κj) fσ

^(||s − sj||)

point estimate type B point estimate type C 95% Wald CI for type B 95% Wald CI for type C

Effective interaction range ≈ 50 km

22 / 27

slide-29
SLIDE 29

Motivation Point Process Modelling Inference Data Analysis Summary

Goodness-of-fit (residual analysis)

R> checkResidualProcess(fit, plot=1)

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 u(i) Cumulative distribution

deterministic tie-breaking

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 u(i) Cumulative distribution

U(0, 1)-scheme

23 / 27

slide-30
SLIDE 30

Motivation Point Process Modelling Inference Data Analysis Summary

Goodness-of-fit (simulation)

R> simulate(fit, nsim = 100, + data = imdepi, + tiles = districts, + W = germany)

Compare observed 7-year incidences with (2.5%, 97.5%) quantiles from 100 simulations from the fitted CIF model Many excess districts around Aachen at the border to the Netherlands Edge effects hide potential transmissions across the border

2 4 6 8 10

24 / 27

slide-31
SLIDE 31

Motivation Point Process Modelling Inference Data Analysis Summary

Summary

twinstim is a comprehensive framework for the modelling, inference and simulation of general self-exciting spatio-temporal point processes, e.g., epidemics, forest fires, residential burglaries, riots, . . . Details in Meyer, Elias & Höhle (2012) . . . and most importantly . . . The twinstim implementation is available in the popular R package surveillance (Höhle, Meyer & Paul, 2012)

25 / 27

slide-32
SLIDE 32

Motivation Point Process Modelling Inference Data Analysis Summary

Acknowledgements

Michael Höhle, Robert Koch Institute, for fruitful collaboration Johannes Elias and Ulrich Vogel, University of Würzburg, for supplying the IMD data and for discussions on the microbiological aspects Ludwig Fahrmeir, Ludwig-Maximilians-Universität München, for providing helpful suggestions and comments Stephen Price for inviting me to this seminar and for providing an interesting application The Munich Center of Health Sciences and the Swiss National Science Foundation for financial support

26 / 27

slide-33
SLIDE 33

Motivation Point Process Modelling Inference Data Analysis Summary

Literature

Daley, D. J. & Vere-Jones, D. (2003). An introduction to the theory of point processes (2nd ed., Vol. I: Elementary Theory and Methods). New York: Springer-Verlag. Höhle, M. (2009, December). Additive-multiplicative regression models for spatio-temporal epidemics. Biometrical Journal, 51(6), 961–978. doi: 10.1002/bimj.200900050 Höhle, M., Meyer, S. & Paul, M. (2012). surveillance: T emporal and spatio-temporal modeling and monitoring of epidemic phenomena [Computer software manual]. Retrieved from http://surveillance.r-forge.r-project.org/ (R package version 1.4-2) Meyer, S., Elias, J. & Höhle, M. (2012). A space-time conditional intensity model for invasive meningococcal disease occurrence. Biometrics, 68(2), 607–616. doi: 10.1111/j.1541-0420.2011.01684.x Ogata, Y . (1998, June). Space-time point-process models for earthquake occurrences. Annals of the Institute of Statistical Mathematics, 50(2), 379–402. doi: 10.1023/A:1003403601725 Rathbun, S. L. (1996). Asymptotic properties of the maximum likelihood estimator for spatio-temporal point processes. Journal of Statistical Planning and Inference, 51(1), 55–74. doi: 10.1016/0378-3758(95)00070-4

27 / 27