and probabilistic forecasting with surveillance Sebastian Meyer - - PowerPoint PPT Presentation

and probabilistic forecasting with surveillance
SMART_READER_LITE
LIVE PREVIEW

and probabilistic forecasting with surveillance Sebastian Meyer - - PowerPoint PPT Presentation

Social contact data in endemic-epidemic models and probabilistic forecasting with surveillance Sebastian Meyer Institute of Medical Informatics, Biometry, and Epidemiology Friedrich-Alexander-Universitt Erlangen-Nrnberg, Erlangen, Germany 5


slide-1
SLIDE 1

Social contact data in endemic-epidemic models and probabilistic forecasting with surveillance

Sebastian Meyer Institute of Medical Informatics, Biometry, and Epidemiology Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany 5 July 2017 Joint work with Johannes Bracher and Leonhard Held (University of Zurich)

slide-2
SLIDE 2

World Health Organization 2014

Forecasting disease outbreaks is still in its infancy, however, unlike weather forecasting, where substantial progress has been made in recent years.

Sebastian Meyer | FAU | hhh4contacts and probabilistic forecasting 5 July 2017 1

slide-3
SLIDE 3

World Health Organization 2014

Forecasting disease outbreaks is still in its infancy, however, unlike weather forecasting, where substantial progress has been made in recent years.

Key requirements to forecast infectious disease incidence

  • Multivariate view to predict incidence in different regions and subgroups
  • Stratified count time series from routine public health surveillance
  • Useful statistical models to reflect forecast uncertainty
  • Predictive model assessment

Sebastian Meyer | FAU | hhh4contacts and probabilistic forecasting 5 July 2017 1

slide-4
SLIDE 4

Infectious disease spread ~ social contacts

library("hhh4contacts") plotC(contactmatrix(grouping = NULL))

age group of contact age group of participant

00−04 05−09 10−14 15−19 20−24 25−29 30−34 35−39 40−44 45−49 50−54 55−59 60−64 65−69 70+ − 4 5 − 9 1 − 1 4 1 5 − 1 9 2 − 2 4 2 5 − 2 9 3 − 3 4 3 5 − 3 9 4 − 4 4 4 5 − 4 9 5 − 5 4 5 5 − 5 9 6 − 6 4 6 5 − 6 9 7 + 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0

Average number of contact persons

EU-funded POLYMOD study

[Mossong et al. 2008]:

  • 7 290 participants from eight

European countries recorded contacts during one day

  • Contact characteristics were

similar across countries

  • Remarkable mixing patterns

with respect to age

Sebastian Meyer | FAU | hhh4contacts and probabilistic forecasting 5 July 2017 2

slide-5
SLIDE 5

Infectious disease spread ~ location and distance

  • Tobler’s First Law of Geography:

Everything is related to everything else, but near things are more related than distant things.

Specifically [e.g., Meyer and Held 2017]:

Spatial interaction decays as a power law.

Regional characteristics may also affect disease spread, e.g., rural vs. urban municipalities

Sebastian Meyer | FAU | hhh4contacts and probabilistic forecasting 5 July 2017 3

slide-6
SLIDE 6

Infectious disease spread ~ time

20 40 60 80 50 100 150 Time [days] Number of individuals susceptible infectious removed

  • Occasional outbreaks
  • Limited infectious period

10 20 30 40 50 5 10 15 Calendar week Seasonal effect

  • Seasonality (influenza,

measles, norovirus gastroenteritis, . . . )

Sebastian Meyer | FAU | hhh4contacts and probabilistic forecasting 5 July 2017 4

slide-7
SLIDE 7

Case study: norovirus gastroenteritis in Berlin, 2011–2016

noroBEg <- noroBE(by = "agegroups", timeRange = c("2011-w27", "2016-w26"))

25−44 45−64 65+ 00−04 05−14 15−24 2012 2013 2014 2015 2016 2012 2013 2014 2015 2016 2012 2013 2014 2015 2016 10 20 30 10 20 30

weekly incidence [per 100 000 inhabitants]

Lab-confirmed counts from survstat.rki.de, stratified by 12 city districts and 6 age groups

Sebastian Meyer | FAU | hhh4contacts and probabilistic forecasting 5 July 2017 5

slide-8
SLIDE 8

An age-stratified, spatio-temporal model

Negative binomial likelihood for infectious disease counts Ygrt with endemic-epidemic mean decomposition:

µgrt = νgrt +φgrt ∑

g′,r′

cg′g wr′r Yg′,r′,t−1

Sebastian Meyer | FAU | hhh4contacts and probabilistic forecasting 5 July 2017 6

slide-9
SLIDE 9

An age-stratified, spatio-temporal model

Negative binomial likelihood for infectious disease counts Ygrt with endemic-epidemic mean decomposition:

µgrt = νgrt + φgrt ∑

g′,r′

cg′g wr′r Yg′,r′,t−1

Log-linear predictors

νgrt and φgrt

  • Population
  • ffsets
  • Seasonality
  • Group-specific

susceptibility

  • Covariates, e.g.,

vaccination coverage

Sebastian Meyer | FAU | hhh4contacts and probabilistic forecasting 5 July 2017 6

slide-10
SLIDE 10

An age-stratified, spatio-temporal model

Negative binomial likelihood for infectious disease counts Ygrt with endemic-epidemic mean decomposition:

µgrt = νgrt +φgrt ∑

g′,r′

cg′g wr′r Yg′,r′,t−1

Log-linear predictors

νgrt and φgrt

  • Population
  • ffsets
  • Seasonality
  • Group-specific

susceptibility

  • Covariates, e.g.,

vaccination coverage Aggregated POLYMOD

contactmatrix() (cg′g)

age group of contact age group of participant

00−04 05−14 15−24 25−44 45−64 65+ 00−04 05−14 15−24 25−44 45−64 65+ 0.0 0.1 0.2 0.3 0.4 0.5

Sebastian Meyer | FAU | hhh4contacts and probabilistic forecasting 5 July 2017 6

slide-11
SLIDE 11

An age-stratified, spatio-temporal model

Negative binomial likelihood for infectious disease counts Ygrt with endemic-epidemic mean decomposition:

µgrt = νgrt +φgrt ∑

g′,r′

cg′g wr′r Yg′,r′,t−1

Log-linear predictors

νgrt and φgrt

  • Population
  • ffsets
  • Seasonality
  • Group-specific

susceptibility

  • Covariates, e.g.,

vaccination coverage Aggregated POLYMOD

contactmatrix() (cg′g)

age group of contact age group of participant

00−04 05−14 15−24 25−44 45−64 65+ 00−04 05−14 15−24 25−44 45−64 65+ 0.0 0.1 0.2 0.3 0.4 0.5

Spatial weights, e.g., power-law decay wr′r = (or′r + 1)−ρ

3.0% 8.4% 8.4% 3.0% 8.4% 3.0% 46.8% 8.4% 3.0% 1.5% 3.0% 3.0% Sebastian Meyer | FAU |

hhh4contacts and probabilistic forecasting 5 July 2017 6

slide-12
SLIDE 12

Power-adjustment of the contact matrix: Cκ := EΛκE−1

powerC <- make_powerC(contactmatrix(), normalize = TRUE)

powerC(0)

age group of contact age group of participant

00−04 05−14 15−24 25−44 45−64 65+ 00−04 05−14 15−24 25−44 45−64 65+ 0.0 0.2 0.4 0.6 0.8 1.0

powerC(0.5)

age group of contact age group of participant

00−04 05−14 15−24 25−44 45−64 65+ 00−04 05−14 15−24 25−44 45−64 65+ 0.0 0.2 0.4 0.6 0.8 1.0

powerC(1)

age group of contact age group of participant

00−04 05−14 15−24 25−44 45−64 65+ 00−04 05−14 15−24 25−44 45−64 65+ 0.0 0.2 0.4 0.6 0.8 1.0

powerC(2)

age group of contact age group of participant

00−04 05−14 15−24 25−44 45−64 65+ 00−04 05−14 15−24 25−44 45−64 65+ 0.0 0.2 0.4 0.6 0.8 1.0

Sebastian Meyer | FAU | hhh4contacts and probabilistic forecasting 5 July 2017 7

slide-13
SLIDE 13

Model estimation

Likelihood inference using surveillance::hhh4() [Meyer, Held, and Höhle 2017] A “simple”, age-stratified, spatio-temporal model1:

noroBEall <- noroBE(by = "all", flatten = TRUE, # 6 x 12 = 72 columns timeRange = c("2011-w27", "2016-w26")) fit <- hhh4(stsObj = noroBEall, control = list( end = list(f = addSeason2formula(~1),

  • ffset = prop.table(population(noroBEall), 1)),

ne = list(f = ~1 + log(pop), weights = W_powerlaw(maxlag = 5, log = TRUE), scale = expandC(contactmatrix(), 12)), data = list(pop = prop.table(population(noroBEall), 1)), family = "NegBin1", subset = 2:(4*52)))

1Full models in demo("hhh4contacts", package = "hhh4contacts")

Sebastian Meyer | FAU | hhh4contacts and probabilistic forecasting 5 July 2017 8

slide-14
SLIDE 14

Fitted mean by age group aggregated over districts

10 20 30 40 2011 2012 2013 2014 2015 2016

00−04

  • 10

20 30 40 2011 2012 2013 2014 2015 2016

05−14

  • 10

20 30 40 2011 2012 2013 2014 2015 2016

15−24

  • from other age groups

within age group endemic

10 20 30 40 2011 2012 2013 2014 2015 2016

25−44

  • 10

20 30 40 2011 2012 2013 2014 2015 2016

45−64

  • 20

40 60 80 100 120 2011 2012 2013 2014 2015 2016

65+

  • 120

Sebastian Meyer | FAU | hhh4contacts and probabilistic forecasting 5 July 2017 9

slide-15
SLIDE 15

Prediction and validation

  • AIC-based model comparison selects most complex model
  • Is this choice also supported by predictive model assessment based on the

last season?

  • We can quantify sharpness and calibration of probabilistic
  • one-week-ahead forecasts (negative binomial)
  • long-term forecasts (via Monte Carlo simulation)
  • Proper scoring rules as overall performance measures [Gneiting and Katzfuss

2014]

  • Assign penalty score based on the predictive distribution F and the actual
  • bservation yobs
  • Example: Dawid-Sebastiani score

DSS(F,yobs) = log|Σ|+(yobs −µ)⊤Σ−1(yobs −µ)

Sebastian Meyer | FAU | hhh4contacts and probabilistic forecasting 5 July 2017 10

slide-16
SLIDE 16

Target quantity: overall epidemic curve

purely endemic model

time Number of cases 2015 2016 50 100 150 200 250 300

  • 1%

25% 50% 75% 99%

full model

time Number of cases 2015 2016 50 100 150 200 250 300

  • 1%

25% 50% 75% 99%

Sebastian Meyer | FAU | hhh4contacts and probabilistic forecasting 5 July 2017 11

slide-17
SLIDE 17

Target quantity: final size by age group

purely endemic model

age group − 4 5 − 1 4 1 5 − 2 4 2 5 − 4 4 4 5 − 6 4 6 5 + 500 1000 1500 2000 2500 3000

  • 1%

25% 50% 75% 99%

full model

age group − 4 5 − 1 4 1 5 − 2 4 2 5 − 4 4 4 5 − 6 4 6 5 + 500 1000 1500 2000 2500 3000

  • 1%

25% 50% 75% 99%

Sebastian Meyer | FAU | hhh4contacts and probabilistic forecasting 5 July 2017 12

slide-18
SLIDE 18

Summary and outlook

  • Models do not perfectly represent individual-level disease transmission, but:

still useful for prediction of aggregate-level surveillance counts

  • Improved model fit and predictions by incorporating spatial weights and social

contact data [Held, Meyer, and Bracher 2017]

  • If the modelling goal is forecasting, use proper scoring rules to assess the

quality of probabilistic forecasts

  • surveillance currently implements the following univariate scores for

Poisson and NegBin predictions: rps, dss, logs

  • For continuous distributions: package scoringRules

Sebastian Meyer | FAU | hhh4contacts and probabilistic forecasting 5 July 2017 13

slide-19
SLIDE 19

Summary and outlook

  • Models do not perfectly represent individual-level disease transmission, but:

still useful for prediction of aggregate-level surveillance counts

  • Improved model fit and predictions by incorporating spatial weights and social

contact data [Held, Meyer, and Bracher 2017]

  • If the modelling goal is forecasting, use proper scoring rules to assess the

quality of probabilistic forecasts

  • surveillance currently implements the following univariate scores for

Poisson and NegBin predictions: rps, dss, logs

  • For continuous distributions: package scoringRules
  • User interface for multivariate scoring rules
  • Binomial hhh4 models
  • hhh4 add-on package (Johannes Bracher):
  • Analytical DSS of multivariate path forecasts
  • Distributed higher-order time lags

Sebastian Meyer | FAU | hhh4contacts and probabilistic forecasting 5 July 2017 13

slide-20
SLIDE 20

References

Gneiting, Tilmann and Katzfuss, Matthias (2014). “Probabilistic forecasting”. In: Annual Review of Statistics and Its Application 1.1, pp. 125–151. DOI:

10.1146/annurev-statistics-062713-085831.

Held, Leonhard, Meyer, Sebastian, and Bracher, Johannes (2017). “Probabilistic forecasting in infectious disease epidemiology: The 13th Armitage lecture”. In: Statistics in Medicine (in press). DOI: 10.1002/sim.7363. Meyer, Sebastian and Held, Leonhard (2017). “Incorporating social contact data in spatio-temporal models for infectious disease spread”. In: Biostatistics 18.2,

  • pp. 338–351. DOI: 10.1093/biostatistics/kxw051.

Meyer, Sebastian, Held, Leonhard, and Höhle, Michael (2017). “Spatio-temporal analysis

  • f epidemic phenomena using the R package surveillance”. In: Journal of Statistical

Software 77.11, pp. 1–55. DOI: 10.18637/jss.v077.i11. Mossong, Joël et al. (2008). “Social contacts and mixing patterns relevant to the spread of infectious diseases”. In: PLoS Medicine 5.3, e74. DOI:

10.1371/journal.pmed.0050074.

World Health Organization (2014). “Anticipating epidemics”. In: Weekly Epidemiological Record 89.22, p. 244. URL: http://www.who.int/wer.

Questions? Comments? seb.meyer@fau.de

slide-21
SLIDE 21

Appendix

slide-22
SLIDE 22

Disease incidence map

noroBEr <- noroBE(by = "districts", timeRange=c("2011-w27","2016-w26")) scalebar <- layout.scalebar(noroBEr@map, corner = c(0.7, 0.9), scale = 10, labels = c(0, "10 km"), cex = 0.6, height = 0.02) plot(noroBEr, type = observed ~ unit, sub = "Mean yearly incidence", population = 100000 / ( sum(pop2011)*(nrow(noroBEr)/52) ), labels = list(cex = 0.8), sp.layout = scalebar)

2011/27 − 2016/26

Mean yearly incidence

10 km

chwi frkr lich mahe mitt neuk pank rein span zehl scho trko 49.00 64.00 81.0090.25 110.25 132.25 144.00 156.25 169.00 Sebastian Meyer | FAU | hhh4contacts and probabilistic forecasting 5 July 2017 1

slide-23
SLIDE 23

noroBErbyg <- noroBE(by = "all", timeRange = c("2011-w27", "2016-w26"))

00−04

10 km chwi frkr lich mahe mitt neuk pank rein span zehl scho trko

9.0011.56 14.44 17.64 21.16 25.00

05−14

10 km chwi frkr lich mahe mitt neuk pank rein span zehl scho trko

0.81 1.21 1.69 2.25 2.56 2.89 3.24

15−24

10 km chwi frkr lich mahe mitt neuk pank rein span zehl scho trko

1.96 2.56 3.24 4.004.41 4.84 5.29

25−44

10 km chwi frkr lich mahe mitt neuk pank rein span zehl scho trko

7.29 8.41 9.61 10.89 12.25 13.69

45−64

10 km chwi frkr lich mahe mitt neuk pank rein span zehl scho trko

10.24 12.25 14.4416.00 17.64 19.36

65+

10 km chwi frkr lich mahe mitt neuk pank rein span zehl scho trko

25.00 36.00 49.00 64.00 81.00

animation::saveHTML(animate(noroBErbyg[["00-04"]], tps = 1:52, timeplot = list(as.Date = TRUE)))

Sebastian Meyer | FAU | hhh4contacts and probabilistic forecasting 5 July 2017 2

slide-24
SLIDE 24

Model formulation for the norovirus data µgrt = egr exp

  • α(ν)

g

+α(ν)

r

+βxt +γ(ν)

g

sin(ωt)+δ (ν)

g

cos(ωt)

  • + exp
  • α(φ)

g

+α(φ)

r

+τ log(egr)+γ(φ) sin(ωt)+δ (φ) cos(ωt)

g′,r′

⌊(Cκ)g′g (or′r + 1)−ρ⌋Yg′,r′,t−1

  • Group- and district-specific effects α(·)

g

and α(·)

r

  • Christmas break indicator xt → reduced reporting
  • Group-specific endemic seasonality (sinusoidal log-rates, ω = 2π/52)
  • “Gravity model” eτ

gr → force of infection scales with population size

  • Cκ: power-adjusted contact matrix
  • Power-law weights wr′r = (or′r + 1)−ρ

+ group-specific overdispersion parameters

Sebastian Meyer | FAU | hhh4contacts and probabilistic forecasting 5 July 2017 3

slide-25
SLIDE 25

Estimated seasonality

0.0 0.5 1.0 1.5 2.0 2.5 3.0 calendar week endemic seasonality (multiplicative effect) 00−04 05−14 15−24 25−44 45−64 65+

00−04 05−14 15−24 25−44 45−64 65+

27 35 43 51 3 7 11 19 0.0 0.2 0.4 0.6 0.8 1.0 calendar week epidemic proportion 27 35 43 51 3 7 11 19

Sebastian Meyer | FAU | hhh4contacts and probabilistic forecasting 5 July 2017 4

slide-26
SLIDE 26

Estimated power-law weights

0.0 0.2 0.4 0.6 0.8 1.0 adjacency order weight 1 2 3 4

  • uniform spatial spread

power law unconstrained local transmission only

Sebastian Meyer | FAU | hhh4contacts and probabilistic forecasting 5 July 2017 5