Inverse Problems in Epidemiology Karyn Sutton 1 , 2 1 Center for - - PowerPoint PPT Presentation

inverse problems in epidemiology
SMART_READER_LITE
LIVE PREVIEW

Inverse Problems in Epidemiology Karyn Sutton 1 , 2 1 Center for - - PowerPoint PPT Presentation

Inverse Problems in Epidemiology Karyn Sutton 1 , 2 1 Center for Research in Scientific Computation & 2 Center for Quantitative Studies in Biomedicine North Carolina State University 3 Department of Mathematics and Statistics Arizona State


slide-1
SLIDE 1

Inverse Problems in Epidemiology

Karyn Sutton1,2

1 Center for Research in Scientific Computation & 2 Center for Quantitative Studies in Biomedicine

North Carolina State University

3 Department of Mathematics and Statistics

Arizona State University Collaborators: H.T. Banks1,2 Carlos Castillo-Ch´ avez3 Wednesday, October 29, 2008

Workshop on Inverse and Partial Information Problems: Methodology and Applications

slide-2
SLIDE 2

INVERSE PROBLEMS IN EPIDEMIOLOGY

Public Health Challenges in Infectious Diseases

  • Prescribing and implementing control strategies (prevention and/or

treatment)

  • Collection and analyzing surveillance data
  • One strategy likely not effective in all populations

– Heterogeneous populations – Drugs or vaccines may be inappropriate for population

Workshop on Inverse and Partial Information Problems: Methodology and Applications October 29, 2008

slide-3
SLIDE 3

INVERSE PROBLEMS IN EPIDEMIOLOGY

Mathematical Approaches & Inverse Problems

  • Physiological structure can be incorporated into population models
  • Determine appropriate level of detail
  • Determine if mechanisms/terms should be included in model
  • Calibrate a mathematical model to population of interest
  • Theoretically study prevention and/or treatment strategies
  • Assess impact/effectiveness of implemented prevention or treatment strategy
  • Improve surveillance data collection

– ‘Types’ of data – How many longitudinal data points – Frequency of longitudinal observations

Workshop on Inverse and Partial Information Problems: Methodology and Applications October 29, 2008

slide-4
SLIDE 4

INVERSE PROBLEMS IN EPIDEMIOLOGY

Pneumococcal Diseases as Example

Invasive infections caused by Streptococcus pneumoniae include pneumonia, meningitis, bacteremia, sepsis.

  • Population heterogeneity plays a role in infection dynamics
  • Multiple serotypes complicate prevention and treatment; Vary by:

– Geographic region – Age groups affected – Ability to colonize individuals – Ability to cause infection in colonized individuals

Workshop on Inverse and Partial Information Problems: Methodology and Applications October 29, 2008

slide-5
SLIDE 5

INVERSE PROBLEMS IN EPIDEMIOLOGY

Pneumococcal Diseases as Example

  • Vaccine development active research area

– Distinct in structural design → distinct in induced immunity

  • Polysaccharide vaccine: licensed in 1983, effective in elderly
  • BUT most affected group is children,

– 1 million children under the age of 5 die from pneumococcal pneumonia annually (WHO, 1999)

  • Protein conjugate vaccine: licensed in 2001, effective in children
  • BUT may induce undesirable evolutionary changes in endemic pneumococci,

changing landscape of infections in unknown and potentially serious ways.

Workshop on Inverse and Partial Information Problems: Methodology and Applications October 29, 2008

slide-6
SLIDE 6

INVERSE PROBLEMS IN EPIDEMIOLOGY

Key Assumptions in Mathematical Model of Pneumococcal Diseases

  • Asymptomatic nasopharyngeal colonization (or ‘carriage’) results from casual

contacts

  • Infection established only if colonies cannot be cleared
  • Seasonality in infection rates due to changes in host susceptibility (comorbidity)
  • Susceptible and colonized individuals vaccinated at same rate
  • Vaccines may induce protection against infection and colonization (conjugate)
  • Vaccine protection may be lost

Workshop on Inverse and Partial Information Problems: Methodology and Applications October 29, 2008

slide-7
SLIDE 7

INVERSE PROBLEMS IN EPIDEMIOLOGY

Pneumococcal Disease Dynamics Model

  • S E + EV + I + IV

N

  • E

SV E + EV + I + IV N

  • EV

S SV E EV I

S

E

(t)EV l(t)E

I

I

SV

EV

IV

IV

IV

µI

µS µE

µSV µEV µIV

κ(t) = κ0(1 + κ1 cos[ω(t − τ)]

Workshop on Inverse and Partial Information Problems: Methodology and Applications October 29, 2008

slide-8
SLIDE 8

INVERSE PROBLEMS IN EPIDEMIOLOGY

Surveillance Data from Australian NNDS

Observations from 2001-2004, before conjugate vaccine

  • Total Cases (monthly):

Y (1)

j

∼ R tj+1

tj

[κ(s)E(s) + δκ(s)EV (s)] ds; (t1, ..., t37) = (1, ..., 37).

  • Unvaccinated Cases (annually):

Y (2)

k

∼ R tk+1

tk

[κ(s)E(s)] ds; (t1, ..., t4) = (1, 13, 25, 37).

  • Vaccinated Cases (annually):

Y (3)

k

∼ R tk+1

tk

[δκ(s)EV (s)] ds; (t1, ..., t4) = (1, 13, 25, 37). Observations after conjugate vaccine freely available

  • Total Cases Jan 2005 - Jun 2007 (monthly):

Y (1)

j

∼ R tj+1

tj

[κ(s)E(s) + δκ(s)EV (s)] ds; (t1, ..., t31) = (1, ..., 31).

  • Unvaccinated Cases 2005 (annually):

Y (2)

k

∼ R tk+1

tk

[κ(s)E(s)] ds; (t1, t2) = (1, 13).

  • Vaccinated Cases 2005 (annually):

Y (3)

k

∼ R tk+1

tk

[δκ(s)EV (s)] ds; (t1, t2) = (1, 13).

Workshop on Inverse and Partial Information Problems: Methodology and Applications October 29, 2008

slide-9
SLIDE 9

INVERSE PROBLEMS IN EPIDEMIOLOGY

Statistical Model

  • Assume a statistical model

Y (i) = f (i)(t, θ0) + ǫ(i) where Y (1) = {Y (1)

j

}36

j=1, Y (2) = {Y (2) k

}3

k=1, Y (3) = {Y (3) k

}3

k=1 and f (i), ǫ(i)

are defined similarly.

  • We further assume
  • 1. There exist ‘true parameters’ θ0 which generated observations.
  • 2. ǫ(i)

j

are i.i.d. for fixed i.

  • 3. mean E[ǫ(i)

j ] = 0, and variance var[ǫ(i) j ] = σ2 0,i.

  • 4. σ0,2 = σ0,3; data likely arose by same counting process.

Workshop on Inverse and Partial Information Problems: Methodology and Applications October 29, 2008

slide-10
SLIDE 10

INVERSE PROBLEMS IN EPIDEMIOLOGY

Inverse Problem Formulation

  • ˆ

θOLS(Y ) = arg minθ∈Θ J(θ, σ1, σ2) where Θ ⊂ Rp is the feasible parameter space.

  • Objective function

J(Y, θ, σ1, σ2) = 1 σ2

1 36

X

j=1

˛ ˛ ˛f(1)

j

(t, θ) − Y (1)

j

˛ ˛ ˛

2

+ 1 σ2

2 3

X

k=1

»˛ ˛ ˛f(2)

k

(t, θ) − Y (2)

k

˛ ˛ ˛

2

+ ˛ ˛ ˛f(3)

k

(t, θ) − Y (3)

k

˛ ˛ ˛

2–

  • Variance formulas

ˆ σ2

1 =

1 36 − p

36

X

j=1

˛ ˛ ˛f(1)

j

(t, ˆ θ) − Y (1)

j

˛ ˛ ˛

2

ˆ σ2

2 =

1 6 − p

3

X

k=1

»˛ ˛ ˛f(2)

k

(t, ˆ θ) − Y (2)

k

˛ ˛ ˛

2

+ ˛ ˛ ˛f(3)

k

(t, ˆ θ) − Y (3)

k

˛ ˛ ˛

2– Workshop on Inverse and Partial Information Problems: Methodology and Applications October 29, 2008

slide-11
SLIDE 11

INVERSE PROBLEMS IN EPIDEMIOLOGY

Inverse Problem Formulation

  • Estimating θ = (β, κ0, κ1, δ)T requires simultaneous estimation of σ2

1, σ2 2 via

an iterative process.

  • Estimate ˆ

ψ = (ˆ θ, ˆ σ2

1, ˆ

σ2

2)T by:

  • 1. Guess ˆ

σ(0)

1 , ˆ

σ(0)

2 .

  • 2. ˆ

θ(0) = arg minθ∈Θ J(θ, ˆ σ(0)

1 , ˆ

σ(0)

2 )

  • 3. Calculate ˆ

σ2

1, ˆ

σ2

2 with ˆ

θ(0).

  • 4. Continue updating ˆ

θ(k) and then ˆ σ2

1, ˆ

σ2 until ˛ ˛ ˛|| ˆ ψ(k)|| − || ˆ ψ(k−1)|| ˛ ˛ ˛ ≤ 10−q where q is a pre-determined constant.

  • Obtain standard errors from estimated covariance matrix:

SE(ˆ θk) ≈ q ˆ Σkk ˆ Σ = "

3

X

i=1

1 ˆ σ2

i

χT

i (ˆ

θ)χi(ˆ θ) #−1 where the (j, k)th entry of χi(ˆ θ) is

∂fi j ∂θk Workshop on Inverse and Partial Information Problems: Methodology and Applications October 29, 2008

slide-12
SLIDE 12

INVERSE PROBLEMS IN EPIDEMIOLOGY

Model Calibration Vaccine Assessment

5 10 15 20 25 30 35 40 100 200 300 400

New Infections t (months) Cases

1 1.5 2 2.5 3 600 800 1000 1200 1400 1600 1800

Unvaccinated Cases t (years) Cases

1 1.5 2 2.5 3 600 800 1000 1200 1400 1600 1800

Vaccinated Cases t (years) Cases 5 10 15 20 25 30 50 100 150 200 250

t (months) cases Jan 05 thru Jun 07

Used calibrated model to show that vaccine becoming increasingly less effective ⇒ suggests need for quantitative monitoring

Workshop on Inverse and Partial Information Problems: Methodology and Applications October 29, 2008

slide-13
SLIDE 13

INVERSE PROBLEMS IN EPIDEMIOLOGY

Age-Specific Surveillance Data

  • Age recorded with most infectious disease reports
  • Not reported frequently enough for reliable parameter estimation
  • Generated age-dependent data to explore following questions:

– Which ‘types’ of information should be collected to estimate certain parameters? – How many longitudinal points are needed? How frequently? Over what length of time? – How can we tell if a model is ‘over-specified’?

Workshop on Inverse and Partial Information Problems: Methodology and Applications October 29, 2008

slide-14
SLIDE 14

INVERSE PROBLEMS IN EPIDEMIOLOGY

Age-Structured Model

  • Age is a reasonable marker for physiological processes which govern infection

dynamics

  • Analogous age-structured pneumococcal disease model with parameters/rates

functions of age

  • Discretize PDE to system of ODE’s assuming stable age distribution
  • State variables represent age cohorts, possibly of different lengths

– Consider parameters constant within each age class – Use smaller lengths in younger and older age classes

  • Facilitates computational studies and connection to surveillance data

Workshop on Inverse and Partial Information Problems: Methodology and Applications October 29, 2008

slide-15
SLIDE 15

INVERSE PROBLEMS IN EPIDEMIOLOGY

Age-Structured Model

S(a,t) E(a,t) I(a,t) SV(a,t) EV(a,t)

(a,t)

(a) (a) (a)(a,t) (a) (a) (a) (a) (a)(a,t) (a,t) µ(a) µ(a) µ(a) (a) µ(a) µ(a) (a)

IV(a,t)

(a) µ(a) (a)

S(0, t) = R ∞ f(a′)n(a′, t)da′

Workshop on Inverse and Partial Information Problems: Methodology and Applications October 29, 2008

slide-16
SLIDE 16

INVERSE PROBLEMS IN EPIDEMIOLOGY

Inverse Problem for Age-Structured Data

  • For m age classes, nk longitudinal observations, data is generated according to

the following statistical model Y (k,i)

j

= f (k,i)(tj, θ0) + ǫ(k,i)

j

i denotes age class, k denotes ‘type’ of observation, j denotes time.

  • ‘True’ parameters θ0 known!
  • ǫ(k,i)

j

∼ N(0, σ2

k,i)

– Variance independent of time, but scaled according to observation – σk,i =

l 100 ∗ avgjf (k,i)(tj, θ0) for l% noisy data Workshop on Inverse and Partial Information Problems: Methodology and Applications October 29, 2008

slide-17
SLIDE 17

INVERSE PROBLEMS IN EPIDEMIOLOGY

Age-Structured Generated Data

  • Total cases: f (1,i)

j

= R tj+1

tj

[κi(s)Ei(s) + δiκi(s)Ei(s)] ds where j = 1, ..., n1 for each i = 1, ..., 5,

  • Vaccinated Cases: f (2,i)

j

= R tj+1

tj

δiκi(s)EV i(s)ds where j = 1, .., n2 for each i = 1, ..., 5,

  • Colonization Prevalence: f (3,i)

j

=

Ei(tj)+EV i(tj) Ni

where j = 1, ..., n4 for each i = 1, ..., 5,

  • Vaccinated Colonization Prevalence: f (4,i)

j

=

EV i(tj) Ni

where j = 1, ..., n5 for each i = 1, .., 5. ** Colonization prevalence data not currently collected, but public health officials have considered the benefits of this information

Workshop on Inverse and Partial Information Problems: Methodology and Applications October 29, 2008

slide-18
SLIDE 18

INVERSE PROBLEMS IN EPIDEMIOLOGY

Model Calibration Results

  • Estimated mean infection rates κ0 from reports of total cases Y (1)

– ψ = (κ0,1, ..., κ0,m, ˆ σ2

1,1, ..., ˆ

σ2

1,m)

  • Estimated mean infection rates κ0 and vaccine infection protection δ from total

Y (1) and vaccinated cases Y (2)

  • Estimated force of infection Λ from colonization prevalence data Y (3)

– Assuming mixing structure (proportionate), can estimate contact matrix ci,i′ – Difficult to quantify ⇒ rarely, if ever, available in literature – Drives horizontal spread of infections

Workshop on Inverse and Partial Information Problems: Methodology and Applications October 29, 2008

slide-19
SLIDE 19

INVERSE PROBLEMS IN EPIDEMIOLOGY

Annual, 6 yrs Monthly, 2 yrs ψ0 ˆ ψ10 SE(ˆ θ10) ˆ ψ10 SE(ˆ θ10) κ0,1 3e−4 3.22e−4 4.9e−6 3.00e−4 2.4e−6 κ0,2 2.5e−5 2.56e−4 1.1e−6 2.47e−5 6.4e−7 κ0,3 4e−5 3.89e−4 2.7e−6 3.96e−5 1.4e−6 κ0,4 6e−5 5.79e−5 5.6e−6 5.86e−5 2.8e−6 κ0,5 1.7e−4 1.82e−4 1.9e−5 1.74e−4 6.8e−6 σ10

1,1

149 148.6287 σ10

1,2

51 51.2244 σ10

1,3

96 96.2737 σ10

1,4

54 54.4548 σ10

1,5

127 127.2959 σ10

1,1

5.5415 5.30 σ10

1,2

2.0388 1.90 σ10

1,3

4.8541 5.08 σ10

1,4

2.8422 3.34 σ10

1,5

5.4788 5.62

Workshop on Inverse and Partial Information Problems: Methodology and Applications October 29, 2008

slide-20
SLIDE 20

INVERSE PROBLEMS IN EPIDEMIOLOGY

Model Calibration Results

5 10 15 20 50 100 150

t (Mos) Cases

Total Cases, Age Groups 1,2

data 1 model 1 data 2 model 2 5 10 15 20 50 100 150

t (Mos) Cases

Total Cases, Age Groups 3,4,5

data 3 model 3 data 4 model 4 data 5 model 5 5 10 15 20 10 20 30 40 50

t (Mos) Cases

Vaccinated Cases, Age Group 4

data 4 model 4 5 10 15 20 10 20 30 40 50

t (Mos) Cases

Vaccinated Cases, Age Group 5

data 5 model 5

Workshop on Inverse and Partial Information Problems: Methodology and Applications October 29, 2008

slide-21
SLIDE 21

INVERSE PROBLEMS IN EPIDEMIOLOGY

Vaccine Impact Assessment

  • Estimated vaccine infection protection δ in vaccinated classes from vaccinated

data Y (2)

  • Estimated vaccine colonization protection ǫ in younger age classes from

vaccinated colonization prevalence Y (4) – Studies attempting to quantify this parameter are controversial – Further motivation to collect colonization data as protein-based vaccines are implemented

  • Simultaneously estimated vaccine protection parameters ǫ and δ in relevant age

classes from vaccinated infection Y (2) and colonization prevalence data Y (4)

  • Unable to estimate ǫ or δ without vaccine information (Y (1), Y (3))

Workshop on Inverse and Partial Information Problems: Methodology and Applications October 29, 2008

slide-22
SLIDE 22

INVERSE PROBLEMS IN EPIDEMIOLOGY

Age Class Refinement

  • Age is commonly reported, but often unclear how to meaningfully aggregate

information

  • Consider models of this type with fewer age classes as special cases of models

with more age classes – Model 1: (0,10], (10,20], (20,∞); Model 2: (0,20], (20,∞) – Age-dependent parameter α(a) Model 1: α1, α2, α3; Model 2: α1 = α2, α3 – Estimating α(a) from data will likely result in a lower residual sum of squares (RSS) for Model 1 as compared to Model 2.

  • Models with more age classes will always give a lower residual when fit to data,

more degrees of freedom

  • Employ the use of a RSS-based statistic

– Tells when improvement of fit/reduction of RSS is statistically significant – If improvement is significant, increased level of detail is warranted in the model

Workshop on Inverse and Partial Information Problems: Methodology and Applications October 29, 2008

slide-23
SLIDE 23

INVERSE PROBLEMS IN EPIDEMIOLOGY

Model Comparison Statistic

  • Consider the estimator

θOLS(Y ) = arg min

θ∈Θ Jn(Y, θ)

where Θ ⊂ Rp is the feasible parameter space.

  • Statistical model and corresponding assumptions unchanged

Yj = f(tj, θ0) + ǫj

  • Define constrained parameter space

ΘH = {θ ∈ Θ|Hθ = c} where H is an r × p matrix and c is a known constant. (r is the difference in degrees of freedom)

  • Let θH(Y ) denote the OLS estimator over ΘH.

Workshop on Inverse and Partial Information Problems: Methodology and Applications October 29, 2008

slide-24
SLIDE 24

INVERSE PROBLEMS IN EPIDEMIOLOGY

Model Comparison Statistic

  • Testing the null hypothesis

H0 : θ0 ∈ ΘH.

  • Define RSS-based statistic

U r

n(Y ) = n

` Jn(Y, θ(Y )) − Jn(Y, θH(Y )) ´ Jn(Y, θ(Y ))

  • If H0 true, Un → U(r) in dist’n as n → ∞, U(r) ∼ χ2(r).
  • Choose significance level α, then P rob{U > τ} = α for threshold τ.
  • If Un > τ, reject H0. Otherwise, do not reject H0.
  • Back to example of model 1 vs. model 2:

– H = ` 1 −1 ´ , c = 0 and r = 1. – Compare Un to χ2(1) at some pre-determined α. – If Un > τ reject H0 ⇒ considering additional age classes warranted by

  • bservations.

– If Un < τ, additional age classes do not explain observations better, using the ‘simpler’ model, (model 2) is reasonable.

Workshop on Inverse and Partial Information Problems: Methodology and Applications October 29, 2008

slide-25
SLIDE 25

INVERSE PROBLEMS IN EPIDEMIOLOGY

Model Comparison Results

Generated total case reports Y (1) with 14-age class model: (0,2mos], (2,4 mos], (4,6 mos], (6,24 mos], (2,5 yrs], (5,10], (10,15], (15,50], (50,65], (65,70], (70,75], (75,80], (80,85], 85+.

Aggregated (0,2 y] (2,15] (2,15] (65,∞) (65, ∞) age groups (65,∞) r: χ2(r) 9 6 4 τ 29.67 24.10 20.00 J60 9,475.1 3,093.7 2,244.1 U r

60

212.8 29.08 4.62

  • Differences among older age classes not important ⇒ reasonable to group

these into 1 age class

  • Considering smaller age groups between 2 and 15 years of age provides a

better fit.

  • Discretizing the youngest age classes also provides a significantly better fit.

Workshop on Inverse and Partial Information Problems: Methodology and Applications October 29, 2008

slide-26
SLIDE 26

INVERSE PROBLEMS IN EPIDEMIOLOGY

20 40 60 10 20 30 40 50 60

t (Mos) Cases

data 1 model 1 20 40 60 4 6 8 10 12 14 16 18 20

t (Mos) Cases

data 2 model 2 20 40 60 10 20 30 40 50 60 70 80

t (Mos) Cases

data 3 model 3 data 4 model 4 20 40 60 20 30 40 50 60 70 80 90 100

t (Mos) Cases

data 5 model 5

Model with 5 age classes fit to infection data grouped into 5 age classes.

Workshop on Inverse and Partial Information Problems: Methodology and Applications October 29, 2008

slide-27
SLIDE 27

INVERSE PROBLEMS IN EPIDEMIOLOGY

20 40 60 5 10 15 20 25 30 35

t (Mos) Cases

data 1 model 1 data 2 model 2 data 3 model 3 data 4 model 4 20 40 60 2 4 6 8 10

t (Mos) Cases

data 5 model 5 data 6 model 6 data 7 model 7 20 40 60 10 20 30 40 50 60 70 80

t (Mos) Cases

data 8 model 8 data 9 model 9 20 40 60 5 10 15 20 25 30

t (Mos) Cases

data 10 model 10 data 11 model 11 data 12 model 12 data 13 model 13 data 14 model 14

Model with 5 age classes fit to infection data grouped into 14 age classes.

Workshop on Inverse and Partial Information Problems: Methodology and Applications October 29, 2008