Bayesian inference for age-structured population model of infectious - - PowerPoint PPT Presentation

bayesian inference for age structured population model of
SMART_READER_LITE
LIVE PREVIEW

Bayesian inference for age-structured population model of infectious - - PowerPoint PPT Presentation

Bayesian inference for age-structured population model of infectious disease with application to varicella in Poland Piotr Gwiazda, Baej Miasojedow, Magdalena Rosiska 02.XII.2016 Varicella Varicella or chickenpox is a viral disease which


slide-1
SLIDE 1

Bayesian inference for age-structured population model of infectious disease with application to varicella in Poland

Piotr Gwiazda, Błażej Miasojedow, Magdalena Rosińska 02.XII.2016

slide-2
SLIDE 2

Varicella

Varicella or chickenpox is a viral disease which typically

  • ccurs in childhood with peak incidence at the age of 4 - 5,

when children enter preschool or school.

source: wikipedia

slide-3
SLIDE 3

Structured Population Model

For varicella it is reasonable to conisder only two states model , i.e. the susceptible and those who have ever been infected. ∂tq(t, a) + ∂aq(t, a) = −λ(t, a)q(t, a) for (t, a) ∈ R × R+. In this model q(t, a) represents the proportion of susceptible individuals of age a at a time point t. If we assume that all individuals are susceptible at birth this equation may be supplemented with boundary condition: q(0, t) = 1 for all t ∈ R Note that in this problem no initial condition is needed.

slide-4
SLIDE 4

The goal

Estimate unknown force of infection λ(t, a) based on available data.

slide-5
SLIDE 5

Source of data

The available data were derived from a database of POLYMOD

  • project. Samples from individuals aged 1 - 19 years (by the date
  • f birth) at the time of the sample collection (2000 - 2004) were

extracted from an existing bio-bank and tested for anti-VZV with a commercial testing kit. The bio-bank contained samples collected mainly for the purpose of routine check-ups or investigations before the surgical procedures. Altogether 1244 samples were included in the study, the number per year ranged from 108 to 500. The number of individuals in single Age – Year cells ranged from 1 to 45 and was generally smaller for the 2000–2001 period.

slide-6
SLIDE 6

Models based on discretization

Approximate Age-Year box by discretization. Build an GAM model to estimate force of infection λγ(a, t) = 20(sin(γt) + 1.1)

1 2 3 2.5 3.0 3.5 4.0

γ

True posterior 1 cohort 4 cohorts 16 cohorts

slide-7
SLIDE 7

Bayesian inverse problem

We want to find solution u to the equation y = G(u) u, y elements of some Banach space noisy measurement y = G(u) + η

  • A. M. Stuart, Inverse problems: A bayesian perspective,

Acta Numerica 19 (2010)

  • S. L. Cotter, M. Dashti, A. M. Stuart, Approximation of

bayesian inverse problems for pdes, SIAM Journal on Numerical Analysis 48 (1) (2010).

slide-8
SLIDE 8

Data model

We will first describe the seroprevalence data. This type of data characterize individuals who have been tested to establish if they have ever had contact with a disease or not. The

  • bservations are generally of the form (Yij, tij, aij), where Yij is a

random variable indicating whether the person i in sample j has had the contact with disease, at exact test time, tij and exact age at test, aij. Let’s assume that: P(Yij = 1|tij, aij) = q(tij, aij)

slide-9
SLIDE 9

Data model

The data are agregated and exact value of tij and aij is

  • unknown. Let’s define pj as:

pj =

  • R×R+ Ψj(t, a)q(t, a)dtda = EΨj(q)

then Yj = Nj

i=1 Yij is distributed according to binomial

distribution Bin(pj, Nj), where the total number of individuals in the sample j is denoted by Nj.

slide-10
SLIDE 10

Likelihood function

Next, let us denote the likelihood of observation by L(θ|Y) =

j pYj θ,j(1 − pθ,j)Nj−Yj. To complete the description of

the Bayesian model we need to set prior distributions on θ, denoted by f(θ). Then the posterior distribution is proportional to: π(θ|Y) ∝ L(θ|Y)f(θ)

slide-11
SLIDE 11

Application to real data: varicella in Poland

We model the force of infection λ(t, a) by λ(t, a) = λ1(a)(sin(γ1t + γ2) + 1 + γ3) with λ1(a) =

k

  • i=1

αi1(a ∈ Ai) (1) where λ1(a) is a step function describing possible different levels

  • f infection in k different age groups Ai of form Ai = (ai−1, ai].

We choose four groups: children’s before preschool education A1 = (1, 3], children’s during preschool education A2 = (3, 7], primary school students A3 = (7, 15], and others A4 = (15, 20]. The force of infection is fully specified by the following unknown parameters αi ∈ R+ for i = 1, . . . , 4, γ1 ∈ R+ , γ2 ∈ [0, 2π) and γ3 ∈ R+.

slide-12
SLIDE 12

Application to real data: varicella in Poland

We set the following prior’s αi ∼ Exp(10) for i = 1, . . . , k γ1 ∼ Exp(0.8) γ2 ∼ Unif([0, 2π]) γ3 ∼ Exp(1) The choice of hyper-parameters is consistent with a prior knowledge on observed incidence of varicella in Poland as described above. Finally we choose Ψj a smoothed uniform distribution on Age×Year box.

slide-13
SLIDE 13

MCMC sampler

The pseudo-marginal MCMC approach assumes existence of an unbiased, positive estimator of likelihood function, ˆ L(θ|Y), which is used to introduce an auxiliary target of form π(θ, u) ∝ ˆ L(θ|Y)f(θ)p(u) , where u is a random variable with a distribution p which satisfies E[ˆ L(θ|Y)] =

  • ˆ

L(θ|Y)p(u)du = L(θ|Y) . Clearly the marginal distribution of θ is exactly π(θ).

slide-14
SLIDE 14

Pseudo-marginal random walk Metropolis

Initialize θ0 and draw corresponding ˆ L(θ0|Y), where ˆ L(θ|Y) is an unbiased, positive estimator of L(θ|Y) . for n = 1 to N do Sample proposal ϑ ∼ N(θn−1, σ2I). Draw an estimator ˆ L(ϑ|Y) With probability min

  • ˆ

L(ϑ|Y)f(ϑ) ˆ L(θn−1|Y)f(θn−1) , 1

  • ,

set θn = ϑ otherwise θn = θn−1. end for

slide-15
SLIDE 15

Unbiased estimator of likelihood

Consider a sequence of independent random variables (Tj,m, Aj,m) ∼ Ψj for j = 1, . . . , J and m = 1, . . . , M where J is the number of subsamples in the model and M 1 is an arbitrary integer. We define an unbiased estimator of pθ,j by ˆ pθ,j,i = 1 M

M

  • m=1

qθ(Tj,m, Aj,m) , for i = 1, . . . , Nj. Next we define ˆ L(θ|Y) by ˆ L(θ|Y) =

  • j

Nj

  • i=1

ˆ p1(iYj)

θ,j,i

(1 − ˆ pθ,j,i)1(i>Yj) .

slide-16
SLIDE 16

Results

We approximate posterior distribution based on data from years 2000 − 2003 and we predict prevalence for 2004

25 50 75 100 5 10 15

Age (years) Prevalence per 100

slide-17
SLIDE 17

Paramter estimation

α1 α2 α3 α4

0.0 2.5 5.0 7.5 10.0 12.5 0.0 0.1 0.2 0.3 0.4 0.0 0.2 0.4 0.6 0.0 0.1 0.2 0.3 0.4 0.0 0.2 0.4 0.6

Value of parameter

slide-18
SLIDE 18

Paramter estimation

γ1 γ2 γ3

0.0 0.2 0.4 0.6 2 4 2 4 6 2 4 6

Value of parameter

slide-19
SLIDE 19

Convergence of MCMC algorithm

α1 α2 α3 α4 γ1 γ2 γ3

0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 20 40 20 40 20 40 20 40 20 40 20 40 20 40

lag acf

slide-20
SLIDE 20

Thank You!