Regression modelling of misclassified correlated interval-censored - - PowerPoint PPT Presentation

regression modelling of misclassified correlated interval
SMART_READER_LITE
LIVE PREVIEW

Regression modelling of misclassified correlated interval-censored - - PowerPoint PPT Presentation

Arno st Kom arek Dept. of Probability and Mathematical Statistics Regression modelling of misclassified correlated interval-censored data Workshop on Flexible Models for Longitudinal and Survival Data with Applications in Biostatistics


slide-1
SLIDE 1

Arnoˇ st Kom´ arek

  • Dept. of Probability and Mathematical Statistics

Regression modelling of misclassified correlated interval-censored data

Workshop on Flexible Models for Longitudinal and Survival Data with Applications in Biostatistics Warwick, July 27 – 29, 2015

slide-2
SLIDE 2

Joint work with Mar´ ıa Jos´ e Garc´ ıa-Zattera and Alejandro Jara Pontificia Universidad Cat´

  • lica de Chile

Santiago de Chile

slide-3
SLIDE 3

Outline

1

Misclassified interval-censored data.

2

Model for misclassified interval-censored data.

a

Misclassification model.

b

Event-time model.

3

Estimation and inference.

4

Simulation study.

5

Models comparison.

6

The Signal Tandmobiel study.

7

Summary and conclusions.

3/87 Arnoˇ

st Kom´ arek .

slide-4
SLIDE 4

Part I Misclassified interval-censored data

slide-5
SLIDE 5

Motivating dataset

The Signal Tandmobiel study

Longitudinal dental study, Flanders (Belgium), 1996 – 2001. 2 315 boys, 2 153 girls followed from 7 until 12 years old (primary

school time).

Annual dental examinations. Sixteen trained dental examiners.

Each child examined in general by different examiner in each year.

✇ Clinical data.

Data on oral hygiene and dietary habits.

5/87 Arnoˇ

st Kom´ arek

  • I. Misclassified interval-censored data
slide-6
SLIDE 6

Main aim

Model the relationship between time to caries experience (CE) and potential risk factors.

Gender (boys vs. girls). Presence of sealants. Frequency of brushing (daily / not daily). Geographical location.

6/87 Arnoˇ

st Kom´ arek

  • I. Misclassified interval-censored data
slide-7
SLIDE 7

Caries experience (CE)

7/87 Arnoˇ

st Kom´ arek

  • I. Misclassified interval-censored data
slide-8
SLIDE 8

Caries experience (CE)

❄ Reversible

7/87 Arnoˇ

st Kom´ arek

  • I. Misclassified interval-censored data
slide-9
SLIDE 9

Caries experience (CE)

❄ Reversible ✻ Caries (Irreversible)

7/87 Arnoˇ

st Kom´ arek

  • I. Misclassified interval-censored data
slide-10
SLIDE 10

Statistical modelling challenges

CE is a progressive disease

✇ we deal with a monotone 0/1 process.

CE status checked only at discrete occasions

(visits/dental examinations) ✇ interval censoring.

Teeth in one mouth share common environment, genetical

dispositions, . . . ✇ dependence among processes on different teeth in one mouth.

8/87 Arnoˇ

st Kom´ arek

  • I. Misclassified interval-censored data
slide-11
SLIDE 11

CE process & interval censoring

♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣

1 Y(i,j)(t)

T(i,j)

Y(i,j) v(i,1) v(i,2) v(i,3) 1 v(i,4) 1 t

T(i,j) ∈

  • v(i,2), v(i,3)
  • ,

Y(i,j) =

  • 0, 0, 1, 1

⊤.

9/87 Arnoˇ

st Kom´ arek

  • I. Misclassified interval-censored data
slide-12
SLIDE 12

Summary of notation

T(i,j): event (CE) time of tooth j on subject i,

i = 1, . . . , N, j = 1, . . . , J.

Y(i,j)(t): 0/1 CE status of tooth (i, j) at time t. x(i,j): potential risk factors, covariates to explain T(i,j)

  • Y(i,j)(t).

0 = v(i,0) < v(i,1) < v(i,2) < · · · < v(i,Ki) < v(i,Ki+1) = ∞:

visit times (of dental examinations) for subject i.

Y(i,j) =

  • Y(i,j,1), . . . , Y(i,j,Ki)

⊤: recorded 0/1 CE status of tooth (i, j) at performed visits.

10/87 Arnoˇ

st Kom´ arek

  • I. Misclassified interval-censored data
slide-13
SLIDE 13

Interval-censored data

Interest in Regression T(i,j) ∼ x(i,j) ≡ Y(i,j)(t) ∼ x(i,j). Observed data Monotone 0/1 sequence Y(i,j) =

  • Y(i,j,1), . . . , Y(i,j,Ki)

⊤ together with the visit times v(i,1), . . . , v(i,Ki). ≡ Intervals (L(i,j), U(i,j)] such that T(i,j) ∈ (L(i,j), U(i,j)] and L(i,j), U(i,j) ∈

  • 0, v(i,1), . . . , v(i,Ki), ∞
  • .

L(i,j): the last visit time when Y(i,j,∗) = 0, U(i,j): the first visit time when Y(i,j,∗) = 1.

11/87 Arnoˇ

st Kom´ arek

  • I. Misclassified interval-censored data
slide-14
SLIDE 14

Life is not so easy. . .

Not easy and somehow subjective diagnosis of CE

✇ misclassification in recorded values Y(i,j,1), . . . , Y(i,j,Ki). ✇ sensitivity/specificity of the diagnostic test towards caries are not one.

12/87 Arnoˇ

st Kom´ arek

  • I. Misclassified interval-censored data
slide-15
SLIDE 15

Life is not so easy. . .

Not easy and somehow subjective diagnosis of CE

✇ misclassification in recorded values Y(i,j,1), . . . , Y(i,j,Ki). ✇ sensitivity/specificity of the diagnostic test towards caries are not one.

Misclassified correlated interval-censored data.

12/87 Arnoˇ

st Kom´ arek

  • I. Misclassified interval-censored data
slide-16
SLIDE 16

CE process & misclassified interval-censored data

♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣

1 Y(i,j)(t)

T(i,j)

Y(i,j) v(i,1) v(i,2) v(i,3) v(i,4) 1 t

T(i,j) ∈

  • v(i,3), v(i,4)
  • really?,

Y(i,j) =

  • 0, 0, 0, 1

⊤.

13/87 Arnoˇ

st Kom´ arek

  • I. Misclassified interval-censored data
slide-17
SLIDE 17

CE process & misclassified interval-censored data

♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣

1 Y(i,j)(t)

T(i,j)

Y(i,j) v(i,1) v(i,2) 1 v(i,3) v(i,4) 1 t

T(i,j) ∈ ???, Y(i,j) =

  • 0, 1, 0, 1

⊤.

14/87 Arnoˇ

st Kom´ arek

  • I. Misclassified interval-censored data
slide-18
SLIDE 18

Misclassified interval-censored data

Interest in Regression T(i,j) ∼ x(i,j) ≡ Y(i,j)(t) ∼ x(i,j). Observed data T(i,j)

  • Y(i,j) observed only indirectly through

Y(i,j) =

  • Y(i,j,1), . . . , Y(i,j,Ki)

⊤: ✇ not necessarily monotone sequence of 0/1 possibly misclassified CE status indicators from visits performed at times v(i,1), . . . , v(i,Ki).

15/87 Arnoˇ

st Kom´ arek

  • I. Misclassified interval-censored data
slide-19
SLIDE 19

Study design that leads to misclassified interval-censored data

Longitudinal follow-up. Event status checked at pre-specified time points.

Assumption here: visit times independent of the event time.

Occurrence of event is determined by a diagnostic test (with possi-

bly imperfect sensitivity and/or specificity).

Frequent for many non-death events. Nevertheless, data are mostly analyzed as if both sensitivity and speci-

ficity are equal to one and hence there is no event status misclassifica- tion.

Follow-up is not scheduled to stop after the first positive result.

Frequent in longitudinal studies where the event is not the primary study

  • utcome.

16/87 Arnoˇ

st Kom´ arek

  • I. Misclassified interval-censored data
slide-20
SLIDE 20

Principal questions

Using just the observed data – Y(i,j)

1

Can we do a valid statistical inference on the time to event T(i,j) in presence of event misclassification even if no external infor- mation is available on the magnitude of the misclassification?

No external information on the sensitivity/specificity values.

17/87 Arnoˇ

st Kom´ arek

  • I. Misclassified interval-censored data
slide-21
SLIDE 21

Principal questions

Using just the observed data – Y(i,j)

1

Can we do a valid statistical inference on the time to event T(i,j) in presence of event misclassification even if no external infor- mation is available on the magnitude of the misclassification?

No external information on the sensitivity/specificity values.

2

Can we evaluate the magnitude of misclassification?

Can we estimate sensitivity/specificity of the event

classification?

17/87 Arnoˇ

st Kom´ arek

  • I. Misclassified interval-censored data
slide-22
SLIDE 22

Principal questions

Using just the observed data – Y(i,j)

1

Can we do a valid statistical inference on the time to event T(i,j) in presence of event misclassification even if no external infor- mation is available on the magnitude of the misclassification?

No external information on the sensitivity/specificity values.

2

Can we evaluate the magnitude of misclassification?

Can we estimate sensitivity/specificity of the event

classification?

3

Do we get a valid inference on the time to event T(i,j) if mis- classification ignored and it is assumed that T(i,j) lies in the first “possible” observed interval?

17/87 Arnoˇ

st Kom´ arek

  • I. Misclassified interval-censored data
slide-23
SLIDE 23

Part II Modelling approach

slide-24
SLIDE 24

Hierarchical model

Hierarchically specified model (likelihood) for observed data Yi =

  • Y(i,1), . . . , Y(i,J)
  • .

Start with a joint likelihood of unobservable Ti and observed Yi: p(Yi, Ti) = p

  • Yi
  • Ti
  • p(Ti).

p

  • Yi
  • Ti
  • : (mis)classification model

✇ visit times v(i,1), . . . , v(i,Ki ) act as covariates here.

p(Ti): survival model for (correlated) event times

✇ risk factors x(i,1), . . . , x(i,J) act as covariates here.

19/87 Arnoˇ

st Kom´ arek

  • II. Modelling approach
slide-25
SLIDE 25

Hierarchical model

Hierarchically specified model (likelihood) for observed data Yi =

  • Y(i,1), . . . , Y(i,J)
  • .

Start with a joint likelihood of unobservable Ti and observed Yi: p(Yi, Ti) = p

  • Yi
  • Ti
  • p(Ti).

p

  • Yi
  • Ti
  • : (mis)classification model

✇ visit times v(i,1), . . . , v(i,Ki ) act as covariates here.

p(Ti): survival model for (correlated) event times

✇ risk factors x(i,1), . . . , x(i,J) act as covariates here. Likelihood of observed data on subject i: p(Yi) =

  • RJ

+

p(Yi, Ti) dTi.

19/87 Arnoˇ

st Kom´ arek

  • II. Modelling approach
slide-26
SLIDE 26

Overall likelihood

Overall likelihood

Independence among subjects (children):

p(Y1, . . . , YN) =

N

  • i=1

p(Yi). p(Yi) =

  • RJ

+

p(Yi, Ti) dTi, p(Yi, Ti) = p

  • Yi
  • Ti
  • misclassification model

p(Ti) event-time model

20/87 Arnoˇ

st Kom´ arek

  • II. Modelling approach
slide-27
SLIDE 27

Part III Misclassification model

slide-28
SLIDE 28

(Mis)classification model p

  • Y i
  • T i
  • For each i (each child), some conditional independence assumed.

Event classification Y(i,j,k) for given unit (tooth j) at given time (k) is

(conditionally) independent of (a) event classification Y(i,j∗,l) for other units (other teeth, j∗ = j) at arbitrary times (arbitrary l); (b) event classification Y(i,j,k∗) for the same unit (the same tooth) at

  • ther times (k∗ = k);

(c) event times T(i,j∗) of other units (other teeth, j∗ = j). ✇✇✇ p(Yi | Ti) =

J

  • j=1

Ki

  • k=1

p(Y(i,j,k) | T(i,j)).

In the rest: form of p(Y(i,j,k) | T(i,j)) for given j (tooth) and k (visit time).

22/87 Arnoˇ

st Kom´ arek

  • III. Misclassification model
slide-29
SLIDE 29

Simple (mis)classification model p

  • Y i
  • T i
  • Only one examiner

Model parameters:

α: examiner’s sensitivity

α = P

  • Y(i,j,k) = 1
  • T(i,j) ≤ v(i,k)
  • .

η: examiner’s specificity

η = P

  • Y(i,j,k) = 0
  • T(i,j) > v(i,k)
  • .

Likelihood contribution: p(Y(i,j,k) | T(i,j)) = p(Y(i,j,k) | T(i,j); α, η, vi,k) =            αY(i,j,k) (1 − α)1−Y(i,j,k), if T(i,j) ≤ v(i,k) (correct Y(i,j,k) equals 1), (1 − η)Y(i,j,k) η1−Y(i,j,k), if T(i,j) > v(i,k) (correct Y(i,j,k) equals 0).

23/87 Arnoˇ

st Kom´ arek

  • III. Misclassification model
slide-30
SLIDE 30

Slightly more complicated (mis)classification model p

  • Y i
  • T i
  • More (Q > 1) examiners involved in a study

✇ Signal Tandmobiel study: Q = 16.

Different examiners have different ability to detect event (caries)

✇ sensitivity/specificity should be allowed to depend on the examiner. ✇

24/87 Arnoˇ

st Kom´ arek

  • III. Misclassification model
slide-31
SLIDE 31

Slightly more complicated (mis)classification model p

  • Y i
  • T i
  • More (Q > 1) examiners involved in a study

✇ Signal Tandmobiel study: Q = 16.

Different examiners have different ability to detect event (caries)

✇ sensitivity/specificity should be allowed to depend on the examiner.

It is not necessarily as easy to detect caries on all teeth

(j = 1, . . . , J) in the mouth ✇ sensitivity/specificity should be allowed to depend on tooth (j).

24/87 Arnoˇ

st Kom´ arek

  • III. Misclassification model
slide-32
SLIDE 32

Slightly more complicated (mis)classification model p

  • Y i
  • T i
  • Q examiners, dependence of sensitivity/specificity on tooth (j)

One more set of covariates in a model: ξ(i,k) ∈ {1, . . . , Q}

index (id) of examiner who scored (all) teeth of the ith child during his/her

kth visit at time v(i,k).

Slightly more unknown parameters of a model (q = 1, . . . , Q): αq =

  • α(q,1), . . . , α(q,J)

⊤, ηq =

  • η(q,1), . . . , η(q,J)

⊤. α(q,j), η(q,j): sensitivity and specificity if the event classification of tooth j is performed by examiner q, i.e., α(q,j) = P

  • Y(i,j,k) = 1
  • T(i,j) ≤ v(i,k); ξ(i,k) = q
  • ,

η(q,j) = P

  • Y(i,j,k) = 0
  • T(i,j) > v(i,k); ξ(i,k) = q
  • .

25/87 Arnoˇ

st Kom´ arek

  • III. Misclassification model
slide-33
SLIDE 33

Slightly more complicated (mis)classification model p

  • Y i
  • T i
  • Q examiners, dependence of sensitivity/specificity on tooth (j)

Model parameters:

α =

  • α⊤

1 , . . . , α⊤ Q

⊤: sensitivites of all examiners for all teeth.

η =

  • η⊤

1 , . . . , η⊤ Q

⊤: specificities of all examiners for all teeth. Likelihood contribution: p(Y(i,j,k) | T(i,j)) = p(Y(i,j,k) | T(i,j); α, η, vi,k, ξ(i,k)) =              α

Y(i,j,k) (ξ(i,k),j) (1 − α(ξ(i,k),j))1−Y(i,j,k),

if T(i,j) ≤ v(i,k) (correct Y(i,j,k) equals 1), (1 − η(ξ(i,k),j))Y(i,j,k) η

1−Y(i,j,k) (ξ(i,k),j) ,

if T(i,j) > v(i,k) (correct Y(i,j,k) equals 0).

26/87 Arnoˇ

st Kom´ arek

  • III. Misclassification model
slide-34
SLIDE 34

Even more complicated (mis)classification model p

  • Y i
  • T i
  • Sensitivies/specificities can further be modelled in a structural way as

functions of characteristics (covariates) of examiners/teeth:

age of examiner, gender of examiner, tooth position in the mouth,

  • .

. .

“Only” one more hierarchical level of the model.

27/87 Arnoˇ

st Kom´ arek

  • III. Misclassification model
slide-35
SLIDE 35

Hierarchical model

Reminder Likelihood contribution of observed data of the ith child: p(Yi) =

  • RJ

+

p(Yi, Ti) dTi =

  • RJ

+

p

  • Yi
  • Ti
  • p(Ti) dTi.

28/87 Arnoˇ

st Kom´ arek

  • III. Misclassification model
slide-36
SLIDE 36

Part IV Event-time model

slide-37
SLIDE 37

Event time model p(T i)

One more reminder Ti =

  • T(i,1), . . . , T(i,J)
  • ≡ possibly correlated times to CE for J teeth of the ith child.

xi =

  • x(i,1), . . . , x(i,J)
  • ≡ covariates that may explain Ti.

Form of p(Ti) can in principle be derived from any regression model for correlated event time data (if we believe that a given model is suit- able for data at hand):

frailty Cox model, random intercept accelerated failure time (AFT) model,

  • .

. . 30/87 Arnoˇ

st Kom´ arek

  • IV. Event-time model
slide-38
SLIDE 38

Event time model p(T i)

Random intercept AFT model log

  • T(i,j)
  • = x⊤

(i,j)β + bi + ε(i,j)

i = 1, . . . , N, j = 1, . . . , J,

β: regression coefficients. ε(1,1), . . . , ε(N,J): i.i.d. with zero-mean density gε(·). b1, . . . , bN: i.i.d. with density gb(·) .

bi (common for all j) induces dependence between T(i,1), . . . , T(i,J),

31/87 Arnoˇ

st Kom´ arek

  • IV. Event-time model
slide-39
SLIDE 39

Event time model p(T i)

Distributional assumptions gε(·) ∼ N(0, σ2

ε).

gb(·) ∼ penalized Gaussian mixture (PGM):

κ−9 κ−6 κ−3 κ0 κ3 κ6 κ9

  • ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

κ−9 κ−6 κ−3 κ0 κ3 κ6 κ9

  • ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

κ−9 κ−6 κ−3 κ0 κ3 κ6 κ9

  • ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

κ−9 κ−6 κ−3 κ0 κ3 κ6 κ9

  • ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

32/87 Arnoˇ

st Kom´ arek

  • IV. Event-time model
slide-40
SLIDE 40

Event time model p(T i)

Distributional assumptions gε(·) ∼ N(0, σ2

ε),

gb(·) ∼ µ + τ M

l=−M wl N(κl, ζ2)

  • penalized Gaussian mixture (PGM)

Model parameters: σ2

ε, w =

  • w−M, . . . , wM
  • , µ, τ 2.

Penalized Gaussian mixture:

M ≈ 15, ζ ≈ 0.2, κ−M, . . . , κM: equidistant knots on interval approx. [−4.5, 4.5];

flexible model for distribution with approximately zero mean and unit variance.

Regularization using penalized differences of (transformed) weights

w−M, . . . , wM. 33/87 Arnoˇ

st Kom´ arek

  • IV. Event-time model
slide-41
SLIDE 41

Event time model p(T i)

log

  • T(i,j)
  • = x⊤

(i,j)β + bi + ε(i,j)

i = 1, . . . , N, j = 1, . . . , J, Distribution of the event time T(i,j) Up to the log-transformation:

convolution of a full parametric Normal and a semi-parametric

PGM. Also distribution of the event time is specified semi-parametrically. More details:

Kom´

arek, Lesaffre & Hilton (2005, J. of Computat. and Graphical Stat.),

Kom´

arek, Lesaffre & Legrand (2007, Statistics in Medicine),

Kom´

arek & Lesaffre (2008, J. of the American Statistical Association). 34/87 Arnoˇ

st Kom´ arek

  • IV. Event-time model
slide-42
SLIDE 42

Part V Estimation and inference

slide-43
SLIDE 43

Likelihood

p(Y1, . . . , YN) =

N

  • i=1

p(Yi) =

N

  • i=1
  • RJ

+

p(Yi, Ti)dTi =

N

  • i=1
  • RJ

+

p

  • Yi
  • Ti)p(Ti) dTi.

p

  • Yi
  • Ti
  • : (mis)classification model

unknown parameters: α =

  • α(1,1), . . . , α(Q,J)

⊤, η =

  • η(1,1), . . . , η(Q,J)

⊤: sensitivities and specificities for examiners and teeth.

p(Ti): event-time model

random intercept AFT model with a PGM distribution of random intercept; unknown parameters: regression coefficients β, intercept µ, variances τ 2

and σ2

ε, mixture weights w.

36/87 Arnoˇ

st Kom´ arek

  • V. Estimation and inference
slide-44
SLIDE 44

Likelihood

For each i = 1, . . . , N: p(Yi) =

  • RJ

+

p

  • Yi
  • Ti)
  • J

j=1

Ki

k=1 p

  • Y(i,j,k)
  • T(i,j)
  • p(Ti)

dTi. Misclassification part

p

  • Y(i,j,k)
  • T(i,j)
  • =
  • α(ξ(i,k),j)

Y(i,j,k)

  • 1 − α(ξ(i,k),j)

1−Y(i,j,k)I(T(i,j)){T(i,j)∈(0,v(i,k)]} ×

  • 1 − η(ξ(i,k),j)

Y(i,j,k) η(ξ(i,k),j)

1−Y(i,j,k)

I(T(i,j)){T(i,j)∈(v(i,k),+∞)} =

k

  • l=1
  • α(ξ(i,k),j)

Y(i,j,k)

  • 1 − α(ξ(i,k),j)

1−Y(i,j,k)I(T(i,j)){T(i,j)∈(v(i,l−1),v(i,l)]} × ×

Ki +1

  • l=k+1
  • 1 − η(ξ(i,k),j)

Y(i,j,k) η(ξ(i,k),j)

1−Y(i,j,k)

I(T(i,j)){T(i,j)∈(v(i,l−1),v(i,l)]} . 37/87 Arnoˇ

st Kom´ arek

  • V. Estimation and inference
slide-45
SLIDE 45

Likelihood

For each i = 1, . . . , N: p(Yi) =

  • RJ

+

p

  • Yi
  • Ti) p(Ti) dTi.

Event-time part

p(Ti) =

  • R

p

  • T(i,j)
  • bi
  • p(bi) dbi,

p

  • T(i,j)
  • bi
  • : log-normal following from the AFT model with a normal

error

Unknown parameters: β, σ2

ε.

p(bi): normal mixture following from the PGM model.

Unknown parameters: w, α, τ 2.

38/87 Arnoˇ

st Kom´ arek

  • V. Estimation and inference
slide-46
SLIDE 46

Estimation and inference

Maximum-likelihood clearly not tractable. Bayesian specification of the model (with weakly informative priors) and MCMC based inference

Possible. All integrals in the likelihood disappear in calculations if Bayesian

data augmentation used ✇ unobserved event times T(i,j); ✇ random effects (frailties) bi. package bayesSurv (≥2.3).

39/87 Arnoˇ

st Kom´ arek

  • V. Estimation and inference
slide-47
SLIDE 47

Prior distributions

AFT regression parameters:

β ∼ Normal (with large variances);

(Inverted) variance of the AFT error terms:

σ−2

ε

∼ Gamma (with small rate and shape params.);

Location of the random intercepts:

µ ∼ Normal (with large variance);

(Inverted) squared scale of the random intercepts:

τ −2 ∼ Gamma (with small rate and shape params.).

40/87 Arnoˇ

st Kom´ arek

  • V. Estimation and inference
slide-48
SLIDE 48

Random intercept distribution

Remember: bi ∼ penalized Gaussian mixture (PGM)

κ−9 κ−6 κ−3 κ0 κ3 κ6 κ9

  • ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

κ−9 κ−6 κ−3 κ0 κ3 κ6 κ9

  • ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

κ−9 κ−6 κ−3 κ0 κ3 κ6 κ9

  • ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

κ−9 κ−6 κ−3 κ0 κ3 κ6 κ9

  • ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

41/87 Arnoˇ

st Kom´ arek

  • V. Estimation and inference
slide-49
SLIDE 49

Prior for PGM weights

Mixture weights (from the PGM model for the distribution of the random intercept)

Remember: bi ∼ µ + τ M

l=−M wl N(κl, ζ2), where M is relatively

large.

Weights w should sum-up to one. It is primarily worked with the transformed weights

a =

  • a−M, . . . , aM
  • :

wl = exp(al) M

m=−M exp(am)

, l = −M, . . . , M, a0 = 0.

Regularization prior for the (transformed) weights.

42/87 Arnoˇ

st Kom´ arek

  • V. Estimation and inference
slide-50
SLIDE 50

Prior for PGM weights

Mixture weights (from the PGM model for the distribution of the random intercept) Regularization prior for the (transformed) weights: p

  • a
  • λ
  • ∝ exp

  −λ 2

M

  • j=−M+o
  • ∆oaj

2    = exp

  • −λ

2 a⊤P⊤

  • Poa
  • .

∆o: difference operator of order o

✇ Po: corresponding difference operator matrix.

λ: smoothing hyperparameter

✇ prior: λ ∼ Gamma.

43/87 Arnoˇ

st Kom´ arek

  • V. Estimation and inference
slide-51
SLIDE 51

Prior for misclassification parameters

Sensitivities and specificities of the event-classification

For each q (examiners) and j (unit – tooth)

0 < α(q,j) < 1: sensitivity of examiner q when scoring the jth unit (tooth); 0 < η(q,j) < 1: specificity of examiner q when scoring the jth unit (tooth).

Identification constraint: α(q,j) + η(q,j) > 1. Prior:

  • α(q,j), η(q,j)
  • ∼ Beta × Beta truncated by the identification

constraint.

44/87 Arnoˇ

st Kom´ arek

  • V. Estimation and inference
slide-52
SLIDE 52

Markov chain Monte Carlo

MCMC – Block Gibbs sampler

Parameters of the event-time model (β, σ2

ε, PGM parameters

w/a, λ, µ, τ 2 and augmented random effects b1, . . . , bN):

Nothing new compared to the situation without misclassification, see

earlier papers Kom´ arek, Lesaffre (& Legrand) (2007, 2008). 45/87 Arnoˇ

st Kom´ arek

  • V. Estimation and inference
slide-53
SLIDE 53

Markov chain Monte Carlo

MCMC – Block Gibbs sampler

Parameters of the event-time model (β, σ2

ε, PGM parameters

w/a, λ, µ, τ 2 and augmented random effects b1, . . . , bN):

Nothing new compared to the situation without misclassification, see

earlier papers Kom´ arek, Lesaffre (& Legrand) (2007, 2008).

Augmented event times T(i,j):

Sampling from a mixture of truncated log-normals. Truncation: intervals between the visit times. Mixture weights:

binomial probabilities that depend on sensitivi- ties/specificities and observed Y(i,j,l) values of 0/1 event classifications. 45/87 Arnoˇ

st Kom´ arek

  • V. Estimation and inference
slide-54
SLIDE 54

Markov chain Monte Carlo

MCMC – Block Gibbs sampler

Parameters of the event-time model (β, σ2

ε, PGM parameters

w/a, λ, µ, τ 2 and augmented random effects b1, . . . , bN):

Nothing new compared to the situation without misclassification, see

earlier papers Kom´ arek, Lesaffre (& Legrand) (2007, 2008).

Augmented event times T(i,j):

Sampling from a mixture of truncated log-normals. Truncation: intervals between the visit times. Mixture weights:

binomial probabilities that depend on sensitivi- ties/specificities and observed Y(i,j,l) values of 0/1 event classifications. What would be changed if other than AFT model with normal errors assumed for event times? 45/87 Arnoˇ

st Kom´ arek

  • V. Estimation and inference
slide-55
SLIDE 55

Markov chain Monte Carlo

MCMC – Block Gibbs sampler

Parameters of the event-time model (β, σ2

ε, PGM parameters

w/a, λ, µ, τ 2 and augmented random effects b1, . . . , bN):

Nothing new compared to the situation without misclassification, see

earlier papers Kom´ arek, Lesaffre (& Legrand) (2007, 2008).

Augmented event times T(i,j):

Sampling from a mixture of truncated log-normals. Truncation: intervals between the visit times. Mixture weights:

binomial probabilities that depend on sensitivi- ties/specificities and observed Y(i,j,l) values of 0/1 event classifications. What would be changed if other than AFT model with normal errors assumed for event times?

Sensitivities (α’s) and specificities (η’s):

Sampling from truncated Beta distributions.

45/87 Arnoˇ

st Kom´ arek

  • V. Estimation and inference
slide-56
SLIDE 56

Part VI Simulation study

slide-57
SLIDE 57

Simulation study

J = 4 (teeth), N = 500, 1 000, 2 000 (children). log

  • T(i,j)
  • = 2.0 + 0.2 x(i,j),1 − 0.1 x(i,j),2 + bi + ε(i,j).

x(i,j),1 ∼ Uniform(0, 1), x(i,j),2 ∼ Bernoulli(0.5). var(bi) + var(ε(i,j)) = 0.1.

  • var(bi)

var(ε(i,j)) = σb

σε = 0.5, 1, 2, 5. gb: (a) Normal, (b) clearly bimodal two-component Normal mixture, (c) Gumbel. Ki = 10 visits (in random intervals). Q = 5 examiners randomly assigned to the visits. Sensitivities and specificities ranging 0.60 – 0.96.

47/87 Arnoˇ

st Kom´ arek

  • VI. Simulation study
slide-58
SLIDE 58

Simulation study

500 data sets for each scenario. Each dataset also analyzed while ignoring misclassification. The first Y(i,j,k) = 1 determined the “observed” interval where T(i,j) occurred leading to “standard” interval-censored data.

48/87 Arnoˇ

st Kom´ arek

  • VI. Simulation study
slide-59
SLIDE 59

Sensitivity α(1,1) = 0.60

gb: bimodal two-component N mixture

  • 0.55

0.60 0.65 N α11

500 1000 2000 500 1000 2000 500 1000 2000 500 1000 2000

0.5 1 2 5 σb σε:

49/87 Arnoˇ

st Kom´ arek

  • VI. Simulation study
slide-60
SLIDE 60

Sensitivity α(1,1) = 0.60

gb: Gumbel

  • 0.55

0.60 0.65 N α11

500 1000 2000 500 1000 2000 500 1000 2000 500 1000 2000

0.5 1 2 5 σb σε:

50/87 Arnoˇ

st Kom´ arek

  • VI. Simulation study
slide-61
SLIDE 61

Sensitivity α(4,4) = 0.91

gb: bimodal two-component N mixture

  • 0.86

0.88 0.90 0.92 0.94 N α44

500 1000 2000 500 1000 2000 500 1000 2000 500 1000 2000

0.5 1 2 5 σb σε:

51/87 Arnoˇ

st Kom´ arek

  • VI. Simulation study
slide-62
SLIDE 62

Sensitivity α(4,4) = 0.91

gb: Gumbel

  • 0.84

0.86 0.88 0.90 0.92 0.94 0.96 N α44

500 1000 2000 500 1000 2000 500 1000 2000 500 1000 2000

0.5 1 2 5 σb σε:

52/87 Arnoˇ

st Kom´ arek

  • VI. Simulation study
slide-63
SLIDE 63

Regression parameter β1 = 0.20

gb: bimodal two-component N mixture

  • 0.10

0.15 0.20 0.25 0.30 N β1

500 1000 2000 500 1000 2000 500 1000 2000 500 1000 2000

0.5 1 2 5 σb σε:

53/87 Arnoˇ

st Kom´ arek

  • VI. Simulation study
slide-64
SLIDE 64

Regression parameter β1 = 0.20

gb: Gumbel

  • 0.15

0.20 0.25 0.30 N β1

500 1000 2000 500 1000 2000 500 1000 2000 500 1000 2000

0.5 1 2 5 σb σε:

54/87 Arnoˇ

st Kom´ arek

  • VI. Simulation study
slide-65
SLIDE 65

Regression parameter β1 = 0.20

gb: bimodal two-component N mixture

IGNORED MISCLASSIFICATION

  • −0.10

−0.05 0.00 0.05 0.10 0.15 0.20 N β1

500 1000 2000 500 1000 2000 500 1000 2000 500 1000 2000

0.5 1 2 5 σb σε:

55/87 Arnoˇ

st Kom´ arek

  • VI. Simulation study
slide-66
SLIDE 66

Regression parameter β1 = 0.20

gb: Gumbel

IGNORED MISCLASSIFICATION

  • −0.10

−0.05 0.00 0.05 0.10 0.15 0.20 N β1

500 1000 2000 500 1000 2000 500 1000 2000 500 1000 2000

0.5 1 2 5 σb σε:

56/87 Arnoˇ

st Kom´ arek

  • VI. Simulation study
slide-67
SLIDE 67

Survival function for a certain covariates combination

σb/σε = 5 gb : bimodal two-component N mixture

5 10 15 0.0 0.2 0.4 0.6 0.8 1.0

N = 500

Time S(t) 5 10 15 0.0 0.2 0.4 0.6 0.8 1.0

N = 1000

Time S(t) 5 10 15 0.0 0.2 0.4 0.6 0.8 1.0

N = 2000

Time S(t)

gb : Gumbel

5 10 15 0.0 0.2 0.4 0.6 0.8 1.0

N = 500

Time S(t) 5 10 15 0.0 0.2 0.4 0.6 0.8 1.0

N = 1000

Time S(t) 5 10 15 0.0 0.2 0.4 0.6 0.8 1.0

N = 2000

Time S(t)

57/87 Arnoˇ

st Kom´ arek

  • VI. Simulation study
slide-68
SLIDE 68

Survival function for a certain covariates combination

σb/σε = 0.5 gb : bimodal two-component N mixture

5 10 15 0.0 0.2 0.4 0.6 0.8 1.0

N = 500

Time S(t) 5 10 15 0.0 0.2 0.4 0.6 0.8 1.0

N = 1000

Time S(t) 5 10 15 0.0 0.2 0.4 0.6 0.8 1.0

N = 2000

Time S(t)

gb : Gumbel

5 10 15 0.0 0.2 0.4 0.6 0.8 1.0

N = 500

Time S(t) 5 10 15 0.0 0.2 0.4 0.6 0.8 1.0

N = 1000

Time S(t) 5 10 15 0.0 0.2 0.4 0.6 0.8 1.0

N = 2000

Time S(t)

58/87 Arnoˇ

st Kom´ arek

  • VI. Simulation study
slide-69
SLIDE 69

Survival function for a certain covariates combination

σb/σε = 5

IGNORED MISCLASSIFICATION

gb : bimodal two-component N mixture

5 10 15 0.0 0.2 0.4 0.6 0.8 1.0

N = 500

Time S(t) 5 10 15 0.0 0.2 0.4 0.6 0.8 1.0

N = 1000

Time S(t) 5 10 15 0.0 0.2 0.4 0.6 0.8 1.0

N = 2000

Time S(t)

gb : Gumbel

5 10 15 0.0 0.2 0.4 0.6 0.8 1.0

N = 500

Time S(t) 5 10 15 0.0 0.2 0.4 0.6 0.8 1.0

N = 1000

Time S(t) 5 10 15 0.0 0.2 0.4 0.6 0.8 1.0

N = 2000

Time S(t)

59/87 Arnoˇ

st Kom´ arek

  • VI. Simulation study
slide-70
SLIDE 70

Survival function for a certain covariates combination

σb/σε = 0.5

IGNORED MISCLASSIFICATION

gb : bimodal two-component N mixture

5 10 15 0.0 0.2 0.4 0.6 0.8 1.0

N = 500

Time S(t) 5 10 15 0.0 0.2 0.4 0.6 0.8 1.0

N = 1000

Time S(t) 5 10 15 0.0 0.2 0.4 0.6 0.8 1.0

N = 2000

Time S(t)

gb : Gumbel

5 10 15 0.0 0.2 0.4 0.6 0.8 1.0

N = 500

Time S(t) 5 10 15 0.0 0.2 0.4 0.6 0.8 1.0

N = 1000

Time S(t) 5 10 15 0.0 0.2 0.4 0.6 0.8 1.0

N = 2000

Time S(t)

60/87 Arnoˇ

st Kom´ arek

  • VI. Simulation study
slide-71
SLIDE 71

Simulation study 2

How about if no misclassification present but we use the model that accounts for possible misclassification?

Simulation study 2 where data generated without misclassification (all sensitivities and specificities being equal to one).

61/87 Arnoˇ

st Kom´ arek

  • VI. Simulation study
slide-72
SLIDE 72

Sensitivity α(1,1) = 1.00

gb: bimodal two-component N mixture

  • 0.95

0.96 0.97 0.98 0.99 1.00 N α11

500 1000 2000 500 1000 2000 500 1000 2000 500 1000 2000

0.5 1 2 5 σb σε:

62/87 Arnoˇ

st Kom´ arek

  • VI. Simulation study
slide-73
SLIDE 73

Sensitivity α(1,1) = 1.00

gb: Gumbel

  • 0.995

0.996 0.997 0.998 0.999 N α11

500 1000 2000 500 1000 2000 500 1000 2000 500 1000 2000

0.5 1 2 5 σb σε:

63/87 Arnoˇ

st Kom´ arek

  • VI. Simulation study
slide-74
SLIDE 74

Regression parameter β1 = 0.20

gb: bimodal two-component N mixture

  • 0.15

0.20 0.25 N β1

500 1000 2000 500 1000 2000 500 1000 2000 500 1000 2000

0.5 1 2 5 σb σε:

64/87 Arnoˇ

st Kom´ arek

  • VI. Simulation study
slide-75
SLIDE 75

Regression parameter β1 = 0.20

gb: Gumbel

  • 0.14

0.16 0.18 0.20 0.22 0.24 0.26 N β1

500 1000 2000 500 1000 2000 500 1000 2000 500 1000 2000

0.5 1 2 5 σb σε:

65/87 Arnoˇ

st Kom´ arek

  • VI. Simulation study
slide-76
SLIDE 76

Survival function for a certain covariates combination

σb/σε = 5 gb : bimodal two-component N mixture

5 10 15 0.0 0.2 0.4 0.6 0.8 1.0

N = 500

Time S(t) 5 10 15 0.0 0.2 0.4 0.6 0.8 1.0

N = 1000

Time S(t) 5 10 15 0.0 0.2 0.4 0.6 0.8 1.0

N = 2000

Time S(t)

gb : Gumbel

5 10 15 0.0 0.2 0.4 0.6 0.8 1.0

N = 500

Time S(t) 5 10 15 0.0 0.2 0.4 0.6 0.8 1.0

N = 1000

Time S(t) 5 10 15 0.0 0.2 0.4 0.6 0.8 1.0

N = 2000

Time S(t)

66/87 Arnoˇ

st Kom´ arek

  • VI. Simulation study
slide-77
SLIDE 77

Survival function for a certain covariates combination

σb/σε = 0.5 gb : bimodal two-component N mixture

5 10 15 0.0 0.2 0.4 0.6 0.8 1.0

N = 500

Time S(t) 5 10 15 0.0 0.2 0.4 0.6 0.8 1.0

N = 1000

Time S(t) 5 10 15 0.0 0.2 0.4 0.6 0.8 1.0

N = 2000

Time S(t)

gb : Gumbel

5 10 15 0.0 0.2 0.4 0.6 0.8 1.0

N = 500

Time S(t) 5 10 15 0.0 0.2 0.4 0.6 0.8 1.0

N = 1000

Time S(t) 5 10 15 0.0 0.2 0.4 0.6 0.8 1.0

N = 2000

Time S(t)

67/87 Arnoˇ

st Kom´ arek

  • VI. Simulation study
slide-78
SLIDE 78

Part VII Models comparison

slide-79
SLIDE 79

Models comparison

Two competing models M1 and M2.

May differ in specification of the event-time and/or the misclassification

model.

Pseudo Bayes factor (PsBF) (Geisser and Eddy, 1979, JASA; Gelfand and Dey, 1994, JRSS, B): PsBF(M1, M2) = PsMLM1 PsMLM2 , PsMLM: pseudo marginal likelihood given model M: PsMLM =

N

  • i=1

J

  • j=1

pM

  • Y(i,j,1), . . . , Y(i,j,Ki)
  • Y[−(i,j)]

Y[−(i,j)]: data without observation of unit (tooth) j of subject (child) i; pM(· | ·): posterior predictive distribution.

69/87 Arnoˇ

st Kom´ arek

  • VII. Models comparison
slide-80
SLIDE 80

Pseudo marginal likelihood

Approximation based on the proposal of Gelfand and Dey (1994, JRSS, B): pM

  • Y(i,j,1), . . . , Y(i,j,Ki)
  • Y[−(i,j)]

=

  • Eα, η, β, bi, σ2

ε | Y

  • 1

P

  • Y(i,j,1), . . . , Y(i,j,Ki)
  • α, η, β, bi, σ2

ε

  • −1

  • 1

B

B

  • b=1
  • 1

P

  • Y(i,j,1), . . . , Y(i,j,Ki)
  • α(b), η(b), β(b), b(b)

i

, σ2(b)

ε

  • −1

. ,

70/87 Arnoˇ

st Kom´ arek

  • VII. Models comparison
slide-81
SLIDE 81

Pseudo marginal likelihood

P

  • Y(i,j,1), . . . , Y(i,j,Ki )
  • α, η, β, bi, σ2

ε

  • =

Ki +1

  • k=1

v(i,k)

v(i,k−1)

1 t ϕ

  • log t
  • x⊤

(i,j)β + bi, σ2 ε

  • dt
  • × W(i,j,k)(Y(i,j), α, η).

W(i,j,k)(Y(i,j), α, η): quantity for which we have a closed-form expres- sion and which is also needed in the MCMC procedure.

71/87 Arnoˇ

st Kom´ arek

  • VII. Models comparison
slide-82
SLIDE 82

Part VIII The Signal Tandmobiel study

slide-83
SLIDE 83

Models

Event-time model log

  • T(i,j)
  • = bi + x⊤

(i,j)β + ε(i,j)

T(i,j) Age at getting caries on tooth j (∈ {1, 2, 3, 4}) of a child i. x(i,j): gender, presence of sealants, frequency of brushing, x and y

geographical coordinate. Misclassification models

16 examiners. Model M1: sensitivities/specificities both examiner and tooth specific

(64 + 64 sensitivities and specificities).

Model M2: sensitivities/specificities only examiner specific

(16 + 16 sensitivities and specificities).

73/87 Arnoˇ

st Kom´ arek

  • VIII. The Signal Tandmobiel study
slide-84
SLIDE 84

Models comparison

Pseudo marginal log-likelihoods: M1: −16 545, M2: −16 515. PsBF(M1, M2) = exp(−30) ≈ 10−13. From a predictive point of view, the simpler model M2 (sensitivi- ties/specificities only examiner specific) is better.

74/87 Arnoˇ

st Kom´ arek

  • VIII. The Signal Tandmobiel study
slide-85
SLIDE 85

Sensitivities

(posterior means and 95% HPD credible intervals)

0.70 0.75 0.80 0.85 0.90 0.95 1.00 Examiner Sensitivity 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

75/87 Arnoˇ

st Kom´ arek

  • VIII. The Signal Tandmobiel study
slide-86
SLIDE 86

Specificities

(posterior means and 95% HPD credible intervals)

0.70 0.75 0.80 0.85 0.90 0.95 1.00 Examiner Specificity 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

76/87 Arnoˇ

st Kom´ arek

  • VIII. The Signal Tandmobiel study
slide-87
SLIDE 87

Random intercept density

(standardized, pointwise posterior means)

−3 −2 −1 1 2 3 0.0 0.2 0.4 0.6 0.8 1.0 1.2 b g(b)

77/87 Arnoˇ

st Kom´ arek

  • VIII. The Signal Tandmobiel study
slide-88
SLIDE 88

Posterior summary for regression parameters

Posterior Model Mean Median 95%HPD Gender 1 −0.05971 −0.05964 (−0.09115 ; −0.02854) (Girl) 2 −0.05984 −0.05993 (−0.09098 ; −0.02814) Sealants 1 0.19027 0.19016 ( 0.16319 ; 0.21762) (Present) 2 0.19067 0.19054 ( 0.16378 ; 0.21787)

  • Freq. of Brush.

1 0.16564 0.16562 ( 0.12168 ; 0.21056) (Daily) 2 0.16538 0.16542 ( 0.12242 ; 0.20938) x-ordinate 1 −0.00092 −0.00092 (−0.00122 ; −0.00062) 2 −0.00092 −0.00092 (−0.00122 ; −0.00062) y-ordinate 1 −0.00002 −0.00004 (−0.00010 ; 0.00087) 2 −0.00007 −0.00007 (−0.00101 ; 0.00080) 78/87 Arnoˇ

st Kom´ arek

  • VIII. The Signal Tandmobiel study
slide-89
SLIDE 89

Survival functions

(pointwise posterior means)

5 10 15 0.0 0.2 0.4 0.6 0.8 1.0 Age S(t) Boy: Seal:More freq. Girl: Seal:More freq. Boy: Seal:Less freq. Boy: No seal:More freq. Girl: Seal:Less freq. Girl: No seal:More freq. Boy: No seal:Less freq. Girl: No seal:Less freq.

79/87 Arnoˇ

st Kom´ arek

  • VIII. The Signal Tandmobiel study
slide-90
SLIDE 90

Hazard functions

(pointwise posterior means)

5 10 15 0.0 0.1 0.2 0.3 0.4 Age h(t) Girl: No seal:Less freq. Boy: No seal:Less freq. Girl: No seal:More freq. Girl: Seal:Less freq. Boy: No seal:More freq. Boy: Seal:Less freq. Girl: Seal:More freq. Boy: Seal:More freq.

80/87 Arnoˇ

st Kom´ arek

  • VIII. The Signal Tandmobiel study
slide-91
SLIDE 91

Part IX Summary and conclusions

slide-92
SLIDE 92

Misclassified interval-censored data

♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣

1 Y(i,j)(t)

T(i,j)

Y(i,j) v(i,1) v(i,2) 1 v(i,3) v(i,4) 1 t

T(i,j) ∈ ???, Y(i,j) =

  • 0, 1, 0, 1

⊤.

82/87 Arnoˇ

st Kom´ arek

  • IX. Summary and conclusions
slide-93
SLIDE 93

Conclusions

Interval-censored event time data are encountered whenever a cer- tain evaluation (examination/labo/. . . ) is needed to determine the event status. Event status evaluation is often subject to misclassification.

Not only human examiners but also labo procedures have usually sen-

sitivity and/or specificity < 1.

Ignoring misclassification may lead to seriously biased results of the event time analysis.

83/87 Arnoˇ

st Kom´ arek

  • IX. Summary and conclusions
slide-94
SLIDE 94

Conclusions

Interval-censored event time data are encountered whenever a cer- tain evaluation (examination/labo/. . . ) is needed to determine the event status. Event status evaluation is often subject to misclassification.

Not only human examiners but also labo procedures have usually sen-

sitivity and/or specificity < 1.

Ignoring misclassification may lead to seriously biased results of the event time analysis. Joint modelling of the misclassification and event-time processes al- lows for unbiased/consistent estimation of parameters of:

the event-time process (survival functions, regression parameters, . . . ); the misclassification process (sensitivities, specificities).

No need for external (validation) data to get sensitivities/specificities related to classification.

83/87 Arnoˇ

st Kom´ arek

  • IX. Summary and conclusions
slide-95
SLIDE 95

Possible extensions/modifications

Other than AFT model with random intercept as the event-time model.

Only small parts of the MCMC scheme would have to be modified.

Examiner-specific covariates to model sensitivities/specificities in the misclassification model.

Logit model.

84/87 Arnoˇ

st Kom´ arek

  • IX. Summary and conclusions
slide-96
SLIDE 96

Possible extensions/modifications

Other than AFT model with random intercept as the event-time model.

Only small parts of the MCMC scheme would have to be modified.

Examiner-specific covariates to model sensitivities/specificities in the misclassification model.

Logit model.

Time-dependent sensitivities/specificities.

Useful if a learning-by-doing can be expected in event-classification. Likely not possible with our “joint” approach due to identifiability prob-

lems.

External (validation) data needed to estimate parameters of the mis-

classification process. 84/87 Arnoˇ

st Kom´ arek

  • IX. Summary and conclusions
slide-97
SLIDE 97

Applicability

Designed longitudinal studies with visit times pre-specified (being in- dependent of the event times). Event status checked at each visit independently of previous exami- nation results by imperfect diagnostic procedure. At least three visits (for at least some subjects) needed to identify parameters of the misclassification process (sensitivities and speci- ficities). Above conditions quite often satisfied in practice and misclassification

  • ignored. . .

85/87 Arnoˇ

st Kom´ arek

  • IX. Summary and conclusions
slide-98
SLIDE 98

Applicability

Designed longitudinal studies with visit times pre-specified (being in- dependent of the event times). Event status checked at each visit independently of previous exami- nation results by imperfect diagnostic procedure. At least three visits (for at least some subjects) needed to identify parameters of the misclassification process (sensitivities and speci- ficities). Above conditions quite often satisfied in practice and misclassification

  • ignored. . .

Practically nothing is lost if misclassification considered even if not present.

85/87 Arnoˇ

st Kom´ arek

  • IX. Summary and conclusions
slide-99
SLIDE 99

THANK YOU FOR YOUR ATTENTION!

slide-100
SLIDE 100

References

GARC´

IA-ZATTERA, JARA, KOM´ AREK (2015+). A flexible AFT model for mis-

classified clustered interval-censored data. Under review.

KOM´

AREK, LESAFFRE, HILTON (2005). Accelerated failure time model for arbitrarily

censored data with smoothed error distribution. Journal of Computational and Graphi- cal Statistics, 14(3), 726–745. KOM´

AREK, LESAFFRE, LEGRAND (2007). Baseline and treatment effect heterogene-

ity for survival times between centers using a random effects accelerated failure time model with flexible error distribution. Statistics in Medicine, 26(30), 5457–5472. KOM´

AREK, LESAFFRE (2008). Bayesian accelerated failure time model with multivariate

doubly-interval-censored data and flexible distributional assumptions. Journal of the American Statistical Association, 103(482), 523–533. GARC´

IA-ZATTERA, MUTSVARI, JARA, DECLERCK, LESAFFRE (2010). Correcting for

misclassification for a monotone disease process with an application in dental research. Statistics in Medicine, 29(30), 3103–3117. GARC´

IA-ZATTERA, JARA, LESAFFRE, MARSHALL (2012).

Modeling of multivariate monotone disease processes in the presence of misclassification. Journal of the Amer- ican Statistical Association, 107(499), 976–989.

87/87 Arnoˇ

st Kom´ arek

  • IX. Summary and conclusions