[PPT] - Case-base sampling for fitting and validating prognostic models PowerPoint Presentation

SLIDE 1

Case-base sampling for fitting and validating prognostic models

Workshop on Statistical Issues in Biomarker and Drug Co-development Fields Institute, Toronto

Olli Saarela

Dalla Lana School of Public Health, University of Toronto

November 8, 2014

Olli Saarela (University of Toronto) Case-base sampling for prognostic modeling Nov 8, 2014 1 / 23

SLIDE 2

Outline

1

Case-base sampling

2

Application: estimation of ROC/AUC from time-to-event data

Olli Saarela (University of Toronto) Case-base sampling for prognostic modeling Nov 8, 2014 2 / 23

SLIDE 3

Case-base sampling

Motivation: Cox regression and absolute risk

Olli Saarela (University of Toronto) Case-base sampling for prognostic modeling Nov 8, 2014 3 / 23

SLIDE 4

Case-base sampling

Motivation: Cox regression and absolute risk

Time matching/risk set sampling (including Cox partial likelihood) eliminates the baseline hazard from the likelihood expression for the hazard ratios.

Olli Saarela (University of Toronto) Case-base sampling for prognostic modeling Nov 8, 2014 3 / 23

SLIDE 5

Case-base sampling

Motivation: Cox regression and absolute risk

Time matching/risk set sampling (including Cox partial likelihood) eliminates the baseline hazard from the likelihood expression for the hazard ratios. If, however, the absolute risks are of interest, they have to be recovered using the semi-parametric Breslow estimator.

Olli Saarela (University of Toronto) Case-base sampling for prognostic modeling Nov 8, 2014 3 / 23

SLIDE 6

Case-base sampling

Motivation: Cox regression and absolute risk

Time matching/risk set sampling (including Cox partial likelihood) eliminates the baseline hazard from the likelihood expression for the hazard ratios. If, however, the absolute risks are of interest, they have to be recovered using the semi-parametric Breslow estimator. Alternative approaches for fitting flexible hazard models for estimating absolute risks, not requiring this two-step approach?

Olli Saarela (University of Toronto) Case-base sampling for prognostic modeling Nov 8, 2014 3 / 23

SLIDE 7

Case-base sampling

Motivation: Cox regression and absolute risk

Time matching/risk set sampling (including Cox partial likelihood) eliminates the baseline hazard from the likelihood expression for the hazard ratios. If, however, the absolute risks are of interest, they have to be recovered using the semi-parametric Breslow estimator. Alternative approaches for fitting flexible hazard models for estimating absolute risks, not requiring this two-step approach? There is; it originates from Mantel (1973) and Hanley & Miettinen (2009).

Olli Saarela (University of Toronto) Case-base sampling for prognostic modeling Nov 8, 2014 3 / 23

SLIDE 8

Case-base sampling

An alternative framework for survival analysis

Olli Saarela (University of Toronto) Case-base sampling for prognostic modeling Nov 8, 2014 4 / 23

SLIDE 9

Case-base sampling

An alternative framework for survival analysis

Case-base sampling combined with logistic/multinomial regression provides an alternative to risk set sampling-based semi-parametric survival analysis methods.

Olli Saarela (University of Toronto) Case-base sampling for prognostic modeling Nov 8, 2014 4 / 23

SLIDE 10

Case-base sampling

An alternative framework for survival analysis

Case-base sampling combined with logistic/multinomial regression provides an alternative to risk set sampling-based semi-parametric survival analysis methods. This enables easy fitting of smooth-in-time and non-proportional hazard models with multiple time scales.

Olli Saarela (University of Toronto) Case-base sampling for prognostic modeling Nov 8, 2014 4 / 23

SLIDE 11

Case-base sampling

An alternative framework for survival analysis

Case-base sampling combined with logistic/multinomial regression provides an alternative to risk set sampling-based semi-parametric survival analysis methods. This enables easy fitting of smooth-in-time and non-proportional hazard models with multiple time scales. Provides an alternative to Kaplan-Meier-based methods for estimating discrimination statistics (e.g. ROC, AUC, risk reclassification probabilities) from censored survival data.

Olli Saarela (University of Toronto) Case-base sampling for prognostic modeling Nov 8, 2014 4 / 23

SLIDE 12

Case-base sampling

Study base

2 4 6 8 10 1000 2000 3000 4000 5000 6000 Follow−up years Population Population−time (55243 PY in total) Olli Saarela (University of Toronto) Case-base sampling for prognostic modeling Nov 8, 2014 5 / 23

SLIDE 13

Case-base sampling

Case series

2 4 6 8 10 1000 2000 3000 4000 5000 6000 Follow−up years Population

Incident CVD event

(493 in total) Olli Saarela (University of Toronto) Case-base sampling for prognostic modeling Nov 8, 2014 6 / 23

SLIDE 14

Case-base sampling

Time matching

2 4 6 8 10 1000 2000 3000 4000 5000 6000 Follow−up years Population

Incident CVD event

(493 in total) Olli Saarela (University of Toronto) Case-base sampling for prognostic modeling Nov 8, 2014 7 / 23

SLIDE 15

Case-base sampling

Start again

2 4 6 8 10 1000 2000 3000 4000 5000 6000 Follow−up years Population

Incident CVD event

(493 in total) Olli Saarela (University of Toronto) Case-base sampling for prognostic modeling Nov 8, 2014 8 / 23

SLIDE 16

Case-base sampling

Base series

2 4 6 8 10 1000 2000 3000 4000 5000 6000 Follow−up years Population

●
●
●
●
●
●
Base series (4930

person−moments) Olli Saarela (University of Toronto) Case-base sampling for prognostic modeling Nov 8, 2014 9 / 23

SLIDE 17

Case-base sampling

Age as the time scale

30 40 50 60 70 80 1000 2000 3000 4000 5000 6000 Age Population

●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
Case series

Study base Olli Saarela (University of Toronto) Case-base sampling for prognostic modeling Nov 8, 2014 10 / 23

SLIDE 18

Case-base sampling

Base series

30 40 50 60 70 80 1000 2000 3000 4000 5000 6000 Age Population

●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
Case series

Base series Olli Saarela (University of Toronto) Case-base sampling for prognostic modeling Nov 8, 2014 11 / 23

SLIDE 19

Case-base sampling

Base series matched by the Framingham score

30 40 50 60 70 80 1000 2000 3000 4000 5000 6000 Age Population

●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
Case series

Base series Olli Saarela (University of Toronto) Case-base sampling for prognostic modeling Nov 8, 2014 12 / 23

SLIDE 20

Case-base sampling

Likelihood expression (Saarela & Arjas, 2014)

Olli Saarela (University of Toronto) Case-base sampling for prognostic modeling Nov 8, 2014 13 / 23

SLIDE 21

Case-base sampling

Likelihood expression (Saarela & Arjas, 2014)

The hazard regression can now be fitted using the conditional likelihood expression L(θ) ≡

n

i=1
t∈(0,τ]
λi(t; θ)dNi(t)

ρi(t) + λi(t; θ)

dMi(t)

, where Ni(t) counts the cases, and Mi(t) counts both the case and base series person-moments contributed by individual i.

Olli Saarela (University of Toronto) Case-base sampling for prognostic modeling Nov 8, 2014 13 / 23

SLIDE 22

Case-base sampling

Likelihood expression (Saarela & Arjas, 2014)

The hazard regression can now be fitted using the conditional likelihood expression L(θ) ≡

n

i=1
t∈(0,τ]
λi(t; θ)dNi(t)

ρi(t) + λi(t; θ)

dMi(t)

, where Ni(t) counts the cases, and Mi(t) counts both the case and base series person-moments contributed by individual i. This is of logistic regression form with the offset term ρi(t) accounting for the base series sampling mechanism.

Olli Saarela (University of Toronto) Case-base sampling for prognostic modeling Nov 8, 2014 13 / 23

SLIDE 23

Case-base sampling

Likelihood expression (Saarela & Arjas, 2014)

The hazard regression can now be fitted using the conditional likelihood expression L(θ) ≡

n

i=1
t∈(0,τ]
λi(t; θ)dNi(t)

ρi(t) + λi(t; θ)

dMi(t)

, where Ni(t) counts the cases, and Mi(t) counts both the case and base series person-moments contributed by individual i. This is of logistic regression form with the offset term ρi(t) accounting for the base series sampling mechanism. Generalizes to multinomial regression when competing causes are present.

Olli Saarela (University of Toronto) Case-base sampling for prognostic modeling Nov 8, 2014 13 / 23

SLIDE 24

Case-base sampling

Model specification

Olli Saarela (University of Toronto) Case-base sampling for prognostic modeling Nov 8, 2014 14 / 23

SLIDE 25

Case-base sampling

Model specification

Consider the following specification of the hazard function: λi(t; θ) = exp{θ0 + f1(t, θ1) + f2(age at baselinei + t, θ2) + f3(troponin Ii, θ3) + θ4 × HDL cholesteroli + θ5 × non-HDL cholesteroli + θ6 × treated systolic blood pressurei + θ7 × untreated systolic blood pressurei + θ8 × smokeri + θ9 × prevalent diabetesi}. Here f1, f2 and f3 are appropriate spline basis functions.

Olli Saarela (University of Toronto) Case-base sampling for prognostic modeling Nov 8, 2014 14 / 23

SLIDE 26

Case-base sampling

Fitting the hazard model

Olli Saarela (University of Toronto) Case-base sampling for prognostic modeling Nov 8, 2014 15 / 23

SLIDE 27

Case-base sampling

Fitting the hazard model

The likelihood expression does not feature the cumulative hazard, only the hazard function itself evaluated at a discrete number of points.

Olli Saarela (University of Toronto) Case-base sampling for prognostic modeling Nov 8, 2014 15 / 23

SLIDE 28

Case-base sampling

Fitting the hazard model

The likelihood expression does not feature the cumulative hazard, only the hazard function itself evaluated at a discrete number of points. The hazard model can be fitted using standard logistic regression procedures.

Olli Saarela (University of Toronto) Case-base sampling for prognostic modeling Nov 8, 2014 15 / 23

SLIDE 29

Case-base sampling

Fitting the hazard model

The likelihood expression does not feature the cumulative hazard, only the hazard function itself evaluated at a discrete number of points. The hazard model can be fitted using standard logistic regression procedures. The baseline hazard, and consequently, the absolute risk, is obtained as part of the model fit.

Olli Saarela (University of Toronto) Case-base sampling for prognostic modeling Nov 8, 2014 15 / 23

SLIDE 30

Case-base sampling

Fitting the hazard model

The likelihood expression does not feature the cumulative hazard, only the hazard function itself evaluated at a discrete number of points. The hazard model can be fitted using standard logistic regression procedures. The baseline hazard, and consequently, the absolute risk, is obtained as part of the model fit. Easy to incorporate multiple time scales and interactions between time and other covariates.

Olli Saarela (University of Toronto) Case-base sampling for prognostic modeling Nov 8, 2014 15 / 23

SLIDE 31

Case-base sampling

Fitting the hazard model

The likelihood expression does not feature the cumulative hazard, only the hazard function itself evaluated at a discrete number of points. The hazard model can be fitted using standard logistic regression procedures. The baseline hazard, and consequently, the absolute risk, is obtained as part of the model fit. Easy to incorporate multiple time scales and interactions between time and other covariates. The time effects themselves can be fitted using flexible specifications, such as regression splines (Hanley & Miettinen, 2009; Saarela & Hanley, 2014).

Olli Saarela (University of Toronto) Case-base sampling for prognostic modeling Nov 8, 2014 15 / 23

SLIDE 32

Application: estimation of ROC/AUC from time-to-event data

Discrimination measures

Olli Saarela (University of Toronto) Case-base sampling for prognostic modeling Nov 8, 2014 16 / 23

SLIDE 33

Application: estimation of ROC/AUC from time-to-event data

Discrimination measures

Since the hazard model specification was fully parametric, Bayesian measures of uncertainty may be calculated for any function of these parameters.

Olli Saarela (University of Toronto) Case-base sampling for prognostic modeling Nov 8, 2014 16 / 23

SLIDE 34

Application: estimation of ROC/AUC from time-to-event data

Discrimination measures

Since the hazard model specification was fully parametric, Bayesian measures of uncertainty may be calculated for any function of these parameters. Consequently, we can obtain posterior predictive distributions for discrimination measures such as ROC curves, areas under the curve (AUC), or risk reclassification probabilities.

Olli Saarela (University of Toronto) Case-base sampling for prognostic modeling Nov 8, 2014 16 / 23

SLIDE 35

Application: estimation of ROC/AUC from time-to-event data

Discrimination measures

Since the hazard model specification was fully parametric, Bayesian measures of uncertainty may be calculated for any function of these parameters. Consequently, we can obtain posterior predictive distributions for discrimination measures such as ROC curves, areas under the curve (AUC), or risk reclassification probabilities. Overfitting?

Olli Saarela (University of Toronto) Case-base sampling for prognostic modeling Nov 8, 2014 16 / 23

SLIDE 36

Application: estimation of ROC/AUC from time-to-event data

Discrimination measures

Since the hazard model specification was fully parametric, Bayesian measures of uncertainty may be calculated for any function of these parameters. Consequently, we can obtain posterior predictive distributions for discrimination measures such as ROC curves, areas under the curve (AUC), or risk reclassification probabilities. Overfitting? The procedure works similarly if the risk score has been derived in another sample.

Olli Saarela (University of Toronto) Case-base sampling for prognostic modeling Nov 8, 2014 16 / 23

SLIDE 37

Application: estimation of ROC/AUC from time-to-event data

Calculating sensitivity/specificity

Olli Saarela (University of Toronto) Case-base sampling for prognostic modeling Nov 8, 2014 17 / 23

SLIDE 38

Application: estimation of ROC/AUC from time-to-event data

Calculating sensitivity/specificity

Consider for example sensitivity, that is, the probability of the estimated 10-year risk π(X; θ) being at least some threshold risk π∗, given the occurrence of the event during the 10 years, and data D: P(π(X; θ) ≥ π∗ | N(10) = 1, θ, D) =

x 1{π(x;θ)≥π∗}π(x; θ)P(dx | D)
x π(x; θ)P(dx | D)

.

Olli Saarela (University of Toronto) Case-base sampling for prognostic modeling Nov 8, 2014 17 / 23

SLIDE 39

Application: estimation of ROC/AUC from time-to-event data

Calculating sensitivity/specificity

Consider for example sensitivity, that is, the probability of the estimated 10-year risk π(X; θ) being at least some threshold risk π∗, given the occurrence of the event during the 10 years, and data D: P(π(X; θ) ≥ π∗ | N(10) = 1, θ, D) =

x 1{π(x;θ)≥π∗}π(x; θ)P(dx | D)
x π(x; θ)P(dx | D)

. The sources of uncertainty here are the unknown parameters θ of the hazard regression model, and the unknown predictive distribution P(X | D) of the prognostic factors.

Olli Saarela (University of Toronto) Case-base sampling for prognostic modeling Nov 8, 2014 17 / 23

SLIDE 40

Application: estimation of ROC/AUC from time-to-event data

Calculating sensitivity/specificity

Consider for example sensitivity, that is, the probability of the estimated 10-year risk π(X; θ) being at least some threshold risk π∗, given the occurrence of the event during the 10 years, and data D: P(π(X; θ) ≥ π∗ | N(10) = 1, θ, D) =

x 1{π(x;θ)≥π∗}π(x; θ)P(dx | D)
x π(x; θ)P(dx | D)

. The sources of uncertainty here are the unknown parameters θ of the hazard regression model, and the unknown predictive distribution P(X | D) of the prognostic factors. If we take P(dx | D) = n

i=1 1 nδxi(dx), a point estimate is given by

n

i=1 1{π(xi;ˆ θ)≥π∗}π(xi; ˆ

θ)

n

i=1 π(xi; ˆ

θ) .

Olli Saarela (University of Toronto) Case-base sampling for prognostic modeling Nov 8, 2014 17 / 23

SLIDE 41

Application: estimation of ROC/AUC from time-to-event data

Parametric ROC curves

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 False positive probability True positive probability

0.05
0.1
0.15
0.2
AUC=0.662 (Age only)

AUC=0.738 (Framingham score) AUC=0.746 (Age + classic risk factors) AUC=0.751 (Age + classic risk factors + Troponin I) AUC=0.753 (Age + classic risk factors + HS Troponin I) Olli Saarela (University of Toronto) Case-base sampling for prognostic modeling Nov 8, 2014 18 / 23

SLIDE 42

Application: estimation of ROC/AUC from time-to-event data

Kaplan-Meier ROC curves (Heagerty et al. 2000)

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 False positive probability True positive probability

0.05
0.1
0.15
0.2
AUC=0.661 (Age only)

AUC=0.750 (Framingham score) AUC=0.763 (Age + classic risk factors) AUC=0.766 (Age + classic risk factors + Troponin I) AUC=0.772 (Age + classic risk factors + HS Troponin I) Olli Saarela (University of Toronto) Case-base sampling for prognostic modeling Nov 8, 2014 19 / 23

SLIDE 43

Application: estimation of ROC/AUC from time-to-event data

Posterior predictive distribution for AUC

Olli Saarela (University of Toronto) Case-base sampling for prognostic modeling Nov 8, 2014 20 / 23

SLIDE 44

Application: estimation of ROC/AUC from time-to-event data

Posterior predictive distribution for AUC

The hazard model parameters θ are drawn from the posterior distribution P(dθ | D) ∝ L(θ)P(dθ).

Olli Saarela (University of Toronto) Case-base sampling for prognostic modeling Nov 8, 2014 20 / 23

SLIDE 45

Application: estimation of ROC/AUC from time-to-event data

Posterior predictive distribution for AUC

The hazard model parameters θ are drawn from the posterior distribution P(dθ | D) ∝ L(θ)P(dθ). The posterior predictive distribution of the prognostic factors may be approximated by the Bayesian bootstrap (Rubin, 1981).

Olli Saarela (University of Toronto) Case-base sampling for prognostic modeling Nov 8, 2014 20 / 23

SLIDE 46

Application: estimation of ROC/AUC from time-to-event data

Posterior predictive distribution for AUC

The hazard model parameters θ are drawn from the posterior distribution P(dθ | D) ∝ L(θ)P(dθ). The posterior predictive distribution of the prognostic factors may be approximated by the Bayesian bootstrap (Rubin, 1981). This corresponds to P(dx | D) = n

i=1 wiδxi(dx), where

(w1, . . . , wn) ∼ Dirichlet(1, . . . , 1).

Olli Saarela (University of Toronto) Case-base sampling for prognostic modeling Nov 8, 2014 20 / 23

SLIDE 47

Application: estimation of ROC/AUC from time-to-event data

Posterior predictive distribution for AUC

The hazard model parameters θ are drawn from the posterior distribution P(dθ | D) ∝ L(θ)P(dθ). The posterior predictive distribution of the prognostic factors may be approximated by the Bayesian bootstrap (Rubin, 1981). This corresponds to P(dx | D) = n

i=1 wiδxi(dx), where

(w1, . . . , wn) ∼ Dirichlet(1, . . . , 1). The ROC curve and corresponding AUC are calculated at each realization of θ and (w1, . . . , wn).

Olli Saarela (University of Toronto) Case-base sampling for prognostic modeling Nov 8, 2014 20 / 23

SLIDE 48

Application: estimation of ROC/AUC from time-to-event data

Posterior AUCs for the five models

0.60 0.65 0.70 0.75 0.80 10 20 30 40 50 AUC Density Age only Framingham score Age + classic risk factors Age + classic risk factors + Troponin I Age + classic risk factors + HS Troponin I Olli Saarela (University of Toronto) Case-base sampling for prognostic modeling Nov 8, 2014 21 / 23

SLIDE 49

Application: estimation of ROC/AUC from time-to-event data

Remarks

Olli Saarela (University of Toronto) Case-base sampling for prognostic modeling Nov 8, 2014 22 / 23

SLIDE 50

Application: estimation of ROC/AUC from time-to-event data

Remarks

Case-base sampling combined with logistic/multinomial regression provides an alternative to risk set sampling-based semi-parametric survival analysis methods.

Olli Saarela (University of Toronto) Case-base sampling for prognostic modeling Nov 8, 2014 22 / 23

SLIDE 51

Application: estimation of ROC/AUC from time-to-event data

Remarks

Case-base sampling combined with logistic/multinomial regression provides an alternative to risk set sampling-based semi-parametric survival analysis methods. This enables easy fitting of smooth-in-time and non-proportional hazard models with multiple time scales.

Olli Saarela (University of Toronto) Case-base sampling for prognostic modeling Nov 8, 2014 22 / 23

SLIDE 52

Application: estimation of ROC/AUC from time-to-event data

Remarks

Case-base sampling combined with logistic/multinomial regression provides an alternative to risk set sampling-based semi-parametric survival analysis methods. This enables easy fitting of smooth-in-time and non-proportional hazard models with multiple time scales. Similarly, this provides an alternative to Kaplan-Meier-based methods for estimating discrimination statistics (e.g. ROC, AUC, risk reclassification probabilities) from censored survival data.

Olli Saarela (University of Toronto) Case-base sampling for prognostic modeling Nov 8, 2014 22 / 23

SLIDE 53

Application: estimation of ROC/AUC from time-to-event data

Remarks

Case-base sampling combined with logistic/multinomial regression provides an alternative to risk set sampling-based semi-parametric survival analysis methods. This enables easy fitting of smooth-in-time and non-proportional hazard models with multiple time scales. Similarly, this provides an alternative to Kaplan-Meier-based methods for estimating discrimination statistics (e.g. ROC, AUC, risk reclassification probabilities) from censored survival data. Bayesian measures of uncertainty can be obtained for these.

Olli Saarela (University of Toronto) Case-base sampling for prognostic modeling Nov 8, 2014 22 / 23

SLIDE 54

Application: estimation of ROC/AUC from time-to-event data

Remarks

Case-base sampling combined with logistic/multinomial regression provides an alternative to risk set sampling-based semi-parametric survival analysis methods. This enables easy fitting of smooth-in-time and non-proportional hazard models with multiple time scales. Similarly, this provides an alternative to Kaplan-Meier-based methods for estimating discrimination statistics (e.g. ROC, AUC, risk reclassification probabilities) from censored survival data. Bayesian measures of uncertainty can be obtained for these. Improving the prediction of CVD in healthy populations, beyond the classic risk factors of CVD, has been challenging.

Olli Saarela (University of Toronto) Case-base sampling for prognostic modeling Nov 8, 2014 22 / 23

SLIDE 55

Application: estimation of ROC/AUC from time-to-event data

References

Hanley JA, Miettinen OS (2009). Fitting Smooth-In-Time Prognostic Risk Functions via Logistic Regression. The International Journal of Biostatistics 5(1). Heagerty P, Lumley T, Pepe MS (2000). Time-Dependent ROC Curves for Censored Survival Data and a Diagnostic Marker. Biometrics 56, 337–344. Mantel N (1973). Synthetic Retrospective Studies and Related Topics. Biometrics 29, 479–486 Saarela O, Arjas E (2014). Non-parametric Bayesian hazard regression for chronic disease risk assessment. Scandinavian Journal of Statistics. doi:10.1111/sjos.12125. Saarela O, Hanley JA (2014). Case-base methods for studying vaccination safety.

Biometrics. doi:10.1111/biom.12222.

Rubin, D. B. (1981). The Bayesian bootstrap. The Annals of Statistics 9, 130–134.

Olli Saarela (University of Toronto) Case-base sampling for prognostic modeling Nov 8, 2014 23 / 23