Generalizing inferences about failure-time outcomes from randomized - - PowerPoint PPT Presentation

generalizing inferences about failure time outcomes from
SMART_READER_LITE
LIVE PREVIEW

Generalizing inferences about failure-time outcomes from randomized - - PowerPoint PPT Presentation

Generalizing inferences about failure-time outcomes from randomized individuals to a target population Sarah Robertson Brown University sarah robertson@brown.edu June 2, 2019 1 / 34 Acknowledgments PCORI awards ME-1306-03758 and


slide-1
SLIDE 1

Generalizing inferences about failure-time outcomes from randomized individuals to a target population

Sarah Robertson

Brown University sarah robertson@brown.edu

June 2, 2019

1 / 34

slide-2
SLIDE 2

Acknowledgments

PCORI awards ME-1306-03758 and ME-1502-27794, NIH grant R37 AI102634, AHRQ T32AGHS00001 Joint work with JA Steingrimsson, MA Hern´ an, IJ Dahabreh

2 / 34

slide-3
SLIDE 3

Problem- Clinical trials may have limited applicability

Even if trial is run perfectly, trial results often do not apply/extend to a target population Target population = trial-eligible individuals Want estimates of treatment effects in the target population, not just the trial Need methods to extend trial results by combining randomized and non-randomized data

3 / 34

slide-4
SLIDE 4

Using a nested study design to extend trial results

Nested study design: Trial embedded in a cohort of trial-eligible individuals (e.g., health-care system)∗

∗Image from Choudhry, 2017 NEJM 4 / 34

slide-5
SLIDE 5

Nested study design

1 1 S A X y y Y

Hypothetical Target Population

X

Trial nested within cohort Sample Cohort of eligible patients Participate in trial?

Yes No S=indicator for trial participation; A=treatment indicator; Y=outcome; X=baseline covariates

5 / 34

slide-6
SLIDE 6

Applied example of a nested study design

All-cause mortality

Coronary Artery Surgery Study (CASS) August 1975 to December 1996

Outcome Study Design

Randomized individuals (N=780)

Surgery (N=390) Medical Therapy (N=390) Non-randomized individuals (N=1310) Met trial- eligibility criteria (N=2099) Median follow-up time was 14 years Some outcomes censored in last intervals

  • f the study

6 / 34

slide-7
SLIDE 7

CASS trial-only analysis: survival curves (Kaplan-Meier)

0.6 0.7 0.8 0.9 1.0 Survival probability 4 8 12 16 Time (years)

Trial-only IPW

Surgery Medical

7 / 34

slide-8
SLIDE 8

CASS trial-only analysis: survival probabilities

16-year survival probability, % Estimator Surgery Medical Trial-only 66.3 (61.5, 71.1) 65.8 (61.5, 71.1) Counterfactual survival analysis: survival status, Y a,c=0

t

, at time interval t, under treatment a and no censoring c = 0 Want to compare counterfactual survival curves in the target population with followup over time intervals j = 1, ..., t (e..g, follow-up in yearly intervals) Use methods that adjust for informative censoring/drop-out

8 / 34

slide-9
SLIDE 9

Observed and missing data in the clinical trial

Yj: indicator for having the event by interval j Cj: indicator for being censored by interval j Consider one trial participant (id 1), followed for j=1,2 years

1 S A Y1,c=0 X Yj Y0,c=0 1 Cj j 1 2 1 id 1

Goal: Use observed data (and assumptions) to identify counterfactual Y 0,c=0 and Y 1,c=0 Assume trial no treatment switching, no measurement error Drop out occurs in trial

9 / 34

slide-10
SLIDE 10

Identifying the causal effect in the trial

1 S A Y1,c=0 x X Yj Y0,c=0 1 Cj j 1 2 1 id 1

Consistency Observed outcome=counterfactual outcome under the assigned treatment and no censoring in an interval

10 / 34

slide-11
SLIDE 11

Identifying the causal effect in the trial

1 S A Y1,c=0 x X Yj Y0,c=0 1 Cj j 1 2 1 id 1

Consistency Observed outcome=counterfactual outcome under the assigned treatment and no censoring in an interval Positivity of treatment assignment Nonzero probability of being randomized to either treatment

11 / 34

slide-12
SLIDE 12

Identifying the causal effect in the trial

1 S A Y1,c=0 x X Yj Y0,c=0 1 Cj j 1 2 1 id 1

Consistency Observed outcome=counterfactual outcome under the assigned treatment and no censoring in an interval Positivity of treatment assignment Nonzero probability of being randomized to either treatment Positivity of being observed during follow-up Nonzero probability

  • f being observed

12 / 34

slide-13
SLIDE 13

Identifying the causal effect in the trial

1 S A Y1,c=0 x X Yj Y0,c=0 1 Cj j 1 2 1 id 1

Consistency Observed outcome=counterfactual outcome under the assigned treatment and no censoring in an interval Positivity of treatment assignment Nonzero probability of being randomized to either treatment Positivity of being observed during follow-up Nonzero probability

  • f being observed

Non-informative censoring Counterfactual outcome is independent

  • f censoring status given covariates

13 / 34

slide-14
SLIDE 14

Identifying the causal effect in the trial

1 S A Y1,c=0 x X Yj Y0,c=0 1 Cj j 1 2 1 id 1

Consistency Observed outcome=counterfactual outcome under the assigned treatment and no censoring in an interval Positivity of treatment assignment Nonzero probability of being randomized to either treatment Positivity of being observed during follow-up Nonzero probability

  • f being observed

Non-informative censoring Counterfactual outcome is independent

  • f censoring status given covariates

Exchangeability over treatment Counterfactual outcome is independent of treatment given covariates

14 / 34

slide-15
SLIDE 15

Additional assumptions for generalizing trial results

1 S A Y1,c=0 x X Yj Y0,c=0 1 Cj j 1 2 1 id 1

Positivity of trial participation Nonzero probability of participating

15 / 34

slide-16
SLIDE 16

Additional assumptions for generalizing trial results

1 S A Y1,c=0 x X Yj Y0,c=0 1 Cj j 1 2 1 id 1 x 1 2 2

Positivity of trial participation Nonzero probability of participating Exchangeability over trial participation Know enough factors that determine the outcome so that trial participation itself is unimportant

16 / 34

slide-17
SLIDE 17

Methods for extrapolating trial results

Missing data problem in the non-randomized Estimate the effect of the intervention had it been applied to the target population 3 classes of estimators∗: Outcome model-based estimator, g-formula computation Probability of trial participation estimator, IPW Doubly robust estimator, DR

∗Dahabreh et al., Biometrics 2018, and https://arxiv.org/abs/1805.00550 17 / 34

slide-18
SLIDE 18

Outcome model-based estimator (g-formula)

Regression-based extrapolation using pooled logistic regression Model the hazard of the outcome (dying) in each interval among those remaining in the risk set, conditional on baseline covariates, in each treatment arm of the trial Represent time flexibly (e.g, squared term), include time and treatment interactions (allow non-proportionality) Predict over all trial-eligible individuals and marginalize to get estimated counterfactual survival at different intervals in the target population Consistent when outcome model is correctly specified

18 / 34

slide-19
SLIDE 19

Baseline covariates in CASS

Variable Level Non participants Trial participants Surgery Medical N 955 368 363 Age, years 50.9 (7.7) 51.4 (7.2) 50.9 (7.4) Angina None 195 (20.4) 83 (22.6) 81 (22.3) Present 760 (79.6) 285 (77.4) 282 (77.7) History of MI No 406 (42.5) 159 (43.2) 135 (37.2) Yes 549 (57.5) 209 (56.8) 228 (62.8) LAD % obstruction 39.1 (38.7) 36.4 (38.0) 34.9 (37.0) Left ventricular score 7.1 (2.7) 7.4 (2.9) 7.3 (2.8) Diseased vessels 347 (36.3) 146 (39.7) 133 (36.6) ≥ 1 608 (63.7) 222 (60.3) 230 (63.4) Ejection fraction, % 60.2 (12.3) 60.9 (13.1) 59.8 (12.8) Results presented as mean (standard deviation) for continuous variables and count (%) for discrete variables. CASS = Coronary Artery Surgery Study; LAD = left anterior descending coronary artery; MI = myocardial infarction.

19 / 34

slide-20
SLIDE 20

CASS re-analysis: survival curves (OM)

Survival curves in the trial vs survival curves in the target population

0.6 0.7 0.8 0.9 1.0 Survival probability 4 8 12 16 Time (years)

Trial-only

0.6 0.7 0.8 0.9 1.0 Survival probability 4 8 12 16 Time (years)

OM IPW AIPW

Surgery Medical

20 / 34

slide-21
SLIDE 21

CASS re-analysis: survival probabilities (OM)

16-year survival probability, % Estimator Surgery Medical Trial-only 66.3 (61.5, 71.1) 65.8 (61.5, 71.1) OM 63.9 (58.6, 69.0) 64.0 (58.9, 69.0)

21 / 34

slide-22
SLIDE 22

Inverse participation weighting estimator (IPW)

Weighted Kaplan-Meier estimator Weights depend on correctly specifying a model for the probability of participation and probability of being censored in each interval Estimate participation model conditional on baseline covariates at baseline Estimate censoring model among trial participants conditional on remaining in the risk set and baseline covariates in each treatment arm

22 / 34

slide-23
SLIDE 23

CASS re-analysis: survival curves (OM, IPW)

0.6 0.7 0.8 0.9 1.0 Survival probability 4 8 12 16 Time (years)

Trial-only

0.6 0.7 0.8 0.9 1.0 Survival probability 4 8 12 16 Time (years)

OM

0.6 0.7 0.8 0.9 1.0 Survival probability 4 8 12 16 Time (years)

IPW

0.6 0.7 0.8 0.9 1.0 Survival probability 4 8 12 16 Time (years)

AIPW

Surgery Medical

23 / 34

slide-24
SLIDE 24

CASS re-analysis: survival probabilities (OM, IPW)

16-year survival probability, % Estimator Surgery Medical Trial-only 66.3 (61.5, 71.1) 65.8 (61.5, 71.1) OM 63.9 (58.6, 69.0) 64.0 (58.9, 69.0) IPW 63.5 (58.3, 68.8) 63.6 (58.5, 68.8)

24 / 34

slide-25
SLIDE 25

Doubly robust/ Augmented inverse probability weighting estimator (AIPW)

Combines working models from the outcome-based estimator and IPW for two opportunities for valid inference At least as efficient as IPW, in large samples, when all models are correctly specified Necessary for machine learning

25 / 34

slide-26
SLIDE 26

CASS re-analysis: survival curves (OM, IPW, AIPW)

0.6 0.7 0.8 0.9 1.0 Survival probability 4 8 12 16 Time (years)

Trial-only

0.6 0.7 0.8 0.9 1.0 Survival probability 4 8 12 16 Time (years)

OM

0.6 0.7 0.8 0.9 1.0 Survival probability 4 8 12 16 Time (years)

IPW

0.6 0.7 0.8 0.9 1.0 Survival probability 4 8 12 16 Time (years)

AIPW

Surgery Medical

26 / 34

slide-27
SLIDE 27

CASS re-analysis: survival probabilities (OM, IPW, AIPW)

16-year survival probability, % Estimator Surgery Medical Trial-only 66.3 (61.5, 71.1) 65.8 (61.5, 71.1) OM 63.9 (58.6, 69.0) 64.0 (58.9, 69.0) IPW 63.5 (58.3, 68.8) 63.6 (58.5, 68.8) AIPW 63.7 (58.5, 69.0) 63.7 (58.6, 68.8)

27 / 34

slide-28
SLIDE 28

Conclusions

Three different estimators for extrapolating trial results to a target population Estimators rely on different modeling assumptions (model assessment + robustness) Validity depends on explicit but untestable assumptions Should be combined with formal sensitivity analyses Useful for randomized trials nested in large healthcare systems, but can be modified for non-nested designs

28 / 34

slide-29
SLIDE 29

Appendix: Identifying the causal effect in the trial

Consistency Y j = Y a,c=0

j

when A = a and C j = 0 Positivity of treatment assignment 0 < Pr[A = a|X, S = 1] < 1 Positivity of being observed during follow-up Pr[Cj = 0|X = x, S = 1, A = a, Y j−1 = 0, C j−1 = 0] > 0 Non-informative censoring Ya

j ⊥

⊥Cj|X, S = 1, A = a, Y j−1 = 0, C j−1 = 0

Exchangeability over treatment Y a

j ⊥

⊥A|X, S = 1

29 / 34

slide-30
SLIDE 30

Appendix: Additional assumptions for generalizing trial results

Positivity of trial participation Pr[S = 1|X] > 0 Exchangeability over trial participation: Y a

j ⊥

⊥S|X for each j = 1, . . . , t.

30 / 34

slide-31
SLIDE 31

Appendix: Data structure and notation

Use discrete-time structure, so follow-up time is split into evenly spaced intervals j = 1, . . . , J, where J is the administrative end of the study

Notation

The observed data is: (Xi, Si, Si × Ai, Si × Yj,i, Si × Cj,i); j = 0, . . . , J; i = 1, . . . , n n total number of individuals S indicator for being randomized X vector of baseline covariates A the (randomized) treatment Yj,i indicator for having the event by interval j Cj,i indicator for being censored by interval j

31 / 34

slide-32
SLIDE 32

Appendix: Outcome model-based estimator

Model the hazard of the outcome among S = 1 and A = a, e.g., using a pooled logistic regression:

  • θOM(a, t) = 1

n

n

  • i=1

t

  • j=1
  • 1 −

h(a, j; Xi)

  • ,
  • h(a, j; X) is an estimator for Pr[Yj = 1|X, S = 1, A = a, Y j−1 = 0, C j = 0]

32 / 34

slide-33
SLIDE 33

Appendix: Inverse participation weighting estimator (IPW)

Similar to a weighted Kaplan-Meier:

  • θIPW(a, t) =

t

  • j=1

       1 −

n

  • i=1

I(Si = 1, Ai = a, Yj,i = 1, C j,i = 0) w(a, j; Xi)

n

  • i=1

I(Si = 1, Ai = a, C j,i = 0) w(a, j; Xi)        ,with weights

  • w(a, j; Xi) =

    

  • p(Xi)

e(a; Xi)

j

  • m=1
  • 1 −

ℓ(a, m; Xi)

   

−1

,

  • p(Xi) estimates Pr[S = 1|X]
  • e(a; Xi) estimates Pr[A = 1|X, S = 1]
  • ℓ(a, m; Xi) estimates Pr[Cm = 1|X, S = 1, A = a, Y m−1 = 0, C m−1 = 0]

33 / 34

slide-34
SLIDE 34

Appendix: Doubly robust/ Augmented inverse probability weighted estimator

  • θAIPW(a, t) =

θOM(a, t) + 1

n

n

  • i=1
  • Z(a, t; Xi, Ai, Si)
  • ,
  • Z(a, t; Xi, Ai, Si) = −
  • j≤t

I(Si = 1, Ai = a) w(a, j; Xi)

t

  • m=1
  • 1 −

h(a, m; Xi)

  • j
  • m=1
  • 1 −

h(a, m; Xi)

  • × I(Y j−1,i = 0, C j,i = 0)
  • I(Yj,i = 1) −

h(a, j; Xi)

  • .

34 / 34