Bayesian Causal Inference in High Dimensional Data Settings Jacob - - PowerPoint PPT Presentation

bayesian causal inference in high dimensional data
SMART_READER_LITE
LIVE PREVIEW

Bayesian Causal Inference in High Dimensional Data Settings Jacob - - PowerPoint PPT Presentation

Bayesian Causal Inference in High Dimensional Data Settings Jacob Spertus and Sharon-Lise Normand Harvard Medical School & School of Public Health Funded by R01-GM111339 and U01-FDA004493 Thanks to Organizer: Mariel Finucane , Mathematica


slide-1
SLIDE 1

Bayesian Causal Inference in High Dimensional Data Settings

Jacob Spertus and Sharon-Lise Normand

Harvard Medical School & School of Public Health Funded by R01-GM111339 and U01-FDA004493 Thanks to Organizer: Mariel Finucane, Mathematica Policy Research

June 2017

sharon@hcp.med.harvard.edu (HMS) AcademyHealth 2017 June 2017 1 / 14

slide-2
SLIDE 2

Introduction

OUTLINE

Motivating problem What has been done What we add to literature Revisit motivating problem Concluding remarks Thanks

sharon@hcp.med.harvard.edu (HMS) AcademyHealth 2017 June 2017 2 / 14

slide-3
SLIDE 3

Introduction Motivating problem

DRUG ELUTING (DES)VERSUS BARE METAL (BMS) CORONARY STENTS

DES (approved 2003) and BMS (approved in 1990s) frequently implanted to keep treated arteries clear & supported DES improves target-vessel revascularization (TVR) more than BMS DES associated with late stent thrombosis (death) Have 9000 patients and 500 confounders Do DES cause fewer revascularizations compared to BMS? MASSACHUSETTS, 2011 Stent Type Characteristic BMS DES Outcomes, % 1 Year Mortality 10.2 3.3 1 Year TVR 9.0 6.5 Confounders Age, yrs 66.4 63.7 STEMI, % 35.7 18.2 Cardiomyopathy

  • r LVSD, %

11.1 8.4 Emergent, % 38.3 20.3 Shock, % 3.8 0.8

STEMI = ST-elevated myocardial infarction; LVSD = left ventricular systolic dysfunction Spertus and Normand, 2017 (under review) sharon@hcp.med.harvard.edu (HMS) AcademyHealth 2017 June 2017 3 / 14

slide-4
SLIDE 4

Introduction Motivating problem

CONSIDERATIONS

Observational setting - no randomization Combination of clinical and claims data

Clinical data obtained by trained data managers in each hospital Approximately 131 confounders from clinical data (demographics, pre-existing conditions, presentation severity, procedural characteristics) Approximately 500 confounders in claims data (Present on Admission codes) Sparsity: considered claims diagnoses having 10 coded as present

Goals: avoid strong parametric specifications, adhere to ignorable treatment assignment assumption, and adopt a design-based approach Final dataset: 8718 patients and 495 potential confounders

sharon@hcp.med.harvard.edu (HMS) AcademyHealth 2017 June 2017 4 / 14

slide-5
SLIDE 5

What has been done

PROPENSITY SCORES

With high-dimensional data

Schneeweiss et al. (2009) proposed algorithm in high-dimensional settting

But no accounting for uncertainty in variable selection & outcome is used to inform propensity score model Bayesian?

McCandless et al. 2009: uses propensity score as a latent variable and jointly models the latent score and the outcome model. Kaplan and Chen, 2012: 2-step approach (propensity score model then a parametric outcome model) Saarela et al., 2015: marginal model specification coupled with inverse probability of treatment weighting 2-step procedure

†Zigler and Dominici, 2014: include the propensity score as a linear

predictor in the outcome model & use Bayesian model averaging

†Only Bayesian method proposed in the high-dimensional setting sharon@hcp.med.harvard.edu (HMS) AcademyHealth 2017 June 2017 5 / 14

slide-6
SLIDE 6

What we add

APPROACH

WANT: ∆ = E(YDES) - E(YBMS) Two-step approach

1 Step 1: estimate propensity score model via regularization 2 Step 2: assume a binomial likelihood for binary outcome, weighted to

generate a pseudo-population

Not Bayesian per se Incorporates uncertainty from propensity score estimation Addresses the large k problem Simple diagnostic tools to assess balancing properties Maintains separation between treatment and outcome Outcome model does not assume a parametric function of treatment

sharon@hcp.med.harvard.edu (HMS) AcademyHealth 2017 June 2017 6 / 14

slide-7
SLIDE 7

What we add Models

MODELS (T Binary Treatment, Y Binary Outcome, X Confounders, k Large)

Step 1: Treatment Model Ti ∼ Bern(π(Xi)) π(Xi) = logit−1

  • β0 +

k

  • j=1

βjXij

  • π(Xi) = propensity score

Priors required for βj

Typically centered at 0 Horseshoe prior (Carvahlo et al., 2010) βj ∼ N(0, λ2

jτ2)

λj, τ ∼ Cauchy+(0, 1)

Mimics Bayesian Model Averaging (with heavy-tailed discrete mixtures)

sharon@hcp.med.harvard.edu (HMS) AcademyHealth 2017 June 2017 7 / 14

slide-8
SLIDE 8

What we add Models

MODELS (T Binary Treatment, Y Binary Outcome, X Confounders, k Large)

Step 1: Treatment Model Ti ∼ Bern(π(Xi)) π(Xi) = logit−1

  • β0 +

k

  • j=1

βjXij

  • π(Xi) = propensity score

Priors required for βj

Typically centered at 0 Horseshoe prior (Carvahlo et al., 2010) βj ∼ N(0, λ2

jτ2)

λj, τ ∼ Cauchy+(0, 1)

Mimics Bayesian Model Averaging (with heavy-tailed discrete mixtures) Step 2: Outcome Model YT | nT, pT, π(X) ∼ Binomial(nT, pT) pT | αT0, αT1 ∼ Beta(αT0, αT1) nT = number of subjects receiving treatment T pT = probability of outcome under treatment T Independent samples in treatment groups A-posteriori pT | Y, π(X) ∼ Beta(aT, bT)

sharon@hcp.med.harvard.edu (HMS) AcademyHealth 2017 June 2017 8 / 14

slide-9
SLIDE 9

What we add Models

MODEL (T Binary Treatment, Y Binary Outcome, X Confounders, k Large)

Posterior Distribution: pT | Y, π(X) ∼ Beta (aT, bT)

a1 = α11 + γ1 n

  • i=1

TiYi πi

  • weight

b1 = α10 + γ1 n

  • i=1

Ti(1 − Yi) πi

  • weight

a0 = α00 + γ0 n

  • i=1

(1 − Ti)Yi 1 − πi

  • weight

b0 = α01 + γ0 n

  • i=1

(1 − Ti)(1 − Yi) 1 − πi

  • weight

Renormalization terms: γ1 = n

i=1 Ti

n

i=1 Ti/πi

; γ0 = n

i=1(1 − Ti)

n

i=1(1 − Ti)/(1 − πi) sharon@hcp.med.harvard.edu (HMS) AcademyHealth 2017 June 2017 9 / 14

slide-10
SLIDE 10

What we add Operating characteristics

DOES UNCERTAINTY IN STEP 1 MATTER?

500 simulations, n = 1000, k= 100 (18 βj = 0), P(Yi = 1) ≈ 0.10 BART = Bayesian Additive Regression Trees 95% CI Coverage ˆ ∆ Bias Width 95% CI Integrated Propensity Score Student-t3(0, 32)

  • .011

.220 95.2% Horseshoe Priors .016 .110 93.0% BART .011 .123 96.8% Mean Propensity Score Student-t3(0, 32)

  • .001

.095 79.2% Horseshoe Priors .018 .092 86.0% BART .015 .093 87.2% Other Methods Naive Estimate .030 .092 73.0% IPW

  • .001

.151 92.8% TMLE .006 .075 81.4%

Bottom Line: useful to integrate over the propensity score distribution for large k

sharon@hcp.med.harvard.edu (HMS) AcademyHealth 2017 June 2017 10 / 14

slide-11
SLIDE 11

What we add Causal estimates for BMS vs DES

WEIGHTED STANDARDIZED MEAN DIFFERENCES IN CONFOUNDERS

sharon@hcp.med.harvard.edu (HMS) AcademyHealth 2017 June 2017 11 / 14

slide-12
SLIDE 12

What we add Causal estimates for BMS vs DES

1-YEAR TARGET VESSEL REVASCULARIZATION

−8 −7 −6 −5 −4 −3 −2 −1 1 2 1−Year Revascularization,% Naive IPW TMLE t(0,9) HShoe BART Favours DES Favours BMS

Bayesian Frequentist

sharon@hcp.med.harvard.edu (HMS) AcademyHealth 2017 June 2017 12 / 14

slide-13
SLIDE 13

Conclusions

SUMMARY OF OUR APPROACH

Advantages Not fully Bayesian Propensity score is simple, widely used, and familiar Easy to assess balancing properties Fixing propensity score at posterior mean:

Good coverage for small k Underestimates variance for large k Some bias

Maintains separation between treatment and outcome models Honestly reflects uncertainty

sharon@hcp.med.harvard.edu (HMS) AcademyHealth 2017 June 2017 13 / 14

slide-14
SLIDE 14

Conclusions

SUMMARY OF OUR APPROACH

Advantages Not fully Bayesian Propensity score is simple, widely used, and familiar Easy to assess balancing properties Fixing propensity score at posterior mean:

Good coverage for small k Underestimates variance for large k Some bias

Maintains separation between treatment and outcome models Honestly reflects uncertainty Disadvantages Not fully Bayesian Robustness?

Binomial likelihood

Balancing property may be lost when incorporating uncertainty

Thank you!

sharon@hcp.med.harvard.edu (HMS) AcademyHealth 2017 June 2017 14 / 14