Doubly robust treatment e ff ect estimation with missing attributes - PowerPoint PPT Presentation

Doubly robust treatment e ff ect estimation with missing attributes E ff ect of tranexamic acid on mortality of patients with traumatic brain injury Imke Mayer, Julie Josse, Stefan Wager, Tobias Gauss, Jean-Denis Moyer � Group EHESS; ´ Ecole Polytechnique; Stanford Business School; Traumabase R Statistique, Math´ ematique et Applications, Fr´ ejus, 3 sept. 2019 1

Introduction

Traumabase • 20 , 000 patients • 250 continuous and categorical variables: heterogeneous • 16 hospitals: multilevel data • 4,000 new patients/ year Center Accident Age Sex Weight Lactactes BP shock . . . Beaujon fall 54 m 85 NM 180 yes Pitie gun 26 m NR NA 131 no Beaujon moto 63 m 80 3.9 145 yes Pitie moto 30 w NR Imp 107 no HEGP knife 16 m 98 2.5 118 no . ... . . 2

Traumabase • 20 , 000 patients • 250 continuous and categorical variables: heterogeneous • 16 hospitals: multilevel data • 4,000 new patients/ year Center Accident Age Sex Weight Lactactes BP shock . . . Beaujon fall 54 m 85 NM 180 yes Pitie gun 26 m NR NA 131 no Beaujon moto 63 m 80 3.9 145 yes Pitie moto 30 w NR Imp 107 no HEGP knife 16 m 98 2.5 118 no . ... . . ) Estimate causal e ff ect : Administration of the treatment ”tranexamic acid” (within 3 hours after the accident) on the outcome mortality for traumatic brain injury patients 2

Missing values Percentage 100 25 50 75 AIS.external 0 AIS.face AIS.head Impossible Not Applicable Not made Not Informed NA ISS Trauma.center TBI Pupil.anomaly Pupil.anomaly.ph OTI.MICU Cardiac.arrest.ph HR GSC.init Delta.hemoCue Vasopressor.therapy IGS.II Hemoglobin SBP Death.in.ICU DBP Anticoagulant.therapy Antiplatelet.therapy SpO2 FiO2 SBP.min Variable HR.max DBP.min SpO2.min GSC.motor.init Neurosurgery.day0 Medcare.time.ph Tranexamic.acid HemoCue.init Cristalloid.volume Colloid.volume Decompressive.craniectomy Osmotherapy ICP EVD TCD.PI.max SBP.MICU HR.MICU DBP.MICU Glasgow.discharge IICP Osmotherapy.ph Improv.anomaly.osmo Cause.of.death Temperature.min 3

Causal inference: classical framework

Potential outcome framework (Neyman, 1923, Rubin, 1974) Causal e ff ect Binary treatment w 2 { 0 , 1 } on i-th individual with potential outcomes Y i (1) and Y i (0). Individual causal e ff ect of the treatment: ∆ i = Y i (1) � Y i (0) • Problem: ∆ i never observed (only observe one outcome/indiv). Causal inference as a missing value pb? Covariates Treatment Outcome(s) X 1 X 2 X 3 W Y(0) Y(1) 1.1 20 F 1 NA T -6 45 F 0 F NA 0 15 M 1 F NA . . . . . . . . . . . . -2 52 M 0 T NA 4

Potential outcome framework (Neyman, 1923, Rubin, 1974) Causal e ff ect Binary treatment w 2 { 0 , 1 } on i-th individual with potential outcomes Y i (1) and Y i (0). Individual causal e ff ect of the treatment: ∆ i = Y i (1) � Y i (0) • Problem: ∆ i never observed (only observe one outcome/indiv). Causal inference as a missing value pb? • Average treatment e ff ect (ATE) τ = E [ ∆ i ] = E [ Y i (1) � Y i (0)]: The ATE is the di ff erence of the average outcome had everyone gotten treated and the average outcome had nobody gotten treatment. ) First solution: estimate τ with randomized controlled trials (RCT). 4

Observational data Non random assignment ! Confounding Mortality rate 20% - treated 38% - not treated 16%: treatment kills? survived deceased Pr(survived | treatment) Pr(deceased | treatment) TA not administered 2,167 (68%) 399 (13%) 0.84 0.16 TA administered 374 (12%) 228 (7%) 0.62 0.38 Table 1: Occurrence and frequency table for traumatic brain injury patients (total number: 3,168). 5

Unconfoundedness and the propensity score Assumptions • n iid samples ( X i , Y i , W i ), • Y i = W i Y i (1) + (1 � W i ) Y i (0) (SUTVA) • Treatment assignment is random conditionally on X i : { Y i (0) , Y i (1) } ? ? W i | X i ⌘ unconfoundedness assumption . Propensity score and overlap assumption e ( x ) , P ( W i = 1 | X i = x ) 8 x 2 X . We will assume overlap, i.e. 0 < e ( x ) < 1 8 x 2 X . Key property e is a balancing score, i.e. under unconfoundedness, it satisfies { Y i (0) , Y i (1) } ? ? W i | e ( X i ) 6

Propensity based estimators Inverse Propensity Weighted estimator n τ IPW , 1 ✓ W i Y i e ( X i ) � (1 � W i ) Y i ◆ X ˆ n ˆ 1 � ˆ e ( X i ) i =1 7

Propensity based estimators Inverse Propensity Weighted estimator n τ IPW , 1 ✓ W i Y i e ( X i ) � (1 � W i ) Y i ◆ X ˆ n ˆ 1 � ˆ e ( X i ) i =1 Augmented IPW: a doubly robust estimator Define µ ( w ) ( x ) := E [ Y i ( w ) | X i = x ]. n Y i − ˆ µ (1) ( X i ) − (1 − W i ) Y i − ˆ µ (0) ( X i ) τ AIPW := 1 ✓ ◆ X ˆ ˆ µ (1) ( X i ) − ˆ µ (0) ( X i ) + W i n ˆ e ( X i ) 1 − ˆ e ( X i ) i =1 is consistent if either the ˆ µ ( w ) ( x ) are consistent or ˆ e ( x ) is consistent. ) The AIPW has better statistical properties than IPW (Robins et al., 1994; Chernozhukov et al., 2018). ) Possibility to use any (machine learning) procedure such as random forests, deep nets, etc. to estimate ˆ e ( x ) and ˆ µ ( w ) ( x ) without harming the interpretability of the causal e ff ect estimation. R package grf (Athey et al., 2019) 7

Causal inference: with missing attributes?

Unconfoundedness with missing attributes? Without any changes to the previous framework, the only straightforward – but generally biased – solution is complete-case analysis. Covariates Treatment Outcome(s) X 1 X 2 X 3 W Y(0) Y(1) 20 F 1 T NA NA -6 45 NA 0 F NA 0 NA M 1 NA F 32 F 1 T NA NA 1 63 M 1 F NA -2 NA M 0 T NA 8

Unconfoundedness with missing attributes? Without any changes to the previous framework, the only straightforward – but generally biased – solution is complete-case analysis. Covariates Treatment Outcome X 1 X 2 X 3 W Y 20 F 1 T NA -6 45 NA 0 F 0 NA M 1 F 32 F 1 T NA 1 63 M 1 F -2 NA M 0 T 8

Unconfoundedness with missing attributes? Without any changes to the previous framework, the only straightforward – but generally biased – solution is complete-case analysis. ! Often not a good idea! What are the alternatives? Two families of methods • Unconfoundedness despite missingness • Classical missing values mechanisms ( MCAR , MAR , MNAR , (Rubin, 1976)) 8

Unconfoundedness with missing attributes? Unconfoundedness despite missingness Adapt the initial assumptions s.t. treatment assignment is unconfounded given only the observed information, that is, observed covariates and the response pattern . 8

Unconfoundedness with missing attributes? Notations • response pattern R 2 { NA , 1 } p , R j , 1 { X j is observed } + NA 1 { X j is missing } , • X ∗ = R � X 2 { R [ NA } p Unconfoundedness despite missingness Treatment is unconfounded given X ∗ : { Y i (1) , Y i (0) } ? ? W i | X ∗ , (1) or alternatively: { Y i (1) , Y i (0) } ? ? W i | X i , R i , 8 ? X i | X ∗ CIT: W i ? i , R i (2) > < or > CIO: Y i ( t ) ? ? X i | X ∗ for t 2 { 0 , 1 } i , R i : 8

Unconfoundedness with missing attributes? Unconfoundedness despite missingness Treatment is unconfounded given X ∗ : { Y i (1) , Y i (0) } ⊥ ⊥ W i | X ∗ , (1) or alternatively: { Y i (1) , Y i (0) } ? ? W i | X i , R i ,  CIT: W i ? ? X i | X ∗ i , R i (2)   or  CIO: Y i ( t ) ? ? X i | X ∗ i , R i for t 2 { 0 , 1 }  (a) CIT (b) CIO X X ∗ R X X ∗ R w w Y ( w ) Y ( w ) W W 8

Generalized propensity score and random forests Generalized propensity score (Rosenbaum and Rubin, 1984) e ∗ ( X ∗ ) = P ( W = 1 | X ∗ ) . ! Allows to balance treatment and control groups on the observed information X ∗ in the case of missing values (1). 9

Generalized propensity score and random forests Generalized propensity score (Rosenbaum and Rubin, 1984) e ∗ ( X ∗ ) = P ( W = 1 | X ∗ ) . ! Allows to balance treatment and control groups on the observed information X ∗ in the case of missing values (1). ! Random forests allow incorporating missing values directly since they allow semi-discrete variables (e.g. X ∗ 2 ( R ⇥ NA ) p ). ! With specific representation/encoding of missing values ( MIA ), splits are possible either on observed variables or on response pattern (Josse et al., 2019). 9

Generalized propensity score and random forests Generalized propensity score (Rosenbaum and Rubin, 1984) e ∗ ( X ∗ ) = P ( W = 1 | X ∗ ) . ! Random forests allow incorporating missing values directly since they allow semi-discrete variables (e.g. X ∗ 2 ( R ⇥ NA ) p ). ! With specific representation/encoding of missing values ( MIA ), splits are possible either on observed variables or on response pattern (Josse et al., 2019). ! recursively find partition that minimizes empirical risk. For every covariate X j and threshold z , there are three possibilities: { X ∗ j  z or X ∗ { X ∗ j = NA } j > z } vs { X ∗ { X ∗ j > z or X ∗ j  z } vs j = NA } { X ∗ { X ∗ j = NA } vs j 6 = NA } 9

Doubly robust treatment e ff ect estimation with missing attributes - PowerPoint PPT Presentation

Doubly robust treatment e ff ect estimation with missing attributes E ff ect of tranexamic acid on mortality of patients with traumatic brain injury Imke Mayer, Julie Josse, Stefan Wager, Tobias Gauss, Jean-Denis Moyer Group EHESS; Ecole

Recommendation on Data Missing Not at Random A Doubly Robust Joint Learning Approach Rating

P rot ect ion Prot ect ing processes/ users f rom each 17: P rot ect ion/ Securit y ot her

Searching for Doubly Self Searching for Doubly Self- Orthogonal Latin Squares Orthogonal Latin

doubly linked lists Sept. 20/21, 2017 1 Singly linked list head tail 2 Doubly linked list

Doubly-Linked Lists 4-02-2013 Doubly-linked list Implementation of List ListIterator

Outlier Outlier Outlier- Outlier - -robust - robust robust robust identification

Doubly-Competitive Distribution Estimation Yi Hao and Alon Orlitsky Department of Electrical and

Multiple Imputation for Missing Data in KLoSA Juwon Song Korea University and UCLA Contents 1.

Missing Values in SAS Magnus Mengelbier Director PhUSE 2011 1 Topics Introduction

Missing Data and Imputation NINA ORWITZ OCTOBER 30 TH , 2017 Outline Types of missing data

The ECT failed in delivering in its policy objectives The Energy Charter Treaty (ECT) is a

Watkins Glen CS D Capital proj ect 2017 Agenda Capital Proj ect 2017 Planning S

Net work Management Tasks Prot ect ing t he net work (e.g. int rusion 17: det ect ion) Net

ECT: A REVIEW Nothing to Disclose ECT is not a treatment of last resort!!! History

Robust Location and Scatter Estimators Outline for Multivariate Data Analysis Background

MALE INFERTILITY CASE-I Before Treatment: After Treatment: After Treatment: CASE 2 BEFORE

Clinical prediction models in the age of artificial intelligence and big data Ewout Steyerberg

Uncovering disassortativity in large scale-free networks Nelly Litvak University of Twente,

Power of genetic epidemiology study 28.10.2005 GE02 day 4 part 4 Yurii Auchenko Erasmus MC

Presenters and Agenda Dr. Kyle Freese, PhD, MPH James Martin Lara Popovich Chief

The resurrection of time as a continuous concept in biostatistics, demography and epidemiology

Using sparsity to overcome unmeasured confounding: Two examples Qingyuan Zhao Statistical

Unifying Data Units and Models in (Co-)Clustering C. Biernacki Joint work with A. Lourme 24 e

Zachary B Bischof Fabian B Bustamante Nick Nick Fea Feamst ster er The growth of broadband