Causal inference with missing values Effect of tranexamic acid on - PowerPoint PPT Presentation

Causal inference with missing values Effect of tranexamic acid on mortality for head trauma patient Julie Josse, (INRIA XPOP - X) - Imke Mayer 22 January, 2019 Statistic seminar Nice 1

Research activities • Dimensionality reduction methods to visualize complex data (PCA based) : multi-sources, textual, arrays, questionnaire • Low rank estimation, selection of regularization parameters • Missing values - matrix completion • Causal inference • Fields of application : bio-sciences (agronomy, sensory analysis), health data (hospital data) • R community : book R for Stat, R foundation, taskforce, packages : FactoMineR explore continuous, categorical, multiple contingency tables (correspondence analysis), combine clustering and PC, .. MissMDA for single and multiple imputation, PCA with missing denoiseR to denoise data with low-rank estimation R-miss-tastic missing values plateform 2

Overview 1. Introduction 2. Causal inference Inverse-propensity weighting Double robust methods 3. Handling missing values Single imputation with PCA Supervised learning with missing values Logistic regression with missing values 4. Results 5. Conclusion 3

Introduction

Collaborators Imke Mayer , Wei Jiang, Genevieve Robin, polytechnique students, Jean-Pierre Nadal , Traumabase (APHP) : Tobias Gauss, Sophie Hamada, Jean-denis Moyer Capgemini 4

Traumabase 15000 patients/ 250 variables/ 11 hospitals, from 2011 (4000 new patients/ year) Center Accident Age Sex Weight Height BMI BP SBP 1 Beaujon Fall 54 m 85 NR NR 180 110 2 Lille Other 33 m 80 1.8 24.69 130 62 3 Pitie Salpetriere Gun 26 m NR NR NR 131 62 4 Beaujon AVP moto 63 m 80 1.8 24.69 145 89 6 Pitie Salpetriere AVP bicycle 33 m 75 NR NR 104 86 7 Pitie Salpetriere AVP pedestrian 30 w NR NR NR 107 66 9 HEGP White weapon 16 m 98 1.92 26.58 118 54 10 Toulon White weapon 20 m NR NR NR 124 73 ................... SpO2 Temperature Lactates Hb Glasgow Transfusion ........... 1 97 35.6 <NA> 12.7 12 yes 2 100 36.5 4.8 11.1 15 no 3 100 36 3.9 11.4 3 no 4 100 36.7 1.66 13 15 yes 6 100 36 NM 14.4 15 no 7 100 36.6 NM 14.3 15 yes 9 100 37.5 13 15.9 15 yes 10 100 36.9 NM 13.7 15 no ⇒ Estimate causal effect : administration of the treatment ”tranexamic acid” (within the first 3 hours after the accident) on mortality ( outcome ) for traumatic brain injury (TBI) patients. 5

Causal inference for traumatic brain injury with missing values • 45 quantitative & categorical covariates selected by experts • Outcome : in-ICU death (binary), causes : brain death, withdrawal of • Treatment : tranexamic acid (binary) • 3050 patients with a brain injury (a lesion visible on the CT scan) type of accident, anamnesis, etc. ) and hospital data (Delphi process). Pre-hospital (blood pressure, patients reactivity, care, head injury and multiple organ failure. Percentage Choc.hemorragique 100 25 50 75 AIS.face 0 Trauma.cranien AIS.tete Anomalie.pupillaire Percentage of missing values Glasgow IOT.SMUR Glasgow.initial Mydriase Catecholamines FC ACR.1 Temps.en.rea PAS Traitement.antiagregants PAD Traitement.anticoagulant SpO2 DC.en.rea Plaquettes Hb Glasgow.moteur.initial TP.pourcentage PAS.min Ventilation.FiO2 FC.max PAD.min Temps.depart.scanner.ou.bloc KTV.poses.avant.TDM Fibrinogene.1 SpO2.min Derniere.PAS.avant.depart Dose.NAD.depart Derniere.PAD.avant.depart LATA Temps.lieux.hop Glasgow.moteur Lactates PaO2 pCO2 ARDS Couple Alcool FC.SMUR PAS.SMUR EER Craniectomie.decompressive PAD.SMUR Diplome.plus.eleve.ou.niveau DTC.IP.max Osmotherapie PIC DTC.IP.max.24h.HTIC Lactates.H2.1 DVE Hypothermie.therapeutique Glasgow.sortie Lactates.H2 Regression.mydriase.sous.osmotherapie Lactates.prehosp HTIC Mannitol.SSH Temps.arrivee.pose.PIC Cause.du.DC Temps.arrivee.pose.DVE Temperature.min Delai.DC variable imp.data nf.data nr.data na.data null.data 6

Outline ⇒ Causal inference Causal inference methodology : estimate causal relationships between an intervention (acid administration) and an outcome (mortality), when the study is potentially confounded by selection bias due to the absence of randomization. ⇒ How to handle missing values ? ⇒ Causal inference with missing values, analysis of the data 7

Causal inference

Potential outcome framework (Rubin, 1974) Causal effect Binary treatment w ∈ { 0 , 1 } on i-th individual with potential outcomes Y i (1) and Y i (0). Individual causal effect of the treatment : ∆ i = Y i (1) − Y i (0) 8

Potential outcome framework (Rubin, 1974) Causal effect Binary treatment w ∈ { 0 , 1 } on i-th individual with potential outcomes Y i (1) and Y i (0). Individual causal effect of the treatment : ∆ i = Y i (1) − Y i (0) • Problem : ∆ i never observed (only observe one outcome/indiv). Causal inference as a missing value pb ? • Average treatment effect (ATE) τ = E [∆ i ] = E [ Y i (1) − Y i (0)] : The ATE is the difference of the average outcome had everyone gotten treated and the average outcome had nobody gotten treated. ⇒ First solution : estimate τ with randomized controlled trials (RCT). 8

Average treatment effect estimation in RTCs Assumptions : Observe n iid samples ( Y i , W i ) each satisfying : • Y i = Y i ( W i ) (SUTVA) • W i ⊥ ⊥ { Y i (0) , Y i (1) } (random treatment assignment) Difference-in-means estimator τ DM = 1 Y i − 1 � � ˆ Y i n 1 n 0 W 1 =1 W 1 =0 Properties of ˆ τ DM τ DM is unbiased and √ n -consistent. √ n (ˆ d ˆ τ DM − τ ) − n →∞ N (0 , V DM ) , − − → where V DM = Var ( Y i (0)) P ( W i =0) + Var ( Y i (1)) P ( W i =1) . 9

Average treatment effect estimation in RTCs τ DM = 1 Y i − 1 � � ˆ Y i n 1 n 0 W 1 =1 W 1 =0 Furthermore assume a linear model for the two potential outcomes : Linear assumptions n iid samples ( X i , Y i , W i ) • Y i ( w ) = c ( w ) + X i β ( w ) + ε i ( w ), w ∈ { 0 , 1 } , Y i ( w ) = µ ( w ) ( X i ) + ε i ( w ) • E [ ε i ( w ) | X i ] = 0 and Var ( ε i ( w ) | X i ) = σ 2 . OLS estimator c (0) + ¯ X (ˆ β (1) − ˆ τ OLS = ˆ ˆ c (1) − ˆ β (0) ) = � � c (1) + X i ˆ c (0) − X i ˆ 1 = 1 � � � (ˆ β (1) ) − (ˆ β (0) ) � µ (1) X i − ˆ ˆ µ (0) ( X i ) n i n i Properties of ˆ τ OLS √ n (ˆ d n →∞ N (0 , V OLS ) . And V DM = V OLS + � β (0) + β (1) ) � 2 τ OLS − τ ) − − − → A . 10

Observational data. Non random assignment : confusion Mortality rate 16% - treated 28 - not treated 13 : treatment kills ? Died P(Outcome | Treatment) Treated 0 1 0 1 FALSE 2225 340 0.867 0.133 TRUE 436 168 0.722 0.278 Strong indication for confounding factors that need to be controlled for. Standardized mean differences between treated and control. Covariate Balance TP.pourcentage ● Hb ● Fibrinogene.1 ● AIS.abdo.pelvien ● AIS.membres.bassin ● Lactates ● PAS.min ● PAS ● AIS.thorax ● PAD.min ● Dose.NAD.depart ● Glasgow.initial ● FC ● Sample Glasgow.moteur.initial ● ● Unadjusted PAD ● AIS.face ● pCO2 ● Plaquettes ● FC.max ● SpO2.min ● SpO2 ● Temps.lieux.hop ● PaO2 ● AIS.tete ● DTC.IP.max ● AIS.externe ● Alcool ● 0.00 0.25 0.50 0.75 1.00 Absolute Mean Differences Treated patients are more severe with higher risk of death (graphical model) 11

Solutions to estimate ATE with observational data • Matching : pair each treated (resp. untreated) patient with one or more similar untreated (resp. treated) patient (R package Match ) • Inverse-propensity weighting : to adjust for biases in the treatment assignment Propensity Score before Weighting Propensity Score after Weighting 1.00 1.00 0.75 0.75 as.factor(treatment) as.factor(treatment) scaled scaled 0 0 0.50 0.50 1 1 0.25 0.25 0.00 0.00 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 pscore pscore • Double robust methods for model misspecifications : covariate balancing propensity score, augmented IPW. (Robins et al. , 1994) • Regression adjustment , regression-adjusted matching , etc. 12

Unconfoundedness and the propensity score Assumptions • n iid samples ( X i , Y i , W i ), • Treatment assignment is random conditionally on X i : { Y i (0) , Y i (1) } ⊥ ⊥ W i | X i ≡ unconfoundedness assumption . Measure enough covariates to capture any dependence between W i and the PO Propensity score e ( x ) = P ( W i = 1 | X i = x ) ∀ x ∈ X . Key property e is a balancing score, i.e. under unconfoundedness, it satisfies { Y i (0) , Y i (1) } ⊥ ⊥ W i | e ( X i ) As a consequence, it suffices to control for e ( X ) (rather than X ), to remove 13 biases associated with non-random treatment assignment.

Unconfoundedness and the propensity score Propensity score e ( x ) = P ( W i = 1 | X i = x ) ∀ x ∈ X . Key property Under unconfoundedness, e ( x ) satisfies { Y i (0) , Y i (1) } ⊥ ⊥ W i | e ( X i ) . Proof To prove this balancing property, we note that the distribution of W is fully specified by its mean. Therefore we need to prove that : E [ W i |{ Y i (0) , Y i (1) } , X i ] = E [ W i | X i ] ⇒ E [ W i |{ Y i (0) , Y i (1) } , e ( X i )] = E [ W i | e ( X i )] 14

Causal inference with missing values Effect of tranexamic acid on - PowerPoint PPT Presentation

Causal inference with missing values Effect of tranexamic acid on mortality for head trauma patient Julie Josse, (INRIA XPOP - X) - Imke Mayer 22 January, 2019 Statistic seminar Nice 1 Research activities Dimensionality reduction methods

Political Science 209 - Fall 2018 Causal Inference Florian Hollenbach 7th September 2018 Causal

Causal Effect Evaluation and Causal Network Learning Zhi Geng Peking University, China June

Causal Inference By: Miguel A. Hern an and James M. Robins Part I: Causal inference without

A Brief Introduction to Causal Inference Brady Neal causalcourse.com What is causal inference?

Missing Values in SAS Magnus Mengelbier Director PhUSE 2011 1 Topics Introduction

Introduction to Causal Inference Lan Liu University of Minnesota at Twin Cities liux3771@umn.edu

Foundations of Causal Discovery Frederick Eberhardt KDD Causality Workshop 2016 Causal Discovery

Modes of Statistical Inference for Causal Efgects Plus an overview of the testing based approach

Geographic Data Science - Lecture IX Causal Inference Dani Arribas-Bel Today Correlation Vs

Causal inference Gary Goertz Kroc Institute for International Peace Studies University of Notre

Causal Inference An introduction based on S. Wagers course on Causal Inference (OIT 661) Imke

Geographic Data Science - Lecture IX Causal Inference Dani Arribas-Bel Today Correlation Vs

Causal Inference Theory and Applications Dr. Matthias Uflacker, Johannes Huegle, Christopher

Geographic Data Science - Lecture IX Causal Inference Dani Arribas-Bel Today Correlation Vs

Causal Inference and Response Surface Modeling Inference and

Causal Programming Causal Programming Joshua Brul Joshua Brul

1 OBJECTIVES 1. Describe the impact of Western Acculturation on the dietary patterns of South

NHSBT PAEDIATRIC COMPONENTS NHSBT PAEDIATRIC COMPONENTS SPECIFICATION AND AVAILABILITY PRESENTED

COAGULOPATHY OF PATIENTS INFECTED WITH COVID-19 Reza aghabozorgi MD medical oncologist and

Civilian Trauma Care COL(ret) George E Peoples, MD, FACS 6 April 2018 Military Contributions to

Objectives for Training Purpose of todays training is to provide an overview of how

R graphics and data manipulation Mark Dunning, Mike Smith, Sarah Vowler 12 December 2014 About

Refactoring NAMD for Petascale Machines and Graphics Processors James Phillips

OUTLINE FIBROADENOMA PHYLLODES TUMOR FIBROEPITHELIAL LESIONS OF THE BREAST