Classifying Treatment Responders Under Causal Effect Monotonicity - PowerPoint PPT Presentation

June 12, 2019 ICML 2019 Classifying Treatment Responders Under Causal Effect Monotonicity Nathan Kallus ORIE and Cornell Tech, Cornell University

Heterogeneous Treatment Effect Estimation X Age X Weight X BMI X SysBP T (Anticoagulant) Y (Hemorrhage) 49 106 31 Warfarin 1 54 89 26 None 0 43 130 38 None 1 . . . . . . . . . . . . . . . . . . Fit CATE τ ( X ) = E [ Y (1) − Y (0) | X ] to data on X, T, Y E.g. : Causal Forest (Wager & Athey ’17), TARNet (Shalit et al. ’17), ... Nathan Kallus Classifying Treatment Responders Under Causal Effect Monotonicity 2

Often Outcome is Binary Treatment Outcome Observed ( T ) ( Y ) Give anticoagulant Hemorrhage? Personalized discount Buy? Target job training Employed in 6 months? Homelessness prevention program Re-enter? Recidivism prevention program Recidivate? Support for minority CS students Drop out? Nathan Kallus Classifying Treatment Responders Under Causal Effect Monotonicity 3

Often We Want to Predict Response Treatment Individual Label of Interest ( T ) ( Y (1) − Y (0) ) Give anticoagulant Hemorrhage iff medicated Personalized discount Would buy iff discounted Target job training Would get job iff trained Homelessness prevention program Re-enter iff not targeted Recidivism prevention program Recidivate iff not targeted Support for minority CS students Drop out iff not targeted Nathan Kallus Classifying Treatment Responders Under Causal Effect Monotonicity 4

Classifying Responders: The Problem ◮ Each unit consists of ◮ Features X ◮ Potential outcomes Y (1) , Y (0) ∈ { 0 , 1 } ◮ “Non-responder” has Y (0) = Y (1) ◮ Would’ve bought (or, not bought) regardless of discount ◮ Would’ve hemorrhaged (or, not) regardless of anticoagulant ◮ “Responder” has Y (1) = 1 > 0 = Y (0) ◮ Would’ve bought if and only if offered discount ◮ R = I [ Y (1) > Y (0)] ◮ Ground truth NOT observed in X, T, Y data ◮ Want classifier f : X → { 0 , 1 } with small loss L θ ( f ) = θ P ( false positive ) + (1 − θ ) P ( false negative ) = θ P ( f ( X ) = 1 , R = 0) + (1 − θ ) P ( f ( X ) = 0 , R = 1) . Nathan Kallus Classifying Treatment Responders Under Causal Effect Monotonicity 5

Monotonicity ◮ Monotone treatment response assumption : Y (1) ≥ Y (0) ◮ Discount never causes a would-be buyer to not buy ◮ Job training never causes someone to not get employed? Nathan Kallus Classifying Treatment Responders Under Causal Effect Monotonicity 6

Monotonicity ◮ Monotone treatment response assumption : Y (1) ≥ Y (0) ◮ Discount never causes a would-be buyer to not buy ◮ Job training never causes someone to not get employed? ◮ Under monotonicity, R = Y (1) − Y (0) ∈ { 0 , 1 } ◮ So, P ( R = 1 | X ) = τ ( X ) = E [ Y (1) − Y (0) | X ] ◮ f ( X ) = I [ τ ( X ) ≥ θ ] minimizes L θ ( f ) ◮ Can take plug-in approach using any CATE estimator ˆ τ ◮ Question: any value to a direct classification approach? Nathan Kallus Classifying Treatment Responders Under Causal Effect Monotonicity 6

Classifying Responders ◮ For simplicity, consider completely randomized data with P ( T = 1) = 0 . 5 ◮ Let Z = I [ Y = T ] (observable!) ◮ R = 1 = ⇒ Z = 1 ◮ R = 0 = ⇒ Z ∼ Bernoulli(0 . 5) ◮ Z is like a corrupted observation of R ◮ Seeing Z = 0 is more informative about R ◮ Using Z as a surrogate label for R leads to new direct approaches to the classification problem ◮ Two instantiations of this are RespSVM, RespNet Nathan Kallus Classifying Treatment Responders Under Causal Effect Monotonicity 7

Empirical Results: Synthetic Responder Z = + 1 3 3 Z = − 1 Non-responder 2 2 1 1 0 0 −1 −1 −2 −2 −3 −3 −3 −2 −1 0 1 2 3 −3 −2 −1 0 1 2 3 The true label R The observable label Z T = + 1, Y = + 1 T = − 1, Y = + 1 3 3 T = + 1, Y = − 1 T = − 1, Y = − 1 2 2 1 1 0 0 −1 −1 −2 −2 −3 −3 −3 −2 −1 0 1 2 3 −3 −2 −1 0 1 2 3 T = +1 T = 0 Nathan Kallus Classifying Treatment Responders Under Causal Effect Monotonicity 8

Empirical Results: Synthetic Linear responder classification boundary 1.0 1.0 5eVS690 lLn 0.9 0.9 5eVS690 5BF 0.9 5eVSL5-gen 0.8 0.8 Accuracy 5eVSL5-dLVF 0.8 5eVS1et-gen 0.7 0.7 5eVS1et-dLVF 0.7 5F 0.6 0.6 CF 0.6 7A51et 0.5 0.5 0.5 5eVS690 lLn 0.9 10 1 10 2 10 3 10 1 10 2 10 3 10 1 10 2 10 3 5eVS690 5BF d = 2 d = 10 d = 20 5eVSL5-gen 0.8 5eVSL5-dLVF Spherical responder classification boundary 5eVS1et-gen 0.7 5eVS1et-dLVF 1.0 5eVS690 lLn 0.9 0.9 5F 0.9 5eVS690 5BF 5eVSL5-gen 0.6 0.8 0.8 CF Accuracy 0.8 5eVSL5-dLVF 5eVS1et-gen 0.7 0.7 7A51et 0.7 5eVS1et-dLVF 5F 0.6 0.6 0.5 0.6 CF 7A51et 0.5 0.5 0.5 10 1 10 2 10 3 10 1 10 2 10 3 10 1 10 2 10 3 10 1 10 2 10 3 d = 2 d = 10 d = 20 Nathan Kallus Classifying Treatment Responders Under Causal Effect Monotonicity 9

Empirical Results: Census Data ◮ Predict whether the sex-at-birth of mother’s first two kids being the same influences her decision to have a third ◮ Follows data construction by Angirst & Evans ’96 ◮ Covariates: ethnicity of mother and father; their ages at marriage, at census, at 1st kid, and at 2nd kid, year of marriage, and education level Method L θ (in 0 . 01 ) % 1st % 2nd % 3rd RespSVM lin 49 ± 2 . 7 100% RespLR-gen 57 ± 2 . 4 100% RespLR-disc 58 ± 2 . 3 2% LR 58 ± 2 . 3 92% RF 58 ± 2 . 3 6% Nathan Kallus Classifying Treatment Responders Under Causal Effect Monotonicity 10

Thank you! Poster: Today 6:30pm @ Pacific Ballroom #74

Classifying Treatment Responders Under Causal Effect Monotonicity - PowerPoint PPT Presentation

June 12, 2019 ICML 2019 Classifying Treatment Responders Under Causal Effect Monotonicity Nathan Kallus ORIE and Cornell Tech, Cornell University Heterogeneous Treatment Effect Estimation X Age X Weight X BMI X SysBP T (Anticoagulant) Y

Causal Effect Evaluation and Causal Network Learning Zhi Geng Peking University, China June

V0D 2016 Classifying Studies V0D V0D 2016 Classifying Studies 1 2016 Classifying Studies

Political Science 209 - Fall 2018 Causal Inference Florian Hollenbach 7th September 2018 Causal

Foundations of Causal Discovery Frederick Eberhardt KDD Causality Workshop 2016 Causal Discovery

Causal Inference By: Miguel A. Hern an and James M. Robins Part I: Causal inference without

Randomized Experiments The goal of randomized experiments is to identify The causal

Causal Programming Causal Programming Joshua Brul Joshua Brul

Few-shot Domain Adaptation 1/12 by Causal Mechanism Transfer Domain adaptation Causal mechanism

Causal Discovery from Observational Data Brady Neal causalcourse.com What if we dont have

A Brief Introduction to Causal Inference Brady Neal causalcourse.com What is causal inference?

Data-efficient causal effect estimation Adith Swaminathan adswamin@microsoft.com Joint work with

Can one extract causal information from high-dimensional observational data? Applied Multivariate

Classifying Homogeneous Structures Cherlin Introduction The finite case Gregory Cherlin

Causal Inference An introduction based on S. Wagers course on Causal Inference (OIT 661) Imke

Introduction to Causal Inference Lan Liu University of Minnesota at Twin Cities liux3771@umn.edu

Week 5 Video 2 Relationship Mining Causal Mining Causal Data Mining These slides developed in

APPTG Annual C Confer erence ce 201 2017 #APPTG PERSPECTIVES FROM Peter M MacCa Callum um

IHI Expedition Expedition: Improving Medication Safety from the Patients Perspective Session

Perspectives in 2015 Dominique-Charles Valla Dpartement Hospitalo-Universitaire (DHU) UNITY

New Anticoagulant Hybrids Marta Correia-da-Silva 1,2 *, Catarina Carvalho 1 , Brbara Duarte 3 ,

PHARMACOLOGY Pharmacology Danita Narciso Pharm D LEARNING OBJECTIVES Know what factors

Annual Meeting September 14, 2019 Direct Oral Anticoagulant Use in Chronic Kidney Disease

Pediatric Anticoagulation Discuss high-risk pediatric populations and indications for

Michael Streiff, MD Rakhi Naik, MD MHS Jody Hooper, MD July 7th, 2020 Session Overview Case

Sambuz

Useful Links

Newsletter

Mail Us