Causal Inference and Response Surface Modeling Inference - PowerPoint PPT Presentation

Causal ¡Inference ¡and ¡Response ¡ Surface ¡Modeling ¡ ¡ Inference ¡and ¡Representa6on ¡ DS-‑GA-‑1005 ¡Fall ¡2015 ¡ Guest ¡lecturer: ¡Uri ¡Shalit ¡

What ¡is ¡Causal ¡Inference? ¡ source: ¡xkcd.com/552/ ¡ 2/53 ¡

Causal ¡ques6ons ¡as ¡ counterfactual ¡ques6ons ¡ • Does ¡this ¡medica6on ¡improve ¡pa6ents ¡health? ¡ – Counterfactual: ¡taking ¡vs. ¡not ¡taking ¡ • Is ¡the ¡new ¡design ¡bringing ¡more ¡customers? ¡ – Counterfactual: ¡new ¡design ¡vs. ¡old ¡design ¡ • Is ¡online ¡teaching ¡beOer ¡than ¡in-‑class? ¡ – Counterfactual: ¡… ¡ 3/53 ¡

Poten6al ¡Outcomes ¡Framework ¡ (Rubin’s ¡Causal ¡Model) ¡ • Each ¡unit ¡(pa6ent, ¡customer, ¡student, ¡cell ¡culture) ¡ has ¡ two ¡poten6al ¡outcomes: ¡(y 0 ,y 1 ) ¡ – y 0 ¡is ¡the ¡poten6al ¡outcome ¡had ¡the ¡unit ¡not ¡been ¡ treated: ¡“ control ¡outcome ” ¡ – y 1 ¡is ¡the ¡poten6al ¡outcome ¡had ¡the ¡unit ¡been ¡treated: ¡ “ treatment ¡outcome ” ¡ • Treatment ¡effect ¡for ¡unit ¡ i ¡ ¡= ¡ ¡y i 1 ¡– ¡y i 0 ¡ • O]en ¡interested ¡in ¡mean ¡or ¡expected ¡ ¡ treatment ¡effect ¡ 4/53 ¡

Hypothe6cal ¡example ¡– ¡effect ¡of ¡fish ¡oil ¡ supplement ¡on ¡blood ¡pressure ¡(Hill ¡& ¡Gelman) ¡ Unit ¡ female ¡ age ¡ treatment ¡ poten0al ¡ poten0al ¡ observed ¡ outcome ¡ ¡ ¡ outcome ¡ outcome ¡ y i 0 ¡ y i 1 ¡ y i ¡ Audrey ¡ 1 ¡ 40 ¡ 0 ¡ 140 ¡ 135 ¡ 140 ¡ Anna ¡ 1 ¡ 40 ¡ 0 ¡ 140 ¡ 135 ¡ 140 ¡ Bob ¡ 0 ¡ 50 ¡ 0 ¡ 150 ¡ 140 ¡ 150 ¡ Bill ¡ 0 ¡ 50 ¡ 0 ¡ 150 ¡ 140 ¡ 150 ¡ Caitlin ¡ 1 ¡ 60 ¡ 1 ¡ 160 ¡ 155 ¡ 155 ¡ Cara ¡ 1 ¡ 60 ¡ 1 ¡ 160 ¡ 155 ¡ 155 ¡ Dave ¡ 0 ¡ 70 ¡ 1 ¡ 170 ¡ 160 ¡ 160 ¡ Doug ¡ 0 ¡ 70 ¡ 1 ¡ 170 ¡ 160 ¡ 160 ¡ Source: ¡Jennifer ¡Hill ¡ Mean(y i 1 ¡– ¡y i 0 ) ¡= ¡-‑7.5 ¡ Mean( ¡(y i |treatment=1) ¡-‑ ¡(y i |treatment=0)) ¡= ¡12.5 ¡ ¡ ¡ 5/53 ¡

The ¡fundamental ¡problem ¡of ¡ The ¡fundamental ¡problem ¡of ¡causal ¡inference ¡ ¡ causal ¡inference: ¡ We ¡only ¡ever ¡observe ¡one ¡of ¡the ¡ two ¡outcomes ¡ • How ¡to ¡deal ¡with ¡The ¡Problem: ¡ – Close ¡subs6tutes ¡ – Randomiza6on ¡ – Sta6s6cal ¡Adjustment ¡ 6/53 ¡

Fundamental ¡Problem ¡(I): ¡ ¡ Close ¡Subs6tutes ¡ • Does ¡chemical ¡X ¡corrode ¡material ¡M? ¡Create ¡a ¡piece ¡ of ¡material ¡M, ¡break ¡it ¡into. ¡Place ¡chemical ¡on ¡ one ¡piece. ¡ • Does ¡removing ¡meat ¡from ¡my ¡diet ¡reduce ¡my ¡ weight? ¡ My ¡weight ¡before ¡the ¡diet ¡is ¡a ¡close ¡subs6tute ¡to ¡my ¡ weight ¡a]er ¡the ¡diet ¡ had ¡I ¡not ¡gone ¡on ¡the ¡new ¡diet ¡ • Separated ¡twin ¡studies. ¡ ¡ ¡ ¡ ¡ What ¡assump0ons ¡have ¡we ¡ made ¡here? ¡ 7/53 ¡

Fundamental ¡Problem ¡(II): ¡ ¡ Randomiza6on ¡ • Assume ¡the ¡outcomes ¡are ¡generated ¡from ¡a ¡ distribu6on. ¡ • Therefore ¡if ¡we ¡sample ¡enough ¡6mes, ¡we ¡can ¡ es6mate ¡the ¡mean ¡effect: ¡ • Obtain ¡a ¡sample ¡of ¡the ¡items ¡of ¡interest. ¡Assign ¡half ¡to ¡ treatment ¡and ¡half ¡to ¡control, ¡ at ¡random ¡ • This ¡yields ¡two ¡es6mates: ¡ y 1 0 ,…,y n 0 ¡ y n+1 1 ,…,y 2n 1 ¡ • Average ¡the ¡es6mates ¡ 8/53 ¡

Fundamental ¡Problem ¡(III): ¡ ¡ Sta6s6cal ¡Adjustment ¡ • Some6mes ¡we ¡can’t ¡find ¡close ¡subs6tutes, ¡and ¡can’t ¡ randomize, ¡for ¡example: ¡ • Non-‑compliance: ¡some ¡of ¡the ¡people ¡did ¡not ¡follow ¡the ¡ new ¡diet ¡proscribed ¡in ¡the ¡experiment. ¡ • Ethical: ¡does ¡breathing ¡Asbestos ¡cause ¡cancer? ¡ • Imprac6cal: ¡do ¡stricter ¡gun ¡laws ¡lead ¡to ¡safer ¡ communi6es? ¡ • Retrospec6ve: ¡we ¡have ¡data ¡from ¡the ¡past, ¡for ¡example ¡ educa6onal ¡aOainment ¡and ¡college ¡aOendance. ¡ • Control ¡and ¡treatment ¡popula6ons ¡are ¡different ¡ ¡ ¡ ¡ 9/53 ¡

Fundamental ¡Problem ¡(III): ¡ ¡ Sta6s6cal ¡Adjustment ¡ • Treatment ¡and ¡control ¡group ¡are ¡not ¡similar ¡– ¡what ¡ can ¡we ¡do? ¡ • Es6mate ¡the ¡outcomes ¡using ¡a ¡model, ¡such ¡as ¡linear ¡ regression, ¡random ¡forests, ¡BART ¡(later ¡today). ¡ Known ¡as ¡Response ¡Surface ¡Modeling ¡ ¡ • Divide ¡the ¡sample ¡into ¡similar ¡subgroups ¡ ¡ • Re-‑weight ¡the ¡units ¡to ¡be ¡more ¡representa6ve ¡ ¡ Today ¡we ¡will ¡focus ¡on ¡sta8s8cal ¡adjustment ¡ ¡with ¡response ¡surface ¡modeling ¡ ¡ 10/53 ¡

Response ¡Surface ¡Modeling: ¡ ¡ Linear ¡Regression ¡ True ¡model: ¡ y i = β 0 + β 1 T i + β 2 x i + ε i Fit ¡without ¡ confounding ¡variable ¡x i : ¡ * + β * y i = β 0 1 T i + ε i Represent ¡x i ¡as ¡a ¡func6on ¡T i : ¡ x i = γ 0 + γ 1 T i + θ i Obtain: ¡ * = β 1 + β 2 γ 1 β 1 11/53 ¡

When ¡will ¡this ¡work? ¡ • No ¡hidden ¡confounders ¡ • Model ¡is ¡correct ¡ • Both ¡assump6ons ¡patently ¡false. ¡How ¡can ¡we ¡ make ¡them ¡less ¡false? ¡ 12/53 ¡

hidden ¡confounder ¡ h treatment ¡ observed ¡ x T confounder ¡ y observed ¡outcome ¡ 14/53 ¡

Pearl’s ¡do-‑calculus ¡and ¡structural ¡ ¡ equa6on ¡modeling ¡ U T ¡ U x ¡ treatment ¡ observed ¡ T=f T (x,U t ) ¡ x=f x (U x ) ¡ confounder ¡ U y ¡ y=f y (x,T,U y ) ¡ ¡ observed ¡outcome ¡ 15/53 ¡

Pearl’s ¡do-‑calculus ¡and ¡structural ¡ ¡ equa6on ¡modeling ¡ U x ¡ treatment ¡ observed ¡ T=t ¡ x=f x (U x ) ¡ confounder ¡ U y ¡ y=f y (x,t,U y ) ¡ ¡ observed ¡outcome ¡ 16/53 ¡

Response ¡Surface ¡Modeling ¡ • We ¡wish ¡to ¡model ¡U x , ¡f x (U x ), ¡U y , ¡and ¡f y (U y ,x,t). ¡ • In ¡principle ¡any ¡regression ¡method ¡can ¡work: ¡ use ¡t=T i ¡as ¡a ¡feature, ¡predict ¡for ¡both ¡ T i =0, ¡T i =1. ¡ • Linear ¡regression ¡is ¡far ¡too ¡weak ¡for ¡most ¡ problems ¡of ¡interest! ¡ 17/53 ¡

Response ¡Surface ¡Modeling: ¡BART ¡ • In ¡principle ¡ any ¡regression ¡method ¡can ¡work: ¡ use ¡T i ¡as ¡a ¡feature, ¡predict ¡for ¡both ¡T i =0, ¡T i =1. ¡ • In ¡2008, ¡Chipman, ¡George ¡and ¡McCulloch ¡ introduced ¡Bayesian ¡Addi6ve ¡Regression ¡Trees ¡ (BART). ¡ • BART ¡is ¡non-‑linear, ¡yet ¡easy ¡to ¡fit ¡and ¡ empirically ¡robust ¡to ¡model ¡misspecifica6on. ¡ • Proven ¡as ¡very ¡successful ¡for ¡causal ¡inference, ¡ especially ¡adopted ¡in ¡the ¡social ¡sciences. ¡ 18/53 ¡

Bayesian ¡Addi6ve ¡Regression ¡Tress ¡ (BART) ¡ Chipman, ¡H. ¡A., ¡George, ¡E. ¡I., ¡& ¡McCulloch, ¡R. ¡E. ¡(2010). ¡ ¡ BART: ¡Bayesian ¡addi8ve ¡regression ¡trees . ¡ ¡The ¡Annals ¡of ¡Applied ¡Sta6s6cs, ¡266-‑298. ¡ bartMachine   Kapelner, ¡A., ¡& ¡Bleich, ¡J. ¡(2013). ¡ ¡ bartMachine: ¡ ¡Machine ¡Learning ¡with ¡Bayesian ¡ Addi8ve ¡Regression ¡Trees . ¡ ¡ arXiv ¡preprint ¡arXiv:1312.2171. ¡ 19/53 ¡

What’s ¡a ¡regression ¡tree? ¡ source: ¡MaOhew ¡Pratola, ¡OSU ¡ x 5 < ¡c ¡ x 5 ≥ ¡c ¡ μ 3 ¡ x 2 < ¡d ¡ x 2 ≥ ¡d ¡ μ 1 ¡ μ 2 ¡ μ k (x) ¡can ¡be ¡e.g. ¡linear ¡func6on, ¡a ¡ Gaussian ¡process, ¡or ¡just ¡a ¡constant. ¡ 20/53 ¡

21/53 ¡ source: ¡MaOhew ¡Pratola, ¡OSU ¡

Bayesian ¡Regression ¡Trees ¡ • Each ¡tree ¡is ¡a ¡func6on ¡g(·√ ¡; ¡T, ¡M) ¡parameterized ¡by: ¡ – Tree ¡structure ¡T ¡ – Leaf ¡func6ons ¡M ¡ • Bayesian ¡framework: ¡ ¡ – Data ¡is ¡generated ¡y(x) ¡= ¡g(·√ ¡; ¡T, ¡M) ¡+ ¡ε, ¡ε ~ N (0,σ 2 ) ¡ – Prior: π (M,T, σ 2 ) ¡= ¡ π (M|T,σ 2 )π(T|σ 2 ) ¡π(σ 2 ) ¡ ¡ ¡ 22/53 ¡

Causal Inference and Response Surface Modeling Inference - PowerPoint PPT Presentation

Causal Inference and Response Surface Modeling Inference and Representa6on DS-GA-1005 Fall 2015 Guest lecturer: Uri Shalit What is Causal Inference?

Political Science 209 - Fall 2018 Causal Inference Florian Hollenbach 7th September 2018 Causal

Causal Effect Evaluation and Causal Network Learning Zhi Geng Peking University, China June

Causal Inference By: Miguel A. Hern an and James M. Robins Part I: Causal inference without

A Brief Introduction to Causal Inference Brady Neal causalcourse.com What is causal inference?

Introduction to Causal Inference Lan Liu University of Minnesota at Twin Cities liux3771@umn.edu

Foundations of Causal Discovery Frederick Eberhardt KDD Causality Workshop 2016 Causal Discovery

Causal Inference Theory and Applications Dr. Matthias Uflacker, Johannes Huegle, Christopher

Modes of Statistical Inference for Causal Efgects Plus an overview of the testing based approach

Geographic Data Science - Lecture IX Causal Inference Dani Arribas-Bel Today Correlation Vs

Causal inference Gary Goertz Kroc Institute for International Peace Studies University of Notre

Causal Inference An introduction based on S. Wagers course on Causal Inference (OIT 661) Imke

Geographic Data Science - Lecture IX Causal Inference Dani Arribas-Bel Today Correlation Vs

Geographic Data Science - Lecture IX Causal Inference Dani Arribas-Bel Today Correlation Vs

The Fitted Response Surface Graph the fitted surface and its standard error: response.R 1 / 17

Causal Programming Causal Programming Joshua Brul Joshua Brul

Few-shot Domain Adaptation 1/12 by Causal Mechanism Transfer Domain adaptation Causal mechanism

Q2 2020 Earnings Call July 30, 2020 1 Forward-looking statements Safe Harbor Statement This

Construction Manager Perspective: Lessons Learned Implementing the EPA Superfund Job Training

THE FUTURE OF MASS-TORT LONG-TAIL LITIGATION Stephen Hoke, Hoke LLC (Chair) Claudia Temple,

in the Home Identificacin de Peligros Ambientales Caseros MFY Legal Services, Inc. New York

Notice with respect to asbestos Published in the Canada Gazette Information Session Overview

Workers compensation for asbestos related disease in Canada Katherine Lippel Canada

Unsupervised Code-Switching for Multilingual Historical Document Transcription Dan Garrette

ASPHALT SHINGLES RECYCLING: HOW TO WORK WITH REGULATORS Presented at the 5 th Asphalt Shingle