Enough: Nonparametric Matching Methods under Treatment Heterogeneity - PowerPoint PPT Presentation

When Doubly Robust is Not Robust Enough: Nonparametric Matching Methods under Treatment Heterogeneity Hui Shao PhD, Charles Stoecker PhD, Lizheng Shi PhD Department of Global Health Management and Policy School of Public Health and Tropical Medicine Tulane University

Population Heterogeneity and Treatment Heterogeneity • Patient populations are heterogeneous. • Age, sex, income, disease etiology and severity, presence of comorbidities and etc. • Varying characteristics can potentially confound/modify treatment effect. Tulane University, GHMP

Population Heterogeneity and Treatment Heterogeneity X X Modifying Treatment Outcome Treatment Outcome Treatment Treatment Effect Effect Confounding Modifying X Modifying Treatment Outcome Treatment Effect Selecting Treatment based on treatment effect Tulane University, GHMP

Treatment Selection Treatment Section based on treatment effect 1 0.9 0.8 0.7 Treatment Effect 0.6 0.5 0.4 0.3 0.2 0.1 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Probability of receiving treatment Average Treatment Effect on Average Treatment Effect Treated (ATT) (ATE) • Technological advances have increased our ability to 1) Study treatment heterogeneity 2) Deliver precision-targeted individualized treatment plans. • Patients potentially receive better treatment effect were more likely to choose the treatment. • Physicians more likely to recommend the treatment to patients who would benefit more from it. Tulane University, GHMP

What can we get from OLS? 𝑃𝑣𝑢𝑑𝑝𝑛𝑓 𝑗 = 𝛾 0 + 𝛾 1 ∗ 𝑈𝑠𝑓𝑏𝑢𝑛𝑓𝑜𝑢 𝑗 + 𝛿 ∗ 𝑌 𝑗 + 𝜁 𝑗 1) Modern Analytical framework (e.g. OLS) rarely discuss treatment heterogeneity issue. 2) And 𝛾 1 was most likely to be assumed as population average across different individuals. 3) W hat does 𝛾 1 actually represent? ATT or ATE? Tulane University, GHMP

What can we get from OLS? 𝐿 𝛾 1 = ෍ 𝐹 Y 𝑗 T 𝑗 = 1, 𝑌 𝑗 = 𝑌 𝐿 − 𝐹 Y 𝑗 T 𝑗 = 0, 𝑌 𝑗 = 𝑌 𝐿 ∗ 𝑋 𝐿 𝐿=1 𝛾 1 denotes Treatment Effect from OLS K denotes possible combination of 𝑌 𝑗 (strata) 𝑋 𝐿 denotes weights for strata K Tulane University, GHMP

What can we get from OLS? 𝑈ℎ𝑓𝑝𝑠𝑓𝑢𝑗𝑑𝑏𝑚𝑚𝑧, 𝑥𝑓 𝑥𝑝𝑣𝑚𝑒 𝑥𝑏𝑜𝑢 𝑋 𝐿 = Pr(𝑌 𝑗 = 𝑌 𝐿 ) However 𝑤𝑏𝑠 T 𝑗 𝑌 𝑗 = 𝑌 𝐿 ∗ Pr(𝑌 𝑗 = 𝑌 𝐿 ) 𝑋 𝐿 = 𝐿 σ 𝐿=1 𝑤𝑏𝑠 T 𝑗 𝑌 𝑗 = 𝑌 𝐿 ∗ Pr(𝑌 𝑗 = 𝑌 𝐿 ) Acknowledging that 𝑤𝑏𝑠 T 𝑗 𝑌 𝑗 = 𝑌 𝐿 = Pr T 𝑗 = 1 𝑌 𝑗 = 𝑌 𝐿 ∗ (1 − Pr T 𝑗 = 1 𝑌 𝑗 = 𝑌 𝐿 ) That is, individuals with propensity score closer to 0.5 receive higher weights Tulane University, GHMP

What can we get from OLS? Chose treatment based on treatment effect 1 Low Medium High Medium Low 0.9 Weights Weights Weights Weights Weights 0.8 0.7 Treatment Effect 0.6 0.5 0.4 0.3 0.2 0.1 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Probability of receiving treatment OLS regression provides an estimation of treatment effect that neither ATT nor ATE, but a weighted average of treatment effect on variance of treatment assignment in each strata. Tulane University, GHMP

PSM to estimate ATT 1-1: one to one matching K-NN: k nearest neighborhood matching PSW-uncap: propensity score weighting without capping PSW-cap: propensity score with capping on both sides. LLM: local linear matching Kernel: Kernel matching Tulane University, GHMP

How matching handle this bias and provide ATT? 2 High Weights Equal weights for each individual 1.5 1 .5 0 0 .2 .4 .6 .8 1 X Treatment Control Before Matching After Matching 𝑤𝑏𝑠 T 𝑗 𝑌 𝑗 = 𝑌 𝐿 = 𝑤𝑏𝑠(T 𝑗 ) Then 𝑤𝑏𝑠(T 𝑗 ) ∗ Pr(𝑌 𝑗 = 𝑌 𝐿 ) 𝑋 𝐿 = 𝑤𝑏𝑠(T 𝑗 ) ∗ Pr(𝑌 𝑗 = 𝑌 𝐿 ) = Pr(𝑌 𝑗 = 𝑌 𝐿 ) 𝐿 σ 𝐿=1 Tulane University, GHMP

doubly robust “ D oubly robust” method refers to applying OLS regression on matched sample to achieve “optimal estimation accuracy” Step 1 Step 2 Use matching to resample the data to Regression on two balanced groups matched data (pseudo randomization) Evidence shows an accuracy improvement after applying doubly robust method 2 . However, previous studies never discussed heterogeneity scenarios. 2.Bang H, Robins JM. Doubly robust estimation in missing data and causal inference models. Biometrics 2005; Tulane University, GHMP

doubly robust Low Low Equal weights Weights Weights Medium bandwidth Large bandwidth Small bandwidth • When using 1 to 1 matching, or kernel matching with narrow bandwidth, it is relatively easier to achieve proposed matching pattern • Kernel matching with large bandwidth is likely to achieve above matching pattern (right). • Applying OLS regression on this matched sample will still result in assigning different weights on different strata based on Var (T i |X i =X K ). Tulane University, GHMP

Objectives To test if the “doubly robust” method improves the estimation accuracy, compares to a direct mean comparison after matching. Tulane University, GHMP

Generating propensity score Three representative propensity score distributions were designed. More Patients in control group More Patients in treatment Balanced, most common 1. Johnson, N., "Systems of Frequency Curves Generated by Methods of Translation," Biometrika 36 (1949) Tulane University, GHMP

Treatment Heterogeneity For each density design, three treatment heterogeneity scenario were designed Linear Peak at 0.7 More random Tulane University, GHMP

Monte-Carlo Simulation Record ATT 1 by 1 Record ATT KNN Generate Record ATT Run OLS PSW Data Record ATT Kernel Local Linear Record ATT β OLS • 9 DGPs • 500 individuals were generated • 1000 iterations were conducted. Tulane University, GHMP

Treatment Heterogeneity 𝑗 (𝐵𝑈𝑈 𝑛𝑏𝑢𝑑ℎ𝑗𝑜𝑕,𝑗 − 𝐵𝑈𝑈 𝑢𝑠𝑣𝑓,𝑗 ) 2 σ 𝑗=1 MSE = 𝑗 • Mean squared error (MSE) was used to measure the bias for each matching method. • i denotes each iteration. • Relative bias is the measurement of estimation accuracy in this study: 𝑆𝑓𝑚𝑏𝑢𝑗𝑤𝑓 𝐶𝑗𝑏𝑡: MSE 𝑛𝑏𝑢𝑑ℎ𝑗𝑜𝑕 MSE 𝑃𝑀𝑇 Tulane University, GHMP

Results & Discussion Density 1 Heterogeneity 1 Bias of OLS 100 90 80 Relative Bias (%) 86.2 70 71.38 60 50 54.14 40 48.3 43.45 30 32.07 32.03 20 10 0 Tulane University, GHMP

Results & Discussion • Red line denotes relative bias from OLS regression. • One to one matching provides unreliable matching results. • 5 out of 9 simulated DGPs showed that one by one matching yield higher relative bias than simple OLS regression. Tulane University, GHMP

Results & Discussion • PSW also provides unreliable estimation. • 7 out of 9 simulated DGPs showed that PSW yield worse estimation than simple OLS regression. • Capping provides better estimation. Tulane University, GHMP

Results & Discussion • KNN matching provides consistent lower relative bias than OLS regression. • However, the efficacy of this method is sensitive to the number of matching assigned to each individual. Tulane University, GHMP

Results & Discussion • Kernel matching provides best estimation among all PSM methods. • And this evidence is consistent in all DGPs. Tulane University, GHMP

Results & Discussion Kernel Matching 100 90 80 Rlative Bias to OLS 70 60 50 40 Bandwidth 0.29 (Leave-one-out algorithm) 30 20 10 0 0.0000 0.2000 0.4000 0.6000 0.8000 1.0000 1.2000 1.4000 1.6000 1.8000 2.0000 Bandwith • The efficacy of Kernel matching relies on the choice of bandwidth. The leave-one-out algorithm 2 is a widely used algorithm to select • bandwidth. • Our simulation showed that this algorithm provided reliable estimation. 3. Frölich, M., Finite-sample properties of propensity-score matching and weighting estimators. Review of Economics and Statistics, 2004. 86 (1): p. 77-90. Tulane University, GHMP

doubly robust • You can choose certain bandwidth to achieve lowest relative bias. • Bias from doubly robust method increases as bandwidth increases. Tulane University, GHMP

Discussion Step 1 Step 2 Use matching to Regression on provide pseudo matched data randomization • We also applied doubly robust method on other PSMs. • Doubly robust method yields consistent equal or worse estimation than a direct mean comparison. Tulane University, GHMP

Conclusion • Treatment selection on expected treatment effect (treatment heterogeneity) will likely to cause biased estimation from OLS regression. • PSM method can reduce this bias. Among all the PSM methods, kernel matching yields consistently best estimation. • Doubly robust method is not recommended. Tulane University, GHMP

Enough: Nonparametric Matching Methods under Treatment Heterogeneity - PowerPoint PPT Presentation

When Doubly Robust is Not Robust Enough: Nonparametric Matching Methods under Treatment Heterogeneity Hui Shao PhD, Charles Stoecker PhD, Lizheng Shi PhD Department of Global Health Management and Policy School of Public Health and Tropical

7.5 Bipartite Matching Matching Matching. Input: undirected graph G = (V, E). M E

Global Shape Matching Section 3.3: Articulated Matching using Graph Cuts Global Shape Matching:

Matching of Matrix Elements and Parton Showers CKKW matching in e + e collisions Lecture 2:

Matching Bipartite Matching Input Given a (undirected) graph G = ( V , E ) Input Given a bipartite

Nonparametric Regression Splines for Nonparametric Regression Splines for Regional Atmospheric

Nonparametric Sequential Change Detection for High-Dimensional Problems Yasin Ylmaz Electrical

The np package np : A Package for Nonparametric Kernel The np package implements a variety of

Nonparametric analysis of CMB Nonparametric analysis of CMB power spectrum data and consistency

Shunem 1. Sufficiency means enough to meet the situation; enough to accomplish the task.

Impedance Matching of 640 GHz SIS Mixer Impedance Matching of 640 GHz SIS Mixer of 640 GHz SIS

String Matching Inge Li Grtz CLRS 32 String Matching String matching problem: string

Outline Morning program Preliminaries Text matching I Text matching II Afternoon program

CSE182-L7 Dicitionary matching Pattern matching October 09 CSE182 Dictionary Matching

Graph Matchings Matching A matching M in a graph G is a set of non-loop edges with no shared

Outline Morning program Preliminaries Text matching I Text matching II Afternoon program

Scalable String Matching on the Scalable String Matching on the Scalable String Matching on the

Preliminary Simulation on Jet Breakup Experiment Using High Accuracy Kernel Correction Scheme for

Eu Resist : An integrated system for management of antiretroviral drug resistance Francesca

SHERPA-city: Impact of traffic measures on urban air quality NO 2 is a local problem An analysis

in in sout uthern hern Portugal tugal un under r cl clim imate ate ch chan ange ge

Investigation of the OpenCL support in the GeantV's Vectorized Geometry Gabor Biro 22.09.2014.

LA-UR-17-28072 Approved for public release; distribution is unlimited. Title: Los Alamos

Neural Networks and Their Applications Edited by J. G. Taylor King's College London II

PARAMETER STUDY OF SHORT CARBON FIBER REINFORCED CARBON D. Heim 1* , S. Zaremba 1 , C. Klotz 2 ,

Sambuz

Useful Links

Newsletter

Mail Us