Policy Evaluation with Latent Confounders via Optimal Balance Andrew - PowerPoint PPT Presentation

Policy Evaluation with Latent Confounders via Optimal Balance Andrew Bennett 1 Cornell University awb222@cornell.edu Nathan Kallus 1 Cornell University kallus@cornell.edu 1 Alphabetical order. 1 / 33

Policy Learning Problem Given some observational data on individuals described by some covariates ( X ), interventions performed on those individuals ( T ), and resultant outcomes ( Y ), wish to estimate utility of policies that assign treatment to individuals based on covariates Challenging problem when the relationship between T and Y in the logged data is confounded, even controlling for X Examples: Drug assignment policy: X is patient information available to doctors, T is drug assigned, Y is medical outcome, and confounding due to factors not fully accounted for by X (e.g. socieoconomics) deciding drug assignment in observational data Personalized education: X contains individual student statistics, T is an educational intervention, Y is measure of post-intervention student outcomes, and confounding due to X poorly accounting for criteria used by decision makers in observational data (e.g. X contains standardized test score but decisions made based on actual student capability) 2 / 33

Setup - Latent Confounder Framework Logged Data Model: Latent Confounders: Z ∈ Z ⊆ R p Observed Proxies: X ∈ X ⊆ R q Treatment: T ∈ { 1 , . . . , m } Potential Outcomes: Y ( t ) ∈ R Assumption ( Z are true confounders) For every t ∈ { 1 , . . . , m } , the variables X , T , Y ( t ) are mutually independent, conditioned on Z. X Z T Y 3 / 33

Setup - Logging and Behavior Policies Evaluation Policy: π t ( x ) denotes the probability of assigning treatment T = t given observed proxies X = x by evaluation policy Logging Policy: e t ( z ) denotes the probability of assigning treatment T = t given observed proxies Z = z by logging policy η t ( x ) denotes the probability of assigning treatment T = t given observed proxies X = x by logging policy 4 / 33

Setup - Policy Evaluation Goal Definition (Policy Value) τ π = E [ � m t =1 π t ( X ) Y ( t )]. Goal: Our goal is to estimate the policy value τ π given iid logged data of the form (( X 1 , T 1 , Y 1 ) , . . . , ( X n , T n , Y n )) τ π that minimizes the MSE E [(ˆ τ π − τ π ) 2 ] Want to find an estimator ˆ 5 / 33

Setup - Latent Confounder Model X Z T Y We denote by ϕ ( z ; x , t ) the conditional density of Z given X = x , T = t Assumption (Latent Confounder Model) We assume that we have an identified model for ϕ ( z ; x , t ) , and that we can calculate conditional densities and sample Z values using this model 6 / 33

Setup - Observed Proxies X Z T Y We do not assume ignorability given X This means standard approaches based on inverse propensity scores are bound to fail Instead the proxies X can be used (along with T ) to calculate the posterior of the true confounders Z , which can be used for evaluation 7 / 33

Setup - Additional Assumptions Assumption (Weak Overlap) E [ e − 2 ( Z )] < ∞ t Assumption (Bounded Variance) The conditional variance of our potential outcomes given X , T is bounded: V [ Y ( t ) | X , T ] ≤ σ 2 . 8 / 33

Setup - Mean Value Functions Define the following mean value functions: µ t ( z ) = E [ Y ( t ) | Z = z ] ν t ( x , t ′ ) = E [ Y ( t ) | X = x , T = t ′ ] = E [ µ t ( Z ) | X = x , T = t ′ ] ρ t ( x ) = E [ Y ( t ) | X = x ] = E [ µ t ( Z ) | X = x ] Note that we can equivalently redefine policy value as: m τ π = E [ � π t ( X ) Y ( t )] t =1 m � = E [ π t ( X ) µ t ( Z )] t =1 m � = E [ π t ( X ) ν t ( X , T )] t =1 9 / 33

Past Work - Standard Estimator Types Weighted, Direct, and Doubly Robust estimators: n W = 1 � ˆ τ π W i Y i n i =1 n m ρ = 1 � � τ π ˆ π t ( X i )ˆ ρ t ( X i ) ˆ n i =1 t =1 n m n ρ = 1 ρ t ( X i ) + 1 � � � τ π ˆ π t ( X i )ˆ W i ( Y i − ˆ ρ T i ( X i )) W , ˆ n n i =1 t =1 i =1 Note that ˆ ρ t is not straightforward to estimate via regression since ρ t ( x ) = E [ Y ( t ) | X = x ] � = E [ Y | X = x ] Correct IPW weights W i = π T i ( X i ) / e T i ( Z i ) are infeasible since Z i is not observed, and naively misspecified IPW weights W i = π T i ( X i ) /η T i ( X i ) lead to biased evaluation 10 / 33

Past Work - Optimal Balancing Optimal Balancing (Kallus 2018) seeks to come up with a set of weights W i that ˆ τ π W minimize an estimate of the worst-case MSE of policy evaluation, given a class of functions for the unknown mean value function Define CMSE ( W , µ ) to be the conditional mean squared error given the logged data of ˆ τ π W as an estimate of the sample average policy effect (SAPE), if the mean value function were given by µ Choose weights W ∗ for evaluation according to the rule: W ∗ = arg min sup CMSE ( W , µ ) W ∈W µ ∈F Permits simple QP algorithm when F is a class of RKHS functions 11 / 33

Generalized IPS Weights I Suppose we want to define weights W ( X , T ) IPS-style such that the weighted estimator is unbiased term-by-term, this requires solving: E [ W ( X , T ) δ T i t Y ( t )] = E [ π t ( X ) Y ( t )] Can easily verify that if we assume ignorability given X this equation is solved by standard IPS weights W ( X , T ) = π T ( X ) /η T ( X ) Theorem (Generalized IPS Weights) If W ( x , t ) satisfies the above equation then for each t ∈ { 1 , . . . , m } � m t ′ =1 η t ′ ( x ) ν t ( x , t ′ ) + Ω t ( x ) W ( x , t ) = π t ( x ) , η t ( x ) ν t ( x , t ) for some Ω t ( x ) such that E [Ω t ( X )] = 0 ∀ t. 12 / 33

Generalized IPS Weights II Calculating these generalized IPS weights is not straightforward since it involves the counterfactual estimation of ν t ( x , t ′ ) for t � = t ′ (requires knowledge of Z ) In addition would expect high variance from error in estimating ν t due to its position in denominator However the fact that such weights exist supports idea of using optimal balancing style approach, and choosing weights that balance a flexible class of possible mean outcome functions 13 / 33

Adversarial Objective Motivation Define the following, where we embed the dependence on µ inside ν t implicitly: f it = W i δ T i t − π t ( X i ) � 2 � n m + 2 σ 2 1 � � n 2 � W � 2 J ( W , µ ) = f it ν t ( X i , T i ) 2 , n i =1 t =1 Theorem (CMSE Upper Bound) W − τ π ) 2 | X 1: n , T 1: n ] ≤ 2 J ( W , µ ) + O p (1 / n ) . E [(ˆ τ π Lemma (CMSE Convergence implies Consistency) W = τ π + O p (1 / √ n ) . W − τ π ) 2 | X 1: n , T 1: n ] = O p (1 / n ) then ˆ If E [(ˆ τ π τ π 14 / 33

Balancing Objective Our optimal balancing objective is to choose weights W ∗ for evaluation according to the following optimzation problem: W ∗ = arg min sup J ( W , µ ) W ∈W µ ∈F 15 / 33

Feasibility of Balancing Objective I Minimizing J ( W , µ ) over some class of µ ∈ F corresponds to balancing some class of functions ν implicitly indexed by µ , since: � 2 � n n m 1 W i ν T i ( X i , T i ) − 1 � � � J ( W , µ ) = π t ( X i ) ν t ( X i , T i ) n n i =1 i =1 t =1 + 2 σ 2 n 2 � W � 2 2 Note that such balancing would be impossible over a generic flexible class of functions ν ignoring Z , due to ν t ( x , t ′ ) terms for t � = t ′ 16 / 33

Feasibility of Balancing Objective II The following lemma suggests that this fundamental counterfactual issue may not be a problem given our implicit constraint imposed by indexing using µ and our overlap assumption: Lemma (Mean Value Function Overlap) Assuming � µ t � ∞ ≤ b, under our weak overlap assumption, for all x ∈ X , and t , t ′ , t ′′ ∈ { 1 , . . . , m } we have | ν t ( x , t ′′ ) | ≤ η t ′ ( x ) � 8 b E [ e − 2 ( Z ) | X = x , T = t ′ ] | ν t ( x , t ′ ) | . t η t ′′ ( x ) 17 / 33

Assumptions for Consistent Evaluation I Define F t = { µ t : ∃ ( µ ′ 1 , . . . , µ ′ m ) ∈ F with µ ′ t = µ t } , then we make the following assumptions: Assumption (Normed) For each t ∈ { 1 , . . . , m } there exists a norm � · � t on span( F t ) , and there exists a norm � · � on span( F ) which is defined given some R m norm as � µ � = � ( � µ 1 � 1 , . . . , � µ m � m ) � . Assumption (Absolutely Star Shaped) For every µ ∈ F and | λ | ≤ 1 , we have λµ ∈ F . Assumption (Convex Compact) F is convex and compact 18 / 33

Assumptions for Consistent Evaluation II Assumption (Square Integrable) For each t ∈ { 1 , . . . , m } the space F t is a subset of L 2 ( Z ) , and its norm dominates the L 2 norm (i.e., inf µ t ∈F t � µ t � / � µ t � L 2 > 0 ). Assumption (Nondegeneracy) Define B ( γ ) = { µ ∈ span( F ) : � µ � ≤ γ } . Then we have B ( γ ) ⊆ F for some γ > 0 . Assumption (Boundedness) sup µ ∈F � µ � ∞ < ∞ . 19 / 33

Assumptions for Consistent Evaluation III Definition (Rademacher Complexity) 1 � n R n ( F ) = E [sup f ∈F i =1 ǫ i f ( Z i )], where ǫ i are iid Rademacher random n variables. Assumption (Complexity) For each t ∈ { 1 . . . , m } we have R n ( F t ) = o (1) . 20 / 33

Policy Evaluation with Latent Confounders via Optimal Balance Andrew - PowerPoint PPT Presentation

Policy Evaluation with Latent Confounders via Optimal Balance Andrew Bennett 1 Cornell University awb222@cornell.edu Nathan Kallus 1 Cornell University kallus@cornell.edu 1 Alphabetical order. 1 / 33 Policy Learning Problem Given some

Controlling for confounders through approximate sufficiency Rina Foygel Barber (joint with Lucas

Pitfalls of data-driven networking: A case study of latent causal confounders in video streaming

in the presence of latent confounders and linear non-Gaussian SEMs Shohei Shimizu Osaka

Optimal Agents Nick Hay 27th September 2005 1 / 36 Nick Hay Optimal Agents The Optimal Agent

1 Latent variable models In the next section we will discuss latent variable models for

Part III: Latent Tree Models Le Song ICML 2012 Tutorial on Spectral Algorithms for Latent

Chapter 12. Evaluation Research Chapter 12. Evaluation Research evaluation research? evaluation

User Interface Evaluation Empirical evaluation Heuristic evaluation 1 CS 349 - UI evaluation

Toward Computing Towards an Optimal . . . An (Almost) Optimal . . . Minor Problem an Optimal

Causality Actions, Confounders and Interventions Christos Dimitrakakis October 30, 2019 . . .

www.dagitty.net Dealing with confounders just got easier! George TH Ellison PhD DSc TIME

EpiGraphDB Query for confounders http://epigraphdb.org/confounder/ (cf:Gwas)-[r1:MR]->

Confounders and Corfield: Back to the Future 12 July, 2018 0G 2018 ICOTS-10 1 0G 2018

STATISTICS 536B, Lecture #3 March 3, 2015 General options for binary Y , binary X , confounders C

EVALUATION OF THE OPTIMAL EVALUATION OF THE OPTIMAL LOCATION OF MONITORING LOCATION OF

Stochastic Latent Actor-Critic: Deep Reinforcement Learning with a Latent Variable Model CS330

Distributed Variational Inference in Sparse Gaussian Process Regression and Latent Variable Models

Learning Semantic Visual Codebook for Action Recognition by Embedding into Concept Space Behrouz

Knowledge Transfer Using Latent Variable Models Ayan Acharya UT Austin, Department of ECE July

Latent Dimensions of Religion and Spirituality: A Longitudinal Correlated Topic Model Seong-Hyeon

A Tutorial on Deep Probabilistic Generative Models Ryan P. Adams Princeton University Machine

19 Auto Lecture encoders : Ankur Bambhanoliya Scribes : Donald Hamnett Motivation

Unsupervised Discovery of Object Landmarks as Structural Representations Yuting Zhang 1 , Yijie

More Than a Query Language: SQL in the 21 st Century @MarkusWinand @ModernSQL

Policy Evaluation with Latent Confounders via Optimal Balance Andrew - PowerPoint PPT Presentation

Policy Evaluation with Latent Confounders via Optimal Balance Andrew Bennett 1 Cornell University awb222@cornell.edu Nathan Kallus 1 Cornell University kallus@cornell.edu 1 Alphabetical order. 1 / 33 Policy Learning Problem Given some

Controlling for confounders through approximate sufficiency Rina Foygel Barber (joint with Lucas

Pitfalls of data-driven networking: A case study of latent causal confounders in video streaming

in the presence of latent confounders and linear non-Gaussian SEMs Shohei Shimizu Osaka

Optimal Agents Nick Hay 27th September 2005 1 / 36 Nick Hay Optimal Agents The Optimal Agent

1 Latent variable models In the next section we will discuss latent variable models for

Part III: Latent Tree Models Le Song ICML 2012 Tutorial on Spectral Algorithms for Latent

Chapter 12. Evaluation Research Chapter 12. Evaluation Research evaluation research? evaluation

User Interface Evaluation Empirical evaluation Heuristic evaluation 1 CS 349 - UI evaluation

Toward Computing Towards an Optimal . . . An (Almost) Optimal . . . Minor Problem an Optimal

Causality Actions, Confounders and Interventions Christos Dimitrakakis October 30, 2019 . . .

www.dagitty.net Dealing with confounders just got easier! George TH Ellison PhD DSc TIME

EpiGraphDB Query for confounders http://epigraphdb.org/confounder/ (cf:Gwas)-[r1:MR]-&gt;

Confounders and Corfield: Back to the Future 12 July, 2018 0G 2018 ICOTS-10 1 0G 2018

STATISTICS 536B, Lecture #3 March 3, 2015 General options for binary Y , binary X , confounders C

EVALUATION OF THE OPTIMAL EVALUATION OF THE OPTIMAL LOCATION OF MONITORING LOCATION OF

Stochastic Latent Actor-Critic: Deep Reinforcement Learning with a Latent Variable Model CS330

Distributed Variational Inference in Sparse Gaussian Process Regression and Latent Variable Models

Learning Semantic Visual Codebook for Action Recognition by Embedding into Concept Space Behrouz

Knowledge Transfer Using Latent Variable Models Ayan Acharya UT Austin, Department of ECE July

Latent Dimensions of Religion and Spirituality: A Longitudinal Correlated Topic Model Seong-Hyeon

A Tutorial on Deep Probabilistic Generative Models Ryan P. Adams Princeton University Machine

19 Auto Lecture encoders : Ankur Bambhanoliya Scribes : Donald Hamnett Motivation

Unsupervised Discovery of Object Landmarks as Structural Representations Yuting Zhang 1 , Yijie

More Than a Query Language: SQL in the 21 st Century @MarkusWinand @ModernSQL

EpiGraphDB Query for confounders http://epigraphdb.org/confounder/ (cf:Gwas)-[r1:MR]->