Suchi Saria Assistant Professor Computer Science, Applied Math & Stats and Health Policy Institute for Computational Medicine
Reliable Decision Support using Counterfactual Models
w/ Peter Schulam, PhD candidate
Reliable Decision Support using Counterfactual Models Suchi Saria - - PowerPoint PPT Presentation
Reliable Decision Support using Counterfactual Models Suchi Saria Assistant Professor Computer Science, Applied Math & Stats and Health Policy Institute for Computational Medicine w/ Peter Schulam, PhD candidate Example: Customer Churn !
Suchi Saria Assistant Professor Computer Science, Applied Math & Stats and Health Policy Institute for Computational Medicine
w/ Peter Schulam, PhD candidate
, ! , ! , ! , !
Supervised Learning
, ! , ! , ! , !
Supervised Learning
Supervised ML models can be biased for decision-making problems!
, , ! , ! ,
Ad emails, discounts, etc. Ad emails, discounts, etc.
, , ! , ! ,
Ad emails, discounts, etc. Ad emails, discounts, etc.
,
πtest( ˆ P)
,
Supervised ML leads to models that are unstable to shifts in the policy between the train and test
Adverse Event Onset Is the patient at risk of a septic shock?
sepsis and death
high temperature
learning model learns high temperature is associated with low risk.
Dyagilev and Saria, Machine Learning 2015
Increasing discrepancy in physician prescription behavior in train vs. test environment
Treat based on temp Treat based on WBC
Dyagilev and Saria, Machine Learning 2015
Predictive model trained using classical supervised ML creates unsafe scenarios where sick patients are overlooked.
to each clone
Outcome under 10% discount.
Outcome under 20% discount.
to each clone
vs.
Set of actions Random variable Action
Potential outcomes model the observed outcome under each possible action (or intervention)
Rubin, 1974 Neyman et al., 1923 Rubin, 2005
40 60 80 100 120 5 10 15
Years Since First Symptom PFVC
Lung Capacity
40 60 80 100 120 5 10 15
Years Since First Symptom PFVC
Lung Capacity
40 60 80 100 120 5 10 15
Years Since First Symptom PFVC
Lung Capacity
40 60 80 100 120 5 10 15
Years Since First Symptom PFVC
Lung Capacity
40 60 80 100 120 5 10 15
Years Since First Symptom PFVC
Lung Capacity
40 60 80 100 120 5 10 15
Years Since First Symptom PFVC
Lung Capacity
40 60 80 100 120 5 10 15
Years Since First Symptom PFVC
Lung Capacity
40 60 80 100 120 5 10 15
Years Since First Symptom PFVC
Lung Capacity
40 60 80 100 120 5 10 15
Years Since First Symptom PFVC
Lung Capacity
E[Y ( ) | H = h]
40 60 80 100 120 5 10 15
Years Since First Symptom PFVC
Lung Capacity
E[Y ( ) | H = h] E[Y ( ) | H = h]
40 60 80 100 120 5 10 15
Years Since First Symptom PFVC
Lung Capacity
E[Y ( ) | H = h] E[Y ( ) | H = h] E[Y ( ) | H = h]
2017 for discussion of related work.
Dudik et al., 2011 Paduraru et al. 2013 Jiang and Li, 2016
for a policy when learning from offline data. e.g.
Brodersen et al., 2015
ads; single intervention
Bottou et al., 2013 Taubman et al.,2009
epidemiology; multiple sequential interventions
Xu, Xu, Saria, 2016
sparse, irregularly sampled longitudinal data; functional outcomes
Lok et al., 2008 Schulam Saria, 2017
important assumptions:
Rubin, 1974 Neyman et al., 1923 Rubin, 2005
with the potential outcome of the observed treatment
i=1
i=1 , {yi(ai), ai, xi}n i=1
covariates we need to assume a non-zero probability of seeing each treatment
xBMI
yBP
Exerc
xBMI
yBP
Exerc
xBMI
yBP
Exerc
(observational) data:
Estimation requires a statistical model for estimating conditionals
potential outcome models
1-3 hold, then this is possible!
UAI Tutorial: Saria and Soleimani, 2017
Timing between measurements is irregular and random
Creatinine is a test used to measure kidney function.
And so are times between treatments
In the discrete-time setting, we did not treat the timing of events as random
Set of finite sequences of actions
pfvc pdlco rvsp 25 50 75 5 10 15 5 10 15 5 10 15 5 10 15
Years Since Diagnosis Marker Value
Medication Prednisone Methotrex Cyclophosphamide Cytoxan
pfvc pdlco rvsp 25 50 75 5 10 15 5 10 15 5 10 15 5 10 15
Years Since Diagnosis Marker Value
Medication Prednisone Methotrex Cyclophosphamide Cytoxan
Treatments administered according to unknown policy (i.e. not an RCT)
pfvc pdlco rvsp 25 50 75 5 10 15 5 10 15 5 10 15 5 10 15
Years Since Diagnosis Marker Value
Medication Prednisone Methotrex Cyclophosphamide Cytoxan
Learning is especially difficult because there is time- dependent feedback between actions and outcomes
Robins 1986
Schulam and Saria, NIPS 2017
i=1
Schulam and Saria, NIPS 2017
i=1
zy
Did we measure an outcome?
i=1
zy
Did we take an action?
za
i=1
zy
What is the value of the outcome?
za y
i=1
zy
What action did we take?
za y a
Schulam and Saria, NIPS 2017
Probability of event happening at this time Probability of mark given event time
Schulam and Saria, NIPS 2017
Probability of event happening at this time Probability of mark given event time Star denotes dependence on history
Schulam and Saria, NIPS 2017
`(✓) =
n
X
j=1
log p∗
θ(yj | tj, zyj) + n
X
j=1
log ∗
θ(t)p∗ θ(aj, zyj, zaj | tj, yj) −
Z τ ∗
θ(s)ds
Model the conditional probability of the outcome using a GP
Schulam and Saria, NIPS 2017
Schulam and Saria, NIPS 2017
Schulam and Saria, NIPS 2017
independent of potential outcomes
Schulam and Saria, NIPS 2017
40 60 80 100 120 5 10 15
Years Since First Symptom PFVC
Lung Capacity
40 60 80 100 120 5 10 15
Years Since First Symptom PFVC
Lung Capacity
History Ht
40 60 80 100 120 5 10 15
Years Since First Symptom PFVC
Lung Capacity
History Ht
P({Ys(a) : s > t} | Ht)
CGP risk scores are stable across regime A and B training data
Baseline GP scores change
CGP relative risk across patients is also stable across training data A and B
Baseline GP’s relative risk changes
CGP AUC is constant across regimes A and B
Baseline GP’s AUC is unstable
CGP risk scores are unstable if the policy in the training data violates our assumptions
Counterfactual (no treatment) Factual
Counterfactual (CVVHD)
100 200 300 400 500 Time (hours) 20 40 60 80
BUN
100 200 300 400 500 Time (hours) 3.5 4.0 4.5 5.0 5.5
Potassium
100 200 300 400 500 Time (hours) 60 80 100 120
HR
100 200 300 400 500 Time (hours) 1 2 3 4
Creatinine
100 200 300 400 500 Time (hours) 7 8 9 10 11
Calcium
100 200 300 400 500 Time (hours) 80 100 120 140 160
Blood Pressure
Continuous-time actions, continuous-time multi-variate trajectories
Input x(t) convolved with impulse-response h(t) to generate response ρ(t)
Input
ρ(t) = x(t) ∗ h(t)
Response
−1 1 2 −
2nd order 3rd order
0.0 0.5 1.0 5 10 15 20 0.0 0.5 1.0 −0.5 0.0 0.5 1.0 5 5 10 15 20 −0.5 0.0 0.5 1.0 1.5
complex roots
2nd order
x(t) h(t) ρ(t)
ρ(t) = x(t) ∗ h(t) = Z ∞
−∞
x(τ)h(t − τ)dτ h(t) = αβ β − α(e−αt − e−βt)1(t ≥ 0)
Example:
To allow sharing across signals:
gd(t) = ψ ρ0(t) | {z }
shared
+(1 − ψ) ρd(t) | {z }
signal-specific
ψ ∈ [0, 1]
Similar ideas in pharmacokinetics:
Cutler, 1978 Shargel et al. 2005 Rich et al., 2016
Soleimani, Subbaswamy, Saria, UAI 2017
Better relative performance at longer prediction horizons For horizon 7: on test regions with treatment, 15% than BART and 8% better than LSTM
1 2 3 4 5 6 7 Prediction Horizon (days) 0.6 0.7 0.8 0.9 1.0 NRMSE
Proposed model RNN BART
Soleimani, Subbaswamy, Saria, UAI 2017
Proposed Model LSTM BART
and accountability
Dyagilev and Saria, Machine Learning 2015 Soleimani, Subbaswamy, Saria, UAI 2017 Schulam and Saria, NIPS 2017 Xu, Xu, Saria, MLHC 2016 (JMLR-to appear) Robins 1986
Rubin, 1974 Neyman et al., 1923 Rubin, 2005 Soleimani and Saria, UAI 2017
Robins and Hernan 2009
All references throughout the slides are active links and clickable. For errors and edits, please contact: ssaria@cs.jhu.edu Thanks!