machine learning for healthcare 6 871 hst 956
play

Machine Learning for Healthcare 6.871, HST.956 Lecture 14: Causal - PowerPoint PPT Presentation

Machine Learning for Healthcare 6.871, HST.956 Lecture 14: Causal Inference Part 1 David Sontag Does gastric bypass surgery prevent onset of diabetes? 2013 1994 2000 <4.5% 4.5%5.9% 6.0%7.4% 7.5%8.9%


  1. Machine Learning for Healthcare 6.871, HST.956 Lecture 14: Causal Inference Part 1 David Sontag

  2. Does gastric bypass surgery prevent onset of diabetes? 2013 1994 2000 <4.5% 4.5%–5.9% 6.0%–7.4% 7.5%–8.9% >9.0% • In Lecture 4 & PS2 we used machine learning for early detection of Type 2 diabetes • Health system doesn’t want to know how to predict diabetes – they want to know how to prevent it • Gastric bypass surgery is the highest negative weight (9th most predictive feature) – Does this mean it would be a good intervention?

  3. What is the likelihood this patient, with breast cancer, will survive 5 years? • Such predictive models widely used to stage patients. Should we initiate treatment? How aggressive? • What could go wrong if we trained to predict survival, and then used to guide patient care? 𝒁 Treatment Diagnosis Death 𝒀 Time “Mary” A long survival time may be because of treatment!

  4. What treatment should we give this patient? Expansion pathology (image from Andy Beck) • People respond differently to treatment • Goal: use data from other patients and their journeys to guide future treatment decisions • What could go wrong if we trained to predict (past) treatment decisions? Best this can do is Treatment A “David” match current “John” Treatment B medical practice! Treatment A “Juana”

  5. Does smoking cause lung cancer? • Doing a randomized control trial is unethical • Could we simply answer this question by comparing Pr(lung cancer | smoker) vs Pr(lung cancer | nonsmoker)? • No! Answering such questions from observational data is difficult because of confounding

  6. To properly answer, need to formulate as causal questions: Patient , 𝑌 Intervention, 𝑈 (including all (e.g. medication, ? confounding procedure) factors) Outcome , 𝑍 High dimensional Observational data

  7. Potential Outcomes Framework (Rubin-Neyman Causal Model) • Each unit (individual) 𝑦 ! has two potential outcomes: – 𝑍 ! (𝑦 " ) is the potential outcome had the unit not been treated: “ control outcome ” – 𝑍 # (𝑦 " ) is the potential outcome had the unit been treated: “ treated outcome ” • Conditional average treatment effect for unit 𝑗 : 𝐷𝐵𝑈𝐹 𝑦 ! = 𝔽 " $ |' % ) [𝑍 ) |𝑦 ! ] − 𝔽 " & |' % ) [𝑍 * |𝑦 ! ] $ ~$(" & ~$(" • Average Treatment Effect: 𝐵𝑈𝐹: = 𝔽 𝑍 ) − 𝑍 * = 𝔽 '~$(') 𝐷𝐵𝑈𝐹 𝑦

  8. Potential Outcomes Framework (Rubin-Neyman Causal Model) • Each unit (individual) 𝑦 ! has two potential outcomes: – 𝑍 ! (𝑦 " ) is the potential outcome had the unit not been treated: “ control outcome ” – 𝑍 # (𝑦 " ) is the potential outcome had the unit been treated: “ treated outcome ” • Observed factual outcome: 𝑧 ! = 𝑢 ! 𝑍 ) 𝑦 ! + 1 − 𝑢 ! 𝑍 * (𝑦 ! ) • Unobserved counterfactual outcome: +, = (1 − 𝑢 ! )𝑍 𝑧 ! ) 𝑦 ! + 𝑢 ! 𝑍 * (𝑦 ! )

  9. The fundamental problem of causal inference “The fundamental problem of causal inference” We only ever observe one of the two outcomes

  10. Example – Blood pressure and age 𝑧 = 𝑐𝑚𝑝𝑝𝑒_𝑞𝑠𝑓𝑡. 𝑍 $ 𝑦 Treated 𝑍 % 𝑦 𝑦 = 𝑏𝑕𝑓

  11. Blood pressure and age 𝑧 = 𝑐𝑚𝑝𝑝𝑒_𝑞𝑠𝑓𝑡. 𝐷𝐵𝑈𝐹(𝑦) 𝑍 $ 𝑦 Treated 𝑍 % 𝑦 𝑦 = 𝑏𝑕𝑓

  12. Blood pressure and age 𝑧 = 𝑐𝑚𝑝𝑝𝑒_𝑞𝑠𝑓𝑡. 𝐵𝑈𝐹 𝑍 $ 𝑦 Treated 𝑍 % 𝑦 𝑦 = 𝑏𝑕𝑓

  13. Blood pressure and age 𝑧 = 𝑐𝑚𝑝𝑝𝑒_𝑞𝑠𝑓𝑡. 𝑍 $ 𝑦 Treated 𝑍 % 𝑦 Treated Control 𝑦 = 𝑏𝑕𝑓

  14. Blood pressure and age 𝑍 $ 𝑦 𝑧 = 𝑐𝑚𝑝𝑝𝑒_𝑞𝑠𝑓𝑡. Treated 𝑍 % 𝑦 Treated Control 𝑦 = 𝑏𝑕𝑓 Counterfactual treated Counterfactual control

  15. (age, gender, Sugar levels Sugar levels Observed exercise,treatment) had they had they sugar levels received received medication A medication B (45, F, 0, A ) 6 5.5 6 (45, F, 1, B ) 7 6.5 6.5 (55, M, 0, A ) 7 6 7 (55, M, 1, B ) 9 8 8 (65, F, 0, B ) 8.5 8 8 (65,F, 1, A ) 7.5 7 7.5 (75,M, 0, B ) 10 9 9 (75,M, 1, A ) 8 7 8 (Example from Uri Shalit)

  16. (age, gender, Sugar levels Sugar levels Observed exercise) had they had they sugar levels received received medication A medication B (45, F, 0) 6 5.5 6 (45, F, 1) 7 6.5 6.5 (55, M, 0) 7 6 7 (55, M, 1) 9 8 8 (65, F, 0) 8.5 8 8 (65,F, 1) 7.5 7 7.5 (75,M, 0) 10 9 9 (75,M, 1) 8 7 8 (Example from Uri Shalit)

  17. (age, gender, Y 0 : Sugar levels Y 1 : Sugar levels Observed exercise) had they had they sugar levels received received medication A medication B (45, F, 0) 6 5.5 6 (45, F, 1) 7 6.5 6.5 (55, M, 0) 7 6 7 (55, M, 1) 9 8 8 (65, F, 0) 8.5 8 8 (65,F, 1) 7.5 7 7.5 (75,M, 0) 10 9 9 (75,M, 1) 8 7 8 (Example from Uri Shalit)

  18. (age,gender, Sugar levels Sugar levels Observed exercise) had they had they sugar levels received received medication medication mean(sugar|medication B) – A B mean(sugar|medicaton A) = (45, F, 0) 6 5.5 6 ? (45, F, 1) 7 6.5 6.5 (55, M, 0) 7 6 7 (55, M, 1) 9 8 8 mean(sugar| had they received B) – (65, F, 0) 8.5 8 8 mean(sugar| had they received A) = (65,F, 1) 7.5 7 7.5 ? (75,M, 0) 10 9 9 (75,M, 1) 8 7 8 (Example from Uri Shalit)

  19. (age,gender, Sugar levels Sugar levels Observed exercise) had they had they sugar levels received received medication medication mean(sugar|medication B) – A B mean(sugar|medicaton A) = (45, F, 0) 6 5.5 6 7.875 - 7.125 = 0.75 (45, F, 1) 7 6.5 6.5 (55, M, 0) 7 6 7 (55, M, 1) 9 8 8 mean(sugar| had they received B) – (65, F, 0) 8.5 8 8 mean(sugar| had they received A) = (65,F, 1) 7.5 7 7.5 7.125 - 7.875 = -0.75 (75,M, 0) 10 9 9 (75,M, 1) 8 7 8 (Example from Uri Shalit)

  20. Typical assumption – no unmeasured confounders 𝑍 * , 𝑍 ) : potential outcomes for control and treated 𝑦 : unit covariates (features) T: treatment assignment We assume: (𝑍 ! , 𝑍 " ) ⫫ 𝑈 | 𝑦 The potential outcomes are independent of treatment assignment, conditioned on covariates 𝑦

  21. Typical assumption – no unmeasured confounders 𝑍 * , 𝑍 ) : potential outcomes for control and treated 𝑦 : unit covariates (features) T: treatment assignment We assume: (𝑍 ! , 𝑍 " ) ⫫ 𝑈 | 𝑦 Ignorability

  22. Ignorability covariates 𝒚 𝑼 treatment (features) 𝒁 𝟏 𝒁 𝟐 Potential outcomes (𝑍 ! , 𝑍 " ) ⫫ 𝑈 | 𝑦

  23. Ignorability anti- hypertensive medication age, gender, 𝒚 𝑼 weight, diet, heart rate at rest,… 𝒁 𝟏 𝒁 𝟐 blood pressure blood pressure after medication after A medication B (𝑍 ! , 𝑍 " ) ⫫ 𝑈 | 𝑦

  24. No Ignorability anti- hypertensive medication age, gender, 𝒚 𝑼 weight, diet, diabetic heart rate at rest,… 𝒊 𝒁 𝟏 𝒁 𝟐 blood pressure blood pressure after medication after A medication B (𝑍 ! , 𝑍 " ) ⫫ 𝑈 | 𝑦

  25. Typical assumption – common support Y * , 𝑍 ) : potential outcomes for control and treated 𝑦 : unit covariates (features) 𝑈 : treatment assignment We assume: 𝑞 𝑈 = 𝑢 𝑌 = 𝑦 > 0 ∀𝑢, 𝑦

  26. Framing the question 1. Where could we go to for data to answer these questions? 2. What should X , T, and Y be to satisfy ignorability? 3. What is the specific causal inference question that we are interested in? 4. Are you worried about common support?

  27. Outline for lecture • How to recognize a causal inference problem • Potential outcomes framework – Average treatment effect (ATE) – Conditional average treatment effect (CATE) • Algorithms for estimating ATE and CATE

  28. Average Treatment Effect The expected causal effect of 𝑈 on 𝑍 : ATE := E [ Y 1 − Y 0 ]

  29. Average Treatment Effect – the adjustment formula • Assuming ignorability, we will derive the adjustment formula (Hernán & Robins 2010, Pearl 2009) • The adjustment formula is extremely useful in causal inference • Also called G-formula

  30. Average Treatment Effect The expected causal effect of 𝑈 on 𝑍 : ATE := E [ Y 1 − Y 0 ]

  31. Average Treatment Effect The expected causal effect of 𝑈 on 𝑍 : ATE := E [ Y 1 − Y 0 ] law of total expectation E [ Y 1 ] = ⇥ ⇤ E Y 1 ∼ p ( Y 1 | x ) [ Y 1 | x ] = E x ∼ p ( x ) ⇥ ⇤

  32. Average Treatment Effect The expected causal effect of 𝑈 on 𝑍 : ATE := E [ Y 1 − Y 0 ] E [ Y 1 ] = ignorability ⇥ ⇤ E Y 1 ∼ p ( Y 1 | x ) [ Y 1 | x ] = (𝑍 * , 𝑍 ) ) ⫫ 𝑈 | 𝑦 E x ∼ p ( x ) ⇥ ⇤ E Y 1 ∼ p ( Y 1 | x ) [ Y 1 | x, T = 1] = E x ∼ p ( x ) , T=1 E E

  33. Average Treatment Effect The expected causal effect of 𝑈 on 𝑍 : ATE := E [ Y 1 − Y 0 ] E [ Y 1 ] = ⇥ ⇤ E Y 1 ∼ p ( Y 1 | x ) [ Y 1 | x ] = E x ∼ p ( x ) ⇥ ⇤ E Y 1 ∼ p ( Y 1 | x ) [ Y 1 | x, T = 1] = E x ∼ p ( x ) , T=1 shorter notation E x ∼ p ( x ) [ E [ Y 1 | x, T = 1]]

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend