semi supervised prediction of comorbid rare conditions
play

Semi-supervised Prediction of Comorbid Rare Conditions Chirag Nagpal - PowerPoint PPT Presentation

Semi-supervised Prediction of Comorbid Rare Conditions Chirag Nagpal 1 , K Miller 1 , T Pellathy 2 , M Hravnak 2 , G Clermont 2 , M Pinsky 2 , A Dubrawski 1 1 Auton Lab Carnegie Mellon University 2 University of Pittsburgh chiragn@cs.cmu.edu


  1. Semi-supervised Prediction of Comorbid Rare Conditions Chirag Nagpal 1 , K Miller 1 , T Pellathy 2 , M Hravnak 2 , G Clermont 2 , M Pinsky 2 , A Dubrawski 1 1 Auton Lab Carnegie Mellon University 2 University of Pittsburgh chiragn@cs.cmu.edu November 18, 2017 1 / 83

  2. Overview 1 Background Motivation Prior Work 2 Dataset Description Sources Feature Extraction Ground Truth 3 Approach Baselines PreCoRC : Prediction of Comorbid Rare Conditions 4 Results 5 Future Work 2 / 83

  3. Overview 1 Background Motivation Prior Work 2 Dataset Description Sources Feature Extraction Ground Truth 3 Approach Baselines PreCoRC : Prediction of Comorbid Rare Conditions 4 Results 5 Future Work 3 / 83

  4. Motivation 4 / 83

  5. Motivation • Rare Conditions are potentially under reported in EHR. 5 / 83

  6. Motivation • Rare Conditions are potentially under reported in EHR. • Prevent Failure-to-Rescue, (FTR). Death of a Hospitalised Patient from a treatable condition. 6 / 83

  7. Motivation • Rare Conditions are potentially under reported in EHR. • Prevent Failure-to-Rescue, (FTR). Death of a Hospitalised Patient from a treatable condition. • Ability to predict conditions would allow for pro-active healthcare 7 / 83

  8. Motivation • Rare Conditions are potentially under reported in EHR. • Prevent Failure-to-Rescue, (FTR). Death of a Hospitalised Patient from a treatable condition. • Ability to predict conditions would allow for pro-active healthcare • Challenge: FTR Conditions, under reported, available data sparse for standard Machine Learning 8 / 83

  9. Motivation 9 / 83

  10. Motivation • Leverage historical EHR Build Early Warning System, identify patients at risk. 10 / 83

  11. Motivation • Leverage historical EHR Build Early Warning System, identify patients at risk. • Augment scarce ground truth for operationally useful models. 11 / 83

  12. Motivation • Leverage historical EHR Build Early Warning System, identify patients at risk. • Augment scarce ground truth for operationally useful models. • Model interpretable by the end user, medical practitioner. 12 / 83

  13. Tree Featurization • Tree Featurization [Singh et al., 2014] Expicitly Leverage ICD Hierarchy in the Feature Representation. 13 / 83

  14. Tree Featurization • Tree Featurization [Singh et al., 2014] Expicitly Leverage ICD Hierarchy in the Feature Representation. Pneumonia 487 14 / 83

  15. Tree Featurization • Tree Featurization [Singh et al., 2014] Expicitly Leverage ICD Hierarchy in the Feature Representation. Pneumonia Pneumonia&Influenza 487 480-488 15 / 83

  16. Tree Featurization • Tree Featurization [Singh et al., 2014] Expicitly Leverage ICD Hierarchy in the Feature Representation. Pneumonia Pneumonia&Influenza Respiratory System 487 480-488 460-519 16 / 83

  17. OoD Embedding Learning 17 / 83

  18. OoD Embedding Learning • Out of Domain Embedding Learning [Liu et al., 2016] Learn Embeddings from External Sources for Dense Representation of ICD codes PubMed 18 / 83

  19. OoD Embedding Learning • Out of Domain Embedding Learning [Liu et al., 2016] Learn Embeddings from External Sources for Dense Representation of ICD codes PubMed PubMed Central 19 / 83

  20. OoD Embedding Learning • Out of Domain Embedding Learning [Liu et al., 2016] Learn Embeddings from External Sources for Dense Representation of ICD codes PubMed PubMed Central Open Access 20 / 83

  21. OoD Embedding Learning • Out of Domain Embedding Learning [Liu et al., 2016] Learn Embeddings from External Sources for Dense Representation of ICD codes One Hot Encoding PubMed PubMed Central Open Access 21 / 83

  22. OoD Embedding Learning • Out of Domain Embedding Learning [Liu et al., 2016] Learn Embeddings from External Sources for Dense Representation of ICD codes CBOW One Hot Encoding PubMed PubMed Central Open Access 22 / 83

  23. OoD Embedding Learning • Out of Domain Embedding Learning [Liu et al., 2016] Learn Embeddings from External Sources for Dense Representation of ICD codes CBOW One Hot Encoding Dense Encoding PubMed PubMed Central Open Access 23 / 83

  24. Overview 1 Background Motivation Prior Work 2 Dataset Description Sources Feature Extraction Ground Truth 3 Approach Baselines PreCoRC : Prediction of Comorbid Rare Conditions 4 Results 5 Future Work 24 / 83

  25. Feature Extraction 25 / 83

  26. Feature Extraction Static Data 1 Age 2 Gender 3 Ethnicity 26 / 83

  27. Feature Extraction Static Data Admission Data 1 ICD-9 Codes 1 Age • Diagnosis Codes 2 Gender • Procedure Codes • Admission Codes 3 Ethnicity 2 Diagnosis Related Groups 27 / 83

  28. Feature Extraction Aggregated Records Static Data Admission Data 1 ICD-9 Codes 1 Age 1 X T n = { 1 , 0 ... 0 , 1 } • Diagnosis Codes 2 X ′ 2 Gender • Procedure Codes T n = • Admission Codes Σ { X T 1 , ..., X T n } 3 Ethnicity 2 Diagnosis Related Groups 28 / 83

  29. Clinical Tasks 29 / 83

  30. Clinical Tasks • Intubation & Mechanical Ventilation ( Task-IMV ) A Treatment Scenario occuring in context of Failure-to-Rescue (FTR) cases. ICD Codes: 96.04, 96.71, 96.72, 518.81 30 / 83

  31. Clinical Tasks • Intubation & Mechanical Ventilation ( Task-IMV ) A Treatment Scenario occuring in context of Failure-to-Rescue (FTR) cases. ICD Codes: 96.04, 96.71, 96.72, 518.81 • Venous Thrombo-embolism ( Task-VTE ) Includes both, patients diagnosed with Pulmonary and Deep Vein Thrombosis, an under reported, Life Threatening Condition ICD Codes: 415.1, 451.11, 451,2, 451.81, 453.8 31 / 83

  32. Clinical Tasks 32 / 83

  33. Clinical Tasks Intubation & Mechanical Ventilation ( Task-IMV ) 1266 Positives ≈ 1.173% 33 / 83

  34. Clinical Tasks Intubation & Mechanical Ventilation ( Task-IMV ) 1266 Positives ≈ 1.173% Task-IMV -10 : Uses 10% Labelled Data 34 / 83

  35. Clinical Tasks Intubation & Mechanical Ventilation ( Task-IMV ) 1266 Positives ≈ 1.173% Task-IMV -10 : Uses 10% Labelled Data Task-IMV -90 : Uses 90% Labelled Data 35 / 83

  36. Clinical Tasks Intubation & Mechanical Ventilation ( Task-IMV ) 1266 Positives ≈ 1.173% Task-IMV -10 : Uses 10% Labelled Data Task-IMV -90 : Uses 90% Labelled Data Venous Thromboembolism ( Task-VTE ) 56 Positives ≈ 0.0519% 36 / 83

  37. Overview 1 Background Motivation Prior Work 2 Dataset Description Sources Feature Extraction Ground Truth 3 Approach Baselines PreCoRC : Prediction of Comorbid Rare Conditions 4 Results 5 Future Work 37 / 83

  38. Baselines 38 / 83

  39. Baselines • Logistic Regression with ℓ 2 Penalty. LR 39 / 83

  40. Baselines • Logistic Regression with ℓ 2 Penalty. • Random Forest Ensemble LR RF 40 / 83

  41. Baselines • Logistic Regression with ℓ 2 Penalty. • Random Forest Ensemble • Principal Component Analysis LR RF PCA-LR PCA-RF 41 / 83

  42. Baselines • Logistic Regression with ℓ 2 Penalty. • Random Forest Ensemble • Principal Component Analysis • Non-Negative Matrix Factorisation LR RF PCA-LR PCA-RF NMF-LR NMF-RF 42 / 83

  43. PreCoRC Pipeline 43 / 83

  44. PreCoRC Pipeline T-Edges O-Edges Historical Test Data Data I-Edges P-Edges Pre diction La be l Re -Distribution Prior Final Prediction Score Graph ICD-9 Binary Structure Hierarchy Classifier 44 / 83

  45. PreCoRC Pipeline T-Edges O-Edges Historical Test Data Data I-Edges P-Edges Pre diction La be l Re -Distribution Prior Final Prediction Score Graph ICD-9 Binary Structure Hierarchy Classifier 45 / 83

  46. PreCoRC Pipeline T-Edges O-Edges Historical Test Data Data I-Edges P-Edges Prediction Label Re-Distribution Final Prior Prediction Score Graph ICD-9 Binary Structure Hierarchy Classifier 46 / 83

  47. Graph Construction 47 / 83

  48. Graph Construction Patients Records ICD-9 Ontology 487 Record 1 Influenza Patient A Record 2 480-488 Pneumonia & Influenza Record n Patient B 460-519 Respiratory Diseases Record 1 I-Edge s O-Edge s Record n P-Edge s T-Edge s 48 / 83

  49. Graph Construction Patients Records ICD-9 Ontology 487 Record 1 Influenza Patient A Record 2 480-488 Pneumonia & Influenza Record n Patient B 460-519 Respiratory Diseases Record 1 I-Edge s O-Edge s Record n P-Edge s T-Edge s 49 / 83

  50. Graph Construction Patients Records ICD-9 Ontology 487 Record 1 Influenza Patient A Record 2 480-488 Pneumonia & Influenza Record n Patient B 460-519 Respiratory Diseases Record 1 I-Edges O-Edges Record n P-Edges T-Edges 50 / 83

  51. Label Propagation 51 / 83

  52. Label Propagation Harmonic Energy Minimization [Zhu et al., 2003] i ∈L ( y i − f i ) 2 D ii + λ � i , j ( f i − f j ) 2 A ii E ( f ) = � 52 / 83

  53. Label Propagation Harmonic Energy Minimization [Zhu et al., 2003] i ∈L ( y i − f i ) 2 D ii + λ � i , j ( f i − f j ) 2 A ii E ( f ) = � Soft Label HEM [Wang et al., 2013] � � � ( y i − f i ) 2 D ii + λ � ( f i − π i ) 2 D ii + � ( f i − f j ) 2 A ii E ( f ) = w 0 i ∈L i ∈U i , j 53 / 83

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend