A Deep Learning Pipeline for Patient Diagnosis Prediction Using - - PowerPoint PPT Presentation

a deep learning pipeline for patient diagnosis prediction
SMART_READER_LITE
LIVE PREVIEW

A Deep Learning Pipeline for Patient Diagnosis Prediction Using - - PowerPoint PPT Presentation

A Deep Learning Pipeline for Patient Diagnosis Prediction Using Electronic Health Records BIOKDD 2020 19th International Workshop on Data Mining in Bioinformatics Leopold Franz, Dr. Yash Raj Shrestha, Dr. Bibek Paudel 1 | | 29.05.2020


slide-1
SLIDE 1

| | 29.05.2020 Leopold Franz

A Deep Learning Pipeline for Patient Diagnosis Prediction Using Electronic Health Records

BIOKDD 2020 • 19th International Workshop on Data Mining in Bioinformatics

Leopold Franz, Dr. Yash Raj Shrestha, Dr. Bibek Paudel

1

slide-2
SLIDE 2

| | 29.05.2020 Leopold Franz

Multimorbidity: A growing problem

2 6% 9% 16%

Ageing Population Increasing Prevalence

General Increase of Multimorbidity

slide-3
SLIDE 3

| | 29.05.2020 Leopold Franz

Diagnosing Multiple Diseases

Multimorbid patients are underdiagnosed 71% of the times by doctors.

[Hausmann-Thürig et al., 2019]

3

Solution:

Data Analytics for Diseases Diagnosis EHR

slide-4
SLIDE 4

| | 29.05.2020 Leopold Franz

Literature Data Method Evaluation Task

Title Authors Year Publication Venue Healthcare Dataset Knowledge graph Text Structured Numerical Data Other FCNN RNN Transforme r Other Readmission Prediction Mortality Prediction Diagnosis Detection

Deep Patient: An Unsupervised Representation to Predict the Future of Patients from the Electronic Health Records [Miotto et al., 2016] R.M., L.L., B.K., J.D. 2016 Scientific Reports Mount Sinai Data Warehouse Yes Yes Stacked Denoising Autoencoder Yes MiME: Multilevel Medical Embedding of Electronic Health Records for Predictive Healthcare [Choi et al., 2018] E.C., C.X., W.S., J.S. 2018 NeurIPS Sutter Health Yes Med2Vec, GRAM GRU Yes Scalable and accurate deep learning with electronic health records [Rajkomar et al., 2018] A.R., E.O., K.C., J.D. 2018 npj Digital Medicine UCSF, UCM Hospital Data Yes Yes Boosted NN LSTM Attention TANN Length of Stay Prediction Yes Yes Yes Improved Hierarchical Patient Classification with Language Model Pretraining over Clinical Notes [Kemp et al., 2019] J.K., A.R., A.D. 2019 Pre-print (ArXiv) MIMIC-III Yes Yes Hierarchical RNN Length of Stay Prediction Yes Yes Yes ClinicalBERT: Modeling Clinical Notes and Predicting Hospital Readmission [Huang et al., 2019] K.H., J.A., R.R. 2019 Pre-print (ArXiv) MIMIC-III Yes BERT Language Tasks (NER, RE, Q&A) Yes

Related Work

slide-5
SLIDE 5

| | 29.05.2020 Leopold Franz

Data-driven Methods

5

Data Representation Learning

slide-6
SLIDE 6

| | 29.05.2020 Leopold Franz

Data

6

MIMIC - III

Intensive Care Unit (ICU) Beth Israel Deaconess Medical Center Boston, Massachusetts

slide-7
SLIDE 7

| | 29.05.2020 Leopold Franz

Representation

7

MIMIC - III

slide-8
SLIDE 8

| | 29.05.2020 Leopold Franz

Representation

8

slide-9
SLIDE 9

| | 29.05.2020 Leopold Franz

Deep Learning Models

9

1 2

2,0.8570960175731908,0.217059 3,0.8600811744347597,0.232499 4,93.94117647058823,27.890168 5,107.05833333333334,18.43740 6,128.0,28.160255680657446 24,1.4332023841777235,0.32984 25,152.64563907386574,56.9523 26,234.93350328315236,146.972 28,53.892805755395685,33.9506 29,-52.037351426321635,56.657 34,20.11627906976744,6.942653

EHR

Structured Data Unstructured Data

Pre- processing Pre- processing

slide-10
SLIDE 10

| | 29.05.2020 Leopold Franz

Metrics

10

AUROC AUPRC Recall@Prec80

Text Analytics Numerical Analytics

slide-11
SLIDE 11

| | 29.05.2020 Leopold Franz

Results

11

slide-12
SLIDE 12

| | 29.05.2020 Leopold Franz

Results

12

slide-13
SLIDE 13

| | 29.05.2020 Leopold Franz 13

Interpretation

slide-14
SLIDE 14

| | 29.05.2020 Leopold Franz

Limitations

14

Treating Multimorbidity Limited Data Interpretability Ethics Acceptance

slide-15
SLIDE 15

| | 29.05.2020 Leopold Franz

Comments and Questions

15

https://arxiv.org/abs/2006.16926

slide-16
SLIDE 16

| | 29.05.2020 Leopold Franz

Appendix

16

slide-17
SLIDE 17

| | 29.05.2020 Leopold Franz

References

[Atella et al., 2019] Atella, V., Piano Mortari, A., Kopinska, J., Belotti, F., Lapi, F., Cricelli, C., and Fontana, L. (2019). Trends in age-related disease burden and healthcare utilization. Aging cell, 18(1):e12861–e12861. 30488641[pmid]. [UN, 2020] UN (2020). World Population Ageing 2019. United Nations, Department of Economic and Social Affairs, Population Division. [WHO, 2016] WHO (2016). Multimorbidity. Technical Series on Safer Primary Care. World Health Organization. [Deetjen et al., 2020] Deetjen, U., Biesdorf, S., Guiliani, G., and Oberhänsli, W. (2020). Unleashing the power of digital health through ecosystems. [Hausmann-Thürig et al., 2019] Hausmann-Thürig, D., Kiesel, V., Zimmerli, L., Schlatter, N., von Gunten, A., Wattinger, N., and Rosemann, T. (2019). Sensitivity for multimorbid- ity: The role of diagnostic uncertainty of physicians when evaluating multimorbid video case-based vignettes. PLoS ONE, 14(4):e0215049.

17

slide-18
SLIDE 18

| | 29.05.2020 Leopold Franz

Literature Data Method Evaluation Task

Title Authors Year Publication Venue Healthcare Dataset Knowledge graph Text Structured Numerical Data Other FCNN RNN Transforme r Other Readmission Prediction Mortality Prediction Diagnosis Detection

Knowledge graph solutions in healthcare for improved clinical outcomes [Aasman and Mirhaji, 2018] J.A., P.M. 2018 CEUR UMLS, SNOMEDCT , OMOP Yes Build Knowledge Graph Learning a Health Knowledge Graph from Electronic Medical Records [Rotmensch et al., 2017] M.R, Y.H., A.T., D.S. 2017 Scientific Reports Unknown tertiary teaching hospital + GHKG Yes Bayesian Network with Noisy OR gates Build Knowledge Graph Learning to summarize radiology findings [Zhang et al., 2018] Y.Z., D.D., T.Q., C.L. 2018 EMNLP- LOUHI Stanford University Hospital Yes Bidirectional LSTM with Attention Summarize Radiology Report Deep Patient: An Unsupervised Representation to Predict the Future of Patients from the Electronic Health Records [Miotto et al., 2016] R.M., L.L., B.K., J.D. 2016 Scientific Reports Mount Sinai Data Warehouse Yes Yes Stacked Denoising Autoencoder Yes MiME: Multilevel Medical Embedding of Electronic Health Records for Predictive Healthcare [Choi et al., 2018] E.C., C.X., W.S., J.S. 2018 NeurIPS Sutter Health Yes Med2Vec, GRAM GRU Yes Neural networks versus Logistic regression for 30 days all-cause readmission prediction [Allam et al., 2019] A.A., M.N., G.T., M.K. 2019 Scientific Reports HF Dataset from HCUP Yes Logistic Regression, CRF, CNN RNN, LSTM Yes Scalable and accurate deep learning with electronic health records [Rajkomar et al., 2018] A.R., E.O., K.C., J.D. 2018 npj Digital Medicine UCSF, UCM Hospital Data Yes Yes Boosted NN LSTM Attention TANN Length of Stay Prediction Yes Yes Yes Improved Hierarchical Patient Classification with Language Model Pretraining over Clinical Notes [Kemp et al., 2019] J.K., A.R., A.D. 2019 Pre-print (ArXiv) MIMIC-III Yes Yes Hierarchical RNN Length of Stay Prediction Yes Yes Yes BioBERT: a pre-trained biomedical language representation model for biomedical text mining [Lee et al., 2019] J.L., W.Y., S.K., J.K. 2019 Bioinformatics PubMed Abstracts, PMC Articles Yes BERT Language Tasks (NER, RE, Q&A) Publicly Available Clinical BERT Embeddings [Alsentzer et al., 2019] E.A., J.M., W.B., M.M. 2019 ClinicalNLP, NAACL, WS MIMIC-III Yes BERT Language Tasks (NER, RE, Q&A) ClinicalBERT: Modeling Clinical Notes and Predicting Hospital Readmission [Huang et al., 2019] K.H., J.A., R.R. 2019 Pre-print (ArXiv) MIMIC-III Yes BERT Language Tasks (NER, RE, Q&A) Yes

slide-19
SLIDE 19

| | 29.05.2020 Leopold Franz

1 2 3 4 1 2 3 4 1 2 3 4

Filter Data Time to Discharge

1 2 3

Group Time Values

4

Change Shape

5

Normalize Values

1 2 3 4 1 2 3 4 1 2 3 4

DeepObserver Pre-processing

19

slide-20
SLIDE 20

| | 29.05.2020 Leopold Franz

DeepObserver CNN Model

Time Dimensional Filters Number of CCS codes to predict Regularization Technique Regularization Technique Regularization Technique

Loss: Binary Cross Entropy Learning Rate: 10-3

20

slide-21
SLIDE 21

| | 29.05.2020 Leopold Franz

ClinicalBERT Pre-processing

21

Filter data Split data according to task

1 2 3

A) Lowercase B) Replace abbreviations C) Remove superfluous characters

4

Cut Text to equal length

5

Split train:val:test

a) a) b), c) b) a) a) b), c) b) a) a) b), c) b)

a) a) a) a) b), c) b), c) b), c) b), c) b) b)
slide-22
SLIDE 22

| | 29.05.2020 Leopold Franz

ClinicalBERT_Multi Model

22

slide-23
SLIDE 23

| | 29.05.2020 Leopold Franz

query key

23

Interpretation