SLIDE 1 Convolutional Neural Networks For Modeling Temporal Biomarkers And Disease Predictions
Narges Razavian
New York University Langone Medical Center GTC 2017
In collaboration with: David SontagPhD, Saul BleckerMD, Ann-Marie SchmidtMD, Enrico BertiniPhD, Rahul Krishnan, YD Choi, Josua Krause, Somesh Nigam, Aaron Smith-McLallen, Ravi Chawla
SLIDE 2 Deep learning progress Healthcare world getting digital
Parallel Developments
EHR adoption by healthcare centers in the US Error rate on Image-Net object recognition challenge
SLIDE 3
What is captured in the EHR?
Source: healthcare.gov
SLIDE 4
Healthcare has joined the data-rich world
SLIDE 5 Moving from Treatment to Prevention
Challenges: Each Individual has a different ‘healthy’ baseline.
- Temporal Patterns/Trends are predictive
Each biomarker varies at a different speed in our bodies Measurements are sparse, asynchronous and correlated Many correlated outcomes are observed per patient
- Can we leverage this correlation?
SLIDE 6 Biomarkers and Outcomes
Biomarkers measurements
SLIDE 7 Biomarkers and Outcomes
Biomarkers measurements
Phenotype (diseases)
SLIDE 8 Biomarkers and Outcomes
Biomarkers measurements
Phenotype (diseases)
SLIDE 9 Biomarkers and Outcomes
Biomarkers measurements
Phenotype (diseases)
SLIDE 10
Step 1 Learn each biomarker from other biomarkers time-series
SLIDE 11 Kernel Regression
Observations X (Measurement Time-Series) Time Not Observed Want to estimate
SLIDE 12 Kernel Regression
Observations X (Measurement Time-Series) Time Not Observed Want to estimate
E[X(v)] = xP(x |
∫
t = v, Xtrain)dx
E[X | t = v, Xtrain] = x
∫
P(x,t = v | Xtrain) P(t = v | Xtrain) dx
E[X | t = v, Xtrain] = x
∫
K(x − xi,v −ti)
xi,ti
∑
K(v −ti)
ti
∑
dx
SLIDE 13 Kernel Regression
Observations X (Measurement Time-Series) Time Not Observed Want to estimate
E[X(v)] = xP(x |
∫
t = v, Xtrain)dx
E[X | t = v, Xtrain] = x
∫
P(x,t = v | Xtrain) P(t = v | Xtrain) dx
E[X | t = v, Xtrain] = x
∫
K(x − xi,v −ti)
xi,ti
∑
K(v −ti)
ti
∑
dx E[X | t = v, Xtrain] = (K ⊗ Xtrain)(v) (K ⊗ I(Xtrain :Observed))(v)
SLIDE 14
Use convolution framework to LEARN those kernels
SLIDE 15 We can learn the kernel (No need for parametric forms and cross validations) Easily extendible to multivariate! Unsupervised: All needed is (asynchronous) sequence of
Fast to train. Fast to apply.
Benefits
SLIDE 16 Data: 30K Individuals from the original training set. Dataset split equally between train, test and validate set. Loss: MSE. Train and evaluate only on (lab, person) with more than 1 observaGon.
Mul$variate Kernels learned for each input dimension (total 18)
SLIDE 17
More details in our ICLR paper
Narges Razavian, David Sontag Temporal Convolutional Neural Networks for Diagnosis from Lab Tests http://arxiv.org/abs/1511.07938 Open Source code available (torch/lua implementation): https://github.com/clinicalml/deepDiagnosis
SLIDE 18
Step 2 Predict 200+ correlated outcomes using multi- resolution convolutional neural networks and multi-task learning
SLIDE 19
Multi-Resolution Convolution Networks
The Architecture - model (1)
SLIDE 20
Multi-Resolution Convolution Networks
The Architecture - model (2)
SLIDE 21
Prediction AUCs on the held-out test set
SLIDE 22
More details in our JMLR paper
Narges Razavian, Jake Marcus, David Sontag, Multi-task Prediction of Disease Onsets from Longitudinal Lab Tests JMLR, 2016 http://arxiv.org/abs/1608.00647 Open Source code available (torch/lua implementation): https://github.com/clinicalml/deepDiagnosis
SLIDE 23 Following up in clinical world
- Prediction models built and deployed for
– Nurse calls and home visits for 250,000+ NYUMC patients at high risk for a number of these outcomes – Improved documentation in EHR
- Automation of mandatory visits/screening/follow-ups
- Best practice alerts
- Reimbursement for intense lifestyle management programs
- Extending to broader outcomes and domains
SLIDE 24
New York University (i2b2) Database
SLIDE 25 New York University (i2b2) Database
Nuclear Medicine Procedures Magnetic Resonance Imaging
SLIDE 26 Conclusions
- Applications of deep learning in healthcare are unlimited
- Unsupervised learning + back-propagation + deep learning
can recover biomarker models from asynchronous high- dimensional time-series data
- Multi-task learning benefits prediction tasks with smaller
datasets.
SLIDE 27
Thanks!
Questions/comments: Narges.Razavian@nyumc.org Open Source Package: https://github.com/clinicalml/deepDiagnosis
SLIDE 28