Using natural language processing to assess documentation of - - PowerPoint PPT Presentation

using natural language processing to
SMART_READER_LITE
LIVE PREVIEW

Using natural language processing to assess documentation of - - PowerPoint PPT Presentation

Using natural language processing to assess documentation of features of critical illness in discharge documents of ARDS survivors AcademyHealth ARM 2016: Leveraging Data to Improve Quality and Outcomes Gary E. Weissman, MD 1,2 , Michael O.


slide-1
SLIDE 1

Using natural language processing to assess documentation of features of critical illness in discharge documents

  • f ARDS survivors

AcademyHealth ARM 2016: Leveraging Data to Improve Quality and Outcomes

June 27, 2016

Gary E. Weissman, MD1,2, Michael O. Harhay, MPH2,3, Ricardo M. Lugo, MD, MA4, Barry D. Fuchs, MD, MS1, Scott D. Halpern, MD, PhD1,2,3, Mark E. Mikkelsen, MD, MSCE1,3

1 Pulmonary, Allergy, and Critical Care Division, Hospital of the University of Pennsylvania,

Philadelphia, PA

2 Leonard Davis Institute of Health Economics, University of Pennsylvania, Philadelphia, PA 3 Center for Clinical Epidemiology and Biostatistics, University of Pennsylvania, Philadelphia, PA 4 Division of Cardiovascular Medicine, Vanderbilt University School of Medicine, Nashville, TN

@garyweissman

slide-2
SLIDE 2

2

Disclosures

 Funding

  • NIH/NHLBI T32-HL098054

 Conflicts of interest

  • None
slide-3
SLIDE 3

3

Background

 Acute respiratory distress syndrome (ARDS)

  • 190,600 cases/yr in USA
  • Improved mortality  more survivors

Rubenfeld et al. N Engl J Med 2005;353:1685-1693. Abel et al. Thorax 1998;53:292-294. Elliott et al. Crit Care Med 2014;42:2518-2526. Needham et al. Crit Care Med 2012;40:502-509.

 Post-intensive care syndrome (PICS)

  • 80% prevalence among ARDS survivors
  • Deficits across 3 domains:

– Cognitive – Psychiatric – Functional

 The handoff

  • Often only chance to communicate between inpatient  outpatient
  • JCAHO: “reason for hospitalization” and “significant findings”
  • Review: primary diagnosis (17.5%), hospital course (14.5%)

Kind et al. AHRQ 2008. Kripalani et al. JAMA 2007;297:831-841.

slide-4
SLIDE 4

4

Questions

 Clinical

  • Which features of critical illness are documented at hospital

discharge?

 Methodologic

  • Which NLP tasks are important for identification of features of critical

illness in hospital discharge documents?

slide-5
SLIDE 5

5

Natural language processing (NLP)

 “…a subfield of linguistics and computer science that deals with computer applications who input is natural language.”

Bretonnel Cohen and Demner-Fushman. Biomedical Natural Language Processing. 2014. Yim et al. JAMA Oncol 2016;797-804.

slide-6
SLIDE 6

6

Methods

 Population  Keywords  Natural language processing (NLP)  Sensitivity analysis  Manual review  Multivariable modified Poisson regression

slide-7
SLIDE 7

7

Methods: Population

 Prospective, electronic, real-time ARDS identification 2013 - 2015  Sensitivity 97.6%, specificity 97.6%  Exclusions:

  • Heart failure
  • Neurosurgery
  • 48h post-operative
  • FiO2 < 0.5 at diagnosis
  • Death in hospital
  • Discharge to inpatient hospice

 Final sample

  • 1,797  815 eligible discharge documents

Koenig et al. Crit Care Med 2011;39:98-104.

slide-8
SLIDE 8

8

Methods: Population

slide-9
SLIDE 9

9

ARDS Mechanical ventilation ICU admission Symptoms of PICS Acute lung injury Extubate CCU Anxiety Acute respiratory distress syndrome Intubate Critical care Brain dysfunction ALI Mechanical ventilation Critical illness Cognitive impairment ALI/ARDS VDRF Critically ill Confusion ARDS Vent CTICU Delirium Vent dependent respiratory failure CTSICU Depression Ventilator ICU Executive dysfunction Intensive care ICU delirium Intensive care unit Immobility MICU Memory dysfunction

Methods: Keywords (incomplete)

slide-10
SLIDE 10

10 10

Methods: NLP

 R statistical computing language

  • packages: tm, RWeka, data.table, stringdist, ggplot2

 Tasks

  • Standard preprocessing
  • Morphologic decomposition
  • Tokenization
  • Spelling error identification

 Not included

  • Named entity recognition
  • Relation extraction (negation, temporal)
  • Word sense disambiguation
  • Problem-specific segmentation

 Goal: keyword-based document classifier

slide-11
SLIDE 11

11 11

Methods: NLP task detail

 Morphologic decomposition – “stemming”

  • Acronyms treated separately

patient emergently intubated respiratory failure patient emerg intub respiratori failur

slide-12
SLIDE 12

12 12

Methods: NLP task detail

 Spelling error identification

  • Sensitivity analysis
  • Mr. Smith suspected of having acute

respiratroy distress syndrome Restricted Damerau-Levenshtein Distance: > stringsim('respiratroy', 'respiratory') [1] 0.9090909

slide-13
SLIDE 13

13 13

Results

slide-14
SLIDE 14

14 14

Results: Modified Poisson regression

slide-15
SLIDE 15

15 15

Results: sensitivity analysis

slide-16
SLIDE 16

16 16

Results: Manual review

slide-17
SLIDE 17

17 17

Error analysis

 Word sense disambiguation  “…depression of left ventricular systolic function…”  “…weak gag...”  Sentence boundary detection  “...was admitted to the MICU.MICU course complicated by...”

slide-18
SLIDE 18

18 18

Summary points

 ARDS not often documented at discharge because it’s not recognized in the ICU – not because of “forgetting”  Mechanical ventilation and ICU admission are frequently mentioned in discharge summaries of ARDS survivors  Keyword-based document classifier has excellent accuracy for identifying ARDS, mechanical ventilation, and ICU admission in hospital discharge documents  Named entity recognition and disambiguation (NERD) and sentence boundary detection tasks may improve performance of PICS symptom identification

Weissman GE, et al. Annals of the American Thoracic Society. Epub. 2016.

slide-19
SLIDE 19

19 19

Next steps

 EHR redesign

  • Easily accessible, searchable queries for researchers
  • Robust metadata for structured and unstructured fields

 Data challenges

  • Location-specific jargon
  • Share code

 Generalizability

  • Need for decision support for ARDS recognition in real time
  • Method not the results
slide-20
SLIDE 20

20 20

Acknowledgments

 Acute care health services research group, University of Pennsylvania  Penn Data Analytics Center  Anonymous reviewers and editorial staff at Annals of the American Thoracic Society

slide-21
SLIDE 21

21 21

slide-22
SLIDE 22

22 22

Results

slide-23
SLIDE 23

23 23

Results

slide-24
SLIDE 24

24 24

Methods: NLP task detail

 Preprocessing

  • Remove whitespace, numbers, punctuation, stopwords, lower case

The x-ray demonstrated a 4 cm mass. No effusion or pneumothorax. xray demonstrated cm mass effusion pneumothorax

slide-25
SLIDE 25

25 25

Methods: NLP task detail

 Tokenization

  • N-grams (1 ≤ N ≤ 4)

patient suspected acute respiratory distress syndrome intubated critical illness N = 2: patient suspected, suspected acute, acute respiratory, respiratory distress, distress syndrome, syndrome intubated, intubated critical, critical illness