ICD-10-AM and ACHI Rajvir Kaur Master of Research Authors Rajvir - - PowerPoint PPT Presentation
ICD-10-AM and ACHI Rajvir Kaur Master of Research Authors Rajvir - - PowerPoint PPT Presentation
Comparative Analysis of Algorithmic Approaches for Auto-Coding with ICD-10-AM and ACHI Rajvir Kaur Master of Research Authors Rajvir Kaur Jeewani Anupama Ginige Introduction Electronic Health Records (EHRs): Digitised version of paper
Authors
Rajvir Kaur Jeewani Anupama Ginige
Introduction
- Electronic Health Records (EHRs): Digitised version of paper based medical records
- What is Clinical Coding?
- Assignment of alphanumeric codes
- Manually assigned by clinical coders
- Uses:
- Funding, insurance claim processing
- Research
- Government and policy makers use coded data.
Image Credit: https://medium.com/@Petuum/automated-icd-coding-using-deep-learning-1e9170652175
Classification system in different countries
- Countries specific classification system:
ICD-10-CM (Clinical Modification) ICD-10-CA (Canadian Modification) ICD-10-GM (German Modification) ICD-10-AM (Australian Modification)
- Ireland, Singapore, Saudi Arabia
Image Credit: https://www.slideshare.net/EduardoPorras2
Challenges in manual coding
- Complexity of codes
ICD-9 : 3,882 codes ICD-10: Approx. 70,000 codes
- 15-42 records per day
- Annual cost: 25 billion dollars (U.S.)
- Training and recruitment cost
- Highly prone to errors
Image Credit: http://bestptbilling.com/how-to-reduce-icd-10-transition-pain-for-physical-therapy-practice-owners/
“Boy, this new system is so confusing.
your ICD-9 code says that you’re here for a sprained ankle, but your ICD-10 code says it’s complete and irreversible skeletal failure.
Our Contribution
- We focus on:
- ICD-10-AM and ACHI classification system
- Comparing and analysing various approaches based on standard evaluation
criteria
- Our research concentrates on only two ICD-10-AM and ACHI chapters
- Digestive System:
Chapter 11: Diseases of the digestive system (ICD-10-AM) Chapter 10: Procedures on digestive system (ACHI)
- Respiratory System:
Chapter 10: Diseases of the respiratory system (ICD-10-AM) Chapter 7: Procedure on respiratory system (ACHI)
Ethics Approval
- Western Sydney University Ethics No.: H12628190
- Dataset:
- Total 190 clinical records (Gold Standard)
- Collected from hospitals across Australia
- Archived by National Centre for Classification
in Health (NCCH)
Sample data
Paper based Electronic version
- PDF or Image file to Tabular format
- Created text narratives
- Information extracted from medical records include:
Principal Diagnoses (PDx) Additional Diagnoses (ADx) Smoke related diagnosis Diabetes condition Supplementary conditions Past Medical History (PMHx) Family Medical History Principal Procedure Additional Procedure Type of anaesthesia Ventilation details Allied health intervention
Dataset
- 190 original records
- Additional 45 records similar to digestive and respiratory diseases and interventions
45 Clinical Records = 190 + 45 =235 15 digestive system 30 respiratory system
Dataset Digestive system records Respiratory system records Data190 116 74 Data235 131 104
Overview of the Proposed work
Clinical Text Processing Using ICD-10-AM/ ACHI TASK 1: ICD-10-AM/ ACHI Chapter Classification TASK 2: ICD-10-AM/ ACHI Code Assignment Digestive System Respiratory System Pattern Matching Rule Based Machine Learning
Approaches and Techniques
Clinical Text Processing Approaches and Techniques Pattern Matching
Regular Expression Evaluation
1. Precision 2. Recall 3. F-score 4. Accuracy 5. Hamming Loss 6. Jaccard Similarity
Rule-based Machine Learning
Pre-processing
1. Sentence splitting 2. Abbreviation Expansion 3. Tokenisation 4. Spell Check
Defining Rules Pre-processing
1. Sentence splitting 2. Abbreviation Expansion 3. Tokenisation 4. Spell Check 5. Stop word removal 6. Negation detection
Feature Extraction
1-gram, 2-gram, 3-gram, 4-gram
Classification
SVM, Naïve Bayes, Decision Tree Random Forest, AdaBoost, kNN, MLP
Evaluation Evaluation
1. Precision 2. Recall 3. F-score 4. Accuracy 5. Hamming Loss 6. Jaccard Similarity
Pattern Matching
- Simplest approach
- Search a text-string within the text
- Match character for character
- Use Regular Expression
bronchi, bronchus, bronchial, bronchitis A 51 year old patient has serious cough but no sign of pneumonia keywords
Rule-based approach
- Use logical expression and Boolean operations
if (logical expression) then (category)
ICD-10 Codes Generating rules
K05.2 Acute periodontitis Acute pericoronitis Parodontal abscess Peridontal abscess Excludes acute apical periodontitis (K04.4) periapical abscess (K04.7) periapical abscess with sinus (K04.6) If document contains acute periodontitis OR acute pericoronitis OR parodontal abscess OR peridontal abscess OR AND document NOT contains acute apical periodontitis AND periapical abscess AND periapical abscess with sinus assign code K05.2
Machine Learning
- ML
Image Credit: https://www.newtium.com/Software/Predictive
Data Preprocessing
- 1. Abbreviation Expansion
Admission Date: **** Discharge Date:**** Presenting Problems Respiratory -cough PRINCIPAL DIAGNOSIS Infective exacerbation of bronchiectasis Acute-on-chronic Type 2 respiratory failure Summary of Progress Dear Doctor, Thank you for your ongoing care of **** , who presented to **** hospital on **** with SOB, cough and chest pain, on a background of bronchiectasis. The patient was admitted under the case of Dr**** (Respiratory) for management of infective exacerbation of bronchiectasis. Background Bronchiectasis
- Known to Dr****(Respiratory)
- Bronchiectasis diagnosed 20 years ago, secondary to childhood
pertussis Left ventricular failure
- Known to Dr****(Cardiology)
Cough, SOB, Pleuritic chest pain
Abbreviations Full-form COPD Chronic obstructive pulmonary disease SBO Small bowel obstruction IHD Ischaemic heart disease SOB Shortness of breath HTN Hypertension T2DM Type 2 diabetes mellitus
Data Preprocessing
- 2. Spell Check
Used : NLTK and PyEnchant Python libraries
Australian English American English
- esophagus
esophagus tumour tumor anaemia anemia anaesthetic anesthetic ischaemic ischemic diarrhoea diarrhea
Data Preprocessing
- 3. Stop word removal
‘again’, ‘about’, ‘there’, ‘once’, ‘during’, ‘out’, ‘they’, ‘own’, ‘an’, ‘some’, ‘its’, ‘yours’ ‘such’, ‘into’, ‘most’, ‘itself’, ‘other’, ‘off’, ‘am’, ‘who’, ‘as’, ‘him’, ‘each’, ‘themselves’, ‘until’, ‘we’, ‘these’, ‘your’, ‘his’, ‘through’, ‘me’, ‘her’, ‘more’ , ‘himself’, ‘this’, ‘down’, ‘should’, ‘our’, ‘their’, ‘while’, ‘above’, ‘both’, ‘up’, ‘ours’, ‘she’, ‘all’, ‘when’, ‘at’, ‘any’, ‘before’, ‘them’, ‘same’, ‘yourselves’, ‘because’, ‘what’, ‘over’, ‘why’, ‘now’, ‘he’, ‘you’, ‘herself’, ‘just’, ‘ourselves’, ‘hers’, ‘yourself’, ‘how’, ‘theirs’ ‘further’, ‘doing’, ‘where’, ‘too’, ‘whom’, ‘those’
X
no, not, nil, never
Data Preprocessing
- 4. Negation Detection
negated term The patient is suffering from serious cough but no evidence of pneumonia. keywords Negated findings: (pneumonia, ‘True’) – do not assign code Non-negated findings: (cough, ‘True’) – assign code
Feature Extraction
Bag of words representation
X:The infant was admitted to The hospital for bronchiolitis with worse cough and wheeze Y:The old male presented for vomiting and diarrhoea
admitted 1 and 2 bronchiolitis 1 cough 1 diarrhoea 1 for 2 hospital 1 infant 1 male 1
- ld
1 presented 1 to 1 the 3 vomiting 1 was 1 wheeze 1 with 1 worse 1
Classification
Seven classifiers: Support Vector Machine (SVM) Naïve Bayes (NB) Decision Tree (DT) Random Forest (RF) AdaBoost k-Nearest Neighbor (kNN) Multi Layer Perceptron (MLP)
Evaluation
Yi: Ground truth label Zi : Predicted label N: Number of records M: Set of all labels
Positive Negative
Positive
True Positive (TP) False Negative (FN)
Negative
False Positive (FP) True Negative (TN)
Predicted Ground Truth
Results: TASK 1 ICD-10-AM/ACHI Chapter Classification
TASK 1: ICD-10-AM/ ACHI Chapter Classification Digestive System Respiratory System Gastrointestinal class Respiratory class
Metrics Classifiers Data190 0.95 0.95 0.95 0.9474 0.05263 0.94736 Data235 0.87 0.87 0.87 0.8723 0.12765 0.87234 Data190 0.93 0.92 0.92 0.9211 0.07894 0.92105 Data235 0.98 0.98 0.98 0.9787 0.02127 0.97872 Data190 0.89 0.87 0.86 0.8684 0.13157 0.86842 Data235 0.88 0.87 0.87 0.8723 0.12765 0.87234 Data190 0.76 0.55 0.42 0.5526 0.44736 0.55263 Data235 0.84 0.81 0.8 0.8085 0.19148 0.80851 Data190 0.84 0.84 0.84 0.8421 0.15789 0.84211 Data235 0.9 0.89 0.89 0.8936 0.10638 0.89361 Data190 0.85 0.84 0.84 0.8421 0.15789 0.84211 Data235 0.89 0.89 0.89 0.8936 0.10638 0.89361 Data190 0.88 0.87 0.87 0.8684 0.13157 0.86842 Data235 0.9 0.89 0.89 0.8936 0.10638 0.89361 Multi Layer Perceptron Support Vector Machine Naïve Bayes Decision Tree Random Forest k-Nearest Neighbor AdaBoost Jaccard Similarity Dataset Precision Recall F-score Accuracy Hamming Loss
Task 2: ICD-10-AM/ACHI Code Assignment
TASK 2: ICD-10-AM/ACHI Code Assignment
Pattern Matching Rule-Based Machine Learning Training-Testing not required Training-Testing required
Test Data (20%) Data190 Digestive system: 22 Respiratory system:16 Total: 38 Data235 Digestive system: 26 Respiratory system:21 Total: 47 Number of Medical Records
Results: TASK 2 ICD-10-AM/ACHI Code Assignment
Data190 Data235 0.2 0.4 0.6 0.8 1 Pattern Matching Rule-Based Precision Recall F-score Accuracy HL JS 0.2 0.4 0.6 0.8 1 Pattern Matching Rule-Based Precision Recall F-score Accuracy HL JS Approach Dataset Precision Recall F-score Accuracy Hamming Loss Jaccard Similarity Data190 0.7953 0.4184 0.5277 0.4027 0.043 0.4365 Data235 0.8029 0.409 0.5201 0.3945 0.0405 0.4255 Data190 0.7913 0.6916 0.7257 0.6053 0.1728 0.5803 Data235 0.792 0.6872 0.7222 0.6011 0.1745 0.5768 Pattern Matching Rule based
TASK 2 Results: Machine Learning
Classifier Dataset Precision Recall F- score Accuracy Hamming Loss Jaccard Similarity Data190 0.76798 0.45175 0.54361 0.44051 0.03706 0.44776 Data235 0.89308 0.55191 0.65373 0.54143 0.01955 0.52697 Data190 0.62534 0.63168 0.57465 0.44051 0.67841 0.42014 Data235 0.72891 0.61722 0.61821 0.49643 0.35158 0.48805 Data190 0.58333 0.25586 0.33523 0.25389 0.01392 0.27135 Data235 0.66666 0.30773 0.39793 0.29717 0.02453 0.32365 Data190 0.81421 0.81329 0.79115 0.66831 0.23514 0.65517 Data235 0.92392 0.92019 0.91412 0.86118 0.09458 0.82945 Data190 0.92062 0.85015 0.87305 0.79201 0.08776 0.74537 Data235 0.91407 0.91295 0.90351 0.84462 0.11271 0.79245 Data190 0.62938 0.29488 0.37559 0.29073 0.02192 0.29 Data235 0.63475 0.34756 0.38689 0.34537 0.00942 0.33055 Data190 0.68001 0.46388 0.51485 0.38567 0.34411 0.36667 Data235 0.57679 0.46974 0.40582 0.40993 0.24057 0.3913 kNN SVM Naïve Bayes Random Forest AdaBoost Decision Tree MLP
Data190 results using 4-gram and Data235 results using 2-gram feature set
Comparison of approaches
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Pattern Matching Rule-based Machine Learning 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Pattern Matching Rule-based Machine Learning
Data190 Data235
Conclusion and Future Work
- Conclusion:
- Due to adoption of EHRs and advanced classification systems, there is the
need to automate clinical workflow
- Computer Assisted Coding has capability to overcome the challenges of
manual coding
- Machine Learning approach is capable to predict correct ICD-10-AM and ACHI
codes
- Future Work:
- To work on large-scale data
- To work on other chapters of ICD-10-AM and ACHI classification system
- To apply Deep Learning and Hybrid approaches for Computer Assisted Coding