ARTIFICIAL INTELLIGENCE FOR DIGITAL PATHOLOGY
Kyunghyun Paeng, Co-founder and Research Scientist, Lunit Inc.
ARTIFICIAL INTELLIGENCE FOR DIGITAL PATHOLOGY Kyunghyun Paeng, - - PowerPoint PPT Presentation
ARTIFICIAL INTELLIGENCE FOR DIGITAL PATHOLOGY Kyunghyun Paeng, Co-founder and Research Scientist, Lunit Inc. 1. BACKGROUND: DIGITAL PATHOLOGY 2. APPLICATIONS BREAST CANCER AGENDA PROSTATE CANCER 3. DEMONSTRATIONS 4. CONCLUSION 2
Kyunghyun Paeng, Co-founder and Research Scientist, Lunit Inc.
2
3
BACKGROUND: DIGITAL PATHOLOGY
Patient Detection
(X-ray, CT, MRI, ...)
Radiology Diagnosis
(biopsy, resection, ...)
Pathology Treatment Oncology
4
BACKGROUND: DIGITAL PATHOLOGY
Diagnosis
(biopsy, resection, ...)
Pathology Slide Report
(-) Archiving (-) Workflow (-) Analysis
5
BACKGROUND: DIGITAL PATHOLOGY
Diagnosis
(biopsy, resection, ...)
Pathology
(+) Archiving (+) Workflow (+) Analysis
6
BACKGROUND: DIGITAL PATHOLOGY
“Diagnostic Concordance Among Pathologists Interpreting Breast Biopsy Specimens.”, JAMA, 2015.
7
BACKGROUND: DIGITAL PATHOLOGY
Grade 1 Grade 2 Grade 3
3! 4! 3? 4?
~ 100,000 pixels
8
9
APPLICATION #1: BREAST CANCER
Mitosis
Proliferation score
(in 10 consecutive HPFs)
Score 1: ~6 mitosis Score 2: 6~10 mitosis Score 3: 10~ mitosis
prognosis good bad
10
APPLICATION #1: BREAST CANCER
656 ROIs from 73 slides
Mitosis #1 (x,y) Mitosis #N (x,y)
...
Auxiliary dataset Proliferation score
500 slides Training dataset Tumor Proliferation Assessment Challenge 2016
TUPAC16 | MICCAI Grand Challenge
Proliferation score
321 slides Test dataset
11
APPLICATION #1: BREAST CANCER
Whole slide image Tissue region extraction Patch extraction at x40 ROI detection using cell density
...
Stain normalization Mitosis Detection Network
Feature vector based on statistical information Support Vector Machine
Proliferation score
Auxiliary set for mitosis detection
Phase 1: Handling whole slide images Phase 2: Mitosis detection Phase 3: Score prediction
12
APPLICATION #1: BREAST CANCER
Whole slide image Tissue region extraction Patch extraction at x40 ROI detection using cell density
...
Stain normalization
13
APPLICATION #1: BREAST CANCER
Mitosis Detection Network Auxiliary set for mitosis detection
conv 1, 3x3, 16 16 8 128 x 128 resblock 2.1, 3x3, 64 resblock 2.3, 3x3, 64 resblock 1.1, 3x3, 32 resblock 1.3, 3x3, 32 resblock 3.1, 3x3, 128 resblock 3.3, 3x3, 128 mitosis normal Global pooling layer
Training step: , Inference step:
14
APPLICATION #1: BREAST CANCER
Feature vector based on statistical information Support Vector Machine
Proliferation score
15
APPLICATION #1: BREAST CANCER
Tumor Proliferation Assessment Challenge 2016
TUPAC16 | MICCAI Grand Challenge
16
APPLICATION #2: PROSTATE CANCER
Grade 1 Grade 2 Grade 3 Grade 4 Grade 5 Grade 1, 2 Grade 3 Grade 4 Grade 5 Core #1: 5+5 Core #2: 0 Core #3: 3+4 Core #4: 0
17
APPLICATION #2: PROSTATE CANCER
{ Grade, Contours }
900 slides Training dataset { Grade, Contours }
50 slides Test dataset
Dataset from medical centers
18
APPLICATION #2: PROSTATE CANCER
Normal Grade 3 Grade 5 Grade 4
Normal Grade 3 Grade 4 Grade 5 Gleason score classification network
1000 1100 1110 1111 Ranking loss with thermometer code Memory network-based refinement (25 neighbors)
Patch-based classification
...
Embedded memory vector Query vector Embedding Memory network Refined
Normal Grade 3 Grade 5 Grade 4
19
APPLICATION #2: PROSTATE CANCER
Key features for improving performance
Baseline settings
Not a classification problem! Ordering problem!
1000 1100 1110 1111
Network decodes from the left-most bit to the right-most bit.
20
APPLICATION #2: PROSTATE CANCER
Query vector (1x4dim) Memory vector (25x4dim)
...
Patch-level outputs (25 neighbors) Embedding
... ... ... ...
1x1024 25x1024 Innerproduct
...
Attention vector 25x1
...
Weighting 1D-CNN Softmax Refined output
21
APPLICATION #2: PROSTATE CANCER
Patch-level performance Core-level performance
+ Data cleansing: 80% + Ranking loss: 82.8% + Memnet refinement: 87.5%
22
23
24
Challenge #1. How to handle gigapixel images ? (i.e., whole slide images) ü Consider how to sample patches. (patch size, sampling step, ...) è with pathologists. ü Consider how to construct whole pipeline from gigapixel images to diagnosis. Challenge #2. How to handle quality variation between slides ? ü Design image processing modules carefully. ü Do cross-validation to avoid overfitting.
Challenge #3. How to handle ambiguous ground-truth ? ü Design task-specific loss. ü Sanitize training dataset as much as possible you can. ü Don’t be satisfied with patch-based results.