A Reader Study on a 14head Microscope Brandon D. Gallas, Qi Gong - - PowerPoint PPT Presentation

a reader study on a 14 head microscope
SMART_READER_LITE
LIVE PREVIEW

A Reader Study on a 14head Microscope Brandon D. Gallas, Qi Gong - - PowerPoint PPT Presentation

A Reader Study on a 14head Microscope Brandon D. Gallas, Qi Gong FDA/CDRH/OSEL/DIDSR, Silver Spring, MD, US Jamal Benhamida, Matthew G. Hanna, S. Joseph Sirintrapun, Kazuhiro Tabata, Yukako Yagi Memorial Sloan Kettering Cancer Center


slide-1
SLIDE 1

A Reader Study on a 14‐head Microscope

Brandon D. Gallas, Qi Gong FDA/CDRH/OSEL/DIDSR, Silver Spring, MD, US Jamal Benhamida, Matthew G. Hanna, S. Joseph Sirintrapun, Kazuhiro Tabata, Yukako Yagi Memorial Sloan Kettering Cancer Center (MSKCC), Pathology Informatics, New York, NY, US Partha P. Mitra Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, US

slide-2
SLIDE 2

A Reader Study on a 14‐head Microscope

Purpose

  • Purpose of this work

– Demonstration … proof of concept … technology demonstration … method development

  • Technology evaluation, not clinical performance

Task‐based evaluation of image quality – Task: Detection and classification of mitotic figures (MFs) – Images: Glass slides and WSI – Readers: Pathologists – Performance: Within‐ and Between‐Reader Agreement

5/23/2018 www.fda.gov 2

Clinically relevant task Part of every pathologist’s training Challenging task (substantial reader variability) Convenient samples Agreement … No ground truth Count differences (calibration) Pairwise Concordance (correlation) “MRMC” analyses account for variability from Multiple Readers and Multiple Cases

slide-3
SLIDE 3

Microscope still the gold standard

Remove search from technology evaluation

  • Eliminate location variability for faster and

more precise results.

eeDAP: Evaluation Environment for Digital and Analog Pathology

  • eeDAP: Evaluation Environment for Digital

and Analog Pathology

  • Registration allows pathologists to

evaluate the same fields of view

5/23/2018 www.fda.gov 3

Clinical practice Pathologists choose Fields of View to evaluate

Pathologist 1 Pathologist 2 Pathologist 3 Pathologist 4

Technology Evaluation All pathologists evaluate same Fields of View Camera image

  • f glass slide

WSI Patch

slide-4
SLIDE 4

NIH Mitotic Counting Study

  • NIH Study data (Mark Simpson)

– FOV locations saved for each pathologist in digital mode – Preliminary agreement results given during WSIWG meeting

5/23/2018 www.fda.gov 4

Counts come from different tissue! Clinical practice vs. technology evaluation

pHH3 20x H&E 20x H&E 40x

slide-5
SLIDE 5

eeDAP on the road last year …

5/23/2018 www.fda.gov 5

Monitor, Computer, motorized stage with joystick, microscope with mounted camera, reticle in eyepiece

slide-6
SLIDE 6

Mitotic Counting and Classification

Install, Demo, Train at MSKCC Study Design

  • 4 slides from Mark Simpson at

NCI

– HE: canine oral melanoma

  • 10 ROIs per slide from tumor

– ROI – 800 x 800 pixels @ 0.25um/pixel 200um x 200um 17% of the entire FOV (0.24 mm2)

  • 4 pathologists from MSKCC

5/23/2018 www.fda.gov 6 eeDAP on loan to MSKCC

slide-7
SLIDE 7

Quick look at first study

  • Circles: mitotic figures

identified by pathologists.

  • “Candidate MFs” =

marked cells

  • Each color corresponds to

a different pathologist.

5/23/2018 www.fda.gov 7 WSI image

slide-8
SLIDE 8

Readers per Candidate MF

Readers Per Candidate, total = 92

readersPerCandidate1 Density 1 2 3 4 0.0 0.1 0.2 0.3 0.4 45 12 14 21

  • 45/92 = 49% marked by only one
  • 21/92 = 23% unanimously

marked

  • Build these candidate MFs into

next study: Classification task

  • Need some low‐probability

candidates from ROIs with zero or

  • ne candidates ‐> yield 34

5/23/2018 www.fda.gov 8

slide-9
SLIDE 9

Can we use eeDAP on this multi‐headed microscope?

  • Same microscope frame …

14 heads!

  • Stage mounts fine
  • Camera mounts fine
  • Let’s do it.

5/23/2018 www.fda.gov 9

slide-10
SLIDE 10

Mitotic counting and Classification: Multi‐head microscope

High‐throughput reader study Study Design

5/23/2018 www.fda.gov 10

  • 4 slides from Mark Simpson at NCI

– HE: canine oral melanoma

  • 10 ROIs per slide from tumor

– ROI = 800 x 800 pixels @ 0.25um/pixel = 200um x 200um = 17% of the entire FOV (0.24 mm2)

  • 126 (=92+34) Candidate MFs
  • 10 pathologists*
  • Collect data on paper

– ~1 hour training – ~2 hours for data collection

slide-11
SLIDE 11

Mitotic counting and Classification: Multi‐head microscope

High‐throughput reader study Workflow

5/23/2018 www.fda.gov 11

  • Mark and count in ROI
  • Classify candidates in same ROI
slide-12
SLIDE 12

Readers per Candidate: Multi‐head study

  • Similar characteristics as

before

  • 79/158 = 49% marked by only
  • ne
  • 21/158 = 13% unanimously

marked

– 13 agree with previous, 8 new

  • nes

5/23/2018 www.fda.gov 12

Readers Per Candidate, total = 158

readersPerCandidate2 Density 2 4 6 8 10 0.0 0.1 0.2 0.3 0.4 0.5 79 12 7 6 5 9 4 9 6 21

slide-13
SLIDE 13

Counting Results

  • Each point =

– One ROI and a pair of readers – Appears twice (transpose x,y) – Has noise added for visualization

  • How do we summarize this?

5/23/2018 www.fda.gov 13

2 4 6 8 2 4 6 8

Between-reader Scatter Plot micro14 vs. micro14

Reader counts: 14-head Microscope Reader counts: 14-head Microscope

Agreement … No ground truth Count differences (calibration) Pairwise Concordance (correlation) “MRMC” analyses account for variability from Multiple Readers and Multiple Cases

slide-14
SLIDE 14

Results: Count Differences

  • Rotate 45 and

rescale x‐axis

‐> Bland‐Altman plot

5/23/2018 www.fda.gov 14

2 4 6 8 2 4 6 8

Between-reader Scatter Plot micro14 vs. micro14

Reader counts: 14-head Microscope Reader counts: 14-head Microscope

slide-15
SLIDE 15

Results: Count differences

2 4 6 8

  • 6
  • 4
  • 2

2 4 6

Between-Reader Differences: micro14 N = 3420

Count Averages Count Differences

  • Rotate 45 and

rescale x‐axis

‐> Bland‐Altman plot

  • Limits of agreement

– Characterize spread of differences – σ = 1.07

  • Not the standard error

– SE characterizes the spread of the mean difference

5/23/2018 www.fda.gov 15 σ = 1.07

“MRMC” analyses: account for variability from Multiple Readers and Multiple Cases

slide-16
SLIDE 16

Results: Count Differences

  • Study 1:

– More MFs with microscope – Count differences were larger with digital

  • Study 2:

– Microscope results consistent with Study 1

5/23/2018 www.fda.gov 16

Study 1: Average Counts SE Average Counts Std of Between‐Reader Count Differences Digital 1.22 0.23 1.29 Microscope 1.48 0.27 1.12 Microscope ‐ Digital 0.26 0.12 1.20

slide-17
SLIDE 17

Results: Count Differences

  • Study 1:

– More MFs with microscope – Count differences were larger with digital

  • Study 2:

– Microscope results consistent with Study 1

5/23/2018 www.fda.gov 17

Study 1: Average Counts SE Average Counts Std of Between‐Reader Count Differences Digital 1.22 0.23 1.29 Microscope 1.48 0.27 1.12 Microscope ‐ Digital 0.26 0.12 1.20

slide-18
SLIDE 18

Results: Count Differences

  • Study 1:

– More MFs with microscope – Count differences were larger with digital

  • Study 2:

– Microscope results consistent with Study 1

5/23/2018 www.fda.gov 18

Study 1: Average Counts SE Average Counts Std of Between‐Reader Count Differences Digital 1.22 0.23 1.29 Microscope 1.48 0.27 1.12 Microscope – Digital 0.26 0.12 1.20 Study 2: 14‐head Microscope 1.54 0.25 1.07

slide-19
SLIDE 19

Pairwise Concordance

A probability that tracks with correlation

  • Select two ROIs
  • Consider the counts from two pathologists

– Pathologist 1: X1, X2 – Pathologist 2: Y1, Y2

  • Possible outcomes

5/23/2018 www.fda.gov 19

Concordance X1>X2, Y1>Y2 Discordance X1>X2, Y1<Y2 Tie for Pathologist 1 X1=X2, Y1≠Y2 Tie for Pathologist 2 X1≠X2, Y1=Y2 Tie for both pathologists X1=X2, Y1=Y2

2 4 6 8 2 4 6 8

Between-reader Scatter Plot micro14 vs. micro14

Reader counts: 14-head Microscope Reader counts: 14-head Microscope

No time for concordance results

slide-20
SLIDE 20

Classification scores

20 40 60 80 100 20 40 60 80 100

Between-reader Scatter Plot micro14 vs. micro14

Reader classification scores Reader classification scores

  • Summarize with concordance
  • For some cases

– “Definitely Not a MF” – “Definitely Is a MF”

  • Can this impact AI training?
  • Can still binarize this data

– 15% in the red zone

5/23/2018 www.fda.gov 20 No time for concordance results

slide-21
SLIDE 21

Generalize to evaluating computational pathology

  • FDA qualification of images with annotations

– MDDT: Medical Device Development Tools – Support FDA submissions of computational pathology

  • Generate candidates from

– Pathologists AND – Algorithm(s)

  • Candidates cover

range in likelihood the candidate is a MF

  • Use same agreement measures

5/23/2018 www.fda.gov 21

Readers Per Candidate, total = 158

readersPerCandidate2 Density 2 4 6 8 10 0.0 0.1 0.2 0.3 0.4 0.5 79 12 7 6 5 9 4 9 6 21

Reduces bias in the comparison Similar plot AI likelihood instead of readers per candidate

slide-22
SLIDE 22

Summary

  • Collected and analyzing:

– MF counts, locations, and classifications

  • Agreement analyses

– MRMC analysis – Calibration – Correlation – Unit of analysis: cells > ROIs > slides

  • Limitations

– Anecdotal feedback

  • Pathologists felt rushed
  • Focus handling not perfect
  • No reticles in eyepieces

– No Ground Truth

  • Future work

– Generalize to other ROIs? – Generalize to other specimens (organs)?

  • Evaluate AI algorithms

– Use similar study design – Use similar analysis tools – Need “candidates” from algorithms and pathologists for unbiased evaluation

  • FDA qualification of images with annotations

– MDDT: Medical Device Development Tools – Test sets for FDA submissions of computational pathology

5/23/2018 www.fda.gov 22