Image-based profiling using deep learning
Juan C. Caicedo Ph.D Broad Institute of MIT and Harvard
mitosis
Image-based profiling using deep learning Juan C. Caicedo Ph.D - - PowerPoint PPT Presentation
mitosis Image-based profiling using deep learning Juan C. Caicedo Ph.D Broad Institute of MIT and Harvard Images can be quantified for all kinds of phenotypes Muscle structure Patient biopsy tissue Image Mass Spec David Thomas Margaret
mitosis
David Thomas Margaret Shipp/Scott Rodig Michael Angelo Allen Institute for Cell Science Olivier Pourquie Muscle structure 3D Muscle structure Patient biopsy tissue Control human iPS Isogenic Duchenne-like iPS Image Mass Spec
Clinical trials underway for Alisertib in adults with AMKL. Wen Q, et al. (2012). Cell 150(3):575-89
DNA stain with outlines identifying the nuclei DMSO SU6656
TKK 0.1 uM
AZ138 - 0.01uM
Caicedo J.C., Singh S., Carpenter A. "Applications of Image-Based Profiling of Perturbations". Current Opinion in Biotechnology - 2016.
Gustafsdottir, et al. PLOS ONE 2013 Bray, et al. Nature Protocols 2016
Are treatments significantly different / effective?
Caicedo, J.C., et al. "Data-analysis strategies for image-based cell profiling." Nature Methods 14.9, 2017
Segmentation
Example Image Labeled objects Manual annotation
Train Model
Training Applying
Run Model
Labeled objects New image
Caicedo, J.C., et al. "Evaluation of Deep Learning Strategies for Nucleus Segmentation in Fluorescence Images." BioRxiv (2019): 335216.
65,333 experiments 3,634 teams 3 months
0.0
0.1 0.2 0.3 0.4 0.5 0.6
1 - [ods.ai] topcoders 2 - jacobkie 3 - Deep Retina 4 - Nuclear Vision 5 - Inom Mirzaev CellProfiler reference*
Competition score Number of participant teams
10 20 30 40
Distribution of scores in second-stage evaluation
Intersection over Union Threshold
0.0 0.2 0.4 0.6 0.8
Accuracy: F1-score
1.0 1st place 2nd place 3rd place reference
Accuracy of top-3 models
0.5 0.6 0.7 0.8 0.9
a b
[ods.ai] topcoders jacobkie Deep Retina 1st place 2nd place 3rd place
0.0 0.2 0.4 0.6 0.8 Small fluorescent Pink and purple tissue Purple tissue Large fluorescent Grayscale tissue
Accuracy by image type
Accuracy: F1-score @ 0.7 IoU
Dataset distribution b a
1st place 2nd place 3rd place
Training Test
80.6% 0.6% 15.5% 0.9% 2.4% 67.9% 4.7% 15.1% 11.3% 0.9% reference
Caicedo et al. 2019 Nature Methods. In Press.
Small fluorescent Pink and purple tissue Purple tissue Large fluorescent Grayscale tissue
in several parts of her body
The patient accepts an experimental treatment called immunotherapy.
Taken from https://www.bbc.co.uk/news/health-44338276
Tumor sequencing revealed 62 mutations They knew treatment for 7 of them The experiment worked only with 4 mutations
Anne Carpenter Shantanu Singh Juan Caicedo Mohammad Rohban
Cell line: A549 Over-expression 8 replicates 50 million+ single cells
EGFR_WT CONTROL ARAF_WT
CTNNB1_WT
FBXW7_WT KRAS_WT KEAP1_WT MAPK7_WT RIT1_WT STK11_WT
ARAF_p.S214F
Are treatments significantly different / effective?
Caicedo, et al. 2017 Nature Methods
Engineer measurements Define and compute useful properties
Area Shape Color distribution Nuclei size
classification
Learn features to solve a task Train a deep neural network
Discover associations
Compound A Compound B Compound C Compound D
Mechanistic Class X Mechanistic Class Y
Batch 1
A1 A2 B1 B2 C1 C2
Batch 2
A3 D1 D2 B3 C3 D3
softmax CNN
Main goal: Treatment-level profiling Auxiliary task: Single-cell treatment classification Caicedo, J. C., et al. "Weakly supervised learning of single-cell feature embeddings". Computer Vision and Pattern Recognition, IEEE CVPR 2018
Caicedo et al. CVPR 2018
CellProfiler Weakly Supervised Learning
Control TP53.WT STK11.WT NFE2L2.WT MDM2.WT KRAS_p.G12V KEAP1.WT EGFR_p.T790M EGFR_p.L858R EGFR.WT
Transfer learning In all plots, x-axis is t-SNE1 and y-axis is t-SNE2 of the projected phenotypic space.
EGFR Wild Type Control EGFR Mutant
EGFR_p.S645C
EGFR_p.S645C
EGFR WT Control EGFR MUT
Variant impact: 66.9%
EGFR_p.T790M, p.L858R.o EGFR_p.L858R EGFR_p.S654C EGFR_p.K754E EGFR_p.Q102H EGFR_p.R222L
Find more results at: https://broad.io/cp-luad
Morphological Variant Impact Score (%) EGFR_p.T790M, p.L858R.o EGFR_p.L858R EGFR_p.S654C EGFR_p.K754E EGFR_p.Q102H EGFR_p.R222L
0.0% 20.0% 40.0% 60.0% 80.0%
Find more results at: https://broad.io/cp-luad
Imaging is a rich source of information. Computer vision has powerful tools for image analysis. Many computer vision tasks can be fully automated. Imaging data can be connected to other sources of data.
Hamdah Abbasi Jeanelle Ackerman Beth Cimini Minh Doan Allen Goodman Profiling group: Shantanu Singh Tim Becker Marzieh Haghighi Matt Smith Broad Imaging Platform Anne Carpenter Many thanks to our many biology collaborators! Recent major funding for this work provided by: