Bioimage Informatics: Computer Vision for Biology Luis Pedro Coelho - - PowerPoint PPT Presentation

bioimage informatics computer vision for biology
SMART_READER_LITE
LIVE PREVIEW

Bioimage Informatics: Computer Vision for Biology Luis Pedro Coelho - - PowerPoint PPT Presentation

Bioimage Informatics: Computer Vision for Biology Luis Pedro Coelho Institute for Molecular Medicine, Lisbon Mhlanga Lab November 2011 High Throughput Science The real measure of success is the number of experiments that can be crowded


slide-1
SLIDE 1

Bioimage Informatics: Computer Vision for Biology

Luis Pedro Coelho

Institute for Molecular Medicine, Lisbon Mhlanga Lab

November 2011

slide-2
SLIDE 2

High Throughput Science

“The real measure of success is the number of experiments that can be crowded into twenty-four hours.” — Thomas Edison

Luis Pedro Coelho (Institute for Molecular Medicine) ⋆ Bioimage Informatics ⋆ Nov 2011 (2 / 43)

slide-3
SLIDE 3

High Throughput High Content Biology

Lab T echnologies

Liquid handling robots Multi-well plates Automated microscopes One can generate thousands of images per hour.

Luis Pedro Coelho (Institute for Molecular Medicine) ⋆ Bioimage Informatics ⋆ Nov 2011 (3 / 43)

slide-4
SLIDE 4

Images

8 2 2 1 1 1 2 2 8 8 2 2 2 2 2 8 21 8 8 2 2 2 8 8 21 8 8 8 2 8 8 8 21 8 8 8 8 8 8 8 21 8 8 8 2 8 8 8 21 8 8 2 2 2 8 8 8 8 2 2 2 2 2 8 This is the raw data.

Luis Pedro Coelho (Institute for Molecular Medicine) ⋆ Bioimage Informatics ⋆ Nov 2011 (4 / 43)

slide-5
SLIDE 5

Image Processing

T ypical T asks

Denoising Particle detection Segmentation … At the end of these steps, you still have an image which must be interpreted by computer or human. I am not discussing any of this today. See Alexandre’s talk.

Luis Pedro Coelho (Institute for Molecular Medicine) ⋆ Bioimage Informatics ⋆ Nov 2011 (5 / 43)

slide-6
SLIDE 6

Image Processing

T ypical T asks

Denoising Particle detection Segmentation … At the end of these steps, you still have an image which must be interpreted by computer or human. I am not discussing any of this today. See Alexandre’s talk.

Luis Pedro Coelho (Institute for Molecular Medicine) ⋆ Bioimage Informatics ⋆ Nov 2011 (5 / 43)

slide-7
SLIDE 7

Image Processing

T ypical T asks

Denoising Particle detection Segmentation … At the end of these steps, you still have an image which must be interpreted by computer or human. I am not discussing any of this today. See Alexandre’s talk.

Luis Pedro Coelho (Institute for Molecular Medicine) ⋆ Bioimage Informatics ⋆ Nov 2011 (5 / 43)

slide-8
SLIDE 8

First Task

Classification

Given labeled data, can we learn a classification model?

Labeled Data

A small dataset of images with labels. The goal is to then assign labels to other images.

Luis Pedro Coelho (Institute for Molecular Medicine) ⋆ Bioimage Informatics ⋆ Nov 2011 (6 / 43)

slide-9
SLIDE 9

Example

Luis Pedro Coelho (Institute for Molecular Medicine) ⋆ Bioimage Informatics ⋆ Nov 2011 (7 / 43)

slide-10
SLIDE 10

Example

Luis Pedro Coelho (Institute for Molecular Medicine) ⋆ Bioimage Informatics ⋆ Nov 2011 (7 / 43)

slide-11
SLIDE 11

Features

Feature Based Approach

Represent the image by a small number of features. Proposed by Boland and Murphy (1998) for subcellular location. Very successful for many applications.

Luis Pedro Coelho (Institute for Molecular Medicine) ⋆ Bioimage Informatics ⋆ Nov 2011 (8 / 43)

slide-12
SLIDE 12

Features

A feature is any number you can compute from the image. For a good features, you wish to simmultaneously

. .

1

Capture the important variations. . .

2

Disregard the unimportant variations.

These are naturally problem dependent, but machine learning helps.

Luis Pedro Coelho (Institute for Molecular Medicine) ⋆ Bioimage Informatics ⋆ Nov 2011 (9 / 43)

slide-13
SLIDE 13

Example Feature

12 6 5 4 3 5 11 10 4 6 7 4 4 5 3 10 8 9 3 4 12 9 8 14 7 12 10 8 11 13

Luis Pedro Coelho (Institute for Molecular Medicine) ⋆ Bioimage Informatics ⋆ Nov 2011 (10 / 43)

slide-14
SLIDE 14

Example Feature

12 6 5 4 3 5 11 10 4 6 7 4 4 5 3 10 8 9 3 4 12 9 8 14 7 12 10 8 11 13

Luis Pedro Coelho (Institute for Molecular Medicine) ⋆ Bioimage Informatics ⋆ Nov 2011 (10 / 43)

slide-15
SLIDE 15

Example Feature

12 6 5 4 3 5 11 10 4 6 7 4 4 5 3 10 8 9 3 4 12 9 8 14 7 12 10 8 11 13

Luis Pedro Coelho (Institute for Molecular Medicine) ⋆ Bioimage Informatics ⋆ Nov 2011 (10 / 43)

slide-16
SLIDE 16

Algorithm

For each 3 × 3 region: Find the maximum and the minimum. Subtract the minimum from the maximum. You end up with a number per region (per pixel). For an image level feature, average this number .

1

What is this feature sensitive to? .

2

What is this feature invariant to?

Luis Pedro Coelho (Institute for Molecular Medicine) ⋆ Bioimage Informatics ⋆ Nov 2011 (11 / 43)

slide-17
SLIDE 17

Algorithm

For each 3 × 3 region: Find the maximum and the minimum. Subtract the minimum from the maximum. You end up with a number per region (per pixel). For an image level feature, average this number .

1

What is this feature sensitive to? .

2

What is this feature invariant to?

Luis Pedro Coelho (Institute for Molecular Medicine) ⋆ Bioimage Informatics ⋆ Nov 2011 (11 / 43)

slide-18
SLIDE 18

Algorithm

For each 3 × 3 region: Find the maximum and the minimum. Subtract the minimum from the maximum. You end up with a number per region (per pixel). For an image level feature, average this number . .

1

What is this feature sensitive to? . .

2

What is this feature invariant to?

Luis Pedro Coelho (Institute for Molecular Medicine) ⋆ Bioimage Informatics ⋆ Nov 2011 (11 / 43)

slide-19
SLIDE 19

Example

2.5 3.0 3.5 4.0 4.5

value

1 2 3 4 5

count Nuclear Mitochondria

Luis Pedro Coelho (Institute for Molecular Medicine) ⋆ Bioimage Informatics ⋆ Nov 2011 (12 / 43)

slide-20
SLIDE 20

Example

2.5 3.0 3.5 4.0 4.5

value

1 2 3 4 5 6

count Nuclear Mitochondria Nucleoli

Luis Pedro Coelho (Institute for Molecular Medicine) ⋆ Bioimage Informatics ⋆ Nov 2011 (12 / 43)

slide-21
SLIDE 21

Complex Examples

Alternatives

Manually design features by trial and error Machine learning approach

Machine Learning

.

1

Use many generic features (tens to hundreds) .

2

Automatically learn which features are important

Luis Pedro Coelho (Institute for Molecular Medicine) ⋆ Bioimage Informatics ⋆ Nov 2011 (13 / 43)

slide-22
SLIDE 22

Complex Examples

Alternatives

Manually design features by trial and error Machine learning approach

Machine Learning

. .

1

Use many generic features (tens to hundreds) . .

2

Automatically learn which features are important

Luis Pedro Coelho (Institute for Molecular Medicine) ⋆ Bioimage Informatics ⋆ Nov 2011 (13 / 43)

slide-23
SLIDE 23

Typical Features

T exture (Haralick, Gabor, …) Edginess, smoothness, … Local features, … … The literature is very vast.

Luis Pedro Coelho (Institute for Molecular Medicine) ⋆ Bioimage Informatics ⋆ Nov 2011 (14 / 43)

slide-24
SLIDE 24

Luis Pedro Coelho (Institute for Molecular Medicine) ⋆ Bioimage Informatics ⋆ Nov 2011 (15 / 43)

slide-25
SLIDE 25

Luis Pedro Coelho (Institute for Molecular Medicine) ⋆ Bioimage Informatics ⋆ Nov 2011 (15 / 43)

slide-26
SLIDE 26

Luis Pedro Coelho (Institute for Molecular Medicine) ⋆ Bioimage Informatics ⋆ Nov 2011 (15 / 43)

slide-27
SLIDE 27

Luis Pedro Coelho (Institute for Molecular Medicine) ⋆ Bioimage Informatics ⋆ Nov 2011 (15 / 43)

slide-28
SLIDE 28

Luis Pedro Coelho (Institute for Molecular Medicine) ⋆ Bioimage Informatics ⋆ Nov 2011 (15 / 43)

slide-29
SLIDE 29

Luis Pedro Coelho (Institute for Molecular Medicine) ⋆ Bioimage Informatics ⋆ Nov 2011 (15 / 43)

slide-30
SLIDE 30

Luis Pedro Coelho (Institute for Molecular Medicine) ⋆ Bioimage Informatics ⋆ Nov 2011 (15 / 43)

slide-31
SLIDE 31

Classifiers

4 3 2 1 1 2 3 4 3 2 1 1 2 3 4 5

Luis Pedro Coelho (Institute for Molecular Medicine) ⋆ Bioimage Informatics ⋆ Nov 2011 (16 / 43)

slide-32
SLIDE 32

Classifiers

20 40 60 80 100 20 40 60 80 100

Luis Pedro Coelho (Institute for Molecular Medicine) ⋆ Bioimage Informatics ⋆ Nov 2011 (16 / 43)

slide-33
SLIDE 33

Results

Cyto Cytosk Lyso PM Mito N NO Cyto 115 10 3 15 8 4 Cytosk 14 147 3 2 30 1 Lyso 3 1 14 50 1 PM 31 6 2 9 2 1 Mito 22 30 15 126 6 1 N 25 1 1 219 9 NO 1 1 16 95 Average: 72%

Luis Pedro Coelho (Institute for Molecular Medicine) ⋆ Bioimage Informatics ⋆ Nov 2011 (17 / 43)

slide-34
SLIDE 34

HeLa Dataset

dna er gi gii l m n a e t dna 86 1 er 84 1 1 gi 84 2 1 gii 4 79 1 1 l 1 72 1 10 m 3 1 1 64 3 1 n 1 1 78 a 98 e 2 3 5 1 79 1 t 1 1 1 88 Average: 94% Human performance: 83% (Murphy et al., 2003)

Luis Pedro Coelho (Institute for Molecular Medicine) ⋆ Bioimage Informatics ⋆ Nov 2011 (18 / 43)

slide-35
SLIDE 35

HeLa Dataset

dna er gi gii l m n a e t dna 86 1 er 84 1 1 gi 84 2 1 gii 4 79 1 1 l 1 72 1 10 m 3 1 1 64 3 1 n 1 1 78 a 98 e 2 3 5 1 79 1 t 1 1 1 88 Average: 94% Human performance: 83% (Murphy et al., 2003)

Luis Pedro Coelho (Institute for Molecular Medicine) ⋆ Bioimage Informatics ⋆ Nov 2011 (18 / 43)

slide-36
SLIDE 36

Typical Results

Comparable to or better than human! Better with multiple replicates. Classification times: a few seconds per image.

Luis Pedro Coelho (Institute for Molecular Medicine) ⋆ Bioimage Informatics ⋆ Nov 2011 (19 / 43)

slide-37
SLIDE 37

Other Problems

Other T ypical Classification Problems

Phenotype in a screen Stem cell differentiation …

Luis Pedro Coelho (Institute for Molecular Medicine) ⋆ Bioimage Informatics ⋆ Nov 2011 (20 / 43)

slide-38
SLIDE 38

Segmentation as Classification

(Coelho et al., 2009) (Chen et al., 2011)

Luis Pedro Coelho (Institute for Molecular Medicine) ⋆ Bioimage Informatics ⋆ Nov 2011 (21 / 43)

slide-39
SLIDE 39

Learning to Count

(Lempitsky & Zisserman, 2010)

Luis Pedro Coelho (Institute for Molecular Medicine) ⋆ Bioimage Informatics ⋆ Nov 2011 (22 / 43)

slide-40
SLIDE 40

Conclusions

Computers can do very well at classification. Flexible tool if you have the training data.

Luis Pedro Coelho (Institute for Molecular Medicine) ⋆ Bioimage Informatics ⋆ Nov 2011 (23 / 43)

slide-41
SLIDE 41

Mixture Patterns Classification

Previously reported methods work well for simple classes, like “endosomes” or “mitochondria.” What if a protein is present in both endosomes and mitochondria?

Luis Pedro Coelho (Institute for Molecular Medicine) ⋆ Bioimage Informatics ⋆ Nov 2011 (24 / 43)

slide-42
SLIDE 42

Mixture Patterns Classification

Previously reported methods work well for simple classes, like “endosomes” or “mitochondria.” What if a protein is present in both endosomes and mitochondria?

Luis Pedro Coelho (Institute for Molecular Medicine) ⋆ Bioimage Informatics ⋆ Nov 2011 (24 / 43)

slide-43
SLIDE 43

Mixture Pattern Example

Luis Pedro Coelho (Institute for Molecular Medicine) ⋆ Bioimage Informatics ⋆ Nov 2011 (25 / 43)

slide-44
SLIDE 44

Mixture Pattern Example

Luis Pedro Coelho (Institute for Molecular Medicine) ⋆ Bioimage Informatics ⋆ Nov 2011 (25 / 43)

slide-45
SLIDE 45

Mixture Pattern Example

Luis Pedro Coelho (Institute for Molecular Medicine) ⋆ Bioimage Informatics ⋆ Nov 2011 (25 / 43)

slide-46
SLIDE 46

Supervised Unmixing Problem

Given examples of pure patterns and a mixed pattern, can we identify how much each pure pattern contributes to the mixture? Using an object-based approach, we can solve this. (T. Zhao et al., 2005) (T. Peng, G. Bonami et al., 2010)

Luis Pedro Coelho (Institute for Molecular Medicine) ⋆ Bioimage Informatics ⋆ Nov 2011 (26 / 43)

slide-47
SLIDE 47

Supervised Unmixing Problem

Given examples of pure patterns and a mixed pattern, can we identify how much each pure pattern contributes to the mixture? Using an object-based approach, we can solve this. (T. Zhao et al., 2005) (T. Peng, G. Bonami et al., 2010)

Luis Pedro Coelho (Institute for Molecular Medicine) ⋆ Bioimage Informatics ⋆ Nov 2011 (26 / 43)

slide-48
SLIDE 48

Unsupervised Unmixing Problem

What if we don’t know the pure patterns? Given a collection of untagged images, can we identify the pure and mixed patterns?

Luis Pedro Coelho (Institute for Molecular Medicine) ⋆ Bioimage Informatics ⋆ Nov 2011 (27 / 43)

slide-49
SLIDE 49

Unsupervised Unmixing Problem

What if we don’t know the pure patterns? Given a collection of untagged images, can we identify the pure and mixed patterns?

Luis Pedro Coelho (Institute for Molecular Medicine) ⋆ Bioimage Informatics ⋆ Nov 2011 (27 / 43)

slide-50
SLIDE 50

Process

Luis Pedro Coelho (Institute for Molecular Medicine) ⋆ Bioimage Informatics ⋆ Nov 2011 (28 / 43)

slide-51
SLIDE 51

Process

Luis Pedro Coelho (Institute for Molecular Medicine) ⋆ Bioimage Informatics ⋆ Nov 2011 (28 / 43)

slide-52
SLIDE 52

Process

Luis Pedro Coelho (Institute for Molecular Medicine) ⋆ Bioimage Informatics ⋆ Nov 2011 (28 / 43)

slide-53
SLIDE 53

Process

Luis Pedro Coelho (Institute for Molecular Medicine) ⋆ Bioimage Informatics ⋆ Nov 2011 (28 / 43)

slide-54
SLIDE 54

Results: Mixing Bases

(Coelho et al., 2010)

Luis Pedro Coelho (Institute for Molecular Medicine) ⋆ Bioimage Informatics ⋆ Nov 2011 (29 / 43)

slide-55
SLIDE 55

Results: Mixing Fractions

700 411 242 142 83 49 29

mitotracker concentration

300 214 153 109 78 55 39

lysotracker concentration

Correlation: 91% (Coelho et al., 2010)

Luis Pedro Coelho (Institute for Molecular Medicine) ⋆ Bioimage Informatics ⋆ Nov 2011 (30 / 43)

slide-56
SLIDE 56

Results: Mixing Fractions

700 411 242 142 83 49 29

mitotracker concentration

300 214 153 109 78 55 39

lysotracker concentration

700 411 242 142 83 49 29

mitotracker concentration

300 214 153 109 78 55 39

lysotracker concentration

Correlation: 91% (Coelho et al., 2010)

Luis Pedro Coelho (Institute for Molecular Medicine) ⋆ Bioimage Informatics ⋆ Nov 2011 (30 / 43)

slide-57
SLIDE 57

Pattern unmixing works both in supervised and unsupervised modes.

Luis Pedro Coelho (Institute for Molecular Medicine) ⋆ Bioimage Informatics ⋆ Nov 2011 (31 / 43)

slide-58
SLIDE 58

Other Heterogeneous Problems

Problems

Multiple cells in a field Multiple cells in a tissue …

Luis Pedro Coelho (Institute for Molecular Medicine) ⋆ Bioimage Informatics ⋆ Nov 2011 (32 / 43)

slide-59
SLIDE 59

Multiple Heterogeneous Cells

Approach

. .

1

Segment cells . .

2

Classify cells independently . .

3

Group classifications (Altschuler & Wu, 2010)

Luis Pedro Coelho (Institute for Molecular Medicine) ⋆ Bioimage Informatics ⋆ Nov 2011 (33 / 43)

slide-60
SLIDE 60

Positive Example

Luis Pedro Coelho (Institute for Molecular Medicine) ⋆ Bioimage Informatics ⋆ Nov 2011 (34 / 43)

slide-61
SLIDE 61

Negative Example

Luis Pedro Coelho (Institute for Molecular Medicine) ⋆ Bioimage Informatics ⋆ Nov 2011 (35 / 43)

slide-62
SLIDE 62

Luis Pedro Coelho (Institute for Molecular Medicine) ⋆ Bioimage Informatics ⋆ Nov 2011 (36 / 43)

slide-63
SLIDE 63

K-Nearest Neighbour Test

(Henze, 1988) (T. Zhao et al., 2006)

Luis Pedro Coelho (Institute for Molecular Medicine) ⋆ Bioimage Informatics ⋆ Nov 2011 (37 / 43)

slide-64
SLIDE 64

K-Nearest Neighbour Test

(Henze, 1988) (T. Zhao et al., 2006)

Luis Pedro Coelho (Institute for Molecular Medicine) ⋆ Bioimage Informatics ⋆ Nov 2011 (37 / 43)

slide-65
SLIDE 65

K-Nearest Neighbour Test

(Henze, 1988) (T. Zhao et al., 2006)

Luis Pedro Coelho (Institute for Molecular Medicine) ⋆ Bioimage Informatics ⋆ Nov 2011 (37 / 43)

slide-66
SLIDE 66

K-Nearest Neighbour Test

(Henze, 1988) (T. Zhao et al., 2006)

Luis Pedro Coelho (Institute for Molecular Medicine) ⋆ Bioimage Informatics ⋆ Nov 2011 (37 / 43)

slide-67
SLIDE 67

K-Nearest Neighbour Test

(Henze, 1988) (T. Zhao et al., 2006)

Luis Pedro Coelho (Institute for Molecular Medicine) ⋆ Bioimage Informatics ⋆ Nov 2011 (37 / 43)

slide-68
SLIDE 68

K-Nearest Neighbour Test

(Henze, 1988) (T. Zhao et al., 2006)

Luis Pedro Coelho (Institute for Molecular Medicine) ⋆ Bioimage Informatics ⋆ Nov 2011 (37 / 43)

slide-69
SLIDE 69

K-Nearest Neighbour Test

(Henze, 1988) (T. Zhao et al., 2006)

Luis Pedro Coelho (Institute for Molecular Medicine) ⋆ Bioimage Informatics ⋆ Nov 2011 (37 / 43)

slide-70
SLIDE 70

Where we are going

Data Integration

Multiple image types Non-image data (This was my PhD dissertation, but it is still unpublished)

Luis Pedro Coelho (Institute for Molecular Medicine) ⋆ Bioimage Informatics ⋆ Nov 2011 (38 / 43)

slide-71
SLIDE 71

Where we are going

Active Learning

Let the computer choose the experiment. Cut the human out of the loop. (King et al., 2009) (Murphy, 2011)

Luis Pedro Coelho (Institute for Molecular Medicine) ⋆ Bioimage Informatics ⋆ Nov 2011 (39 / 43)

slide-72
SLIDE 72

Conclusions & Guidelines

Automated methods can give better answers than humans (if the question is well defined) Interpretation need not be the bottleneck even in high-throughput settings Not so many user friendly tools available Collaboration can get you an expert Start your collaboration before you collect data

Luis Pedro Coelho (Institute for Molecular Medicine) ⋆ Bioimage Informatics ⋆ Nov 2011 (40 / 43)

slide-73
SLIDE 73

Acknowledgments

  • Prof. Robert F

. Murphy

  • Dr. T

ao Peng Aabid Shariff

  • Dr. Estelle Glory-Afshar
  • Dr. Elvira Garcia-Osuna

Armaghan Naik Joshua Kangas …

  • Prof. Gustavo Rohde

Cheng Chen Funding Agencies Fulbright Program National Institutes of Health Fundação Para Ciência e T ecnologia Siebel Scholars Foundation

Luis Pedro Coelho (Institute for Molecular Medicine) ⋆ Bioimage Informatics ⋆ Nov 2011 (41 / 43)

slide-74
SLIDE 74

thank you…

slide-75
SLIDE 75

Slides

These slides (and complete references to all papers mentioned) are available at http://luispedro.org/talks/2011/embo

Luis Pedro Coelho (Institute for Molecular Medicine) ⋆ Bioimage Informatics ⋆ Nov 2011 (43 / 43)