Detecting and visualizing cell phenotype differences from microscopy - - PowerPoint PPT Presentation

detecting and visualizing cell phenotype differences from
SMART_READER_LITE
LIVE PREVIEW

Detecting and visualizing cell phenotype differences from microscopy - - PowerPoint PPT Presentation

Introduction Optimal Mass Transport Linearized Optimal Transport Results Detecting and visualizing cell phenotype differences from microscopy images using transport-based morphometry Soheil Kolouri Ph.D. qualifying exam presentation


slide-1
SLIDE 1

Introduction Optimal Mass Transport Linearized Optimal Transport Results

Detecting and visualizing cell phenotype differences from microscopy images using transport-based morphometry

Soheil Kolouri

Ph.D. qualifying exam presentation Biomedical Engineering Department Carnegie Mellon University

  • S. Kolouri

Optimal Transport

slide-2
SLIDE 2

Introduction Optimal Mass Transport Linearized Optimal Transport Results

  • S. Kolouri

Optimal Transport

slide-3
SLIDE 3

Introduction Optimal Mass Transport Linearized Optimal Transport Results Introduction

Introduction ◮ Thanks to the recent development of advanced high resolution microscopes, imaging experiments have become the main source of data for biologists to test and validate their hypotheses.

1 / 32

slide-4
SLIDE 4

Introduction Optimal Mass Transport Linearized Optimal Transport Results Introduction

Input Relevant questions

  • 1. Are they different?
  • 2. if so, how are they different?

Applications ◮ Benign vs. malignant cells (pathology) ◮ Understanding role of different proteins ◮ Discovering drugs ◮ Understanding effects of genes (RNAi) ◮ Understanding cell signaling mechanisms. ◮ ...

2 / 32

slide-5
SLIDE 5

Introduction Optimal Mass Transport Linearized Optimal Transport Results Introduction

State of the art ◮ Pre-determined numerical features are conventionally used for quantifying extracted cells from histology images and classifying different populations (e.g. normal vs. diseased). Common features ◮ Area ◮ Perimeter ◮ Axis lengths ◮ Haralick ◮ Gabor ◮ Wavelet ◮ Gray level statistics ◮ . . .

3 / 32

slide-6
SLIDE 6

Introduction Optimal Mass Transport Linearized Optimal Transport Results Introduction

Drawbacks

  • 1. The feature space is often very smaller than the image space, and hence one

might lose important information in this data reduction process.

  • 2. Inferring biological information based on the features is not straightforward.

Why don’t we use the image space itself?

4 / 32

slide-7
SLIDE 7

Introduction Optimal Mass Transport Linearized Optimal Transport Results Introduction

Drawbacks

  • 1. The feature space is often very smaller than the image space, and hence one

might lose important information in this data reduction process.

  • 2. Inferring biological information based on the features is not straightforward.

While in feature space:

4 / 32

slide-8
SLIDE 8

Introduction Optimal Mass Transport Linearized Optimal Transport Results Introduction

Alternative approach ◮ A good alternative approach to the feature-based morphometry is the transportation-based morphometry (Wang et al. , 2010). ◮ In transportation-based morphometry the target image is warped to the source image and the distance between images is defined as a function of the transformation. ◮ The distance defined in this manner uses the whole information embedded in the image and provides a meaningful similarity measure. dOT (x, y) = 16.7 dOT (x, y) = 25.8

5 / 32

slide-9
SLIDE 9

Introduction Optimal Mass Transport Linearized Optimal Transport Results Introduction

Goal ◮ Our goal is to devise a method for detecting and visualizing differences between subpopulations (classes) in an image dataset. ◮ We use an optimal transport based approach to accomplish our goal.

6 / 32

slide-10
SLIDE 10

Introduction Optimal Mass Transport Linearized Optimal Transport Results Mass preserving mappings Kantorovich-Wasserstein metric Particle-based Kantrovich-Wasserstein metric Mass Preserving Mapping

Assume that Ω0 and Ω1 are two domains in Rd, each with a positive density, µ0 and µ1, respectively and same total amount of mass,

  • Ω0

µ0( x)d x =

  • Ω1

µ1( x)d x A mapping f : Ω0 → Ω1 is called mass preserving (MP), if f satisfies, |Df( x)|µ1(f( x)) = µ0( x) where |Df( x)| is the determinant of the Ja- cobian matrix of f which gives us the factor by which the function f expands or shrinks at x. ◮ There exist infinitely many number of such mappings.

7 / 32

slide-11
SLIDE 11

Introduction Optimal Mass Transport Linearized Optimal Transport Results Mass preserving mappings Kantorovich-Wasserstein metric Particle-based Kantrovich-Wasserstein metric

Kantorovich-Wasserstein metric ◮ We use the Kantorovich-Wasserstein (KW) formulation of the optimal transport problem.

Lp Kantorovich-Wasserstein Metric

The Lp Kantorovich-Wasserstein metric for two densities µ0 ∈ Ω0 and µ1 ∈ Ω1 (normalized images) is defined as follows, dp(µ0, µ1)p = min

f∈MP

  • x

||f( x) − x||p

pµ0(

x)d x in which, MP = {f| |Df( x)|µ1(f( x)) = µ0( x), ∀ x ∈ Ω0}. p = 2 is a common choice which leads to a unique solution of f. ◮ Solving above optimization to get the distance, although possible, is not an easy

  • task. Hence, we use the particle based formulation of the KW distance.

8 / 32

slide-12
SLIDE 12

Introduction Optimal Mass Transport Linearized Optimal Transport Results Mass preserving mappings Kantorovich-Wasserstein metric Particle-based Kantrovich-Wasserstein metric

Particle-base Kantorovich-Wasserstein ◮ We assume that our images are of the following form, µ0( x) =

i piδ(

x − xi) µ1( y) =

j qjδ(

y − yj) where δ is the Kronecker-delta function, and pi is the particle mass (intensity).

Wang, . . ., Rohde, Cytometry, 2010

◮ Then the set of all coupling matrices between µ0 and µ1 are defined as follows, Π(µ0, µ1) = {F[i, j] = fi,j|fi,j > 0,

  • j

fi,j = pi,

  • i

fi,j = qj}. ◮ The L2 Kantorovich-Wasserstein metric for this case can be written as, d2

OT (µ0, µ1) =

min

f∈Π(µ0,µ1)

  • i
  • j

|| xi − yj||2

2fi,j.

which is then solved by linear programming.

9 / 32

slide-13
SLIDE 13

Introduction Optimal Mass Transport Linearized Optimal Transport Results Mass preserving mappings Kantorovich-Wasserstein metric Particle-based Kantrovich-Wasserstein metric

Example

10 / 32

slide-14
SLIDE 14

Introduction Optimal Mass Transport Linearized Optimal Transport Results LOT distance Isometric linear embedding Pipeline

LOT distance ◮ Due to computational complexity of the linear programming, calculating pairwise OT distance in the dataset is not desirable. ◮ Alternatively, we choose a reference image, σ( z) =

k mkδ(

z − zk) and measure the OT distance of all images from this reference image.

Wang, . . ., Rohde, Cytometry, 2010 11 / 32

slide-15
SLIDE 15

Introduction Optimal Mass Transport Linearized Optimal Transport Results LOT distance Isometric linear embedding Pipeline

LOT distance ◮ Due to computational complexity of the linear programming, calculating pairwise OT distance in the dataset is not desirable. ◮ Alternatively, we choose a reference image, σ( z) =

k mkδ(

z − zk) and measure the OT distance of all images from this reference image.

Wang, . . ., Rohde, Cytometry, 2010 11 / 32

slide-16
SLIDE 16

Introduction Optimal Mass Transport Linearized Optimal Transport Results LOT distance Isometric linear embedding Pipeline

LOT distance ◮ Due to computational complexity of the linear programming, calculating pairwise OT distance in the dataset is not desirable. ◮ Alternatively, we choose a reference image, σ( z) =

k mkδ(

z − zk) and measure the OT distance of all images from this reference image.

Wang et al., Cytometry, 2010

◮ Let xk =

1 mk

  • i fk,i

xi and yk =

1 mk

  • j gk,j

yj be the centeroid of the forward image of the particle mkδ( z − zk) by the transportation plans f and g,

  • respectively. The LOT distance between µ0 and µ1 is then defined as,

d2

LOT,σ(µ0, µ1) =

  • k

||xk − yk||2

2mk. 11 / 32

slide-17
SLIDE 17

Introduction Optimal Mass Transport Linearized Optimal Transport Results LOT distance Isometric linear embedding Pipeline

LOT distance

12 / 32

slide-18
SLIDE 18

Introduction Optimal Mass Transport Linearized Optimal Transport Results LOT distance Isometric linear embedding Pipeline

Isometric linear embedding ◮ The defined distance, d2

LOT,σ, provides a method for mapping a sample

measure (particle approximated image) into a linear space. ◮ The L2 distance in this linear space corresponds to d2

LOT,σ.

◮ For particle approximated images µ0( x) and µ1( y) the linear embeddings, ν0 and ν1 respectively, are obtained by finding the discrete transportation map between the reference image σ( z) =

k mkδ(

z − zk) and µ0 and µ1,    ν0 = [√m1 x1, . . . , √mNσ xNσ]T ν1 = [√m1 y1, . . . , √mNσ yNσ]T where Nσ is the number of particles in σ. ◮ Hence, the L2 distance between ν0 and ν1 is as follows, ||ν0 − ν1||2

2 =

  • k

||√mk(xk − yk)||2

2

=

  • k

||xk − yk||2

2mk

= d2

LOT,σ(µ0, µ1) 13 / 32

slide-19
SLIDE 19

Introduction Optimal Mass Transport Linearized Optimal Transport Results LOT distance Isometric linear embedding Pipeline

Original histology image

With: John A. Ozolek, M.D., Univ. Pittsburgh 14 / 32

slide-20
SLIDE 20

Introduction Optimal Mass Transport Linearized Optimal Transport Results LOT distance Isometric linear embedding Pipeline

Extracting the luminance channel

15 / 32

slide-21
SLIDE 21

Introduction Optimal Mass Transport Linearized Optimal Transport Results LOT distance Isometric linear embedding Pipeline

Luminance channel

16 / 32

slide-22
SLIDE 22

Introduction Optimal Mass Transport Linearized Optimal Transport Results LOT distance Isometric linear embedding Pipeline

Inverting the luminance channel

17 / 32

slide-23
SLIDE 23

Introduction Optimal Mass Transport Linearized Optimal Transport Results LOT distance Isometric linear embedding Pipeline

Cell segmentation

18 / 32

slide-24
SLIDE 24

Introduction Optimal Mass Transport Linearized Optimal Transport Results LOT distance Isometric linear embedding Pipeline

Generating a translation and rotation invariant dataset

19 / 32

slide-25
SLIDE 25

Introduction Optimal Mass Transport Linearized Optimal Transport Results LOT distance Isometric linear embedding Pipeline

Particle approximation

20 / 32

slide-26
SLIDE 26

Introduction Optimal Mass Transport Linearized Optimal Transport Results LOT distance Isometric linear embedding Pipeline

Reference image and its particle representation ◮ Finally, the linear embeddings of the particle approximated images are calculated. ◮ The main phenotype variations in the dataset are found using the standard principal component analysis (PCA) technique. ◮ The statistically significant differences are computed by applying the penalized Linear Discriminant Analysis (pLDA) applied to the linear embedding of the dataset.

21 / 32

slide-27
SLIDE 27

Introduction Optimal Mass Transport Linearized Optimal Transport Results Dataset Visualization of variations Conclusion and future work Acknowledgements

First dataset ◮ The first dataset contains microscopy images of liver tissue samples obtained from ten different subjects including five cancer patients suffering from fetal-type hepatoblastoma (FHB), with the remaining images from the liver of five healthy individuals.

22 / 32

slide-28
SLIDE 28

Introduction Optimal Mass Transport Linearized Optimal Transport Results Dataset Visualization of variations Conclusion and future work Acknowledgements 23 / 32

slide-29
SLIDE 29

Introduction Optimal Mass Transport Linearized Optimal Transport Results Dataset Visualization of variations Conclusion and future work Acknowledgements

◮ The p-value (calculated from t-test) is zero to the numerical precision. ◮ For this dataset both feature-base and transportation-based morphology lead to 100% patient-wise classification accuracy.

24 / 32

slide-30
SLIDE 30

Introduction Optimal Mass Transport Linearized Optimal Transport Results Dataset Visualization of variations Conclusion and future work Acknowledgements

Second dataset ◮ The second dataset contains microscopy images of thyroid tissue samples

  • btained from 85 different subjects including:

◮ 27 subjects diagnosed with Follicular Adenoma (FA) ◮ 20 subjects diagnosed with Follicular Carcinoma (FC) ◮ 28 subjects diagnosed with Nodular Goiter (NG) ◮ 10 subjects diagnosed with Follicular variant papillary carcinoma (FVPC) 25 / 32

slide-31
SLIDE 31

Introduction Optimal Mass Transport Linearized Optimal Transport Results Dataset Visualization of variations Conclusion and future work Acknowledgements 26 / 32

slide-32
SLIDE 32

Introduction Optimal Mass Transport Linearized Optimal Transport Results Dataset Visualization of variations Conclusion and future work Acknowledgements

Third dataset ◮ The third dataset contains microscopy images of cytoplasm to nucleus translocation of the Forkhead (FKHR-EGFP) fusion protein in stably transfected human osteosarcoma cells, U2OS.

From image set BBBC013v1 provided by Ilya Ravkin, available from the Broad Bioimage Benchmark Collection. 27 / 32

slide-33
SLIDE 33

Introduction Optimal Mass Transport Linearized Optimal Transport Results Dataset Visualization of variations Conclusion and future work Acknowledgements 28 / 32

slide-34
SLIDE 34

Introduction Optimal Mass Transport Linearized Optimal Transport Results Dataset Visualization of variations Conclusion and future work Acknowledgements

From%Thermo)Fisher%

29 / 32

slide-35
SLIDE 35

Introduction Optimal Mass Transport Linearized Optimal Transport Results Dataset Visualization of variations Conclusion and future work Acknowledgements

Conclusion ◮ A new automatic method for describing variations in a set of images is reviewed. The method is based on a linearized OT framework and provides a linear embedding of morphological changes for a set of images. ◮ The linear embedding provided by LOT allows one to synthesize and visualize every point in this linear space. ◮ This unique capability of LOT combined with linear modeling techniques, such as PCA and pLDA, allows one to visualize meaningful variations in appearance in an image set automatically.

30 / 32

slide-36
SLIDE 36

Introduction Optimal Mass Transport Linearized Optimal Transport Results Dataset Visualization of variations Conclusion and future work Acknowledgements

Acknowledgements ◮ Dr. Gustavo K. Rohde ◮ Dr. Saurav Basu ◮ Dr. Akif B. Tosun ◮ Dr. Wei Wang ◮ Ms. Serim Park ◮ Ms. Anupama Kuruvilla ◮ Mr. Mike McCann ◮ Ms. Shinjini Kundu

31 / 32

slide-37
SLIDE 37

Introduction Optimal Mass Transport Linearized Optimal Transport Results Dataset Visualization of variations Conclusion and future work Acknowledgements

Thank you!

32 / 32