Introduction Optimal Mass Transport Linearized Optimal Transport Results Detecting and visualizing cell phenotype differences from microscopy images using transport-based morphometry Soheil Kolouri Ph.D. qualifying exam presentation Biomedical Engineering Department Carnegie Mellon University S. Kolouri Optimal Transport

Introduction Optimal Mass Transport Linearized Optimal Transport Results S. Kolouri Optimal Transport

Introduction Optimal Mass Transport Introduction Linearized Optimal Transport Results Introduction ◮ Thanks to the recent development of advanced high resolution microscopes, imaging experiments have become the main source of data for biologists to test and validate their hypotheses. 1 / 32

Introduction Optimal Mass Transport Introduction Linearized Optimal Transport Results Input Relevant questions 1. Are they different? 2. if so, how are they different? Applications ◮ Benign vs. malignant cells (pathology) ◮ Understanding role of different proteins ◮ Discovering drugs ◮ Understanding effects of genes (RNAi) ◮ Understanding cell signaling mechanisms. ◮ ... 2 / 32

Introduction Optimal Mass Transport Introduction Linearized Optimal Transport Results State of the art Common features ◮ Pre-determined numerical features are conventionally used ◮ Area for quantifying extracted cells from histology images and classifying different populations (e.g. normal vs. diseased). ◮ Perimeter ◮ Axis lengths ◮ Haralick ◮ Gabor ◮ Wavelet ◮ Gray level statistics ◮ . . . 3 / 32

Introduction Optimal Mass Transport Introduction Linearized Optimal Transport Results Drawbacks 1. The feature space is often very smaller than the image space, and hence one might lose important information in this data reduction process. 2. Inferring biological information based on the features is not straightforward. Why don’t we use the image space itself? 4 / 32

Introduction Optimal Mass Transport Introduction Linearized Optimal Transport Results Drawbacks 1. The feature space is often very smaller than the image space, and hence one might lose important information in this data reduction process. 2. Inferring biological information based on the features is not straightforward. While in feature space: 4 / 32

Introduction Optimal Mass Transport Introduction Linearized Optimal Transport Results Alternative approach ◮ A good alternative approach to the feature-based morphometry is the transportation-based morphometry (Wang et al. , 2010). ◮ In transportation-based morphometry the target image is warped to the source image and the distance between images is defined as a function of the transformation. ◮ The distance defined in this manner uses the whole information embedded in the image and provides a meaningful similarity measure. d OT ( x, y ) = 16 . 7 d OT ( x, y ) = 25 . 8 5 / 32

Introduction Optimal Mass Transport Introduction Linearized Optimal Transport Results Goal ◮ Our goal is to devise a method for detecting and visualizing differences between subpopulations (classes) in an image dataset. ◮ We use an optimal transport based approach to accomplish our goal. 6 / 32

Introduction Mass preserving mappings Optimal Mass Transport Kantorovich-Wasserstein metric Linearized Optimal Transport Particle-based Kantrovich-Wasserstein metric Results Mass Preserving Mapping Assume that Ω 0 and Ω 1 are two domains in R d , each with a positive density, µ 0 and µ 1 , respectively and same total amount of mass, � � µ 0 ( � x ) d� x = µ 1 ( � x ) d� x Ω 0 Ω 1 A mapping f : Ω 0 → Ω 1 is called mass preserving (MP), if f satisfies, | D f ( � x ) | µ 1 ( f ( � x )) = µ 0 ( � x ) where | D f ( � x ) | is the determinant of the Ja- cobian matrix of f which gives us the factor by which the function f expands or shrinks ◮ There exist infinitely many at � x . number of such mappings. 7 / 32

Introduction Mass preserving mappings Optimal Mass Transport Kantorovich-Wasserstein metric Linearized Optimal Transport Particle-based Kantrovich-Wasserstein metric Results Kantorovich-Wasserstein metric ◮ We use the Kantorovich-Wasserstein (KW) formulation of the optimal transport problem. L p Kantorovich-Wasserstein Metric The L p Kantorovich-Wasserstein metric for two densities µ 0 ∈ Ω 0 and µ 1 ∈ Ω 1 (normalized images) is defined as follows, � d p ( µ 0 , µ 1 ) p = x || p min || f ( � x ) − � p µ 0 ( � x ) d� x f ∈ MP � x in which, MP = { f | | D f ( � x ) | µ 1 ( f ( � x )) = µ 0 ( � x ) , ∀ � x ∈ Ω 0 } . p = 2 is a common choice which leads to a unique solution of f . ◮ Solving above optimization to get the distance, although possible, is not an easy task. Hence, we use the particle based formulation of the KW distance. 8 / 32

Introduction Mass preserving mappings Optimal Mass Transport Kantorovich-Wasserstein metric Linearized Optimal Transport Particle-based Kantrovich-Wasserstein metric Results Particle-base Kantorovich-Wasserstein ◮ We assume that our images are of the following form, � µ 0 ( � x ) = � i p i δ ( � x − � x i ) µ 1 ( � y ) = � j q j δ ( � y − � y j ) where δ is the Kronecker-delta function, and p i is the particle mass (intensity). Wang, . . . , Rohde, Cytometry, 2010 ◮ Then the set of all coupling matrices between µ 0 and µ 1 are defined as follows, � � Π( µ 0 , µ 1 ) = { F [ i, j ] = f i,j | f i,j > 0 , f i,j = p i , f i,j = q j } . j i ◮ The L 2 Kantorovich-Wasserstein metric for this case can be written as, d 2 � � y j || 2 OT ( µ 0 , µ 1 ) = min || � x i − � 2 f i,j . f ∈ Π( µ 0 ,µ 1 ) i j which is then solved by linear programming. 9 / 32

Introduction Mass preserving mappings Optimal Mass Transport Kantorovich-Wasserstein metric Linearized Optimal Transport Particle-based Kantrovich-Wasserstein metric Results Example 10 / 32

Introduction LOT distance Optimal Mass Transport Isometric linear embedding Linearized Optimal Transport Pipeline Results LOT distance ◮ Due to computational complexity of the linear programming, calculating pairwise OT distance in the dataset is not desirable. ◮ Alternatively, we choose a reference image, σ ( � z ) = � k m k δ ( � z − � z k ) and measure the OT distance of all images from this reference image. Wang, . . . , Rohde, Cytometry, 2010 11 / 32

Introduction LOT distance Optimal Mass Transport Isometric linear embedding Linearized Optimal Transport Pipeline Results LOT distance ◮ Due to computational complexity of the linear programming, calculating pairwise OT distance in the dataset is not desirable. ◮ Alternatively, we choose a reference image, σ ( � z ) = � k m k δ ( � z − � z k ) and measure the OT distance of all images from this reference image. Wang, . . . , Rohde, Cytometry, 2010 11 / 32

Introduction LOT distance Optimal Mass Transport Isometric linear embedding Linearized Optimal Transport Pipeline Results LOT distance ◮ Due to computational complexity of the linear programming, calculating pairwise OT distance in the dataset is not desirable. ◮ Alternatively, we choose a reference image, σ ( � z ) = � k m k δ ( � z − � z k ) and measure the OT distance of all images from this reference image. Wang et al., Cytometry, 2010 1 1 ◮ Let x k = � � i f k,i � x i and y k = j g k,j � y j be the centeroid of the m k m k forward image of the particle m k δ ( � z − � z k ) by the transportation plans f and g , respectively. The LOT distance between µ 0 and µ 1 is then defined as, d 2 � || x k − y k || 2 LOT,σ ( µ 0 , µ 1 ) = 2 m k . k 11 / 32

Introduction LOT distance Optimal Mass Transport Isometric linear embedding Linearized Optimal Transport Pipeline Results LOT distance 12 / 32

Introduction LOT distance Optimal Mass Transport Isometric linear embedding Linearized Optimal Transport Pipeline Results Isometric linear embedding ◮ The defined distance, d 2 LOT,σ , provides a method for mapping a sample measure (particle approximated image) into a linear space. ◮ The L 2 distance in this linear space corresponds to d 2 LOT,σ . ◮ For particle approximated images µ 0 ( � x ) and µ 1 ( � y ) the linear embeddings, ν 0 and ν 1 respectively, are obtained by finding the discrete transportation map z ) = � between the reference image σ ( � k m k δ ( � z − � z k ) and µ 0 and µ 1 , ν 0 = [ √ m 1 x 1 , . . . , √ m N σ x N σ ] T ν 1 = [ √ m 1 y 1 , . . . , √ m N σ y N σ ] T where N σ is the number of particles in σ . ◮ Hence, the L 2 distance between ν 0 and ν 1 is as follows, ||√ m k ( x k − y k ) || 2 || ν 0 − ν 1 || 2 � � || x k − y k || 2 2 = = 2 m k 2 k k d 2 = LOT,σ ( µ 0 , µ 1 ) 13 / 32

Introduction LOT distance Optimal Mass Transport Isometric linear embedding Linearized Optimal Transport Pipeline Results Original histology image With: John A. Ozolek, M.D., Univ. Pittsburgh 14 / 32

Introduction LOT distance Optimal Mass Transport Isometric linear embedding Linearized Optimal Transport Pipeline Results Extracting the luminance channel 15 / 32

Introduction LOT distance Optimal Mass Transport Isometric linear embedding Linearized Optimal Transport Pipeline Results Luminance channel 16 / 32

Recommend

More recommend