A tutorial in spatial statistics for microscopy data analysis Ed - - PowerPoint PPT Presentation

a tutorial in spatial statistics for microscopy data
SMART_READER_LITE
LIVE PREVIEW

A tutorial in spatial statistics for microscopy data analysis Ed - - PowerPoint PPT Presentation

A tutorial in spatial statistics for microscopy data analysis Ed Cohen Department of Mathematics, Imperial College London wwwf.imperial.ac.uk/ eakc07 QBI 2019 Spatial Statistics Spatial Statistics: Statistical theory and methodology for


slide-1
SLIDE 1

A tutorial in spatial statistics for microscopy data analysis

Ed Cohen

Department of Mathematics, Imperial College London

wwwf.imperial.ac.uk/∼eakc07 QBI 2019

slide-2
SLIDE 2

Spatial Statistics

Spatial Statistics: Statistical theory and methodology for modelling and analysing spatial data. Fluorescence microscopy is concerned with imaging objects, We are interested in understanding spatial

  • rganisation of objects to inform our

understanding of biological mechanisms and processes. Therefore we will restrict ourselves to spatial point patterns.

slide-3
SLIDE 3

Spatial Point Pattern

Data in the form of a set points, irregularly distributed in a region of space are called a spatial point pattern. Arise in many different contexts, e.g.

◮ Location of trees in a forest ◮ Location of ants nests in compact geographical

region

◮ Location of a particular protein of interest in a

cellular environment.

slide-4
SLIDE 4

Spatial Point Pattern

Mathematically, we can represent a spatial point pattern as a set of locations Φ = {s1, s2, ...}, with each event si belongs to X, a (locally compact subset) of Rd. For example: in fluorescence microscopy imaging, X is typically a square region in R2, and each fluorophore/event has a true position si = (xi, yi).

slide-5
SLIDE 5

Spatial point processes

Informally, a point process is a stochastic mechanism that generates a countable set of events - i.e. a spatial point pattern. It is the probabilistic framework that governs how many events there are and where they

  • ccur.

Analogous to a probability distribution for random variables.

slide-6
SLIDE 6

STOCHASTIC MECHANISM

ROLLING TWO DICE BIVARIATE DISCRETE UNIFORM DISTRIBUTION ON {1,2,3,4,5,6} x {1,2,3,4,5,6}

REALIZATIONS

CLATHARIN COATED PITS ON CELL MEMBRANCE POISSON PROCESS WITH INTENSITY !

slide-7
SLIDE 7

Describing and characterizing spatial point processes

We typically represent a spatial point process by N, where N(A) is a random number indicating the number of events within some set A. !" # = 8 !& # = 0 !( # = 4 It is the probability distribution of N(A) for all (nice) sets A that characterizes a spatial point process.

slide-8
SLIDE 8

Characterizing spatial point processes

Intensity (localized rate of events): λ(s) = lim

|ds|↓0

E{N(ds)} |ds| . The second-order intensity of a spatial point process N at points s, u ∈ X is γ(s, u) = lim

|ds||du|↓0

E{N(ds)N(du)} |ds||du| . The second-order covariance of a spatial point process N at points s, u ∈ X is c(s, u) = γ(s, u) − λ(u)λ(s) cov(X, Y ) = E(XY ) − E(X)E(Y ). Pair correlation function: g(s, u) = γ(s, u) λ(s)λ(u).

slide-9
SLIDE 9

Characterizing first and second order moments of spatial point processes

Homogeneity: λ(s) is constant for all s ∈ X. Translates as: the chance of getting an event at any particular point in spaces is the same across X. Stationarity and isotropic: γ(s, u) = γ(||s − u||) = γ(r). Translates as: The covariance between any two points in space having an event or not depends only on the distance between them. These assumptions are not as restrictive as they first seem. If the heterogeneity is itself random then these notions can still hold.

slide-10
SLIDE 10

Poisson process

Spatial point process N is Poisson if the following hold: For every (nice) subset A ⊂ X, the number of events is Poisson distributed with expected value µ(A) =

  • A λ(s)ds

For any collection A1, ..., An, the random variables N(A1), ..., N(An) are independent of one another. Poisson processes have memoryless property - all events are independent of eachother. Homogeneous Poisson processes are known as completely spatial random (CSR).

slide-11
SLIDE 11

Complete spatial randomness

slide-12
SLIDE 12

Complete spatial randomness

0 0 1 0 0 0 1 1 0 0 1 1 0 1 0 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 1 1 1 0 0 0 1 1 1 0 1 0 0 1 0 1 1 0 0 0 0 1 0 1 0 0 0 1 0 0 1 0 0 0 1 0 1 0 1 1 0 0 1 0 0 0 1 1 0 1 1 0 0 0 0 1 1 1 1 0 0 1 0 1 0 1 0 1 0 0 1 0 1 1 0 1 0 1 0 0 0 1 0 0 1 1 0 0 1 1 1 1 1 1 1 1 1 0 1 0 1 1 1 0 0 0 0 0 1 1 1 1 0 0 0 0 1 0 0 1 0 0 0 0 1 1 0 1 0 0 0 1 1 1 1 0 0 0 1 1 0 1 0 0 1 0 0 0 0 0 0 0 0 1 1 0 0 1 0 0 0 1 0 0 0 0 1 0 1 1 0 0 0 1 0 1 0 0 0 1 1 1 0 0 0 0 0 1 0 1 1 0 0 0 1 0 0 1 1 1 0 1 0 0 1 1 0 0 0 0 1 1 0 0 0 1 0 1 1 1 0 1 1 1 1 0 0 1 1 0 0 0 0 1 0 1 1 1 1 1 1 1 0 1 1 0 0 0 0 0 0 0 1 1 1 1 0 1 1 0 1 0 0 0 0 1 1 0 0 0 0 0 0 1 1 1 0 0 0 0 1 0 0 1 0 0 1 0 0 1 1 1 0 0 1 0 0 0 0 1 1 0 1 1 1 0 1 0 1 0 0 1 0 1 0 0 1 1 1 1 1 0 0 0 0 1 0 1 0 1 0 1 0 0 0 0 0 1 1 0 0 0 1 0 0 1 0 0 1 0 0 1 0 0 0 0 1 1 0 0 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 0 1 0 1 1 1 1 1 1 1 1 1 1 0 0 0 1 1 1 0 1 1 1 1 1 0 1 0 0 1 0 0 0 0 1 0 1 1 0 0 1 1 0 0 0 0 1 0 1 0 1 0 0 1 1 1 1 1 0 1 0 1 1 1 1 0 0 0 0 1 1 1 1 1 0 0 0 1 1 0 0 0 0 0 0 1 0 1 1 1 1 0 1 0 1 1 0 1 0 0 1 1 0 0 0 1 1 0 0 0 0 0 1 0 1 1 0 1 0 0 1 1 1 0 1 1 1 1 1 1 1 1 0 1 0 0 1 1 0 1 1 0 1 0 0 0 0 1 0 1 0 0 0 1 1 1 0 0 0 0 1 0 0 1 0 1 1 0 1 0 0 0 0 0 0 1 1 0 1 1 1 0 0 0 1 1 0 0 0 1 1 0 0 0 1 0 1 0 1 1 1 0 0 1 1 1 0 0 1 1 1 0 0 1 1 0 1 1 0 1 0 0 0 1 0 0 1 0 1 1 1 0 0 1 0 1 0 0 1 0 1 0 1 1 0 0 1 1 0 1 0 0 0 1 1 0 0 0 0 0 1 0 1 1 1 0 0 0 0 0 0 0 1 1 1 0 0 0 0 1 0 0 1 0 1 1 1 1 0 1 1 1 1 1 0 1 1 1 0 1 0 0 1 0 0 1 0 0 1 1 0 0 0 1 0 0 0 0 1 1 1 1 1 1 0 0 1 1 1 1 1 1 0 1 0 1 1 1 1 1 0 1 0 0 0 1 1 1 0 0 1 1 0 0 1 0 0 1 0 1 1 0 0 1 0 0 0 1 1 1 0 1 1 0 0 1 1 1 0 1 1 0 1 0 0 1 1 0 0 0 0 1 0 1 1 1 0 0 0 0 1 0 1 0 0 0 1 1 0 0 0 0 0 0 0 1 0 1 1 0 1 1 0 1 1 1 1 1 0 1 1 1 1 0 1 0 0 0 0 1 1 0 1 0 0 0 1 0 1 1 1 0 0 0 0 1 0 1 0 0 0 0 0 1 1 0 1 1 1 1 0 0 0 1 0 0 0 0 1 0 1 0 0 0 1 0 1 1 0 0 1 0 0 1 1 1 1 1 0 1 0 1 0 1 0 0 1 1 1 1 1 1 0 0 0 1 0 1 0 0 1 0 0 0 0 0 1 1 1 0 0 1 1 0 0 0 1 0 0 1 1 1 0 0 1 0 0 1 1 0 0 0 0 1 1 0 0 0 1 0 0 0 1 0 0 1 0 0 1 0 1 0 1 1 0 1 1 1 1 0 0 1 1 0 1 0 0 0 0 1 0 0 0 0 0 1 0 0 1 1 0 1 0 0 0 0 1 1 0 0 0 0 0 0 1 0 0 1 0 0 1 0 1 1 1 1 1 0 1 1 1 0 1 0 0 0 1 0 0 1 1 1 0 1 0 0 0 1 1 0 1 1 1 1 0 0 0 0 1 1 1 0 0 1 1 1 0 0 1 1 1 0 1 0 1 0 0 0 0 0 1 0 1 1 1 0 1 0 0 0 0 1 0 0 0 1 0 0 0 1 1 1 0 0 0 1 0 0 0 1 1 0 0 0 1 1 0 0 1 1 1 0 0 0 0 1 0 0 1 1 1 1 0 0 0 0 1 0 1 1 1 1 0 1 0 0 0 1 0 0 0 1 0 1 1 0 1 1 1 0 1 1 0 1 0 0 0 1 0 1 0 1 0 1 0 0 1 1 1 0 0 1 0 0 1 1 1 0

slide-13
SLIDE 13

CSR vs Clustering vs Inhibition/Regularity

CSR Clustered Inhibited

slide-14
SLIDE 14

CSR vs Clustering vs Inhibition/Regularity

CSR Clustered Inhibited

slide-15
SLIDE 15

Ripley’s K-function

Ripley’s K-function is used extensively across the sciences, including microscopy to detect and characterize clustering behavior. It is a theoretical property of the point process. K(r) ≡ r 2πr′g(r′)dr′ = λ−1E{number of events within distance r of an arbitrary event}. Its widespread use lies in its interpretability and the ease at which it can be estimated with robust, well studied estimators from a single point pattern. Many of the recent developments in spatial data analysis have this function at their heart. !

slide-16
SLIDE 16

Worked example: Poisson process

The second-order intensity of a homogeneous N at points s, u ∈ X is γ(s, u) = lim

|ds||du|↓0

E{N(ds)N(du)} |ds||du| = lim

|ds||du|↓0

E{N(ds)}{N(du)} |ds||du| = λ(s)λ(u) g(s, u) = γ(s, u) λ(s)λ(u) = 1 K(r) = πr2 L(r) − r ≡

  • K(r)/π − r = 0.

!

slide-17
SLIDE 17

K-function for different types of process

CSR Clustered Inhibited

L(r) - r r L(r) - r r L(r) - r r

slide-18
SLIDE 18

Estimation

REMEMBER: we are interested in knowing the properties of the PROCESS. We need to estimate them from the pattern - typically we only get one pattern with which to estimate them. ˆ K(r) = A n(n − 1)

n

  • i=1

n

  • j=1

wijI(0 < dij < r)

slide-19
SLIDE 19

Estimation

REMEMBER: we are interested in knowing the properties of the PROCESS. We need to estimate them from the pattern - typically we only get one pattern with which to estimate them. ˆ K(r) = A n(n − 1)

n

  • i=1

n

  • j=1

wijI(0 < dij < r)

slide-20
SLIDE 20

Estimation

REMEMBER: we are interested in knowing the properties of the PROCESS. We need to estimate them from the pattern - typically we only get one pattern with which to estimate them. ˆ K(r) = A n(n − 1)

n

  • i=1

n

  • j=1

wijI(0 < dij < r)

slide-21
SLIDE 21

Estimation

REMEMBER: we are interested in knowing the properties of the PROCESS. We need to estimate them from the pattern - typically we only get one pattern with which to estimate them. ˆ K(r) = A n(n − 1)

n

  • i=1

n

  • j=1

wijI(0 < dij < r)

slide-22
SLIDE 22

Testing

Inference is typically performed through hypothesis testing: H0 : the process is CSR vs HA : the process is not CSR For this we need a test statistics and its distribution under the null (CSR). T = max

r {|ˆ

L(r) − r|}. Lagache et al, Analysis of the Spatial Organization of Molecules with Robust Statistics, PLOS One, 2013.

slide-23
SLIDE 23

Testing

0.01 0.02 0.03

r

  • 0.03
  • 0.02
  • 0.01

0.01 0.02 0.03

L(r)-r

0.1 0.2 0.3

r

  • 0.03
  • 0.02
  • 0.01

0.01 0.02 0.03 0.1 0.2 0.3

r

  • 0.03
  • 0.02
  • 0.01

0.01 0.02 0.03

slide-24
SLIDE 24

Clustering

Clustering: the act of identifying and characterising clusters size shape number of events in a cluster. Caution should be taken trying to extract these properties from the K-function Rubin-Delanchy et al, Bayesian cluster identification in single-molecule localization microscopy data, Nature Methods 2015. Griffi´ e et al, 3D Bayesian cluster analysis of super-resolution data reveals LAT recruitment to the T cell synapse Staszowska et al, The R´ enyi divergence enables accurate and precise cluster analysis for localization microscopy

slide-25
SLIDE 25

Colocalization

Inference is typically performed through hypothesis testing: H0 : the two processes are independent vs HA : the two process are not independent INDEPENDENT COLOCALIZED

slide-26
SLIDE 26

Colocalization

Test statistic based on the estimator of the cross K-function K12(r) = λ−1

2 E{number of events of type 2 within distance r of an arbitrary event of type 1}

!

For two independent processes, K12(r) = πr2. However, it is notoriously troublesome to get the distribution of ˆ K12(r) under the null. Legache et al, Mapping molecular assemblies with fluorescence microscopy and object-based spatial statistics, Nature Communications, 2018.

slide-27
SLIDE 27

Parametric models

Poisson cluster process Mat´ ern Neyman-Scott process Thomas process Markov point process Strauss process Cox process log-Gaussian-Cox Fibre driven Cox

slide-28
SLIDE 28

Resources

Books

◮ P. Diggle. Statistical Analysis of Spatial and Spatio-Temporal Point Patterns. ◮ J. Illian et al. Statistical Analysis and Modelling of Spatial Point Patterns. ◮ N. Cressie. Statistics for Spatial Data. ◮ Chiu and Stoyan. Stochastic Geometry and its applications.

Software: SpatStat in R.

slide-29
SLIDE 29

Extracting point patterns

Cohen et al, Resolution limit of image analysis algorithms, Nature Communications, Accepted.

IMAGING ALGORITHMS

Φ"

RAW IMAGE IMAGE EVENTS

Φ#

OBJECT EVENTS

$" $#

slide-30
SLIDE 30

Extracting point patterns

Cohen et al, Resolution limit of image analysis algorithms, Nature Communications, Accepted.