What is a point pattern? For a specified, bounded region D , a set of - - PowerPoint PPT Presentation

what is a point pattern
SMART_READER_LITE
LIVE PREVIEW

What is a point pattern? For a specified, bounded region D , a set of - - PowerPoint PPT Presentation

What is a point pattern? For a specified, bounded region D , a set of locations s i , i = 1 , 2 ..., n Spatial Point Patterns p. 1/24 What is a point pattern? For a specified, bounded region D , a set of locations s i , i = 1 , 2 ..., n The


slide-1
SLIDE 1

What is a point pattern?

For a specified, bounded region D, a set of locations

si, i = 1, 2..., n

Spatial Point Patterns – p. 1/24

slide-2
SLIDE 2

What is a point pattern?

For a specified, bounded region D, a set of locations

si, i = 1, 2..., n

The locations are viewed as “random”

Spatial Point Patterns – p. 1/24

slide-3
SLIDE 3

What is a point pattern?

For a specified, bounded region D, a set of locations

si, i = 1, 2..., n

The locations are viewed as “random” Need not have variables at locations, just the pattern of points

Spatial Point Patterns – p. 1/24

slide-4
SLIDE 4

What is a point pattern?

For a specified, bounded region D, a set of locations

si, i = 1, 2..., n

The locations are viewed as “random” Need not have variables at locations, just the pattern of points Crude features of patterns, e.g., complete randomness, clustering/attraction, inhibition/repulsion, regular/systematic

Spatial Point Patterns – p. 1/24

slide-5
SLIDE 5

What is a point pattern?

For a specified, bounded region D, a set of locations

si, i = 1, 2..., n

The locations are viewed as “random” Need not have variables at locations, just the pattern of points Crude features of patterns, e.g., complete randomness, clustering/attraction, inhibition/repulsion, regular/systematic Can add “marks”, i.e., labels. Then, a point pattern for each mark; comparison of patterns

Spatial Point Patterns – p. 1/24

slide-6
SLIDE 6

spatial homogeneity

0.0 0.4 0.8 0.0 0.4 0.8

u v

0.0 0.4 0.8 0.0 0.4 0.8

u v

0.0 0.4 0.8 0.0 0.4 0.8

u v

0.0 0.4 0.8 0.0 0.4 0.8

u v

0.0 0.4 0.8 0.0 0.4 0.8

u v

0.0 0.4 0.8 0.0 0.4 0.8

u v

Spatial Point Patterns – p. 2/24

slide-7
SLIDE 7

cluster pattern; systematic pattern

0.0 0.4 0.8 0.0 0.4 0.8

u v

Clustered

0.0 0.4 0.8 0.0 0.4 0.8

u v

Clustered

0.0 0.4 0.8 0.0 0.4 0.8

u v

Clustered

0.0 0.4 0.8 0.0 0.4 0.8

u v

Regular

0.0 0.4 0.8 0.0 0.4 0.8

u v

Regular

0.0 0.4 0.8 0.0 0.4 0.8

u v

Regular

Spatial Point Patterns – p. 3/24

slide-8
SLIDE 8

spatial heterogeneity

u v lambda

u v

5 10 15 20 5 10 15 20

Spatial Point Patterns – p. 4/24

slide-9
SLIDE 9

spatial heterogeneity

5 10 15 20 5 10 15 20

u v

5 10 15 20 5 10 15 20

u v

5 10 15 20 5 10 15 20

u v

5 10 15 20 5 10 15 20

u v

5 10 15 20 5 10 15 20

u v

5 10 15 20 5 10 15 20

u v

Spatial Point Patterns – p. 5/24

slide-10
SLIDE 10

Examples

pattern of trees in a forest, say junipers and pinions

Spatial Point Patterns – p. 6/24

slide-11
SLIDE 11

Examples

pattern of trees in a forest, say junipers and pinions pattern of disease cases, perhaps cases and controls

Spatial Point Patterns – p. 6/24

slide-12
SLIDE 12

Examples

pattern of trees in a forest, say junipers and pinions pattern of disease cases, perhaps cases and controls breast cancer cases; treatment option - mastectomy or radiation

Spatial Point Patterns – p. 6/24

slide-13
SLIDE 13

Examples

pattern of trees in a forest, say junipers and pinions pattern of disease cases, perhaps cases and controls breast cancer cases; treatment option - mastectomy or radiation perhaps over time, single family homes; urban development

Spatial Point Patterns – p. 6/24

slide-14
SLIDE 14

Examples

pattern of trees in a forest, say junipers and pinions pattern of disease cases, perhaps cases and controls breast cancer cases; treatment option - mastectomy or radiation perhaps over time, single family homes; urban development again over time, bovine tuberculosis

Spatial Point Patterns – p. 6/24

slide-15
SLIDE 15

Distributions

N(B) is number of points in set B N(B) ∼ ?, driven by intensity surface λ(s) yields a

Poisson Process

N(B) ∼ Po(λ(B) where λ(B) =

B λ(s)ds

λ(s) = λ - homogeneous Poisson process, spatial

homogeneity, complete spatial randomness (csr),

λ(B) = λ|B| λ(s) nonconstant, fixed - nonhomogeneous Poisson

process

λ(s) random - Cox process

more to follow

Spatial Point Patterns – p. 7/24

slide-16
SLIDE 16

Counting Measure

Again, for any set B ⊂ D, let N(B) count the number of points in B Given for all B’s, we call a counting measure Counting measure is equivalent to a point pattern If point pattern is random, then N(B)’s are random Model for the uncountable collection of sets?

Spatial Point Patterns – p. 8/24

slide-17
SLIDE 17

Exploratory data analysis

All directed at checking/criticizing the assumption of complete spatial randomness

Spatial Point Patterns – p. 9/24

slide-18
SLIDE 18

Exploratory data analysis

All directed at checking/criticizing the assumption of complete spatial randomness Crude - Partition D in cells of equal area (need not exhaust D). Compute ¯

N, the mean of the cell counts.

Compute S2

N, the sample variance of the counts. Look

at S2

N/ ¯

N.

Spatial Point Patterns – p. 9/24

slide-19
SLIDE 19

Exploratory data analysis

All directed at checking/criticizing the assumption of complete spatial randomness Crude - Partition D in cells of equal area (need not exhaust D). Compute ¯

N, the mean of the cell counts.

Compute S2

N, the sample variance of the counts. Look

at S2

N/ ¯

N.

Can extend to a standard χ2 test treating the cell counts as i.i.d. Poisson random variables under csr

Spatial Point Patterns – p. 9/24

slide-20
SLIDE 20

F and G functions

Distance-based methods:

G(d) is nearest neighbor distance, event to event, i.e., G(d) = Pr(nearest event ≤ d) F(d) is nearest neighbor distance point to event, i.e., F(d) = Pr(nearest event ≤ d)

Under csr G(d) = F(d) = 1 − exp(−λπd2)

ˆ G is the empirical c.d.f. of the n nearest neighbor

distances (nearest neighbor distance for s1, for s2, etc.

ˆ F is the empirical c.d.f. arising from the m nearest

neighbor distances associated with a randomly selected set of m points in D. m is arbitrary and ˆ

G = ˆ F

Edge correction if d > bi, distance from si to edge of D compare ˆ

G with G, ˆ F with F - theoretical Q-Q plot.

Shorter tails - clustering/attraction, longer tails - inhibition/repulsion

Spatial Point Patterns – p. 10/24

slide-21
SLIDE 21

The K function

The K function considers the number of points within distance d of an arbitrary point Under complete spatial randomness, expected number is the same for any point

K is easy to interpret, easy to estimate

Formally, K(d) = (λ)−1E(# of points within d of an arbitrary point)

Spatial Point Patterns – p. 11/24

slide-22
SLIDE 22

cont.

Under complete spatial randomness,

K(d) = λπd2/λ = πd2 (area of circle of radius d) ˆ K(d) = (nˆ λ)−1

i ri

where ri =

j(wij)−1I(||si − sj|| ≤ d).

ri is the number of sj within d of si ˆ λ = n/|D| wij is an edge correction, the proportion of the

circumference of the circle centered at si with radius

||si − sj|| within D

Compare ˆ

K(d) with K(d) = πd2

Regularity/inhibition implies K(d) < πd2, clustering implies K(d) > πd2

Spatial Point Patterns – p. 12/24

slide-23
SLIDE 23

Estimating the intensity

Crude approach: Imagine a refined grid over D. Then

λ(∂s) =

∂s λ(s)ds ≈ λ(s)|∂s|

So, for grid cell Al, assume λ is constant over Al and estimate with N(Al)/|Al|. Two dimensional step function, tile function; like a two-dimensional histogram

Spatial Point Patterns – p. 13/24

slide-24
SLIDE 24

cont.

More sophisticated: A kernel intensity estimate (like a kernel density estimate)

ˆ λτ(s) =

  • i

h(||s − si||/τ)/τ2, s ∈ D h is a radially symmetric bivariate pdf (usually a

bivariate normal),

τ is “bandwidth” (controls smoothness of ˆ λ). And, τ2 not τ since we are in 2-dim space not 1-dim space

Don’t divide by n; we cumulate intensity

Spatial Point Patterns – p. 14/24

slide-25
SLIDE 25

Generating point patterns

Generating a realization from a HPP with intensity λ Sample n ∼ Po(λ|D|). Given n, sample n locations uniformly over D Generating a realization from a NHPP given λ(s) Compute λmax = maxs∈Dλ(s). Sample n ∼ Po(λmax|D|). Given n, sample n locations uniformly over D. “Thin” by retaining si with probability λ(si)/λmax. (Rejection method) Easier than computing λ(D) =

  • D λ(s)ds to draw n and

then trying to sample λ(s)/λ(D)

Spatial Point Patterns – p. 15/24

slide-26
SLIDE 26

NHPP likelihood

Two views Given N(D) = n,

f(s1, s2, ...sn|N(D) = n) =

i λ(si)/(λ(D))n

So, “joint density”, f(s1, s2, ...sn, N(D) = n) =

  • i λ(si)/(λ(D))n(λ(D))nexp(−λ(D))/n!

“likelihood” becomes

L(λ(s), s ∈ D; s1, ...sn) =

  • i

λ(si)exp(−λ(D))

Alternatively, partition D into a fine grid. From the Poisson assumption, the likelihood will be a product

  • ver the grid cells, i.e.

l exp(−λ(Al))(λ(Al))N(Al).

Product of the exponential terms is exp(−λ(D)), regardless of the grid. As the grid becomes finer,

N(Al) = 1 or 0 if si in Al or not

Spatial Point Patterns – p. 16/24

slide-27
SLIDE 27

Modeling λ(s)

λ(s) = σλ0(s), σ unknown λ(s) is a tiled surface over a grid requiring λl for Al λ(s; θ), a parametric function λ(s; θ) = λf(s; θ) where f is a bivariate density function

truncated to D

λ(s) is a surface over 2-dim space using basis

functions, i.e., λ(s) =

j ajfj(s)

λ(s) a process realization, e.g., λ(s) = exp(z(s)) where z(s) is a realization from a spatial Gaussian process

From a Bayesian viewpoint, parameters are all random so these are all Cox processes

Spatial Point Patterns – p. 17/24

slide-28
SLIDE 28

Modeling clarification

What are we modeling? From a Bayesian perspective, the joint distribution of (n, {s1, s2, ..., sn}) and

λD = {λ(s) : s ∈ D}

Note the “misalignment” Above, we provided the distribution of

n, {s1, s2, ..., sn})|λD. So we need to model λD to

complete the specification. This will be a finite dimensional specification above unless λ(s) is a process realization As an aside, suppose we add a “response”, say Y (si) at location si with a usual spatial model,

Y (s) = µ(s) + w(s) + ǫ(s) and w(s) a GP

Then modeling wD = {w(s) : s ∈ D} has nothing to do with modeling λD. Former is a process model for w(s).

Spatial Point Patterns – p. 18/24

slide-29
SLIDE 29

Bayesian model cont.

For parametric cases, we write likelihood as

L(θ; s1, ..., sn) with a prior on θ as usual.

Can include covariates in λ(s; θ) With a prior on θ, the specification is complete. Replace conditioning on λD with conditioning on θ. But we will need to calculate

  • D λ(s; θ)ds. Perhaps

explicitly, perhaps numerical integration A very flexible choice for λ(s; θ) is λf(s; ·) where

f(s : ·) = L

l=1 BV N(s; ul, Σ)/L. Here, ul can be i.i.d.

uniform over D, Σ can be convenient and

θ = (λ, u1, ..., uL, Σ)

Spatial Point Patterns – p. 19/24

slide-30
SLIDE 30

cont.

For the case where λ(s) is a process realization, the nonparametric case, say λ(s) = exp(z(s)) where z(s) is a realization of a GP , i.e., a GP prior on logλD Can include covariates in z(s) Practically, we can only work with a finite dimensional

  • distribution. We replace λD with λ(s∗

l ) where the s∗ l are

  • n a suitably fine grid over D. This introduces a

multivariate normal prior for λ on the z scale Still, we need λ(s) at every s ∈ D. View the z(s∗

l ) as a

tiled surface over D or we interpolate z(s) from {z(s∗

l )}.

Again, conceptually, likelihood is L(λD; s1, ...sn) and we will need to calculate

D exp(z(s))ds

Can’t do this exactly. The finite dimensional set of

z(s∗

l )’s allows a convenient “numerical” integration

although the limit is not λ(D)

Spatial Point Patterns – p. 20/24

slide-31
SLIDE 31

cont.

The “poor man’s” version. Overlay a grid on D, work with the Poisson counts associated with the grid. Treat logλl’s as realizations from a CAR model. Reduce to a disease mapping problem. Choice of, sensitivity to the grid? In fact, can introduce covariates into λ(s), i.e., logλ(s) = XT (s)β + φ(s) With marks, an intensity, hence a process for each

  • mark. Dependent or independent processes?

Spatial Point Patterns – p. 21/24

slide-32
SLIDE 32

Some special Cox processes

Poisson cluster process (Neyman-Scott process) Generate parent events from a NHPP with λ(s) Each parent produces a random number of offspring, e.g., N(si) are i.i.d. Po(δ) Positions of offspring are determined by i.i.d. realizations of a bivariate density Only offspring are retained as point pattern An inhibition process: Generate a realization of a csr. “Thin” by deletion of all pairs with distance less than δ apart, where δ is a “minimum permissible” distance Pairwise interaction process - introduce an interaction function φ(s, s′) into likelihood

Spatial Point Patterns – p. 22/24

slide-33
SLIDE 33

Space-time point patterns

What is time resolution? Discrete time, e.g., pattern of annual plants, annual pattern of disease cases Time is continuous, e.g., when a house is built, when a bovine TB case occurred In either case, we need to think about λ(s, t) If time is discrete, then λt(s) can be modeled dynamically, parametrically or nonparametrically If time is continuous, then again parametric λ(s, t; θ) or nonparametric logGaussian spatio-temporal Cox process model, i.e., z(t, s) = logλ(s, t) is a realization from a space-time Gaussian process. Note that to see a pattern, we must integrate λ(s, t) over a region in space and an interval in time.

Spatial Point Patterns – p. 23/24

slide-34
SLIDE 34

References

Cressie book is still useful Diggle book - more modern but not very complete Diggle review papers - available from his website Møller and Waagepetersen - technical, likelihood based Gotway and Waller - easy reading, less modeling, more EDA style New book by Illian, Penttinen, Stoyan, and Stoyan is very nice

Spatial Point Patterns – p. 24/24