[PPT] - D. Gumprecht, W.G. Mller and J. Rodrguez-Daz University of PowerPoint Presentation

SLIDE 1

D. Gumprecht, W.G. Müller and J. Rodríguez-Díaz

University of Econommics Vienna, Austria Johannes-Kepler-University Linz, Austria University of Salamanca, Spain

Optimal Design for Detecting Spatial Dependence

mODa 8, Almagro, Spain, June 2007

SLIDE 2

2

Spatial dependence

“All things are related but nearby things are more related than distant things.” (Tobler, 1970: the first law of geography) “Spatial dependency is the extent to which the value of an attribute in one location depends

n the values of the attribute in nearby

locations.” (Fotheringham et al, 2002). “Spatial autocorrelation (…) is the correlation among values of a single variable strictly attributable to the proximity of those values in geographic space (…).” (Griffith, 2003). “Hell is a place with no spatial dependence.” (Goodchild, 2002)

SLIDE 3

source: Anselin, 1988 (Columbus, Ohio crime) 3

Random or Clustered?

SLIDE 4

source: M.Goodchild, 2002 4

Spatial Randomness

– values observed at a location do not depend on values observed at neighboring locations – observed spatial pattern of values is equally likely as any other spatial pattern – the location of values may be altered without affecting the information content of the data

SLIDE 5

adapted from Goodchild, 2002 5

Spatial Proximity (Weight) Matrix

Matrix W (n x n) , where

each element wij represents a measure of nearness between regions Oi and Oj

Possible Choices:

wij = 1, if Oi touches Oj wij = 1, if distance(Oi, Oj) d*

A B C D E

A 0 1 0 1 0 B 1 0 1 1 1 C 0 1 0 0 1 D 1 1 0 0 1 E 0 1 1 1 0

SLIDE 6

6

Spatial weight matrices based on distance

Distances dij usually measured centroid

to centroid.

Most common choices are the inverse

distance wij = (1 - 1 1{i=j})/dij ,

or the negative exponential

wij = exp{- δ dij} –1 1{i=j}.

Row standardization ij = wij / j wij is

employed to keep spatial parameters comparable.

SLIDE 7

source: Anselin, 1988 7

Moran Scatter Plots

We can now draw a scatter plot between a variable y, and the “spatial lag” of y, Wy.

The slope of the regression line is Moran’s , which can be interpreted as the spatial autocorrelation, the correlation between variable y and the “spatial lag” Wy

SLIDE 8

8

Tests for Spatial Dependence

Moran, 1950
Cliff and Ord, 1981 for regression residuals

from y = Xβ + ε

Anselin and Kelejian, 1997 investigate

y = Xβ + y + ε.

2

( )( ) ( )

ij i j i

n w y y y y y y − − = −

(

)

1 2 1

, ( )

T T T T T

y M W W My M I X X X X y My

−

+ = = −

SLIDE 9

9

Random or Clustered?

Moran’s = -0.003

Moran’s = 0.511

SLIDE 10

10

Distribution of Moran’s under the H0: no spatial autocorrelation

Inference is usually based on a normal

approximation, using a standardized z-value

btained from the mean and variance of the

statistic, i.e. z() = (-E[])/Var[] ,

which are given by (see Henshaw, 1966)

where K = ½M( +T)M.

a saddle-point approximation and the exact

distribution was derived by Tiefelsdorf, 2000.

asymptotic distributions under deviations can

be found in Kelejian and Prucha, 2001.

tr( ) [ | ] , K E H n k = −

2

2 2

2{( )tr( ) tr( ) } Var[ | ] ( ) ( 2) n k K K H n k n k − − = − − +

SLIDE 11

11

Distribution of Moran’s under the HA: spatial autocorrelation

We assume that the data is generated by a so

called SAR model, i.e. y = Xβ + ε, where ε = ε + u, u being i.i.d.

The normal approximation holds and the

mean and variance are now given by (see Tiefelsdorf, 2000) where the hii

* are derived from functions of the

covariance matrix of the errors, and with

1 2

1 1

[ ] (1 2 ) 1 2

n k n k ii A i i i i

h E H t dt t λ λ

∗ − − ∞ − = =

| = + ⋅ ⋅ ⋅ ⋅ + ⋅ ⋅

∏
2

2

Var[ ] E[ ] E[ ]

A A A

H H H | = | − |

1

2

2 2 1 1 1

2 ( ) E[ ] (1 2 ) (1 2 ) (1 2 )

n k n k n k ii jj ij A i i j i i j

h h h H t t dt t t λ λ λ

∗ ∗ ∗ − − − ∞ − = = =

⋅ + ⋅ | = + ⋅ ⋅ ⋅ ⋅ ⋅ + ⋅ ⋅ ⋅ + ⋅ ⋅

∏

SLIDE 12

12

Random or Clustered?

Moran’s = 0.511 z() = 5.675 Moran’s = -0.003 z() = 0.190

SLIDE 13

13

A Design Criterion

Purpose: minimize the Type II error, i.e. the

probability that, given the alternative, the Moran’s test accepts the null hypothesis of no spatial autocorrelation.

This leads us to the following design problem
Of course we cannot use classical design theory

since the power 1-Ψ is not convex.

1

E( ) min P (1 ) Var( )

A

H

H H α

−

−

| ≤ Φ −

|
1

(1 ) Var[ ] E[ ] E[ ] arg min arg min Var[ ]

A X X A

H H H H

ξ ξ

α ξ

− ∗ ∈ ∈

Φ

− | + | − | = Ψ = Φ

|

SLIDE 14

14

Example: Anselin data

Moran’s = 0.511 z() = 5.675 1- = 0.799

SLIDE 15

15

Exchange type algorithms

E.g. from a given design ξ and a set of

candidate points C exchange the pair which maximizes the decrease in Ψ. (Fedorov, 1972, requires evaluation of the criterion n(N-n) times at each step).

Iterate as long as there is improvement.
Variants by Wynn, 1970, Meyer &

Nachtsheim, 1995, Nguyen, 2002, etc.

Simulated annealing, genetic algorithms

as alternatives?

SLIDE 16

16

Example 2: Anselin data

SLIDE 17

17

Example 2: Anselin data

Moran’s = 0.511 z() = 5.675 1- = 0.799 Moran’s = 0.417 z() = 1.914 1- = 0.983

SLIDE 18

18

Anselin, Luc. 1988. Spatial Econometrics: Methods and Models.

Dordrecht, Amsterdam.

Cliff, Andrew. Keith Ord. 1981. Spatial Processes: Models and
Applications. London: Pion.
Müller, Werner G. 2007. Collecting Spatial Data. Springer-Verlag

Berlin Heidelberg

Tiefelsdorf, Michael. 2000. Modelling Spatial Processes. Springer-

Verlag Berlin Heidelberg New York.

References (www.ifas.jku.at)

SLIDE 19

www.endlessforest.org 19

thank you for your attention!

SLIDE 20

source: O'Sullivan and Unwin, 2002 20

Is it Spatially Random? Tougher than it looks to decide!

Fact: It is observed that about

twice as many people sit catty/corner rather than opposite at tables in a restaurant

Conclusion: psychological

preference for nearness

In actuality: an outcome to

be expected from a random process: two ways to sit

pposite, but four ways to

sit catty/corner

SLIDE 21

source: M.Goodchild 21

Why Spatial Autocorrelation Matters

Spatial autocorrelation is of interest in its own right because it

suggests the operation of a spatial process

Additionally, most statistical analyses are based on the assumption

that the values of observations in each sample are independent of

ne another

– Positive spatial autocorrelation violates this, because samples taken from nearby areas are related to each other and are not independent

In ordinary least squares regression (OLS), for example, the

correlation coefficients will be biased and their precision exaggerated

– Bias implies correlation coefficients may be higher than they really are

They are biased because the areas with higher concentrations of

events will have a greater impact on the model estimate

– Exaggerated precision (lower standard error) implies they are more likely to be found “statistically significant”

they will overestimate precision because, since events tend to be

concentrated, there are actually a fewer number of independent

bservations than is being assumed.

SLIDE 22

22

University of Econommics Vienna, Austria Johannes-Kepler-University Linz, Austria University of Salamanca, Spain

Optimal Design for Detecting Spatial Dependence

mODa 8, Almagro, Spain, June 2007

Spatial dependence

“All things are related but nearby things are more related than distant things.” (Tobler, 1970: the first law of geography) “Spatial dependency is the extent to which the value of an attribute in one location depends

locations.” (Fotheringham et al, 2002). “Spatial autocorrelation (…) is the correlation among values of a single variable strictly attributable to the proximity of those values in geographic space (…).” (Griffith, 2003). “Hell is a place with no spatial dependence.” (Goodchild, 2002)

Random or Clustered?

Spatial Randomness

– values observed at a location do not depend on values observed at neighboring locations – observed spatial pattern of values is equally likely as any other spatial pattern – the location of values may be altered without affecting the information content of the data

Spatial Proximity (Weight) Matrix

each element wij represents a measure of nearness between regions Oi and Oj

wij = 1, if Oi touches Oj wij = 1, if distance(Oi, Oj) d*

Spatial weight matrices based on distance

to centroid.

distance wij = (1 - 1 1{i=j})/dij ,

wij = exp{- δ dij} –1 1{i=j}.

employed to keep spatial parameters comparable.

Moran Scatter Plots

We can now draw a scatter plot between a variable y, and the “spatial lag” of y, Wy.

The slope of the regression line is Moran’s , which can be interpreted as the spatial autocorrelation, the correlation between variable y and the “spatial lag” Wy

Tests for Spatial Dependence

from y = Xβ + ε

y = Xβ + y + ε.

( )( ) ( )

n w y y y y y y − − = −

)

, ( )

y M W W My M I X X X X y My

+ = = −

Random or Clustered?

Moran’s = -0.003

Moran’s = 0.511

Distribution of Moran’s under the H0: no spatial autocorrelation

approximation, using a standardized z-value

statistic, i.e. z() = (-E[])/Var[] ,

where K = ½M( +T)M.

distribution was derived by Tiefelsdorf, 2000.

be found in Kelejian and Prucha, 2001.

tr( ) [ | ] , K E H n k = −

2{( )tr( ) tr( ) } Var[ | ] ( ) ( 2) n k K K H n k n k − − = − − +

Distribution of Moran’s under the HA: spatial autocorrelation

called SAR model, i.e. y = Xβ + ε, where ε = ε + u, u being i.i.d.

mean and variance are now given by (see Tiefelsdorf, 2000) where the hii

covariance matrix of the errors, and with

[ ] (1 2 ) 1 2

h E H t dt t λ λ

| = + ⋅ ⋅ ⋅ ⋅ + ⋅ ⋅

Var[ ] E[ ] E[ ]

H H H | = | − |

Random or Clustered?

Moran’s = 0.511 z() = 5.675 Moran’s = -0.003 z() = 0.190

A Design Criterion

probability that, given the alternative, the Moran’s test accepts the null hypothesis of no spatial autocorrelation.

since the power 1-Ψ is not convex.

E( ) min P (1 ) Var( )

H H α

| ≤ Φ −

(1 ) Var[ ] E[ ] E[ ] arg min arg min Var[ ]

H H H H

α ξ

− | + | − | = Ψ = Φ

Example: Anselin data

Moran’s = 0.511 z() = 5.675 1- = 0.799

Exchange type algorithms

candidate points C exchange the pair which maximizes the decrease in Ψ. (Fedorov, 1972, requires evaluation of the criterion n(N-n) times at each step).

Nachtsheim, 1995, Nguyen, 2002, etc.

as alternatives?

Example 2: Anselin data

Example 2: Anselin data

Moran’s = 0.511 z() = 5.675 1- = 0.799 Moran’s = 0.417 z() = 1.914 1- = 0.983

References (www.ifas.jku.at)

thank you for your attention!

Is it Spatially Random? Tougher than it looks to decide!

twice as many people sit catty/corner rather than opposite at tables in a restaurant

preference for nearness

be expected from a random process: two ways to sit

sit catty/corner

Why Spatial Autocorrelation Matters

Example 1: Regression on Unit Square The error covariance matrix Ω depends on the assumed parameter values ρ and δ, i.e. Ω = [(I – (δ))T(I – (δ)]-1.

intercept only plane trend