D. Gumprecht, W.G. Mller and J. Rodrguez-Daz University of - - PowerPoint PPT Presentation

d gumprecht w g m ller and j rodr guez d az university of
SMART_READER_LITE
LIVE PREVIEW

D. Gumprecht, W.G. Mller and J. Rodrguez-Daz University of - - PowerPoint PPT Presentation

Optimal Design for Detecting Spatial Dependence D. Gumprecht, W.G. Mller and J. Rodrguez-Daz University of Econommics Vienna, Austria Johannes-Kepler-University Linz, Austria University of Salamanca, Spain mODa 8 , Almagro, Spain, June


slide-1
SLIDE 1
  • D. Gumprecht, W.G. Müller and J. Rodríguez-Díaz

University of Econommics Vienna, Austria Johannes-Kepler-University Linz, Austria University of Salamanca, Spain

Optimal Design for Detecting Spatial Dependence

mODa 8, Almagro, Spain, June 2007

slide-2
SLIDE 2

2

Spatial dependence

“All things are related but nearby things are more related than distant things.” (Tobler, 1970: the first law of geography) “Spatial dependency is the extent to which the value of an attribute in one location depends

  • n the values of the attribute in nearby

locations.” (Fotheringham et al, 2002). “Spatial autocorrelation (…) is the correlation among values of a single variable strictly attributable to the proximity of those values in geographic space (…).” (Griffith, 2003). “Hell is a place with no spatial dependence.” (Goodchild, 2002)

slide-3
SLIDE 3

source: Anselin, 1988 (Columbus, Ohio crime) 3

Random or Clustered?

slide-4
SLIDE 4

source: M.Goodchild, 2002 4

Spatial Randomness

– values observed at a location do not depend on values observed at neighboring locations – observed spatial pattern of values is equally likely as any other spatial pattern – the location of values may be altered without affecting the information content of the data

slide-5
SLIDE 5

adapted from Goodchild, 2002 5

Spatial Proximity (Weight) Matrix

  • Matrix W (n x n) , where

each element wij represents a measure of nearness between regions Oi and Oj

  • Possible Choices:

wij = 1, if Oi touches Oj wij = 1, if distance(Oi, Oj) d*

  • A B C D E

A 0 1 0 1 0 B 1 0 1 1 1 C 0 1 0 0 1 D 1 1 0 0 1 E 0 1 1 1 0

slide-6
SLIDE 6

6

Spatial weight matrices based on distance

  • Distances dij usually measured centroid

to centroid.

  • Most common choices are the inverse

distance wij = (1 - 1 1{i=j})/dij ,

  • or the negative exponential

wij = exp{- δ dij} –1 1{i=j}.

  • Row standardization ij = wij / j wij is

employed to keep spatial parameters comparable.

slide-7
SLIDE 7

source: Anselin, 1988 7

Moran Scatter Plots

We can now draw a scatter plot between a variable y, and the “spatial lag” of y, Wy.

The slope of the regression line is Moran’s , which can be interpreted as the spatial autocorrelation, the correlation between variable y and the “spatial lag” Wy

slide-8
SLIDE 8

8

Tests for Spatial Dependence

  • Moran, 1950
  • Cliff and Ord, 1981 for regression residuals

from y = Xβ + ε

  • Anselin and Kelejian, 1997 investigate

y = Xβ + y + ε.

2

( )( ) ( )

ij i j i

n w y y y y y y − − = −

  • (

)

1 2 1

, ( )

T T T T T

y M W W My M I X X X X y My

+ = = −

slide-9
SLIDE 9

9

Random or Clustered?

Moran’s = -0.003

Moran’s = 0.511

slide-10
SLIDE 10

10

Distribution of Moran’s under the H0: no spatial autocorrelation

  • Inference is usually based on a normal

approximation, using a standardized z-value

  • btained from the mean and variance of the

statistic, i.e. z() = (-E[])/Var[] ,

  • which are given by (see Henshaw, 1966)

where K = ½M( +T)M.

  • a saddle-point approximation and the exact

distribution was derived by Tiefelsdorf, 2000.

  • asymptotic distributions under deviations can

be found in Kelejian and Prucha, 2001.

tr( ) [ | ] , K E H n k = −

  • 2

2 2

2{( )tr( ) tr( ) } Var[ | ] ( ) ( 2) n k K K H n k n k − − = − − +

slide-11
SLIDE 11

11

Distribution of Moran’s under the HA: spatial autocorrelation

  • We assume that the data is generated by a so

called SAR model, i.e. y = Xβ + ε, where ε = ε + u, u being i.i.d.

  • The normal approximation holds and the

mean and variance are now given by (see Tiefelsdorf, 2000) where the hii

* are derived from functions of the

covariance matrix of the errors, and with

1 2

1 1

[ ] (1 2 ) 1 2

n k n k ii A i i i i

h E H t dt t λ λ

∗ − − ∞ − = =

| = + ⋅ ⋅ ⋅ ⋅ + ⋅ ⋅

  • 2

2

Var[ ] E[ ] E[ ]

A A A

H H H | = | − |

  • 1

2

2 2 1 1 1

2 ( ) E[ ] (1 2 ) (1 2 ) (1 2 )

n k n k n k ii jj ij A i i j i i j

h h h H t t dt t t λ λ λ

∗ ∗ ∗ − − − ∞ − = = =

⋅ + ⋅ | = + ⋅ ⋅ ⋅ ⋅ ⋅ + ⋅ ⋅ ⋅ + ⋅ ⋅

slide-12
SLIDE 12

12

Random or Clustered?

Moran’s = 0.511 z() = 5.675 Moran’s = -0.003 z() = 0.190

slide-13
SLIDE 13

13

A Design Criterion

  • Purpose: minimize the Type II error, i.e. the

probability that, given the alternative, the Moran’s test accepts the null hypothesis of no spatial autocorrelation.

  • This leads us to the following design problem
  • Of course we cannot use classical design theory

since the power 1-Ψ is not convex.

1

E( ) min P (1 ) Var( )

A

H

H H α

| ≤ Φ −

  • |
  • 1

(1 ) Var[ ] E[ ] E[ ] arg min arg min Var[ ]

A X X A

H H H H

ξ ξ

α ξ

− ∗ ∈ ∈

  • Φ

− | + | − | = Ψ = Φ

  • |
slide-14
SLIDE 14

14

Example: Anselin data

Moran’s = 0.511 z() = 5.675 1- = 0.799

slide-15
SLIDE 15

15

Exchange type algorithms

  • E.g. from a given design ξ and a set of

candidate points C exchange the pair which maximizes the decrease in Ψ. (Fedorov, 1972, requires evaluation of the criterion n(N-n) times at each step).

  • Iterate as long as there is improvement.
  • Variants by Wynn, 1970, Meyer &

Nachtsheim, 1995, Nguyen, 2002, etc.

  • Simulated annealing, genetic algorithms

as alternatives?

slide-16
SLIDE 16

16

Example 2: Anselin data

slide-17
SLIDE 17

17

Example 2: Anselin data

Moran’s = 0.511 z() = 5.675 1- = 0.799 Moran’s = 0.417 z() = 1.914 1- = 0.983

slide-18
SLIDE 18

18

  • Anselin, Luc. 1988. Spatial Econometrics: Methods and Models.

Dordrecht, Amsterdam.

  • Cliff, Andrew. Keith Ord. 1981. Spatial Processes: Models and
  • Applications. London: Pion.
  • Müller, Werner G. 2007. Collecting Spatial Data. Springer-Verlag

Berlin Heidelberg

  • Tiefelsdorf, Michael. 2000. Modelling Spatial Processes. Springer-

Verlag Berlin Heidelberg New York.

References (www.ifas.jku.at)

slide-19
SLIDE 19

www.endlessforest.org 19

thank you for your attention!

slide-20
SLIDE 20

source: O'Sullivan and Unwin, 2002 20

Is it Spatially Random? Tougher than it looks to decide!

  • Fact: It is observed that about

twice as many people sit catty/corner rather than opposite at tables in a restaurant

  • Conclusion: psychological

preference for nearness

  • In actuality: an outcome to

be expected from a random process: two ways to sit

  • pposite, but four ways to

sit catty/corner

slide-21
SLIDE 21

source: M.Goodchild 21

Why Spatial Autocorrelation Matters

  • Spatial autocorrelation is of interest in its own right because it

suggests the operation of a spatial process

  • Additionally, most statistical analyses are based on the assumption

that the values of observations in each sample are independent of

  • ne another

– Positive spatial autocorrelation violates this, because samples taken from nearby areas are related to each other and are not independent

  • In ordinary least squares regression (OLS), for example, the

correlation coefficients will be biased and their precision exaggerated

– Bias implies correlation coefficients may be higher than they really are

  • They are biased because the areas with higher concentrations of

events will have a greater impact on the model estimate

– Exaggerated precision (lower standard error) implies they are more likely to be found “statistically significant”

  • they will overestimate precision because, since events tend to be

concentrated, there are actually a fewer number of independent

  • bservations than is being assumed.
slide-22
SLIDE 22

22

Example 1: Regression on Unit Square The error covariance matrix Ω depends on the assumed parameter values ρ and δ, i.e. Ω = [(I – (δ))T(I – (δ)]-1.

intercept only plane trend