lecture 20
play

Lecture 20 Spatial Random Effects Models + Point Reference Spatial - PowerPoint PPT Presentation

Lecture 20 Spatial Random Effects Models + Point Reference Spatial Data Colin Rundel 04/03/2017 1 Spatial Random Effects Models 2 Scottish Lip Cancer Data 3 Observed Expected 61 N 60 N 59 N value 80 58 N 60 40 57 N 20


  1. Lecture 20 Spatial Random Effects Models + Point Reference Spatial Data Colin Rundel 04/03/2017 1

  2. Spatial Random Effects Models 2

  3. Scottish Lip Cancer Data 3 Observed Expected 61 ° N 60 ° N 59 ° N value 80 58 ° N 60 40 57 ° N 20 0 56 ° N 55 ° N 8 ° W 6 ° W 4 ° W 2 ° W 8 ° W 6 ° W 4 ° W 2 ° W

  4. 4 Obs/Exp % Agg Fish Forest 61 ° N 61 ° N 60 ° N 60 ° N 59 ° N 59 ° N 6 58 ° N 58 ° N 20 4 15 10 2 57 ° N 57 ° N 5 0 0 56 ° N 56 ° N 55 ° N 55 ° N 8 ° W 6 ° W 4 ° W 2 ° W 8 ° W 6 ° W 4 ° W 2 ° W

  5. Neighborhood / weight matrix 5

  6. Moran’s I ape:: Moran.I (lip_cancer$Observed / lip_cancer$Expected, $ p.value : num 0 ## : num 0.0784 $ sd ## $ expected: num -0.0182 ## $ observed: num 0.666 ## ## List of 4 weight = diag ( rowSums (W)) %*% W) %>% str () ## [1] 0.5323161 morans_I = function (y, w) w = diag ( rowSums (W)) %*% W) morans_I (y = lip_cancer$Observed / lip_cancer$Expected, ## [1] 0.2258758 morans_I (y = lip_cancer$Observed, w = diag ( rowSums (W)) %*% W) } (n/ sum (w)) * (num/denom) { 6 n = length (y) y_bar = mean (y) num = sum (w * (y-y_bar) %*% t (y-y_bar)) denom = sum ( (y-y_bar)^2 )

  7. A hierachical model for lip cancer We have observed counts of lip cancer for 56 districts in Scotland. Let y i represent the number of lip cancer for district i . 7 y i ∼ Poisson ( λ i ) log ( λ i ) = log ( E i ) + x i β + ω i ω ∼ N ( 0 , σ 2 ( D − ϕ W ) − 1 ) where E i is the expected counts for each region (and serves as an offet).

  8. Data prep & JAGS model for(i in 1:2) { ## } phi ~ dunif(0,0.99) ## tau ~ dgamma(2, 2) ## sigma2 <- 1/tau ## omega ~ dmnorm(rep(0,length(y)), tau * (D - phi*W)) ## ## } ## beta[i] ~ dnorm(0,1) ## ## D = diag ( rowSums (W)) ## y = lip_cancer$Observed ## model{ ## for(i in 1:length(y)) { ## y[i] ~ dpois(lambda[i]) ## y_pred[i] ~ dpois(lambda[i]) ## log(lambda[i]) <- log_offset[i] + X[i,] %*% beta + omega[i] ## } 8 X = model.matrix (~ scale (lip_cancer$pcaff)) log_offset = log (lip_cancer$Expected)

  9. Model Results 9 Trace of beta[1] Density of beta[1] 2.0 0.2 1.0 −0.4 0.0 35000 40000 45000 50000 55000 60000 −0.5 0.0 0.5 Iterations N = 1000 Bandwidth = 0.04924 Trace of beta[2] Density of beta[2] 0.4 3 0.2 2 1 0.0 0 35000 40000 45000 50000 55000 60000 −0.1 0.0 0.1 0.2 0.3 0.4 0.5 Iterations N = 1000 Bandwidth = 0.02561

  10. 10 Trace of phi Density of phi 1.0 12 0.7 8 4 0.4 0 35000 40000 45000 50000 55000 60000 0.4 0.5 0.6 0.7 0.8 0.9 1.0 Iterations N = 1000 Bandwidth = 0.01076 Trace of sigma2 Density of sigma2 1.5 1.0 0.5 0.0 35000 40000 45000 50000 55000 60000 0.5 1.0 1.5 2.0 Iterations N = 1000 Bandwidth = 0.06232

  11. Predictions & Residuals 11 Predicted Cases Residuals 61 ° N 61 ° N 60 ° N 60 ° N 59 ° N 59 ° N 58 ° N 58 ° N 2 30 20 0 57 ° N 57 ° N 10 −2 56 ° N 56 ° N 55 ° N 55 ° N 8 ° W 6 ° W 4 ° W 2 ° W 8 ° W 6 ° W 4 ° W 2 ° W

  12. Residuals + RMSE + Moran’s I $ observed: num 0.0642 $ p.value : num 0.305 ## : num 0.0803 $ sd ## $ expected: num -0.0182 ## ## #RMSE ## List of 4 weight = diag ( rowSums (W)) %*% W) %>% str () ape:: Moran.I (lip_cancer_pred$resid, ## [1] 0.05661104 morans_I (y = lip_cancer_pred$resid, w = diag ( rowSums (W)) %*% W) #Moran’s I ## [1] 1.498675 lip_cancer_pred$resid %>% .^2 %>% mean () %>% sqrt () 12

  13. Point Referenced Data 13

  14. Example - PM2.5 from CSN The Chemical Speciation Network are a series of air quality monitors run by 14 2007 (n=191) for just PM2.5. EPA (221 locations in 2007). We’ll look at a subset of the data from Nov 11th, 50 pm25 45 10 20 30 40 40 lat pm25 35 40 30 20 10 30 25 −120 −100 −80 long

  15. csn ## 7 11011002 -86.25637 32.40712 2007-11-14 19.70000 ## 5 11130001 -84.99917 32.47639 2007-11-14 22.60000 ## 6 40139997 -112.09577 33.50383 2007-11-14 12.30000 40191028 -110.98230 32.29515 2007-11-14 -86.58637 34.68767 2007-11-14 13.40000 7.20000 ## 8 51190007 -92.28130 34.75619 2007-11-14 12.70000 ## 9 60070002 -121.84222 39.75750 2007-11-14 10.00000 ## 10 60190008 -119.77222 36.78139 2007-11-14 32.26205 ## # ... with 181 more rows ## 4 10890014 ## # A tibble: 191 × 5 <dbl> ## site longitude latitude date pm25 ## <int> <dbl> ## 3 <dttm> <dbl> ## 1 10730023 -86.81500 33.55306 2007-11-14 19.43555 ## 2 10732003 -86.92417 33.49972 2007-11-14 26.40000 15

  16. Aside - Splines 16

  17. We want to find a function f x that best fits our observed data y n while being as smooth as possible. x 2 dx Splines in 1d - Smoothing Splines y i in the tails) with knots at the observed data locations ( x s). by a mixture of weighted natural cubic splines (cubic splines that are linear Interestingly, this minimization problem has an exact solution which is given f 2 f x i i 1 These are a mathematical analogue to the drafting splines represented n f x arg min y 1 y using a penalized regression model. 17

  18. Splines in 1d - Smoothing Splines n in the tails) with knots at the observed data locations ( x s). by a mixture of weighted natural cubic splines (cubic splines that are linear Interestingly, this minimization problem has an exact solution which is given These are a mathematical analogue to the drafting splines represented 17 using a penalized regression model. arg min We want to find a function f ( x ) that best fits our observed data y = y 1 , . . . , y n while being as smooth as possible. ∫ ( y i − f ( x i )) 2 + λ ∑ f ′′ ( x ) 2 dx f ( x ) i = 1

  19. Splines in 2d - Thin Plate Splines dx dy n sum of radial basis functions with knots at the observed data locations The solution to this equation has a natural representation using a weighted 18 n arg min spline model in two dimensions, Now imagine we have observed data of the form ( x i , y i , z i ) where we wish to predict z i given x i and y i for all i . We can naturally extend the smoothing ∫ ∫ ( ∂ 2 f ∂ x 2 + 2 ∂ 2 f ∂ x ∂ y + ∂ 2 f ) ( z i − f ( x i , y i )) 2 + λ ∑ ∂ y 2 f ( x , y ) i = 1 ∑ f ( x , y ) = w i d ( x i , y i ) 2 log d ( x i , y i ) . i = 1

  20. Fitting a TPS plot (pm25_pred) library (fields) points (coords, pch=16, cex=0.5) 19 pm25_pred = r coords = select (csn, longitude, latitude) %>% as.matrix () tps = Tps (x = coords, Y=csn$pm25) pm25_pred[cells] = predict (tps, pred_coords) 50 45 35 30 40 25 20 15 35 10 5 30 25 −120 −110 −100 −90 −80 −70

  21. Gaussin Process Models / Kriging 20

  22. Variogram uvec = seq (0, max (d)/2, length.out=50)) %>% plot () library (geoR) 21 variog (coords = coords, data = csn$pm25, messages = FALSE, coords = csn %>% select (latitude, longitude) %>% as.matrix () d = dist (coords) %>% as.matrix () 100 semivariance 60 20 0 0 5 10 15 20 25 distance

  23. variog (coords = coords, data = csn$pm25, messages = FALSE, uvec = seq (0, max (d)/4, length.out=50)) %>% plot () 22 80 semivariance 60 40 20 0 0 2 4 6 8 10 12 14 distance

  24. Isotropy / Anisotropy v4 = variog4 (coords = coords, data = csn$pm25, messages = FALSE, 23 uvec = seq (0, max (d)/4, length.out = 50)) plot (v4) 0 ° 120 45 ° 90 ° 100 135 ° semivariance 80 60 40 20 0 0 2 4 6 8 10 12 14 distance

  25. 2 exp GP Spatial Model 0 s j s i r ij n 2 0 s i w s If we assume that our data is stationary and isotropic then we can use a Gaussian s w s s y s we can also view this as a spatial random effects model where Process model to fit the data. We will assume an exponential covariance structure. 24 y ∼ N ( µ 1 , Σ) { Σ } ij = σ 2 exp ( − r ∥ s i − s j ∥ ) + σ 2 n 1 i = j

  26. GP Spatial Model If we assume that our data is stationary and isotropic then we can use a Gaussian Process model to fit the data. We will assume an exponential covariance structure. we can also view this as a spatial random effects model where 24 y ∼ N ( µ 1 , Σ) { Σ } ij = σ 2 exp ( − r ∥ s i − s j ∥ ) + σ 2 n 1 i = j y ( s ) = µ ( s ) + w ( s ) + ϵ ( s ) w ( s ) ∼ N ( 0 , Σ ′ ) ϵ ( s i ) ∼ N ( 0 , σ 2 n ) { Σ ′ } ij = σ 2 exp ( − r ∥ s i − s j ∥ )

  27. Fitting with spBayes library (spBayes) coords = select (csn, longitude, latitude) %>% as.matrix () starting = list (phi = 3/3, sigma.sq = 33, tau.sq = 17) beta.Norm = list (0, 1000), phi.Unif = c (3/max_range, 6), sigma.sq.IG = c (2, 2), tau.sq.IG = c (2, 2) ) 25 n = nrow (csn) n_samp = 20000 max_range = max ( dist (coords)) / 4 tuning = list (”phi”=0.1, ”sigma.sq”=0.1, ”tau.sq”=0.1) priors = list (

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend