Lecture 1 Spatio-temporal data & Linear Models Colin Rundel - - PowerPoint PPT Presentation

lecture 1
SMART_READER_LITE
LIVE PREVIEW

Lecture 1 Spatio-temporal data & Linear Models Colin Rundel - - PowerPoint PPT Presentation

Lecture 1 Spatio-temporal data & Linear Models Colin Rundel 1/18/2017 1 Spatio-temporal data 2 Time Series Data - Discrete 3 S&P 500 Open (^GSPC) 2270 2265 2260 2255 Jan 03 Jan 05 Jan 09 Jan 11 Jan 13 Jan 17 2017 2017


slide-1
SLIDE 1

Lecture 1

Spatio-temporal data & Linear Models

Colin Rundel 1/18/2017

1

slide-2
SLIDE 2

Spatio-temporal data

2

slide-3
SLIDE 3

Time Series Data - Discrete

Jan 03 2017 Jan 05 2017 Jan 09 2017 Jan 11 2017 Jan 13 2017 Jan 17 2017 2255 2260 2265 2270

S&P 500 Open (^GSPC)

3

slide-4
SLIDE 4

Time Series Data - Continuous

Jan 01 Jan 15 Feb 01 Feb 15 5 10 15 20

FRN Measured PM25

PM25 (µg/m3)

4

slide-5
SLIDE 5

Spatial Data - Areal

84°W 82°W 80°W 78°W 76°W 33.5°N 34°N 34.5°N 35°N 35.5°N 36°N 36.5°N 37°N

SID79

5

slide-6
SLIDE 6

Spatial Data - Point referenced

178500 179500 180500 181500 329000 330000 331000 332000 333000 334000

copper

178500 179500 180500 181500 329000 330000 331000 332000 333000 334000

lead

178500 179500 180500 181500 329000 330000 331000 332000 333000 334000

zinc

Meuse River

6

slide-7
SLIDE 7

Point Pattern Data - Time

500 1000 1500 2 3 4 5

Old Faithful Eruption Duration

time duration

7

slide-8
SLIDE 8

Point Pattern Data - Space

8

slide-9
SLIDE 9

Point Pattern Data - Space + Time

9

slide-10
SLIDE 10

(Bayesian) Linear Models

10

slide-11
SLIDE 11

Linear Models

Pretty much everything we a going to see in this course will fall under the umbrella of linear or generalized linear models. Yi = β0 + β1 xi1 + · · · + βp xip + ϵi

ϵi ∼ N(0, σ2)

which we can also express using matrix notation as Y

n×1 =

X

n×p β p×1 + ϵ n×1

ϵ ∼ N( 0

n×1, σ2 In n×n) 11

slide-12
SLIDE 12

Multivariate Normal Distribution

For an n-dimension multivate normal distribution with covariance Σ (positive semidefinite) can be written as Y

n×1 ∼ N( µ n×1, Σ n×n) where {Σ}ij = ρijσiσj

   

Y1 . . . Yn

    ∼ N        

µ1

. . .

µn

    ,    

ρ11σ1σ1 · · · ρ1nσ1σn

. . . ... . . .

ρn1σnσ1 · · · ρnnσnσn

       

12

slide-13
SLIDE 13

Multivariate Normal Distribution - Density

For the n dimensional multivate normal given on the last slide, its density is given by

(2π)−n/2 det(Σ)−1/2 exp

(

1 2(Y − µ)′

1×n

Σ−1

n×n (Y − µ) n×1

)

and its log density is given by

n 2 log 2π − 1 2 log det(Σ) − − 1 2(Y − µ)′

1×n

Σ−1

n×n (Y − µ) n×1 13

slide-14
SLIDE 14

A Simple Linear Regression Example

Lets generate some simulated data where the underlying model is known and see how various regression preceedures function.

β0 = 0.7, β1 = 1.5, β2 = −2.2, β3 = 0.1

n = 100,

ϵi ∼ N(0, 1)

14

slide-15
SLIDE 15

Generating the data

set.seed(01172017) n = 100 beta = c(0.7,1.5,-2.2,0.1) eps = rnorm(n) X0 = rep(1, n) X1 = rt(n,df=5) X2 = rt(n,df=5) X3 = rt(n,df=5) X = cbind(X0, X1, X2, X3) Y = X %*% beta + eps d = data.frame(Y,X[,-1])

15

slide-16
SLIDE 16

Least squares fit

Let ˆ Y be our estimate for Y based on our estimate of β,

ˆ

Y = ˆ

β0 + ˆ β1 X1 + ˆ β2 X2 + ˆ β3 X3 = X ˆ β

The least squares estimate,

ls, is given by

arg min

n i 1

Yi Xi

2

With a bit of calculus and algebra we can derive

ls

XtX

1Xt Y 16

slide-17
SLIDE 17

Least squares fit

Let ˆ Y be our estimate for Y based on our estimate of β,

ˆ

Y = ˆ

β0 + ˆ β1 X1 + ˆ β2 X2 + ˆ β3 X3 = X ˆ β

The least squares estimate, ˆ

βls, is given by

arg min

β

n

i=1

(Yi − Xi·β)2

With a bit of calculus and algebra we can derive

ls

XtX

1Xt Y 16

slide-18
SLIDE 18

Least squares fit

Let ˆ Y be our estimate for Y based on our estimate of β,

ˆ

Y = ˆ

β0 + ˆ β1 X1 + ˆ β2 X2 + ˆ β3 X3 = X ˆ β

The least squares estimate, ˆ

βls, is given by

arg min

β

n

i=1

(Yi − Xi·β)2

With a bit of calculus and algebra we can derive

ˆ βls = (XtX)−1Xt Y

16

slide-19
SLIDE 19

Maximum Likelihood

17

slide-20
SLIDE 20

Frequentist Fit

lm(Y ~ ., data=d)$coefficients ## (Intercept) X1 X2 X3 ## 0.73726738 1.65321096 -2.16499958 0.07996257 (beta_hat = solve(t(X) %*% X, t(X)) %*% Y) ## [,1] ## X0 0.73726738 ## X1 1.65321096 ## X2 -2.16499958 ## X3 0.07996257

18

slide-21
SLIDE 21

Bayesian Model

Y1, . . . , Y100 | β, σ2 ∼ N(Xi·β, σ2)

β0, β1, β2, β3 ∼ N(0, σ2

β = 100)

τ 2 = 1/σ2 ∼ Gamma(a = 1, b = 1)

19

slide-22
SLIDE 22

Deriving the posterior

[β0, β1, β2, β3, σ2 | Y ] = [Y | β, σ2] [Y ] [β, σ2] ∝ [Y | β, σ2][β][σ2]

where, Y

2

2

2 n 2 exp n i 1 Yi 1Xi 1 1Xi 2 3Xi 3 2

2

2 1 2 3 2

2

2 4 2 exp 3 i 2 i

2

2 2 a b

ba a

2 a 1 exp

b

2 20

slide-23
SLIDE 23

Deriving the posterior

[β0, β1, β2, β3, σ2 | Y ] = [Y | β, σ2] [Y ] [β, σ2] ∝ [Y | β, σ2][β][σ2]

where,

[Y | β, σ2] =

(

2πσ2)−n/2 exp

(

∑n

i=1 (Yi − β0 − β1Xi,1 − β1Xi,2 − β3Xi,3)2

2σ2

)

1 2 3 2

2

2 4 2 exp 3 i 2 i

2

2 2 a b

ba a

2 a 1 exp

b

2 20

slide-24
SLIDE 24

Deriving the posterior

[β0, β1, β2, β3, σ2 | Y ] = [Y | β, σ2] [Y ] [β, σ2] ∝ [Y | β, σ2][β][σ2]

where,

[Y | β, σ2] =

(

2πσ2)−n/2 exp

(

∑n

i=1 (Yi − β0 − β1Xi,1 − β1Xi,2 − β3Xi,3)2

2σ2

)

[β0, β1, β2, β3 | σ2

β] = (2πσ2 β)−4/2 exp

(

∑3

i=0 β2 i

2σ2

β

)

2 a b

ba a

2 a 1 exp

b

2 20

slide-25
SLIDE 25

Deriving the posterior

[β0, β1, β2, β3, σ2 | Y ] = [Y | β, σ2] [Y ] [β, σ2] ∝ [Y | β, σ2][β][σ2]

where,

[Y | β, σ2] =

(

2πσ2)−n/2 exp

(

∑n

i=1 (Yi − β0 − β1Xi,1 − β1Xi,2 − β3Xi,3)2

2σ2

)

[β0, β1, β2, β3 | σ2

β] = (2πσ2 β)−4/2 exp

(

∑3

i=0 β2 i

2σ2

β

)

[σ2 | a, b] =

ba

Γ(a)(σ2)−a−1 exp

(

b

σ2

)

20

slide-26
SLIDE 26

Deriving the posterior (cont.)

[β0, β1, β2, β3, σ2 | Y ] ∝

(

2πσ2)−n/2 exp

(

∑n

i=1 (Yi − β0 − β1Xi,1 − β1Xi,2 − β3Xi,3)2

2σ2

)

(2πσ2

β)−4/2 exp

(

−β2

0 + β2 1 + β2 2 + β2 3

2σ2

β

)

ba

Γ(a)(σ2)−a−1 exp

(

b

σ2

)

21

slide-27
SLIDE 27

Deriving the Gibbs sampler (σ2 step)

22

slide-28
SLIDE 28

Deriving the Gibbs sampler (βi step)

23