Lecture 13 Gaussian Process Models - Part 2 Colin Rundel - - PowerPoint PPT Presentation

lecture 13
SMART_READER_LITE
LIVE PREVIEW

Lecture 13 Gaussian Process Models - Part 2 Colin Rundel - - PowerPoint PPT Presentation

Lecture 13 Gaussian Process Models - Part 2 Colin Rundel 03/01/2017 1 EDA and GPs 2 t i t j t i t j t i t j Variogram 2 Y t j Y t i E 2 can simplify to for all i and j ) then we t j t i If the process has constant mean (e.g. is called


slide-1
SLIDE 1

Lecture 13

Gaussian Process Models - Part 2

Colin Rundel 03/01/2017

1

slide-2
SLIDE 2

EDA and GPs

2

slide-3
SLIDE 3

Variogram

From the spatial modeling literature the typical approach is to examine an empirical variogram, first we’ll look at the theoretical variogram before looking at the connection to the covariance. Variogram: 2 ti tj Var Y ti Y tj E Y ti ti Y tj tj

2

where ti tj is called the semivariogram. If the process has constant mean (e.g. ti tj for all i and j) then we can simplify to 2 ti tj E Y ti Y tj

2 3

slide-4
SLIDE 4

Variogram

From the spatial modeling literature the typical approach is to examine an empirical variogram, first we’ll look at the theoretical variogram before looking at the connection to the covariance. Variogram: 2γ(ti, tj) = Var(Y(ti) − Y(tj))

= E([(Y(ti) − µ(ti)) − (Y(tj) − µ(tj))]2)

where γ(ti, tj) is called the semivariogram. If the process has constant mean (e.g. ti tj for all i and j) then we can simplify to 2 ti tj E Y ti Y tj

2 3

slide-5
SLIDE 5

Variogram

From the spatial modeling literature the typical approach is to examine an empirical variogram, first we’ll look at the theoretical variogram before looking at the connection to the covariance. Variogram: 2γ(ti, tj) = Var(Y(ti) − Y(tj))

= E([(Y(ti) − µ(ti)) − (Y(tj) − µ(tj))]2)

where γ(ti, tj) is called the semivariogram. If the process has constant mean (e.g. µ(ti) = µ(tj) for all i and j) then we can simplify to 2γ(ti, tj) = E([Y(ti) − Y(tj)]2)

3

slide-6
SLIDE 6

Some Properties of the theoretical Variogram / Semivariogram

  • both are non-negative

γ(ti, tj) ≥ 0

  • both are 0 at distance 0

γ(ti, ti) = 0

  • both are symmetric

γ(ti, tj) = γ(tj, ti)

  • there is no dependence if

2γ(ti, tj) = Var(Y(ti)) + Var(Y(tj)) for all i ̸= j

  • if the process is not stationary

2γ(ti, tj) = Var( Y(ti))

+ Var(

Y(tj))

− 2 Cov(

Y(ti), Y(tj))

  • if the process is stationary

2γ(ti, tj) = 2Var( Y(ti))

− 2 Cov(

Y(ti), Y(tj))

4

slide-7
SLIDE 7

Empirical Semivariogram

We will assume that our process of interest is stationary, in which case we will parameterize the semivariagram in terms of h = |ti − tj|. Empirical Semivariogram:

ˆ γ(h) =

1 2 N(h)

|ti−tj|∈(h−ϵ,h+ϵ)

(Y(ti) − Y(tj))2

Practically, for any data set with n observations there are

n 2

n possible data pairs to examine. Each individually is not very informative, so we aggregate into bins and calculate the empirical semivariogram for each bin.

5

slide-8
SLIDE 8

Empirical Semivariogram

We will assume that our process of interest is stationary, in which case we will parameterize the semivariagram in terms of h = |ti − tj|. Empirical Semivariogram:

ˆ γ(h) =

1 2 N(h)

|ti−tj|∈(h−ϵ,h+ϵ)

(Y(ti) − Y(tj))2

Practically, for any data set with n observations there are

(n

2

) + n possible

data pairs to examine. Each individually is not very informative, so we aggregate into bins and calculate the empirical semivariogram for each bin.

5

slide-9
SLIDE 9

Connection to Covariance

6

slide-10
SLIDE 10

Covariance vs Semivariogram - Exponential

exp cov exp semivar 0.0 0.5 1.0 1.5 0.0 0.5 1.0 1.5 0.00 0.25 0.50 0.75 1.00

d y l

1 1.7 2.3 3 3.7 4.3 5 5.7 6.3 7

7

slide-11
SLIDE 11

Covariance vs Semivariogram - Square Exponential

sq exp cov sq exp semivar 0.0 0.5 1.0 1.5 0.0 0.5 1.0 1.5 0.00 0.25 0.50 0.75 1.00

d y l

1 1.7 2.3 3 3.7 4.3 5 5.7 6.3 7

8

slide-12
SLIDE 12

From last time

−2 −1 1 0.00 0.25 0.50 0.75 1.00

t y

9

slide-13
SLIDE 13

Empirical semivariogram - no bins / cloud

2 4 0.00 0.25 0.50 0.75 1.00

h gamma

10

slide-14
SLIDE 14

Empirical semivariogram (binned)

binwidth=0.1 binwidth=0.15 binwidth=0.05 binwidth=0.075 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 1 2 3 4 1 2 3 4

h gamma

11

slide-15
SLIDE 15

Empirical semivariogram (binned + n)

binwidth=0.1 binwidth=0.15 binwidth=0.05 binwidth=0.075 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 1 2 3 4 1 2 3 4

h gamma n

5 10 15 20 25

12

slide-16
SLIDE 16

Theoretical vs empirical semivariogram

After fitting the model last time we came up with a posterior median of

σ2 = 1.89 and l = 5.86 for a square exponential covariance.

Cov h

2 exp

h l 2 h

2 2 exp

h l 2 1 89 1 89 exp 5 86 h 2 13

slide-17
SLIDE 17

Theoretical vs empirical semivariogram

After fitting the model last time we came up with a posterior median of

σ2 = 1.89 and l = 5.86 for a square exponential covariance.

Cov(h) = σ2 exp(

− (h l)2) γ(h) = σ2 − σ2 exp( − (h l)2) = 1.89 − 1.89 exp( − (5.86 h)2)

13

slide-18
SLIDE 18

Theoretical vs empirical semivariogram

After fitting the model last time we came up with a posterior median of

σ2 = 1.89 and l = 5.86 for a square exponential covariance.

Cov(h) = σ2 exp(

− (h l)2) γ(h) = σ2 − σ2 exp( − (h l)2) = 1.89 − 1.89 exp( − (5.86 h)2)

binwidth=0.05 binwidth=0.1 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 1 2 3

h gamma

13

slide-19
SLIDE 19

Variogram features

14

slide-20
SLIDE 20

PM2.5 Example

15

slide-21
SLIDE 21

FRN Data

Measured PM2.5 data from an EPA monitoring station in Columbia, NJ.

5 10 15 20 Jan 2007 Apr 2007 Jul 2007 Oct 2007 Jan 2008

date pm25

16

slide-22
SLIDE 22

FRN Data

site latitude longitude pm25 date day 230031011 46.682

  • 68.016

8.9 2007-01-03 3 230031011 46.682

  • 68.016

10.4 2007-01-06 6 230031011 46.682

  • 68.016

9.7 2007-01-15 15 230031011 46.682

  • 68.016

7.5 2007-01-18 18 230031011 46.682

  • 68.016

4.6 2007-01-21 21 230031011 46.682

  • 68.016

9.5 2007-01-24 24 230031011 46.682

  • 68.016

9.0 2007-01-27 27 230031011 46.682

  • 68.016

16.2 2007-01-30 30 230031011 46.682

  • 68.016

9.1 2007-02-05 36 230031011 46.682

  • 68.016

19.9 2007-02-11 42 230031011 46.682

  • 68.016

11.5 2007-02-14 45 230031011 46.682

  • 68.016

6.5 2007-02-17 48 230031011 46.682

  • 68.016

14.7 2007-02-23 54 230031011 46.682

  • 68.016

14.1 2007-02-26 57 230031011 46.682

  • 68.016

13.3 2007-03-01 60 230031011 46.682

  • 68.016

8.6 2007-03-04 63 230031011 46.682

  • 68.016

9.0 2007-03-07 66 230031011 46.682

  • 68.016

14.0 2007-03-10 69 230031011 46.682

  • 68.016

8.6 2007-03-13 72 230031011 46.682

  • 68.016

10.3 2007-03-16 75

17

slide-23
SLIDE 23

Mean Model

5 10 15 20 100 200 300

day pm25

## ## Call: ## lm(formula = pm25 ~ day + I(day^2), data = pm25) ## ## Coefficients: ## (Intercept) day I(day^2) ## 12.9644351

  • 0.0724639

0.0001751 ## ## Call: ## lm(formula = pm25 ~ day + I(day^2), data = pm25) ## ## Coefficients: ## (Intercept) day I(day^2) ## 12.9644351

  • 0.0724639

0.0001751

18

slide-24
SLIDE 24

Detrended Residuals

−5 5 10 100 200 300

day resid

Residuals 19

slide-25
SLIDE 25

Empirical Variogram

binwidth=3 binwidth=6 100 200 300 100 200 300 10 20 30 40

h gamma

50 100 150 200

n

20

slide-26
SLIDE 26

Empirical Variogram

binwidth=6 binwidth=9 50 100 150 50 100 150 5 10 15

h gamma

21

slide-27
SLIDE 27

Model

What does the model we are trying to fit actually look like? y d d w d w where d

1 d 2 d2

w d w

2 w 22

slide-28
SLIDE 28

Model

What does the model we are trying to fit actually look like? y(d) = µ(d) + w(d) + w where

µ(d) = β0 + β1 d + β2 d2

w(d) ∼ GP(0, Σ) w ∼ N(0, σ2

w) 22

slide-29
SLIDE 29

JAGS Model

## model{ ## y ~ dmnorm(mu, inverse(Sigma)) ## ## for (i in 1:N) { ## mu[i] <- beta[1]+ beta[2] * x[i] + beta[3] * x[i]^2 ## } ## ## for (i in 1:(N-1)) { ## for (j in (i+1):N) { ## Sigma[i,j] <- sigma2 * exp(- pow(l*d[i,j],2)) ## Sigma[j,i] <- Sigma[i,j] ## } ## } ## ## for (k in 1:N) { ## Sigma[k,k] <- sigma2 + sigma2_w ## } ## ## for (i in 1:3) { ## beta[i] ~ dt(0, 2.5, 1) ## } ## sigma2_w ~ dnorm(10, 1/25) T(0,) ## sigma2 ~ dnorm(10, 1/25) T(0,) ## l ~ dt(0, 2.5, 1) T(0,) ## } 23

slide-30
SLIDE 30

Posterior - Betas

15000 20000 25000 30000 35000 40000 10 Iterations

Trace of beta[1]

−5 5 10 15 20 0.00 0.08

Density of beta[1]

N = 715 Bandwidth = 1.543 15000 20000 25000 30000 35000 40000 −0.15 0.15 Iterations

Trace of beta[2]

−0.2 −0.1 0.0 0.1 0.2 4 8

Density of beta[2]

N = 715 Bandwidth = 0.01645 15000 20000 25000 30000 35000 40000 −4e−04 Iterations

Trace of beta[3]

−4e−04 −2e−04 0e+00 2e−04 4e−04 2500

Density of beta[3]

N = 715 Bandwidth = 3.873e−05

24

slide-31
SLIDE 31

Posterior - Covariance Parameters

15000 20000 25000 30000 35000 40000 0.0 1.0 Iterations

Trace of l

0.0 0.5 1.0 1.5 10

Density of l

N = 715 Bandwidth = 0.01888 15000 20000 25000 30000 35000 40000 15 Iterations

Trace of sigma2

5 10 15 20 25 30 0.00 0.05

Density of sigma2

N = 715 Bandwidth = 1.471 15000 20000 25000 30000 35000 40000 5 15 Iterations

Trace of sigma2_w

5 10 15 0.00 0.20

Density of sigma2_w

N = 715 Bandwidth = 0.5303

25

slide-32
SLIDE 32

Posterior

## # A tibble: 6 × 5 ## param post_mean post_med post_lower post_upper ## * <chr> <dbl> <dbl> <dbl> <dbl> ## 1 beta[1] 7.283488e+00 8.667009e+00 -0.7461648059 1.503065e+01 ## 2 beta[2] -1.627421e-02 -2.817415e-02 -0.0988863015 1.026401e-01 ## 3 beta[3] 5.858818e-05 8.569993e-05 -0.0002481874 2.567976e-04 ## 4 l 1.277712e-01 2.433287e-02 0.0060909947 8.443888e-01 ## 5 sigma2 9.379213e+00 9.016621e+00 1.5643832453 1.979094e+01 ## 6 sigma2_w 1.088809e+01 1.116626e+01 4.2665826402 1.448447e+01

26

slide-33
SLIDE 33

Fitted Variogram

27

slide-34
SLIDE 34

Empirical + Fitted Variogram

binwidth=3 binwidth=6 50 100 150 50 100 150 5 10 15 20

h gamma

28

slide-35
SLIDE 35

Fitted Model + Predictions

5 10 15 20 100 200 300

day pm25

29

slide-36
SLIDE 36

Empirical Variogram (again)

binwidth=3 binwidth=6 50 100 150 50 100 150 5 10 15

h gamma

30

slide-37
SLIDE 37

Empirical Variogram Model

binwidth=3 binwidth=6 50 100 150 50 100 150 5 10 15

h gamma

31

slide-38
SLIDE 38

Empirical Variogram Model + Predictions

5 10 15 20 100 200 300

day pm25

32