Lecture 14 Covariance Functions 3/08/2018 1 More on Covariance - - PowerPoint PPT Presentation

โ–ถ
lecture 14
SMART_READER_LITE
LIVE PREVIEW

Lecture 14 Covariance Functions 3/08/2018 1 More on Covariance - - PowerPoint PPT Presentation

Lecture 14 Covariance Functions 3/08/2018 1 More on Covariance Functions 2 Nugget Covariance 3 ( , ) = 2 1 {=0} where = | | 2 1.00 1 0.75 draw 0 Draw 1 C


slide-1
SLIDE 1

Lecture 14

Covariance Functions

3/08/2018

1

slide-2
SLIDE 2

More on Covariance Functions

2

slide-3
SLIDE 3

Nugget Covariance

๐ท๐‘๐‘ค(๐‘ง๐‘ข๐‘—, ๐‘ง๐‘ข๐‘˜) = ๐œ21{โ„Ž=0} where โ„Ž = |๐‘ข๐‘— โˆ’ ๐‘ข๐‘˜|

0.00 0.25 0.50 0.75 1.00 โˆ’2 โˆ’1 1 2 5 10 15 20 5 10 15 20

h x C y draw

Draw 1 Draw 2

3

slide-4
SLIDE 4

(- / Power / Square) Exponential Covariance

๐ท๐‘๐‘ค(๐‘ง๐‘ข๐‘—, ๐‘ง๐‘ข๐‘˜) = ๐œ2 exp (โˆ’(โ„Ž ๐‘š)๐‘ž) where โ„Ž = |๐‘ข๐‘— โˆ’ ๐‘ข๐‘˜|

0.00 0.25 0.50 0.75 1.00 โˆ’4 โˆ’2 2 โˆ’3 โˆ’2 โˆ’1 1 โˆ’2 2 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00

h x x x C y y y Cov

Exp Pow Exp (p=1.5) Sq Exp

Exponential Powered Exponential (p=1.5) Square Exponential Covariance โˆ’ l=12, sigma2=1

4

slide-5
SLIDE 5

Matern Covariance

๐ท๐‘๐‘ค(๐‘ง๐‘ข๐‘—, ๐‘ง๐‘ข๐‘˜) = ๐œ2 21โˆ’๐œ‰ ฮ“(๐œ‰) ( โˆš 2๐œ‰ โ„Ž โ‹… ๐‘š)

๐œ‰ ๐ฟ๐œ‰ (

โˆš 2๐œ‰ โ„Ž โ‹… ๐‘š) where โ„Ž = |๐‘ข๐‘—โˆ’๐‘ข๐‘˜|

0.00 0.25 0.50 0.75 1.00 โˆ’2 โˆ’1 1 2 โˆ’3 โˆ’2 โˆ’1 1 โˆ’1 1 2 2 4 6 2 4 6 2 4 6 2 4 6

h x x x C y y y

v=1/2 v=3/2 v=5/2

Matern โˆ’ v=1/2 Matern โˆ’ v=3/2 Matern โˆ’ v=5/2 Covariance โˆ’ l=2, sigma2=1

5

slide-6
SLIDE 6

Matern Covariance

  • ๐ฟ๐œ‰ is the modified Bessel function of the second kind.
  • A Gaussian process with Matรฉrn covariance has sample functions that

are โŒˆ๐œ‰ โˆ’ 1โŒ‰ times differentiable.

  • When ๐œ‰ = 1/2 + ๐‘ž for ๐‘ž โˆˆ N+ then the Matern has a simplified form

(product of an exponential and a polynomial of order ๐‘ž).

  • When ๐œ‰ = 1/2 the Matern is equivalent to the exponential covariance.
  • As ๐œ‰ โ†’ โˆž the Matern converges to the square exponential

covariance.

  • A Gaussian process with Matรฉrn covariance has paths that are โŒˆ๐œ‰โŒ‰ โˆ’ 1

times differentiable.

6

slide-7
SLIDE 7

Rational Quadratic Covariance

๐ท๐‘๐‘ค(๐‘ง๐‘ข๐‘—, ๐‘ง๐‘ข๐‘˜) = ๐œ2 (1 + โ„Ž2 ๐‘š2 ๐›ฝ )

โˆ’๐›ฝ

where โ„Ž = |๐‘ข๐‘— โˆ’ ๐‘ข๐‘˜|

0.00 0.25 0.50 0.75 1.00 โˆ’2 โˆ’1 1 โˆ’1 1 2 โˆ’1 1 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00

h x x x y y y

alpha=1 alpha=3 alpha=10

Rational Quadratic โˆ’ alpha=1 Rational Quadratic โˆ’ alpha=10 Rational Quadratic โˆ’ alpha=100 Covariance โˆ’ l=12, sigma2=1

7

slide-8
SLIDE 8

Rational Quadratic Covariance

  • is a scaled mixture of squared exponential covariance functions with

different characteristic length-scales (๐‘š).

  • As ๐›ฝ โ†’ โˆž the rational quadratic converges to the square

exponential covariance.

  • Has sample functions that are infinitely differentiable for any value of

๐›ฝ

8

slide-9
SLIDE 9

Spherical Covariance

๐ท๐‘๐‘ค(๐‘ง๐‘ข๐‘—, ๐‘ง๐‘ข๐‘˜) = {๐œ2 (1 โˆ’ 3

2โ„Ž โ‹… ๐‘š + 1 2(โ„Ž โ‹… ๐‘š)3))

if 0 < โ„Ž < 1/๐‘š

  • therwise

where โ„Ž = |๐‘ข๐‘—โˆ’๐‘ข๐‘˜|

0.00 0.25 0.50 0.75 1.00 โˆ’3 โˆ’2 โˆ’1 1 โˆ’2 2 โˆ’2 โˆ’1 1 2 0.0 0.3 0.6 0.9 0.0 0.3 0.6 0.9 0.0 0.3 0.6 0.9 0.0 0.3 0.6 0.9

h x x x y y y

l=1 l=3 l=10

Spherical โˆ’ l=1 Spherical โˆ’ l=3 Spherical โˆ’ l=10 Covariance โˆ’ sigma2=1

9

slide-10
SLIDE 10

Periodic Covariance

๐ท๐‘๐‘ค(๐‘ง๐‘ข๐‘—, ๐‘ง๐‘ข๐‘˜) = ๐œ2 exp (โˆ’2 ๐‘š2 sin2 (๐œŒโ„Ž ๐‘ž )) where โ„Ž = |๐‘ข๐‘— โˆ’ ๐‘ข๐‘˜|

0.00 0.25 0.50 0.75 1.00 โˆ’1 1 โˆ’2 โˆ’1 1 โˆ’2 โˆ’1 1 2 1 2 3 4 2 4 6 2 4 6 2 4 6

h x x x y y y forcats::as_factor(Cov)

p=1 p=2 p=3

Periodic โˆ’ p=1 Periodic โˆ’ p=2 Periodic โˆ’ p=3 Covariance โˆ’ l=2, sigma2=1

10

slide-11
SLIDE 11

Linear Covariance

๐ท๐‘๐‘ค(๐‘ง๐‘ข๐‘—, ๐‘ง๐‘ข๐‘˜) = ๐œ2

๐‘ + ๐œ2 ๐‘ค (๐‘ข๐‘— โˆ’ ๐‘‘)(๐‘ข๐‘˜ โˆ’ ๐‘‘)

โˆ’1.0 โˆ’0.5 0.0 0.5 1.0 0.00 0.25 0.50 0.75 1.00

x y

11

slide-12
SLIDE 12

Combining Covariances

If we definite two valid covariance functions, ๐ท๐‘๐‘ค๐‘(๐‘ง๐‘ข๐‘—, ๐‘ง๐‘ข๐‘˜) and

๐ท๐‘๐‘ค๐‘(๐‘ง๐‘ข๐‘—, ๐‘ง๐‘ข๐‘˜) then the following are also valid covariance functions, ๐ท๐‘๐‘ค๐‘(๐‘ง๐‘ข๐‘—, ๐‘ง๐‘ข๐‘˜) + ๐ท๐‘๐‘ค๐‘(๐‘ง๐‘ข๐‘—, ๐‘ง๐‘ข๐‘˜) ๐ท๐‘๐‘ค๐‘(๐‘ง๐‘ข๐‘—, ๐‘ง๐‘ข๐‘˜) ร— ๐ท๐‘๐‘ค๐‘(๐‘ง๐‘ข๐‘—, ๐‘ง๐‘ข๐‘˜)

12

slide-13
SLIDE 13

Linear ร— Linear โ†’ Quadratic

๐ท๐‘๐‘ค๐‘(๐‘ง๐‘ข๐‘—, ๐‘ง๐‘ข๐‘˜) = 1 + 2 (๐‘ข๐‘— ร— ๐‘ข๐‘˜) ๐ท๐‘๐‘ค๐‘(๐‘ง๐‘ข๐‘—, ๐‘ง๐‘ข๐‘˜) = 2 + 1 (๐‘ข๐‘— ร— ๐‘ข๐‘˜)

โˆ’10 โˆ’5 5 โˆ’2 โˆ’1 1 2

x y

Cov_a * Cov_b

13

slide-14
SLIDE 14

Linear ร— Periodic

๐ท๐‘๐‘ค๐‘(๐‘ง๐‘ข๐‘—, ๐‘ง๐‘ข๐‘˜) = 1 + 1 (๐‘ข๐‘— ร— ๐‘ข๐‘˜) ๐ท๐‘๐‘ค๐‘(๐‘ง๐‘ข๐‘—, ๐‘ง๐‘ข๐‘˜) = exp (โˆ’2 sin2 (2๐œŒ โ„Ž))

โˆ’4 โˆ’2 2 1 2 3

x y

Cov_a * Cov_b

14

slide-15
SLIDE 15

Linear + Periodic

๐ท๐‘๐‘ค๐‘(๐‘ง๐‘ข๐‘—, ๐‘ง๐‘ข๐‘˜) = 1 + 1 (๐‘ข๐‘— ร— ๐‘ข๐‘˜) ๐ท๐‘๐‘ค๐‘(โ„Ž = |๐‘ข๐‘— โˆ’ ๐‘ข๐‘˜|) = exp (โˆ’2 sin2 (2๐œŒ โ„Ž))

โˆ’5 โˆ’4 โˆ’3 โˆ’2 โˆ’1 1 2 3

x y draw

Draw 1 Draw 2

Cov_a + Cov_b

15

slide-16
SLIDE 16

Sq Exp ร— Periodic โ†’ Locally Periodic

๐ท๐‘๐‘ค๐‘(โ„Ž = |๐‘ข๐‘— โˆ’ ๐‘ข๐‘˜|) = exp(โˆ’(1/3)โ„Ž2) ๐ท๐‘๐‘ค๐‘(โ„Ž = |๐‘ข๐‘— โˆ’ ๐‘ข๐‘˜|) = exp (โˆ’2 sin2 (๐œŒ โ„Ž))

โˆ’2 โˆ’1 1 2 2 4 6

x y

Cov_a * Cov_b

16

slide-17
SLIDE 17

Sq Exp (short) + Sq Exp (long)

๐ท๐‘๐‘ค๐‘(โ„Ž = |๐‘ข๐‘— โˆ’ ๐‘ข๐‘˜|) = (1/4) exp(โˆ’4 โˆš 3โ„Ž2) ๐ท๐‘๐‘ค๐‘(โ„Ž = |๐‘ข๐‘— โˆ’ ๐‘ข๐‘˜|) = exp(โˆ’( โˆš 3/2)โ„Ž2)

โˆ’2 โˆ’1 1 0.0 2.5 5.0 7.5 10.0

x y

Cov_a + Cov_b

17

slide-18
SLIDE 18

Sq Exp (short) + Sq Exp (long) (Seen another way)

Cov_A (short) Cov_B (long) Cov_A + Cov_B 0.0 2.5 5.0 7.5 10.0 0.0 2.5 5.0 7.5 10.0 0.0 2.5 5.0 7.5 10.0 โˆ’3 โˆ’2 โˆ’1 1 2

x y 18

slide-19
SLIDE 19

BDA3 example

19

slide-20
SLIDE 20

BDA3

http://research.cs.aalto.fi/pml/software/gpstuff/demo_births.shtml

20

slide-21
SLIDE 21

Births (one year)

  • 1. Smooth long term trend

(sq exp cov)

  • 2. Seven day periodic trend with

decay (periodic ร— sq exp cov)

  • 3. Constant mean

21

slide-22
SLIDE 22

Component Contributions

We can view our GP in the following ways,

๐ณ โˆผ ๐’ช(๐‚, ๐šป1 + ๐šป2 + ๐œ2๐‰ )

but with appropriate conditioning we can also think of ๐ณ as being the sum

  • f multipe independent GPs

๐ณ = ๐‚ + ๐‘ฅ1(๐ฎ) + ๐‘ฅ2(๐ฎ) + ๐‘ฅ3(๐ฎ)

where

๐‘ฅ1(๐ฎ) โˆผ ๐’ช(0, ๐šป1) ๐‘ฅ2(๐ฎ) โˆผ ๐’ช(0, ๐šป2) ๐‘ฅ3(๐ฎ) โˆผ ๐’ช(0, ๐œ2๐‰ )

22

slide-23
SLIDE 23

Decomposition of Covariance Components

โŽก โŽข โŽฃ ๐‘ง ๐‘ฅ1 ๐‘ฅ2 โŽค โŽฅ โŽฆ โˆผ ๐’ช โŽ› โŽœ โŽ โŽก โŽข โŽฃ ๐‚ โŽค โŽฅ โŽฆ , โŽก โŽข โŽฃ ฮฃ1 + ฮฃ2 + ๐œ2๐‰ ฮฃ1 ฮฃ2 ฮฃ1 ฮฃ1 ฮฃ2 ฮฃ2 โŽค โŽฅ โŽฆ โŽž โŽŸ โŽ  therefore

๐‘ฅ1 | ๐ณ, ๐‚, ๐œพ โˆผ ๐’ช(๐‚๐‘‘๐‘๐‘œ๐‘’, ๐šป๐‘‘๐‘๐‘œ๐‘’) ๐‚๐‘‘๐‘๐‘œ๐‘’ = 0 + ฮฃ1 (ฮฃ1 + ฮฃ2 + ๐œ2๐ฝ)โˆ’1(๐ณ โˆ’ ๐‚) ๐šป๐‘‘๐‘๐‘œ๐‘’ = ฮฃ1 โˆ’ ฮฃ1(ฮฃ1 + ฮฃ2 + ๐œ2๐‰)โˆ’1ฮฃ1

๐‘ข

23

slide-24
SLIDE 24

Births (multiple years)

  • 1. slowly changing trend (sq exp cov)
  • 2. small time scale correlating noise (sq exp cov)
  • 3. 7 day periodical component capturing day of week effect (periodic ร— sq exp cov)
  • 4. 365.25 day periodical component capturing day of year effect (periodic ร— sq exp cov)
  • 5. component to take into account the special days and interaction with weekends (linear

cov)

  • 6. independent Gaussian noise (nugget cov)
  • 7. constant mean

24

slide-25
SLIDE 25

Mauna Loa Exampel

25

slide-26
SLIDE 26

Atmospheric CO2

330 360 390 1960 1980 2000

x y Source

NOAA Scripps (co2 in R)

26

slide-27
SLIDE 27

GP Model

Based on Rasmussen 5.4.3 (we are using slightly different data and parameterization)

๐ณ โˆผ ๐’ช(๐‚, ๐šป1 + ๐šป2 + ๐šป3 + ๐šป4 + ๐œ2I ) {๐‚}๐‘— = ฬ„ ๐‘ง {๐šป1}๐‘—๐‘˜ = ๐œ2

1 exp (โˆ’(๐‘š1 โ‹… ๐‘’๐‘—๐‘˜)2)

{๐šป2}๐‘—๐‘˜ = ๐œ2

2 exp (โˆ’(๐‘š2 โ‹… ๐‘’๐‘—๐‘˜)2) exp (โˆ’2 (๐‘š3)2 sin2(๐œŒ ๐‘’๐‘—๐‘˜/๐‘ž))

{๐šป3}๐‘—๐‘˜ = ๐œ2

3 (1 + (๐‘š4 โ‹… ๐‘’๐‘—๐‘˜)2

๐›ฝ )

โˆ’๐›ฝ

{๐šป4}๐‘—๐‘˜ = ๐œ2

4 exp (โˆ’(๐‘š5 โ‹… ๐‘’๐‘—๐‘˜)2)

27

slide-28
SLIDE 28

JAGS Model

ml_model = โ€model{ y ~ dmnorm(mu, inverse(Sigma)) for (i in 1:(length(y)-1)) { for (j in (i+1):length(y)) { k1[i,j] <- sigma2[1] * exp(- pow(l[1] * d[i,j],2)) k2[i,j] <- sigma2[2] * exp(- pow(l[2] * d[i,j],2) - 2 * pow(l[3] * sin(pi*d[i,j] / per), 2)) k3[i,j] <- sigma2[3] * pow(1+pow(l[4] * d[i,j],2)/alpha, -alpha) k4[i,j] <- sigma2[4] * exp(- pow(l[5] * d[i,j],2)) Sigma[i,j] <- k1[i,j] + k2[i,j] + k3[i,j] + k4[i,j] Sigma[j,i] <- Sigma[i,j] } } for (i in 1:length(y)) { Sigma[i,i] <- sigma2[1] + sigma2[2] + sigma2[3] + sigma2[4] + sigma2[5] } for(i in 1:5){ sigma2[i] ~ dt(0, 2.5, 1) T(0,) l[i] ~ dt(0, 2.5, 1) T(0,) } alpha ~ dt(0, 2.5, 1) T(0,) }โ€ 28

slide-29
SLIDE 29

Diagnostics

alpha l[1] l[2] l[3] l[4] l[5] sigma2[1] sigma2[2] sigma2[3] sigma2[4] sigma2[5] 0 250500750 1000 0 250500750 1000 0 250500750 1000 0 250500750 1000 0 250500750 1000 0.02 0.03 0.04 0.0 0.3 0.6 0.9 0.0 0.2 0.4 0.6 0.8 2 4 6 0.0 0.5 1.0 1.5 2.0 0.6 0.8 1.0 1.2 10 20 30 40 0.005 0.010 0.015 0.020 2000 4000 6000 0.02 0.04 0.06 2 4 6 8

.iteration estimate 29

slide-30
SLIDE 30

Fit Components

Sigma_3 Sigma_4 Sigma_1 Sigma_2 1960 1970 1980 1990 1960 1970 1980 1990 โˆ’4 4 8 โˆ’1.5 โˆ’1.0 โˆ’0.5 0.0 0.5 1.0 โˆ’20 โˆ’10 10 20 30 โˆ’2 โˆ’1 1 2

x post_mean cov

Sigma_1 Sigma_2 Sigma_3 Sigma_4

30

slide-31
SLIDE 31

Forecasting

330 360 390 1960 1980 2000

x post_mean 31

slide-32
SLIDE 32

Forecasting (zoom)

360 370 380 390 400 410 2000 2005 2010 2015

x post_mean 32

slide-33
SLIDE 33

Forecasting ARIMA (auto)

330 360 390 1960 1980 2000

Time co2 level

80 95

Forecasts from ARIMA(1,1,1)(1,1,2)[12]

33

slide-34
SLIDE 34

Forecasting RMSE

dates RMSE (arima) RMSE (gp) Jan 1998 - Jan 2003 1.103 1.911 Jan 1998 - Jan 2008 2.506 4.575 Jan 1998 - Jan 2013 3.824 7.706 Jan 1998 - Mar 2017 5.461 11.395

34

slide-35
SLIDE 35

Forecasting Components

Sigma_3 Sigma_4 Sigma_1 Sigma_2 2000 2005 2010 2015 2000 2005 2010 2015 โˆ’5 5 โˆ’1.0 โˆ’0.5 0.0 0.5 1.0 30 40 50 60 โˆ’1 1

x post_mean cov

Sigma_1 Sigma_2 Sigma_3 Sigma_4

35