Lecture 14 Covariance Functions 3/08/2018 1 More on Covariance - - PowerPoint PPT Presentation
Lecture 14 Covariance Functions 3/08/2018 1 More on Covariance - - PowerPoint PPT Presentation
Lecture 14 Covariance Functions 3/08/2018 1 More on Covariance Functions 2 Nugget Covariance 3 ( , ) = 2 1 {=0} where = | | 2 1.00 1 0.75 draw 0 Draw 1 C
More on Covariance Functions
2
Nugget Covariance
๐ท๐๐ค(๐ง๐ข๐, ๐ง๐ข๐) = ๐21{โ=0} where โ = |๐ข๐ โ ๐ข๐|
0.00 0.25 0.50 0.75 1.00 โ2 โ1 1 2 5 10 15 20 5 10 15 20
h x C y draw
Draw 1 Draw 2
3
(- / Power / Square) Exponential Covariance
๐ท๐๐ค(๐ง๐ข๐, ๐ง๐ข๐) = ๐2 exp (โ(โ ๐)๐) where โ = |๐ข๐ โ ๐ข๐|
0.00 0.25 0.50 0.75 1.00 โ4 โ2 2 โ3 โ2 โ1 1 โ2 2 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00
h x x x C y y y Cov
Exp Pow Exp (p=1.5) Sq Exp
Exponential Powered Exponential (p=1.5) Square Exponential Covariance โ l=12, sigma2=1
4
Matern Covariance
๐ท๐๐ค(๐ง๐ข๐, ๐ง๐ข๐) = ๐2 21โ๐ ฮ(๐) ( โ 2๐ โ โ ๐)
๐ ๐ฟ๐ (
โ 2๐ โ โ ๐) where โ = |๐ข๐โ๐ข๐|
0.00 0.25 0.50 0.75 1.00 โ2 โ1 1 2 โ3 โ2 โ1 1 โ1 1 2 2 4 6 2 4 6 2 4 6 2 4 6
h x x x C y y y
v=1/2 v=3/2 v=5/2
Matern โ v=1/2 Matern โ v=3/2 Matern โ v=5/2 Covariance โ l=2, sigma2=1
5
Matern Covariance
- ๐ฟ๐ is the modified Bessel function of the second kind.
- A Gaussian process with Matรฉrn covariance has sample functions that
are โ๐ โ 1โ times differentiable.
- When ๐ = 1/2 + ๐ for ๐ โ N+ then the Matern has a simplified form
(product of an exponential and a polynomial of order ๐).
- When ๐ = 1/2 the Matern is equivalent to the exponential covariance.
- As ๐ โ โ the Matern converges to the square exponential
covariance.
- A Gaussian process with Matรฉrn covariance has paths that are โ๐โ โ 1
times differentiable.
6
Rational Quadratic Covariance
๐ท๐๐ค(๐ง๐ข๐, ๐ง๐ข๐) = ๐2 (1 + โ2 ๐2 ๐ฝ )
โ๐ฝ
where โ = |๐ข๐ โ ๐ข๐|
0.00 0.25 0.50 0.75 1.00 โ2 โ1 1 โ1 1 2 โ1 1 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00
h x x x y y y
alpha=1 alpha=3 alpha=10
Rational Quadratic โ alpha=1 Rational Quadratic โ alpha=10 Rational Quadratic โ alpha=100 Covariance โ l=12, sigma2=1
7
Rational Quadratic Covariance
- is a scaled mixture of squared exponential covariance functions with
different characteristic length-scales (๐).
- As ๐ฝ โ โ the rational quadratic converges to the square
exponential covariance.
- Has sample functions that are infinitely differentiable for any value of
๐ฝ
8
Spherical Covariance
๐ท๐๐ค(๐ง๐ข๐, ๐ง๐ข๐) = {๐2 (1 โ 3
2โ โ ๐ + 1 2(โ โ ๐)3))
if 0 < โ < 1/๐
- therwise
where โ = |๐ข๐โ๐ข๐|
0.00 0.25 0.50 0.75 1.00 โ3 โ2 โ1 1 โ2 2 โ2 โ1 1 2 0.0 0.3 0.6 0.9 0.0 0.3 0.6 0.9 0.0 0.3 0.6 0.9 0.0 0.3 0.6 0.9
h x x x y y y
l=1 l=3 l=10
Spherical โ l=1 Spherical โ l=3 Spherical โ l=10 Covariance โ sigma2=1
9
Periodic Covariance
๐ท๐๐ค(๐ง๐ข๐, ๐ง๐ข๐) = ๐2 exp (โ2 ๐2 sin2 (๐โ ๐ )) where โ = |๐ข๐ โ ๐ข๐|
0.00 0.25 0.50 0.75 1.00 โ1 1 โ2 โ1 1 โ2 โ1 1 2 1 2 3 4 2 4 6 2 4 6 2 4 6
h x x x y y y forcats::as_factor(Cov)
p=1 p=2 p=3
Periodic โ p=1 Periodic โ p=2 Periodic โ p=3 Covariance โ l=2, sigma2=1
10
Linear Covariance
๐ท๐๐ค(๐ง๐ข๐, ๐ง๐ข๐) = ๐2
๐ + ๐2 ๐ค (๐ข๐ โ ๐)(๐ข๐ โ ๐)
โ1.0 โ0.5 0.0 0.5 1.0 0.00 0.25 0.50 0.75 1.00
x y
11
Combining Covariances
If we definite two valid covariance functions, ๐ท๐๐ค๐(๐ง๐ข๐, ๐ง๐ข๐) and
๐ท๐๐ค๐(๐ง๐ข๐, ๐ง๐ข๐) then the following are also valid covariance functions, ๐ท๐๐ค๐(๐ง๐ข๐, ๐ง๐ข๐) + ๐ท๐๐ค๐(๐ง๐ข๐, ๐ง๐ข๐) ๐ท๐๐ค๐(๐ง๐ข๐, ๐ง๐ข๐) ร ๐ท๐๐ค๐(๐ง๐ข๐, ๐ง๐ข๐)
12
Linear ร Linear โ Quadratic
๐ท๐๐ค๐(๐ง๐ข๐, ๐ง๐ข๐) = 1 + 2 (๐ข๐ ร ๐ข๐) ๐ท๐๐ค๐(๐ง๐ข๐, ๐ง๐ข๐) = 2 + 1 (๐ข๐ ร ๐ข๐)
โ10 โ5 5 โ2 โ1 1 2
x y
Cov_a * Cov_b
13
Linear ร Periodic
๐ท๐๐ค๐(๐ง๐ข๐, ๐ง๐ข๐) = 1 + 1 (๐ข๐ ร ๐ข๐) ๐ท๐๐ค๐(๐ง๐ข๐, ๐ง๐ข๐) = exp (โ2 sin2 (2๐ โ))
โ4 โ2 2 1 2 3
x y
Cov_a * Cov_b
14
Linear + Periodic
๐ท๐๐ค๐(๐ง๐ข๐, ๐ง๐ข๐) = 1 + 1 (๐ข๐ ร ๐ข๐) ๐ท๐๐ค๐(โ = |๐ข๐ โ ๐ข๐|) = exp (โ2 sin2 (2๐ โ))
โ5 โ4 โ3 โ2 โ1 1 2 3
x y draw
Draw 1 Draw 2
Cov_a + Cov_b
15
Sq Exp ร Periodic โ Locally Periodic
๐ท๐๐ค๐(โ = |๐ข๐ โ ๐ข๐|) = exp(โ(1/3)โ2) ๐ท๐๐ค๐(โ = |๐ข๐ โ ๐ข๐|) = exp (โ2 sin2 (๐ โ))
โ2 โ1 1 2 2 4 6
x y
Cov_a * Cov_b
16
Sq Exp (short) + Sq Exp (long)
๐ท๐๐ค๐(โ = |๐ข๐ โ ๐ข๐|) = (1/4) exp(โ4 โ 3โ2) ๐ท๐๐ค๐(โ = |๐ข๐ โ ๐ข๐|) = exp(โ( โ 3/2)โ2)
โ2 โ1 1 0.0 2.5 5.0 7.5 10.0
x y
Cov_a + Cov_b
17
Sq Exp (short) + Sq Exp (long) (Seen another way)
Cov_A (short) Cov_B (long) Cov_A + Cov_B 0.0 2.5 5.0 7.5 10.0 0.0 2.5 5.0 7.5 10.0 0.0 2.5 5.0 7.5 10.0 โ3 โ2 โ1 1 2
x y 18
BDA3 example
19
BDA3
http://research.cs.aalto.fi/pml/software/gpstuff/demo_births.shtml
20
Births (one year)
- 1. Smooth long term trend
(sq exp cov)
- 2. Seven day periodic trend with
decay (periodic ร sq exp cov)
- 3. Constant mean
21
Component Contributions
We can view our GP in the following ways,
๐ณ โผ ๐ช(๐, ๐ป1 + ๐ป2 + ๐2๐ )
but with appropriate conditioning we can also think of ๐ณ as being the sum
- f multipe independent GPs
๐ณ = ๐ + ๐ฅ1(๐ฎ) + ๐ฅ2(๐ฎ) + ๐ฅ3(๐ฎ)
where
๐ฅ1(๐ฎ) โผ ๐ช(0, ๐ป1) ๐ฅ2(๐ฎ) โผ ๐ช(0, ๐ป2) ๐ฅ3(๐ฎ) โผ ๐ช(0, ๐2๐ )
22
Decomposition of Covariance Components
โก โข โฃ ๐ง ๐ฅ1 ๐ฅ2 โค โฅ โฆ โผ ๐ช โ โ โ โก โข โฃ ๐ โค โฅ โฆ , โก โข โฃ ฮฃ1 + ฮฃ2 + ๐2๐ ฮฃ1 ฮฃ2 ฮฃ1 ฮฃ1 ฮฃ2 ฮฃ2 โค โฅ โฆ โ โ โ therefore
๐ฅ1 | ๐ณ, ๐, ๐พ โผ ๐ช(๐๐๐๐๐, ๐ป๐๐๐๐) ๐๐๐๐๐ = 0 + ฮฃ1 (ฮฃ1 + ฮฃ2 + ๐2๐ฝ)โ1(๐ณ โ ๐) ๐ป๐๐๐๐ = ฮฃ1 โ ฮฃ1(ฮฃ1 + ฮฃ2 + ๐2๐)โ1ฮฃ1
๐ข
23
Births (multiple years)
- 1. slowly changing trend (sq exp cov)
- 2. small time scale correlating noise (sq exp cov)
- 3. 7 day periodical component capturing day of week effect (periodic ร sq exp cov)
- 4. 365.25 day periodical component capturing day of year effect (periodic ร sq exp cov)
- 5. component to take into account the special days and interaction with weekends (linear
cov)
- 6. independent Gaussian noise (nugget cov)
- 7. constant mean
24
Mauna Loa Exampel
25
Atmospheric CO2
330 360 390 1960 1980 2000
x y Source
NOAA Scripps (co2 in R)
26
GP Model
Based on Rasmussen 5.4.3 (we are using slightly different data and parameterization)
๐ณ โผ ๐ช(๐, ๐ป1 + ๐ป2 + ๐ป3 + ๐ป4 + ๐2I ) {๐}๐ = ฬ ๐ง {๐ป1}๐๐ = ๐2
1 exp (โ(๐1 โ ๐๐๐)2)
{๐ป2}๐๐ = ๐2
2 exp (โ(๐2 โ ๐๐๐)2) exp (โ2 (๐3)2 sin2(๐ ๐๐๐/๐))
{๐ป3}๐๐ = ๐2
3 (1 + (๐4 โ ๐๐๐)2
๐ฝ )
โ๐ฝ
{๐ป4}๐๐ = ๐2
4 exp (โ(๐5 โ ๐๐๐)2)
27
JAGS Model
ml_model = โmodel{ y ~ dmnorm(mu, inverse(Sigma)) for (i in 1:(length(y)-1)) { for (j in (i+1):length(y)) { k1[i,j] <- sigma2[1] * exp(- pow(l[1] * d[i,j],2)) k2[i,j] <- sigma2[2] * exp(- pow(l[2] * d[i,j],2) - 2 * pow(l[3] * sin(pi*d[i,j] / per), 2)) k3[i,j] <- sigma2[3] * pow(1+pow(l[4] * d[i,j],2)/alpha, -alpha) k4[i,j] <- sigma2[4] * exp(- pow(l[5] * d[i,j],2)) Sigma[i,j] <- k1[i,j] + k2[i,j] + k3[i,j] + k4[i,j] Sigma[j,i] <- Sigma[i,j] } } for (i in 1:length(y)) { Sigma[i,i] <- sigma2[1] + sigma2[2] + sigma2[3] + sigma2[4] + sigma2[5] } for(i in 1:5){ sigma2[i] ~ dt(0, 2.5, 1) T(0,) l[i] ~ dt(0, 2.5, 1) T(0,) } alpha ~ dt(0, 2.5, 1) T(0,) }โ 28
Diagnostics
alpha l[1] l[2] l[3] l[4] l[5] sigma2[1] sigma2[2] sigma2[3] sigma2[4] sigma2[5] 0 250500750 1000 0 250500750 1000 0 250500750 1000 0 250500750 1000 0 250500750 1000 0.02 0.03 0.04 0.0 0.3 0.6 0.9 0.0 0.2 0.4 0.6 0.8 2 4 6 0.0 0.5 1.0 1.5 2.0 0.6 0.8 1.0 1.2 10 20 30 40 0.005 0.010 0.015 0.020 2000 4000 6000 0.02 0.04 0.06 2 4 6 8
.iteration estimate 29
Fit Components
Sigma_3 Sigma_4 Sigma_1 Sigma_2 1960 1970 1980 1990 1960 1970 1980 1990 โ4 4 8 โ1.5 โ1.0 โ0.5 0.0 0.5 1.0 โ20 โ10 10 20 30 โ2 โ1 1 2
x post_mean cov
Sigma_1 Sigma_2 Sigma_3 Sigma_4
30
Forecasting
330 360 390 1960 1980 2000
x post_mean 31
Forecasting (zoom)
360 370 380 390 400 410 2000 2005 2010 2015
x post_mean 32
Forecasting ARIMA (auto)
330 360 390 1960 1980 2000
Time co2 level
80 95
Forecasts from ARIMA(1,1,1)(1,1,2)[12]
33
Forecasting RMSE
dates RMSE (arima) RMSE (gp) Jan 1998 - Jan 2003 1.103 1.911 Jan 1998 - Jan 2008 2.506 4.575 Jan 1998 - Jan 2013 3.824 7.706 Jan 1998 - Mar 2017 5.461 11.395
34
Forecasting Components
Sigma_3 Sigma_4 Sigma_1 Sigma_2 2000 2005 2010 2015 2000 2005 2010 2015 โ5 5 โ1.0 โ0.5 0.0 0.5 1.0 30 40 50 60 โ1 1
x post_mean cov
Sigma_1 Sigma_2 Sigma_3 Sigma_4