@ReinhardFurrer, I-Math/ICS, UZH Uppsala, 2019/01/30
NZZ.ch
Interpolation for huge spatial datasets: theory, implementations and - - PowerPoint PPT Presentation
Interpolation for huge spatial datasets: theory, implementations and ideas @ReinhardFurrer, I-Math/ICS, UZH NZZ.ch Uppsala, 2019/01/30 Joint work Emilio Porcu Florian Gerber Francois Bachoc and with contributions of several others
NZZ.ch
2
and with contributions of several others
3
◮ Motivation ◮ Covariance approximations ◮ Gaussian equivalent measures ◮ Prediction and estimation ◮ Implementation and illustration Questions and ≪Fast-Forward≫ appreciated! Slides at http://www.math.uzh.ch/furrer/slides/
4
2
i
n
i
2 1
Observations: y(s1), . . . , y(sn) First law of geography (Waldo Tobler): Everything is related to everything else, but near things are more related than distant things.
Source: wikipedia.org
4
n 2
Predict the quantity of interest at an arbitrary location. Why? ◮ Fill-in missing data ◮ Force data onto a regular grid ◮ Smooth out the measurement error How? ◮ By eye ◮ Linear interpolation ◮ The correct way . . .
7
Source: Gerber et al (2018), TGRS
8
Source: Gerber et al (2018), TGRS
9
n 2
Predict Z(s0) given y(s1), . . . , y(sn) Y (s) = xT(s)β + µ(s) + Z(s) + ε(s) Minimize mean squared prediction error (over all linear unbiased predictors)
BLUP = Cov
−1obs
(one spatial process, no trend, known covariance structure;
9
n 2
Predict Z(s0) given y(s1), . . . , y(sn) Y (s) = xT(s)β + µ(s) + Z(s) + ε(s)
0.0 0.2 0.4 0.6 0.8
Covariance Distance, lag h
Covariance matrix Σ contains elements C
10
Cov(pred, obs) · Var(obs)−1 · obs = c Σ−1 y ◮ “Simple” spatial interpolation . . . . . . on paper or in class! ◮ BUT:
Many R packages . . . . . . many black boxes . . . Heaton et al. JABES
12
◮ Motivation ◮ Covariance approximations ◮ Gaussian equivalent measures ◮ Prediction and estimation ◮ Implementation and illustration
13
Geostatistical model (GRF):
s s s1 si
n 2
0.0 0.2 0.4 0.6 0.8
Covariance Distance, lag h
C(dist(s1, sn))
Covariance matrix: Σ Lattice model (GMRF): E(Zi|z−i) = β
zj Var(Zi|z−i) = τ2 Gaussianity and regularity conditions:
14
Geostatistical model (GRF):
Lattice model (GMRF):
16
Using sparse covariance functions for greater computational efficiency. Sparseness is guaranteed when ◮ the covariance function has a compact support ◮ a compact support is (artificially) imposed tapering
10 20 30 40
Distance, lag h Wendland Wendland
10 20 30 40
Distance, lag h Matern ν = 1.5 Wendland Matern * Wendland
16
For an isotropic and stationary process with covariance C0(h), find a taper Cρ(h), such that kriging with C0(h)Cρ(h) is asymptotically optimal. MSPE(s0, C0Cρ) MSPE(s0, C0) = MSPEtaper MSPEtrue → 1 ̺(s0, C0Cρ) MSPE(s0, C0) = MSPEnaive MSPEtrue → 1 ̺(s0, C) = C(0) − cTΣ−1c
16
For an isotropic and stationary process with covariance C0(h), find a taper Cρ(h), such that kriging with C0(h)Cρ(h) is asymptotically optimal. MSPE(s0, C0Cρ) MSPE(s0, C0) = MSPEtaper MSPEtrue → 1 ̺(s0, C0Cρ) MSPE(s0, C0) = MSPEnaive MSPEtrue → 1 ̺(s0, C) = C(0) − cTΣ−1c
Proofs based on infill asymptotics and “misspecified” covariances Conditions on the tail behaviour of the spectrum of the (tapered) covariance
Furrer, Genton, Nychka (2006) JCGS Stein (2013) JCGS Bevilacqua et al (2019) AoS
17
Mat´ ern: Mν(r) = 21−ν Γ(ν)rνKν(r) ν > 0 Mν,α,σ2(r) = σ2 Mν(r/α) α > 0 σ2 > 0 Generalized Wendland: Wµ,κ(r) = 1 B(2κ, µ + 1)
∞
u(u2 − r2)κ−1(1 − u)µ
+ du
κ > 0 µ ≥ (d + 1)/2 + κ Wµ,κ,β,σ2(r) = σ2 Wµ(r/β) β > 0 σ2 > 0
18
◮ Motivation ◮ Covariance approximations ◮ Gaussian equivalent measures ◮ Prediction and estimation ◮ Implementation and illustration
19
P0, P1 two probability measures on a measureable space {Ω, S}.
P1(A) = 1 and vice versa. ◮ Restrict A to the σ-algebra generated by {Z(s), s ∈ D} Here: ‘≡’ equivalent on the paths of {Z(s), s ∈ D}. ◮ For Gaussian measures: equivalence or orthogonality first two moments discriminate: P(µ, C) ◮ Here: µ = 0 and D ⊂ Rd, d = 1, 2, 3
20
Covariance functions C0, C1, associated spectral densities CF
0 , CF 1 .
D ⊂ Rd any bounded infinite set, s0 ∈ D.
If ∃a > 0, 0 < bℓ ≤ CF
0 (z) za ≤ bu < ∞,
z → ∞ and if
∞
zd−1
1 (z) − CF 0 (z)
CF
0 (z)
2
dz < ∞, then P(C0) ≡ P(C1).
21
MF
ν (z) = cte ·
1 (1 + z2)ν+d/2, z ≥ 0. WF
µ,κ(z) = cte · 1F2
2, λ + µ 2 + 1 2; −z2 4
λ = (d + 1)/2 + κ ≍ z−2λ for z → ∞,
22
Let P(Mν,α0,σ2
0), P(Mν,α1,σ2 1), D ⊂ Rd, d = 1, 2, 3.
P(Mν,α0,σ2
0) ≡ P(Mν,α1,σ2 1)
⇐ ⇒ σ2
= σ2
1
1
◮ d ≤ 3: not all parameters consistently estimable microergodic parameter σ2 α2ν ◮ d ≥ 5 all fine (Anderes 2010) ◮ d = 4 open . . .
23
Let P(Wµ,κ,β0,σ2
0), P(Wµ,κ,β1,σ2 1).
For a given κ ≥ 0, let µ > d + 1/2 + κ: P(Wµ,κ,β0,σ2
0) ≡ P(Wµ,κ,β1,σ2 1)
⇐ ⇒ σ2
= σ2
1
1
Let P(Mν,α,σ2
0) and P(Wµ,κ,β,σ2 1), D ⊂ Rd, d = 1, 2, 3, κ > 0.
If ν = κ + 1/2, µ > d + 1/2 + κ: P(Mν,α,σ2
0) ≡ P(Wµ,κ,β,σ2 1)
⇐ ⇒ σ2
1
24
◮ Motivation ◮ Covariance approximations ◮ Gaussian equivalent measures ◮ Prediction and estimation ◮ Implementation and illustration
25
Prediction of Z(s0) using “BLUP” under misspecification Zn(µ, κ, β).
P(Wµ,κ,β,σ2): if µ > (d + 1)/2 + κ and for any fixed β1 > 0 Varµ,κ,β,σ2 Zn(µ, κ, β1) − Z(s0)
Zn(µ, κ, β) − Z(s0)
−
→ 1 P(Mν,α,σ2): if ν = κ + 1/2 and for any fixed β1 > 0 Varν,α,σ2 Zn(µ, κ, β1) − Z(s0)
Zn(ν, α) − Z(s0)
→ 1
26
Prediction of Z(s0) using “BLUP” under misspecification Zn(µ, κ, β).
P(Wµ,κ,β,σ2): if σ2
= σ2
1
1
Varµ,κ,β1,σ2
1
Zn(µ, κ, β1) − Z(s0)
Zn(µ, κ, β1) − Z(s0)
−
→ 1 P(Mν,α,σ2): if ν = κ + 1/2, σ2 α2ν = cte · σ2
1
1
Varµ,κ,β1,σ2
1
Zn(µ, κ, β1) − Z(s0)
Zn(µ, κ, β1) − Z(s0)
→ 1
27
Likelihood: − 1 2 log det(Σ(θ) ◦ T) − 1 2yT(Σ(θ) ◦ T)−1y − 1 2 log det(Σ(θ) ◦ T) − 1 2yT (Σ(θ) ◦ T)−1 ◦ T
◮ Dimension plays crucial role Kaufman et al. (2008) JASA ◮ Two-taper approach cannot be used in practice
28
Let P(Mν,α0,σ2
0), with certain constrains on the parameter space.
Let {ˆ σ2
n, ˆ
αn} the ML estimator. Then as n → ∞ 1. ˆ σ2
n
α2ν
n a.s.
− → σ2
2. √n(ˆ σ2
n/ˆ
α2ν
n − σ2 0/α2ν 0 ) D
− → N
0/α2ν 0 )2
◮ Also holds for ˆ σ2
n(α) with misspecified α
29
Let P(Wµ,κ,β0,σ2
0), with certain constrains on the parameter space.
Let ˆ σ2
n(β) the ML estimator with “known” β. Then, as n → ∞
σ2
n(β)
− → σ2
σ2
n(β)
)
D
− → N
0/β2κ+1
)2
30
Let P(Wν,α,σ2), with certain constrains on the parameter space.
Let {ˆ σ2
n(ˆ
βn), ˆ βn} the ML estimator. Then, as n → ∞
σ2
n(ˆ
βn)
β2κ+1
n a.s
− → σ2
σ2
n(ˆ
βn)
β2κ+1
n
− σ2
)
D
− → N
0/β2κ+1
)2
31
◮ Tapering is “obsolete”, but: – with practicable supports, slow convergence – evaluation of covariance function ◮ Univariate setting: Proofs based on infill asymptotics and “misspecified” covariances ◮ Multivariate setting: Proofs based on domain increasing framework Weak conditions on the taper (Furrer, Du, Bachoc 2016)
32
◮ Motivation ◮ Covariance approximations ◮ Gaussian equivalent measures ◮ Prediction and estimation ◮ Implementation and illustration
33
Software to exploit the sparse structure spam64 for : ◮ an R package for sparse matrix algebra ◮ storage economical and fast ◮ versatile, intuitive and simple
See Furrer et al. (2006) JCGS; Furrer, Sain (2010) JSS
◮ R objects have at most 231 elements (almost) ◮ R does not ‘have’ 64-bit integers: stored as doubles ◮ 64-bit exploitation consists of type conversions between front-end R and pre-compiled code
Gerber, M¨
Gerber, M¨
34
◮ residual field from AVHRR NDVI3g product: y = y2000−2009 − y1990−1999, n = 769, 940 observations ◮ Nonstationary Gaussian random field (zero mean)
35
◮ κ > 0: scaling parameter ◮ D(β) = diag(exp(Xβ)): controls strength via covariates ◮ τ ∈ [0, 1]: “no spatial correlation” vs “spatial correlation” ◮ I: identity matrix ◮ R: stationary correlation matrix: – compactly supported covariance – range 50km, sparsity 0.2%
36
◮ Fast is relative . . . optim suboptimal Task Function Time Sparsity Distances h <- nearest.dist(...) 23min 1.4 Gb 0.2o /
T <- cov.Wend(h, ...) 2min 1.4 Gb 0.2o /
chol(T, ...) 29min 8.5 Gb 1.9o /
◮ optimization strategy: – (iterative) grid search over τ exploit multicore architecture – for given τ use quasi-Newton optimizer to optimize κ, β
37
◮ with covariates “distance to nearest coast” and “elevation” ◮ diag(
◮ BIC improvement compared to nonstationary model
38
data on the sphere
39
Collaboration with: – Emilio Porcu – Florian Gerber – Francois Bachoc – Moreno Bevilacqua – Kaspar M¨
– former & present ‘Applied Statistics’ team . . . and many more 143282, 144973, 175529