Reinhard Furrer, I-Math/ICS, UZH ISNPS2018, Salerno, 2018/06/14
NZZ.ch
Predicting missing values in spatio-temporal satellite data - - PowerPoint PPT Presentation
Predicting missing values in spatio-temporal satellite data Reinhard Furrer, I-Math/ICS, UZH NZZ.ch ISNPS2018, Salerno, 2018/06/14 Joint work Florian Gerber Emilio Porcu and with contributions of several others 2 Arctic NDVI NDVI =
NZZ.ch
2
and with contributions of several others
3
NDVI = NIR − R NIR + R
4
Scientifically: ◮ Complex interplay between climate and vegetation/ecosystems ◮ Reflectance measurements as proxy for “greenness”
5
Scientifically: ◮ Complex interplay between climate and vegetation/ecosystems ◮ Reflectance measurements as proxy for “greenness” Statistically: ◮ Large, spatio-temporal datasets with complex structures at low resolution ◮ . . . many missing values
6
data y
7
8
9
MODIS NDIV data (satellite product MOD13A1)
10
10
11
12
145 161 177 193 2004 2005 2006 2007
Day of the year
0.2 0.4 0.6 0.8
NDVI
13
r = 1 2 3 4 5 6 7 8 9 10 11 12
177 193 2004 2005 2006 2007
Day of the year Year
0.2 0.4 0.6 0.8
NDVI
low high
14
Date: 193 doy 2004 177 doy 2006 177 doy 2005 193 doy 2006 193 doy 2005 Score: 0.65 0.71 0.77 0.88 0.91 Rank: 8 9 10 11 12 ˆ q: NA 0.64 NA 0.12 0.77
15
145 161 177 193 2004 2005 2006 2007
Day of the year Year
0.2 0.4 0.6 0.8
NDVI
145 161 177 193 2004 2005 2006 2007
Day of the year Year
0.2 0.4 0.6 0.8
interval length
data and predictions uncertainties
16
17 RMSE ×103
18 (l) Uncertainty contribution from the indicated four steps of the gapfill procedure. (m) Average width of the 90% prediction intervals (40% missing values). (r) Average interval widths and coverage rate per day of the year.
19
20
Collaboration with: – Emilio Porcu – Alfredo Alegria – Florian Gerber – Kaspar M¨
– former & present ‘Applied Statistics’ team . . . and many more URPP Global Change and Biodiversity 143282, 144973, 175529
21 Furrer Sain (2010) spam: A sparse matrix R package with emphasis on MCMC methods for Gaussian Markov random fields JSS 36 1–25 Furrer et al (2017) spam: Sparse Matrix algebra. R package version 2.2-0 Furrer et al (2017) spam64: 64-Bit Extension of the SPArse Matrix R Package ’spam’. R package version 2.2-0 Gerber Moesinger Furrer (2017) Extending R Packages to Support 64-bit Compiled Code: An Illustration with spam64 and GIMMS NDVI3g Data Comput Geosci 104 109-119 Gerber et al (2018) Predicting Missing Values in Spatio-Temporal Remote Sensing
Gerber Moesinger Furrer (2017) dotCall64: An Efficient Interface to Compiled C/C++ and Fortran Code Supporting Long Vectors arXiv:1702.08188 Heaton et al (2017) A Case Study Competition among Methods for Analyzing Large Spatial Data arXiv:1710.05013 Porcu Alegria Furrer (2018) Modeling Temporally Evolving and Spatially Globally Dependent Data International Statistical Review first published.
22
23
24
R function dotCall64 Compiled code 32-bit compiled code 64-bit compiled code Input Preprocess Postprocess Output Use 64-bit?
25
26
◮ residual field from AVHRR NDVI3g product: y = y2000−2009 − y1990−1999, n = 769, 940 observations ◮ Nonstationary Gaussian random field (zero mean)
27
◮ κ > 0: scaling parameter ◮ D(β) = diag(exp(Xβ)): controls strength via covariates ◮ τ ∈ [0, 1]: “no spatial correlation” vs “spatial correlation” ◮ I: identity matrix ◮ R: stationary correlation matrix: – compactly supported covariance – range 50km, sparsity 0.2%
28
◮ Fast is relative . . . optim suboptimal Task Function Time Sparsity Distances h <- nearest.dist(...) 23min 1.4 Gb 0.2o /
T <- cov.Wend(h, ...) 2min 1.4 Gb 0.2o /
chol(T, ...) 29min 8.5 Gb 1.9o /
◮ optimization strategy: – (iterative) grid search over τ exploit multicore architecture – for given τ use quasi-Newton optimizer to optimize κ, β
29
◮ with covariates “distance to nearest coast” and “elevation” ◮ diag(
◮ BIC improvement (1%) compared to nonstationary model