@ReinhardFurrer, I-Math/ICS, UZH GI-Forum, ifgi, WWU M¨ unster, 2018/11/13
NZZ.ch
Imputing missing values in satellite data: From parametric to - - PowerPoint PPT Presentation
Imputing missing values in satellite data: From parametric to non-parametric approaches @ReinhardFurrer, I-Math/ICS, UZH NZZ.ch GI-Forum, ifgi, WWU M unster, 2018/11/13 Joint work Florian Gerber Emilio Porcu Francois Bachoc and
NZZ.ch
2
and with contributions of several others
3
◮ Motivation ◮ Parametric models and their issues ◮ A particular non-parametric approach ◮ Outcome of a biodiversity exercise Questions and ≪Fast-Forward≫ appreciated! Slides at http://www.math.uzh.ch/furrer/slides/
4 www.gcb.uzh.ch
◮ University Research Priority Program, start in 2013 ◮ ≈ 50 members in 2 Faculties and 5 Institutes/Departments
5 www.gcb.uzh.ch
6
Source: 10.1073/pnas.1703928114
“. . . we show that primary productivity, its temporal stability, and the decadal trend of a prolonged growing season strongly increase with biodiversity across heterogeneous landscapes, which is consistent over vast environmental, climatic, and altitudinal gradients. . . . ”
7
Now analysis in the Arctic, not Switzerland ◮ Species abundance plot scale measurements from the International Tundra Experiment (ITEX) ◮ NDVI satellite images and ASTER elevation data
Source: F. Gerber
8
MODIS NDVI data (satellite product MOD13A1, NDVI = NIR − R NIR + R )
9
Source: Gerber et al (2018), TGRS
10
Source: Gerber et al (2018), TGRS
12
2
i
n
i
2 1
Observations: y(s1), . . . , y(sn) Model: Y (s) = signal + noise Y (s) = trend + stochastic part + noise Y (s) = xT(s)β + α(s) + Z(s) + ε(s)
13
n 2
Predict the quantity of interest at an arbitrary location. Why? ◮ Fill-in missing data ◮ Force data onto a regular grid ◮ Smooth out the measurement error How? ◮ By eye ◮ Linear interpolation ◮ The correct way . . .
14
n 2
Describing the covariance structure
0.0 0.2 0.4 0.6 0.8
Covariance Distance, lag h
Covariance matrix Σ contains elements C
15
n 2
Predict Z(s0) given y(s1), . . . , y(sn). Minimize mean squared prediction error (over all linear unbiased predictors)
BLUP = Cov
−1obs
(one spatial process, no trend, known covariance structure;
16
◮ Motivation ◮ Parametric models and their issues ◮ A particular non-parametric approach ◮ Outcome of a biodiversity exercise
17
Cov(pred, obs) · Var(obs)−1 · obs = c Σ−1 y ◮ “Simple” spatial interpolation . . . . . . on paper or in class! ◮ BUT:
18
◮ Parametric structure typically ok ◮ Non-parametric structure often creates “model clash”
19
◮ (method of moment estimation) ◮ Likelihood approaches
20
◮ Many R packages do perform kriging . . . . . . many black boxes . . . . . . to tailored situations See Heaton et al. arXiv:1710.05013/JABES forthcoming
21
◮ Sparse Covariance methods: — Covariance Tapering Furrer — Spatial Partitioning Heaton ◮ Sparse Precision methods: — Lattice Kriging Nychka — Multiresolution Approximations Katzfuss — Stochastic Partial Differential Equations Lindgren — Periodic Embedding Guinness — Nearest Neighbor Processes Datta ◮ Low rank approximation: — Fixed Rank Kriging Zammit-Mangion — Predictive Processes Finley ◮ Algorithmic approaches: — Gapfill Gerber — Local Approximate Gaussian Processes Gramacy — Metakriging Guhaniyogi
22
Geostatistical model (GRF):
s s s1 si
n 2
0.0 0.2 0.4 0.6 0.8
Covariance Distance, lag h
C(dist(s1, sn))
Covariance matrix: Σ Lattice model (GMRF): E(Zi|z−i) = β
zj Var(Zi|z−i) = τ2 Gaussianity and regularity conditions:
23
Geostatistical model (GRF):
Lattice model (GMRF):
25
Using sparse covariance functions for greater computational efficiency. Sparseness is guaranteed when ◮ the covariance function has a compact support ◮ a compact support is (artificially) imposed tapering
10 20 30 40
Distance, lag h Matern ν = 1.5 Wendland
10 20 30 40
Distance, lag h Matern ν = 1.5 Wendland Matern * Wendland
26
◮ Univariate setting: Proofs based on infill asymptotics and “misspecified” covariances Conditions on the tail behaviour of the spectrum of the (tapered) covariance
Furrer, Genton, Nychka (2006) JCGS Kaufman, Schervish, Nychka (2008) JMVA Stein (2013) JCGS Bevilacqua et al (2018?) AoS
◮ Multivariate setting: Proofs based on domain increasing framework Weak conditions on the taper
Furrer, Du, Bachoc (2016) JMVA
27
Software to exploit the sparse structure spam64 for : ◮ an R package for sparse matrix algebra ◮ storage economical and fast ◮ versatile, intuitive and simple
See Furrer et al. (2006) JCGS; Furrer, Sain (2010) JSS
◮ R objects have at most 231 elements (almost) ◮ R does not ‘have’ 64-bit integers: stored as doubles ◮ 64-bit exploitation consists of type conversions between front-end R and pre-compiled code
Gerber, M¨
Gerber, M¨
30
MODIS NDIV data (satellite product MOD13A1, NDVI = NIR − R NIR + R )
31
32
33
145 161 177 193 2004 2005 2006 2007
Day of the year
0.2 0.4 0.6 0.8
NDVI
34
r = 1 2 3 4 5 6 7 8 9 10 11 12
177 193 2004 2005 2006 2007
Day of the year Year
0.2 0.4 0.6 0.8
NDVI
low high
35
Date: 193 doy 2004 177 doy 2006 177 doy 2005 193 doy 2006 193 doy 2005 Score: 0.65 0.71 0.77 0.88 0.91 Rank: 8 9 10 11 12 ˆ q: NA 0.64 NA 0.12 0.77
36
145 161 177 193 2004 2005 2006 2007
Day of the year Year
0.2 0.4 0.6 0.8
NDVI
145 161 177 193 2004 2005 2006 2007
Day of the year Year
0.2 0.4 0.6 0.8
interval length
data and predictions uncertainties
37
38 RMSE ×103
39 (l) Uncertainty contribution from the indicated four steps of the gapfill procedure. (m) Average width of the 90% prediction intervals (40% missing values). (r) Average interval widths and coverage rate per day of the year.
40
41
◮ Motivation ◮ Parametric models and their issues ◮ A particular non-parametric approach ◮ Outcome of a biodiversity exercise
42
H1: Plant productivity (quantified through NDVI) is positively correlated with plot scale biodiversity H2: Landscape variability (quantified through NDVI and slope) is positively correlated with plot scale biodiversity H3: Slope induces a drainage effect and increases plot scale biodiversity
43
◮ Species abundance plot scale measurements from the International Tundra Experiment (ITEX) Shannon biodiversity index on site and plot scale ◮ Landsat NDVI satellite images and ASTER elevation data characterization of the landscape heterogeneity
Source: F. Gerber
43
◮ Species abundance plot scale measurements from the International Tundra Experiment (ITEX) Shannon biodiversity index on site and plot scale ◮ Landsat NDVI satellite images and ASTER elevation data characterization of the landscape heterogeneity
Source: F. Gerber
abisko alexfiord anwr atqasuk barrow bylot kanger kilpisjarvi sadvent sverdrup toolik zackenberg 1980 1985 1990 1995 2000 2005 2010 1980 1985 1990 1995 2000 2005 2010 1 3 5 7 9 11 1 3 5 7 9 11 1 3 5 7 9 11 1 3 5 7 9 11 1 3 5 7 9 11 1 3 5 7 9 11
month year
25 50 75 100
% Cloud cover
Source: F. Gerber
44
◮ Data did not provide evidence for the hypothesis H1–H3. ◮ Statistical power could be improved by adding additional plot data. ◮ Limited amount of Landsat images makes it difficult to measure their seasonal and annual variability. This confounds the temporal aggregation.
Source: F. Gerber
45
Collaboration with: – Florian Gerber – Gabriela Schaepman-Strub – Rogier de Jong – Emilio Porcu – Francois Bachoc – Alfredo Alegria – Kaspar M¨
– former & present ‘Applied Statistics’ team . . . and many more 143282, 144973, 175529
Source: www.gcb.uzh.ch
www.gcb.uzh.ch/en/Events/URPP-GCB-Conferences/conference2019.html
Furrer Sain (2010) spam: A sparse matrix R package with emphasis on MCMC methods for Gaussian Markov random fields JSS 36 1–25 Furrer et al (2017) spam: Sparse Matrix algebra. R package version 2.2-0 Furrer et al (2017) spam64: 64-Bit Extension of the SPArse Matrix R Package ’spam’. R package version 2.2-0 Gerber Moesinger Furrer (2017) Extending R Packages to Support 64-bit Compiled Code: An Illustration with spam64 and GIMMS NDVI3g Data Comput Geosci 104 109-119 Gerber et al (2018) Predicting Missing Values in Spatio-Temporal Remote Sensing
Gerber Moesinger Furrer (2017) dotCall64: An Efficient Interface to Compiled C/C++ and Fortran Code Supporting Long Vectors SoftwareX 7 217-221 Heaton et al (2017f) A Case Study Competition among Methods for Analyzing Large Spatial Data arXiv:1710.05013/JABES forthcoming Porcu Alegria Furrer (2018) Modeling Temporally Evolving and Spatially Globally Dependent Data International Statistical Review 86 344–377 Complete list at: www.math.uzh.ch/furrer/research/publications.shtml