Statistical downscaling by EOFVAR-X models Jiang, Ci-Ren (Institute - - PowerPoint PPT Presentation

statistical downscaling by eofvar x models
SMART_READER_LITE
LIVE PREVIEW

Statistical downscaling by EOFVAR-X models Jiang, Ci-Ren (Institute - - PowerPoint PPT Presentation

Statistical downscaling by EOFVAR-X models Jiang, Ci-Ren (Institute of Statistical Science, Academia Sinica) Chen, Lu-Hung* (Institute of Statistics, NCHU) July 17, 2017 Outline Introduction EOFVAR-X model Real experiments


slide-1
SLIDE 1

Statistical downscaling by EOFVAR-X models

Jiang, Ci-Ren (Institute of Statistical Science, Academia Sinica) Chen, Lu-Hung* (Institute of Statistics, NCHU) July 17, 2017

slide-2
SLIDE 2

Outline

  • Introduction
  • EOFVAR-X model
  • Real experiments
  • Conclusions and discussions

1

slide-3
SLIDE 3

Introduction

slide-4
SLIDE 4

Statistical downscaling

  • Station-wise multiple linear model (and its variants)
  • Bayesian spatial(-temporal) models
  • Neural-networks
  • Support vector regression models
  • etc.

2

slide-5
SLIDE 5

Station-wise multiple linear model

Denote Yt = [Yt(s1),...,Yt(sm)]′ by the observations from m stations and Xt = [Xt(˜ s1),...,Xt(˜ sn)]′ by the model outputs on n grid points, a station-wise multiple linear model considers Yt(si) = X′

β β i + εt

  • Independence assumptions between:
  • 1. difgerent stations Yt(si) and Yt(sj)
  • 2. difgerent observational time points Yt(si) and Yt′(si)
  • But shouldn’t a meteorological variable be spatially and

temporally correlated?

3

slide-6
SLIDE 6

Principal regression by Benestad et al. (2015)

  • Apply principal components analysis to both observations Yt

and model outputs Xt. That is, let Yt =

K

k=1

γt,kφk and Xt =

L

ℓ=1

χt,ℓψℓ. (1)

  • The spatial relationships of Yt and Xt are retained in the

eigenvectors φk’s and ψℓ’s, respectively.

  • By the orthonormal property of φk’s, the pc scores γt,k’s are

now uncorrelated with each other, and it is safe to consider γt,k = bk,0 +

L

ℓ=1

bk,ℓχt,k + et (2) for each k = 1,2,...,K.

4

slide-7
SLIDE 7

Principal regression by Benestad et al. (2015)

  • Denote ˆ

bk,0, ˆ bk,1,..., ˆ bk,L by the estimation of bk,0,bk,1,...,bk,L, the downscaling of Xt can be accomplished by ˆ Yt =

K

k=1

ˆ γt,kφk with ˆ γt,k = ˆ bk,0 + ∑L

ℓ=1 ˆ

bk,ℓχt,k.

  • Empirical studies in Benestad et al. (2015) suggest that PCR

generally performs better and is more robust to station-wise MLR with K = 4 and L = 4.

5

slide-8
SLIDE 8

EOFVAR-X model

slide-9
SLIDE 9

Hypothesis

  • Are temporal relationships also useful for statistical

downscaling?

6

slide-10
SLIDE 10

EOFVAR-X model

  • The pc scores γ

γ γt = [γt,1,γt,2,...,γt,K]′ and χ χ χt = [χt,1,χt,2,...,χt,L]′ are temporally correlated due to the natures of Yt and Xt.

  • Thus, they can be treated as vector of time series.
  • For simplicity, we adapt the easiest model for analyzing vector
  • f time series, vector autoregressive model:

γt,k = αk,0 +

K

κ=1

αk,κ,1γt−1,κ + ···+

K

κ=1

αk,κ,pγt−p,κ +

L

ℓ=1

ρk,ℓ,0χt,ℓ +

L

ℓ=1

ρk,ℓ,1χt−1,ℓ + ···+

L

ℓ=1

ρk,ℓ,qχt−q,ℓ + ut,k (3)

7

slide-11
SLIDE 11

EOFVAR-X model

  • The lags p and q in equation (3) are tuning parameters and

can be selected by model selection, e.g. cross-validation, AIC, BIC, etc.

  • Equation (2) by Benestad et al. (2015) is a special case of

equation (3) with αk,κ,1 = αk,κ,2 = ··· = αk,κ,p = 0 and ρk,ℓ,1 = ρk,ℓ,2 = ··· = ρk,ℓ,q = 0 for all k, κ, and ℓ.

  • Thus our hypothesis can be verifjed by hypothesis testing or

model selection.

8

slide-12
SLIDE 12

Implementation details

  • 1. Impute possible missing data in Yt by DINEOF (Beckers and

Rixen, 2003, Data Interpolating Empirical Orthogonal Functions) (provided by the R package sinkr).

  • 2. Apply PCA to both Xt and Yt as described in equation (1).
  • 3. Estimate the coeffjcients in equation (3) (by the R package

MTS)

9

slide-13
SLIDE 13

Real experiments

slide-14
SLIDE 14

Data sets

Daily mean temperatures of the following two countries are used:

  • 1. German
  • 254 stations from ECAD, 1979-2016.
  • NCEP-DOE Reanalysis 2 (lon 5◦ −16◦, lat 47◦ −56◦)
  • Rolling cross-validation from 2001/1/1.
  • 2. Taiwan
  • 7 stations from ASOS, 2000-2016.
  • NCEP-DOE Reanalysis 2 (lon 118◦ −123◦, lat 21◦ −26◦)
  • Rolling cross-validation from 2011/1/1.

As suggested by Benestad et al. (2015), we set K = L = 4 in both experiments.

10

slide-15
SLIDE 15

Rolling cross-validation

Figure 1: Figure from https://robjhyndman.com/hyndsight/tscv/

11

slide-16
SLIDE 16

Results

German Taiwan RMSE Correlation RMSE Correlation PCR 1.39±0.53 0.83±0.11 1.44±0.75 0.80±0.37 EOFVAR-X 1.14±0.49 0.88±0.08 1.00±0.57 0.83±0.33

Table 1: Cross-validated RMSE and Pearson correlation coeffjcients (mean±sd)

The source codes and notebooks for the experiments are available at the project’s github page1.

1https://github.com/chenlu-hung/eofvarx

12

slide-17
SLIDE 17

Conclusions and discussions

slide-18
SLIDE 18

Conclusions and discussions

  • 1. We developed an EOFVAR-X model to incorporate both the

spatial and temporal relationships in a meteorological variable.

  • 2. Our experimental results on two difgerent countries suggest

that spatial and temporal relationships are both useful for statistical downscaling.

  • 3. Some future directions:
  • More sophisticated time series models can be considered, e.g.,

ARIMA, GARCH, etc.

  • Simultaneous downscaling of multiple meteorological variables.
  • Nonlinear models (e.g. deep learning).

13

slide-19
SLIDE 19

Thank you!

14

slide-20
SLIDE 20

References

Beckers, J.-M. and Rixen, M. (2003). Eof calculations and data fjlling from incomplete oceanographic datasets. Journal of Atmospheric and Oceanic Technology. Benestad, R. E., Chen, D., Mezghani, A., Fan, L., and Parding, K. (2015). On using principal components to represent stations in empirical–statistical downscaling. Tellus A.

15