Specification of Landmarks and Forecasting Water Temperature - - PowerPoint PPT Presentation

specification of landmarks and forecasting water
SMART_READER_LITE
LIVE PREVIEW

Specification of Landmarks and Forecasting Water Temperature - - PowerPoint PPT Presentation

Specification of Landmarks and Forecasting Water Temperature Water Management in the River Wupper G oran Kauermann Thomas Mestekemper Center for Statistics University Bielefeld University Bielefeld 14. August 2008 The River Wupper


slide-1
SLIDE 1

Specification of Landmarks and Forecasting Water Temperature – Water Management in the River Wupper

  • ran Kauermann

Center for Statistics University Bielefeld Thomas Mestekemper University Bielefeld

  • 14. August 2008
slide-2
SLIDE 2

The River Wupper

  • 14. August 2008

1

slide-3
SLIDE 3

The River Wupper

  • 14. August 2008

2

slide-4
SLIDE 4

The River Wupper and its Power Plants

  • 14. August 2008

3

slide-5
SLIDE 5

The EU Water Framework Directive

Commits European Union member states to achieve good qualitative and quantitative status of water bodies until 2015. Good surface water status means both, good ecological and chemical

  • status. The first refers to the quality and functioning of the aquatic eco-

system. For the Wupper this implies: Reduce electric power production “Too warm upstream water” = ⇒

  • r even shut down power plant

Definition of “Warm Water” depends on the fish life and reproduction cycle and the given threshold may vary over the year.

  • 14. August 2008

4

slide-6
SLIDE 6

Outline of Talk

  • Forecasting (upstream) Water Temperature
  • Specification of Landmarks (Threshold, dependent fish spawning cycle)
  • Discussion
  • 14. August 2008

5

slide-7
SLIDE 7

Literature on Water Temperature Forecasting

Hydrological Literature:

  • Seasonal and daily variations of water temperature are significantly

important for aquatic resources. (Caissie et al., 2005, Hydrological Proces-

ses)

  • Two model classes: physical (thermo-dynamic) and stochastic (stati-

stical) models. (Webb et al., 2008, Hydrological Processes) Statistical Literature:

  • Functional component models or dynamic factor models. (Cornillon et

al., 2008, CSDA; Stock & Watson, 2006, Handbook of Economic Forecasting)

  • Functional Time Series. (Ferraty & Vieu, 2006, Nonparametric FDA, Springer-

Verlag)

  • 14. August 2008

6

slide-8
SLIDE 8

Smooth Cyclic Estimation

Let index t = (y, d) denote time with year y, day in year d and wt and at be a 24-dimensional vectors of the hourly water and air temperature, respectively, which decompose to wt = µw(d) + ¯ wt, at = µa(d) + ¯ ayd. ↑ ↑ yearly trend yearly trend Functions µw(d) and µa(d) are fitted with “wrapped” B-splines, i. e. lim

d→365+

µw(d) = lim

d→1−

µw(d).

  • 14. August 2008

7

slide-9
SLIDE 9

Average Temperatures µw and µa

  • 14. August 2008

8

slide-10
SLIDE 10

Functional Principal Components Decomposition

¯ wt shall be decomposed to a dynamic factor model, that is, we reduce dimensions by extracting k suitable factors (done by PCA): ¯ wt = ftΛT

w + w,t

where Λw is a 24 × k dimensional loading matrix, ft a k dimensional factor and w,t a white noise residual. Accordingly for the air temperature we extract h suitable factors: ¯ at = gtΛT

a + a,t

  • 14. August 2008

9

slide-11
SLIDE 11

Fitted Principal Components

5 10 15 20 −0.2 −0.1 0.0 0.1 0.2 0.3

water temperature

hour factor loading

1 2

5 10 15 20 −0.3 −0.2 −0.1 0.0 0.1 0.2

air temperature

hour factor loading

1 2

  • 14. August 2008

10

slide-12
SLIDE 12

The Dynamic Factor Model

Using the backshift operator ∆a,bft = (ft−a, . . . , ft−b). We assume an autoregressive model for the factor ft: ft = (∆1,pft)βf + (∆0,qgt)βg + f,t. This implies that ft depends on:

  • water temperature factors of the p previous days
  • air temperature factors of the q previous days
  • the current day air temperature factors.

Note: In a forecasting setting the last point is only available as meteorological forecast.

  • 14. August 2008

11

slide-13
SLIDE 13

Estimation of the factors ft and gt

We want to compare three different approaches to estimate the factors. 1) We start with a quite simple Least Squares estimation method where the facor loadins are taken as

  • ft = ¯

wtΛw and

  • gt = ¯

atΛa Pro: The remaining parameters βf and βgcan easily be found using least squares regression. Con: The resulting estimates are not Maximum Likelihood-based. We need to incorporate our stochastic models in the estimation method.

  • 14. August 2008

12

slide-14
SLIDE 14

Estimation of the factors ft and gt (continued)

2) In a Maximum Likelihood approach we assume that the residuals in the former mentioned models follow normal distributions: w,t ∼ N

  • 0, diag(σ2

w)

  • and

f,t ∼ N

  • 0, diag(σ2

f)

  • .

3) We incorporate a stochastic autoregressive model for the air tempe- rature, as well, in a Full Maximum Likelihood estimation method: gt = (∆1,˜

qgt)˜

βg + g,t asuming a,t ∼ N

  • 0, diag(σ2

a)

  • and

g,t ∼ N

  • 0, diag(σ2

g)

  • .

The unknown parameters θ = (βf, βg, σ2

f, σ2 w) and ˜

θ = (θ, ˜ βg, ˜ σ2

g, ˜

σ2

a)

are now estimated using an EM-algorithm.

  • 14. August 2008

13

slide-15
SLIDE 15

Model Selection (in progress)

In order to select the best performig model we divide our dataset in a training and a forecasting sample. To measure the model quality one could, for example, make use of the Mean Squared Prediction Error defined by: MSPE = 1 n

n

  • t=i

(wt − wt)(wt − wt)T. We have to select:

  • k and h; the optimal number of factors for water and air temperature,

respectively,

  • p and q; the optimal number of time lags for water and air tempera-

ture, respectively, (we treat ˜ q = 2 as fixed)

  • the optimal estimation method.
  • 14. August 2008

14

slide-16
SLIDE 16

Demonstration

Warm spring days over Whitsun 2008

5 10 15 20 13 14 15 16 17 Temperature Hour

10.5.2008 11.5. 12.5. 13.5. 14.5. 15.5

  • 14. August 2008

15

slide-17
SLIDE 17

Demonstration

10 20 30 40 50 60 70 12 13 14 15 16 Hour Temperature

Real vs. Forecasted Temperature

real forecasted 5 10 15 20 0.20 0.25 0.30 0.35 0.40 Hour RMSE

Root Mean Square Error

  • 14. August 2008

16

slide-18
SLIDE 18

Multiple Day Forecast

Multiple day forecasts show discontinuities. Solution: To achieve a continuous m day forecast we divide our time axis into time intervals of length m, i. e. wm

t = w˜ t = (wyd1, . . . , wyd24, wy(d+1)1, . . . wy(d+m)24)

The above models are re-fitted in analogy to the 24h case.

  • 14. August 2008

17

slide-19
SLIDE 19

Demonstration

10 20 30 40 50 60 70 12 13 14 15 16 Hour Temperature

Real vs. forecasted Temperature

real forecasted 10 20 30 40 50 60 70 0.2 0.3 0.4 0.5 0.6 Hour RMSE

Root Mean Square Error

  • 14. August 2008

18

slide-20
SLIDE 20

Comparison to other modelling approaches

We compared our Least Squares model to three approaches to model the daily maximum temperature presented in Cassie et al. (1998, Can. J. Civ.

Eng.)

  • 1. ¯

wmax

t

= (∆0,2¯ amax

t

)β1 + 1

t resulted in an RMSE of 1.295◦C.

  • 2. ¯

wmax

t

= (∆1,2 ¯ wmax

t

)β2 + K ¯ amax

t

resulted in an RMSE of 2.439◦C.

  • 3. ¯

wmax

t

=

ζ0 1−δ1B¯

amax

t

+

1 1−φ1Bnt resulted in an RMSE of 1.018◦C.

For p = 2, q = 1, k = h = 3 and ˜ q = 2 our Least Squares Model yielded an RMSE of 0.42◦C.

  • 14. August 2008

19

slide-21
SLIDE 21

Finding Seasonal Pattern

Besides forecasting is the specification of seasonal pattern an important issue, since:

  • Water temperature has to stay below ecologically justified thresholds

to preserve the fish populations.

  • Threshold values depend on season, or more precisely on reproduc-

tion cycle of fish.

  • Seasons can vary like an early spring or late summer.
  • What is the “reference year”?
  • 14. August 2008

20

slide-22
SLIDE 22

Literature in ’Warping’ and ’Landmark Specification’

  • Landmark specification in growth curves. (Kneip & Gasser, 1992, Annals
  • f Statistics; Gasser & Kneip, 1995, JASA)
  • Automatic Warping (or self-modelling). (Ramsay & Li, 1998, JRSS B; Ger-

vini & Gasser, 2004, JRSS B)

  • We need an “online” warping, as data arrives over time.
  • 14. August 2008

21

slide-23
SLIDE 23

Structure of Water temperature

time (months) temperature (Celsius) 7 8 9 10 11 12 1 2 3 4 5 6 10 20 30 40 50 60

2001 2002 2002 2003 2003 2004 2004 2005 2005 2006 2006 2007

temperature curves

  • 14. August 2008

22

slide-24
SLIDE 24

Different modelling for landmark registration

Let t = (y, d, h) where h is the hour in day d. water: wt = wydh = ¯ wyd + xydh ↑ ↑ daily avg. temp. residual ↓ ↓ air: at = aydh = ¯ ayd + zydh A principal component analysis is run on the residuals xydh and zydh after substracting the mean daily temperature course: xydh = µx(d) + ¯ xydh and zydh = µz(d) + ¯ zydh.

  • 14. August 2008

23

slide-25
SLIDE 25

Seasonal Pattern in PCA coefficients

xydh = µx(h) +

Kx

  • k=1

fyd,kλx,k(h)

5 10 15 20 −0.4 0.0 0.2 0.4

1 2 3 1 2 3

principal components

2002 2003 2004 2005 2006 2007 −10 −5 5

first score

2002 2003 2004 2005 2006 2007 −4 −2 1 2

second score

2002 2003 2004 2005 2006 2007 −2 2 4

third score

  • 14. August 2008

24

slide-26
SLIDE 26

Landmark based on First PCA Score

We check, whether H0 : E(fyd,1) ≤ 0 is rejected.

time score −10 −5 5 10 2002 2003 2004 2005 2006 2007 253 95 267 80 283 90 236 93 237 123 279 96

first principal component scores

time p−value 0.05 0.95 2002 2003 2004 2005 2006 2007 253 95 267 80 283 90 236 93 237 123 279 96

p−value taking into account 15 consecutive days

time temperature 5 10 15 20 2002 2003 2004 2005 2006 2007 253 95 267 80 283 90 236 93 237 123 279 96

average daily temperature

  • 14. August 2008

25

slide-27
SLIDE 27

Correlation between Water and Air Temperature

Water: xydh = µx(h) +

K

  • k=1

fyd,kλx,k(h) Air: zydh = µz(h) +

K

  • k=1

gyd,kλz,k(h) Canonical correlation: For coefficient vectors δT

k and γT k we obtain the maximal correlation bet-

ween water and air temperature, i. e. max Cor(δT

k xt, γT k zt), k = 1, 2, . . .

  • 14. August 2008

26

slide-28
SLIDE 28

Canonical Correlation Landmark

5 10 15 20 −0.6 −0.2 0.2 daytime canonical component

1 2 3 1 2 3

canonical component water temperature

5 10 15 20 −1 1 2 daytime canonical component

1 2 3 1 2 3

canonical component air temperature

  • 14. August 2008

27

slide-29
SLIDE 29

Canonical Correlation Contributions

We look at the canonical correlation: water: ωt = δT

1 xt

air: νt = γT

1 zt

both: ωt · νt

−0.06 −0.04 −0.02 0.00 0.02 0.04 0.06 −0.08 −0.04 0.00 0.04 water temperature air temperature

canonical correlation scores

2003 2004 2005 2006 2007 −2 2 4 6 8 time (year) correlation

contribution to first canonical correlation

81 87 92 123 92 229 248 211 215 206

  • 14. August 2008

28

slide-30
SLIDE 30

Plotting the Landmarks

Time Temperature

Location of Landmarks

5 10 15 20 2003 2004 2005 2006 2007 PCA CANCOR 100/200DAYMEAN

  • 14. August 2008

29

slide-31
SLIDE 31

Warping the Years

Standard Time Real Time

late early

Time Warping Functions

02/03 03/04 04/05 05/06 06/07 J A S O N D J F M A M J J A S O N D J F M A M J

  • 14. August 2008

30

slide-32
SLIDE 32

Discussion

  • Analysis on Forecasting of Water Temperature is an important issue

(and is getting even more important based on new EU laws).

  • The issue is not fully covered by classical and newer approaches in

time series analysis.

  • Finding landmarks for seasonal variation is relevant from an ecologi-

cal point of view.

  • More to do: Compare our time warp results to observed fish spawning

cycles.

  • 14. August 2008

31