Analyzing multiple time series using a dynamic latent variables - - PowerPoint PPT Presentation

analyzing multiple time series using a dynamic latent
SMART_READER_LITE
LIVE PREVIEW

Analyzing multiple time series using a dynamic latent variables - - PowerPoint PPT Presentation

Analyzing multiple time series using a dynamic latent variables principal component analysis model S. Dossou-Gb et e February 9, 2011 S. Dossou-Gb et e () Dynamic latent variables PCA February 9, 2011 1 / 44 Outline 1


slide-1
SLIDE 1

Analyzing multiple time series using a dynamic latent variables principal component analysis model

  • S. Dossou-Gb´

et´ e February 9, 2011

  • S. Dossou-Gb´

et´ e () Dynamic latent variables PCA February 9, 2011 1 / 44

slide-2
SLIDE 2

Outline

1 Introduction 2 Probabilistic Principal Components Analysis and potential

dynamic extension

3 Statistical method 4 Case study: Performances of a wastewater treatment plant

  • S. Dossou-Gb´

et´ e () Dynamic latent variables PCA February 9, 2011 2 / 44

slide-3
SLIDE 3

Outline

1 Introduction 2 Probabilistic Principal Components Analysis and potential

dynamic extension

3 Statistical method 4 Case study: Performances of a wastewater treatment plant

  • S. Dossou-Gb´

et´ e () Dynamic latent variables PCA February 9, 2011 2 / 44

slide-4
SLIDE 4

Outline

1 Introduction 2 Probabilistic Principal Components Analysis and potential

dynamic extension

3 Statistical method 4 Case study: Performances of a wastewater treatment plant

  • S. Dossou-Gb´

et´ e () Dynamic latent variables PCA February 9, 2011 2 / 44

slide-5
SLIDE 5

Outline

1 Introduction 2 Probabilistic Principal Components Analysis and potential

dynamic extension

3 Statistical method 4 Case study: Performances of a wastewater treatment plant

  • S. Dossou-Gb´

et´ e () Dynamic latent variables PCA February 9, 2011 2 / 44

slide-6
SLIDE 6

1 Introduction

  • S. Dossou-Gb´

et´ e () Dynamic latent variables PCA February 9, 2011 3 / 44

slide-7
SLIDE 7

Introduction

Probabilistic Principal Component Analysis (PPCA)[3, 2] as well as Principal Component Analysis (PCA)[1, 2] are two statistical methods designed for analyzing multivariate data. In this setting multivariate data are considered as response variables assuming latent variables (unobserved effects) could explain the variations among individual observations.

Anderson T.W. (1984): Estimating Linear Statistical Relationships. Annals of Statistics, 12, pp.1-45. Bishop C.M. (2006): Pattern Recognition and Machine Learning. Springer Tipping M.E. & Bishop C.M. (1999): Probabilistic Principal Component

  • Analysis. Journal of the Royal Statistical Society, Series B, 21 (3), pp.611–622
  • S. Dossou-Gb´

et´ e () Dynamic latent variables PCA February 9, 2011 4 / 44

slide-8
SLIDE 8

Introduction ...

These methods have proved their ability to cope with a large number of variables without running into scarce degrees of freedom problems often faced in a regression-based analysis. Similar considerations apply to multivariate time series if they are thought as response variables assuming the variations over time of the individual observations could be explained by hidden and time-varying stochastic mechanisms.

  • S. Dossou-Gb´

et´ e () Dynamic latent variables PCA February 9, 2011 5 / 44

slide-9
SLIDE 9

Introduction ...

These latent time-varying components could describe trends in

  • bserved time series as well as the relationships between them.

This motivates the extension of Probabilistic Principal Component Analysis so as to take into account explicitly the time component that is inherent to the aims of the analysis of multivariate time series.

  • S. Dossou-Gb´

et´ e () Dynamic latent variables PCA February 9, 2011 6 / 44

slide-10
SLIDE 10

2 Probabilistic Principal Components Analysis and potential

dynamic extension

  • S. Dossou-Gb´

et´ e () Dynamic latent variables PCA February 9, 2011 7 / 44

slide-11
SLIDE 11

Dynamic extension of Probabilistic Principal Components Analysis

Let’s consider a multivariate time series {xt,t = 1 : T} where xt = (xtj)j=1:p is a vector of p ≥ 2 numerical measurements. It is assumed there is a k-dimensional latent gaussian process Zt,t = 0 : T such that for each t = 1 : T the following assumptions are fullfilled: Z0 ∼N (0,Ik) Zt − Zt−1 ∼N (0,Ik) where

❼ ηt = Zt − Zt−1 are iid gaussian random vectors ❼ FZ

s is the σ-field generated by {Zu, u = 0 : s} and FZ s gathers

informations from the beginning of the process up to s.

  • S. Dossou-Gb´

et´ e () Dynamic latent variables PCA February 9, 2011 8 / 44

slide-12
SLIDE 12

Dynamic extension of Probabilistic Principal Components Analysis ...

For each t = 1 : T measurement xt is the realisation of a p-dimensional random variable Xt such that E

  • Xt | FZ

t

  • =

µ + AZt Xt − E

  • Xt | FZ

t

N

  • 0,σ2Ip
  • where εt = Xt − E
  • Xt | FZ

t

  • are i.i.d. gaussian random vectors;

The model’s specification is completed by the additional assumptions that the gaussian processes ηt and εt are mutually independent.

  • S. Dossou-Gb´

et´ e () Dynamic latent variables PCA February 9, 2011 9 / 44

slide-13
SLIDE 13

Dynamic extension of Probabilistic Principal Components Analysis ...

This is a latent variables model that can be regarded as a specific case of the state space model. The random process Zt is used to model unknown trend components and can be thought as a state variable whose dynamic behavior is described by a random walk.

  • S. Dossou-Gb´

et´ e () Dynamic latent variables PCA February 9, 2011 10 / 44

slide-14
SLIDE 14

Dynamic extension of Probabilistic Principal Components Analysis ...

❼ Xt is then a gaussian process ❼ the marginal components of Xt are independent conditionally

to FZ

t .

  • S. Dossou-Gb´

et´ e () Dynamic latent variables PCA February 9, 2011 11 / 44

slide-15
SLIDE 15

Model’s identifiabilty

The model’s parameters are A, µ, σ2 also called hyperparameters in the factor analysis terminology. As it stands however, there is a lack of identifiability since for any rotation matrix Ω, the matrix of factor loadings and the trend component could be redefined as A∗ = AΩ and Z ∗

t = tΩZt

respectively.

  • S. Dossou-Gb´

et´ e () Dynamic latent variables PCA February 9, 2011 12 / 44

slide-16
SLIDE 16

Model’s identifiabilty ...

for probabilistic PCA, identifiability will be achieved in some extend if the loading matrix A satisfies the constraint

tAA = diag

  • λ2

j ,j = 1 : k

  • .

This constraint means that the columns of the matrix W are set to constitute an orthogonal system of vectors belonging to Rp.

  • S. Dossou-Gb´

et´ e () Dynamic latent variables PCA February 9, 2011 13 / 44

slide-17
SLIDE 17

Model’s identifiabilty ...

Then the model may be reparametrized by considering

❼ the scalar parameters µ and σ2 and the numerical sequence

λ2

j ,j = 1 : k

❼ and a sequence of orthonormal vectors uj ∈ Rp, j = 1 : k such

that A = UΛ

1 2 with U = [uj]j=1:k and Λ = diag

  • λ2

j ,j = 1 : k

  • .
  • S. Dossou-Gb´

et´ e () Dynamic latent variables PCA February 9, 2011 14 / 44

slide-18
SLIDE 18

3 Statistical method

EM algorithm Filtering, forecasting and smoothing the latent process Algorithms

  • S. Dossou-Gb´

et´ e () Dynamic latent variables PCA February 9, 2011 15 / 44

slide-19
SLIDE 19

3 Statistical method

EM algorithm Filtering, forecasting and smoothing the latent process Algorithms

  • S. Dossou-Gb´

et´ e () Dynamic latent variables PCA February 9, 2011 16 / 44

slide-20
SLIDE 20

The log-likelihood of model’s parameters

Let Θ =

µ,σ2,A denotes the unknown parameters of the

statistical model The density of the joint distribution of the sequence {(Xt)t=1:T , (Zt)t=0:T} of random variables can be written as gT ((xt)t=1:T , (zt)t=0:T) = hT (zt,t = 0 : T)

T

  • t=1

gt (xt | zt−j,j = 0 : t) with hT (zt,t = 0 : T) = h0 (z0)

T

  • t=1

ht (zt | zt−j,j = 1 : t)

  • S. Dossou-Gb´

et´ e () Dynamic latent variables PCA February 9, 2011 17 / 44

slide-21
SLIDE 21

The log-likelihood of model’s parameters ...

By taking into account the models specifications as set in the previous subsection, it comes this density is proportional to the expression

1

σ2

T

exp

  • −1

2 z02

T

  • t=1

exp

  • −1

2 zt − zt−12

  • ×

T

  • t=1

exp

  • − 1

2σ2 xt − µ − Azt2

  • S. Dossou-Gb´

et´ e () Dynamic latent variables PCA February 9, 2011 18 / 44

slide-22
SLIDE 22

The log-likelihood of model’s parameters ...

and hence the complete likelihood (which assumes that both xt and zt are observed) is LX,Z (Θ) = cte − 1 2

  • z2
  • − 1

2

T

  • t=1

zt − zt−12 − T log

  • σ2

− 1 2σ2

T

  • t=1

xt − µ − Azt2 where Θ denotes the set of the model’s parameters µ, σ2, λ and U.

  • S. Dossou-Gb´

et´ e () Dynamic latent variables PCA February 9, 2011 19 / 44

slide-23
SLIDE 23

EM algorithm

Since zt are unobserved we do not have the complete data and we ressort to EM algorithm[1] which provides with an iterative procedure for computing the maximum likelihood estimates of the model’s parameters based on the incomplete data xt,t = 1 : T. Moreover the EM algorithm allows parts of the observation vector xt to be missing at a number of observation times [2, ?]. EM algorithm aims to calculate a maximum likelihood estimate of unknown parameter Θ throught a iterative scheme that alternate between:

❼ maximizing the loglikelihood function with respect to latent

states (M-step)

❼ with respect to unknown parameters (E-step) directly

corresponds to solving the smoothing problem

  • S. Dossou-Gb´

et´ e () Dynamic latent variables PCA February 9, 2011 20 / 44

slide-24
SLIDE 24

3 Statistical method

EM algorithm Filtering, forecasting and smoothing the latent process Algorithms

  • S. Dossou-Gb´

et´ e () Dynamic latent variables PCA February 9, 2011 21 / 44

slide-25
SLIDE 25

Preliminaries

❼ Statistical ´

evaluation of the unobservable state Zt in terms of

❼ Xs, s<t is the forecasting task: Z s

t with s < t

❼ Xs,s ≤ t is the filtering task ❼ Xs,s = 1 : T is the smoothing task depends on past, present

and future: Z T

t

with t = 1 : T .

  • S. Dossou-Gb´

et´ e () Dynamic latent variables PCA February 9, 2011 22 / 44

slide-26
SLIDE 26

Kalman filter for latent process forecasting

The aim is to compute statistics that estimate value zs

t as

realisation of Zt given all the dataset xu,u = 1 : s with s ≤ t . Define for s ≤ t

❼ FX

s as the σ-field generated by Xu,u = 1 : s

❼ the conditional expectation Z s

t = E

  • Zt | FX

s

  • with Z 0

0 = 0

Ps

t =

Var

  • Zt | FX

s

  • Ps

t,t−1 =Cov

  • Zt,Zt−1 | FX

s

  • with P0

0 = Ik

  • S. Dossou-Gb´

et´ e () Dynamic latent variables PCA February 9, 2011 23 / 44

slide-27
SLIDE 27

Kalman filter for latent process forecasting ...

Define the innovations et = Xt − E

  • Xt | FX

t−1

  • = Xt − UΛZ t−1

t

et,t = 1 : T are centered gaussian random vectors and their covariances matrices are Σt = APt−1

t

tA + σ2Ip

  • S. Dossou-Gb´

et´ e () Dynamic latent variables PCA February 9, 2011 24 / 44

slide-28
SLIDE 28

Kalman filter for latent process forecasting ...

One has Z t

t = Z t−1 t

+ Ktet with Kt = Pt−1

t

tA

APt−1

t

tA + σ2Ip −1 = Pt−1

t

tAt Σ−1

t

  • S. Dossou-Gb´

et´ e () Dynamic latent variables PCA February 9, 2011 25 / 44

slide-29
SLIDE 29

Kalman filter for latent process forecasting ...

Thus the following statements hold: Z t−1

t

= Z t−1

t−1

Pt−1

t

= Pt−1

t−1 + Ik

Pt

t

= [I − KtA] Pt−1

t

Kt = Pt−1

t

tA

APt−1

t

tA + σ2Ip −1 = Pt−1

t

tA Σ−1

t

is the so-called Kalman gain Given A and σ2, predicted and filtered values zt−1

t

and zt

t for Zt are

  • btained by evaluation of the above set of recursions for t = 1 : T
  • S. Dossou-Gb´

et´ e () Dynamic latent variables PCA February 9, 2011 26 / 44

slide-30
SLIDE 30

Kalman smoother

The aim is to compute statistics that estimate values zT

t

as realisations of Zt, t = 1 : T, given all the dataset xu,u = 1 : T . Let FX

T denote the σ-field generated by Xt,t = 1 : T.

For t = 1 : T, F X

T is generated by

❼ FX

t−1,

❼ the innovation Zt − Z t−1

t

= Zt − E

  • Zt | FX

t−1

  • ,

❼ ηs,s = t : T , ❼ and εs,s = (t + 1) : T.

  • S. Dossou-Gb´

et´ e () Dynamic latent variables PCA February 9, 2011 27 / 44

slide-31
SLIDE 31

Kalman smoother ...

E

  • Zt−1 | FX

T

  • = E
  • Zt−1 | FX

t−1

  • + E
  • Zt−1 | Zt − Z t−1

t

  • and

E

  • Zt−1 | Zt − Z t−1

t

  • =

Cov

  • Zt−1,Zt − Z t−1

t

Pt−1

t

−1 Zt − Z t−1

t

  • =

Pt−1

t−1

  • Pt−1

t

−1 Zt − Z t−1

t

  • S. Dossou-Gb´

et´ e () Dynamic latent variables PCA February 9, 2011 28 / 44

slide-32
SLIDE 32

Kalman smoother ...

Thus: ❼

Z T

t−1 = Z t−1 t−1 + Jt−1

  • Z T

t − Z t−1 t

  • where Jt−1 = Pt−1

t−1

  • Pt−1

t

−1. ❼

PT

t−1 = Pt−1 t−1 + Jt−1

  • PT

t − Pt−1 t

tJt−1

  • Dempster A. P., N. Laird, and D. B. Rubin (1977). Maximum likelihood from

incomplete data via the EM algorithm. J. Royal Statistical Society, Series B, 39 (1), pp.1–38. Shumway R.H. & Stoffer D.S. (2006): Time Series Analysis and Its Applications With R Examples, 2nd edition. Springer-Verlag Zuur A.F. et al. (2003)Estimating common trends in multivariate time series using dynamic factor analysis. Environmetrics 14, pp.665–685.

  • S. Dossou-Gb´

et´ e () Dynamic latent variables PCA February 9, 2011 29 / 44

slide-33
SLIDE 33

3 Statistical method

EM algorithm Filtering, forecasting and smoothing the latent process Algorithms

  • S. Dossou-Gb´

et´ e () Dynamic latent variables PCA February 9, 2011 30 / 44

slide-34
SLIDE 34

General feature of the EM algorithm

Let Θ denotes the set of the model’s parameters. EM algorithm will consist in alternating between

❼ the computation of the conditional expectation of the

complete data likelihood (expectation step)

❼ and a multivariate normal maximum likelihood maximization

(maximization step) [2, ?].

  • S. Dossou-Gb´

et´ e () Dynamic latent variables PCA February 9, 2011 31 / 44

slide-35
SLIDE 35

Expectation step

The expectation step at the j-th iteration of the EM algorithm consists in the calculation of Q

  • Θ,ˆ

Θj−1

  • = Eˆ

Θj−1 {−LX,Z (Θ | X)}

where ˆ Θj gathers the current values of the estimates of the model’s parameters computed at the (j − 1) -th iteration and X = (Xt)t=1:T. This conditional expectation is obtained by using Kalman filter and smoother [2, ?]. The M-step consists in the maximization of Q

  • Θ,ˆ

Θj−1

  • with

respect to Θ and results in a multivariate normal maximum likelihood maximization.

  • S. Dossou-Gb´

et´ e () Dynamic latent variables PCA February 9, 2011 32 / 44

slide-36
SLIDE 36

4 Case study: Performances of a wastewater treatment plant

Data for case study

  • S. Dossou-Gb´

et´ e () Dynamic latent variables PCA February 9, 2011 33 / 44

slide-37
SLIDE 37

4 Case study: Performances of a wastewater treatment plant

Data for case study

  • S. Dossou-Gb´

et´ e () Dynamic latent variables PCA February 9, 2011 34 / 44

slide-38
SLIDE 38

Data for case study

As an example of how the method works we consider a real-world dataset that describes the behavior a wastewater treatment plant for 527 consecutive days corresponding to the period 1990-1991. The dataset is available by anonymous ftp from the UCI machine learning repository of databases .

  • S. Dossou-Gb´

et´ e () Dynamic latent variables PCA February 9, 2011 35 / 44

slide-39
SLIDE 39

Data for case study ...

Each data is described by 38 variables. 29 variables are daily mean of measurements of 8 quality indicators taken in several places of the plant:

❼ at the input, ❼ after the pretreatment (Primary settler), ❼ at the input of the biological reactor (Secondary settler) ❼ and at the water output of the plant.

  • S. Dossou-Gb´

et´ e () Dynamic latent variables PCA February 9, 2011 36 / 44

slide-40
SLIDE 40

Data for case study ...

9 remaining variables are calculated performances of the primary and secondary treatments for the whole plant.

  • S. Dossou-Gb´

et´ e () Dynamic latent variables PCA February 9, 2011 37 / 44

slide-41
SLIDE 41

Preliminary investigation

The figure below shows the variations over time of four water quality indicators at the beginning of the water treatment process.

  • S. Dossou-Gb´

et´ e () Dynamic latent variables PCA February 9, 2011 38 / 44

slide-42
SLIDE 42

Preliminary investigation ...

Figure:

  • S. Dossou-Gb´

et´ e () Dynamic latent variables PCA February 9, 2011 39 / 44

slide-43
SLIDE 43

One of the main characteristics of these time series is that they exhibit local mean levels that vary over time. Such behavior of the individual time series could be well described by models that combine stochastic trend component with an additive noise like a random walk model. Thus a latent variables model like one that is studied in this paper could provide a suitable method for analyzing these time series together and highlight the relationships between them.

  • S. Dossou-Gb´

et´ e () Dynamic latent variables PCA February 9, 2011 40 / 44

slide-44
SLIDE 44

Some numerical results

1 dimension of the latent trend components

likelihood AIC 1 117876.98 117644.98 2 110414.94 110106.94 3 110559.51 110175.51 4 110462.78 110002.78 5 110491.58 109955.58 6 110493.56 109881.56

  • S. Dossou-Gb´

et´ e () Dynamic latent variables PCA February 9, 2011 41 / 44

slide-45
SLIDE 45

Some numerical results ...

2 Trend components

  • S. Dossou-Gb´

et´ e () Dynamic latent variables PCA February 9, 2011 42 / 44

slide-46
SLIDE 46

Some numerical results ...

  • S. Dossou-Gb´

et´ e () Dynamic latent variables PCA February 9, 2011 43 / 44

slide-47
SLIDE 47

Loadings

1 2 3 4 1 2348.62 2717.45 2775.27

  • 3866.41

2 0.05 0.04 0.11 0.02 3 0.03 0.01 0.04 0.00 4

  • 7.53
  • 3.73

2.15 15.42 5

  • 7.11
  • 10.55

10.61 36.12 6

  • 7.45

2.07 6.83 9.63 7

  • 0.83
  • 1.66
  • 0.83

2.79 8

  • 0.12
  • 0.03

0.24 0.40 9 94.94

  • 41.24

130.55 138.40 10 0.03 0.01 0.03 0.00 11

  • 10.30
  • 3.54

1.64 17.02 12

  • 9.56

3.02 10.25 13.22 13

  • 0.43
  • 1.39
  • 0.44

2.36 14

  • 0.16

0.02 0.39 0.52 15 99.52

  • 38.84

139.29 139.79 16 0.02 0.00 0.02 0.00 17

  • 0.91

0.61 8.61 8.81 18 4.80

  • 0.82

19.29 17.92 19

  • 0.23

0.80 3.47 2.43 20

  • 1.09
  • 1.56
  • 1.20

2.41 21 0.00

  • 0.00

0.03 0.03 22 95.37

  • 40.91

136.74 144.07 23 0.00

  • 0.00
  • 0.02
  • 0.01

24

  • 0.01

0.50 1.19 0.30 25 1.58 0.61 5.77 3.56 26 0.69 0.78 2.32 0.38 27

  • 1.20
  • 0.70
  • 1.37

0.76 28 0.00 0.00 0.01

  • 0.00

29 96.15

  • 36.55

131.43 129.93 30

  • 1.90
  • 0.44
  • 1.78

0.57 31

  • 0.46

0.05 0.33 0.63 32 0.70 0.98 1.90

  • 0.40

33

  • 1.24
  • 0.02
  • 0.23

0.93 34 0.13 0.13 0.39 0.08

  • S. Dossou-Gb´

et´ e () Dynamic latent variables PCA February 9, 2011 44 / 44