ENSEMBLES FOR TIME SERIES FORECASTING Mariana Oliveira & Lus - - PowerPoint PPT Presentation

ensembles for time
SMART_READER_LITE
LIVE PREVIEW

ENSEMBLES FOR TIME SERIES FORECASTING Mariana Oliveira & Lus - - PowerPoint PPT Presentation

ENSEMBLES FOR TIME SERIES FORECASTING Mariana Oliveira & Lus Torgo Ensembles for Time Series Forecasting ACML 2014 2 Mariana Oliveira & Lus Torgo (FCUP/LIAAD) Outline Introduction Delay-coordinate embedding Bagging for


slide-1
SLIDE 1

ENSEMBLES FOR TIME SERIES FORECASTING

Mariana Oliveira & Luís Torgo

slide-2
SLIDE 2

Outline

  • Introduction
  • Delay-coordinate embedding
  • Bagging for Time Series Forecasting
  • Bagging Variants
  • Experimental Evaluation
  • Time series
  • Results
  • Conclusion
  • Future work

Ensembles for Time Series Forecasting Mariana Oliveira & Luís Torgo (FCUP/LIAAD)

2

ACML 2014

slide-3
SLIDE 3

Introduction

  • Ensembles are among the most competitive forms of

solving predictive tasks;

  • Diversity among ensemble members is essential;
  • We aim at improving the predictive performance of

ensembles in time series forecasting.

ACML 2014 Ensembles for Time Series Forecasting Mariana Oliveira & Luís Torgo (FCUP/LIAAD)

3

slide-4
SLIDE 4

Delay-coordinate embedding

  • Delay-coordinate embedding assumes that future

values of the series are only dependent on a limited number of previous values;

  • Any regression tool can then be used to obtain a model of

the form

𝑍

𝑢+ℎ = 𝑔 < 𝑍 𝑢−𝑙, … , 𝑍 𝑢−1, 𝑍 𝑢 > .

  • This requires setting the embed size (k) and most times

there may not exist one single correct answer.

Ensembles for Time Series Forecasting Mariana Oliveira & Luís Torgo (FCUP/LIAAD)

4

ACML 2014

slide-5
SLIDE 5

Delay-coordinate embedding

Ensembles for Time Series Forecasting Mariana Oliveira & Luís Torgo (FCUP/LIAAD)

5 3 7 2 4 1 5

1 2 3 4 5 6 7 1 2 3 4 5 6

time y

t-3 t-2 t-1 t 2 4 1 5 7 2 4 1 3 7 2 4

k=3

ACML 2014

slide-6
SLIDE 6

Bagging for Time Series Forecasting

  • We propose variants of bagging of regression trees;
  • Diversity generation of our variants explores specific

properties of time series prediction tasks;

  • We will compare the performance of our proposals

against that of standard bagging, our baseline.

Ensembles for Time Series Forecasting Mariana Oliveira & Luís Torgo (FCUP/LIAAD)

6

ACML 2014

slide-7
SLIDE 7

Bagging for Time Series Forecasting

  • There are many possible ways of describing the recent

dynamics of a time series through a set of predictors;

  • Our initial set of proposed bagging variants use
  • different embed sizes given a maximum embed size kmax ;
  • summary statistics of recent values as additional predictors.

ACML 2014 Ensembles for Time Series Forecasting Mariana Oliveira & Luís Torgo (FCUP/LIAAD)

7

slide-8
SLIDE 8

t t-1 t-2 ... t-k t t-1 t-2 ... t-k μ σ2

Bagging variants

Ensembles for Time Series Forecasting Mariana Oliveira & Luís Torgo (FCUP/LIAAD)

8

ACML 2014

slide-9
SLIDE 9

t t-1 t-2 t-3 t-4 t-5 t-6 ... t-k t t-1 t-2 ... t-k/2 t ... t-k/4

Bagging variants

Ensembles for Time Series Forecasting Mariana Oliveira & Luís Torgo (FCUP/LIAAD)

9

ACML 2014

slide-10
SLIDE 10

t t-1 t-2 t-3 t-4 t-5 t-6 ... t-k μ σ2 t t-1 t-2 ... t-k/2 μ σ2 t ... t-k/4 μ σ2

Bagging variants

Ensembles for Time Series Forecasting Mariana Oliveira & Luís Torgo (FCUP/LIAAD)

10

ACML 2014

slide-11
SLIDE 11

t t-1 t-2 t-3 t-4 t-5 t-6 ... t-k t t-1 t-2 ... t-k/2 t ... t-k/4 t t-1 t-2 t-3 t-4 t-5 t-6 ... t-k μ σ2 t t-1 t-2 ... t-k/2 μ σ2 t ... t-k/4 μ σ2

Bagging variants

Ensembles for Time Series Forecasting Mariana Oliveira & Luís Torgo (FCUP/LIAAD)

11

ACML 2014

slide-12
SLIDE 12

Experimental Evaluation

  • Data: 14 real world time series;
  • Metric: Standard Mean Squared Error (MSE);
  • Experimental procedure: Monte Carlo simulations
  • randomly selected 10 points in time
  • training on the previous 50% observations
  • testing on the following 25%;
  • Statistical Significance:
  • Wilcoxon signed rank tests with p-value < 0:05;
  • Tested setups:
  • Different number of models in the ensemble (M);
  • Difference value of the maximum embed used (kmax);

Ensembles for Time Series Forecasting Mariana Oliveira & Luís Torgo (FCUP/LIAAD)

12

ACML 2014

slide-13
SLIDE 13

Time series

We use the series of the differences between successive values of each original time series; Each series was treated separately from the

  • thers in their respective

data source.

Ensembles for Time Series Forecasting Mariana Oliveira & Luís Torgo (FCUP/LIAAD)

13

ACML 2014

slide-14
SLIDE 14

Results

Ensembles for Time Series Forecasting Mariana Oliveira & Luís Torgo (FCUP/LIAAD)

14

Paired comparisons: Nr.Wins (Statistically Significant Wins)/ Nr.Losses (Statistically Significant Losses)

ACML 2014

slide-15
SLIDE 15

Results

Ensembles for Time Series Forecasting Mariana Oliveira & Luís Torgo (FCUP/LIAAD)

15

Average and standard deviation of rank for each method

ACML 2014

slide-16
SLIDE 16

Results

Ensembles for Time Series Forecasting Mariana Oliveira & Luís Torgo (FCUP/LIAAD)

16

Average and standard deviation of mean percentual difference wrt to the baseline

ACML 2014

slide-17
SLIDE 17

Results

Ensembles for Time Series Forecasting Mariana Oliveira & Luís Torgo (FCUP/LIAAD)

17

sgn 𝑁𝑇𝐹𝑦−𝑁𝑇𝐹𝐹 . log

  • 100. 𝑁𝑇𝐹𝑦−𝑁𝑇𝐹𝐹

𝑁𝑇𝐹𝐹 + 1

ACML 2014

slide-18
SLIDE 18

Conclusion

  • Proposed initial set of forms of injecting diversity into

ensembles that take into account specific challenges posed by time series;

  • The recent dynamics of a time series is represented using
  • different embed sizes and
  • the addition of variables summarizing the recent observed values;
  • This was implemented and tested in the context of

bagging regression trees, obtaining a clear advantage

  • ver standard bagging in real world data;
  • Our results suggest this is a promising research direction.

Ensembles for Time Series Forecasting Mariana Oliveira & Luís Torgo (FCUP/LIAAD)

18

ACML 2014

slide-19
SLIDE 19

Future work

  • Exploring the possibility of
  • changing the amount of past data used by each model (varying

training windows);

  • making the aggregation of the predictions time-dependent;
  • using other types of predictor variants.

Try it yourself:

  • All code and data necessary to replicate all the results presented available at

http://www.dcc.fc.up.pt/~ltorgo/ACML2014/

  • All programs are written in the free and open source R software environment.

Ensembles for Time Series Forecasting Mariana Oliveira & Luís Torgo (FCUP/LIAAD)

19

ACML 2014