Best Practices for Time Series Forecasting Presentation by Andr - PowerPoint PPT Presentation

Best Practices for Time Series Forecasting Presentation by André Bauer & Marwin Züfle Umeå, June 20, 2019

Road Map On what you can expect: Introduction 09:00 Uhr • Foundations of Time Series • Basics of Forecasting Data Pre-Processing • Basics of Feature Engineering • Comparing Forecasting Methods Feature Engineering 10:30 Uhr • R Code snippets Method Selection 11:00 Uhr Model Fitting Evaluation Summary 12:30 Uhr 2 André Bauer & Marwin Züfle - Best Practices for Time Series Forecasting

Who are we? André Bauer Marwin Züfle Nikolas Herbst In 3rd year of PhD In 2rd year of PhD Post-Doc Research interests: Research interests: Research interests: • • • Forecasting Forecasting Predictive Data • • Elasticity Failure Analytics • • Auto-scaling Prediction Elasticity • • • Self-aware Data Analytics Serverless Computing Predictive Data Analytics group is part of Descartes Research (Self-Aware Computing) headed by Samuel Kounev @ University of Würzburg Published Forecasting Method Selection: Examination and Ways Ahead @ICAC’19 1. Challenges and Approaches: Forecasting for Autonomic Computing @OCDCC’18 2. Telescope: A Hybrid Forecast Method for Univariate Time Series @ITISE’17 3. Online Workload Forecasting. In Self- Aware Computing Systems @Springer’17 Book chapter 4. Under Review 1. Time Series Forecasting: Review and Evaluation of the State-of-the-Art @Invited Article to PIEEE Introduction Data Pre-Processing Feature Engineering Method Selection Model Fitting Evaluation 3 André Bauer & Marwin Züfle - Best Practices for Time Series Forecasting Summary

Requirements ▪ Installation of R & RStudio https://cran.rstudio.com/ https://www.rstudio.com/products/rstudio/download/#download # if not installed install.packages(c("forecast", "devtools", "zoo", "ggm")) install.packages("xgboost", "randomForest", "e1071") Introduction Data Pre-Processing Feature Engineering Method Selection Model Fitting Evaluation 4 André Bauer & Marwin Züfle - Best Practices for Time Series Forecasting Summary

Knowing the future makes life easier! ▪ If shop owner buys How many ? ▪ Too few fresh fruits, fresh fruits customers are dissatisfied to order? ▪ Too many fresh fruits, remaining fruits have to thrown away ▪ Collect sales figures Shop Owner ▪ Analyze purchasing behavior ▪ Forecast number of required fruits ? ? ? ▪ How to forecast and which method? Introduction Data Pre-Processing Feature Engineering Method Selection Model Fitting Evaluation 5 André Bauer & Marwin Züfle - Best Practices for Time Series Forecasting Summary

Forecasting ▪ Expert knowledge ▪ Is expensive ▪ Cannot be automated Problem Data Data Pre- Definition Analysis processing ▪ “No -Free- Lunch Theorem” ▪ There is no forecasting method Method Method Feature that performs best Fitting Selection Engineering ▪ Each method has its benefits and drawbacks Forecasting Evaluation Introduction Data Pre-Processing Feature Engineering Method Selection Model Fitting Evaluation 6 André Bauer & Marwin Züfle - Best Practices for Time Series Forecasting Summary

What is a time series? ▪ Univariate time series ▪ 𝑍 ≔ {𝑧𝑢: 𝑢 ∈ 𝑈} ▪ Ordered collection of values over a specific period ▪ Equidistant time steps ▪ Components ▪ Trend: long term movement ▪ Seasonality: recurring patterns, e.g., produced by humans habits ▪ Cycle: rises and falls without a fixed frequency Introduction ▪ Irregular: statistical noise distribution Data Pre-Processing Feature Engineering Method Selection Model Fitting Evaluation 7 André Bauer & Marwin Züfle - Best Practices for Time Series Forecasting Summary

Stationarity ▪ Most forecasting methods assume ▪ Stationarity or ▪ Time series can be “ stationarized ” ▪ Statistical properties (mean, variance, …) do not change over time ▪ In practice ▪ Time series have trend and/or season ▪ Non-stationary Introduction Data Pre-Processing Feature Engineering Method Selection Model Fitting Evaluation 8 André Bauer & Marwin Züfle - Best Practices for Time Series Forecasting Summary

Missing and problematic values ▪ Most forecasting methods cannot handle missing values ▪ At the beginning: removal ▪ In between: reconstruction, e.g., interpolation ▪ Some forecasting methods (e.g., ETS) cannot handle negative values ▪ Shift time series before forecast to positive ▪ Shift time series back after forecast Introduction Data Pre-Processing Feature Engineering Method Selection Model Fitting Evaluation 9 André Bauer & Marwin Züfle - Best Practices for Time Series Forecasting Summary

Detecting seasonal patterns ▪ Basic idea in mathematics ▪ Break down complex objects into simpler parts ▪ Time series is a weighted sum of sinusoidal components ▪ Periodogram ▪ Bases on Fourier transformation ▪ Each frequency gets “probability” Highest spectrum = Most dominant frequency @1/frequency Introduction Data Pre-Processing Feature Engineering Method Selection Model Fitting Evaluation 10 André Bauer & Marwin Züfle - Best Practices for Time Series Forecasting Summary

Applying a Periodogram # load package library(forecast) # plot AirPassengers time series plot(AirPassengers) # Creating and plotting the periodogram pgram <- spec.pgram(as.vector(AirPassengers)) # Building data frame with relevant info pgram_df <- data.frame(freq = pgram$freq, spec = pgram$spec) # Determining the top 10 frequencies according to the spectrum head(1/pgram_df[order(pgram_df$spec, decreasing = TRUE),1],n=10) Introduction Data Pre-Processing Feature Engineering Method Selection Model Fitting Evaluation 11 André Bauer & Marwin Züfle - Best Practices for Time Series Forecasting Summary

Anomaly Removal ▪ To increase accuracy, anomalies can be removed ▪ Generalized extreme studentized deviate test ▪ Replace anomalies by mean of non-anomaly neighbors ▪ Twitter offers package (https://github.com/twitter/AnomalyDetection) ▪ Detection may be too sensitive and find false-positives Introduction Data Pre-Processing Feature Engineering Method Selection Model Fitting Evaluation 12 André Bauer & Marwin Züfle - Best Practices for Time Series Forecasting Summary

Find Anomalies # if not installed devtools::install_github("twitter/AnomalyDetection") # load package library(AnomalyDetection) # add anomalies air <- as.vector(AirPassengers) air[c(20,100)] <- air[c(20,100)] * 5 anom <- AnomalyDetectionVec(air, period=12, direction='both', plot=TRUE) data(raw_data) anom <- AnomalyDetectionVec(raw_data[,2],period=1440, direction='both', plot=TRUE) Introduction Data Pre-Processing Feature Engineering Method Selection Model Fitting Evaluation 13 André Bauer & Marwin Züfle - Best Practices for Time Series Forecasting Summary

Feature Engineering ▪ “At the end of the day, some machine learning projects succeed and some fail. What makes the difference? Easily the most important factor is the features used” [P. M. Domingos 2012] ▪ Data transformation ▪ Simplifies the model ▪ May lead to better forecast ▪ Feature selection ▪ Most statistical methods support only the time series ▪ Machine learning methods rely on features Introduction Data Pre-Processing Feature Engineering Method Selection Model Fitting Evaluation 14 André Bauer & Marwin Züfle - Best Practices for Time Series Forecasting Summary

Time Series Transformation ▪ Time series may be complex ▪ High variance ▪ Multiplicity effects ▪ Transformation may lead to easier model ▪ Common transformation is logarithm ▪ Box-Cox transformation Introduction Data Pre-Processing Feature Engineering Method Selection Model Fitting Evaluation 15 André Bauer & Marwin Züfle - Best Practices for Time Series Forecasting Summary

Box-Cox Transformation ▪ Offers family of power functions: ln 𝑧 , 𝜇 = 0 𝑧 𝑢 𝜇 − 1 𝑥 𝑢 = ൞ , 𝑝𝑢ℎ𝑓𝑠𝑥𝑗𝑡𝑓 𝜇 ▪ Tries to “normal - shape” the data ▪ Power parameter 𝜇 can be estimated by the method of Guerrero Introduction Data Pre-Processing Feature Engineering Method Selection Model Fitting Evaluation 16 André Bauer & Marwin Züfle - Best Practices for Time Series Forecasting Summary

Box-Cox Transformation # load package library(forecast) timeseries <- AirPassengers # estimate best lambda lambda <- BoxCox.lambda(timeseries) # transform time series trans <- BoxCox(timeseries, lambda = lambda) Introduction Data Pre-Processing Feature Engineering Method Selection Model Fitting Evaluation 17 André Bauer & Marwin Züfle - Best Practices for Time Series Forecasting Summary

Feature Extraction ▪ Additional info may increase the forecast accuracy ▪ Features from external (correlated) data sources ▪ Nearby sensors ▪ Weather ▪ … ▪ Features from the given time series ▪ Time series components ▪ Fourier terms ▪ Categorical information ▪ … Introduction Data Pre-Processing Feature Engineering Method Selection Model Fitting Evaluation 18 André Bauer & Marwin Züfle - Best Practices for Time Series Forecasting Summary

Best Practices for Time Series Forecasting Presentation by Andr - PowerPoint PPT Presentation

Best Practices for Time Series Forecasting Presentation by Andr Bauer & Marwin Zfle Ume, June 20, 2019 Road Map On what you can expect: Introduction 09:00 Uhr Foundations of Time Series Basics of Forecasting Data

Outline Time series and forecasting Time series objects 1 in R Basic time series functionality

Time Series Analysis and Mining with R Time Series Decomposi- tion Time Series Forecasting

Lead Screw Motors LSM08 Series LSM11 Series LSM14 Series LSM17 Series

Outline Automatic time series forecasting Motivation 1 Exponential smoothing 2 Rob J.

Flood Forecasting Initiative Guy Shalev Flooding impact Flood Forecasting Flood Forecasting

Forecasts and potential futures Rob Hyndman Author, forecast Forecasting Using R Sample

Statistics for Management Chapter 8 Time Series and Forecasting Prepared and Delivered by,

Welcome to Forecasting Using R Rob Hyndman Author, forecast Forecasting Using R What you will

Two Algorithms for Time Series Forecasting Danny Yuan Forecasting with Fast Fourier

Forecasting 21 January 2013 1 FCAS Agenda Business Goals & Forecasting Approach

Lecture 10 Forecasting and Model Fitting Colin Rundel 02/20/2017 1 Forecasting 2 Forecasting

Electricity price forecasting: from prob- abilistic to deep learning approaches TU Delft &

Five Steps to Optimization Five Steps to Optimization Beyond Best Practices Beyond Best

ENSEMBLES FOR TIME SERIES FORECASTING Mariana Oliveira & Lus Torgo Ensembles for Time

standard series Overview DP series DX series H series M series bitte hier

AI Konrad Wawruch | 7bulls.com LTD for Financial Time Series Forecasting and Dynamic Assets

Going Cloud Native with Cloud Foundry @chipchilders Chip Childers, VP Technology Cloud Foundry

Specification-Carrying Code for Self-Managed Systems Giovanna Di Marzo Serugendo University of

Foundations of Foundations of Automated Database Tuning Automated Database Tuning Surajit

Outline Research Problem Research Problem Challenges Approaches & Gaps PHD

Part 2: nature-inspired design for artificial systems Saffre & Halloy, 2005 Plan of the

Disclosures Acute Respiratory Failure: Whats New in the Literature? Research funding: NIH,

Climate forcing and malaria dynamics Mercedes Pascual University of Chicago and The Santa Fe

Modeling the risk-benefit of chemoprophylaxis for travelers to areas with stable malaria

Best Practices for Time Series Forecasting Presentation by Andr - PowerPoint PPT Presentation

Best Practices for Time Series Forecasting Presentation by Andr Bauer & Marwin Zfle Ume, June 20, 2019 Road Map On what you can expect: Introduction 09:00 Uhr Foundations of Time Series Basics of Forecasting Data

Outline Time series and forecasting Time series objects 1 in R Basic time series functionality

Time Series Analysis and Mining with R Time Series Decomposi- tion Time Series Forecasting

Lead Screw Motors LSM08 Series LSM11 Series LSM14 Series LSM17 Series

Outline Automatic time series forecasting Motivation 1 Exponential smoothing 2 Rob J.

Flood Forecasting Initiative Guy Shalev Flooding impact Flood Forecasting Flood Forecasting

Forecasts and potential futures Rob Hyndman Author, forecast Forecasting Using R Sample

Statistics for Management Chapter 8 Time Series and Forecasting Prepared and Delivered by,

Welcome to Forecasting Using R Rob Hyndman Author, forecast Forecasting Using R What you will

Two Algorithms for Time Series Forecasting Danny Yuan Forecasting with Fast Fourier

Forecasting 21 January 2013 1 FCAS Agenda Business Goals &amp; Forecasting Approach

Lecture 10 Forecasting and Model Fitting Colin Rundel 02/20/2017 1 Forecasting 2 Forecasting

Electricity price forecasting: from prob- abilistic to deep learning approaches TU Delft &amp;

Five Steps to Optimization Five Steps to Optimization Beyond Best Practices Beyond Best

ENSEMBLES FOR TIME SERIES FORECASTING Mariana Oliveira &amp; Lus Torgo Ensembles for Time

standard series Overview DP series DX series H series M series bitte hier

AI Konrad Wawruch | 7bulls.com LTD for Financial Time Series Forecasting and Dynamic Assets

Going Cloud Native with Cloud Foundry @chipchilders Chip Childers, VP Technology Cloud Foundry

Specification-Carrying Code for Self-Managed Systems Giovanna Di Marzo Serugendo University of

Foundations of Foundations of Automated Database Tuning Automated Database Tuning Surajit

Outline Research Problem Research Problem Challenges Approaches &amp; Gaps PHD

Part 2: nature-inspired design for artificial systems Saffre &amp; Halloy, 2005 Plan of the

Disclosures Acute Respiratory Failure: Whats New in the Literature? Research funding: NIH,

Climate forcing and malaria dynamics Mercedes Pascual University of Chicago and The Santa Fe

Modeling the risk-benefit of chemoprophylaxis for travelers to areas with stable malaria

Forecasting 21 January 2013 1 FCAS Agenda Business Goals & Forecasting Approach

Electricity price forecasting: from prob- abilistic to deep learning approaches TU Delft &

ENSEMBLES FOR TIME SERIES FORECASTING Mariana Oliveira & Lus Torgo Ensembles for Time

Outline Research Problem Research Problem Challenges Approaches & Gaps PHD

Part 2: nature-inspired design for artificial systems Saffre & Halloy, 2005 Plan of the