DSANet: Dual Self-Attention Network for Multivariate Time Series - PowerPoint PPT Presentation

DSANet: Dual Self-Attention Network for Multivariate Time Series Forecasting Siteng Huang, Donglin Wang, Xuehan Wu, Ao Tang Presenter: Siteng Huang Machine Intelligence Laboratory, Department of Engineering, Westlake University, Hangzhou, China November, 2019 1

Outline 1. Introduction and Previous Works 2. Proposed Model 3. Experiments 4. Conclusion 2

Introduction • The purpose of time series forecasting is to predict the future value based on historical data. • The difficulty lies in that traditional methods fail to capture complicated non-linear dependencies between time steps and between multiple time series. Figure 1: An example of chaotic multivariable time series. 3

Previous Works Figure 2: Long- and Short-term Time- series Network (LSTNet). Figure 3: Temporal Pattern Attention. Guokun Lai, Wei-Cheng Chang, Yiming Yang, Hanxiao Liu. Modeling Long- and Short-Term Temporal Patterns with Deep Neural Networks. SIGIR 2018: 95-104 4 Shun-Yao Shih, Fan-Keng Sun, Hung-yi Lee. Temporal pattern attention for multivariate time series forecasting. Machine Learning 108(8-9): 1421-1441 (2019)

Proposed Model Figure 4: Dual Self-Attention Network (DSANet). • Global Temporal Convolution • Self-attention Module • Local Temporal Convolution • Autoregressive Component 5

Experimental Settings • Dataset: A large multivariate time series dataset, which contains the daily revenue of geographically close gas stations. • Baselines: VAR, LRidge, LSVR, GRU, LSTNet-S, LSTNet-A, TPA • Problem Parameters: • window • The length of the input time series • Value range: {32, 64, 128} • horizon • The desirable horizon ahead of the current time stamp • Value range: {3, 6, 12, 24} 9

Experimental Settings • Evaluation Metrics: • Root relative squared error (RRSE) - ∑ ),+ ) / (𝑍 ),+ − 𝑍 (),+) RRSE = ∑ ),+ − mean(𝑍)) / (𝑍 (),+) • Mean absolute error (MAE) - MAE = mean(6 |𝑍 ),+ − 𝑍 ),+ | ) (),+) • Empirical correlation coefficient (CORR) - - ∑ (𝑍 ),+ − mean(𝑍 ) ))(𝑍 ),+ − mean(𝑍 ) )) + CORR = mean(6 ) ) - - ∑ (𝑍 ∑ (𝑍 ) )) / ) )) / ),+ − mean(𝑍 ),+ − mean(𝑍 + + 10

Experimental Results Table 1: RRSE, MAE and CORR scores for our proposed DSANet and baselines when window =32. 11

Ablation Study • DSAwoGlobal: Remove the global temporal convolution branch; • DSAwoLocal: Remove the local temporal convolution branch; • DSAwoAR: Remove the autoregressive component. Figure 5: Ablation test results of DSANet. 14

Conclusion • Multivariate time series with dynamic-period or nonperiodic patterns is chaotic and hard to forecast. • Dual convolutions help to capture mixtures of global and local temporal patterns. • Self-attention mechanism helps to capture the dependencies between different series. • Our model shows promising results and outperforms baselines. • All components have contributed to the effectiveness and robustness of the whole model. 15

Thanks For Attention Question? 16

DSANet: Dual Self-Attention Network for Multivariate Time Series - PowerPoint PPT Presentation

DSANet: Dual Self-Attention Network for Multivariate Time Series Forecasting Siteng Huang, Donglin Wang, Xuehan Wu, Ao Tang Presenter: Siteng Huang Machine Intelligence Laboratory, Department of Engineering, Westlake University, Hangzhou, China

Outline Multivariate Data 1 Multivariate Parametric Methods Multivariate Normal Distribution 2

Attention in NLP CS 6956: Deep Learning for NLP Overview What is attention Attention in

Multivariate t-distributions Surajit Ray Reader, University of Glasgow DataCamp Multivariate

Reading multivariate data Surajit Ray Reader, University of Glasgow DataCamp Multivariate

Attention Eye tracking seminar 2/19/15 Presented by Tatiana Emmanouil Outline What is

Attention, Transformer and BERT Prof. Kuan-Ting Lai 2020/6/16 Attention is All You Need! A.

Calhoun Community College Dual Enrollment Info Session for Students & Parents What is Dual

DUAL CREDIT WHAT IS DUAL CREDIT? Dual credit means two things are happening at once. Students

Lenguaje dual en el distrito 47 Dual Language in District 47 2017-2018 What is Dual Language?

Web Application for the Dual Web Application for the Dual Web Application for the Dual Web

Multivariate Ordination Analyses: Principal Component Analysis Dilys Vela Tatiana Boza Tatiana

Regression Diagnostics and the Forward Search 3. A Single Multivariate Sample Anthony Atkinson,

Multivariate Linear Regression Max Turgeon STAT 4690Applied Multivariate Analysis

Robust Statistics Part 2: Multivariate location and scatter Peter Rousseeuw LARS-IASC School,

Advanced PHP Dr. Steven Bitner A/B and Multivariate testing Why use multivariate testing If

Multivariate normal distribution Surajit Ray Reader, University of Glasgow DataCamp

QUANTILE AUTOREGRESSION ROGER KOENKER AND ZHIJIE XIAO Abstract. We consider quantile

Modelling Volatility in Financial Time Series: Daily and Intra-daily Data Siem Jan Koopman

Forecasting intermittent data with complex patterns Ivan Svetunkov, John Boylan, Patricia Ramos

MCMC analysis of classical time series algorithms. Isambi Sailon mbalawata@yahoo.com

Two Algorithms for Time Series Forecasting Danny Yuan Forecasting with Fast Fourier

Figure 1: Visual representation of M 3 Fusion . II. D ATA The study was carried out on Reunion

Andy Vinten Luigi Spezia (BIOSS) Claire Abel Dave Riach Progress report March 2017 Project B.

Quick Value Statistics in Presentation Presentation can show you a simple time series and

DSANet: Dual Self-Attention Network for Multivariate Time Series - PowerPoint PPT Presentation

DSANet: Dual Self-Attention Network for Multivariate Time Series Forecasting Siteng Huang, Donglin Wang, Xuehan Wu, Ao Tang Presenter: Siteng Huang Machine Intelligence Laboratory, Department of Engineering, Westlake University, Hangzhou, China

Outline Multivariate Data 1 Multivariate Parametric Methods Multivariate Normal Distribution 2

Attention in NLP CS 6956: Deep Learning for NLP Overview What is attention Attention in

Multivariate t-distributions Surajit Ray Reader, University of Glasgow DataCamp Multivariate

Reading multivariate data Surajit Ray Reader, University of Glasgow DataCamp Multivariate

Attention Eye tracking seminar 2/19/15 Presented by Tatiana Emmanouil Outline What is

Attention, Transformer and BERT Prof. Kuan-Ting Lai 2020/6/16 Attention is All You Need! A.

Calhoun Community College Dual Enrollment Info Session for Students &amp; Parents What is Dual

DUAL CREDIT WHAT IS DUAL CREDIT? Dual credit means two things are happening at once. Students

Lenguaje dual en el distrito 47 Dual Language in District 47 2017-2018 What is Dual Language?

Web Application for the Dual Web Application for the Dual Web Application for the Dual Web

Multivariate Ordination Analyses: Principal Component Analysis Dilys Vela Tatiana Boza Tatiana

Regression Diagnostics and the Forward Search 3. A Single Multivariate Sample Anthony Atkinson,

Multivariate Linear Regression Max Turgeon STAT 4690Applied Multivariate Analysis

Robust Statistics Part 2: Multivariate location and scatter Peter Rousseeuw LARS-IASC School,

Advanced PHP Dr. Steven Bitner A/B and Multivariate testing Why use multivariate testing If

Multivariate normal distribution Surajit Ray Reader, University of Glasgow DataCamp

QUANTILE AUTOREGRESSION ROGER KOENKER AND ZHIJIE XIAO Abstract. We consider quantile

Modelling Volatility in Financial Time Series: Daily and Intra-daily Data Siem Jan Koopman

Forecasting intermittent data with complex patterns Ivan Svetunkov, John Boylan, Patricia Ramos

MCMC analysis of classical time series algorithms. Isambi Sailon mbalawata@yahoo.com

Two Algorithms for Time Series Forecasting Danny Yuan Forecasting with Fast Fourier

Figure 1: Visual representation of M 3 Fusion . II. D ATA The study was carried out on Reunion

Andy Vinten Luigi Spezia (BIOSS) Claire Abel Dave Riach Progress report March 2017 Project B.

Quick Value Statistics in Presentation Presentation can show you a simple time series and

Calhoun Community College Dual Enrollment Info Session for Students & Parents What is Dual