causal inference in time series via supervised learning
play

Causal Inference in Time Series via Supervised Learning Yoichi - PDF document

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence (IJCAI-18) Causal Inference in Time Series via Supervised Learning Yoichi Chikahara and Akinori Fujino NTT Communication Science Laboratories, Kyoto


  1. Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence (IJCAI-18) Causal Inference in Time Series via Supervised Learning Yoichi Chikahara and Akinori Fujino NTT Communication Science Laboratories, Kyoto 619-0237, Japan chikahara.yoichi@lab.ntt.co.jp, fujino.akinori@lab.ntt.co.jp Abstract only on its past values are significantly reduced by addition- ally using the past values of X . When the regression model Causal inference in time series is an important can be well fitted to the data, we can infer correct causal di- problem in many fields. Traditional methods use rections. However, in practice, selecting an appropriate re- regression models for this problem. The infer- gression model for each time series data is difficult and re- ence accuracies of these methods depend greatly quires a deep understanding of the data analysis. Therefore, on whether or not the model can be well fitted to it is not easy to identify correct causal directions with these the data, and therefore we are required to select an model-based methods. appropriate regression model, which is difficult in The goal of this paper is to build an approach to causal practice. This paper proposes a supervised learn- inference in time series that does not require a deep under- ing framework that utilizes a classifier instead of standing of the data analysis. To realize this goal, we pro- regression models. We present a feature represen- pose a supervised learning framework that utilizes a classifier tation that employs the distance between the con- instead of regression models. Specifically, we propose solv- ditional distributions given past variable values and ing the problem of Granger causality identification by ternary show experimentally that the feature representation classification, in other words, by training a classifier that as- provides sufficiently different feature vectors for signs ternary causal labels ( X → Y , X ← Y , or No Cau- time series with different causal relationships. Fur- sation ) to time series. In fact, several methods have already thermore, we extend our framework to multivariate been proposed that perform classification to infer causal re- time series and present experimental results where lationships from i.i.d. data, which have worked well ex- our method outperformed the model-based meth- perimentally [Bontempi and Flauder, 2015; Guyon, 2013; ods and the supervised learning method for i.i.d. Lopez-Paz et al. , 2015; 2017]. To solve causal inference in data. time series via classification, we formulate a feature represen- tation that provides sufficiently different feature vectors for 1 Introduction time series with different causal relationships. The idea for obtaining such feature vectors is founded on the definition of Discovering temporal causal directions is an important task Granger causality: X is the cause of Y if the following two in time series analysis and has key applications in various conditional distributions of the future value of Y are differ- fields. For instance, finding the causal direction that indi- ent; one is given the past values of Y and the other is given cates that the research and development (R&D) expenditure the past values of X and Y . To build the classifier for Granger X influences the total sales Y , but not vice versa, is helpful causality identification, we utilize the distance between these for decision making in companies. In addition, identifying distributions when preparing feature vectors. To compute the causal (regulatory) relationships between genes from time se- distance, by using kernel mean embedding , we map each dis- ries gene expression data is one of the most important topics tribution to a point in the feature space called the reproduc- in bioinformatics. ing kernel Hilbert space (RKHS) and measure the distance As a definition of temporal causality, Granger causality between the points, which is termed the maximum mean dis- [Granger, 1969] is widely used [Kar et al. , 2011; Yao et al. , crepancy (MMD) [Gretton et al. , 2007]. 2015]. According to its definition, the variable X is the cause of the variable Y if the past values of X are helpful in pre- In experiments, our method sufficiently outperformed the dicting the future value of Y . model-based Granger causality methods and the supervised Traditional methods for identifying Granger causality use learning method for i.i.d. data by using the same feature rep- regression models [Bell et al. , 1996; Cheng et al. , 2014; resentation and the same classifier. Furthermore, we describe Granger, 1969; Marinazzo et al. , 2008; Sun, 2008] such as the how our approach can be extended to multivariate time series vector autoregressive (VAR) model and the generalized addi- and show experimentally that feature vectors have a sufficient tive models (GAM). With these methods, we can determine difference that depends on Granger causality, which demon- that X is the cause of Y if the prediction errors of Y based strates the effectivity of our proposed framework. 2042

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend