A SURVEY DENIS KHRYASHCHEV, GRADUATE CENTER, CUNY, OCTOBER 2018 - PowerPoint PPT Presentation

PATTERN DISCOVERY IN TIME SERIES A SURVEY DENIS KHRYASHCHEV, GRADUATE CENTER, CUNY, OCTOBER 2018

MOTIVATION Often datasets represent processes that take place over long periods of time. Their outputs are measured at regular time intervals creating discrete time series. For example, consider CitiBike demand and Fisher river temperature data. Fisher river CitiBike Data sources: 1. https://s3.amazonaws.com/tripdata/index.html, 2. https://datamarket.com/data/set/235d/mean-daily-temperature-fisher-river-near-dallas-jan-01-1988-to-dec-31-1991

MOTIVATION. COMPLEXITY Complexity quantifies the internal structure of the underlying process. EEG data can be classified [1] into interictal, preictal and seizure using their complexity. interictal 1. voltage, µV preictal 1. voltage, µV seizure 1. voltage, µV [1] Petrosian, Arthur. "Kolmogorov complexity of finite sequences and recognition of different preictal EEG patterns." Computer-based medical systems, 1995., Proceedings of the Eighth IEEE Symposium on. IEEE, 1995.

MOTIVATION. PERIODICITY Natural phenomena like Sun activity, Earth rotation and revolution drive periodic human activity on the large scale. E.g. New York City’s human mobility is highly periodic with clear peaks in ridership from 6 AM to 10 AM, and from 3 PM to 7 PM. Image source: http://web.mta.info/mta/news/books/docs/Ridership_Trends_FINAL_Jul2018.pdf

MOTIVATION. PREDICTABILITY Predictability estimates the expected accuracy of forecasting given time series. Often there is a trade-off between the desired accuracy and computation time [2]. [2] Zhao, Kai, et al. "Predicting taxi demand at high spatial resolution: approaching the limit of predictability." Big Data (Big Data), 2016 IEEE International Conference on. IEEE, 2016.

MOTIVATION. CLUSTERING Often a task of grouping similar in certain quality time series arises in the domains of transportation, finance, medicine, … Time sensitive modifications of standard techniques are applied, e.g. k-means of autocorrelation functions. autocorrelation functions time series clustered together Image source: Denis Khryashchev’s summer internship at Simulmedia (Jun – Aug 2018).

MOTIVATION. FORECASTING Perhaps, the most well known and widely applied task related to time series is forecasting. Understanding time series periodicity, complexity, and predictability helps in selecting better predictors and optimizing parameters. E.g., knowing periodicity P=5 of the series, one can forecast averaging values with lag 5. Video source: Denis Khryashchev’s summer internship at Simulmedia (Jun – Aug 2018).

NOTATION Throughout the presentation we will consider time series of real values and will use the following notation 𝑂 , 𝑌 𝑢 ∈ ℜ 𝑌 = 𝑌 1 , … , 𝑌 𝑂 = 𝑌 𝑢 1 Not to be confused with set notation, ⋅ is used to denote sequences. A subsequence of the series 𝑌 that starts at period 𝑗 and ends at period 𝑘 is written as 𝑘 = 𝑌 𝑢 𝑗 𝑘 = 𝑌 𝑗 , … , 𝑌 𝑌 𝑗 𝑘 , 𝑗 ≤ 𝑘

ORGANIZATION OF THE PRESENTATION

1. KOLMOGOROV COMPLEXITY For time series 𝑌 we define the Kolmogorov complexity as the length of the shortest description of a sequence of values ordered in time in some fixed universal description language 𝐿 𝑌 = 𝑒 𝑌 where 𝐿 is the Kolmogorov complexity, and 𝑒(𝑌) is the shortest description of the time series X. Smaller values of 𝐿 𝑌 indicate lower complexity.

1. KOLMOGOROV COMPLEXITY. EXAMPLE Given two time series 𝑌 = 0, 1, 0, 1, 0, 1, 0, 1, 0, 1 and 𝑍 = { 1, 0, 0, 1, 1, 1, 0, 0, 1, 0 } and selecting Python as our description language we have the shortest descriptions 𝑒 𝑄 𝑌 = 0, 1 ∗ 5 and 𝑒 𝑄 𝑍 = 1, 0, 0, 1, 1, 1, 0, 0, 1, 0 quantifying smaller “Pythonic” complexity for 𝑌 comparing to 𝑍 𝐿 𝑄 𝑌 = 𝑒 𝑄 𝑌 = 7 𝐿 𝑄 𝑍 = 𝑒 𝑄 𝑍 = 21

1. KOLMOGOROV COMPLEXITY. LIMITATIONS However, as proven by Kolmogorov in [3], and Chaitin and Arslanov in [4], the complexity 𝐿 is not a computable function in general. 3. Kolmogorov, Andrei N. "On tables of random numbers." Sankhyā: The Indian Journal of Statistics, Series A (1963): 369 -376. 4. G. J. Chaitin, A. Arslanov, and C. Calude , “Program - size complexity computes the halting problem,” Department of Computer Science, The University of Auckland, New Zealand, Tech. Rep., 1995.

1. LEMPEL-ZIV COMPLEXITY Lempel and Ziv [5] proposed a combinatorial approximation of the complexity of finite sequences based on their production history. For time series 𝑌 it is ℎ 1 +1 , 𝑌 ℎ 1 +1 ℎ 2 𝑛 𝐼 𝑌 = 𝑌 1 , … 𝑌 ℎ 𝑛−1 +1 For series 𝑌 = {0,0,0,1,1,0,1,0,0,1,0,0,0,1,0,1} one of the production histories is 𝐼 𝑌 = {0}ڂ{0,0,1}ڂ{1,0}ڂ{1,0,0}ڂ{1,0,0,0}ڂ{1,0,1} The overall complexity is the size of the shortest possible production history 𝑑 𝑌 = min 𝐼(𝑌) 𝐼 𝑌 Disadvantage : the actual values 𝑌 𝑢 are treated as symbols, e.g. 𝑑 𝑌 = 1, 2, 1, 5, 1, 2 = 𝑑(𝑍 = {8, 0.5, 8, 0.1, 8, 0.5}) 5. Lempel, Abraham, and Jacob Ziv. "On the complexity of finite sequences." IEEE Transactions on information theory 22.1 (1976): 75-81.

2. ENTROPY Shannon and Weaver introduced entropy [6] as a measure of information transmitted by a signal in a communication channel 𝐼 𝑌 = −𝔽 log 2 𝑄 𝑌 Renyi [7] generalized the definition for ordinary discrete finite distribution of 𝑌 𝒬 = 𝑞 1 , … , 𝑞 𝑁 , σ 𝑙 𝑞 𝑙 = 1 to entropy of order 𝛽 ( 𝛽 → 1 for Shannon entropy) 1 𝛽 1−𝛽 log 2 σ 𝑙 𝑞 𝑙 𝐼 𝛽 𝑌 = 𝐼 𝛽 𝒬 = Disadvantage: both definitions do not take order of the values 𝑌 𝑢 into account, e.g. 𝐼 𝑌 = 1, 2, 3, 1, 2, 3 = 𝐼(𝑍 = {1, 3, 2, 2, 3, 1}) . 6. Cover, Thomas M., and Joy A. Thomas. Elements of information theory. John Wiley & Sons, 2012. 7. Rényi, Alfréd. On measures of entropy and information. HUNGARIAN ACADEMY OF SCIENCES Budapest Hungary, 1961.

2. KOLMOGOROV ENTROPY Entropy is often used as an approximation of complexity. Among the most well-known approximations [8] of the complexity is Kolmogorov Entropy defined as 1 𝐿 = − lim 𝜐→∞ lim 𝜗→∞ lim 𝑒𝜐 ෍ 𝑞 𝑗 1 , … , 𝑗 𝑒 ln 𝑞 𝑗 1 , … , 𝑗 𝑒 𝑒→∞ 𝑗 1 ,…𝑗 𝑒 It describes complexity of a dynamic system with 𝐺 degrees of freedom. 𝐺 - dimensional phase space is partitioned into 𝜗 𝐺 boxes, 𝜐 stands for time intervals, and 𝑞 𝑗 1 , … , 𝑗 𝑒 is the joint probability that we find the 𝐺 -dimensional point representing values 𝑌 𝑢=𝑙𝜐 inside the box 𝜗 𝐺 . Disadvantage : the approximation is computable for known analytically defined models, however, it is hard to calculate it given the resulting series only. 8. Grassberger, Peter, and Itamar Procaccia. "Estimation of the Kolmogorov entropy from a chaotic signal." Physical review A 28.4 (1983): 2591.

2. ENTROPY WITH TEMPORAL COMPONENT Another definition [6] of entropy takes into account temporal order of the values 𝑌 𝑢 𝑘 log 2 𝑄 𝑌 𝑗 𝑘 𝑂 𝑂 𝐼 𝑢 𝑌 = − σ 𝑗=1 σ 𝑘=1 𝑄 𝑌 𝑗 𝑘 is the probability of the subsequence 𝑌 𝑗 𝑘 . 𝐼 𝑢 𝑌 is 𝑃 2 𝑂 complex. 𝑄 𝑌 𝑗 Lempel-Ziv estimator [9] approximates 𝐼 𝑢 𝑌 rapidly converging ′ −1 1 𝑂 σ 𝑢 𝑡 𝑢 𝐼 𝑀𝑎 𝑌 = ln 𝑂 ′ is the shortest subsequence starting at period 𝑢 observed for the 1 st time. where 𝑡 𝑢 Disadvantage: values 𝑌 𝑢 are treated as symbols, e.g. 𝐼 𝑀𝑎 𝑌 = 1, 2, 1, 5 = 𝐼 𝑀𝑎 (𝑍 = {2, 9, 2, 3}) 6. Cover, Thomas M., and Joy A. Thomas. Elements of information theory. John Wiley & Sons, 2012. 9. Kontoyiannis, Ioannis, et al. "Nonparametric entropy estimation for stationary processes and random fields, with applications to English text." IEEE Transactions on Information Theory 44.3 (1998): 1319-1327.

2. PERMUTATION ENTROPY Bandt and Pompe [10] proposed permutation entropy of order 𝑜 𝐼 𝑜 = − σ 𝑞 𝜌 log 𝑞 𝜌 # 𝑢|0≤𝑢≤𝑈−𝑜, 𝑢𝑧𝑞𝑓 𝑌 𝑢+1 ,…,𝑌 𝑢+𝑜 =𝜌 where 𝑞 𝜌 = is a frequency of a permutation of type 𝑈−𝑜+1 𝜌 . E.g., for 𝑌 = 4, 7, 9, 10, 6, 11, 3 , 𝑜 = 3 we have 𝜌 4, 7, 9 = 𝜌 7, 9, 10 = 𝜌 012 𝑌 𝑢 < 𝑌 𝑢+1 < 𝑌 𝑢+2 , 𝜌 9, 10, 6 = 𝜌 6, 11, 3 = 𝜌 210 𝑌 𝑢+2 < 𝑌 𝑢 < 𝑌 𝑢+1 , 𝜌 10, 6, 11 = 𝜌 102 𝑌 𝑢+1 < 𝑌 𝑢 < 𝑌 𝑢+2 . 2 2 1 1 The entropy becomes 𝐼 3 = −2 5 log 5 − 5 log 5 ≈ 1.52 . Disadvantage: the definition requires 𝑌 𝑢 ≠ 𝑌 𝑢+1 and has a complexity of 𝑃(𝑜!) . 10. Bandt, Christoph, and Bernd Pompe. "Permutation entropy: a natural complexity measure for time series." Physical review letters 88.17 (2002): 174102.

A SURVEY DENIS KHRYASHCHEV, GRADUATE CENTER, CUNY, OCTOBER 2018 - PowerPoint PPT Presentation

PATTERN DISCOVERY IN TIME SERIES A SURVEY DENIS KHRYASHCHEV, GRADUATE CENTER, CUNY, OCTOBER 2018 MOTIVATION Often datasets represent processes that take place over long periods of time. Their outputs are measured at regular time intervals

Chapter 9. Survey Research Chapter 9. Survey Research survey research methods? survey research

Member Survey 2015 Survey method Surv Survey Monk y Monkey as survey platform, receiving 82

Annual Teen Health Survey 9 School Districts All 8 th , 10 th , and 12 th graders 3-year survey

2018 Monitoring Survey Results June 2018 Saolta Group Survey Overview June 2018 Saolta Survey

Staff Survey 2017 Summary of findings from the Pulse survey Our Survey Methodology Set up

CS 401 Max Flow Applications Xiaorui Sun 1 Survey Design Survey Design Survey design. Design

DOWNTOWN LINCOLN Historic Survey DOWNTOWN LINCOLN Historic Survey LINCOLN DOWNTOWN Historic

SURVEY - CA. Devendra H. Jain dhjainassociates@gmail.com Meaning of Survey MEANING OF SURVEY

SYLVAN GROVE Historic Survey SYLVAN GROVE Historic Survey https://khri.kansasgis.org/ SYLVAN

Savannah: A City-Wide Historic Resources Survey SURVEY PURPOSE AND IMPLEMENTATION OF SURVEY

Basic Needs Summit Presidents Cabinet Presentation Agenda Campus- wide Survey Results

PCCT Client Satisfaction Survey 2015 Type of Survey v The Client Satisfaction Survey was created

Adams County Quality of Life Survey Survey Presentation of Results 1 Adams County Quality of

Industry Economic Outlook Survey Detailed Survey Results: 2Q 2019 Survey Background

BIBLICAL SURVEY Christmas Class From here To here BIBLICAL SURVEY BIBLICAL SURVEY Christmas

AICPA Business and Industry Economic Outlook Survey Detailed Survey Results: 3Q 2019 Survey

6 and 8 Times Table and Division Facts Diving into Mastery Guidance for Educators Diving Deeper

Route testing dialer Break through your data What is a dialer? The route testing dialer route

Patent Breakout IIB: Patenting Tomorrows Data Security Technology: Navigating the PTO in a

Agricultural Lending In The State Of Texas Results From 4 Streams Of Research Research Conducted

Monitoring Massive Network Traffic using Bayesian Inference David Rodriguez Cisco Systems, Inc.

I In recent years, regarding to basic progresses in design of integrated orbits, wireless

Calibration of GHRS Burst Noise Rejection Techniques E. A. Beaver 1 , R. D. Cohen 1 , A. Diplas 1 ,

Under Pressure: How Economic and Regulatory Strains Threaten to Undermine Tribal Gaming