Data Mining Techniques
CS 6220 - Section 3 - Fall 2016
Lecture 18: Time Series
Jan-Willem van de Meent (credit: Aggarwal Chapter 14.3)
Data Mining Techniques CS 6220 - Section 3 - Fall 2016 Lecture 18: - - PowerPoint PPT Presentation
Data Mining Techniques CS 6220 - Section 3 - Fall 2016 Lecture 18: Time Series Jan-Willem van de Meent ( credit: Aggarwal Chapter 14.3 ) Time Series Data http://www.capitalhubs.com/2012/08/the-correlation-between-apple-product.html Time Series
CS 6220 - Section 3 - Fall 2016
Jan-Willem van de Meent (credit: Aggarwal Chapter 14.3)
http://www.capitalhubs.com/2012/08/the-correlation-between-apple-product.html
(and often when such events will occur)
(here: seasonal effects + growth)
source: https://am241.wordpress.com/tag/time-series/
y0
i = 1
k Pk1
n=0 yin
Moving Average Exponential
50 100 150 200 250 165 170 175 180 185 190 195 200 NUMBER OF TRADING DAYS IBM STOCK PRICE ACTUAL VALUES 20−DAY MOVING AVERAGE 50−DAY MOVING AVERAGE 50 100 150 200 250 165 170 175 180 185 190 195 200 NUMBER OF TRADING DAYS IBM STOCK PRICE ACTUAL VALUES
(a) Moving average smoothing (b) Exponential smoothing
− −
y′
i = α · yi + (1 − α) · y′ i−1
Differencing yt - yt-1 Log differencing log yt - log yt-1
5 10 15 20 25 30 10 20 30 40 50 60 TIME INDEX PRICE VALUE
ORIGINAL SERIES DIFFERENCED SERIES
5 10 15 20 25 30 0.5 1 1.5 2 2.5 3 3.5 4 4.5 TIME INDEX LOGARITHM(PRICE VALUE)
ORIGINAL SERIES (LOG) DIFFERENCED SERIES (LOG)
(a) Unscaled series (b) Logarithmic scaling Definition 14.3.1 (Strictly Stationary Time Series) A strictly stationary time series is one in which the probabilistic distribution of the values in any time interval [a, b] is identical to that in the shifted interval [a + h, b + h] for any value of the time shift h.
E[✏t] = 0 yt = c + ✏t
50 100 150 200 250 −1 −0.8 −0.6 −0.4 −0.2 0.2 0.4 0.6 0.8 1 LAG AUTOCORRELATION 100 200 300 400 500 600 700 800 900 1000 −1 −0.8 −0.6 −0.4 −0.2 0.2 0.4 0.6 0.8 1 LAG (DEGREES) AUTOCORRELATION
IBM Stock Price Sine Wave
Autocorrelation(L) = Covariancet(yt, yt+L) Variancet(yt) .
yt =
q
bi · ϵt−i + c + ϵt yt =
p
ai · yt−i +
q
bi · ϵt−i + c + ϵt
Moving-Average: MA(q) Autoregressive moving-average: ARMA(p,q) Autoregressive integrated moving-average: ARIMA(p,d,q) Autoregressive: AR(p) Do least-squares regression to estimate a,b,c yt =
p
X
i=1
aiyt−i + c + ✏t y(d)
t
=
p
X
i=1
aiy(d)
t−i + q
X
i=1
bi✏t−i + c + ✏t
source: http://www.statsref.com/HTML/index.html?arima.html
(p,d,q) = (0,1,12)
Time Series Histogram Mixture Posterior on states
Time Series Histogram Mixture Posterior on states
Estimate from GMM
Estimate from HMM
(more likely to be in same state than different state)
y m a a/2 y/2 a/2 m y/2
(adapted from:: Mining of Massive Datasets, http://www.mmds.org)
Model for random Surfer:
y m a a/2 y/2 a/2 m y/2
(adapted from:: Mining of Massive Datasets, http://www.mmds.org)
Gaussian Mixture zn ∼ Discrete(π) xn|zn = k ∼ Normal(µk, σk) Gaussian HMM z1 ∼ Discrete(π) zt+1|zt = k ∼ Discrete(Ak) xt|zt = k ∼ Normal(µk, σk) A = M >
zn ∼ Discrete(π) xn|zn = k ∼ Normal(µk, σk) Expectation Maximization γi
tk = p(zt = k | xt, θi−1)
= p(xt, zt = k | θi−1) P
l p(xt, zt = l | θi−1)
N i
k = PT t=1 γi tk
µi
k = 1 N i
k
PT
t=1 γi tkxt
σi
k =
⇣
1 N i
k
PT
t=1 γi tk(xi t − µi k)2⌘1/2
πi
k = N i k/N
Expectation step for HMM
z1 ∼ Discrete(π) zt+1|zt = k ∼ Discrete(Ak) xt|zt = k ∼ Normal(µk, σk)
αt,l := p(x1:t, zt) = X
k
p(xt|µl, σl)Aklαt−1,k βt,k := p(xt+1:T | zt) = X
l
βt+1,l p(xt+1|µl, σl) Akl γt,k = p(zt = k | x1:T , θ) = p(x1:t, zt)p(xt+1:T |zt) p(x1:T ) ∝ αt,kβt,k
Handwritten Digits
itten sam- hid- ained
RNA splicing