Data Mining Techniques CS 6220 - Section 3 - Fall 2016 Lecture 18: - - PowerPoint PPT Presentation

data mining techniques
SMART_READER_LITE
LIVE PREVIEW

Data Mining Techniques CS 6220 - Section 3 - Fall 2016 Lecture 18: - - PowerPoint PPT Presentation

Data Mining Techniques CS 6220 - Section 3 - Fall 2016 Lecture 18: Time Series Jan-Willem van de Meent ( credit: Aggarwal Chapter 14.3 ) Time Series Data http://www.capitalhubs.com/2012/08/the-correlation-between-apple-product.html Time Series


slide-1
SLIDE 1

Data Mining Techniques

CS 6220 - Section 3 - Fall 2016

Lecture 18: Time Series

Jan-Willem van de Meent (credit: Aggarwal Chapter 14.3)

slide-2
SLIDE 2

Time Series Data

http://www.capitalhubs.com/2012/08/the-correlation-between-apple-product.html

slide-3
SLIDE 3

Time Series Data

slide-4
SLIDE 4

Time Series Data

  • Time series forecasting is fundamentally hard
  • Rare events often play a big role in changing trends
  • Impossible to know how events will affects trends


(and often when such events will occur)

slide-5
SLIDE 5

Time Series Data

  • In some cases there are clear trends


(here: seasonal effects + growth)

source: https://am241.wordpress.com/tag/time-series/

slide-6
SLIDE 6

Autoregressive Models

slide-7
SLIDE 7

y0

i = 1

k Pk1

n=0 yin

Time Series Smoothing

Moving Average Exponential

50 100 150 200 250 165 170 175 180 185 190 195 200 NUMBER OF TRADING DAYS IBM STOCK PRICE ACTUAL VALUES 20−DAY MOVING AVERAGE 50−DAY MOVING AVERAGE 50 100 150 200 250 165 170 175 180 185 190 195 200 NUMBER OF TRADING DAYS IBM STOCK PRICE ACTUAL VALUES

  • EXP. SMOOTHING (α=0.1)
  • EXP. SMOOTHING (α=0.05)

(a) Moving average smoothing (b) Exponential smoothing

− −

y′

i = α · yi + (1 − α) · y′ i−1

slide-8
SLIDE 8

Stationary Time Series

Differencing yt - yt-1 Log differencing log yt - log yt-1

5 10 15 20 25 30 10 20 30 40 50 60 TIME INDEX PRICE VALUE

ORIGINAL SERIES DIFFERENCED SERIES

5 10 15 20 25 30 0.5 1 1.5 2 2.5 3 3.5 4 4.5 TIME INDEX LOGARITHM(PRICE VALUE)

ORIGINAL SERIES (LOG) DIFFERENCED SERIES (LOG)

(a) Unscaled series (b) Logarithmic scaling Definition 14.3.1 (Strictly Stationary Time Series) A strictly stationary time series is one in which the probabilistic distribution of the values in any time interval [a, b] is identical to that in the shifted interval [a + h, b + h] for any value of the time shift h.

E[✏t] = 0 yt = c + ✏t

slide-9
SLIDE 9

Auto-correlation

50 100 150 200 250 −1 −0.8 −0.6 −0.4 −0.2 0.2 0.4 0.6 0.8 1 LAG AUTOCORRELATION 100 200 300 400 500 600 700 800 900 1000 −1 −0.8 −0.6 −0.4 −0.2 0.2 0.4 0.6 0.8 1 LAG (DEGREES) AUTOCORRELATION

IBM Stock Price Sine Wave

Autocorrelation(L) = Covariancet(yt, yt+L) Variancet(yt) .

slide-10
SLIDE 10

Autoregressive Models

yt =

q

  • i=1

bi · ϵt−i + c + ϵt yt =

p

  • i=1

ai · yt−i +

q

  • i=1

bi · ϵt−i + c + ϵt

Moving-Average: MA(q) Autoregressive moving-average: ARMA(p,q) Autoregressive integrated moving-average: ARIMA(p,d,q) Autoregressive: AR(p) Do least-squares regression to estimate a,b,c yt =

p

X

i=1

aiyt−i + c + ✏t y(d)

t

=

p

X

i=1

aiy(d)

t−i + q

X

i=1

bi✏t−i + c + ✏t

slide-11
SLIDE 11

ARIMA on Airline Data

source: http://www.statsref.com/HTML/index.html?arima.html

(p,d,q) = (0,1,12)

slide-12
SLIDE 12

Hidden Markov Models

slide-13
SLIDE 13

Time Series with Distinct States

slide-14
SLIDE 14

Can we use a Gaussian Mixture Model?

Time Series Histogram Mixture Posterior on states

slide-15
SLIDE 15

Time Series Histogram Mixture Posterior on states

Can we use a Gaussian Mixture Model?

slide-16
SLIDE 16

Estimate from GMM

Hidden Markov Models

Estimate from HMM

  • Idea: Mixture model + Markov chain for states
  • Can model correlation between subsequent states


(more likely to be in same state than different state)


slide-17
SLIDE 17

Reminder: Random Surfers in PageRank

y m a a/2 y/2 a/2 m y/2

(adapted from:: Mining of Massive Datasets, http://www.mmds.org)

Model for random Surfer:

  • At time t = 0 pick a page at random
  • At each subsequent time t follow an

  • utgoing link at random
slide-18
SLIDE 18

Reminder: Random Surfers in PageRank

y m a a/2 y/2 a/2 m y/2

(adapted from:: Mining of Massive Datasets, http://www.mmds.org)

slide-19
SLIDE 19

Hidden Markov Models

Gaussian Mixture zn ∼ Discrete(π) xn|zn = k ∼ Normal(µk, σk) Gaussian HMM z1 ∼ Discrete(π) zt+1|zt = k ∼ Discrete(Ak) xt|zt = k ∼ Normal(µk, σk) A = M >

slide-20
SLIDE 20

Review: Gaussian Mixtures

zn ∼ Discrete(π) xn|zn = k ∼ Normal(µk, σk) Expectation Maximization γi

tk = p(zt = k | xt, θi−1)

= p(xt, zt = k | θi−1) P

l p(xt, zt = l | θi−1)

  • 1. Update cluster probabilities
  • 2. Update parameters

N i

k = PT t=1 γi tk

µi

k = 1 N i

k

PT

t=1 γi tkxt

σi

k =

1 N i

k

PT

t=1 γi tk(xi t − µi k)2⌘1/2

πi

k = N i k/N

slide-21
SLIDE 21

Forward-backward Algorithm

Expectation step for HMM

z1 ∼ Discrete(π) zt+1|zt = k ∼ Discrete(Ak) xt|zt = k ∼ Normal(µk, σk)

αt,l := p(x1:t, zt) = X

k

p(xt|µl, σl)Aklαt−1,k βt,k := p(xt+1:T | zt) = X

l

βt+1,l p(xt+1|µl, σl) Akl γt,k = p(zt = k | x1:T , θ) = p(x1:t, zt)p(xt+1:T |zt) p(x1:T ) ∝ αt,kβt,k

slide-22
SLIDE 22

Other Examples for HMMs

Handwritten Digits

  • State 1: Sweeping arc
  • State 2: Horizontal line

itten sam- hid- ained

RNA splicing

  • State 1: Exon (relevant)
  • State 2: Splice site
  • State 3: Intron (ignored)