Time-Series Data Numerical data obtained at regular time intervals - - PowerPoint PPT Presentation

time series data
SMART_READER_LITE
LIVE PREVIEW

Time-Series Data Numerical data obtained at regular time intervals - - PowerPoint PPT Presentation

Time-Series Data Numerical data obtained at regular time intervals The time intervals can be annually, quarterly, daily, hourly, etc. Example: Year: 1999 2000 2001 2002 2003 Sales: 75.3 74.2 78.5 79.7 80.2


slide-1
SLIDE 1
slide-2
SLIDE 2

Time-Series Data

 Numerical data obtained at regular time

intervals

 The time intervals can be annually, quarterly,

daily, hourly, etc.

 Example:

Year: 1999 2000 2001 2002 2003 Sales: 75.3 74.2 78.5 79.7 80.2

slide-3
SLIDE 3

Time-Series Plot

 the vertical axis

measures the variable of interest

 the horizontal axis

corresponds to the time periods

U.S. Inflation Rate

0.00 2.00 4.00 6.00 8.00 10.00 12.00 14.00 16.00 1975 1977 1979 1981 1983 1985 1987 1989 1991 1993 1995 1997 1999 2001 Year Inflation Rate (%)

A time-series plot is a two-dimensional plot of time series data

slide-4
SLIDE 4

Time Series Example

slide-5
SLIDE 5

Mining Time Series Data

 Prediction and Forecasting  Similarity Search

slide-6
SLIDE 6

The Importance of Forecasting

 Governments forecast unemployment, interest

rates, and expected revenues from income taxes for policy purposes

 Marketing executives forecast demand, sales, and

consumer preferences for strategic planning

 College administrators forecast enrollments to

plan for facilities and for faculty recruitment

 Traders forecast stock prices, interest rates and

volatilities to make profit

slide-7
SLIDE 7

Time-Series Components

Time Series

Cyclical Component Irregular Component Trend Component Seasonal Component

slide-8
SLIDE 8

Trend Component

 Long-run increase or decrease over time

(overall upward or downward movement)

 Data taken over a long period of time

Sales Time

slide-9
SLIDE 9

Downward linear trend

Trend Component

 Trend can be upward or downward  Trend can be linear or non-linear

Sales Time Upward nonlinear trend Sales Time

slide-10
SLIDE 10

Seasonal Component

 Short-term regular wave-like patterns  Observed within 1 year  Often monthly or quarterly

Sales Time (Quarterly)

Winter Spring Summer Fall Winter Spring Summer Fall

slide-11
SLIDE 11

Cyclical Component

 Long-term wave-like patterns  Regularly occur but may vary in length  Often measured peak to peak/trough to trough

Sales 1 Cycle Year

slide-12
SLIDE 12

Irregular Component

 Unpredictable, random, “residual”

fluctuations

 Due to random variations of

 Nature  Accidents or unusual events

 “Noise” in the time series

slide-13
SLIDE 13

Additive Time-Series Model

 Used primarily for forecasting

where Ti = Trend value at time i Si = Seasonal value at time i Ci = Cyclical value at time i Ii = Irregular (random) value at time i

i i i i i

I C S T Y    

slide-14
SLIDE 14

Multiplicative Time-Series Model

 Used primarily for forecasting

where Ti = Trend value at time i Si = Seasonal value at time i Ci = Cyclical value at time i Ii = Irregular (random) value at time i

i i i i i

I C S T Y    

slide-15
SLIDE 15

Forecasting Time Series

 Smoothing-based forecasting (moving average)  Trend based forecasting  Autoregressive models  Many alternative models exist

slide-16
SLIDE 16

Moving Averages

 Calculate moving averages to get an

  • verall impression of the pattern of

movement over time Moving Average: averages of consecutive time series values for a chosen period of length L

slide-17
SLIDE 17

Moving Averages

 Used for smoothing  A series of arithmetic means over time  Result dependent upon choice of L

(length of period for computing means)

 Examples:

 For a 5 year moving average, L = 5  For a 7 year moving average, L = 7  Etc.

slide-18
SLIDE 18

Moving Averages

 Example: Five-year moving average

 First average:  Second average:  etc.

5 Y Y Y Y Y MA(5)

5 4 3 2 1

    

5 Y Y Y Y Y MA(5)

6 5 4 3 2

    

slide-19
SLIDE 19

Example: Annual Data

Year Sales 1 2 3 4 5 6 7 8 9 10 11 etc… 23 40 25 27 32 48 33 37 37 50 40 etc…

10 20 30 40 50 60 1 2 3 4 5 6 7 8 9 10 11 Sales Year

Annual Sales

… …

slide-20
SLIDE 20

Calculating Moving Averages

 Each moving average is for a

consecutive block of 5 years

Year Sales 1 23 2 40 3 25 4 27 5 32 6 48 7 33 8 37 9 37 10 50 11 40 Average Year 5-Year Moving Average 3 29.4 4 34.4 5 33.0 6 35.4 7 37.4 8 41.0 9 39.4 … …

5 5 4 3 2 1 3      5 32 27 25 40 23 29.4     

slide-21
SLIDE 21

Annual vs. 5-Year Moving Average

10 20 30 40 50 60 1 2 3 4 5 6 7 8 9 10 11 Year Sales

Annual 5-Year Moving Average

Annual vs. Moving Average

 The 5-year

moving average smoothes the data and shows the underlying trend

slide-22
SLIDE 22

Exponential Smoothing

 A weighted moving average

 Weights decline exponentially  Most recent observation weighted most

 Used for smoothing and short term forecasting

(often one period into the future)

slide-23
SLIDE 23

Exponential Smoothing

 The weight (smoothing coefficient) is W

 Subjectively chosen  Range from 0 to 1  Smaller W gives more smoothing,

larger W gives less smoothing

slide-24
SLIDE 24

Exponential Smoothing Model

1 1

Y E 

1 i i i

W)E (1 WY E

  

where: Ei = exponentially smoothed value for period i Ei-1 = exponentially smoothed value already computed for period i - 1 Yi = observed value in period i W = weight (smoothing coefficient), 0 < W < 1 For i = 2, 3, 4, …

slide-25
SLIDE 25

Exponential Smoothing Example

 Suppose we use weight W = .2

Time Period (i) Sales (Yi) Forecast from prior period (Ei-1) Exponentially Smoothed Value for this period (Ei) 1 2 3 4 5 6 7 8 9 10 etc. 23 40 25 27 32 48 33 37 37 50 etc.

  • 23

26.4 26.12 26.296 27.437 31.549 31.840 32.872 33.697 etc. 23 (.2)(40)+(.8)(23)=26.4 (.2)(25)+(.8)(26.4)=26.12 (.2)(27)+(.8)(26.12)=26.296 (.2)(32)+(.8)(26.296)=27.437 (.2)(48)+(.8)(27.437)=31.549 (.2)(48)+(.8)(31.549)=31.840 (.2)(33)+(.8)(31.840)=32.872 (.2)(37)+(.8)(32.872)=33.697 (.2)(50)+(.8)(33.697)=36.958 etc.

1 i i i

W)E (1 WY E

  

E1 = Y1 since no prior information exists

slide-26
SLIDE 26

Sales vs. Smoothed Sales

 Fluctuations have

been smoothed

 NOTE: the

smoothed value in this case is generally a little low, since the trend is upward sloping and the weighting factor is only .2

10 20 30 40 50 60 1 2 3 4 5 6 7 8 9 10

Time Period Sales

Sales Smoothed

slide-27
SLIDE 27

Forecasting Time Period i + 1

 The smoothed value in the current

period (i) is used as the forecast value for next period (i + 1) : i 1 i

E Y ˆ 

slide-28
SLIDE 28

Trend-Based Forecasting

 Estimate a trend line using regression analysis

Year Time Period (X) Sales (Y) 1999 2000 2001 2002 2003 2004 1 2 3 4 5 20 40 30 50 70 65

X b b Y

1 0 

 ˆ

 Use time(X) as the

independent variable:

slide-29
SLIDE 29

Trend-Based Forecasting

 The linear trend forecasting

equation is:

Sales trend

10 20 30 40 50 60 70 80 1 2 3 4 5 6

Year sales

Year Time Period (X) Sales (Y) 1999 2000 2001 2002 2003 2004 1 2 3 4 5 20 40 30 50 70 65

i i

X 9.5714 21.905 Y   ˆ

slide-30
SLIDE 30

Trend-Based Forecasting

 Forecast for time period 6:

Year Time Period (X) Sales (Y) 1999 2000 2001 2002 2003 2004 2005 1 2 3 4 5 6 20 40 30 50 70 65 ??

Sales trend

10 20 30 40 50 60 70 80 1 2 3 4 5 6

Year sales

79.33 (6) 9.5714 21.905 Y    ˆ

slide-31
SLIDE 31

Nonlinear Trend Forecasting

 A nonlinear regression model can be used

when the time series exhibits a nonlinear trend

 Quadratic form is one type of a nonlinear

model:

 Can try other functional forms to get best fit

i 2 i 2 i 1 i

ε X β X β β Y    

slide-32
SLIDE 32

Model Selection Using Differences

 Use a linear trend model if the first differences

are approximately constant

 Use a quadratic trend model if the second

differences are approximately constant

) ( ) ( )

1

  • n

n 2 3 1 2

Y Y Y Y Y (Y       

)] ( ) )] ( ) )] ( )

2

  • n

1

  • n

1

  • n

n 2 3 3 4 1 2 2 3

Y Y Y [(Y Y Y Y [(Y Y Y Y [(Y             

Why?

slide-33
SLIDE 33

i p

  • i

p 2

  • i

2 1

  • i

1 i

Y A Y A Y A A Y δ        

Autoregressive Models

 Used for forecasting  Takes advantage of autocorrelation

 1st order - correlation between consecutive

values

 2nd order - correlation between values 2 periods

apart

 pth order Autoregressive models:

Random Error

slide-34
SLIDE 34

Autoregressive Model: Example

Year Units 97 4 98 3 99 2 00 3 01 2 02 2 03 4 04 6

The Office Concept Corp. has acquired a number of office units (in thousands of square feet) over the last eight

  • years. Develop the second order Autoregressive model.

i 2

  • i

2 1

  • i

1 i

Y A Y A A Y δ    

slide-35
SLIDE 35

Autoregressive Model: Example Solution

Year Yi Yi-1 Yi-2 97 4

  • - --

98 3 4 -- 99 2 3 4 00 3 2 3 01 2 3 2 02 2 2 3 03 4 2 2 04 6 4 2

Coefficients Intercept 3.5 X Variable 1 0.8125 X Variable 2

  • 0.9375

Model Output

  • Develop the 2nd order

table

  • Build a regression model

2 i 1 i i

0.9375Y 0.8125Y 3.5 Y

  

  ˆ

i 2

  • i

2 1

  • i

1 i

Y A Y A A Y δ    

slide-36
SLIDE 36

Autoregressive Model Example: Forecasting

Use the second-order equation to forecast number of units for 2005:

625 . 4 ) ) ) ) ˆ ˆ          

 

0.9375(4 0.8125(6 3.5 0.9375(Y 0.8125(Y 3.5 Y 0.9375Y 0.8125Y 3.5 Y

2003 2004 2005 2 i 1 i i

slide-37
SLIDE 37

Autoregressive Modeling Steps

  • 1. Choose p
  • 2. Form a series of “lagged predictor” variables

Yi-1 , Yi-2 , … ,Yi-p

  • 3. Build regression model using all p variables
slide-38
SLIDE 38

Measuring Errors

 Choose the model that gives the smallest

measuring errors

 Mean Absolute

Deviation (MAD)

 Less sensitive to

extreme observations

 Sum of Squared

Errors (SSE)

 Sensitive to outliers

 

n 1 i 2 i i

) Y (Y SSE ˆ n Y Y MAD

n 1 i i i

  ˆ

slide-39
SLIDE 39

Principal of Parsimony

 Suppose two or more models provide a

good fit for the data

 Select the simplest model

 Simplest model types:

 Least-squares linear  Least-squares quadratic  1st order autoregressive

 More complex types:

 2nd and 3rd order autoregressive  Least-squares exponential

slide-40
SLIDE 40

Time Series Similarity Search

slide-41
SLIDE 41

Motivation

 Identify the same person in

different actions

 Detect forgery

slide-42
SLIDE 42

Motivation

 Find similar stocks

  • r mutual funds

 Find a time period with similar inflation rate

and unemployment rate

slide-43
SLIDE 43

How to Measure Similarity

 Euclidean? Manhattan? Lp norm?  Need a method that allows elastic shifting of

time axis to accommodate sequences that are similar but can be out of phase

time

slide-44
SLIDE 44

i i+2 i i i time time

Why Dynamic Time Warping?

Any distance (Euclidean, Manhattan, …) which aligns the i-th point on

  • ne time series with the

i-th point on the other will produce a poor similarity score. A non-linear (elastic) alignment produces a more intuitive similarity measure, allowing similar shapes to match even if they are out of phase in the time axis.

Slides based on one by Elena Tsiporkova

slide-45
SLIDE 45

nl

Time Series Y Time Series X

Warping Path

pL pl p1

The best alignment between X and Y is the path through the grid

P = p1, … , pl , … , pL pl = (nl , ml )

which minimizes the total distance between them.

P is called a warping path

N M

ml

1

1

Slides based on one by Elena Tsiporkova

slide-46
SLIDE 46

Warping Path

 Given and  An (N,M)-warping path is a sequence P = (p1,

p2, … , pL ) satisfying the three conditions:

 Boundary condition:

 starts with (1,1), ends with (N, M)

 Monotonicity condition:

 never goes back in time

 Step size condition:

 one step at a time

slide-47
SLIDE 47

Which one is a warping path?

slide-48
SLIDE 48

Cost Matrix

 Given ,

and a local distance (cost) measure

 A cost matrix is defined as:  DTW algorithm finds a warping path

such that the overall cost is minimized

slide-49
SLIDE 49

Cost Matrix Example

slide-50
SLIDE 50

DTW Distance

 Total cost of a warping path p  Total cost of the optimal warping path  The DTW distance btw X and Y is defined as

slide-51
SLIDE 51

Dynamic Programming

 Let denote the DTW distance btw prefix

sequences and

 D is called the accumulated cost matrix

 Obviously,

 D satisfies the following identities

slide-52
SLIDE 52

Classical DTW Algorithm

slide-53
SLIDE 53

Classical DTW Example

slide-54
SLIDE 54

Illustrating Example

DTW(X,Y) = 2

slide-55
SLIDE 55

DTW Variations

Interest read; not required for this course

 Local weights  Global constraints