Handling Missingness Manipulating Time Series Data in R: Case - - PowerPoint PPT Presentation

handling missingness
SMART_READER_LITE
LIVE PREVIEW

Handling Missingness Manipulating Time Series Data in R: Case - - PowerPoint PPT Presentation

MANIPULATING TIME SERIES DATA IN R: CASE STUDIES Handling Missingness Manipulating Time Series Data in R: Case Studies Missingness > citydata citydata pop 1980-01-01 562994 1981-01-01 564179 1982-01-01 565361 570000 1983-01-01 565491


slide-1
SLIDE 1

MANIPULATING TIME SERIES DATA IN R: CASE STUDIES

Handling Missingness

slide-2
SLIDE 2

Manipulating Time Series Data in R: Case Studies

Missingness

> citydata pop 1980-01-01 562994 1981-01-01 564179 1982-01-01 565361 1983-01-01 565491 1984-01-01 566723 1985-01-01 NA 1986-01-01 NA 1987-01-01 NA 1988-01-01 570867 1989-01-01 572222 1990-01-01 574823

Jan 1980 Jan 1982 Jan 1984 Jan 1986 Jan 1988 Jan 1990 564000 570000

citydata

slide-3
SLIDE 3

Manipulating Time Series Data in R: Case Studies

Fill NAs with Last Observation

> citydata_locf <- na.locf(citydata) > plot.xts(citydata) > plot.xts(citydata_locf)

  • Last observation carried forward (LOCF)

Jan 1980 Jan 1982 Jan 1984 Jan 1986 Jan 1988 Jan 1990 564000 570000

citydata_locf

Jan 1980 Jan 1982 Jan 1984 Jan 1986 Jan 1988 Jan 1990 564000 570000

citydata

slide-4
SLIDE 4

Manipulating Time Series Data in R: Case Studies

Fill NAs with Next Observation

  • Next observation carried backward (NOCB)

> citydata_nocb <- na.locf(citydata, fromLast = TRUE) > plot.xts(citydata) > plot.xts(citydata_nocb)

Jan 1980 Jan 1982 Jan 1984 Jan 1986 Jan 1988 Jan 1990 564000 570000

citydata_nocb

Jan 1980 Jan 1982 Jan 1984 Jan 1986 Jan 1988 Jan 1990 564000 570000

citydata

slide-5
SLIDE 5

Manipulating Time Series Data in R: Case Studies

Linear Interpolation

> citydata_approx <- na.approx(citydata) > plot.xts(citydata) > plot.xts(citydata_nocb)

Jan 1980 Jan 1982 Jan 1984 Jan 1986 Jan 1988 Jan 1990 564000 570000

citydata_approx

Jan 1980 Jan 1982 Jan 1984 Jan 1986 Jan 1988 Jan 1990 564000 570000

citydata

slide-6
SLIDE 6

MANIPULATING TIME SERIES DATA IN R: CASE STUDIES

Let’s practice!

slide-7
SLIDE 7

MANIPULATING TIME SERIES DATA IN R: CASE STUDIES

Lagging and Differencing

slide-8
SLIDE 8

Manipulating Time Series Data in R: Case Studies

Jan 2010 9.6 Feb 2010 9.2 March 2010 8.9 April 2010 8.3 May 2010 8.2 June 2010 8.4 July 2010 8.3

Lagging

  • lag() offsets observations in time

lag(unemployment, k = 1, ...)

  • 9.6

9.2 8.9 8.3 8.2 8.4

slide-9
SLIDE 9

Manipulating Time Series Data in R: Case Studies

  • 0.4
  • 0.3
  • 0.6
  • 0.1

0.2

  • 0.1

Jan 2010 9.6 Feb 2010 9.2 March 2010 8.9 April 2010 8.3 May 2010 8.2 June 2010 8.4 July 2010 8.3

Differencing

  • diff() measures change between periods

diff(unemployment, lag = 1, ...)

slide-10
SLIDE 10

MANIPULATING TIME SERIES DATA IN R: CASE STUDIES

Let’s practice!

slide-11
SLIDE 11

MANIPULATING TIME SERIES DATA IN R: CASE STUDIES

Rolling Functions

slide-12
SLIDE 12

Manipulating Time Series Data in R: Case Studies

Discrete Windows

> unemployment_yrs <- split(unemployment, f = "years")

  • Split the data according to period

> unemployment_yrs <- lapply(unemployment_yrs, cummax)

  • Apply function within period
  • Bind new data into xts object

> unemployment_ytd <- do.call(rbind, unemployment_yrs)

slide-13
SLIDE 13

Manipulating Time Series Data in R: Case Studies

2000 2002 2004 2006 2008 2010 2 4 6 8 Unemployment (%) 2000 2002 2004 2006 2008 2010 2 4 6 8 Unemployment (%)

Rolling Windows

> unemployment_avg <- rollapply(unemployment, width = 12, FUN = mean)

  • rollapply() applies a function to a rolling window
slide-14
SLIDE 14

MANIPULATING TIME SERIES DATA IN R: CASE STUDIES

Let’s practice!