Time Series Database Interface UseR 2008 Paul Gilbert Bank of - - PDF document

time series database interface user 2008 paul gilbert
SMART_READER_LITE
LIVE PREVIEW

Time Series Database Interface UseR 2008 Paul Gilbert Bank of - - PDF document

Time Series Database Interface UseR 2008 Paul Gilbert Bank of Canada pgilbert@bank-banque-canada.ca August, 2008 The views expressed are those of the authors. No responsibility for them should be attributed to the Bank of Canada. 1


slide-1
SLIDE 1

Time Series Database Interface UseR 2008 Paul Gilbert Bank of Canada pgilbert@bank-banque-canada.ca August, 2008 The views expressed are those of the authors. No responsibility for them should be attributed to the Bank of Canada.

1

slide-2
SLIDE 2

Motivation: Provide an abstract layer (API) for time series database calls (and back-end data base structures)

  • What is a time series?
  • Query is (usually) really simple.

2

slide-3
SLIDE 3

Why an API layer?

  • Other

code can use multiple/different databases.

  • Independence from database/vendor.
  • Independence from database versions.
  • Independence from organization.

3

slide-4
SLIDE 4

Why SQL?

  • (not just SQL)
  • DBI, network interface, widely used and

supported

  • Do extra things like panels and vintages.
  • (I get to learn something about SQL)

4

slide-5
SLIDE 5
  • API (roughly defined)
  • Applications: R, ...
  • Interface: TSdbi (using DBI)
  • Database:

PADI, MySQL, SQLite, ... SQL, Fame, ..., getHistoricalQuote

  • (using TSpadi, TSMySQL, TSQLite, ...
  • (and RMySQL, RSQLite, ...)

5

slide-6
SLIDE 6
  • Historical note: PADI is old

6

slide-7
SLIDE 7

TSconnect(drv, dbname, ...) TSget(serIDs, con=options()$TSconnection, ...)

  • for SQL dbs this calls a standard SQL

function, TSgetSQL.

  • in theory the time representation can be

specified.

7

slide-8
SLIDE 8

TSput(x, serIDs=seriesNames(x), con=options()$TSconnection, ...) TSdates(serIDs, con=options()$TSconnection, ...) TSdescription(x, con=options()$TSconnection, ...) TSdescription assignment function too. TSexists TSdoc

8

slide-9
SLIDE 9

SQL implementation

  • tables for each ”frequency” of data
  • annual,

quarterly, monthly, semiannual, weekly, daily, business day, minutely, irregular data with a date, and irregular data with a date and time

  • A Q M S W D B I T U
  • Meta (documentation and to lookup what

table a series is on)

  • vintages, panels (optional)

9

slide-10
SLIDE 10

Example table setup dbGetQuery(con, "create table D ( id VARCHAR(40), date DATE, period INT, v double DEFAULT NULL );") dbGetQuery(con, "CREATE INDEX Dindex_id ON D (id);") dbGetQuery(con, "CREATE INDEX Dindex_date ON D (date);") dbGetQuery(con, "CREATE INDEX Dindex_period ON D (period);")

10

slide-11
SLIDE 11

Example (skipping details about setting up tables, permissions, loading data, etc) require("TSMySQL") m <- dbDriver("MySQL") con <- TSconnect("MySQL", dbname="FVvintages") # pass user/passwd/host in ~/.my.cnf z1 <- TSget(serIDs="M2+gross", con=con, vintage="v2001-07") z2 <- TSget(serIDs="M2+gross", con=con, vintage="current") z3 <- TSget(serIDs="M2+gross", con) # should default to current tfplot(z1, z2,z3, Title="Selected vintages of gross M2+", start=c(1990,1))

11

slide-12
SLIDE 12

1990 1995 2000 2005 5e+05 7e+05 9e+05 M2+gross

Selected vintages of gross M2+

12

slide-13
SLIDE 13
  • performance?
  • other back-ends?
  • extensions? start, end, manipulations?

13

slide-14
SLIDE 14

The End

14

slide-15
SLIDE 15
  • 15