Canada Data and Resources Hugh McCague Valerie Preston Walter - - PowerPoint PPT Presentation

canada data and resources
SMART_READER_LITE
LIVE PREVIEW

Canada Data and Resources Hugh McCague Valerie Preston Walter - - PowerPoint PPT Presentation

Accessing Statistics Canada Data and Resources Hugh McCague Valerie Preston Walter Giesbrecht Sara Tumpane Outline Survey Terminology Research Data Centre (RDC) RDC versus Public Use Microdata Files (PUMF) Accessing the RDC


slide-1
SLIDE 1

Accessing Statistics Canada Data and Resources

Hugh McCague Valerie Preston Walter Giesbrecht Sara Tumpane

slide-2
SLIDE 2

Outline

  • Survey Terminology
  • Research Data Centre (RDC)
  • RDC versus Public Use Microdata Files (PUMF)
  • Accessing the RDC
  • Statistics Canada Surveys and Data
  • Statistical Software
  • Statistical Consulting Service
  • Resources
slide-3
SLIDE 3

Some Survey Terminology

3

  • Population
  • Elements
  • Sample: Simple Random Sample, Probability

Sample

  • Response Rate
  • Weights: Simple Weights
slide-4
SLIDE 4

4

  • Demographics
  • Strata
  • Clusters (primary sampling units, PSUs)
  • Complex Sample
  • Complex Weights, Bootstrap and Jackknife

Replicate Weights

Some Survey Terminology

slide-5
SLIDE 5

5

  • Cross-sectional data
  • Longitudinal data: periods, waves, cycles,

trajectory, life course

  • Attrition: attrition rate.
  • Helpful reference:

Ornstein, Michael. A Companion to Survey

  • Research. London; Thousand Oaks, CA:

SAGE, 2013.

Some Survey Terminology

slide-6
SLIDE 6

Research Data Center (RDC)

  • Access to Statistics Canada data and statistical software
  • Microdata & administrative data
  • For York students and faculty, access is free
  • A “secure” environment
  • Researchers are “deemed employees” of Statistics Canada
  • Must work in RDC
  • CRDCN Network
slide-7
SLIDE 7

The CRDCN Network

slide-8
SLIDE 8

York RDC

  • 282 York Lanes
  • Staffed by:
  • Analyst Sara Tumpane (yorkrdc2@yorku.ca)
  • Assistant Theresa Kim (yorkrdc3@yorku.ca)
  • 8 workstations
  • Open 3 days/ wk
  • http://www.isr.yorku.ca/rdc/

8

slide-9
SLIDE 9

Before you apply to the RDC…

  • Consider your options
  • Is what you need in some more readily accessible source

(either PUMF or aggregate file)

slide-10
SLIDE 10

RDC or PUMF?

Confidential Microdata in Research Data Centres Public Use Microdata Files accessed

  • nline

Characteristics:

  • Contains most of the original

information collected during the survey

  • Continuous variables are accessible
  • Longitudinal identifiers provided
  • Contains bootstrap weights used for

calculating exact variance Characteristics:

  • Manipulated by aggregating,

capping, or deleting variables that could be “identifiers”; survey respondents cannot be identified

  • Many continuous variables

transformed into categorical variables

  • Longitudinal identifiers stripped

Access is appropriate when:

  • Sensitive variables not provided in

PUMF

  • A PUMF does not exist
  • Longitudinal data is necessary
  • Analytical work is complex in

nature Access is appropriate when:

  • Immediate data access is required
  • Analysis is for a course paper or

equivalent

  • Data exploration
slide-11
SLIDE 11

Labour Force Survey

PUMF Master file

  • Demographic variables
  • Geography
  • Age
  • Sex
  • Marital status
  • Demographic variables
  • Geography
  • Age
  • Sex
  • Marital status
  • Country of birth
  • Country completed highest post-

secondary degree/certificate/diploma

  • Landed immigrant status
  • Detailed Aboriginal status
slide-12
SLIDE 12

CCHS 2012 Example 1

PUMF Master File

  • 1381 variables
  • Sources of personal income
  • Employment inc.
  • EI/Worker's comp
  • Senior benefits
  • Other
  • 1815 variables
  • Sources of personal income
  • wages and salaries
  • income from self-employment
  • dividends and interest
  • employment insurance
  • worker's compensation
  • CPP or QPP
  • job related retirement pensions
  • RRSP/RRIF
  • OAS and GIS
  • social assistance/welfare
  • child tax benefits
  • child support
  • alimony
  • ther
  • none
slide-13
SLIDE 13

CCHS 2012 Example 2

PUMF Master File

  • Geography
  • Province of residence of respondent-(G)
  • Health Region - (G)
  • B.C. Health Authority (BCHA) - (D)
  • Geography
  • Province of residence of respondent
  • Postal code - (D)
  • Health region of residence of respondent - (D)
  • Sub-health region (Québec only) - (D)
  • Nova Scotia district health authority
  • British Columbia local health authority - (D)
  • Regional health authority (RHA) - Alberta - (D)
  • British Columbia health authority - (D)
  • Local health integrated networks - Ontario - (D)
  • 2006 census dissemination area
  • Federal electoral district - (D)
  • Census subdivision - (D)
  • Census division - (D)
  • Statistical area classification type - (D)
  • 2006 Census metropolitan area (CMA)
  • Health region peer group
  • Urban and rural areas
  • Urban and rural areas - 2 levels - (D)
  • Subzones for Alberta
  • Manitoba health authority - (D)
slide-14
SLIDE 14

Accessing PUMFs & master file metadata

  • Statistics Canada Nesstar data portal
  • metadata only, for PUMFs and master files
  • http://www62.statcan.ca/webview/
  • YUL: Data & Statistics library guide
  • http://researchguides.library.yorku.ca/data
  • <odesi> (OCUL)
  • http://www.library.yorku.ca/e/resolver/id/1165738
slide-15
SLIDE 15

http://www.andertoons.com/data/cartoon/6543/things-good-stuff-ok-i-reiterate-request-for-specific-data

slide-16
SLIDE 16

How to apply to an RDC and available datasets

  • RDC Application Pages
  • Data available in the RDCs
  • SSHRC Website
slide-17
SLIDE 17

Accessing the RDC

Action Timeline Notes Apply through the SSHRC website 1-2 Hours

Provide list of academic contributions, project proposal

Evaluation of the proposal 2-4 Weeks

Approval based on relevance of methods and data, and demonstrated need for microdata

Security screening process 1-3 Weeks for approval Sign Microdata Research Contract 1-3 Weeks for approval

slide-18
SLIDE 18

Project Proposal

  • The project proposal includes the following elements:
  • Title of the Project
  • Rationale and objectives of the study
  • Proposed data analysis and software requirements
  • Data requirements
  • Expected project start and end dates
  • Expected products
  • References
slide-19
SLIDE 19

Data at the RDC

  • Labour Force Survey (LFS): 1976 - 2014
  • Monthly estimates of employment & unemployment
  • Rotating 6 month panel, N= ~ 16,500
  • Paper: Seasonal Adjustment, Demography, and GDP Growth, Dunbar, G.R. (2013),

Canadian Journal of Economics

  • Survey of Labour and Income Dynamics (SLID): 1993 – 2011
  • Changes in well-being over time
  • Overlapping 6 year panels, N= ~ 17,000
  • Paper: An Empirical Model of Tax Convexity and Self-Employment, Wen, J-F. &

Gordon, D. (2014), The Review of Economics and Statistics

  • Workplace and Employee Survey (WES): 1999 – 2006
  • Employer: competitiveness, innovation, technology use: N= ~ 6,300
  • Employee: training, job stability, earnings: N= ~24,000
  • Paper: Organizational Redesign, Information Technologies and Workplace

Productivity, Dostie, B. & Jayaraman, R. (2012), The B.E. Journal of Economic Analysis and Policy

slide-20
SLIDE 20

Data (continued)

  • Survey of Household Spending (SHS): 1986 - 2012
  • Spending, investments, and savings: household and person
  • Cross-sectional: N= ~17,000 (households)
  • Paper: Does One Size Fit All? The CPI and Canadian Seniors, Brzozowski, M. (2006),

Canadian Public Policy

  • Survey of Financial Security (SFS): 1999 – 2012
  • Net worth (wealth) of Canadian families: assets, debt, employment, income,

education

  • Cross-sectional: N= ~20,000 (households)
  • Paper: New Evidence on Taxes and Portfolio Choice, Atalay, K. et al. (2009), Social and

Economic Dimensions of an Aging Population (SEDAP) Research Papers

  • Census & National Household Survey (NHS): 1911 – 2011
  • Demographic, social, and economic characteristics
  • Cross-sectional (mandatory): 20% sample, N= ~6,000,000
  • Paper: Quality of Life, Firm Productivity, and the Value of Amenities Across

Canadian Cities, Albouy, D. Leibovici, F. & Warman, C. (2013), Canadian Journal of Economics

slide-21
SLIDE 21

Data by Themes

  • Health and Health Care
  • National Population Health Survey (NPHS)
  • Participation and Activity Limitation Survey (PALS)
  • Canadian Tobacco, Alcohol and Drugs Survey (CTADS)
  • Occupations and Organizations
  • Workplace and Employee Survey (WES)
  • Survey of Labour and Income Dynamics (SLID)
  • Census
  • Education
  • Youth in Transition Survey (YITS)
  • National Graduates Survey (NGS)
  • Race and Ethnicity
  • Aboriginal Peoples Survey (APS)
  • Longitudinal Survey of Immigrants to Canada (LSIC)
  • Ethnic Diversity Survey (EDS)
slide-22
SLIDE 22

Pilot Data

  • Canadian Cancer Registry (CCR)
  • Vital Statistics
  • Uniform Crime Reporting
  • Homicide Survey
  • Hate Crime Data
  • Ministry of Community and Social Services (MCSS)
  • Citizenship and Immigration Canada (CIC)
slide-23
SLIDE 23

Which Statistical Software to use at the York RDC? Features to Consider

  • SPSS 23
  • SAS 9.4
  • Stata 13
  • R 3.0.3

Statistical Software Resources: Institute for Digital Research and Educations (idre), UCLA

http://www.ats.ucla.edu/stat/

slide-24
SLIDE 24

Statistical Consulting Service (SCS)

24

  • Statistical Consulting provided by a group of York

faculty and graduate students with staff at the Institute for Social Research (ISR).

  • Usually, no fee for York faculty and student researchers
  • Online appointment scheduler
slide-25
SLIDE 25

http://truthfacts.com/truthfacts/2014/04/09

slide-26
SLIDE 26

Statistical Consulting Service (SCS)

26

  • ISR/SCS Short Courses and Spring Seminar Series on

data analysis, qualitative research methods, survey methods, and related software

  • More details: http://www.isryorku.ca/centres/scs/
slide-27
SLIDE 27

Contact Information and Resources

  • http://www.isryorku.ca/econ
slide-28
SLIDE 28