the uk longitudinal studies lss
play

The UK Longitudinal Studies (LSs) Sensitive microdata: Sample from - PowerPoint PPT Presentation

The UK Longitudinal Studies (LSs) Sensitive microdata: Sample from the Census linked to administrative data (births, deaths, marriages, health and other) Restricted access: Safe settings ONS LS (England & Wales): London,


  1. The UK Longitudinal Studies (LSs)  Sensitive microdata: Sample from the Census linked to administrative data (births, deaths, marriages, health and other)  Restricted access:  Safe settings ONS LS (England & Wales): London, Titchfield and Newport SLS (Scotland): Edinburgh NILS (Northern Ireland): Belfast  Remote access Only variable names and labels are provided to the user A Support Officer runs analysis script on the real data Administrative Data Research Centre - Scotland | Beata Nowok | 10 March 2015

  2. Synthetic data for the UK LSs  Synthetic versions of data extracts to match individual user data requests  Provided to approved researchers for preliminary analysis and preparing code, final analysis will be run on the real data in safe settings Administrative Data Research Centre - Scotland | Beata Nowok | 10 March 2015

  3. Original (input) Marital Sex Age Education Income Life satisfaction status FEMALE 57 VOCATIONAL/GRAMMAR MARRIED 800 PLEASED MALE 41 SECONDARY UNMARRIED 1500 MIXED FEMALE 18 VOCATIONAL/GRAMMAR UNMARRIED NA PLEASED FEMALE 78 PRIMARY/NO EDUCATION WIDOWED 900 MIXED FEMALE 54 VOCATIONAL/GRAMMAR MARRIED 1500 MOSTLY SATISFIED MALE 20 SECONDARY UNMARRIED -8 PLEASED FEMALE 39 SECONDARY MARRIED 2000 MOSTLY SATISFIED MALE 39 SECONDARY MARRIED 1197 MIXED Synthetic (output) FEMALE 38 VOCATIONAL/GRAMMAR MARRIED NA MOSTLY DISSATISFIED FEMALE 73 VOCATIONAL/GRAMMAR WIDOWED 1700 PLEASED Marital Sex Age Education Income Life satisfaction status FEMALE 54 SECONDARY WIDOWED 2000 MOSTLY SATISFIED MALE 81 PRIMARY/NO EDUCATION MARRIED 2100 PLEASED MALE 30 VOCATIONAL/GRAMMAR UNMARRIED 900 MOSTLY SATISFIED MALE 54 VOCATIONAL/GRAMMAR MARRIED 1700 PLEASED MALE 68 SECONDARY MARRIED -8 DELIGHTED FEMALE 32 VOCATIONAL/GRAMMAR DIVORCED 870 MIXED MALE 61 PRIMARY/NO EDUCATION MARRIED -8 MIXED FEMALE 98 PRIMARY/NO EDUCATION MARRIED 800 MOSTLY DISSATISFIED FEMALE 50 PRIMARY/NO EDUCATION MARRIED NA MOSTLY SATISFIED Data that look FEMALE 37 VOCATIONAL/GRAMMAR MARRIED 158 PLEASED (structurally) like MALE 28 VOCATIONAL/GRAMMAR NA 1500 MOSTLY SATISFIED FEMALE 62 PRIMARY/NO EDUCATION MARRIED 830 MOSTLY SATISFIED original data but MALE 78 PRIMARY/NO EDUCATION MARRIED NA PLEASED FEMALE 29 SECONDARY MARRIED 580 MOSTLY SATISFIED contain artificial MALE 59 PRIMARY/NO EDUCATION MARRIED 1300 MOSTLY SATISFIED units only MALE 41 SECONDARY UNMARRIED 1500 MIXED MALE 18 SECONDARY UNMARRIED -8 PLEASED FEMALE 73 PRIMARY/NO EDUCATION WIDOWED 1350 MOSTLY SATISFIED

  4. Data that behave (statistically) like original data

  5. Generating Synthetic Versions of Sensitive Microdata for Statistical Disclosure Control package http://cran.r-project.org/package=synthpop

  6. Generating synthetic data: method fit observed synthetic Y j ~ (Y 0 ,Y 1 ,...,Y j−1 ) draw Sequentially replacing original data values with synthetic values generated from conditional probability distributions

  7. Generating synthetic data: synthpop observed synthetic syn ()

  8. Generating synthetic data: synthpop  Synthesis can be run with default parameters (classification and regression tree models - CART) syn(data)  Methods to summarise and to make inferences from synthetic data Administrative Data Research Centre - Scotland | Beata Nowok | 10 March 2015

  9. syn() & common data problems  Missing-data patterns  Semi-continuous variables  Restricted values (interrelationships between variables)  Linear constraints  Non-negativity / non-normality  Deterministic relations Administrative Data Research Centre - Scotland | Beata Nowok | 10 March 2015

  10. Conclusions  Synthetic data – expanding the use of confidential microdata  UK LSs: Access to LS-like data on own computer  ADRC-S: Archiving linked data  Teaching  The synthpop package for R – facilitating generation and analysis of synthetic data Direction: Automation based on best practices and methods Administrative Data Research Centre - Scotland | Beata Nowok | 10 March 2015

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend