Modelling Measurement Error in Administrative and Survey Variables - PowerPoint PPT Presentation

Modelling Measurement Error in Administrative and Survey Variables Sander Scholtus, Bart Bakker, Arnout van Delden (s.scholtus@cbs.nl)

Outline – Introduction – Modelling measurement error ‐ structural equation models ‐ identification by means of an audit sample – Application ‐ VAT data for Dutch quarterly turnover statistics – Summary / discussion 2

Introduction – Quality of administrative data for statistical purposes ‐ coverage of target population, timeliness, … ‐ measurement issues – Administrative data: possible conceptual differences – Compare admin. data to survey data ‐ previous presentation: survey data = gold standard ‐ current presentation: measurement errors in both sources 3

Modelling measurement error – Basic approach: ‐ link administrative data to survey data ‐ allow for measurement errors in both sources ‐ fit a structural equation model (SEM) ‐ latent variables represent “true” concepts ‐ standardised factor loadings reflect validity of measurement y 1 = τ 1 + λ 11 η 1 + ε 1 y 1 ε 1 λ 11 admin y 2 = τ 2 + λ 21 η 1 + ε 2 η 1 τ 1 1 true τ 2 latent y 2 λ 21 ε 2 observed survey constant 4

Modelling measurement error – Complications: model identification ‘requires’ ‐ multiple (≥ 3) related concepts ‐ multiple (≥ 2) observed variables for each concept ‐ choice of a metric for each latent variable (for evaluating bias) ζ 1 y 1 ε 1 latent η 1 1 1 observed δ 1 x 1 y 2 ε 2 constant ξ 1 1 y 3 ε 3 x 2 δ 2 η 2 1 1 ζ 2 y 4 ε 4 5

Modelling measurement error – Standard identification solutions yield ‘arbitrary’ metrics: ‐ reference indicators [e.g., τ 1 = 0 and λ 11 = 1] ‐ standardise latent variables [E( η 1 ) = 0 and Var( η 1 ) = 1] – Alternative solution: calibration ‐ collect additional gold standard data for a random subsample (audit sample / verification study) ‐ simulation results suggest: audit sample of 50 units is sufficient y 1 ε 1 λ 11 y 1 = τ 1 + λ 11 η 1 + ε 1 admin η 1 τ 1 y 2 = τ 2 + λ 21 η 1 + ε 2 1 true τ 2 y 3 = η 1 y 2 λ 21 1 ε 2 survey y 3 audit 6

Application: VAT data – Dutch quarterly turnover statistics – Main question: VAT turnover fit for use? ‐ base cells in car trade and transport sector ‐ tax regulations exist, previous analysis inconclusive ‐ large and complex units excluded – Sources of data: ‐ Business Register (BR) ‐ Profit Declarations (PD; admin. source) ‐ VAT data (admin. source) ‐ Structural Business Statistics (SBS; sample survey) ‐ Audit sample: re-edited SBS data (50 units per base cell) 7

Application: VAT data – Model: (SBS data removed to avoid multicollinearity with audit data) BR No. audit Empl. SBS Tot. Turn- Costs over PD Pur- audit chase VAT SBS SBS PD audit PD audit SBS 8

Application: VAT data – Model estimation ‐ used Pseudo Maximum Likelihood to account for • complex survey design (SBS + audit sample) • skewness of the data ‐ examined data transformations: • variables on original scale • variables divided by number of legal units (heteroscedasticity) ‐ used R packages lavaan and lavaan.survey 9

Application: VAT data Results for NACE 45112 (“Sale/repair of passenger cars”) 1 Robust (PML) fit measures : 1.02 BR 0.87 3.31 χ 2 = 66 (df = 47, p = 0.03); 1 CFI = 0.998; TLI = 0.999; RMSEA = 0.032 No. audit Empl. 1 0.05 0.03 Tot. Turn- 1.02 55 1 Costs over – 0.02 PD 1.03 0.02 1 1.04 1 1.03 0.80 1 1 Pur- 1.05 audit chase 1 VAT PD audit PD audit 1.21 1 – 0.02 – 0.02 – 0.01 1 1 1 10

Application: VAT data – Result from SEM on previous slide: Turnover(VAT) = – 0.02 + 0.80 × Turnover(true) + ε – Derive a correction formula through a second SEM: β α Turn- VAT 1 over ζ λ * = 1.03 τ * = – 0.01 PD 1 θ * = 0.06 ε Turnover(true) = 0.18 + 1.13 × Turnover(VAT) + ζ (R 2 =0.90) ( σ =0.08) ( σ =0.06) 11

Summary / discussion – Can assess validity and bias of admin. data with SEMs – Advantages over direct comparison to survey data: ‐ allow for measurement errors in all sources ‐ objective evaluation of measurement quality – Possible disadvantages: ‐ need multiple related concepts ‐ need an audit sample to identify bias – Suggestion: apply a multi-stage approach 1. Make a direct comparison to survey data (linear regression) 2. If inconclusive, determine validity with SEM approach 3. If validity high, collect audit sample to estimate bias as well 12

Modelling Measurement Error in Administrative and Survey Variables - PowerPoint PPT Presentation

Modelling Measurement Error in Administrative and Survey Variables Sander Scholtus, Bart Bakker, Arnout van Delden (s.scholtus@cbs.nl) Outline Introduction Modelling measurement error structural equation models identification by

Chapter 11: The R.M.S. Error for Regression Errors: A has a large positive error B has a large

Beta star measurement G. Wang and M.Bai Yellow beta star and chromatic beta beat measurement

ERROR DETECTON & CORRECTION Error Detection EDC= Error Detection and Correction bits

Measurement Error in American Community Survey Paradata and 2014 Community Survey Paradata and

Measurement Uncertainty - Error & Uncertainty Measurement errors are impossible to avoid

Human Error and Human Error Identification Techniques adapted from an IE 545 presentaton by

An Overview of Human Error Drawn f rom J . Reason, Human Error , Cambridge, 1990 Aaron Brown CS

Questions From Chapter 1 Figure 1.1: Testing life cycle Ch 12 Error vocabulary 1

Error Detection Codes Error Detection Two types Nave scheme Error Detection Codes

llvm::Error Rich Error Handling in LLVM Error Handling History LLVMs APIs historically

Chapter 9. Survey Research Chapter 9. Survey Research survey research methods? survey research

Bridging social and physical measurement: measurement is not scale construction; measurement is

Presentation to Ontario Smart Grid Working Group Who is Measurement Canada? Measurement: A part

Quantum Information Processing and Quantum Error Correction and Quantum Error Correction with

QEC11 Quantum Error Correction and Quantum Error-Correcting Codes Todd A. Brun Center for

Lecture 9: Wireless link layer: Lecture 9: Wireless link layer: error control and wrap-up error

Latent and Network Models with Applications to Finance Jingchen Liu Department of Statistics

Engineering Inspection Review Revision 1 (9/29/17) Charter Tasks Identify opportunities to

A Latent Variable Model of Synchronous Parsing for Syntactic and Semantic Dependencies James

Julia F. Slejko, PhD 1 Louis P. Garrison, PhD 1 Richard J. Willke, PhD 2 1 University of Washington

On the Many Claims and Applications of the Latent Variable Science is an attempt to exploit

Factorization Meets the Neighborhood: a Multifaceted Collaborative Filtering Model Yehuda Koren

Joint Modeling of Longitudinal Item Response Data and Survival Jean-Paul Fox University of

Statistical modelling of a terrorist network with the latent class model and Bayesian model

Sambuz

Useful Links

Newsletter

Mail Us

Modelling Measurement Error in Administrative and Survey Variables - PowerPoint PPT Presentation

Modelling Measurement Error in Administrative and Survey Variables Sander Scholtus, Bart Bakker, Arnout van Delden (s.scholtus@cbs.nl) Outline Introduction Modelling measurement error structural equation models identification by

Chapter 11: The R.M.S. Error for Regression Errors: A has a large positive error B has a large

Beta star measurement G. Wang and M.Bai Yellow beta star and chromatic beta beat measurement

ERROR DETECTON &amp; CORRECTION Error Detection EDC= Error Detection and Correction bits

Measurement Error in American Community Survey Paradata and 2014 Community Survey Paradata and

Measurement Uncertainty - Error &amp; Uncertainty Measurement errors are impossible to avoid

Human Error and Human Error Identification Techniques adapted from an IE 545 presentaton by

An Overview of Human Error Drawn f rom J . Reason, Human Error , Cambridge, 1990 Aaron Brown CS

Questions From Chapter 1 Figure 1.1: Testing life cycle Ch 12 Error vocabulary 1

Error Detection Codes Error Detection Two types Nave scheme Error Detection Codes

llvm::Error Rich Error Handling in LLVM Error Handling History LLVMs APIs historically

Chapter 9. Survey Research Chapter 9. Survey Research survey research methods? survey research

Bridging social and physical measurement: measurement is not scale construction; measurement is

Presentation to Ontario Smart Grid Working Group Who is Measurement Canada? Measurement: A part

Quantum Information Processing and Quantum Error Correction and Quantum Error Correction with

QEC11 Quantum Error Correction and Quantum Error-Correcting Codes Todd A. Brun Center for

Lecture 9: Wireless link layer: Lecture 9: Wireless link layer: error control and wrap-up error

Latent and Network Models with Applications to Finance Jingchen Liu Department of Statistics

Engineering Inspection Review Revision 1 (9/29/17) Charter Tasks Identify opportunities to

A Latent Variable Model of Synchronous Parsing for Syntactic and Semantic Dependencies James

Julia F. Slejko, PhD 1 Louis P. Garrison, PhD 1 Richard J. Willke, PhD 2 1 University of Washington

On the Many Claims and Applications of the Latent Variable Science is an attempt to exploit

Factorization Meets the Neighborhood: a Multifaceted Collaborative Filtering Model Yehuda Koren

Joint Modeling of Longitudinal Item Response Data and Survival Jean-Paul Fox University of

Statistical modelling of a terrorist network with the latent class model and Bayesian model

Sambuz

Useful Links

Newsletter

Mail Us

ERROR DETECTON & CORRECTION Error Detection EDC= Error Detection and Correction bits

Measurement Uncertainty - Error & Uncertainty Measurement errors are impossible to avoid