Reducing Nonresponse Bias through Responsive Design and External - - PowerPoint PPT Presentation

reducing nonresponse bias through responsive design and
SMART_READER_LITE
LIVE PREVIEW

Reducing Nonresponse Bias through Responsive Design and External - - PowerPoint PPT Presentation

Introduction The Proposed Methods Simulation Results Next Step Reducing Nonresponse Bias through Responsive Design and External Benchmarks Julia Lee University of Michigan July 17, 2012 Thesis committee: S. Heeringa, R. Little, T.


slide-1
SLIDE 1

Introduction The Proposed Methods Simulation Results Next Step

Reducing Nonresponse Bias through Responsive Design and External Benchmarks

Julia Lee

University of Michigan

July 17, 2012

Thesis committee: S. Heeringa, R. Little, T. Raghunathan, R. Valliant

Julia Lee CE 2012 Survey Methods Symposium

slide-2
SLIDE 2

Introduction The Proposed Methods Simulation Results Next Step

Goals of the Project

1 To improve respondent representativeness 2 To assess the nature of nonresponse 3 To adjust for nonresponse Julia Lee CE 2012 Survey Methods Symposium

slide-3
SLIDE 3

Introduction The Proposed Methods Simulation Results Next Step

Outline

Introduction The proposed method Simulation results Next steps

Julia Lee CE 2012 Survey Methods Symposium

slide-4
SLIDE 4

Introduction The Proposed Methods Simulation Results Next Step Current Practice Alternatives

Current Practice

Reduce nonresponse bias at the analysis stage: Weighting class methods Propensity score methods Calibration (Imputation) Challenges: Need nonrespondent information Assume ignorable nonresponse pattern Extreme and highly variable weights occur

Julia Lee CE 2012 Survey Methods Symposium

slide-5
SLIDE 5

Introduction The Proposed Methods Simulation Results Next Step Current Practice Alternatives

Alternatives

Reduce nonresponse bias at the design and data collection stages: Actively control for nonresponse bias at design stage by adaptively improving respondent representativeness. Effectively use frame data, contextual data, paradata, and benchmark information to obviate the need for nonrespondent information.

Julia Lee CE 2012 Survey Methods Symposium

slide-6
SLIDE 6

Introduction The Proposed Methods Simulation Results Next Step Responsive Design Procedure Data Structure The key steps

Responsive Design Procedure

Objectives: Obviate the need for nonrespondent information Obtain more representative respondent pool Terminology: Benchmark survey: capture desired target population, such as American Community Survey Current Survey: survey that you are conducting

Julia Lee CE 2012 Survey Methods Symposium

slide-7
SLIDE 7

Introduction The Proposed Methods Simulation Results Next Step Responsive Design Procedure Data Structure The key steps

Responsive Design Procedure

Setting: Surveys with multi-phase data collection The procedures:

1 Complete first phase of data collection. 2 Combine with benchmark information. 3 Augument with frame data, contextual data, and paradata. 4 Model the origin of each data point (1=benchmark, 0=

current survey) in terms of covariates.

5 Compute ratio of propensity score density (Rps) between

benchmark and current survey.

6 Sample next phase subjects using Rps. 7 Iterate steps 2 through 6 until acceptable representativeness

  • r budget reached.

Julia Lee CE 2012 Survey Methods Symposium

slide-8
SLIDE 8

Introduction The Proposed Methods Simulation Results Next Step Responsive Design Procedure Data Structure The key steps

The problem

How do we know propensity scores of next phase subjects before they respond?

Julia Lee CE 2012 Survey Methods Symposium

slide-9
SLIDE 9

Introduction The Proposed Methods Simulation Results Next Step Responsive Design Procedure Data Structure The key steps

Data structure

Y1 Y2 Y3 Y4 X1 X2 X3 Z1 Z2 Bench 1 √ √ √ √ √ √ √ √ √ Bench 1 √ √ √ √ √ √ √ √ √ Bench 1 √ √ √ √ √ √ √ √ √ … 1 √ √ √ √ √ √ √ √ √ S1 √ √ √ √ √ √ √ √ √ S1 √ √ √ √ √ √ √ √ √ … √ √ √ √ √ √ √ √ √ S2 √ √ √ √ √ S2 √ √ √ √ √ S2 √ √ √ √ √ S2 √ √ √ √ √ … √ √ √ √ √ Notation: Ys are survey variables Xs are common covariates across benchmark survey and the sample survey. Zs are auxiliary data or contextual data from frame, registry, or interview observations, etc.

Missing data

Julia Lee CE 2012 Survey Methods Symposium

slide-10
SLIDE 10

Introduction The Proposed Methods Simulation Results Next Step Responsive Design Procedure Data Structure The key steps

The key step 1: Imputation

Estimate propensity score of next samples using imputed covariates

Y1 Y2 Y3 Y4 X1 X2 X3 Z1 Z2 Bench 1 √ √ √ √ √ √ √ √ √ Bench 1 √ √ √ √ √ √ √ √ √ Bench 1 √ √ √ √ √ √ √ √ √ … 1 √ √ √ √ √ √ √ √ √ S1 √ √ √ √ √ √ √ √ √ S1 √ √ √ √ √ √ √ √ √ … √ √ √ √ √ √ √ √ √ S2 ▲ ▲ ▲ ▲ √ √ √ √ √ S2 ▲ ▲ ▲ ▲ √ √ √ √ √ S2 ▲ ▲ ▲ ▲ √ √ √ √ √ S2 ▲ ▲ ▲ ▲ √ √ √ √ √ … ▲ ▲ ▲ ▲ √ √ √ √ √ Notation: Ys are survey variables Xs are common covariates across benchmark survey and the sample survey. Zs are auxiliary data or contextual data from frame, registry, or interview observations, etc.

Imputation

Julia Lee CE 2012 Survey Methods Symposium

slide-11
SLIDE 11

Introduction The Proposed Methods Simulation Results Next Step Responsive Design Procedure Data Structure The key steps

The key step 2: Rps

Define an acceptance/rejection process on the original sampling frame, to reduce or eliminate bias relative to the benchmark survey. Must satisfy: πP(Z|accept) + (1 − π)P(Z) = PB(Z) where π is the fraction of the combined data that are newly drawn. What we want is P(accept|Z). Choose P(Z) to be propensity score density and use Bayes Theorem to obtain P(accept|Z) ∝ PB(Z) P(Z)

Julia Lee CE 2012 Survey Methods Symposium

slide-12
SLIDE 12

Introduction The Proposed Methods Simulation Results Next Step Example Data

NHIS vs BRFSS: Covariates in the propensity score model

The usual suspects: Geographic region Demographic: gender, age, race, marital status, Socio-economic status: education, income categories, work status

Julia Lee CE 2012 Survey Methods Symposium

slide-13
SLIDE 13

Introduction The Proposed Methods Simulation Results Next Step Example Data

NHIS vs BRFSS: Observed Data

−4 −2 2 4 0.0 0.1 0.2 0.3 0.4 0.5 0.6 Density Phase 1 Bench phase 2 phase 3 phase 4

Julia Lee CE 2012 Survey Methods Symposium

slide-14
SLIDE 14

Introduction The Proposed Methods Simulation Results Next Step Example Data

NHIS vs BRFSS: Observed Data

  • BRFSS

NHIS 0.0 0.2 0.4 0.6 0.8 1.0

Calendar Quarter 1

UNWT Propensity Scores

  • BRFSS

NHIS 0.0 0.2 0.4 0.6 0.8 1.0

Calendar Quarter 2

UNWT Propensity Scores

  • BRFSS

NHIS 0.0 0.2 0.4 0.6 0.8 1.0

Calendar Quarter 3

UNWT Propensity Scores

  • BRFSS

NHIS 0.0 0.2 0.4 0.6 0.8 1.0

Calendar Quarter 4

UNWT Propensity Scores

Julia Lee CE 2012 Survey Methods Symposium

slide-15
SLIDE 15

Introduction The Proposed Methods Simulation Results Next Step Example Data

NHIS vs BRFSS: Responsive Design

−4 −2 2 4 0.0 0.5 1.0 1.5 Density Phase 1 Bench Sample 2 −4 −2 2 4 0.0 0.5 1.0 1.5 Density Phase 2 Bench Sample 3 −4 −2 2 4 0.0 0.5 1.0 1.5 Density Phase 3 Bench Sample 4 −4 −2 2 4 0.0 0.5 1.0 1.5 Density Phase 4 Bench

Julia Lee CE 2012 Survey Methods Symposium

slide-16
SLIDE 16

Introduction The Proposed Methods Simulation Results Next Step Example Data

ACS vs CE: Observed Data

−4 −2 2 4 6 8 0.0 0.2 0.4 0.6 0.8 1.0 Density Phase 1 Bench phase 2 phase 3 phase 4

Julia Lee CE 2012 Survey Methods Symposium

slide-17
SLIDE 17

Introduction The Proposed Methods Simulation Results Next Step Example Data

ACS vs CE: Observed Data

  • CE

ACS 0.0 0.2 0.4 0.6 0.8 1.0

Calendar Quarter 1

UNWT Propensity Scores

  • CE

ACS 0.0 0.2 0.4 0.6 0.8 1.0

Calendar Quarter 2

UNWT Propensity Scores

  • CE

ACS 0.0 0.2 0.4 0.6 0.8 1.0

Calendar Quarter 3

UNWT Propensity Scores

  • CE

ACS 0.0 0.2 0.4 0.6 0.8 1.0

Calendar Quarter 4

UNWT Propensity Scores

Julia Lee CE 2012 Survey Methods Symposium

slide-18
SLIDE 18

Introduction The Proposed Methods Simulation Results Next Step Example Data

ACS vs CE: Observed Data

  • CE

ACS 0.70 0.75 0.80 0.85 0.90 0.95 1.00

Calendar Quarter 1

UNWT Propensity Scores

  • CE

ACS 0.70 0.75 0.80 0.85 0.90 0.95 1.00

Calendar Quarter 2

UNWT Propensity Scores

  • CE

ACS 0.70 0.75 0.80 0.85 0.90 0.95 1.00

Calendar Quarter 3

UNWT Propensity Scores

  • CE

ACS 0.70 0.75 0.80 0.85 0.90 0.95 1.00

Calendar Quarter 4

UNWT Propensity Scores

Julia Lee CE 2012 Survey Methods Symposium

slide-19
SLIDE 19

Introduction The Proposed Methods Simulation Results Next Step Model fit

Model fit and similarity measures

Model fit diagnostics Distance measure on densities

Hellinger distance to quantify the similarity between two probability distributions H2 = 1 2 √ dP −

  • dQ

2 where P and Q represent the propensity score density from benchmark and current survey, respectively.

Balance measure on covariates

Absolute distance

Julia Lee CE 2012 Survey Methods Symposium

slide-20
SLIDE 20

Introduction The Proposed Methods Simulation Results Next Step Model fit

Thank you! Comments are appreciated! Contact: julialee@umich.edu

Julia Lee CE 2012 Survey Methods Symposium