Domain Estimation of Survey Discontinuities Nikos Tzavidis 1 Joint - - PowerPoint PPT Presentation

domain estimation of survey discontinuities
SMART_READER_LITE
LIVE PREVIEW

Domain Estimation of Survey Discontinuities Nikos Tzavidis 1 Joint - - PowerPoint PPT Presentation

NTTS 2017 Domain Estimation of Survey Discontinuities Nikos Tzavidis 1 Joint work with Paul Smith (University of Southampton), Timo Schmid, Natalia Rojas-Perilla, (Freie Universit at Berlin), Jan van den Brackel (CBS & University of


slide-1
SLIDE 1

NTTS 2017

Domain Estimation of Survey Discontinuities

Nikos Tzavidis 1 Joint work with Paul Smith (University of Southampton), Timo Schmid, Natalia Rojas-Perilla, (Freie Universit¨ at Berlin), Jan van den Brackel (CBS & University of Maastricht), Silvia Manclossi, Chris McGowan & Lisa Walters (Welsh Government) NTTS Conference Brussels, March 13 - 17 2017

1Southampton Statistical Sciences Research Institute, University of

Southampton (n.tzavidis@soton.ac.uk)

Domain Estimation of Survey Discontinuities

slide-2
SLIDE 2

NTTS 2017

What is a Survey Discontinuity?

◮ Surveys try to maintain consistent methodologies

(sampling/survey design) over time

◮ Aids the comparability of survey estimates over time ◮ However, changes in design cannot be avoided ◮ Changes designed to increase efficiency/reduce costs ◮ Can create breaks in the series known as discontinuities

Domain Estimation of Survey Discontinuities

slide-3
SLIDE 3

NTTS 2017

An Example: The National Survey for Wales

◮ The Welsh Government (WG) has reviewed the way in which

social surveys are conducted in Wales

◮ WG instituted a new National Survey (NSn) from 2016 ◮ The NSn collects information previously collected in 5 surveys

◮ The old National Survey (NSo) ◮ The Welsh Health Survey (WHS) ◮ The Active Adults Survey (AAS) ◮ The Arts in Wales Survey (AWS) ◮ The Welsh Outdoor Recreation Survey (WORS) Domain Estimation of Survey Discontinuities

slide-4
SLIDE 4

NTTS 2017

Reviewing Survey Operations: The National Survey for Wales

◮ Agreeing the NSn involved consultations with customers ◮ NSn -like the NSo- uses a rotating design ◮ Longer questionnaire but not all original questions included ◮ Sufficient sample for LAs and Welsh Health Boards ◮ Demonstrate that new methodology is appropriate ◮ Produce estimates of discontinuities ◮ Estimates at National/sub-national levels

Domain Estimation of Survey Discontinuities

slide-5
SLIDE 5

NTTS 2017

Potential Sources of Discontinuities

◮ Several changes in the NSn are potentially important ◮ Change of contractor ◮ Mode: Telephone/ self-completed to face-to-face ◮ Interviewer effects - Social acceptability (sensitive questions) ◮ Questions from 5 surveys combined in a single questionnaire ◮ Possible ordering and context effects ◮ Impact of new design on response propensity by subgroup

Domain Estimation of Survey Discontinuities

slide-6
SLIDE 6

NTTS 2017

A Framework for Assessing Discontinuities

◮ WG put in place a large-scale pilot of the new design ◮ Similar design to the one used in the NSn (n = 2800) ◮ Discontinuities: Difference between the estimates from the old

surveys and those from the pilot

◮ Account for sampling variance ◮ Focus on discontinuities greater than 5 percentage points

(Government Statistical Service - Methodology Advisory Committee, 2016)

Domain Estimation of Survey Discontinuities

slide-7
SLIDE 7

NTTS 2017

Assumptions

◮ Assumption 1: The time difference between the pilot and the

  • ld surveys can be ignored

◮ Assumption 2: The pilot is used as if it were the new survey ◮ Ideally estimate discontinuities by a split-sample experiment ◮ Old and new designs randomly administered to respondents ◮ This is not the case with the Welsh pilot survey ◮ We cannot say why discontinuities occur

Domain Estimation of Survey Discontinuities

slide-8
SLIDE 8

NTTS 2017

Estimating Discontinuities with a Pilot Survey

National level

◮ Denote by H-T the Horvitz -Thompson estimator of θ ◮ ˆ

θO is the H-T estimator of θ from the old survey

◮ ˆ

θN is the H-T estimator of θ from the pilot survey

◮ Estimator of Discontinuity: ˆ

D = ˆ θN − ˆ θO

◮ Var( ˆ

D) = Var(ˆ θN − ˆ θO)

Domain Estimation of Survey Discontinuities

slide-9
SLIDE 9

NTTS 2017

Estimating Discontinuities with a Pilot Survey

Domain Level

◮ ˆ

θO

k direct estimator from the old survey in domain k ◮ ˆ

θN

k direct estimator from the pilot survey in domain k ◮ Direct estimator of discontinuity: ˆ

Dk = ˆ θN

k − ˆ

θO

k ◮ Variance of estimated discontinuity possibly large due to the

small sample size of the pilot

◮ Employ model-based estimation

Domain Estimation of Survey Discontinuities

slide-10
SLIDE 10

NTTS 2017

Model-based Estimation of Discontinuities

◮ Area-level model (Fay & Herriot, 1979, JASA; Van den Brakel

et al., 2016, JRSS A) ˆ θN

k = xT k ˆ

β + vk + ǫk vk ∼ N(0, σ2

v); ǫk ∼ N(0, ψk)

ˆ θEBLUP

k

= ˆ γk ˆ θN

k + (1 − ˆ

γk)xT

k ˆ

β

◮ Estimated Discontinuity: ˆ

DM

k = ˆ

θEBLUP

k

− ˆ θO

k

Domain Estimation of Survey Discontinuities

slide-11
SLIDE 11

NTTS 2017

Practical Challenges with Model-based Estimation

◮ Sampling variance ψk is assumed known/estimated

accounting for the complex sampling design

Var( ˆ DM

k ) = MSE(ˆ

θEBLUP

k

) + Var(ˆ θO

k ) − 2Cov(ˆ

θN

k , ˆ

θO

k ) ◮ Estimating Cov(ˆ

θN

k , ˆ

θO

k ) is complex (Van den Brackel, 2016) ◮ Model covariates can be taken from survey or admin data

Domain Estimation of Survey Discontinuities

slide-12
SLIDE 12

NTTS 2017

Practical Challenges with Model-based Estimation

◮ Model covariates from surveys treated as random variables ◮ Extend the area-level model to account for measurement error

in the predictors (Yabarra & Lohr, 2008, Biometrika) ˆ θME−EBLUP

k

= ˆ γk ˆ θN

k + (1 − ˆ

γk)ˆ xT

k ˆ

β γk = σ2

v + βTVkβ

σ2

v + βTVkβ + ψk ◮ Vk is the variance covariance matrix of x ◮ Estimating Vk is another practical challenge

Domain Estimation of Survey Discontinuities

slide-13
SLIDE 13

NTTS 2017

Experimental Results for the National Survey for Wales

◮ We present selected anonymised results ◮ Ranges of direct point estimates (Horvitz-Thompson) of

National discontinuities

◮ Plots of model and direct point estimates for domains ◮ Model-based estimates produced with the F-H model ◮ ψk variance of H -T estimator ◮ Domains defined by crossing areas with demographic groups

Domain Estimation of Survey Discontinuities

slide-14
SLIDE 14

NTTS 2017

Experimental Results for the National Survey for Wales

Ranges of Significant (National) Discontinuities - Direct Surveys Range Survey 1

  • 0.108, -0.058

Survey 2

  • 0.111, 0.166

Survey 3

  • 0.082, 0.110

Survey 4

  • 0.152, 0.184

Domain Estimation of Survey Discontinuities

slide-15
SLIDE 15

NTTS 2017

Experimental Results for the National Survey for Wales - Variable 1

Domains

−0.4 −0.2 0.0 0.2 0.4 10 20 30 40

  • Discontinuities − Direct

−0.4 −0.2 0.0 0.2 0.4

  • Discontinuities − FH Model

Domain Estimation of Survey Discontinuities

slide-16
SLIDE 16

NTTS 2017

Experimental Results for the National Survey for Wales - Variable 2

Domains

−0.2 0.0 0.2 0.4 10 20 30 40

  • Discontinuities − Direct

−0.2 0.0 0.2 0.4

  • Discontinuities − FH Model

Domain Estimation of Survey Discontinuities

slide-17
SLIDE 17

NTTS 2017

Comments

◮ For this example, consistent negative estimates of

discontinuities for some key variables

◮ A number of variables show discontinuities that are larger

than the nominally interesting five percentage points

◮ Evidence that an adjustment for continuity should be made ◮ Pattern of discontinuities differential by subgroup (domain) ◮ Width of 95% CI is smaller for model-based estimates

Domain Estimation of Survey Discontinuities

slide-18
SLIDE 18

NTTS 2017

A Word of Caution

◮ Model-based estimate: Combine model & direct estimates ◮ Model-based estimates are affected by shrinkage ◮ Although variance is reduced, bias can increase ◮ Use model-based estimates cautiously ◮ Contrast to direct estimates ◮ Accounting for measurement error may reduce shrinkage

Domain Estimation of Survey Discontinuities

slide-19
SLIDE 19

NTTS 2017

The Impact of Measurement Error in Covariates

◮ Shrinkage Factor - F-H Model

γk = σ2

v

σ2

v + ψk ◮ Shrinkage Factor - F-H Model with Measurement Error

γME

k

= σ2

v + βTVkβ

σ2

v + βTVkβ + ψk ◮ γk γME

k

< 1 ⇒ γME

k

> γk

◮ Higher weight to the direct estimator under the ME model

Domain Estimation of Survey Discontinuities

slide-20
SLIDE 20

NTTS 2017

Current Research

Alternative modelling frameworks

◮ Approach 1: Model the discontinuity directly ◮ Approach 2: Multivariate F-H model - Joint modelling of

direct estimates from the old and pilot surveys Further topics

◮ Implementing the measurement error F-H model ◮ Benchmarking of domain discontinuities to national estimates

Domain Estimation of Survey Discontinuities