Variance Estimation in Complex Samples: The Finite Population - - PowerPoint PPT Presentation

variance estimation in complex samples the finite
SMART_READER_LITE
LIVE PREVIEW

Variance Estimation in Complex Samples: The Finite Population - - PowerPoint PPT Presentation

03-11-2015, NTTS Brussels Variance Estimation in Complex Samples: The Finite Population Bootstrap Using Pseudo-Populations Andreas Quatember Andreas Quatember: The Finite Population Bootstrap Using Pseudo-Populations 2/11 The Finite


slide-1
SLIDE 1

03-11-2015, NTTS Brussels

Variance Estimation in Complex Samples: The Finite Population Bootstrap Using Pseudo-Populations

Andreas Quatember

slide-2
SLIDE 2

Andreas Quatember: The Finite Population Bootstrap Using Pseudo-Populations 2/11

The Finite Population Bootstrap The bootstrap method provides an alternative for variance estimation “... probably the most flexible and efficient method of analyzing survey data” (Lahiri 2003) Originally developed for the estimation of sampling distributions (of estimators) in i.i.d. situations (Efron 1979):

  • 1. i.i.d. random sample s of size n from a distribution
  • 2. Draw B i.i.d. random resamples of size n from s (MC version)
  • 3. In each resample, calculate the estimator under study
  • 4. For large B, the distribution of the B resample estimates approximates

the interesting sampling distribution How can this idea be applied to without replacement sampling from finite populations?

slide-3
SLIDE 3

Andreas Quatember: The Finite Population Bootstrap Using Pseudo-Populations 3/11

Different approaches are available (cf. Shao and Tu 1995): “Ad-hoc approach” (cf. Ranalli and Mecatti 2012):

  • i.i.d. resampling plus an adequate choice of the resample sizes (cf.

McCarthy and Snowden 1985)

  • i.i.d. resampling plus rescaling of observations (cf. Rao and Wu 1988)
  • Subsampling from the original sample under the original sampling

scheme with an adapted sample size (cf. Sitter 1992)

  • Combining with- and without replacement schemes (cf. Antal and

Tillé 2011) “Plug-in approach” (cf. Ranalli and Mecatti 2012):

  • Generating a bootstrap population (cf. Gross 1980)
slide-4
SLIDE 4

Andreas Quatember: The Finite Population Bootstrap Using Pseudo-Populations 4/11

Basic idea for SI samples and integer design-weights N/n:

  • 1. SI sample of size n from U
  • 2. Replicate each sample unit N/n times to generate a pseudo-population

Up: HT approach The idea behind 1 = ⋅ = ⋅

∑ ∑

HT k k k s s k

t y y d π : Sample value y1 is replicated d1 times, y2 is “cloned” d2 times, and so on d1, d2, …, dn are “replication factors” of the HT approach

  • 3. Draw B SI-resamples of size n from Up
  • 4. Calculate the estimator under study in each resample
  • 5. The MC distribution of these estimates serves as an estimator of the

true sampling distribution of the estimator

slide-5
SLIDE 5

Andreas Quatember: The Finite Population Bootstrap Using Pseudo-Populations 5/11

The key to an efficient application of this procedure is the generation of an adequate pseudo-population Up For Up = U, this framework would perfectly simulate the interesting SI sampling distribution

slide-6
SLIDE 6

Andreas Quatember: The Finite Population Bootstrap Using Pseudo-Populations 6/11

  • 1. Non-integer design weights

Booth et al. (1994): For the generation of Up, replicate each sampling unit k according to the integer part ik of its SI design weight = + N i r n resulting in n·i elements and add N − n·i elements drawn by SI sampling from s Create C such pseudo-populations Ui

p (i = 1,... C) and resample in each of

them to account for the random nature of Up

slide-7
SLIDE 7

Andreas Quatember: The Finite Population Bootstrap Using Pseudo-Populations 7/11

  • 2. General probability sampling with arbitrary πk’s

Holmberg (1998): For the generation of Up, replicate each sampling unit k according to the integer part ik of its design weight = +

k k k

d i r and randomly one more time with probability rk

slide-8
SLIDE 8

Andreas Quatember: The Finite Population Bootstrap Using Pseudo-Populations 8/11

Three properties should apply (Barbiero and Mecatti 2010): A) The total of auxiliary variable x in Up should be equal to t(x) in U B) The total of y in Up should be equal to its HT estimator tHT C)

,

( )

boot HT b HT

E t t = For Booth et al. (1994), or Holmberg (1998):

  • Violation of mimicking principle of the bootstrap approach by

differing from the “nominal” Up for rk > 0

  • Recalculation of sample inclusion probabilities is necessary
slide-9
SLIDE 9

Andreas Quatember: The Finite Population Bootstrap Using Pseudo-Populations 9/11

The HT based bootstrap (HTB): A natural development of the generation procedures proposed (Quatem- ber 2014) Based directly on the HT principle = ⋅

HT k k s

t y d allowing not only whole units with certain values from s in Up Affects the drawing probabilities of the different units in Up

slide-10
SLIDE 10

Andreas Quatember: The Finite Population Bootstrap Using Pseudo-Populations 10/11

Summary of the results of a simulation study on the HTB finite popu- lation bootstrap approach:

  • Follows directly the mimicking principle (more understandable)
  • (Small) positive effect on the efficiency compared to other methods

such as Holmberg (1998)

  • No recalculation of inclusion probabilities necessary (simpler algo-

rithm)

  • Can still be used in situations where other methods fail (when some

dk‘s are close to one)

  • Large pseudo-populations do not have to be generated physically

when the probability mechanism can be used for the resampling process (cf. Ranalli and Mecatti 2012)

slide-11
SLIDE 11

Andreas Quatember: The Finite Population Bootstrap Using Pseudo-Populations 11/11

Thank you very much for your (hopefully non-pseudo) attention!