On balanced sampling and calibration estimation in survey sampling
Risto Lehtonen University of Helsinki
BaNoCoSS 2019, Örebro University, 16-20 June 2019
survey sampling Risto Lehtonen University of Helsinki BaNoCoSS - - PowerPoint PPT Presentation
On balanced sampling and calibration estimation in survey sampling Risto Lehtonen University of Helsinki BaNoCoSS 2019, rebro University, 16-20 June 2019 Topics to be addressed Motivation Representative strategy by Hjek Balanced sampling
BaNoCoSS 2019, Örebro University, 16-20 June 2019
2
3
METRON - International Journal of Statistics 2011, vol. LXIX, n. 1, pp. 45-65 MATTI LANGEL – YVES TILLÉ
4
in the spirit of Jaroslav Hájek (1959, 1981)
5
1 2
k k k Lk k k k k k s k U
6
1 2 1 2
k k k Lk k k k Jk
7
k
k k k k s k U HT k k k s k k
8
k
k k k k k s k U CAL k k k s k k k
9
10
11
12
1 2 3 4
k k k k k k k k k k
1
j HT j k jk k s CAL j k jk HT j z HTz j k s j k k k k k jk k s k s
13
1 2 6
1
14
1
2
3
4
5
6
Extracted from Deville & Tillé (2004) p. 909 Table 1
15
Table 2 Correlation of auxiliary variables with target variables in the population and R square for regression model (N=280)
Auxiliary variables Target variables
1
y
2
y
3
y
4
y
5
y
6
y
1
z
z
z
z
R
Target variable y Balancing & HT Balancing & CAL
1
y 0.90 0.76
2
y 0.91 0.87
3
y 0.80 0.82
4
y 0.21 0.11
5
y 0.15 0.08
6
y 0.26 0.14
Correlation of aux. var. z
1
z
2
z
3
z
4
z
1
z 1.00 0.99
2
z 0.99 1.00
3
z
z 0.98 0.99
1.00
COMMENT: Interesting empirical exploration on the interplay between balanced sampling and calibration estimation by simulation experiments using real survey data Several strategies are applied by combining balanced and non-balanced sampling and Horvitz-Thompson and calibration estimators www.statisticsjournal.lt
16
17
18
19
k k k k
20
1 1 1 1 2
k k k
1 2 2 1 1 2 2 6 1 1 1
k
2 2 1 1 1 1 2 1 2
21
Sampling Penalized balanced sampling Balanced sampling Simple random sampling Estimation HT LMM HT LMM LMM
1 1 1
2
( ) m 1 1.00 1.00 1.00 1.07 Exponential
6
1 2 2
2
( ) m 1 0.66 0.99 0.66 0.66 Exponential
6
Extracted from Table 1 in Breidt & Chauvet (2010) p. 953
22
1 2 3
k k k k d
d
d k k U
23
d d d
dHT k k k s k k k s dHA d k k s k k
d d d
dCAL HT k k k s dk k k s dCAL HA d dk k s
dk dk k
24
1 2 3
d d
dk k k k s k U dk k k k k d
1 2 3
k k k k d
25
d d
dk k k k s k U k k d k k d
1 1 2 2 3 3 1 2 3
k k d d k k k k d k k d k k k k d k d
26
2 1
K d d i d d i d i i d
d
27
Table 4 Median RRMSE (%) of design-based direct HT and Hájek estimators for totals for 40 domains in three domain sample size classes in a simulation experiment of 10,000 SRSWOR samples of 2000 units from a synthetic population of one million units. Expected domain sample size All Minor 12 Medium 40 Major 122 Horvitz-Thompson
ˆ
d
dHT k k k s
t a y 29.00 15.77 8.79 15.80 Hájek
ˆ
d d
k k k s dHA d k k s
a y t N a 4.60 1.85 0.91 1.96
Extracted from Lehtonen & Veijanen (2019)
28
Calibration vectors
1 2 3
(1 , , , )
k k k k
x x x z
and
1 2 3
( , , )
k k k k
x x x z MFC-HT 8.82 1.62 0.78 1.72 MFC-HA 6.39 1.89 0.91 1.98 Model-assisted calibration MC Model:
, , 1 ,...,
k k d k d
y u k U d D x β Model vector
1 2 3
(1 , , , )
k k k k
x x x x
Calibration vectors
ˆ (1 , )
k k
y z
and
ˆ
k k
y z MC-HT 4.29 1.58 0.78 1.67 MC-HA 4.53 1.85 0.91 1.96
Extracted from Lehtonen & Veijanen (2019)
29
HTdk dk dk HAdk d dk
d
k s
30
experiment of 100 SRSWOR samples from population U Upper panel: HT type estimators, lower: Hájek type estimators
31
Breidt, F.J. and Chauvet, G. (2012) Penalized balanced sampling. Biometrika, 99, 945–958. Deville, J.-C. (2000) Generalized calibration and application to weighting for non-response. In: Bethlehem J.G. and van der Heijden, P.G.M. (eds) COMPSTAT. Physica, Heidelberg. Deville, J.-C. and Särndal, C.-E. (1992). Calibration estimators in survey sampling. Journal
Deville, J.-C. and Tillé, Y. (2004) Efficient balanced sampling: The cube method. Biometrika, 91, 893–912. Dirdaite, I. and Krapavickaite, D. (2916) Application of balanced sampling, non-response and calibrated estimator. Lithuanian Journal of Statistics 2016, 55, 81–90. Guggemos, F. and Tillé, Y. (2010) Penalized calibration in survey sampling: Design-based estimation assisted by mixed models. Journal of Statistical Planning and Inference, 140, 3199–3212. Hájek, J. (1959) Optimum strategy and other problems in probability sampling, Casopis pro Pestováni Matematiky, 84, 387–423. Hájek, J. (1981) Sampling from a Finite Population. New York: Marcel Dekker. Lehtonen, R. and Veijanen, A. (2012) Small area poverty estimation by model calibration. Journal of the Indian Society of Agricultural Statistics, 66, 125–133.
32
Lehtonen R. and Veijanen A. (2016) Design-based methods to small area estimation and calibration approach. In: Pratesi M. (Ed.) Analysis of Poverty Data by Small Area Estimation. Chichester: Wiley. Lehtonen R. and Veijanen A. (2017) A two-level hybrid calibration technique for small area
Lehtonen, R. and Veijanen, A. (2019) Small domain estimation with calibration methods. ITACOSM 2019 Conference, 5-7 June 2019, Florence, Italy. Montanari G.E. and Ranalli M.G. (2009) Multiple and ridge model calibration. Proceedings of Workshop on Calibration and Estimation in Surveys 2009. Statistics Canada. Särndal, C.-E. (2007) The calibration approach in survey theory and practice. Survey Methodology, 33, 99–119. Tillé, Y. (2011) Ten years of balanced sampling with the cube method: An appraisal. Survey Methodology 37, 215–226. Wu, C. and Sitter, R.R. (2001) A model-calibration approach to using complete auxiliary information from survey data. Journal of the American Statistical Association, 96, 185–193.
33
34