calibration of small area estimates in business surveys
play

CALIBRATION OF SMALL AREA ESTIMATES IN BUSINESS SURVEYS Rodolphe - PowerPoint PPT Presentation

CALIBRATION OF SMALL AREA ESTIMATES IN BUSINESS SURVEYS Rodolphe Priam, Natalie Shlomo Southampton Statistical Sciences Research Institute University of Southampton United Kingdom SAE, August 2011 The BLUE-ETS Project is financed by the grant


  1. CALIBRATION OF SMALL AREA ESTIMATES IN BUSINESS SURVEYS Rodolphe Priam, Natalie Shlomo Southampton Statistical Sciences Research Institute University of Southampton United Kingdom SAE, August 2011 The BLUE-ETS Project is financed by the grant agreement no: 244767 under Theme 8 of the 7th Framework Programme (FP7) of the European Union, Socio-economic Sciences and Humanities. Page 1 Trier- August 2011

  2. BUSINESS SURVEYS • Statistical units are organisational entities in a country • Interested in small area/domain estimates • Business registers allow for unit level covariates • Distributions are typically skewed with outliers • Transformations, such as the log, to ensure normality assumptions Page 2 Trier- August 2011

  3. SMALL AREA ESTIMATION • Central problem in many areas of social statistics. Recently used in business statistics. • Estimation of the mean in diverse domains Y Y Y Y Y 1 2 i m M area i ˆ ˆ ˆ ˆ ˆ Y ; Y ; Y ; … Y Y … 1 w 2 w i w m ; w M ; w � � � • True population mean � � and design-based estimate � ��� � ��� � because of small � � • Estimated small area mean (EBLUP) � Page 3 Trier- August 2011

  4. SMALL AREA ESTIMATION AND BENCHMARKING • Small area estimation of the total in the different domains Y Y Y Y Y 1 2 i m M ˆ ˆ ˆ ˆ ˆ θ θ θ θ θ … … 1 ; y 2 ; y i ; y m ; y M ; y ~ � ˆ Problem: The total estimated by the model = θ should T w y i i ; y i � ˆ ˆ match the design based estimate of the population total = . T w Y y i i ; w i • Solution by benchmarking the estimates by appropriate method • Consequence of more robust estimation to misspecifications of the model. Page 4 Trier- August 2011

  5. NESTED ERROR UNIT LEVEL MODEL • The Battese, Harter and Fuller (1988) (BHF) model for small areas i=1, …, M : = β + + Y X 1 u e i i N i i i • The target parameter of interest is the area mean: ′ = Y 1 Y / N i N i i i • The EBLUP for non-negligible sampling fractions: ) [ ] ˆ ( ˆ f ′ θ = + − β + ˆ f y 1 f X u i ; y i i i ic GLS i Page 5 Trier- August 2011

  6. BENCHMARKING AT THE LINEAR SCALE (1/2) • Existing methods considered (see for instance Wang & al. (2008) ) ~ − 1 ˆ ˆ ˆ � The ratio method by multiplicative term: RT f f θ = θ T T i ; y y y i ; y ( ) 2 2 σ ˆ + σ ˆ N / n ) ( ~ ) ˆ ˆ ˆ VAR f f i u e i θ = θ + − T T � An additive term with variance weighting: � = ( i ; y i ; y y y m 2 2 2 σ ˆ + σ ˆ N / n i u e i i 1 ) [ ] ˆ ( ˆ PB PB � Pfeffermann and Barnard (1991): ′ θ = + − β + ˆ f y 1 f X u i ; y i i i ic PB i PB = ( ) ˆ = ˆ PB ′ ′ ′ ′ η ˆ = β ˆ ˆ y − ˆ η ˆ = η ˆ − − η ˆ / ( , u ,..., u ) r T n y η C R r R RC R , , , R r , where GLS 1 M = � ( ) M � � − − − R N X , N n , N n , , N n , N , , N + i i 1 1 2 2 m m m 1 M i = 1 Ugarte & al. (2009) applied this constrained model for a business survey for several regions with variance calculations Page 6 Trier- August 2011

  7. BENCHMARKING AT THE LINEAR SCALE (2/2) • We propose the method Augmentation of the unconstrained least-squares system by adding to the original GLS system one row and one column: � � � � � � y X w X � � � � � � s s a s ; a = β + = β + e e � � � � � � PSW a PSW a ′ ′ y � � X w X � � � � + ; a + + + ; a ; a ; a where, ) ′ � = ( ) { } m ( ) ( � ′ ′ ′ = ′ ′ ′ ; = − × ; = − − + ( γ ˆ − ; , , , w N / n 1 1 X N n X 2 1 ) x w w w w + a 1 ; a 2 ; a m ; a i ; a i i Ni ; a i i ic ; a i i ; a i 1 � = � = ( ) ( ) ) m m ( ( ) ( ) 2 ; = 2 γ ˆ − − + − = γ ˆ − − y 1 N n n 1 N / n y w 2 ( 1 ) N n / n . + ; a i i i i i + ; a i i i i 1 1 i i • The benchmarking equation is obtained by orthogonality of the residual to the new added column Page 7 Trier- August 2011

  8. SIMULATION FOR LINEAR CASE • Nested error unit level regression model • B=1000 populations generated • M = 30 areas (no empty areas) f i ≈ 4% • T σ = σ = 0.1 0.3 β = ( 2 , 0 . 25 ) • , , and u e x ~ N (m , s ) m i ~ N(10,3) s i = 2 • ; ; ij i i ONE POPULATION GENERATED TWO AREAS IN THE POPULATION Page 8 Trier- August 2011

  9. SIMULATION RESULT FOR LINEAR CASE (1/2) 1 EBLUP 2 Ratio Benchmark Variance Weighted 3 Benchmark Pfeffermann and Barnard 4 Benchmark Proposed Method 5 Benchmark 1 2 3 4 5 ˆ ˆ ˆ ˆ ˆ VAR PB PSW RT f θ θ θ θ θ i ; y i ; y i ; y i ; y i ; y BIASREL 0.06% 0.58% 0.60% 0.60% 0.60% AARB 0.04% 0.60% 0.62% 0.62% 0.62% ARMSE 1.31% 1.45% 1.46% 1.46% 1.47% DIFFTOT 4.0x10 2 0.000 0.000 0.000 0.000 Page 9 Trier- August 2011

  10. SIMULATION RESULT FOR LINEAR CASE (2/2) 1 EBLUP 2 Ratio Benchmark Variance Weighted 3 Benchmark Pfeffermann and Barnard 4 Benchmark 0.012 Proposed Method 5 Benchmark 0.01 0.008 1 0.006 2 0.004 3 4 0.002 5 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 -0.002 -0.004 Page 10 Trier- August 2011

  11. LOG TRANSFORMATION FOR SKEWED VARIABLE • In BHF model, = β + + y x e u ij ij i i • In business surveys, distributions are skewed o Log normal transformation ( ) = β + + z exp x u e ij ij i i o New formulation of the predictors Page 11 Trier- August 2011

  12. BACK-TRANSFORMATION WITH BIAS CORRECTION • Formulation of a nearly unbiased estimator is: ) � ∈ ˆ ( f , sum θ = + − ˆ + α ˆ (1) f z 1 f exp( y ) i ; z i i i ij i j U i s \ i ˆ and can be defined at the unit level or area level (see α The bias correction is i Chambers, Dorfman (2003) and Molina (2009)) • Other formulation from Kurnia, Notodiputro, Chambers (2009): ~ ˆ ˆ *,exp * θ = θ + α exp( ) (2) i ; z i ; y i ~ α o The bias correction is the modified term at the area level i ~ ~ α and compare to α o We propose the corrective term i 2 i 1 ˆ is the covariance matrix of the covariates. Σ where i Page 12 Trier- August 2011

  13. BACK-TRANSFORMATION WITH BIAS CORRECTION • Approaches under model (1) � Chambers, Dorfman (2003) introduce several estimators: the rast predictor and smearing predictor � Fabrizi, Ferrante, Pacei (2007) compare estimators to a naïve predictor without a bias correction. The twiced smeared estimator performed best in simulation � Chandra, Chambers (2011) discuss calibration after a log- transformation Page 13 Trier- August 2011

  14. BENCHMARKING AFTER BACK-TRANSFORMATION Compare benchmarking at different stages with back transformation ( ) 2 2 2 ~ ˆ ˆ ˆ α ˆ = σ ˆ + σ ˆ / ′ and bias correction by: (a) or (b) α = α ˆ + β Σ β / 2 i u e 2 i i i • Ratio method under different scenarios ˆ f , RT θ � No benchmark at log scale, back-transformed method (2), bias correction (a) i ; z ˆ VAR , RT � Benchmark at log scale, back-transformed method (2), bias correction (a) θ i ; z ˆ ˆ PB , RT PSW , RT θ θ i ; z i ; z ˆ f , sum , RT θ � No benchmark at log scale, back-transformed method (1), bias correction (a) i ; z � No benchmark at log scale, back- transformed method (2), bias correction (b) ˆ f 2 , RT θ i ; z • A maximization of the log-likelihood of the BHF model under constraints, back transformed method (2) and bias correction (b) ˆ MLC θ i ; z Page 14 Trier- August 2011

  15. SIMULATION RESULT FOR NON-LINEAR CASE (1/2) � No benchmark at log scale, back-transformed method (2) , ,bias correction (a) , ratio adjusted � Benchmark at log scale, back- transformed method (2) , bias correction (a), ratio adjusted � No benchmark at log scale, back- transformed method (1) , bias correction (a) , ratio adjusted � No benchmark at log scale, back- transformed method (2) , bias correction (b), ratio adjusted � MLC adjustment, back- transformed method (2) , bias correction (b) NOT BENCHMARKED BENCHMARKED 1a 2a 3a 4a 5a 6a 1b 2b 3b 4b 5b 6b 7b ˆ ˆ f 2 , RT ˆ f , sum , RT ˆ θ VAR , RT θ PSW , RT θ θ ˆ ˆ f ˆ ˆ ˆ ˆ f 2 ˆ PB ˆ f , sum PSW ˆ PB , RT θ θ VAR θ MLC θ θ θ f , RT θ θ ; i ; z θ i ; z i ; z i z i ; z i ; z i ; z i ; z i ; z i ; z i ; z i ; z i ; z BIASREL 0.39% 11.16% 0.47% 8.77% 8.77% 8.75% 2.99% 2.84% 3.03% 2.83% 2.87% 2.90% 2.58% AARB 0.66% 10.89% 0.28% 8.50% 8.49% 8.49% 3.30% 3.15% 3.34% 3.15% 3.18% 3.20% 2.89% ARMSE 5.81% 12.05% 5.75% 10.01% 10.01% 10.02% 6.87% 6.84% 6.90% 6.84% 6.86% 6.90% 6.69% DIFFTOT 5.6x10 4 3.0x10 5 7.1x10 4 2.5x10 5 2.5x10 5 2.5x10 5 0.00 0.00 0.00 0.00 0.00 0.00 0.00 Page 15 Trier- August 2011

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend