constructing better coverage intervals for some
play

Constructing better coverage intervals for some estimators computed - PowerPoint PPT Presentation

Constructing better coverage intervals for some estimators computed from a complex probability sample a round table summary Phillip S. Kott RTI International a Suppose t is a nearly (i.e., asymptotically) unbiased estimator for a parameter


  1. Constructing better coverage intervals for some estimators computed from a complex probability sample a round table summary Phillip S. Kott RTI International a

  2. Suppose ˆ t is a nearly (i.e., asymptotically) unbiased estimator for a parameter t estimated with data drawn via a probability survey. The one-sided Wald coverage intervals for t are ˆ − ˆ −  +   −  1 1 t t ( ) v and t t ( ) v , where v is an estimator for V the variance of ˆ t , and  (.) is the cumulative distribution function of a standard normal distribution. 1

  3. It is well known that when the sample size is large enough, both inequalities hold for roughl y α -percent of samples drawn using the same sampling design as the probability sample. A symmetric two-sided α -percent Wald interval is ˆ − − +   ˆ + − + 1 1 t ([1 ]/ 2) v t t ([1 ]/ 2) v . 2

  4. Kott and Liu (2010) proposed using the following skewness- adjusted one-sided intervals in place of the Wald intervals: ˆ ˆ  ++ +  +− + 2 2 2 2 t t z v and t t z v , 2 m 1 (1 z − + where  = 2 3 z ) b , 6 v 2 z =  -1 (α), m 3 is a nearly unbiased estimator for the third central moment of ˆ : ˆ = − 3 [( ) ], t M E t t and 3 3

  5. b is a nearly unbiased estimate or for ˆ ˆ = − − B E v t [ ( t )] V the regression of on v t t . , 2 1 (1 m z − + In  = 2 3 z ) b , 6 v 2 2 z b accounts for v varying with ˆ − t t 2 1 (1 m accounts for ˆ − 2 3 t being skewed. z ) 6 v 4

  6.   2 1 z m If b  m 3 / v , which is often true, then   + 3   .   6 3 v ˆ = ˆ 3/2 Let / be the estimated skewness of , and m v t 3  =  ˆ 3/2 M / V be the measure is estimating. 3   0: ˆ 2 When             2 2 1 z 1 z  + ˆ + +  + ˆ + − ˆ ˆ         t t z v and t t z v  6 3   6 3           These are the Wald intervals shifted by (1/6 + z 2 /3) ˆ v . 5

  7. A S TRATIFIED M ULTISTAGE S AMPLE Consider now constructing a coverage interval for a parameter t based on stratified multistage sample when a nearly unbiased estimator for that parameter can be put in the form: n H 1 =   h ˆ ˆ , t t hi n = = 1 1 h i h n primary sampling units (PSU ’s) in stratum where there are h h , and each ˆ t for a PSU i in stratum h is a nearly unbiased hi estimator for the same value. 6

  8. We make the common (but often inaccurate) assumption that that the PSU ’s w ere selected randomly but with replacement. We focus on the difference between two domain means estimated using data from the same sample, S . The estimated different in domains means can be expressed as:   (1) (2) w y d w y d k k k k k k , − = −   k S k S y y   (1) (2) (1) (2) w d w d k k k k   k S k S 7

  9. n  3, the following equalities can be used: When all h − 2 n 2 H N ( e e ) ,   h = h hi h v − n ( n 1) = = h 1 i 1 h h − 3 n 3 H N ( e e ) m   h = = h hi h 3 m , and b , − − 3 n ( n 1)( n 2) v = = h 1 i 1 h h h e has the following linearized expression: where each hi   (1) (2) d d  = − − −   k k e n w [ y y ] [ y y ] . ˆ ˆ  hi h k k (1) k (2) k S   N N hi 1 2 8

  10. Some Simple Approximations Even if there were or a statistician wanted to program the equations herself, there may not be three PSUs in every stratum. Unlike collapsing strata for variance estimation, the direction of the potential bias of ˆ  can be positive or negative when the population means of the strata differ. Consequently, strata collapsed together should have (near) equal expected population means. 9

  11. A key to skewness-adjusted coverage intervals is the estimated  ˆ v value b = m 3 / v = . The value of this term for the difference between proportions estimated for two distinction domains from a simple random sample is approximately − − − − − 2 2 m p (1 p )(1 2 p ) / n p (1 p )(1 p ) / n  = 3 1 1 1 1 2 2 2 2 b − + − v p (1 p ) / n p (1 p ) / n 1 1 1 2 2 2 10

  12.   1 1  − −   b (1 2 p ) . When p 1 = p 2 , this collapses to 1  n n  1 2 That appears to suggest that when assessing the difference between proportions in two distinct domains, one should multiply the domain sample sizes by their respective design effects; BUT the design effect captures the impact of clustering, stratification, and unequal weighting on the variance of an estimator, not on its third central moment. 11

  13. A wiser procedure for an estimated proportion p =  k  S w k y k /  k  S w k , where y k = 0/1, might be to estimate B = M 3 / V with  3 w = −  k k S b 2 (1 2 ), p   simple w w   k k k S k S  = and then insert ˆ / b v into the skewness-adjusted simple simple coverage intervals. This estimate ignores the impact of stratification and clustering on b . 12

  14. For the difference between two domain means:  = ˆ b / v with simple simple − − − − − 2 2 p (1 p )(1 2 p ) / n p (1 p )(1 p ) / n = 1 1 1 1 2 2 2 2 b , − + − simple * * p (1 p ) / n p (1 p ) / n 1 1 1 2 2 2 ( ) ( ) 3 2   w w k k = S = S 2 * a a where n , and n .   a a 3 2 w w k k S S a a 13

  15. For a more general population or domain mean of a y-variable ,  one can replace ˆ  = ˆ b / v , simple simple where  − 3 3 w ( y y ) =  k k k S b ,   − simple 2 2 w w ( y y )   k k k k S k S =   y w y / w . and k k k S S 14

  16. Calibration Weighting and the Jackknife Calibration weighting often removes much of the impact of stratification and clustering from an estimated mean. As a result estimating the skewness of an estimated proportion or mean using b simple may not be unreasonable, although it would often be better to replace the y k with a calibrated residual. 15

  17. If calibrated jackknife weights have been constructed to compute v for an estimator ˆ , t then these weights can also be used in estimating the third central moment of ˆ t : − n 2 H ( n 1)   h ˆ ˆ = − 3 h m ( t t ) , − 3( ) J ( hi ) n n ( 2) = = h 1 i 1 h h ˆ hi where ˆ t t is computed with calibrated weights, and ( is ) computed with the calibrated weights for the stratum- h PSU -i jackknife replicate. 16

  18. National Institute Statistical Sciences at the JSM NISS/SAMSI National Institute of Mon, 7/29 Reception Statistical Sciences 6:00 PM to 9:00 PM Statistical and Applied Room: H- Mathematical Capitol Sciences Institute Ballroom 7 17

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend