Parametric and Semiprametric Prediction of Finite Population Total - - PowerPoint PPT Presentation

parametric and semiprametric prediction of finite
SMART_READER_LITE
LIVE PREVIEW

Parametric and Semiprametric Prediction of Finite Population Total - - PowerPoint PPT Presentation

Eideh, A. Informative Nonignorable Universidad Carlos III de Madrid Parametric and Semiprametric Prediction of Finite Population Total Under Informative Sampling and Nonignorable


slide-1
SLIDE 1

Eideh, A. Informative Nonignorable Universidad Carlos III de Madrid

1

Parametric and Semiprametric Prediction of Finite Population Total Under Informative Sampling and Nonignorable Nonresponse (IN) (Theory and In Progress) Abdulhakeem Eideh Department of Mathematics Al-Quds University Abu-Dees Campus, Al-Quds, Palestine E-mail: msabdul@staff.alquds.edu Date: 7 de Noviembre de 2019

  • Depto. Estadística

Universidad Carlos III de Madrid

slide-2
SLIDE 2

Eideh, A. Informative Nonignorable Universidad Carlos III de Madrid

2

Outline  Introduction  Sample distribution  Response distribution  Estimation of response weights  General theory  Parametric Prediction  Semiparametric prediction  Simple Ratio Population Model  Multiple Regression Population Model  Conclusions

slide-3
SLIDE 3

Eideh, A. Informative Nonignorable Universidad Carlos III de Madrid

3

Introduction  Peffermann et al. (1998), survey data may be viewed as the

  • utcome of two random processes:

 The process generating the values in the finite population, often referred to as the ‘superpopulation model’  The process selecting the sample data from the finite population values, known as the ‘sample selection mechanism’  Analytic inference from survey data relates to the superpopulation model,

slide-4
SLIDE 4

Eideh, A. Informative Nonignorable Universidad Carlos III de Madrid

4

 Standard analysis of survey data often fails to account for the complex nature of the sampling design (stratification clustering, unequal probability of selection, informative sampling)  The effect of the sample design on the analysis is due to the fact that the models in use typically do not incorporate all the design variables determining the sample selection, either because there may be too many of them or because they are not of substantive interest.  However, if the sampling design is informative, standard estimates of the model parameters can be severely biased, leading possibly to false inference, since the sample distribution differs from that of the population  In the literature three methods dealing with the effect of unequal probability of selection and informative sampling are discussed.

slide-5
SLIDE 5

Eideh, A. Informative Nonignorable Universidad Carlos III de Madrid

5

 Classical methods:  Probability weighting  Pseudo likelihood estimation  To overcome the difficulties associated with the use of classical inference procedures Pfeffermann et al (1998) proposed the use

  • f the sample distribution (3rd method) induced by the assumed

population models, under informative sampling in case of Cross sectional survey data  Eideh (2002, PhD) fitted time series models for longitudinal survey data, 2-stage clustered (SAE) (prediction and estimation), under informative sampling and the treated nonignorable nonresponse as informative sampling

slide-6
SLIDE 6

Eideh, A. Informative Nonignorable Universidad Carlos III de Madrid

6

Sample distribution (dist. after selection)  The sample distribution is the distribution of the observed

  • utcomes given the selected sample

 

N U ,..., 1 

denote a finite population consisting of N unit  y be the study variable of interest 

i

y be the value of y for the th i

population unit 

i

x ,

U i

be the value of an auxiliary variable(s), x

slide-7
SLIDE 7

Eideh, A. Informative Nonignorable Universidad Carlos III de Madrid

7

 

N

z z ,...,

1

 z

be the values of a known design variable, used for the sample selection process but not included in the working model under consideration. (Secondary data analysis) 

) Pr(    s i

i

, first order selection probabilities 

i i

w  1 

sampling weight ;

N i ,..., 1 

N

y y ,...,

1

independent random variables, with pdf

 

 , |

i i p

x y f

, indexed by a vector parameter  . (Dist. Before selection)

slide-8
SLIDE 8

Eideh, A. Informative Nonignorable Universidad Carlos III de Madrid

8

 Pfeffermann et al. (1998), the sample pdf of

i

y

               

               , , | , | , , x | , , | Pr s , , , | , , | Pr s , , , | , , |

i i p i i p i i i p i i i p i i i i p i i s

x E x y f y E x s i i x y f y x s i i x y f x y f       

     

i i i p i i i p i i p

dy x y f y x E x E         , | , , | , , |

Modeling

 

  , , x |

i i i p

y E

 Pfeffermann et al. (1999) Linear and exponential  Eideh (2002) Logit and probit.

slide-9
SLIDE 9

Eideh, A. Informative Nonignorable Universidad Carlos III de Madrid

9

 Based on the sample data 

s i w x y

i i i

 ; , ,

we can estimate  in two steps:  Step-one: Estimate  , using

   

   , , | 1 , , |

i i i p i i i s

y x E y x w E 

.  Step-two:

   

 

 

     ~ , , | log , | log ~ ,

1 i i s n i s i i i p rs

w E x y f l x

 

 

 

Variance Estimation of 

ˆ:

 Pfeffermann and Sverchkov (2003), Eideh and Nathan (2006, 2009), and Eideh (2009), have considered the use of inverse information for estimating the variance of the maximum sample likelihood estimators, but they treat the informativeness parameter estimates as fixed. For theoretical justification of this practice, see Bonnéry et al. (2018).

slide-10
SLIDE 10

Eideh, A. Informative Nonignorable Universidad Carlos III de Madrid

10

Response Distribution  Three random processes:  1. Process generating the values in the finite population,  2. Sample selection process –sampling design,  3. The nonresponse process  See Pfeffermann et al. (1998). Informative sample selection.  For inference problem, Little (1982) classify the nonresponse mechanism as ignorable (MAR and MCAR) and nonignorable (NMAR).

slide-11
SLIDE 11

Eideh, A. Informative Nonignorable Universidad Carlos III de Madrid

11

Cross Classification of Sampling Design and Nonresponse Mechanism Sampling Design Nonresponse Mechanism ignorable nonignorable informative II-Observed IN- Missing noninformative NI- Observed NN-Observed  Brick (2013) …Thus, bias is often the largest component of mean square error of the estimates

slide-12
SLIDE 12

Eideh, A. Informative Nonignorable Universidad Carlos III de Madrid

12

Notations 

 

N U ,..., 1 

  • finite population

i

y - value of y for the th i

population unit 

i

x ,

U i

  • value x - auxiliary variable

 

N

z z ,...,

1

 z

values of known design variables 

) , , Pr(    z y x s i

i

i i

w  1 

sampling weight ;

N i ,..., 1 

N

y y ,...,

1

ind r.vs with pdf

 

 , |

i i p

x y f

slide-13
SLIDE 13

Eideh, A. Informative Nonignorable Universidad Carlos III de Madrid

13

1 

i

I

if unit

U i

is sampled

i

I

  • therwise

 

1 , |   

i

I U i i s

  • sample size n

 

, |   

i

I U i i s

  • nonsampled unit

1 

i

R

if unit

s i is observed 

i

R

  • therwise

 

1 |   

i

R s i r

  • response set

 

|   

i

R s i r

  • nonresponse set

) , , | Pr(    v y x r i

i

response probability 

i i

  1 

  • response weight
slide-14
SLIDE 14

Eideh, A. Informative Nonignorable Universidad Carlos III de Madrid

14

 Nonsampled dist. Sverchkov and Pfeffermann (2004)

         

          , , | 1 , | , , | 1 , , , | , , |

i i p i i p i i i p i i p i i s

x E x y f y x E s i x y f x y f     

 Response dist. Eideh (2002, 2007):

         

              , , | , , , | , , | , , , , | , , , |

i i s i i s i i i s i i p i i r

x y f x E y E r i x y f x y f x   

 Nonresponse dist. Eideh (2007, 2009):

             

i i s i i s i i i s i i p i i r

E y f y E r i x y f y f x x x x | 1 | , | 1 , , , , | , , , |             

slide-15
SLIDE 15

Eideh, A. Informative Nonignorable Universidad Carlos III de Madrid

15

Relationship between

r r s s p

E E E E E , , , ,

 Pfeffermann and Sverchkov (1999)

       

i i i s i i s i i p

y w E w E y E x x x | | |

1 

 Sverchkov an Pfeffermann(2004)

         

i i s i i i s i i s

w E y w E y E x x x | 1 | 1 |   

 Eideh (2007, 2009)

     

i i i r i i i i r i i p

w E y w E y E x x x | | |   

         

i i r i i i r i i r

E y E y E x x x | 1 | 1 |     

slide-16
SLIDE 16

Eideh, A. Informative Nonignorable Universidad Carlos III de Madrid

16

     

1

| |

i i s i i r

y E y E  

       

i i i r i i r i i s

y E E y E x x x | | |

1

 

         

i i i r i i i i r i i s

w E y w E y E x x x | 1 | 1 |     

       

i i p i i i i r i i i r i i r

y f y w E w E y f x x x x | , | | |   

           

i i r i i r i i i r i i r

y f E y E y f x x x x | | 1 , | 1 |     

               

i i r i i i r i i i i r i i s

y f w E y w E y f x x x x | | 1 , | 1 |     

slide-17
SLIDE 17

Eideh, A. Informative Nonignorable Universidad Carlos III de Madrid

17

Estimation of response weights

i i

  1 

for all

r i

Sverchkov (2008)  If the nonresponse mechanism is NMAR,  then values of

i

y for

r i is available,

 but for

r i  are not available, so we can’t fit the following model:

     

i i i i i i i i i i

y x y x y x s i R ,y x ψ

2 1 2 1

exp exp 1 exp ) , , | 1 Pr( ,                

 )

, ( : 

i i i i

,y x ψ Bernoulli R

       

i i

r i i i r i i i i i i

,y x ψ ,y x ψ y x r f

 

1

, 1 , ) , | (  

 MLE  :

slide-18
SLIDE 18

Eideh, A. Informative Nonignorable Universidad Carlos III de Madrid

18

         

, 1 log , log          

 

  r i i i i r i i i i

,y x ψ ,y x ψ l      

 Using the missing information principle: take the expectation of likelihood equation with respect to the superpoulation model:

               

 

1 , , 1 log 1 , , log                       

  r i i i i r i i i i i i r r i i i i

O ,y x E O ,y x ψ ,y x E ,y x ψ        

,

r s

O O O  

 

 ˆ ˆ

i i

ψ ψ 

. For more information, see Reddles, Kim (2016).

slide-19
SLIDE 19

Eideh, A. Informative Nonignorable Universidad Carlos III de Madrid

19

Parametric Estimation under IN  Assume response measurements are independent

             

 

 

 

m i i i p i i s i i p i i i p i i i s m i i i r in r

x E x E x y f y x E y x E x y f L

1 1 ,

, , | , , , | , | , , | , , | , , , | , ,                  

 Weighted response likelihood – weights  Estimation: Four steps method  Step 1: Estimation of

i

 :

i i

  ˆ 1 ˆ 

  • done

 Step 2: Estimation of the effect of nonresponse mechanism

   

    , , | 1 , , |

i i i r i i i s

y x E y x E 

slide-20
SLIDE 20

Eideh, A. Informative Nonignorable Universidad Carlos III de Madrid

20

  can be estimated by regressing

i

 ˆ on 

i i y

x ,

using the data set

 

r i x y

i i i

 , , , ˆ 

 Step 3:

     

 

i i i r i i i r i i i i r i i p

x y l E w E x y w E x y E   x | | , |   

,

 

i i i r i i i

x w E w l |   

  can be estimated using regression analysis.  Step 4: maximizing

       

         ~ , ~ , , | log ~ , , | log ~

1 1 . i i s m i i i p m i ni in r

x E x E l l

 

 

  

Where

   

m i i i p ni

x y f l

1

, | log  

slide-21
SLIDE 21

Eideh, A. Informative Nonignorable Universidad Carlos III de Madrid

21

“New Test” of Nonignorable Nonresponse (In Progress)

       

i i p i i i i r i i i r i i r

x y f y x w E x w E x y f | , | | |   

Note

   

i i p i i r

x y f x y f | | 

unless

   

i i i r i i i i r

x w E y x w E | , |   

 Tests for missing data mechanisms: Assume

 

i y i x i i i i r

y x y x w E        , | ˆ

for all

r i

 Test NMAR: Test

: 

y

H 

 Test MCAR: Test

:  

y x

H  

 Test MAR: Test

: H

  • 2. Use KL distance between

 

i i r

x y f |

and

 

i i p

x y f |

slide-22
SLIDE 22

Eideh, A. Informative Nonignorable Universidad Carlos III de Madrid

22

 Suggestion: New measure of representativeness (Call it Generalized measure) of a response set: Variance of

i iw

 ˆ

 Existing measure assume the sample is srs and MAR nonresponse mechanism.  Furthermore! Regress

i

 ˆ on

i

w ( or

i

w and

i

x ) to predict

i

 ˆ for

r i and s i , so we have

i

 ˆ

U i

 Bias reduction poststratification based on

i iw

 ˆ

.

slide-23
SLIDE 23

Eideh, A. Informative Nonignorable Universidad Carlos III de Madrid

23

Parametric Prediction of Finite Population Parameter under IN  General Theory: Single-stage population model (two-stage in progress)

           

      s i i r i i r i i s i i s i i N i i

y y y y y y T

1

 For the prediction process we have the following available information:

r s

O O O  

       

s i R U i I x O

i i i i s

   , , , , , 

   

 

m n N s i r i x y O

i i i r

and , , , , , ˆ ,    

 

O T T ˆ ˆ 

  • predictor of T based on O
slide-24
SLIDE 24

Eideh, A. Informative Nonignorable Universidad Carlos III de Madrid

24

 The mean square error (MSE)

   

 

 

 

 

O T Var O T E T O T T E T MSE

p p p p

| | ˆ | ˆ ˆ

2 2

    

 is minimized when

 

O T E T

P

| ˆ 

 Now we consider the following:

 

   

  

   

   

s i i s r i i r r i i p

O y E O y E y O T E T |

 The empirical predictor for T :

 

   

  

  

   

s i i s r i i r r i i p

O y E O y E y O T E T ˆ ˆ | ˆ ˆ

 Based on relationships between moments:

slide-25
SLIDE 25

Eideh, A. Informative Nonignorable Universidad Carlos III de Madrid

25

               

  

   

      

s i i i r i i i r r i i r i i r r i i

w E y w E E y E y T 1 1 1 1    

 Hence,

T can be estimated based only on the data in the response set 

r i w y

i i i

 : , ˆ ,

.  Using method of moments estimator,

   r i i in i in

y w T ˆ

           

          

  r i i i i i r i i i in i

w w n N m n w 1 1 1 1 1    

slide-26
SLIDE 26

Eideh, A. Informative Nonignorable Universidad Carlos III de Madrid

26

Note that (a)

 

 

r i i i

y 1 

is the Horvitz Thompson estimator of  r

i i

y .

(b)

 

 

r i i i i

y w 1 

is the Horvitz Thompson estimator of  s

i i

y

(c)

 

  

r i i

m n 1 

is the “Hajek type correction” for controlling the variability of the response weights. (d)

 

  

r i i i w

n N 1 ) ( 

is the “Hajek type correction” for controlling the variability of the product of response weights and sampling weights.

slide-27
SLIDE 27

Eideh, A. Informative Nonignorable Universidad Carlos III de Madrid

27

 It is easy to verify that (a) Under NN:

       

        

  r i i i r i i i nn i

n N m n w     1 1 1

(b) Under ni:

   

m N m n N m m n w

ni i

     1

(c) Under ii:

     

       

r i i i ii i

w w n N m m n w 1 1 1

slide-28
SLIDE 28

Eideh, A. Informative Nonignorable Universidad Carlos III de Madrid

28

 We can show that 1.

   

 

   

 

   

    

     

      

s i i p i i p r i i s i i s s i i p r i i s r i i in

O E O y Cov O E O y Cov O y E O y E y T     1 , 1 ,

2.

   

   

 

        

                   

 

    s i i p i i p r i i s i i s i s i p in p in

E y Cov E y Cov y E y E T T E T B     1 , 1 ,

3.

 in

T in unbiased of T if

 

, 

i i i r

y w Cov 

 Note that, the stronger the relationship between the study variable and the response probability, and the study variable and first order inclusion probabilities, the larger the bias.

slide-29
SLIDE 29

Eideh, A. Informative Nonignorable Universidad Carlos III de Madrid

29

 We can show that

 

                                     

                                           

   s i i r i i r i i r i i r i i r i i i r i r r i i r i i r i i r i i r i r i i i r i r i i r i i i r in

E w E w E w E y E y w E E E w E w E y E E y w E E w E y w Cov T B                 1 1 ,

 Hence, the bias 

 in

T B

can be estimated based only on the data in the response set, 

r i w y

i i i

 : , ,

, using method of moments estimates technique, that is, replace the moment under the response distribution by the average over the response set, for example

 

 

  r i i i r

a m a E

1

ˆ

.

slide-30
SLIDE 30

Eideh, A. Informative Nonignorable Universidad Carlos III de Madrid

30

Particular cases: Case 1: nn Case 2: ni

   

     

    s i i p r i i p r i i ni

O y E O y E y T

 

   

 

   

 

  0

       

   s i i s i p r i i r i p ni

y E y E y E y E T B

Case 3: ii

slide-31
SLIDE 31

Eideh, A. Informative Nonignorable Universidad Carlos III de Madrid

31

Simple Ratio Population Model (IN)

 

i i p i i

x x N x y

2

, ~   ,

N i , , 1  

independent

 

 

i i i i i p

y x x y E

1

exp ,     

,

 

 

i i i i i s

y x x y E

1

exp ,     

   

i i s i i

x x N x y

2 2 1

, ~ |     

,

n i , , 1  

   

i i r i i

x x N x y

2 2 1 2 1

, ~ |        

m i , 1 

 

       

 

       

    

     

                    

s i 1 1 2 1 1 1 2 1 2 1 ,

exp 1 exp exp 1 exp ) (               

p i p i i r i s i s i i i r i i U i i r i i R in

M x M x x M x M x x x x x y T

slide-32
SLIDE 32

Eideh, A. Informative Nonignorable Universidad Carlos III de Madrid

32

   

       

 

   

       

 

       

                                       

   

      s i p i p i i r i s i s i i i s i i s i p r i i r i p R in p R in

M x M x x M x M x x x y E y E y E y E T T E T B

1 1 2 1 1 1 2 1 2 1 , ,

exp 1 exp exp 1 exp              

Particular cases: Case 1: nn, (

 

): Case 2: ni (

 

) , (

 

):

   ) (

s i ,

     

      

     

r i i U i i r i i i r i i r i i R ni

x x y x x y T

, 

,

R ni

T B

Case 3: ii (

 

):

slide-33
SLIDE 33

Eideh, A. Informative Nonignorable Universidad Carlos III de Madrid

33

Prediction of Census Likelihood

   

N i i i p p

x y f L

1

, | log  

         

  

  

   

s i i i p r i i i p r i i i p p p

x y f x y f x y f L l      , | log , | log , | log log

MLE

       

  

  

           

s i i i p r i i i p r i i i p p

x y f x y f x y f l , | log , | log , | log        

 Take expectation w.r.t. population, the MLE satisfies

slide-34
SLIDE 34

Eideh, A. Informative Nonignorable Universidad Carlos III de Madrid

34

       

  

  

                           

s i i i p s r i i i p r r i i i p p

O x y f E O x y f E x y f l , | log , | log , | log        

                   

  

  

                           

s i i i i r i i i p i i r r i i i r i i i p i r r i i i p p

x w E x x y f w E x E x x y f E x y f l | 1 | , | log 1 | 1 | , | log 1 , | log            

slide-35
SLIDE 35

Eideh, A. Informative Nonignorable Universidad Carlos III de Madrid

35

If we assume that

   

  

r i r

E x E |

, we get

 

, | log   

r i i i p r i

x y f   

     

  

     ˆ 1 ˆ 1 ˆ 1 ˆ 1          w av w n N m n

i i i r i

Pseudo MLE with adjusted response weights

r i

Conclusions  I hope that the new mathematical statistical results obtained will encourage further theoretical, empirical and practical research in these directions.

slide-36
SLIDE 36

Eideh, A. Informative Nonignorable Universidad Carlos III de Madrid

36

Semiparametric Prediction of Finite Population Total under IN  Sverchkov and Pfeffermann (2004) studied the semiparametric prediction of finite population totals under informative sampling  Aim: Develop semi-parametric prediction of finite population total under IN  Fuller (2009, p282) pointed that “The analysis of data with unplanned nonresponse requires the specification of a model for the nonresponse  So that, prediction of finite population total,

       

    s i i r i i r i i N i i

y y y y T

1

 requires specification of nonsampled and nonresponse models

slide-37
SLIDE 37

Eideh, A. Informative Nonignorable Universidad Carlos III de Madrid

37

 Assume that (a) Sample-complement model:

 

i i i

S y    x

β

, for all

s i 

 

i i s

E x 

,

 

 

i i i s

v E x x

2 2 

  

,

 

i k j s

E x  

, 𝒌 ≠ 𝒍 (b) Response-complement model:

 

i i i

Z y    x

α

, for all

r i 

 

i i r

E x 

,

 

 

i i i r

u E x x

2 2 

  

,

 

i k j r

E x  

, 𝒌 ≠ 𝒍 

 

i

S x

β

and

 

i

Z x

α

are known functions of

i

x that depends on

unknown vector parameters 𝜸 and 𝜷, respectively 

 

i

v x

2 

and

 

i

u x

2 

are assumed known except for

2 

 and

2 

slide-38
SLIDE 38

Eideh, A. Informative Nonignorable Universidad Carlos III de Madrid

38

 For prediction process, we need estimation of

 

i

S x

β

and

 

i

Z x

α

(i) Estimation of

 

i

S x

β

: Method 1:

 

 

 

 

 

 

 

 

 

                       

i i i i i r S i i i i s S i

v S y c E v S y E S

i i

x x x x x x x

β x β x β

β β

2 ~ 2 ~

~ ~

min arg min arg

     

i i i r i i i

w E w c x | 1 1     

 Hence, the vector 𝜸 can be estimated by:

 

 

 

          

r i i i i i

v S y c x x β

β 2 ~ ~ 1

ˆ min arg ˆ

,

   

 

i i i r i i i

w E w c x | 1 ˆ ˆ 1 ˆ ˆ     

 New smoothing weights

slide-39
SLIDE 39

Eideh, A. Informative Nonignorable Universidad Carlos III de Madrid

39

Method 2: Assume that

       

1 | 1   

i i r i i i r

w E w E   x

We have:

 

 

   

 

         

 

 

                            

i i i i i r i i r i i i s i i i i s

v S y w E w E v S y E v S y E x x x x x x x

β β β 2 ~ 2 2

1 1  

 Hence

   

 

 

           

r i i i i i i

v S y w x x β

β 2 ~ ~ 2

1 ˆ min arg ˆ 

 Since

   

1 

i i r

w E 

is contant.

slide-40
SLIDE 40

Eideh, A. Informative Nonignorable Universidad Carlos III de Madrid

40

(ii) Estimation of

 

i

Z x

α

Method 1:

 

 

     

 

     

               

i i i i i r Z i i i i r Z i

u Z y k E u Z y E Z

i i

x x x x x x x

α x α x α

α α

2 ~ 2 ~

~ ~

min arg min arg

     

i i r i i

E k x | 1 1     

 Hence, the vector α

ˆ can be estimated by

     

          

r i i i i i

u Z y k x x α

α 2 ~ ~ 1

ˆ min arg ˆ

     

i i r i i

E k x | 1 ˆ ˆ 1 ˆ ˆ     

slide-41
SLIDE 41

Eideh, A. Informative Nonignorable Universidad Carlos III de Madrid

41

Method 2: Assume that

     

1 | 1   

i r i i r

E E   x

 we have:

                     

                        

i i i i r i r i i i i r i i i i r

u Z y E E u Z y k E u Z y E x x x x x x x

α α α 2 2 2

1 1  

 So that

 

     

           

r i i i i i

u Z y x x α

α 2 ~ ~ 2

1 ˆ min arg ˆ 

 Hence

     

  

  

   

s i i r i i r i i p in

S Z y O T E T x x

β α

1 1

ˆ ˆ 1 ,

| ˆ ˆ

slide-42
SLIDE 42

Eideh, A. Informative Nonignorable Universidad Carlos III de Madrid

42

     

  

  

   

s i i r i i r i i p in

S Z y O T E T x x

β α

2 2

ˆ ˆ 2 ,

| ˆ ˆ

 The benefits of using the predictor

2 ,

ˆ

in

T

  • ver using the predictor

1 ,

ˆ

in

T

, is that

2 ,

ˆ

in

T

does not require the identification and estimation

  • f 

    

i i r i

E x x | 1    

.  On the other hand, in situations where this expectation can be estimated properly, the predictor

1 ,

ˆ

in

T

is likely to be more accurate since the weights

       

i i r i i

E k x | 1 1     

 will often be less variable than the weights 

1 

i

.

slide-43
SLIDE 43

Eideh, A. Informative Nonignorable Universidad Carlos III de Madrid

43

 This is because the weights

       

i i r i i

E k x | 1 1     

  • nly

account for the net effect of the response process on the target conditional distribution

 

   , , , |

i i r y

f x

whereas the weights

 

1 

i

account for the effect of the response process on the joint distribution

 

   , , ; ,

i i r y

f x

. Particular cases  For illustration we use method 2 only under different famous models in survey sampling. Case 1: Common Mean Model: Case 2: Simple linear regression model  Sample-complement model:

 

; |

1 i i i s

x x y E    

 

2 

 

i s y

Var

slide-44
SLIDE 44

Eideh, A. Informative Nonignorable Universidad Carlos III de Madrid

44

  

 

    

r i i i i i

x y w

2 1 ~ 2 ,

1 ˆ min arg ˆ   

β

 

 

   

       

w i w i w w i i

x x y x x y x S

2 , 1 2 , 1 2 , 1 2 , 1 2 , ˆ

ˆ ˆ ˆ ˆ ˆ

2

     x

β

  

w w

x y

2 , 1 2 ,

ˆ ˆ  

    

     

   

  

r i w i i r i w i w i i

x x w y y x x w

2 2 , 1

ˆ 

 

1 ˆ  

 i i i

w w 

,

  

   

r i i r i i i w

w x w x   

   

r i i r i i i w

w y w y

 Response-complement model:

 

|

1 i i i r

x x y E    

and

 

2 

 

i r y

Var  

 

   

       

    

     x x y x x y x Z

i i i i 2 , 1 2 , 1 2 , 1 2 , 1 2 , ˆ

ˆ ˆ ˆ ˆ ˆ

2 x

slide-45
SLIDE 45

Eideh, A. Informative Nonignorable Universidad Carlos III de Madrid

45

  

 

  x y

2 , 1 2 ,

ˆ ˆ

    

     

   

  

r i i i r i i i i

x x y y x x

2 2 , 1

ˆ

  

  

 

1 ˆ  

 i i

 

,

  

   

r i i r i i i x

x  

  

   

r i i r i i i y

y  

 Thus

 

    

 

 

 

 

         

   

r i i ir w s w r r i i in

y w x x y n N x x y m n y T ˆ ˆ ˆ

2 , 1 2 , 1 2 ,

 

 

slide-46
SLIDE 46

Eideh, A. Informative Nonignorable Universidad Carlos III de Madrid

46

 

     

 

     

                                

     

            w s r i w i i w i i r i i i r r i i i i i r i i i ir

x x x x w x x w w w n N x x x x x x m n w

2 2

1

  

   

  

r i i r

x x m n

  

s i i s

x x n N

 Note that, if

  

i

, and

w wi 

, then

       

       

r i i ir r s r r r in

y w x x n N x x m n y N T ˆ ˆ ˆ

2 , 1 2 , 1 2 ,

 

           

r s r r r i r i r i ir

x x n N x x m n x x x x m N w         

 

2

slide-47
SLIDE 47

Eideh, A. Informative Nonignorable Universidad Carlos III de Madrid

47

 In addition, if

r r

x x 

(balance response set w.r.t. x) and

r s

x x 

, then

EXP r in

Y y N T ˆ ˆ

2 ,

 

, which is the traditional expansion predictor.  Sarndal (2011) propose what they refer to as balance indicator that are intended to measure the similarity between the respondents and the sample under noninformative sampling design and MAR nonresponse mechanism. Case 3: Simple ratio model: Case 4: Generalized regression estimator (GREG) - Multiple regression  Assume that

 

p ip i i i p

x x y E        

1 1

β x

 Then the GREAG estimator is:

slide-48
SLIDE 48

Eideh, A. Informative Nonignorable Universidad Carlos III de Madrid

48

 

    

  N i w i N i i p y

E T

1 1

ˆ ˆ ˆ

β x

 where

   

    

   r i i i i i r i i i i i w

y w w x x x β  

 1

ˆ

 Justification: Using

   

 

 

i i r i i i i r i i p

w E y w E y E  

  2 ~ 2 ~

min arg min arg ˆ β x β x β      

 So that

   

     

r i i i i i i i p

y w y E

2 ~ 2 ~

min arg min arg ˆ β x β x β 

 

 Hence, β

ˆ is the solution of the equation:

 

 

 

  

r i i i i r i i i i i

y w w x β x x β   ˆ : ˆ

slide-49
SLIDE 49

Eideh, A. Informative Nonignorable Universidad Carlos III de Madrid

49

 Therefore,

   

     y

x x x x x x β W W y w w

r i i i i i r i i i i i w

         

    1 1

ˆ  

 where

 

r rw

w diag W   , ,

1 1

   ,

 

  

r

x x x , ,

1 

Then

 

    

  N i w i N i i p y

E T

1 1

ˆ ˆ ˆ

β x

 Now, if

 

 

   

i ip i i

x x x x ~ , 1 , ,

1 

, then

w w w w p w p w w w w

y x x y

        

   x β ~ ˆ ~ ˆ ˆ ˆ

2 2 1

       

 where

slide-50
SLIDE 50

Eideh, A. Informative Nonignorable Universidad Carlos III de Madrid

50

  

  r i i i r i ij i i w j

w x w x  

  

  r i i i r i i i i w

w y w y  

   

 

  



  

 

 

 

      

      

                                           

r i i i i i w U w w w N i w N i w i w w w w w w w w N i i N i w i w w w N i N i i p

y g w y N N y N N N y y E T

w w

    

                 

 

x x β x x β β x x β β x β x β x β x β x ~ ~ ˆ ~ ~ ~ ˆ ~ ˆ ~ ~ ~ ˆ ~ ˆ ˆ ~ ~ ˆ ˆ ~ ~ ˆ ~ ˆ ~ ~ ˆ ˆ ~ ˆ ~ , 1 ˆ ˆ

1 1 1 1 1 1 1 1 1 1

 where,

  

                

   i r i i i i i w U r i i i i

w w N g x x x x x

1

~ ~ 1  

slide-51
SLIDE 51

Eideh, A. Informative Nonignorable Universidad Carlos III de Madrid

51

 

pU U U

x x , , ~

2

 x

,

 

 N i ij jU

x N x

1

1

 Not that, if

w U 

x x ~ ~ 

, that is,

w j jU

x x

, or

   

   r i i i r i ij i i N i ij

w x w x N  

1

1

 then

          

  r i i i i r i i i

y w w N T   ˆ

 Note that,

 

r i i i i i

y g w T  ˆ

belong to the class of calibration estimator since

U r i i i i

g w x xi    

. (Deville and Sarndal, 1992).

slide-52
SLIDE 52

Eideh, A. Informative Nonignorable Universidad Carlos III de Madrid

52

 It should be noted here that, THIS EAUATION can be considered as calibration constraint when sampling design is informative and nonresponse mechanism is nonignorable. See also Calibration weighting in Two-step Kott (2015). WORK IN PROGRESS… Conclusions  I hope that the new mathematical statistical results obtained will encourage further theoretical, empirical and practical research in these directions.

slide-53
SLIDE 53

Eideh, A. Informative Nonignorable Universidad Carlos III de Madrid

53