Assessing uncertainty of the temporal EBLUP: a resampling-based - - PowerPoint PPT Presentation

assessing uncertainty of the temporal eblup a resampling
SMART_READER_LITE
LIVE PREVIEW

Assessing uncertainty of the temporal EBLUP: a resampling-based - - PowerPoint PPT Presentation

Assessing uncertainty of the temporal EBLUP: a resampling-based approach Lus N. Pereira, MsC 1 Pedro S. Coelho, PhD 2 1 University of the Algarve - ESGHT, Portugal 2 New University of Lisbon - ISEGI, Portugal Rome, 11 th July 2008 Outline


slide-1
SLIDE 1

1 University of the Algarve - ESGHT, Portugal 2 New University of Lisbon - ISEGI, Portugal

Rome, 11th July 2008

Assessing uncertainty of the temporal EBLUP: a resampling-based approach

Luís N. Pereira, MsC 1 Pedro S. Coelho, PhD 2

slide-2
SLIDE 2

2

Background Objectives Methods

– Rao-Yu Model – Uncertainty measures of the EBLUP

Monte Carlo Simulation Study Application with real data Conclusion

Outline

slide-3
SLIDE 3

3

Small Area Estimation (SAE):

is about how to produce reliable estimates of domain characteristics when the sample sizes within the domains are very small or even zero.

We need employ indirect estimators that borrow strength from

related areas and time periods through linking models based on auxiliary information, such as recent census and current administrative data.

Such estimators are often base on linear mixed models (LMM)

Background

slide-4
SLIDE 4

4

The Empirical Best Linear Unbiased Prediction (EBLUP) approach

is the most popular method for the estimation the parameters

When time-series and cross-sectional data is available, longitudinal

LMM are useful

While EBLUP estimators are easy to obtain, measuring its quality is

a challenging problem

Although there is much theory about measuring the uncertainty of

EBLUP under general LMM, very little research has been done regarding the comparison of the approaches.

Background

slide-5
SLIDE 5

5

  • To introduce a parametric bootstrap procedure and a weighted

jackknife method for MSPE estimation of the EBLUP, under a cross-sectional and time-series stationary model

  • To compare the performance of the resampling-based methods

with the Taylor series-based method

  • To apply these methods to real data from the Prices of the

Habitation Transaction Survey

Objectives

slide-6
SLIDE 6

6

Rao and Yu (1994) proposed the following model:

is the parameter of inferential interest for the ith small-area at tth time point (i=1, …, m; t=1, …, T) is its design-unbiased direct survey estimator are independent sampling errors normally distributed with (p×1) is a column vector of area-by-time specific auxiliary variables (p×1) is a column vector of regression parameters are random area specific effects with are random area-by-time specific effects with , following a common AR(1) process for each i.

Rao-Yu Model

it it it

e + = θ θ ˆ

it i it it

u v + + ′ = β x θ

it t i it

u u ε ρ + =

−1 ,

1 < ρ

, ,

it

θ

Where

it

θ ˆ

it

e

it

x

β

i

v

( )

2

, ~

v iid i

N v σ

it

u

( )

2

, ~ σ ε N

iid it

) | ( =

it it

e E θ

slide-7
SLIDE 7

7

They showed that the model can be expressed in matrix form as:

Rao-Yu Model

e u Zv Xβ θ + + + = ˆ

( )

1 , ~ N e

iid it

( )

( )

i m i

blockdiag Cov V V θ

≤ ≤

= =

1

ˆ

Assuming then: with

T T v i

I J Γ V + + =

2 2

σ σ where with

( )

m v

N I v

2

, ~ σ

( )

Γ I u ⊗

m

N

2

, ~ σ

( )

R e , ~ N ) (

2 1 ; 1 it T t m i

diag σ

≤ ≤ ≤ ≤

= R with

( )

{ }

2

1 ρ ρ − =

− j i

Γ

Assuming that is known then:

[ ]′

= ) ( ), (

2 2

ρ σ ρ σ v ψ

ρ

slide-8
SLIDE 8

8

The BLUP estimator (Rao and Yu, 1994):

is the tth row of

Rao-Yu Model

,

Where

t

γ

( )

) ~ ˆ ( ) ( ~ ~

1 2 2

β X θ V γ 1 β x ψ

i i i t T v it it

− ′ + + ′ =

σ σ θ

( )

{ }

2

1 ρ ρ − =

− j i

Γ

were estimated through a method of moments ψ

( )

θ V X X V X β

1 1 1

ˆ ~

− − −

′ ′ =

The EBLUP estimator (Rao and Yu, 1994):

) ˆ ˆ ( ˆ ) ˆ ˆ ( ˆ ) ˆ ( ˆ ~

1 2 2

β X θ V γ 1 β x ψ

i i i t T v it it

− ′ + + ′ =

σ σ θ

Where

( )

θ V X X V X β

1 1 1

ˆ ˆ ˆ ˆ

− − −

′ ′ =

slide-9
SLIDE 9

9

Analytical approximation of the MSPE

Under the normality of the , and (Kackar and Harville, 1984): In the context of the Rao-Yu model (Rao and Yu, 1994) cannot be analytically evaluated.

{ }

it

e

{ }

i

v

{ }

it

u ( )

[ ]

( ) ( )

ψ ψ ψ

it it it

g g MSE

2 1

~ + = θ

( ) ( ) ( ) ( ) ( )

2 2 1

~ ˆ ˆ ~ ˆ ˆ ~       − + + =       ψ ψ ψ ψ ψ

it it it it it

E g g MSE θ θ θ

( )

( ) ( )

t T d t T it

g γ 1 V γ 1 ψ

2 2 1 2 2 2 2 2 1

1 σ σ σ σ ρ σ σ

υ υ υ

+ ′ + − − + =

( )

( ) [ ] ( ) ( ) [ ]

t T i i it t T i i it it

g γ 1 V X x X V X γ 1 V X x ψ

1 2 2 1 1 2 2 1 2

σ σ σ σ

υ υ

+ ′ − ′ ′ + ′ − =

− − − −

( ) ( )

2

~ ˆ ˆ ~       − ψ ψ

it it

E θ θ

( )

1 O

( )

1 −

m O

, ,

slide-10
SLIDE 10

10 Rao and Yu (1994) obtained the following approximation: where At (2×2) is a matrix with: and ( )

1 −

m O

,

( ) ( )

( )

( )

ψ Σ A ψ ψ

it t it it

g tr E

3 * 2

~ ˆ ˆ ~ = ≈       −θ θ

( ) [ ] ( ) [ ]

t T i t i t T i t

a γ 1 ΓV γ V γ 1 ΓV γ

2 2 1 1 2 2 1 11

σ σ σ σ

υ υ

+ − ′ + − =

− − −

( ) [ ] ( ) [ ]

t T i T t i t T i T t

a γ 1 V J 1 V γ 1 V J 1

2 2 1 1 2 2 1 22

σ σ σ σ

υ υ

+ − ′ + − =

− − −

( ) [ ] ( ) [ ]

t T i T t i t T i t

a a γ 1 V J 1 V γ 1 ΓV γ

2 2 1 1 2 2 1 21 12

σ σ σ σ

υ υ

+ − ′ + − = =

− − −

[ ]

) ( ˆ ; ) ( ˆ

2 2 *

ρ σ ρ σ v Cov = Σ

Rao and Yu (1994) proposed the following approximately unbiased estimator:

( ) ( ) ( ) ( )

ψ ψ ψ ψ ˆ 2 ˆ ˆ ˆ ˆ ~

3 2 1 it it it it RY it

g g g mspe + + =      θ

Analytical approximation of the MSPE

slide-11
SLIDE 11

11

Bootstrap approximation of the MSPE

Following Butar and Lahiri (2003) and González-Manteiga et al. (2005) ideas, the parametric boostrap procedure work as follows: 1) Estimate using the method of moments, and then estimate based on Rao-Yu model 2) Generate 3) Generate , independently of 4) Generate , independently of and . Then construct , assuming that ρ ρ ρ ρ is known 5) Construct the bootstrap data 6) Fit the model to and estimate 7) Estimate from Then fit the model to and estimate

) ( ˆ , ) ( ˆ

2 2

ρ σ ρ σ v

( )

θ ψ β β ˆ , ˆ ˆ ˆ =

*

v

( )

2 *

ˆ , ~

v

N σ v

( )

1 , ~

*

N e

( )

2 *

ˆ , ~ σ N ε

*

v

*

e

*

u

* * * *

ˆ ˆ e u Zv β X θ + + + =

*

ˆ θ

( )

* *

ˆ , ˆ ˆ ˆ θ ψ β β =

B * 2 * 2

ˆ ˆ σ σ e

v

*

ˆ θ

( )

* * *

ˆ , ˆ ˆ ˆ θ ψ β β =

E

*

ˆ θ

slide-12
SLIDE 12

12 8) Compute the bootstrap temporal BLUP from 9) Compute the bootstrap temporal EBLUP from 10) Repeat 2)-9) B times: (b=1, …, B) 11) Calculate a boostrap estimate of :

*

ˆ θ

*

ˆ θ

( )

[ ]

) ˆ ˆ ( ˆ ) ˆ ˆ ( ˆ ) ˆ ( ~

* * 1 2 2 * * , B i i i t T v B it it B

β X θ ψ V γ 1 β x ψ − ′ + + ′ =

σ σ θ

( ) [ ]

) ˆ ˆ ( ˆ ) ˆ ˆ ( ˆ ) ˆ ( ˆ ~

* * 1 * * 2 * 2 * * * , E i i i t T v E it it E

β X θ ψ V γ 1 β x ψ − ′ + + ′ =

σ σ θ

it

g3

( ) ( ) ( )

= −

      − =

B b b it B b b it E it

B g

1 2 * , * * , 1 * 3

) ˆ ( ~ ) ˆ ( ˆ ~ ψ ψ θ θ

Following the lines of Butar and Lahiri (2003), a bias corrected bootstrap estimator is:

[ ]

( ) ( )

[ ]

* 3 1 * 2 * 1 1 2 1

) ˆ ( ) ˆ ( ) ˆ ( ) ˆ ( 2 ) ˆ ( ˆ ~

it B b b it b it it it it B it

g g g B g g mspe + + − + =      

= −

ψ ψ ψ ψ ψ θ

( ) ( ) ( )

) ˆ ( ~ , ) ˆ ( ˆ ~

* , * * ,

ψ ψ

b it B b b it E

θ θ

Bootstrap approximation of the MSPE

slide-13
SLIDE 13

13

Jackknife approximation of the MSPE

Following Jiang et al. (2002) and Chen and Lahiri (2008) ideas, the Taylor series approximation of the jackknife MSPE estimator of the EBLUP is:

( ) ( ) ( ) ( ) [ ] ( ) ( )

[ ]

( )

[ ]

( )

      ′ ′ − − + + + ∇ ′ − + =      

WJ t i i i i t WJ t t WJ it it it J it

tr tr g g g mspe υ ψ L ψ β X y ψ β X y ψ L υ A ψ ψ c ψ ψ ψ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ) ˆ ( ˆ ~

1 2 1

θ where:

( )

= − −

=

m j j it j it WJ

w

1 , ,

ˆ ˆ ˆ ψ ψ c

( )( )

= − −

′ − − =

m j j j it j it WJ

w

1 , ,

ˆ ˆ ˆ ˆ ψ ψ ψ ψ υ

j −

ψ ˆ

is the estimator of after deleting the jth small-area data

it j

w , are weights that satisfy

( )

1 ,

1

+ = m O w

it j

ψ

slide-14
SLIDE 14

14 Where is the gradient of with:

( )

′         ∂ ∂ ∂ ∂ = ∇

2 1 2 1 1

,

v it it t

g g g σ σ ψ

( ) ( )

t T v i t i t T v it

g γ 1 V γ Γ V γ 1

2 2 1 1 2 2 2 2 1

2 1 1 σ σ σ σ ρ σ +       ′ − ′ + + − = ∂ ∂

− −

( ) ( )

t T v i T T i t T v v it

g γ 1 V 1 J V γ 1

2 2 1 1 2 2 2 2

2 1 σ σ σ σ σ +       ′ − ′ + + = ∂ ∂

− −

( )

′         ∂ ∂ ∂ ∂ =

2 2 , v t t t

σ σ b b ψ L

is a partitioned matrix (2T×1), with:

( )

ψ

it

g1

( ) [ ]

t T v i t i t

γ 1 ΓV γ V b

2 2 1 1 2

σ σ σ + − = ∂ ∂

− −

( ) [ ]

t T v i T T i v t

γ 1 V J 1 V b

2 2 1 1 2

σ σ σ + − = ∂ ∂

− −

( )

t T v i t

γ 1 V b

2 2 1

σ σ + =

Jackknife approximation of the MSPE

slide-15
SLIDE 15

m=28 ; T=7 ; p=2

) 2 , 1 ( ′ = β

{ }

. 1 , 5 .

2 ∈ v

σ

{ }

00 . 1 , 50 . , 25 . , 00 .

2 ∈

σ

{ }

8 . , 4 . , 2 . , . ∈ ρ ) 1 , ( ~Unif x

iid it

) , 1 ( ′ =

it it

x x We computed for each data set (l=1, …, 1000):

( ) ( ) ( )

( )

    

l l it l RY it

mspe ψ ˆ ˆ ~ θ

( ) ( ) ( )

( )

    

l l it l J it

mspe ψ ˆ ˆ ~

1

θ

( ) ( ) ( )

( )

    

l l it l J it

mspe ψ ˆ ˆ ~

2

θ

Monte Carlo simulation study

15

m m w

it j

1

,

− =

it j m i T t it it it j it j

w

, 1 1 1 , ,

1 x x x x

− = =

      ′ ′ − =

∑∑

with with

slide-16
SLIDE 16

We generated B=250 bootstrap data sets, and then we computed: For each initial data set (l=1, …, 1000):

( ) ( ) ( )

( )

    

l l it l B it

mspe ψ ˆ ˆ ~

1

θ

( ) ( ) ( )

= −

      − =

B b b it B b b it E it

B g

1 2 * , * * , 1 * 3

) ˆ ( ~ ) ˆ ( ˆ ~ ψ ψ θ θ

The MSPE estimators were evaluated through:

∑∑ ∑

= = = −

× − =

m i T t L l it it l a it

MSPE MSPE mspe L mt AARB

1 1 1 ) ( 1

100 1

( )

∑∑∑

= = =

× − =

m i T t L l it it l a it

MSPE MSPE mspe mTL ARMSE

1 1 1 2 ) (

100 1 a∈{RY, B1, J1, J2}.

Monte Carlo simulation study

The percentage of areas where the relative bias (RB) is negative 16

slide-17
SLIDE 17

Table 1: RBN, ARB and ARMSE of MSPE estimators, ρ=0.0

Monte Carlo simulation study

PR-MSPE B1-MSPE J1-MSPE J2-MSPE RBN (%) 0.5 77.679 54.607 38.000 47.500 1.0 74.464 57.964 69.143 61.750 AARB (%) 0.5 25.042 35.209 22.617 22.575 1.0 16.127 16.901 13.781 12.851 ARMSE (%) 0.5 4.464 8.789 4.313 3.888 1.0 3.089 4.295 2.093 2.012

2 v

σ

17

77.679 74.464 22.575 12.851 3.888 2.012

slide-18
SLIDE 18

Figure 1: Box-and-whisker plots of RB of MSPE estimators, ρ=0.0

Monte Carlo simulation study

18

slide-19
SLIDE 19

Monte Carlo simulation study

Table 2: RBN, ARB and ARMSE of MSPE estimators, ρ=0.2

RY-MSPE B1-MSPE J1-MSPE J2-MSPE RBN (%) 0.25 0.5 49.408 40.194 51.449 46.235 1.0 44.184 41.378 39.408 39.663 0.50 0.5 48.490 42.265 44.867 44.673 1.0 39.837 24.224 37.347 36.939 1.00 0.5 41.133 19.214 38.694 38.204 1.0 31.500 10.622 30.010 29.286 AARB (%) 0.25 0.5 26.388 32.833 30.168 31.100 1.0 26.700 30.287 32.105 31.985 0.50 0.5 16.401 26.399 17.374 17.278 1.0 17.176 24.605 18.333 18.286 1.00 0.5 9.770 14.992 10.156 10.122 1.0 10.854 21.099 11.399 11.431 ARMSE (%) 0.25 0.5 2.886 4.578 4.466 4.678 1.0 2.999 3.866 4.786 4.774 0.50 0.5 1.677 1.690 1.864 1.838 1.0 1.762 3.582 2.006 1.993 1.00 0.5 0.752 1.619 0.811 0.805 1.0 0.903 3.242 0.995 1.001

2

σ

2 v

σ

19

26.388 26.700 16.401 17.176 9.770 10.854 2.886 2.999 1.677 1.762 0.752 0.903

slide-20
SLIDE 20

Monte Carlo simulation study

Figure 2: Box-and-whisker plots of RB of MSPE estimators, ρ=0.2

20

slide-21
SLIDE 21

Table 3: RBN, ARB and ARMSE of MSPE estimators, ρ=0.4

Monte Carlo simulation study

RY-MSPE B1-MSPE J1-MSPE J2-MSPE RBN (%) 0.25 0.5 76.541 71.980 67.612 67.367 1.0 73.898 67.408 65.296 67.449 0.50 0.5 73.337 66.969 64.337 63.908 1.0 74.857 68.724 67.949 67.612 1.00 0.5 51.418 44.806 47.235 47.020 1.0 56.816 46.520 52.735 52.449 AARB (%) 0.25 0.5 31.607 35.633 38.391 38.503 1.0 28.119 28.938 33.028 33.086 0.50 0.5 22.858 21.495 25.819 25.864 1.0 20.858 17.046 23.231 23.248 1.00 0.5 12.236 11.776 13.477 13.508 1.0 11.549 10.464 12.758 12.785 ARMSE (%) 0.25 0.5 4.587 6.693 7.559 7.632 1.0 3.872 4.715 5.812 6.005 0.50 0.5 3.394 3.453 4.854 4.881 1.0 3.002 2.455 4.809 4.849 1.00 0.5 1.176 1.092 1.439 1.446 1.0 1.108 0.918 1.351 1.357

2 v

σ

21

2

σ

35.633 28.938 21.495 17.046 11.776 10.464 6.693 4.715 3.453 2.455 1.092 0.918

slide-22
SLIDE 22

Monte Carlo simulation study

Figure 3: Box-and-whisker plots of RB of MSPE estimators, ρ=0.4

22

slide-23
SLIDE 23

23

Application with real data

Real time series obtained from the Prices of the Habitation Transaction

Survey (PHTS) and the Prices of Bank Evaluation in the Habitation Survey (PBEHS) were used

Data available on a quarter basis (T=7) Main goal - mean price of the habitation transaction at NUT III level Auxiliary variable – mean price of bank evaluation at NUT III level 28 NUT III were used as domains of interest (m=28)

slide-24
SLIDE 24

24

Application with real data

Domains ni µ µ µ µit CV anal. Domains ni µ µ µ µit CV anal. 1 1 646 5,9% 15 39 735 3,8% 2 1 714 5,3% 16 34 867 3,6% 3 7 661 5,6% 17 17 814 3,5% 4 6 718 5,2% 18 40 876 3,4% 5 31 763 5,1% 19 12 960 3,4% 6 18 704 5,1% 20 24 974 3,1% 7 19 670 4,9% 21 77 937 2,5% 8 56 756 4,6% 22 49 866 2,5% 9 12 769 4,4% 23 26 1128 2,4% 10 23 820 4,4% 24 90 956 1,8% 11 19 804 4,3% 25 89 1172 1,6% 12 17 666 4,2% 26 405 1041 1,3% 13 22 710 4,1% 27 263 1073 1,0% 14 27 658 3,9% 28 488 1321 0,7%

Table 4: Sample size, mean estimates and coefficients of variation for the EBLUP estimator

slide-25
SLIDE 25

25

Application with real data

Figure 4: Coefficients of variation for the EBLUP estimator

slide-26
SLIDE 26

26

Conclusion

  • It is difficult to find one MSPE estimator which performs better than

the others on bias and precision behaviour;

  • All estimators have absolute relative bias and MSE’s of the same
  • rder of magnitude;
  • The resampling-based approaches outperform the asymptotic

analytical approximation in several situations;

  • The bootstrap estimator tends to show a similar performance to the

jackknife estimators;

  • It seems suitable to use resampling-based methods in order to

estimate the uncertainty of the temporal EBLUP as an alternative to estimators based on long analytical developments.

slide-27
SLIDE 27

27

Further Research

  • Assess different measures of uncertainty for:

Different number of small areas and time points;

Unknown ρ

  • Use resampling-based methods under more complex longitudinal

small area models in which it is impossible to obtain analytical approximations of MSPE of the EBLUP.

slide-28
SLIDE 28

28

References

Butar, F.B., & Lahiri, P. (2003). On measures of uncertainty of empirical Bayes small area

  • estimators. Journal of Statistical Planning and Inference, 112, 63-76.

Chen, S., & Lahiri, P. (2008). On mean squared prediction error estimation in small area estimation

  • problems. Communications in Statistics – Theory and Methods, 37, 1792-1798.

González-Manteiga,W., Lombardía,M., Molina,I., Morales,D., & Santamaría,L. (2008). Bootstrap mean squared error of a small-area EBLUP. Journal of Statistical Computation and Simulation, 78, 443-462. Jiang, J., Lahiri, P., & Wan, S.-M. (2002). A unified jackknife theory for empirical best prediction with M-estimation, The Annals of Statistics, 30, 1782-1810. Kackar, R.N., & Harville, D.A. (1984). Approximations for standard errors of estimators of fixed and random effects in mixed linear models. Journal of the American Statistical Association, 79, 853- 862. Prasad, N.G.N., & Rao, J.N.K. (1990). The estimation of the mean squared error of small-area

  • estimators. Journal of the American Statistical Association, 85, 163-171.

Rao, J.N.K., & Yu, M. (1994). Small-area estimation by combining time-series and cross-sectional

  • data. The Canadian Journal of Statistics, 22(4), 511-528.
slide-29
SLIDE 29

29

Presentation partially supported by:

Acknowledgments

Portuguese Foundation for Science and Technology: fellowship SFRH/BD/36764/2007

slide-30
SLIDE 30

Luís Nobre Pereira (Lmper@ualg.pt) Pedro Simões Coelho (Psc@isegi.unl.pt)

11th June, 2008

Assessing uncertainty of the temporal EBLUP: a resampling-based approach