A three-level M-quantile model for poverty mapping in Poland Maciej - - PowerPoint PPT Presentation

▶

May 20, 2023 254 likes •612 views

Introduction Poverty rate M-quantile model Results Summary A three-level M-quantile model for poverty mapping in Poland Maciej Bersewicz 1 , Stefano Marchetti 2 , Nicola Salvati 2 , Marcin Szymkowiak 1 , ukasz Wawrowski 1 1 Poznan

SLIDE 1

Introduction Poverty rate M-quantile model Results Summary

A three-level M-quantile model for poverty mapping in Poland

Maciej Beręsewicz1, Stefano Marchetti2, Nicola Salvati2, Marcin Szymkowiak1, Łukasz Wawrowski1

1Poznan University of Economics and Business 2University of Pisa

19-08-2016

SAE 2016 Maastricht - The Netherlands

SLIDE 2

Introduction Poverty rate M-quantile model Results Summary

Outline

1 Introduction Introduction 2 Poverty rate Poverty rate definition Aim of the presentation Teritorial aggregation of Poland Review of using SAE in Poland – poverty mapping 3 M-quantile model M-quantile model A three level M-quantile model 4 Results Results 5 Summary

SAE 2016 Maastricht - The Netherlands

SLIDE 3

Introduction Poverty rate M-quantile model Results Summary Introduction

Introduction

Acording to Central Statistical Office (CSO) in Poland, the poverty indicator for the whole country in 2011 based on EU- SILC survey amounts to 17.7%.

CSO does not publish information about the poverty indicator for the lower level

f spatial aggregation (NUTS 2 and lower

levels). This information is available only at the level of the whole country and at the regional level (NUTS 1).

Figure: The poverty rate at the level of regions (NUTS 1)

SAE 2016 Maastricht - The Netherlands

SLIDE 4

Introduction Poverty rate M-quantile model Results Summary Introduction

Introduction

In Poland there is a huge demand for information about poverty indicators at lower level of spatial aggregation.

Poverty maps are required in Poland by Ministry of Infrastructure and Devel-

pment, Ministry of Labour and Social Policy, local authorities and many other
rganizations.

Given the small sample size in the relevant cross classifications of the EU-SILC survey, it is necessary to use chosen techniques of indirect estimation draw on alternative data sources to estimate the parameters of interest at low levels of spatial aggregation with acceptable precision.

SAE 2016 Maastricht - The Netherlands

SLIDE 5

Introduction Poverty rate M-quantile model Results Summary Poverty rate definition Aim of the presentation Teritorial aggregation of Poland Review of using SAE in Poland – poverty mapping

Poverty rate

Definition of poverty rate

Percentage of persons with an equivalised disposable income (after social transfers) below the at-risk-of-poverty threshold set at 60% of the national median of equivalised disposable income.

Formula for poverty rate

ARPR = Nu N · 100, (1) where: Nu — number of households living in poverty, N — number of all households.

SAE 2016 Maastricht - The Netherlands

SLIDE 6

Introduction Poverty rate M-quantile model Results Summary Poverty rate definition Aim of the presentation Teritorial aggregation of Poland Review of using SAE in Poland – poverty mapping

Aim of the presentation

Estimation of at-risk-poverty indicator (poverty rate, head count ratio - HCR) at LAU 1 level (in Polish poviats) using a three-level M-quantile model.

SAE 2016 Maastricht - The Netherlands

SLIDE 7

Introduction Poverty rate M-quantile model Results Summary Poverty rate definition Aim of the presentation Teritorial aggregation of Poland Review of using SAE in Poland – poverty mapping

Teritorial aggregation of Poland

7 regions (NUTS 1) 16 voivodships (NUTS 2) 66 subregions (NUTS 3) 379 poviats (NUTS 4/LAU 1) 2478 gminas (NUTS 5/LAU 2)

SAE 2016 Maastricht - The Netherlands

SLIDE 8

Introduction Poverty rate M-quantile model Results Summary Poverty rate definition Aim of the presentation Teritorial aggregation of Poland Review of using SAE in Poland – poverty mapping

Teritorial aggregation of Poland

7 regions (NUTS 1) 16 voivodships (NUTS 2) 66 subregions (NUTS 3) 379 poviats (NUTS 4/LAU 1) 2478 gminas (NUTS 5/LAU 2)

SAE 2016 Maastricht - The Netherlands

SLIDE 9

Introduction Poverty rate M-quantile model Results Summary Poverty rate definition Aim of the presentation Teritorial aggregation of Poland Review of using SAE in Poland – poverty mapping

Teritorial aggregation of Poland

7 regions (NUTS 1) 16 voivodships (NUTS 2) 66 subregions (NUTS 3) 379 poviats (NUTS 4/LAU 1) 2478 gminas (NUTS 5/LAU 2)

SAE 2016 Maastricht - The Netherlands

SLIDE 10

Introduction Poverty rate M-quantile model Results Summary Poverty rate definition Aim of the presentation Teritorial aggregation of Poland Review of using SAE in Poland – poverty mapping

Teritorial aggregation of Poland

7 regions (NUTS 1) 16 voivodships (NUTS 2) 66 subregions (NUTS 3) 379 poviats (NUTS 4/LAU 1) 2478 gminas (NUTS 5/LAU 2)

SAE 2016 Maastricht - The Netherlands

SLIDE 11

Introduction Poverty rate M-quantile model Results Summary Poverty rate definition Aim of the presentation Teritorial aggregation of Poland Review of using SAE in Poland – poverty mapping

Teritorial aggregation of Poland

7 regions (NUTS 1) 16 voivodships (NUTS 2) 66 subregions (NUTS 3) 379 poviats (NUTS 4/LAU 1) 2478 gminas (NUTS 5/LAU 2)

SAE 2016 Maastricht - The Netherlands

SLIDE 12

Introduction Poverty rate M-quantile model Results Summary Poverty rate definition Aim of the presentation Teritorial aggregation of Poland Review of using SAE in Poland – poverty mapping

Review of using SAE in Poland – poverty mapping

In 2013 the Center for Small Area Estimation, which is a special unit at the Statistical Office in Poznan, in cooperation with Central Statistical Office of Poland and the World Bank prepared a poverty map of Poland at the level of subregions (NUTS 3) using the Fay-Herriot approach. By implementing the Fay-Herriot area level model it was possible to produce estimates of the HCR in Poland at the level of subregions, i.e. at a lower level of aggregation than the direct estimates published by official statistics so far. This has increased the scope of information about poverty: it is now available at the level of 66 subregions.

SAE 2016 Maastricht - The Netherlands

SLIDE 13

Introduction Poverty rate M-quantile model Results Summary Poverty rate definition Aim of the presentation Teritorial aggregation of Poland Review of using SAE in Poland – poverty mapping

Review of using SAE in Poland – poverty mapping

A preliminary analysis of the poverty map created using the SAE methodology has revealed a difference between Central and Eastern Poland (with a higher poverty rate) and Western Poland, characterized by a lower poverty rate.

It was the first step to use the SAE methodology to estimate poverty rate in Poland.

The next step is to create poverty map in Poland at LAU 1 level using data from EU- SILC, Polish Local Databank and Polish National Census of Population and Hous- ing 2011 and a three level M-quantile model.

Figure: The poverty rate at the level of subregions (NUTS 3)

SAE 2016 Maastricht - The Netherlands

SLIDE 14

Introduction Poverty rate M-quantile model Results Summary M-quantile model A three level M-quantile model

M-quantile model

The classical regression model summarizes the behavior of the mean of a random variable at each point in a set of covariates. Instead, quantile regression summarizes the behavior of different parts of the conditional distribution f (y|x) at each point in the set of the x’s. A linear regression model for the q conditional quantile of f (y|x) is: Qy(q|X) = Xβ(q) , (2) where X is a design matrix and yj is a scalar response variable corresponding to a realization of a continuous random variable with unknown continuous cumulative distribution function. Estimates of β(q) are obtained by minimizing the following function:

|yj − xT

j β(q)|{qI(yj − xT j β(q) > 0) + (1 − q)I(yj − xT j β(q) ≤ 0)} .

(3)

SAE 2016 Maastricht - The Netherlands

SLIDE 15

Introduction Poverty rate M-quantile model Results Summary M-quantile model A three level M-quantile model

M-quantile model

The M-quantile regression is a ”quantile-like” generalization of regression based

n influence functions (M-regression). The M-quantile q of the conditional density

f (y|x), denoted by m, is defined as the solution to the estimating equation:

ψq(y − m)f (y|x) dy = 0 ,

(4) where ψq is an asymmetric influence function that is the derivative of an asymmet- ric loss function, ρq. When a linear relation between M-quantile m and auxiliary variables holds, then the M-quantile regression model is: my(q|X) = Xβψ(q) . (5)

SAE 2016 Maastricht - The Netherlands

SLIDE 16

Introduction Poverty rate M-quantile model Results Summary M-quantile model A three level M-quantile model

M-quantile model

An estimate of βψ(q) is obtained by minimizing:

ρq(yj − xT

j βψ(q)).

(6) The minimization of (6) reduces to the following estimating equation:

2ψq(s−1(yj − xT

j βψ(q))) = 0,

(7) where ψq is the derivative of the Huber loss function ρq, and it is equal to: ψq(u) =

qψ(u)

if u > 0, (1 − q)ψ(u) if u ≤ 0, (8) with ψ(u) = uI(|u| ≤ c)+sgn(u)cI(|u| > c) and c is a tuning constant. Provided that the tuning constant c is strictly greater than zero, estimates of βψ(q) are

btained using iterative weighted least squares (IWLS).

SAE 2016 Maastricht - The Netherlands

SLIDE 17

Introduction Poverty rate M-quantile model Results Summary M-quantile model A three level M-quantile model

M-quantile model

Chambers and Tzavidis (2006) extended the use of M-quantile regression models to small area estimation. They characterize the conditional variability across the population of interest by the M-quantile coefficients of the population units. For unit j with values yj and xj, this coefficient is the value θj such that my(θj|xj) = yj. The M- quantile coefficients are determined at the population level. Consequently, if a hierarchical structure explains part of the variability in the population data, then we expect units within areas (or domains) defined by this hierarchy to have similar M-quantile coefficients. Using the M-quantile coefficients it is possible to define an M-quantile small area model: yjd = xT

jdβψ(θd) + ǫjd ,

(9) where βψ(θd) is the unknown vector of M-quantile regression parameters for the unknown area-specific M-quantile coefficient θd and ǫjd is the unit level random error term with distribution function G for which no explicit parametric assump- tions are being made.

SAE 2016 Maastricht - The Netherlands

SLIDE 18

Introduction Poverty rate M-quantile model Results Summary M-quantile model A three level M-quantile model

M-quantile model

The area-specific M-quantile coefficient θd is estimated averaging the M-quantile coefficients of the sample units in area d, so ˆ θd = n−1

nd

j=1 θj.

Then, βψ(ˆ θd) is estimated solving equation (7). The predicted values under the M-quantile small area model are ˆ yjd = xT

jd ˆ

βψ(ˆ θd).

SAE 2016 Maastricht - The Netherlands

SLIDE 19

Introduction Poverty rate M-quantile model Results Summary M-quantile model A three level M-quantile model

A three level M-quantile model

In this part we present an extension of the M-quantile small are model (9) that mimics a three-level nested error model (while (9) mimics a two-level nested error model). By this extension is it possible to take into account a nested hierarchy in the data using the M-quantile approach. As an example, consider a nested hierarchy of households within primary sample units within small areas. If there is evidence of departures from the normality assumptions needed by a three-level mixed models, an alternative model that can account for three variance components can be our extension to the M-quantile model.

SAE 2016 Maastricht - The Netherlands

SLIDE 20

Introduction Poverty rate M-quantile model Results Summary M-quantile model A three level M-quantile model

A three level M-quantile model

Let’s consider a population divided in D areas and in C primary sample units (hereafter clusters), where clusters are partitions - i.e. no intersections between clusters exists - of an area. Consider to have Cd cluster in area d, d = 1, . . . , D and in each cluster Ncd units. A possible specification of a three-level M-quantile model is as follows: yjcd = xT

jcdβψ(θd) + xT jcdγψ(θcd) + ǫjcd,

(10) where yjcd is the target variable (continuous) for unit j in cluster c in area d, xjcd is a p-vector of auxiliary variables known for all the units in the population, θd is the unknown area M-quantile coefficient, βψ(θd) is the unknown p-vector of area-level regression parameters, θcd is the unknown cluster M-quantile coefficient, γψ(θcd) is the unknown p-vector of cluster-level regression parameters and finally ǫjcd is a unit error term. The ψ subscript indicates a robust influence function (e.g. Huber influence function).

SAE 2016 Maastricht - The Netherlands

SLIDE 21

Introduction Poverty rate M-quantile model Results Summary M-quantile model A three level M-quantile model

A three level M-quantile model

Bearing in mind the M-quantile approach to small area estimation summarized before, the quantity θd, βψ(θd), θcd and γψ(θcd) in equation (10) are unknown and can be estimated as follow: (a) Starting from the following M-quantile linear model: yjcd = xT

jcdβψ(θd) + ujcd ,

(11) estimate θd and then βψ(θd) according to the M-quantile approach to small area estimation summarized before. Let these estimates be ˆ θd and ˆ βψ(ˆ θd). (b) Then, compute the residuals ˆ ujcd = yjcd − xT

jcd ˆ

βψ(ˆ θd).

SAE 2016 Maastricht - The Netherlands

SLIDE 22

Introduction Poverty rate M-quantile model Results Summary M-quantile model A three level M-quantile model

A three level M-quantile model

(c) Next, using the residuals ˆ ujcd as target variable in the following M-quantile linear model: ˆ ujcd = xT

jcdγψ(θcd) + ǫjcd ,

(12) estimate θcd and then γψ(θcd) using again the M-quantile approach to small area

estimation. Indeed, ˆ

ujcd is the target variable, θcd is the domain (e.g. cluster) M- quantile coefficient and γψ(θcd) is the vector of regression parameters. Let these estimates be ˆ θcd and ˆ γψ(ˆ θcd). (d) Finally, the estimated model in (10) is as follows: yjcd = xT

jcd ˆ

βψ(ˆ θd) + xT

jcd ˆ

γψ(ˆ θcd) + ˆ ǫjcd . (13) where ˆ ǫjcd = yjcd − xT

jcd ˆ

βψ(ˆ θd) − xT

jcd ˆ

γψ(ˆ θcd) are the residuals of the M-quantile three-level model.

SAE 2016 Maastricht - The Netherlands

SLIDE 23

Introduction Poverty rate M-quantile model Results Summary M-quantile model A three level M-quantile model

A three level M-quantile model

In order to estimate poverty rate in Poland at LAU 1 level the Monte-Carlo technique was used. Using the Monte-Carlo approach it is possible to estimate any parameter that is function of the target variable y (continuous), let’s say h(y). In the Poland EU-SILC 2011 were sampled units in D = 375 poviats (small areas) out of 379 (4 out-of-sample areas). Let cd be the number of the sampled enumerations census (clusters - gminas in our case) out of Cd clusters in area d, d = 1, . . . , D. The cd cluster has sample size ncd . Model parameters estimates are available only for sampled areas and clusters. In order to micro-simulate the population using the auxiliary variables and model parameters estimates we chose a non-parametric approach. The estimation procedure for the population parameter of interest, h(y), at small area level is as follows:

SAE 2016 Maastricht - The Netherlands

SLIDE 24

Introduction Poverty rate M-quantile model Results Summary M-quantile model A three level M-quantile model

A three level M-quantile model

(a) Estimate the model (10) using sample data. (b) Generate a synthetic population according to model (10), ysyn

jcd = xT jcd ˆ

βψ(ˆ θd) + xT

jcd ˆ

γψ(ˆ θcd) To predict values of ysyn

dcj

for the D

d=1(Cd − cd) out-of-sample clusters,

sample with replication from ˆ θcd. Let ˆ θ∗

cd be the sampled M-quantile for

ut-of-sample cluster cout in area d, the predicted values in this cluster are

ysyn

kcoutd = xT kcoutd ˆ

βψ(ˆ θd) + xT

kcoutd ˆ

γψ(ˆ θ∗

cd);

(c) Generate the kth Monte-Carlo population by adding a disturbance to the synthetic values, yk

jcd = ysyn jcd + ζjcd; ζjcd can be obtained sampling from residuals of model

(10), ˆ ǫjcd,

SAE 2016 Maastricht - The Netherlands

SLIDE 25

Introduction Poverty rate M-quantile model Results Summary M-quantile model A three level M-quantile model

A three level M-quantile model

(d) Compute the area target statistic h(yd) on the kth Monte Carlo population, h(yk

d),

where yk

d is the vector of the Monte-Carlo population values in area d;

(e) Repeat step (b) to (d) K times and then estimate h(y) by averaging over the K Monte Carlo populations, ˆ h(yd) = K −1 K

k=1 h(yk d).

(f) To estimate the MSE of ˆ h(yd) the random effect block bootstrap method proposed by Chambers and Chandra (2013) may be used.

SAE 2016 Maastricht - The Netherlands

SLIDE 26

Introduction Poverty rate M-quantile model Results Summary Results

Results

To estimate poverty rate in Poland data from the EU-SILC 2011, Polish National Census

f Population and Housing 2011 and Polish Local Data Bank were used. The selected

variables included among others: fraction of male in the household (males), child dependency ratio in the household (children), fraction of people in 30–44 years in household (people30 44), fraction of people with 65 and more years in the household (people65), fraction of unemployed in the household (unemployed), fraction of disabled people in the household (disabled), fraction of people with basic/elementary education in the household (educ elementary), fraction of people with high education in the household (educ high), whether household own flat with only one room (dummy variable, room1), whether household owns flat with more than three rooms (dummy variable, room3), whether household lives in the village or city < 20 k citizens (dummy variable, village city20).

SAE 2016 Maastricht - The Netherlands

SLIDE 27

Introduction Poverty rate M-quantile model Results Summary Results

Results

Table: Summary statistics of small areas poverty rate estimates

Estimator Minimum 1st quartile Median Mean 3st quartile Maximum Poverty rate Direct 9.71 16.62 20.15 28.45 84.52 M-quantile 8.34 20.97 28.31 28.30 34.80 57.56

SAE 2016 Maastricht - The Netherlands

SLIDE 28

Introduction Poverty rate M-quantile model Results Summary Results

Results

First group are 48 poviats with the lowest poverty rate, from 8.3% to 17.3%. These domains consist mainly of the Polish capi- tal (Warsaw), provincial capital cities (i.e. Poznan, Wroclaw, Krakow, Gdansk and Szczecin) and its agglomerations.

For peripherally located domains, as the distance from capital cities raises poverty rate increases. Other poviats that have low poverty are characterized by well- developed industry in their area or in the direct vicinity. For instance, in these povi- ats are copper, coal and brown coal basin, aviation and chemical industry.

Figure: The poverty rate at the level of poviats (LAU 1)

using a three level M-quantile model SAE 2016 Maastricht - The Netherlands

SLIDE 29

Introduction Poverty rate M-quantile model Results Summary Results

Results

The last two groups of poviats that are characterized with highest poverty rate (over 34.2%) consist of 101 poviats (27%

f all poviats in Poland). For these domi-

ans more than 1/3 of households live under the poverty line.

Moreover, a strong spatial clustering is vis- ible in the east of Poland. In particular, poviats in the east-south part on border with Ukraine are characterized with the highest poverty. Another group of poviats with higher poverty rate are on the west- central part along the border with Ger- many.

Figure: The poverty rate at the level of poviats (LAU 1)

using a three level M-quantile model SAE 2016 Maastricht - The Netherlands

SLIDE 30

Introduction Poverty rate M-quantile model Results Summary Results

Results

Table: Precision of obtained estimates of poverty rate

Estimator less than 16.6% 16.6%–33.3% 33.3% and more N/A Poverty rate Direct 27 337 11 M-quantile 9 237 129

SAE 2016 Maastricht - The Netherlands

SLIDE 31

Introduction Poverty rate M-quantile model Results Summary Results

Results

Figure: Ratio of estimated CVs and direct estimates of

M-quantile estimates versus the sample size for each poviats

Figure: Comparison of direct and M-quantile Cv of poverty

rate grouped by the sample size SAE 2016 Maastricht - The Netherlands

SLIDE 32

Introduction Poverty rate M-quantile model Results Summary

Summary

Results obtained by using a three-level M-quantile model are characterized by better precision than direct estimates. One of the first application of census unit level data in poverty estimation in Poland. Using a three-level M-quantile model increased the scope of information about poverty in Poland at the level of poviats. Using a three-level M-quantile model also enabled to produce the poverty map

f Poland at this level of spatial aggregation as well as the territorial analyses of

this phenomenon.

SAE 2016 Maastricht - The Netherlands

SLIDE 33

Introduction Poverty rate M-quantile model Results Summary

Literature

Chambers, R. and Chandra, H. (2013), A random effect block bootstrap for clustered data, The Journal of Computational and Graphical Statistics, 22:452–470. Chambers, R. and Tzavidis, N. (2006), M-quantile models for small area estimation, Biometrika, 93(2):255– 68. Molina, I. and Rao. J.N.K., (2010), Small area estimation of poverty indicators, The Canadian Journal of Statistics, Vol. 38, No. 3, pp. 369–385. SAE 2016 Maastricht - The Netherlands

SLIDE 34

Introduction Poverty rate M-quantile model Results Summary

Acknowledgments

Acknowledgments The research leading to these results has received support under the European Commission’s 7th Framework Pro- gramme (FP7/2013–2017) under grant agreement no. 312691, InGRID – Inclusive Growth Research Infrastructure Diffusion. SAE 2016 Maastricht - The Netherlands