Decomposition by Subpopulations of the Bonferroni Indexes I. Valli - - PowerPoint PPT Presentation

decomposition by subpopulations of the bonferroni indexes
SMART_READER_LITE
LIVE PREVIEW

Decomposition by Subpopulations of the Bonferroni Indexes I. Valli - - PowerPoint PPT Presentation

Decomposition by Subpopulations of the Bonferroni Indexes I. Valli June 23, 2016 I. Valli Universit degli Studi di Milano Bicocca 1 / 66 Intro Let Y a non-negative variate, usually income, observed on N units of a finite population and, 0


slide-1
SLIDE 1

Decomposition by Subpopulations of the Bonferroni Indexes

  • I. Valli

June 23, 2016

  • I. Valli

Università degli Studi di Milano Bicocca 1 / 66

slide-2
SLIDE 2

Intro

Let Y a non-negative variate, usually income, observed on N units of a finite population and, 0 ≤ y(1) ≤ . . . ≤ y(i) ≤ . . . ≤ y(N) > 0 the N ordered values.

  • I. Valli

Università degli Studi di Milano Bicocca 2 / 66

slide-3
SLIDE 3

Intro

The study of concentration (hereafter “inequality”) can be traced from the end of XIX sec. Since ∀ i = 1, . . . , N p(i) ≥ q(i) where, p(i) = i N ; q(i) = i

t=1 y(t)

T(Y ) in 1914 Corrado Gini claimed that the inequality is more strong when stronger is the above reported inequality. In this terms, Gini suggested as point inequality measure the relative variation R(i)(Y ) = p(i) − q(i) p(i) and as synthetic measure their weighted mean ˜ R(Y ) = N−1

i=1 p(i)−q(i) p(i)

· p(i) N−1

i=1 p(i)

= N−1

i=1 (p(i) − q(i))

N−1

i=1 p(i)

.

  • I. Valli

Università degli Studi di Milano Bicocca 3 / 66

slide-4
SLIDE 4

Intro

In 1930, Carlo Emilio Bonferroni suggests as measures to evaluate the inequality the point index V(i)(Y ) = M(Y ) −

i

t=1 y(t)

i

M(Y ) and as synthetic index their arithmetic mean: ˜ V (Y ) = 1 N − 1

N−1

  • i=1

V(i)(Y ).

  • I. Valli

Università degli Studi di Milano Bicocca 4 / 66

slide-5
SLIDE 5

Intro

In 1940 Mario De Vergottini showed that: R(i)(Y ) = V(i)(Y ) and ˜ R(Y ) = 1 N − 1

N−1

  • i=1

V(i)(Y ) · 2i N − 1

  • I. Valli

Università degli Studi di Milano Bicocca 5 / 66

slide-6
SLIDE 6

Definitions and notation

Let: Q(i)(Y ) =

i

  • t=1

y(t), i = 1, . . . N (1) be the income of the i poorest population units; T(Y ) = Q(N)(Y ) =

N

  • i=1

y(i); (2)

M(i)(Y ) = Q(i)(Y ) i ; (3) M(Y ) = T(Y ) N =

M(N)(Y ). (4)

  • I. Valli

Università degli Studi di Milano Bicocca 6 / 66

slide-7
SLIDE 7

Definitions and notation

The Bonferroni (1930) point and synthetic inequality measures are: V(i)(Y ) = M(Y ) −

M(i)(Y ) M(Y ) ; i = 1, . . . , N (5) ˜ V (Y ) = 1 N − 1

N−1

  • i=1

V(i)(Y ), (6)

  • respectively. Note that V(i)(Y ) is the relative variation of the

lower mean

M(i)(Y ) w.r.t. M(Y ), hence, ˜ V (Y ) is their (simple) arithmetic mean.

  • I. Valli

Università degli Studi di Milano Bicocca 7 / 66

slide-8
SLIDE 8

Definitions and notation

In case of maximum inequality

  • y(1) = . . . = y(N−1) = 0, y(N) > 0
  • it is known that ˜

V (Y ) = 1, for all N ≥ 2. Note that ˜ V (Y ) does not discern among maximum inequality cases with different values of N. In the case of maximum inequality, it seems more reasonable that the value of an inequality index CN(Y ), evaluated on N units, is such that: (a) CN(Y ) is an increasing and positive function of N; (b) lim

N→∞CN(Y ) = 1.

  • I. Valli

Università degli Studi di Milano Bicocca 8 / 66

slide-9
SLIDE 9

Definitions and notation

Now, multiplying, both sides of (6) by N−1

N , we have:

V ′(Y ) = N − 1 N · ˜ V (Y ) = 1 N ·

N−1

  • i=1

V(i)(Y ) = 1 N ·

N

  • i=1

V(i)(Y ). (7) Thus, in case of maximum inequality, V ′(Y ) = 1 − 1

N . Note that

1 − 1

N is an increasing and positive function of N such that

lim

N→∞1 − 1 N = 1.

  • I. Valli

Università degli Studi di Milano Bicocca 9 / 66

slide-10
SLIDE 10

Definitions and notation

Tabella 1: Distribution of N = 10 units and calculation of V(i)(Y ) and V (Y ) = 0, 5157

i y(i) Q(i)(Y )

M(i)(Y ) V(i)(Y ) 1 2 2 2,00 0,9333 2 2 4 2,00 0,9333 3 8 12 4,00 0,8667 4 24 36 9,00 0,70 5 29 65 13,00 0,5667 6 37 102 17,00 0,4333 7 37 139 19,8571 0,3381 8 37 176 22 0,2667 9 62 238 26,4444 0,1185 10 62 300 30 0,00 Total 5,1566

  • I. Valli

Università degli Studi di Milano Bicocca 10 / 66

slide-11
SLIDE 11

Definitions and notation

The value of V ′(Y ) = 1

N

N

i=1 V(i)(Y ) can be interpreted as the

sum of the areas of N rectangles, each with basis 1/N and height V(i)(Y ). To draw the inequality diagram V(i)(Y ), it is needed, first of all, to obtain N points of coordinates i

N , V(i)(Y )

  • . Then, we obtain N rectangles by the following

procedure: the first rectangle has abscissas in the interval

  • 0, 1

N

  • and ordiantes in the interval
  • 0, V(1)(Y )
  • . The i − th rectangle,

i = 2, . . . , N, has abscissas in the interval i−1

N , i N

  • and
  • rdinates in the interval
  • 0, V(i)(Y )
  • . Figure 1 reports the

graphs of V(i)(Y ).

  • I. Valli

Università degli Studi di Milano Bicocca 11 / 66

slide-12
SLIDE 12

Definitions and notation

Figura 1: Graphs of V(i)(Y )

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 pi V(i)(Y)

  • I. Valli

Università degli Studi di Milano Bicocca 12 / 66

slide-13
SLIDE 13

Definitions and notation in the frequency distribution framework

The last column of Table 1 shows that V(i)(Y ) may not be constant for units taking the same value of Y . This behavior of V(i)(Y ) is not reasonable in the decomposition by subpopulation because units with the same value of Y may belong to different subpopulations. We will overcome this situation by substituting the values of V(i)(Y ) corresponding to units with the same value yh of Y with the value V(Ph.)(Y ), where Ph. is the number of units with Y ≤ yh.

  • I. Valli

Università degli Studi di Milano Bicocca 13 / 66

slide-14
SLIDE 14

Definitions and notation in the frequency distribution framework

Let, {0 ≤ y1 < . . . < yh < . . . < yr} denote the set of the r distinct values assumed by the variate Y

  • ver the k subpopulations. It is possible to report the whole

r × k bivariate distribution of the N units as shown in Table 2, where: nhg denotes the frequency of yh in the subpopulation g;

  • nh. = k

g=1 nhg is the frequency of yh in the whole population

and n.g = r

h=1 nhg is the frequency of the subpopulation g.

  • I. Valli

Università degli Studi di Milano Bicocca 14 / 66

slide-15
SLIDE 15

Definitions and notation in the frequency distribution framework

Tabella 2: Bivariate r × k distribution of the whole population partitioned into k subpopulations.

Subpopulations 1 . . . g . . . k Total y1 n11 . . . n1g . . . n1k n1. . . . . . . ... . . . ... . . . . . . yh nh1 . . . nhg . . . nhk nh. . . . . . . ... . . . ... . . . . . . yr nr1 . . . nrg . . . nrk nr. Total n.1 . . . n.g . . . n.k N

  • I. Valli

Università degli Studi di Milano Bicocca 15 / 66

slide-16
SLIDE 16

Definitions and notation in the frequency distribution framework

Let us define, for the overall distribution {(yh, nh.) : h = 1, . . . , r}:

  • Ph. = Ph.(Y ) =

h

  • t=1

nt., (8) Qh.(Y ) = Q(Ph.)(Y ) =

h

  • t=1

yt · nt., (9) T(Y ) = Qr.(Y ) =

r

  • h=1

yh · nh. (10)

Mh.(Y ) = Qh.(Y ) Ph. . (11) Note that, M(Y ) =

Mr.(Y ).

  • I. Valli

Università degli Studi di Milano Bicocca 16 / 66

slide-17
SLIDE 17

Definitions and notation in the frequency distribution framework

For the distribution

  • (yh, nhg) : h = 1, . . . , r; g = 1, . . . , k
  • , of the subpopulation g, let:

Phg = Phg(Y ) =

h

  • t=1

ntg, (12) Qhg(Y ) = Q(Phg)(Y ) =

h

  • t=1

yt · ntg, (13) Tg(Y ) = Qrg(Y ) =

r

  • h=1

yh · nhg, (14) Mg(Y ) = Tg n.g (15)

  • (g) = min h : nhg > 0

(16)

Mhg(Y ) =    yo(g) for h < o(g)

Qhg(Y ) Phg

for h ≥ o(g) , (17) where

Mhg(Y ) in (17) denotes the mean of the first poorest Phg units. Note that, from (17) and (15), follows Mg(Y ) =

Mrg(Y ).

  • I. Valli

Università degli Studi di Milano Bicocca 17 / 66

slide-18
SLIDE 18

Definitions and notation in the frequency distribution framework

Note that, from (12) to (14), we can deduce the quantities defined in (8)-(11):

  • Ph. =

k

  • g=1

Phg(Y ) (18)

  • Qh. =

k

  • g=1

Qhg(Y ) (19) T(Y ) =

k

  • g=1

Tg(Y ) (20)

  • I. Valli

Università degli Studi di Milano Bicocca 18 / 66

slide-19
SLIDE 19

Definitions and notation in the frequency distribution framework

We can now define the Bonferroni inequality measures in the frequency distribution framework. From (9) we have: V ′(Y ) = 1 N

N

  • i=1

V(i)(Y ) = 1 N

r

  • h=1

Ph.

  • i=1+Ph.−nh.

V(i)(Y ). (21)

  • I. Valli

Università degli Studi di Milano Bicocca 19 / 66

slide-20
SLIDE 20

Definitions and notation in the frequency distribution framework

In order to assign same point inequality measure to units that have same value Y = yh, we set, for 1 + Ph. − nh. ≤ i ≤ Ph.: V(i)(Y ) = V(Ph.)(Y ) = M(Y ) −

M(Ph.)(Y ) M(Y ) = M(Y ) −

Mh.(Y ) M(Y ) = Vh(Y ) (22)

  • I. Valli

Università degli Studi di Milano Bicocca 20 / 66

slide-21
SLIDE 21

Definitions and notation in the frequency distribution framework

Hence, from (22), we approximate

Ph.

  • i=1+Ph.−nh.

V(i)(Y ) with

Ph.

  • i=1+Ph.−nh.

V(Ph.)(Y ) =

Ph.

  • i=1+Ph.−nh.

Vh(Y ) = Vh(Y ) · nh. (23) and we define the synthetic inequality index V (Y ), V (Y ) =

r

  • h=1

Vh(Y ) · nh. N , (24) as weighted mean of the point inequality measures Vh(Y ), with weights nh./N. Since Vh(Y ) = V(Ph.)(Y ) ≤ V(i)(Y ) for 1 + Ph. − nh. ≤ i ≤ Ph., follows that V (Y ) ≤ V ′(Y ).

  • I. Valli

Università degli Studi di Milano Bicocca 21 / 66

slide-22
SLIDE 22

Definitions and notation in the frequency distribution framework

The N = 10 values y(i), introduced in the previous section are utilized in Table 3 for the calculation of

Mh.(Y ), Vh(Y ) and V (Y ).

Tabella 3: Calculation of

M h.(Y ), Vh(Y ) and V (Y ).

h yh Q(h.)(Y )

Mh.(Y ) Vh(Y ) 1 2 4 2 0,9333 2 8 12 4 0,8667 3 24 36 9 0,70 4 29 65 13 0,5667 5 37 176 22 0,2667 6 62 300 30 0,00 V(Y)=0,48

  • I. Valli

Università degli Studi di Milano Bicocca 22 / 66

slide-23
SLIDE 23

Definitions and notation in the frequency distribution framework

Figure 4 reports the graphs of V(i)(Y ) and Vh(Y )

Figura 2: Graphs of V(i)(Y ) and Vh(Y )

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 ph

  • 0.0

0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

  • I. Valli

Università degli Studi di Milano Bicocca 23 / 66

slide-24
SLIDE 24

Decomposition by subpopulations of the point Vh(Y ) and synthetic V (Y ) Bonferroni’s inequality indexes

In this section we decompose by subpopulations the point Vh(Y ) and the synthetic V (Y ) Bonferroni indexes using the “two-steps” approach, recently proposed by Zenga (2016) for the decomposition of the Zenga (2007) inequality indexes. In particular, in the first step, we decompose by subpopulations the Bonferroni point index Vh(Y ), then, putting this decomposition in the relation V (Y ) = r

h=1 Vh(Y ) · nh. N , we obtain the

corresponding decomposition by subpopulations of V (Y ).

  • I. Valli

Università degli Studi di Milano Bicocca 24 / 66

slide-25
SLIDE 25

Decomposition by subpopulations of the point Vh(Y ) and synthetic V (Y ) Bonferroni’s inequality indexes

The mean M(Y ) is related to the k means Mg(Y ) by the relation M(Y ) =

k

  • g=1

Mg(Y ) · n.g N , (25) and

Mh.(Y ) is related to the k means

Mhℓ(Y ) by the relation

Mh.(Y ) =

r

  • h=1

Mhℓ(Y ) · p(ℓ|h), (26) where p(ℓ|h) = Phℓ Ph. h = 1, . . . , r; ℓ = 1, . . . , k (27) is the relative frequency of the subpopulation ℓ in the lower group {Y ≤ yh}. Note that k

ℓ=1 p(ℓ|h) = 1, k g=1 n.g N = 1 and

k

ℓ=1

k

g=1 p(ℓ|h) · n.g N = 1.

  • I. Valli

Università degli Studi di Milano Bicocca 25 / 66

slide-26
SLIDE 26

Decomposition by subpopulations of the point Vh(Y ) and synthetic V (Y ) Bonferroni’s inequality indexes

Using (25) and (26), the following k × k additive decomposition

  • f
  • M(Y ) −

Mh.(Y )

  • is obtained:
  • M(Y ) −

Mh.(Y )

  • =

k

  • g=1

Mg(Y ) · n.g N ·

k

  • ℓ=1

p(ℓ|h)+ −

k

  • ℓ=1

Mhℓ(Y ) · p(ℓ|h) ·

k

  • g=1

n.g N =

k

  • ℓ=1

k

  • g=1
  • Mg(Y ) −

Mhℓ(Y )

  • · p(ℓ|h) · n.g

N . (28)

  • I. Valli

Università degli Studi di Milano Bicocca 26 / 66

slide-27
SLIDE 27

Decomposition by subpopulations of the point Vh(Y ) and synthetic V (Y ) Bonferroni’s inequality indexes

Now, dividing both sides of (28) by M(Y ), we obtain the following k × k additive decomposition of Vh(Y ): Vh(Y ) =

k

  • ℓ=1

k

  • g=1

Vhℓg(Y ) (29) where, Vhℓg(Y ) =  Mg(Y ) −

Mhℓ(Y ) M(Y )   · p(ℓ|h) · n.g N (30) is the contribution to Vh(Y ) that derives from the comparison of

Mhℓ(Y ) w.r.t. Mg(Y ).

  • I. Valli

Università degli Studi di Milano Bicocca 27 / 66

slide-28
SLIDE 28

Decomposition by subpopulations of the point Vh(Y ) and synthetic V (Y ) Bonferroni’s inequality indexes

Finally, putting (29) in (24), the following k × k additive decomposition of V (Y ) is obtained: V (Y ) =

r

  • h=1

  

k

  • ℓ=1

k

  • g=1

Vhℓg(Y )    · nh. N =

k

  • ℓ=1

k

  • g=1

V.ℓg(Y ), (31) where, V.ℓg(Y ) =

r

  • h=1

Vhℓg(Y ) · nh. N , (32) is the weighted mean of Vhℓg(Y ) with weights nh.

N .

  • I. Valli

Università degli Studi di Milano Bicocca 28 / 66

slide-29
SLIDE 29

Decomposition by subpopulations of the point Vh(Y ) and synthetic V (Y ) Bonferroni’s inequality indexes

Tabella 4: Lower means

M hg(Y ) and

M h.(Y ) and relative frequencies p(ℓ|h) = Phℓ

Ph.

g

Mh. ℓ 1 2 3 1 2 3

  • (g)

1 2 1 h yh

Mh1

Mh2

Mh3

Ph1 Ph. Ph2 Ph. Ph3 Ph.

1 2 2 8 2 2 0,50 0,00 0,50 2 8 2 8 2 4 0,33 0,33 0,33 3 24 13 8 2 9 0,50 0,25 0,25 4 29 13 8 15,50 13 0,40 0,20 0,40 5 37 25 22,50 15,50 22 0,5 0,25 0,25 6 62 32,4 22,5 31,00 30 0,5 0,20 0,30

n.g N

0,50 0,20 0,30

  • I. Valli

Università degli Studi di Milano Bicocca 29 / 66

slide-30
SLIDE 30

Decomposition by subpopulations of the point Vh(Y ) and synthetic V (Y ) Bonferroni’s inequality indexes

Tabella 5: Values of the contributions Vhℓg(Y ), h = 1, . . . , 6

V1ℓg(Y ) ℓ V2ℓg(Y ) ℓ 1 2 3 1 2 3 g 1 0,2533 0,00 0,2533 g 1 0,1689 0,1356 0,1689 2 0,0683 0,00 0,0683 2 0,0456 0,0322 0,0456 3 0,1450 0,00 0,1450 3 0,0967 0,0767 0,0967 V1(Y ) = 0, 9333 V2(Y ) = 0, 8667 V3ℓg(Y ) ℓ V4ℓg(Y ) ℓ 1 2 3 1 2 3 g 1 0,1617 0,1017 0,1267 g 1 0,1293 0,0813 0,1127 2 0,0317 0,0242 0,0342 2 0,0253 0,0193 0,0187 3 0,0900 0,0575 0,0725 3 0,0720 0,0460 0,0620 V3(Y ) = 0, 70 V4(Y ) = 0, 5667 V5ℓg(Y ) ℓ V6ℓg(Y ) ℓ 1 2 3 1 2 3 g 1 0,0617 0,0412 0,0704 g 1 0,00 0,033 0,007 2

  • 0,0083

0,00 0,0117 2

  • 0,033

0,00

  • 0,017

3 0,0300 0,0212 0,0388 3

  • 0,007

0,017 0,00 V5(Y ) = 0, 2667 V6(Y ) = 0, 00

  • I. Valli

Università degli Studi di Milano Bicocca 30 / 66

slide-31
SLIDE 31

Decomposition by subpopulations of the point Vh(Y ) and synthetic V (Y ) Bonferroni’s inequality indexes

Tabella 6: Values of the contributions V.ℓg(Y )

V.ℓg(Y ) ℓ 1 2 3 g 1 0,1152 0,0508 0,1140 2 0,0148 0,0076 0,0236 3 0,0625 0,0278 0,0637 V (Y ) = 0, 48

  • I. Valli

Università degli Studi di Milano Bicocca 31 / 66

slide-32
SLIDE 32

Contributions of each subpopulations to the point Vh(Y ) and the synthetic V (Y ) indexes

Starting from the k × k contributions in (29), it is possible to

  • btain other useful decompositions. Let:

Vhℓ.(Y ) =

k

  • g=1

Vhℓg(Y ). (33) Then, using (30) and (25) in (33), gives: Vhℓ.(Y ) =

k

  • g=1

Mg(Y ) −

Mhℓ(Y ) M(Y ) · p(ℓ|h) · n.g N = M(Y ) −

Mhℓ(Y ) M(Y ) · p(ℓ|h). (34) Note that M(Y )−

Mhℓ(Y ) M(Y )

is the relative variation of the lower mean

Mhℓ(Y ) w.r.t. the mean M(Y ).

  • I. Valli

Università degli Studi di Milano Bicocca 32 / 66

slide-33
SLIDE 33

Contributions of each subpopulations to the point Vh(Y ) and the synthetic V (Y ) indexes

Now, from (29), (33) and (34) the following k additive decompositions of Vh(Y ) is obtained: Vh(Y ) =

k

  • ℓ=1

k

  • g=1

Vhℓg(Y ) =

k

  • ℓ=1

Vhℓ.(Y ) =

k

  • ℓ=1

M(Y ) −

Mhℓ(Y ) M(Y ) · p(ℓ|h) (35) In conclusion, Vh(Y ) is the sum of the k contributions Vhℓ.(Y ). Formula (35) shows that the point index Vh(Y ) is the weighted mean of the k relative variations M(Y )−

Mhℓ(Y ) M(Y )

with weights p(ℓ|h). Thus, Vhℓ.(Y ) can be interpreted as the contribution of the subpopulation ℓ to the point inequality index Vh(Y ).

  • I. Valli

Università degli Studi di Milano Bicocca 33 / 66

slide-34
SLIDE 34

Contributions of each subpopulations to the point Vh(Y ) and the synthetic V (Y ) indexes

Finally, putting (35) in (24), the following k additive decomposition of V (Y ) is obtained V (Y ) =

r

  • h=1

k

  • ℓ=1

Vhℓ. · nh. N =

k

  • ℓ=1

V.ℓ.(Y ) (36) where, V.ℓ.(Y ) =

r

  • h=1

Vhℓ. · nh. N (37) denotes the weighted mean of Vhℓ.(Y ) with weights nh.

N . Thus,

V.ℓ.(Y ) can be interpreted as the contribution of the subpopulation ℓ to the Bonferroni synthetic index V (Y ).

  • I. Valli

Università degli Studi di Milano Bicocca 34 / 66

slide-35
SLIDE 35

Within and Between components of Vhℓ.(Y ), Vh(Y ) and V (Y )

The contribution Vhℓ.(Y ) of the subpopulation ℓ to the point index Vh(Y ) can be split into a “within” and a “between” component. From (33) we have:

Vhℓ.(Y ) =

k

  • g=1

Vhℓg(Y ) = VhℓW (Y ) + VhℓB(Y ), (38) where, VhℓW (Y ) = Vhℓℓ(Y ) = Mℓ(Y ) −

M hℓ(Y ) M(Y ) · p(ℓ|h) · n.ℓ N , (39) and VhℓB(Y ) =

k

  • g:g=ℓ

Vhℓg(Y ) =

k

  • g:g=ℓ

Mg(Y ) −

M hℓ(Y ) M(Y ) · p(ℓ|h) · n.g N . (40)

  • I. Valli

Università degli Studi di Milano Bicocca 35 / 66

slide-36
SLIDE 36

Within and Between components of Vhℓ.(Y ), Vh(Y ) and V (Y )

In (39) the value of Vhℓℓ(Y ) derives from comparisons of “incomes” of the same subpopulation ℓ; thus, VhℓW (Y ) = Vhℓℓ(Y ) can be interpreted as the “within” part of the contribution Vhℓ.(Y ). Viceversa, in (40) the value of VhℓB(Y ) = k

g:g=ℓ Vhℓg(Y ) derives from the comparison of

“incomes” of different subpopulations; thus, VhℓB(Y ) can be interpreted as the “between” part of Vhℓ.(Y ).

  • I. Valli

Università degli Studi di Milano Bicocca 36 / 66

slide-37
SLIDE 37

Within and Between components of Vhℓ.(Y ), Vh(Y ) and V (Y )

From (35) and (38), we obtain: Vh(Y ) =

k

  • ℓ=1

Vhℓ.(Y ) =

k

  • ℓ=1

[VhℓW (Y ) + VhℓB(Y )] = Vh.W (Y ) + Vh.B(Y ); (41) where, Vh.W (Y ) =

k

  • ℓ=1

VhℓW (Y ) (42) is the sum of the “within” parts of the contributions Vhℓ.(Y ) and can be interpreted as the within part of Vh(Y ) and, Vh.B(Y ) =

k

  • ℓ=1

VhℓB(Y ) (43) can be interpreted as the “between” part of the Bonferroni point index Vh(Y ).

  • I. Valli

Università degli Studi di Milano Bicocca 37 / 66

slide-38
SLIDE 38

Within and Between components of Vhℓ.(Y ), Vh(Y ) and V (Y )

Finally, putting the decomposition (41) in (24) gives the following decomposition of the Bonferroni synthetic index V (Y ): V (Y ) = V..W (Y ) + V..B(Y ), (44) where, V..W (Y ) =

r

  • h=1

Vh.W (Y ) · nh. N (45) and V..B(Y ) =

r

  • h=1

Vh.B(Y ) · nh. N (46) are the within and the between parts of V (Y ), respectively.

  • I. Valli

Università degli Studi di Milano Bicocca 38 / 66

slide-39
SLIDE 39

Within and Between components of Vhℓ.(Y ), Vh(Y ) and V (Y )

Using in (45) the relations (42) and (39), the following useful relation is obtained: V..W (Y ) =

r

  • h=1

k

  • ℓ=1

VhℓW (Y ) · nh. N =

k

  • ℓ=1

r

  • h=1

 Mℓ(Y ) −

M hℓ(Y ) M(Y )   · p(ℓ|h) · n.ℓ N · nh. N . (47) Using in (46) the relations in (43) and (40) the following useful relation is obtained: V..B =

r

  • h=1

k

  • ℓ=1

VhℓB(Y ) · nh. N =

r

  • h=1

 

k

  • ℓ=1

k

  • g:g=ℓ

Vhℓg(Y )   · nh. N =

k

  • ℓ=1

k

  • g:g=ℓ

r

  • h=1

 Mg(Y ) −

M hℓ(Y ) M(Y )   · p(ℓ|h) · n.g N · nh. N . (48)

  • I. Valli

Università degli Studi di Milano Bicocca 39 / 66

slide-40
SLIDE 40

Application

The data used in this application are supplied by the 2012 Central Bank of Italy sample survey on household income and

  • wealth. This survey covers N = 8151 households. In this paper

we deal with the household net disposable income Y , that is the sum of: the payroll income X1, the pensions and net transfers X2, the net self employment income X3, and the property incomes X4. The N = 8151 households have been partitioned according to their residence area: North (g = 1), Center (g = 2) and South with Islands (g = 3). In all computations that follow we consider the weights wi > 0 (i = 1, 2, . . . , 8151; W = wi = 8151 = N) supplied by the Central Bank of Italy for each household; these weights are defined as the inverse of household’s probability of inclusion in the sample (for further details see Banca d’Italia 2014). We remark that the frequency distribution of the total income Y has r = 7287 different values.

  • I. Valli

Università degli Studi di Milano Bicocca 40 / 66

slide-41
SLIDE 41

Application

Tabella 7: Some aggregate characteristics for geographic area.

Subpopulation Italy North Center South W.ℓ 3.971,949 1.537,372 2.641,679 8.151=W W.ℓ/W 0,4873 0,1886 0,3241 1,00 Median 27.527,57 29.824,24 19.123,67 24.590,10 Mean 33.543,17 34.000,09 23.517,86 30.380,22 V.ℓ(Y ) 0.4740 0.4421 0.4695 0,4795=V (Y )

  • I. Valli

Università degli Studi di Milano Bicocca 41 / 66

slide-42
SLIDE 42

Application

The synthetic inequality index V.ℓ(Y ) of the subpopulation ℓ is given by: V.ℓ(Y ) =

r

  • h=1

Vhℓ(Y ) · nhℓ n.ℓ =

r

  • h=1

Mℓ(Y ) −

Mhℓ(Y ) Mℓ(Y ) · nhℓ n.ℓ , (49) where, Vhℓ(Y ) = Mℓ(Y ) −

Mhℓ(Y ) Mℓ(Y ) (50) is the point inequality index of the subpopulation ℓ and nhℓ is the sum of the weights of the households in the subpopulation ℓ with total income Y = yh.

  • I. Valli

Università degli Studi di Milano Bicocca 42 / 66

slide-43
SLIDE 43

Application

For the whole population the abscissas and the ordinates are given respectively by

  • ph. = Ph.

N , Vh(ph.)(Y ) = Vh(Y )

  • ,

where, Ph. = 3

ℓ=1 Phℓ and h(ph.) = h : P(Y ≤ yh) = ph..

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 ph. Vh(ph.)(Y) Italy

V(Y)= 0.479538

Figura 3: Graph of the Bonferroni point measure Vh(ph.)(Y ).

  • I. Valli

Università degli Studi di Milano Bicocca 43 / 66

slide-44
SLIDE 44

Application

Figure 4 displays the graphs of the point inequality measures for: (a) the whole population; (b) the North, the Center and the South. For the subpopulation ℓ the abscissas and the ordinates are given respectively by

  • phℓ = Phℓ

n.ℓ , Vh(phℓ)(Y ) = Vhℓ(Y )

  • ,

where, Phℓ = h

t=1 Wtℓ and h(phℓ) = h : P(Y ≤ yh) = phℓ, h = 1, . . . , r;

ℓ = 1, . . . , k = 3.

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 phl Vh(phl)(Y) North Center South

0.6981

Figura 4: Graphs of the Bonferroni point measures Vh(phℓ)ℓ(Y ), ℓ = 1, 2, 3.

  • I. Valli

Università degli Studi di Milano Bicocca 44 / 66

slide-45
SLIDE 45

Application: decomposition by geographical areas of the point and synthetic inequality indexes of the whole country

In this section we illustrate the decompositions of the point measure Vh(ph.)(Y ) for three values of ph., and the decompositions of the synthetic index V (Y ) = 0.4795. For ph. we have chosen the following values:

  • ph. = 0.10, because Vh(10)(Y ) = 0, 7607 compares the

income mean of the poorest 10% of households with the income mean M(Y );

  • ph. = 0.50, because Vh(0,50)(Y ) = 0, 4859 compares the

income mean of the households with Y ≤ Med (Y ) with the income mean M(Y );

  • ph. = 0.95, because Vh(0,95)(Y ) = 0, 1181 compares the

income mean of the lower group that is the 95% of the whole population with the income mean M(Y ).

  • I. Valli

Università degli Studi di Milano Bicocca 45 / 66

slide-46
SLIDE 46

Application: decomposition by geographical areas of the point and synthetic inequality indexes of the whole country

Table 8 reports for these three values of ph. the corresponding values of h(ph.), Ph(ph.), Ph(ph.)/N and of yh(ph.). Note that: h(ph.) = min h : Ph(ph.). N ≥ ph..

Tabella 8: Cumulative frequencies and quantiles for three values of

  • ph. = P(Y ≤ yh)

ph. h(ph.) Ph(ph.). Ph(ph.)./N yh

  • ph. = 0, 10

460 815,20 0,10 10.600,00

  • ph. = 0, 50

3.064 4.075,65 0,50 24.590,10

  • ph. = 0, 95

6.841 7.743,48 0,95 68.819,23

  • ph. = 1, 00

7.287 8.151,00 1,00 368.689,70

  • I. Valli

Università degli Studi di Milano Bicocca 46 / 66

slide-47
SLIDE 47

Application: decomposition by geographical areas of the point and synthetic inequality indexes of the whole country

Table 9 reports all the values needed for the decompositions of V0,10(Y ) = 0, 7607. These decompositions are shown in Table 10.

Tabella 9: Means Mℓ(Y ) and

M hℓ(Y ), frequencies Phℓ and Ph., and relative frequencies n.ℓ

N and p(ℓ|h); h = 460, yh = 10.600, 00.

  • ph. = 0, 10

North Center South Italy h = 460 ℓ = 1 ℓ = 2 ℓ = 3 Y ≤ 10.600, 00 275,78 114,01 425,40 815,20 Y > 10.600, 00 3.696,16 1.423,36 2.216,28 7.335,80 W.ℓ 3.971,95 1.537,37 2.641,68 8.151 Relative frequencies W.ℓ/W 0,4873 0,1886 0,3241 1,00 p(ℓ|h) 0,3383 0,1399 0,5218 1,00 Means Mℓ(Y ) 33.543,17 34.000,09 23.517,86 30.380,22

Mhℓ(Y ) 7.091,45 7.554,05 7.310,03 7.270,21

  • I. Valli

Università degli Studi di Milano Bicocca 47 / 66

slide-48
SLIDE 48

Application: decomposition by geographical areas of the point and synthetic inequality indexes of the whole country

Tabella 10: Decomposition of Vh(0,1)(Y ) = V460(Y ) = 0, 7607 in V460ℓg(Y ), V460ℓ.(Y ), V460ℓW (Y ), V460ℓB(Y ), V460.W (Y ), V460.B(Y ).

V460ℓg(Y ) ℓ g 1 2 3 1 0,1435 0,0583 0,2196 2 0,0565 0,0230 0,0865 3 0,0593 0,0238 0,0902 V460ℓ.(Y ) 0,2593 0,1051 0,3963 0,7607=V460(Y ) V460ℓW (Y ) 0,1435 0,0230 0,0902 0,2567=V460.W (Y ) V460ℓB(Y ) 0,1158 0,0821 0,3060 0,5040=V460.B(Y )

  • I. Valli

Università degli Studi di Milano Bicocca 48 / 66

slide-49
SLIDE 49

Application: decomposition by geographical areas of the point and synthetic inequality indexes of the whole country

The greatest contribution Vh(0,1)ℓg(Y ) is Vh(0,1)31(Y ): Vh(0,1)31(Y ) = M1(Y ) −

Mh(0,1)3(Y ) M(Y ) · n.1 N · p(3|460) = 33.543, 17 − 7.310, 03 30.380, 22 · 0, 4873 · 0, 5218 = 0, 8635 · 0, 4873 · 0, 5218 = 0, 2196. This result depends from the difference of the lower mean of the South and the mean of the North, and from their relative weights: p(ℓ|h) and n.g

N .

  • I. Valli

Università degli Studi di Milano Bicocca 49 / 66

slide-50
SLIDE 50

Application: decomposition by geographical areas of the point and synthetic inequality indexes of the whole country

Conversely, for the contribution V(0,1)13(Y ) we have: Vh(0,1)13(Y ) = M3(Y ) −

Mh(0,1)1(Y ) M(Y ) · n.3 N · p(1|460) = 23.517, 86 − 7.091, 45 30.380, 22 · 0, 3241 · 0, 3383 = 0, 5407 · 0, 3241 · 0, 3383 = 0, 0593. In this way the difference between Vh(0,1)13(Y ) and Vh(0,1)31(Y ) is explained by the different comparison between means and the different weights.

  • I. Valli

Università degli Studi di Milano Bicocca 50 / 66

slide-51
SLIDE 51

Application: decomposition by geographical areas of the point and synthetic inequality indexes of the whole country

Let us consider now the decomposition of the point index Vh(0,1)(Y ) = 0, 7607 into the three contributions Vh(0,1)ℓ.(Y ) of each macro regions: Vh(0,1)1.(Y ) = M(Y ) −

Mh(0,1)1.(Y ) M(Y ) · p(1|460) = 30.380, 22 − 7.091, 45 30.380, 22 · 0, 3383 = 0, 7666 · 0, 3383 = 0, 2593; Vh(0,1)2.(Y ) = M(Y ) −

Mh(0,1)2.(Y ) M(Y ) · p(2|460) = 30.380, 22 − 7.554, 05 30.380, 22 · 0, 1399 = 0, 7513 · 0, 1399 = 0, 1051; Vh(0,1)3.(Y ) = M(Y ) −

Mh(0,1)3.(Y ) M(Y ) · p(3|460) = 30.380, 22 − 7.310, 03 30.380, 22 · 0, 5218 = 0, 7594 · 0, 5218 = 0, 3963.

  • I. Valli

Università degli Studi di Milano Bicocca 51 / 66

slide-52
SLIDE 52

Application: decomposition by geographical areas of the point and synthetic inequality indexes of the whole country

These values show that the relative variations of the lower means of the three macroregions w.r.t. the mean of the whole population are similar, while their relative weights p(ℓ|h) are very different. This explain why the greatest contribution to the point index Vh(0,10)(Y ) = 0, 7607 comes from the South.

  • I. Valli

Università degli Studi di Milano Bicocca 52 / 66

slide-53
SLIDE 53

Application: decomposition by geographical areas of the point and synthetic inequality indexes of the whole country

Finally, Table 11 reports all the values needed for the decompositions of Vh(0,95)(Y ) = 0, 1181 which are reported in Table 12.

Tabella 11: Means Mℓ(Y ) and

M hℓ(Y ), frequencies Phℓ and Ph. , and relative frequencies n.ℓ

N and p(ℓ|h); h = 6.841, yh = 68.819, 23.

  • ph. = 0, 95

North Center South Italy h = 6.841 ℓ = 1 ℓ = 2 ℓ = 3 Y ≤ 68.819, 20 3.710,67 1.443,33 2.589,48 7.743,48 Y > 68.819, 20 261,28 94,04 52,20 407,52 W.ℓ 3.971,95 1.537,37 2.641,68 8.151 Relative frequencies W.ℓ/W 0,4873 0,1886 0,3241 1,00 p(ℓ|h) 0,4792 0,1864 0,3344 1,00 Means Mℓ(Y ) 33.543,17 34.000,09 23.517,86 30.380,22

Mhℓ(Y ) 28.856,94 20.401,45 21.819,47 26.791,44

  • I. Valli

Università degli Studi di Milano Bicocca 53 / 66

slide-54
SLIDE 54

Application: decomposition by geographical areas of the point and synthetic inequality indexes of the whole country

Tabella 12: Decomposition of Vh(0,95)(Y ) = V6841(Y ) = 0, 1181 in V6841ℓg(Y ), V6841ℓ.(Y ), V6841ℓW (Y ), V6841ℓB(Y ), V6841.W (Y ), V6841.B(Y )

V6841ℓg(Y ) ℓ g 1 2 3 1 0,0360 0,0094 0,0629 2 0,0153 0,0042 0,0253 3

  • 0,0273
  • 0,0137

0,0061 V6841ℓ.(Y ) 0,0240

  • 0,0001

0,0942 0,1181=V6841(Y ) V6841ℓW (Y ) 0,0360 0,0042 0,0061 0,0462=V6841.W (Y ) V6841ℓB(Y )

  • 0,0120
  • 0,0043

0,0882 0,0719=V6841.B(Y )

  • I. Valli

Università degli Studi di Milano Bicocca 54 / 66

slide-55
SLIDE 55

Application: decomposition by geographical areas of the point and synthetic inequality indexes of the whole country

In conclusion, Table 13 reports the decompositions of the synthetic index V (Y ) = 0, 4795.

Tabella 13: Decomposition of the synthetic index V (Y ) = 0, 4795 in V.ℓg(Y ), V.ℓ.(Y ), V.ℓW (Y ), V.ℓB(Y ), V..W (Y ), V..B(Y ).

V.ℓg(Y ) ℓ g 1 2 3 1 0,1114 0,0364 0,1366 2 0,0443 0,0145 0,0541 3 0,0289 0,0084 0,0449 V.ℓ.(Y ) 0,1847 0,0593 0,2355 0,4795=V (Y ) V.ℓW (Y ) 0,1114 0,0145 0,0449 0,1708=V..W (Y ) V.ℓB(Y ) 0,0733 0,0448 0,1907 0,3088=V..B(Y )

  • I. Valli

Università degli Studi di Milano Bicocca 55 / 66

slide-56
SLIDE 56

Application: decomposition by geographical areas of the point and synthetic inequality indexes of the whole country

It is useful to remember that the contribution V.ℓg(Y ) reported in this table are the weighted means of the corresponding contribution Vhℓg(Y ) with weights nh./N: V.ℓg(Y ) =

r

  • h=1

Vhℓg(Y ) · nh. N . This table confirms that the two greatest contributions V.ℓg(Y ) are V.31(Y ) = 0, 1366 and V.11(Y ) = 0, 1114.

  • I. Valli

Università degli Studi di Milano Bicocca 56 / 66

slide-57
SLIDE 57

Application: decomposition by geographical areas of the point and synthetic inequality indexes of the whole country

Finally, Table 14 reports for the three macro regions their: relative contributions to the point indexes νhℓ.(Y ) = Vhℓ.(Y ) Vh(Y ) = M(Y ) −

Mhℓ(Y ) M(Y ) −

Mh.(Y ) · p(ℓ|h) (h = 1, ..., r − 1) (51) relative contributions to the sythetic index ν.ℓ.(Y ) = V.ℓ.(Y ) V (Y ) (52) relative weights n.ℓ

N

  • I. Valli

Università degli Studi di Milano Bicocca 57 / 66

slide-58
SLIDE 58

Application: decomposition by geographical areas of the point and synthetic inequality indexes of the whole country

Tabella 14: Relative contributions ν460ℓ.(Y ), ν3064ℓ.(Y ), ν6841ℓ.(Y ) and ν.ℓ.(Y ).

ℓ Tot. North Center South ν460ℓ.(Y ) 0,3409 0,1381 0,5209 1,00 ν3064ℓ.(Y ) 0,4084 0,1336 0,4581 1,00 ν6841ℓ.(Y ) 0,2034

  • 0,0011

0,7977 1,00 ν.ℓ.(Y ) 0,3851 0,1237 0,4912 1,00 n.ℓ/N 0,4873 0,1886 0,3241 1,00

  • I. Valli

Università degli Studi di Milano Bicocca 58 / 66

slide-59
SLIDE 59

Application: decomposition by geographical areas of the point and synthetic inequality indexes of the whole country

From Table 14, we note that the South is the region that shows the gratest income inequality, while the contrary happen for the Center. We conclude this section observing that for the whole population, the within component is the 0, 1708/0, 4795 = 35, 62% of the synthetic index.

  • I. Valli

Università degli Studi di Milano Bicocca 59 / 66

slide-60
SLIDE 60

Conclusion and final remarks

The decompositons here illustrated and those proposed by Tarsitano (1990) and by Silber (2013) are rather different. In fact, Tarsitano and Silber give only the decomposition by subpopulations of the Bonferroni synthetic index. It is worth to point out that the principal result of the present proposal is the decomposition of the Bonferroni point measure Vh(Y ) = M(Y )−

Mh(Y ) M(Y )

in the following sum of k × k contributions Vhℓg(Y ): Vh(Y ) =

k

  • ℓ=1

k

  • g=1

Vhℓg(Y ) =

k

  • ℓ=1

k

  • g=1

Mg(Y ) −

Mhℓ(Y M(Y ) · p(ℓ|h) · n.g N From this k × k decomposition of Vh(Y ), with simple aggregation, the within and the between components and the contribution of each subpopulation to Vh(Y ) are obtained.

  • I. Valli

Università degli Studi di Milano Bicocca 60 / 66

slide-61
SLIDE 61

Conclusion and final remarks

From formula (30) derives that if

Mhℓ(Y ) > Mg(Y ) and p(ℓ|h) > 0, then, the corresponding contribution Vhℓg(Y ) is

  • negative. This characteristic is illustrated in Table 7 and in

Table 16. Moreover, if the means Mg(Y ) and Mℓ(Y ) of two subpopulations are such that Mg(Y ) < Mℓ(Y ), then Vhℓg(Y ) < 0 for some h = 1, . . . , r. Note that it is also possible to have negative contributions V.ℓg(Y ) for the synthetic Bonferroni index as in the case of the following 6 × 3 bivariate distribution.

  • I. Valli

Università degli Studi di Milano Bicocca 61 / 66

slide-62
SLIDE 62

Conclusion and final remarks

Tabella 15: Bivariate distribution

h yh n.1 n.2 n.3 nh. 1 2 1 1 2 2 8 1 1 3 24 1 1 4 29 1 1 5 37 2 1 3 6 62 2 2 Total 5 2 3 10

  • I. Valli

Università degli Studi di Milano Bicocca 62 / 66

slide-63
SLIDE 63

Conclusion and final remarks

Tabella 16: Contributions V.ℓg(Y ) and V.ℓ.(Y ) of the distribution in Table 15

V.ℓg(Y ) ℓ 1 2 3 g 1 0,0262 0,2553 0,2152 2

  • 0,0521

0,0020

  • 0,0205

3

  • 0,0410

0,0625 0,0325 V.ℓ.(Y )

  • 0,0669

0,3198 0,2272 V (Y ) = 0, 48

  • I. Valli

Università degli Studi di Milano Bicocca 63 / 66

slide-64
SLIDE 64

Conclusion and final remarks

Moreover, it is also possible to have negative values for the contribution Vhℓ.(Y ) = M(Y ) −

Mhℓ(Y ) M(Y ) · p(ℓ|h)

  • f the subpopulation ℓ to the point index Vh(Y ). In the

application reported in this paper: for h > 7055,

Mh1(Y ) > M(Y ) and Vh1(Y ) < 0; for h > 6850,

Mh2(Y ) > M(Y ) and Vh2(Y ) < 0. We point out that it is possible to have negative contributions V.ℓ.(Y ) of the subpopulation ℓ to the synthetic index V (Y ) as illustrated in Table 20.

  • I. Valli

Università degli Studi di Milano Bicocca 64 / 66

slide-65
SLIDE 65

Conclusion and final remarks

Finally, we remark that it is not possible to have negative values for the contributions Bhℓg(Y ) =

+

Mhg(Y ) −

Mhℓ(Y )

+

Mh.(Y ) · a(g|h) · p(ℓ|h)

  • f the Zenga point index

Ih(Y ) =

+

Mh.(Y ) −

Mh.(Y )

+

Mh.(Y ) where,

+

Mh.(Y ) is the mean of the upper group in the whole population {(yh+1, nh+1.), h = 1, . . . , r − 1} ,

+

Mhg(Y ) is the mean of the upper group in the subpopulation g and a(g|h) is the relative frequency of the subpopulation g in the upper group.

  • I. Valli

Università degli Studi di Milano Bicocca 65 / 66

slide-66
SLIDE 66

GRAZIE

  • I. Valli

Università degli Studi di Milano Bicocca 66 / 66