welfare inequality poverty 2
play

Welfare, Inequality & Poverty, # 2 1 Arthur CHARPENTIER - - PowerPoint PPT Presentation

Arthur CHARPENTIER - Welfare, Inequality and Poverty Arthur Charpentier charpentier.arthur@gmail.com http ://freakonometrics.hypotheses.org/ Universit de Rennes 1, January 2015 Welfare, Inequality & Poverty, # 2 1 Arthur CHARPENTIER -


  1. Arthur CHARPENTIER - Welfare, Inequality and Poverty Arthur Charpentier charpentier.arthur@gmail.com http ://freakonometrics.hypotheses.org/ Université de Rennes 1, January 2015 Welfare, Inequality & Poverty, # 2 1

  2. Arthur CHARPENTIER - Welfare, Inequality and Poverty Modeling Income Distribution Let { x 1 , · · · , x n } denote some sample. Then n n � � x = 1 1 x i = nx i n i =1 i =1 This can be used when we have census data. ●● ● ● ● 1 load ( u r l ( " http : // freakonometrics . f r e e . f r / income_5 . RData" ) ) 2 income < − s o r t ( income ) 3 plot ( 1 : 5 , income ) 0 50000 100000 150000 200000 250000 income It is possible to use survey data. If π i denote the probability to be drawn, use weights 1 ω i ∝ nπ i 2

  3. Arthur CHARPENTIER - Welfare, Inequality and Poverty The weighted average is then n � ω i x ω = ω x i i =1 where ω = � ω i . This is an unbaised estimator of the population mean. Sometime, data are obtained from stratified samples : before sampling, members of the population are groupes in homogeneous subgroupes (called a strata). Given S strata, such that the population in strata s is N s , then S � � N s N x s where x s = 1 x S = x i N s s =1 i ∈S s 3

  4. Arthur CHARPENTIER - Welfare, Inequality and Poverty Statistical Tools Used to Describe the Distribution Consider a sample { x 1 , · · · , x n } . Usually, the order is not important. So let us order those values, ≤ x 2: n ≤ · · · ≤ x n − 1: n ≤ x 1: n x n : n ���� ���� min { x i } max { x i } As usual, assume that x i ’s were randomly drawn from an (unknown) distribution F . If F denotes the cumulative distribution function, F ( x ) = P ( X ≤ x ), one can prove that F ( x i : n ) = P ( X ≤ x i : n ) ∼ i n The quantile function is defined as the inverse of the cumulative distribution function F , Q ( u ) = F − 1 ( u ) or F ( Q ( u )) = P ( X ≤ Q ( u )) = u 4

  5. Arthur CHARPENTIER - Welfare, Inequality and Poverty Lorenz curve Lorenz curve The empirical version of Lorenz curve is 1.0     0.8 � n, 1 i L = x j : n 0.6   nx L(p) j ≤ i ● 0.4 ● 0.2 1 > plot ( ( 0 : 5 ) / 5 , c (0 ,cumsum( income ) /sum( income ● ● ) ) ) 0.0 0.0 0.2 0.4 0.6 0.8 1.0 p 5

  6. Arthur CHARPENTIER - Welfare, Inequality and Poverty Gini Coefficient A Gini coefficient is defined as the ratio of areas, A + B . It can be defined using order statistics as n � 2 i · x i : n − n + 1 1.0 ● G = n ( n − 1) x n − 1 0.8 i =1 0.6 L(p) 1 > n < − length ( income ) ● 0.4 A 2 > mu < − mean( income ) ● 0.2 B 3 > 2 ∗ sum ( ( 1 : n) ∗ s o r t ( income ) ) / (mu ∗ n ∗ (n − 1)) − (n ● ● 0.0 ● +1)/ (n − 1) 0.0 0.2 0.4 0.6 0.8 1.0 [ 1 ] 0.5800019 4 p 6

  7. Arthur CHARPENTIER - Welfare, Inequality and Poverty Distribution Fitting Assume that we now have more observations, 1 > load ( u r l ( " http : // freakonometrics . f r e e . f r /income_500. RData" ) ) We can use some histogram to visualize the distribu- tion of the income Histogram of income 40 1 > summary( income ) Min . 1 st Qu. Median Mean 3rd Qu. 2 30 Max. Frequency 2191 23830 42750 77010 87430 20 3 2003000 10 4 > s o r t ( income ) [ 4 9 5 : 5 0 0 ] [ 1 ] 465354 489734 512231 539103 627292 5 0 2003241 0 500000 1000000 1500000 2000000 income 6 > h i s t ( income , breaks=seq (0 ,2005000 , by=5000) ) 7

  8. Arthur CHARPENTIER - Welfare, Inequality and Poverty Distribution Fitting Because of the dispersion, look at the histogram of the logarithm of the data Histogram of log(income, 10) 1 > h i s t ( log ( income , 1 0 ) , breaks=seq ( 3 , 6 . 5 , length =51) ) 40 2 > boxplot ( income , h o r i z o n t a l= TRUE, log=" x " ) 30 Frequency 20 10 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 log(income, 10) 2e+03 1e+04 5e+04 2e+05 1e+06 8

  9. Arthur CHARPENTIER - Welfare, Inequality and Poverty Distribution Fitting 1.0 The cumulative distribution function (on the log of 0.8 the income) Cumulated Probabilities 0.6 1 > u < − s o r t ( income ) 0.4 2 > v < − ( 1 : 5 0 0 ) /500 0.2 3 > plot (u , v , type=" s " , log=" x " ) 0.0 2e+03 1e+04 5e+04 2e+05 1e+06 Income (log scale) 9

  10. Arthur CHARPENTIER - Welfare, Inequality and Poverty Distribution Fitting 1e+06 2e+05 If we invert that graph, we have the quantile function Income (log scale) 5e+04 1 > plot (v , u , type=" s " , c o l=" red " , log=" y " ) 1e+04 2e+03 0.0 0.2 0.4 0.6 0.8 1.0 Probabilities 10

  11. Arthur CHARPENTIER - Welfare, Inequality and Poverty Distribution Fitting 1.0 ● 0.8 On that dataset, Lorenz curve is 0.6 L(p) 1 > plot ( ( 0 : 5 0 0 ) / 500 , c (0 ,cumsum( income ) /sum( 0.4 income ) ) ) 0.2 0.0 ● 0.0 0.2 0.4 0.6 0.8 1.0 p 11

  12. Arthur CHARPENTIER - Welfare, Inequality and Poverty Distribution and Confidence Intervals There are two techniques to get the distribution of an estimator � θ , – a parametric one, based on some assumptions on the underlying distribution, – a nonparametric one, based on sampling techniques If X i ’s have a N ( µ, σ 2 ) distribution, then � � µ, σ 2 X ∼ N n But sometimes, distribution can only be obtained as an approximation, because of asymptotic properties. From the central limit theorem, � � µ, σ 2 X → N as n → ∞ n In the nonparametric case, the idea is to generate pseudo-samples of size n , by resampling from the original distribution. 12

  13. Arthur CHARPENTIER - Welfare, Inequality and Poverty Bootstraping Consider a sample x = { x 1 , · · · , x n } . At step b = 1 , 2 , · · · , B , generate a pseudo sample x b by sampling (with replacement) within sample x . Then compute any statistic � θ ( x b ) 1 > boot < − function ( sample , f , b=500){ 2 + F < − rep (NA, b) 3 + n < − length ( sample ) 4 + f o r ( i in 1 : b) { 5 + idx < − sample ( 1 : n , s i z e=n , r e p l a c e= TRUE) 6 + F[ i ] < − f ( sample [ idx ] ) } 7 + return (F) } 13

  14. Arthur CHARPENTIER - Welfare, Inequality and Poverty Bootstraping Let us generate 10,000 bootstraped sample, and com- pute Gini index on those 15 1 >boot_g i n i < − boot ( income , gini ,1 e4 ) To visualize the distribution of the index 10 Density 1 > h i s t ( boot_gini , p r o b a b i l i t y= TRUE) 5 2 > u < − seq ( . 4 , . 7 , length =251) 3 > v < − dnorm(u , mean( boot_g i n i ) , sd ( boot_g i n i ) 0 ) 0.45 0.50 0.55 0.60 4 > l i n e s (u , v , c o l=" red " , l t y =2) boot_gini 14

  15. Arthur CHARPENTIER - Welfare, Inequality and Poverty Continuous Versions The empirical cumulative distribution function n � F n ( x ) = 1 � 1 ( x i ≤ x ) n i =1 Observe that F n ( x j : n ) = j � n If F is absolutely continuous, � x f ( t ) dt i.e. f ( x ) = dF ( x ) F ( x ) = . dx 0 Then � b P ( x ∈ [ a, b ]) = f ( t ) dt = F ( b ) − F ( a ) . a 15

  16. Arthur CHARPENTIER - Welfare, Inequality and Poverty Continuous Versions One can define quantiles as x = Q ( p ) = F − 1 ( p ) The expected value is � ∞ � ∞ � 1 µ = xf ( x ) dx = [1 − F ( x )] dx = Q ( p ) dp. 0 0 0 We can compute the average standard of living of the group below z . This is equivalent to the expectation of a truncated distribution. � � � z � ∞ 1 1 − F ( x ) µ − z = xf ( x ) dx = fx F ( z ) F ( z ) 0 0 16

  17. Arthur CHARPENTIER - Welfare, Inequality and Poverty Continuous Versions Lorenz curve is p �→ L ( p ) with � Q ( p ) L ( p ) = 1 xf ( x ) dx µ 0 Gastwirth (1971) proved that � p � p 0 Q ( u ) du L ( p ) = 1 Q ( u ) du = � 1 µ 0 Q ( u ) du 0 The numerator sums the incomes of the bottom p proportion of the population. The denominator sums the incomes of all the population. L is a [0 , 1] → [0 , 1] function, continuous if F is continuous. Observe that L is increasing, since dL ( p ) = Q ( p ) dp µ Further, L is convex 17

  18. Arthur CHARPENTIER - Welfare, Inequality and Poverty The sample case � i � i � j =1 x j : n � n L = n j =1 x j : n The points { i/n, L ( i/n ) } are then linearly interpolated to complete the corresponding Lorenz curve. The continuous distribution case � F − 1 ( p ) � p ydF ( y ) 1 0 F − 1 ( u ) du � ∞ L ( p ) = = E ( X ) ydF ( y ) 0 0 with p ∈ (0 , 1). Let L be a continuous function on [0 , 1], then L is a Lorenz curve if and only if L (0) = 0 , L (1) = 1 , L ′ (0 + ) ≥ 0 and L ′′ ( p ) ≥ 0 on [0 , 1] . 18

  19. Arthur CHARPENTIER - Welfare, Inequality and Poverty From Lorenz to Bonferroni The Bonferroni curve is B ( p ) = L ( p ) p and the Bonferroni index is � 1 BI = 1 − B ( p ) dp. 0 Define i � P i = i n and Q i = 1 x j nx j =1 then � P i − Q i � n − 1 � 1 B = n − 1 P i i =1 19

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend