aims of these lectures
play

Aims of these Lectures Discuss some basic statistical - PowerPoint PPT Presentation

Aims of these Lectures Discuss some basic statistical concepts/techniques. Relate to these to Economics. Help you to help yourself These lectures are not, and are not intended to be, a substitute for the Applied Statistics for


  1. Aims of these Lectures • Discuss some basic statistical concepts/techniques. • Relate to these to Economics. • Help you to help yourself • These lectures are not, and are not intended to be, a substitute for the Applied Statistics for Economics and Business course.

  2. � A Useful Website: http://www.maths.murdoch.edu.au/units/statsnotes/ � A useful statistics package: MINITAB Start Networked Applications General Software Statistics and Graphing Minitab 14

  3. Two Big Issues Why are Some Countries Richer than � Others? An old issue in Economics: Adam Smith ‘Wealth � of Nations, 1776 Why do Some Countries Grow faster than � Others? Countries are richer now because they grew � faster in the past, e.g. compare 2000 with 1500

  4. Neoclassical Growth Theory � Higher s implies larger output per capita and real wage rate in the long run. � Higher n implies lower output per capita and real wage rate in the long run. � Countries which are far below their long run equilibrium will grow faster than countries which are close to their steady state.

  5. The Penn World Tables � Panel data set of Macroeconomic variables � 208 countries � 25 macro variables � 1950 to 2000 � Many missing values

  6. Accessing the PWT � available at http://www.pwt.econ.upenn.edu/ � Select countries/years/variables � Choose CSV option � Copy and Paste data into Notepad � Save as, e.g., “mydata.csv” – keep “” � Open in Excel � (Alternatively: use Word and save as test file.)

  7. Describing Data � Tabulate, List � Numerical Summary � Graphical Summary

  8. Frequency Table � Select a suitable set of class intervals bounded by class limits . � The class frequency is the number of data points in each interval. � The class mark is the midpoint of the class interval. � Class Boundaries may differ from Class Limits (as a result of rounding). � The class size is the difference between the upper and lower class boundaries.

  9. Frequency Distributions Suppose we have a sample of n observations. The (absolute) frequency of any value is the number of times that value appears in the sample The relative frequency of a value is the proportion of the sample which has that value. The empirical frequency distribution of a random variable is the sample analogue of its probability distribution . It can be graphed by constructing a histogram .

  10. World Distribution of Real GDP per capita 1960, 2000 rgdp1960 rgdp2000 0-999 23 15 1000-1999 27 13 2000-2999 21 7 3000-3999 13 10 4000-4999 5 8 5000-5999 3 4 6000-6999 1 4 7000-7999 6 2 8000-8999 2 2 9000-9999 2 3 10000-10999 4 2 11000-14999 3 3 15000-19999 0 6 20000-24999- 0 10 25000 + 0 9 Note: Class Boundaries are, e.g. 2999.5-3999.5

  11. The Median � Smaller than 50% of the sample and larger than 50% of the sample � Order the sample from smallest to largest, the median lies halfway up the order. � Let n be the sample size: � if n is odd, median is at observation (n+1)/2 � if n is even, average the two values at n/2 and ( n/2)+1 . � A useful property: the median is insensitive to (changes in) extreme sample values.

  12. Quantiles � The First Quartile , Q 1 , is larger than 25% of the sample values and smaller that 75% � The Third Quartile , Q 3 , is larger than 75% of the sample values and smaller that 25% � The Second Quartile , Q 2 , is the Median � The Interquartile Range , Q 3 -Q 1 , is a robust measure of the variability of the sample data. � Other frequently used quartiles are deciles and percentiles .

  13. The Mean � Defined as + + + x x ... x = 1 2 n x n n 1 ∑ = x i n = i 1 � The ‘centre of gravity’ of the distribution. � Sensitive to extreme values. � Gives each sample value the same ‘weight’, 1/n.

  14. Comparisons RGDP1960 RGDP2000 Mean 3332 9088 Q 1 1076 1669 Median 2305 4361 Q 3 3970 1590 IQR 2893 14231 Minimum 383 482 Maximum 14877 44009

  15. Graphical Methods � Stem and Leaf Plots � Box Plots � Bar Charts � Histograms

  16. Stem and Leaf Plots Given a set of numbers: � The leaf is the last digit considered. � The leaf unit specifies which digit. � The stem is the rest of the number. � The first column is the count for each stem. � The count where the median occurs is enclosed in parentheses.

  17. Stem and Leaf for World GDP, 1960 23 0 34445556677778889999999 50 1 000001111123334455666778899 (21) 2 011223333344566667799 40 3 0000122234489 27 4 12669 22 5 238 19 6 8 18 7 334778 N = 111, 12 8 12 Leaf Unit = 100 10 9 26 i.e. lowest rgdp is in range 8 10 1469 300-400, highest is in range 4 11 55 2 12 4 14800-14900 1 13 1 14 8

  18. Departures from Symmetry Skewness: A measure of asymmetry of a distribution Skewness is zero for a symmetric distribution. Positive Skewness - long tail to the right, mean greater than median. Negative Skewness - long tail to the left, mean less than median. Kurtosis: a measure of thickness of the tails

  19. Box Plots � Indicate symmetry and variability of the sample values. � Measuring along the horizontal or vertical axis, draw a box with edges at Q 1 and Q 3 so its length is the IQR. � The width is up to you. � Draw a line across the box at the median value � Draw lines - whiskers - from the box to the sample maximum and sample minimum values (excluding outliers). � Observations lying more than 1.5*IQR from the edges of the box are ‘outliers’ and are represented by asterisks.

  20. Boxplot of World Income Distribution, 1960 16000 14000 12000 10000 8000 6000 4000 2000 0

  21. The Evolution of World Income Distribution, 1960- 2000 50000 40000 30000 20000 10000 0 rgdp1960 rgdp1970 rgdp1980 rgdp1990 rgdp2000

  22. Bar Chart of World GDP per capita, 1960 1960 30 25 20 15 10 5 0 9 + 9 9 9 9 9 9 9 9 9 9 9 - 9 9 0 9 9 9 9 9 9 9 9 9 9 9 9 9 9 0 9 9 9 9 - 9 9 9 9 9 9 9 9 9 0 0 2 4 6 8 9 0 2 4 6 8 2 4 6 8 - - - - 0 0 0 0 0 1 1 1 1 1 2 2 2 2 2 0 0 0 0 - - - - - - - - 0 0 0 0 0 0 0 0 - 0 0 0 0 0 0 0 0 0 0 0 0 0 2 4 6 8 0 0 0 0 0 0 0 0 0 0 0 2 4 6 8 2 4 6 0 1 1 1 1 1 2 2 2 2

  23. Bar Chart of World GDP per capita, 2000 2000 16 14 12 10 8 6 4 2 0 9 + 9 9 9 9 9 9 9 9 9 9 9 9 9 - 0 9 9 9 9 9 9 9 9 9 9 9 9 9 9 0 9 9 9 9 9 - 9 9 9 9 9 9 9 9 0 0 2 4 6 8 9 0 2 4 6 8 2 4 6 8 - - - - 0 1 1 1 1 1 2 2 2 0 0 0 0 2 2 - - - - - - - - 0 0 0 0 0 0 0 0 0 0 0 0 - 0 0 0 0 0 0 0 0 0 0 0 0 0 2 4 6 8 0 0 0 0 0 0 0 0 0 0 0 2 4 6 8 2 4 6 0 1 1 1 1 1 2 2 2 2

  24. Histograms Given a sample of size n, 1. Select a number of classes - ‘bins’ - of equal width. Each sample value falls into one of the classes. 2. Calculate the number of values in each class - the class frequency . 3. Construct a bar graph where (a) the base of each bar is the class width (b) the height is the frequency for that class or the relative frequency for the class. 4. A rule for bin width IQR = h 2 1/3 n Note: Sometimes useful to have unequal bin widths.

  25. Histogram of World GDP per capita, 1960 40 30 requency 20 F 10 0 0 2000 4000 6000 8000 10000 12000 14000 rgdp1960

  26. Histogram of World GDP per capita, 2000 35 30 Note: Badly chosen class intervals. 25 Frequency 20 15 10 5 0 0 10000 20000 30000 40000 rgdp2000

  27. Convergence Do Poorer Countries Grow Faster?

  28. Scatterplot of Growth, 1960-2000, vs Initial RGDP 0.06 0.05 0.04 0.03 avgrowt h 0.02 0.01 0.00 -0.01 -0.02 0 2000 4000 6000 8000 10000 12000 14000 16000 rgdp1960

  29. A Linear Relationship between Two Variables. = + Y a bX � Choose Y as the dependent variable and X as the independent variable. � What a and b best represent, the data?

  30. Fitting a Line to Data � Could join any two points but line may be a long way from others. � Any line drawn through the data generates a set of residuals , some positive some negative. � The distance of a point from the line can be measured by the squared residual . � The Least Squares criterion: ‘ minimise the sum of the squared residuals ’.

  31. The ‘Least Squares’ Coefficients ( )( ) n n ∑ ∑ − − X X Y Y x y i i i i = = = = 1 1 i i b ( ) n n ∑ ∑ − 2 2 X X x i i = = i 1 i 1 = − a Y b X

  32. A Regression Worksheet Obs X Y x y xy xx 1 15.75 0.94 10.25 0.66 6.77 104.97 2 3.49 0.08 -2.01 -0.21 0.42 4.05 3 4.74 0.13 -0.76 -0.15 0.11 0.58 4 5.49 0.36 -0.02 0.07 0.00 0.00 5 5.41 0.47 -0.09 0.18 -0.02 0.01 6 2.07 0.10 -3.44 -0.19 0.65 11.80 7 7.69 0.41 2.18 0.13 0.27 4.76 8 2.48 0.11 -3.02 -0.18 0.53 9.12 9 5.44 0.13 -0.07 -0.15 0.01 0.00 10 2.48 0.11 -3.02 -0.17 0.52 9.13 Sums 55.05 2.84 0.00 0.00 9.26 144.42 Means 5.50 0.28 a = -0.07 b = 0.06

  33. A verage Growth vs RGDP 1960: 30 Richest Countries 0.04 0.03 avgrowt h 0.02 0.01 0.00 5000 7500 10000 12500 15000 rgdp1960

  34. Average Growth vs RGDP1960: 30 Poorest Countries 0.05 0.04 0.03 avgrowt h 0.02 0.01 0.00 -0.01 300 400 500 600 700 800 900 1000 1100 1200 rgdp1960

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend