. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
EE625 : ECONOMETRICS . . . . . . . . . . . . . . . . - - PowerPoint PPT Presentation
EE625 : ECONOMETRICS . . . . . . . . . . . . . . . . - - PowerPoint PPT Presentation
EE625 : ECONOMETRICS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Introduction Economics concerns with relations among economic variables. Econometrics concerns the
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Introduction
◮ Economics concerns with relations among economic variables.
Econometrics concerns the analysis of data describing economic relationship.
◮ We may also ask by curiosity whether a change in one variable
(x) causes a change in another variable (y)?
◮ Examples of economic question:
◮ What is the effect of school spending on student performance? ◮ Does having another year of education cause an increase in
salary?
◮ What is the effect of registering debt collectors on low-income
debtors?
◮ Does reducing class size cause an improvement in student
performance?
◮ What is the effect of prohibiting political campaign on voting
- utcomes?
◮ What is the effect of minimum wage on unemployment? ◮ and so on...
◮ These questions have something in common.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Causality and the Notion of Ceteris Paribus in Econometric Analysis
◮ What is a causal effect of x on y? ◮ For example, x could be ”institutions”, y could be ”economic
development”, or x could be ”schooling”, and y could be ”wage”
◮ Suppose x is correlated with y, can we interpret this
relationship as causation?
◮ Consider the following story: in 1988, someone conducted a
series of interviews with freshmen and found that those who had taken SAT preparation courses scored on average 63 points lower than those who hadn’t.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
◮ The person then concluded that SAT preparation courses were
not helpful. Is his/her conclusion valid?
◮ How can we isolate the effect of x on y, and quantitatively
establish that x matters for y?
◮ If we can run a controlled experiment, this may allow a simple
correlation analysis to uncover causality. But this is rarely the case in economics.
◮ We generally must accept the conditions under which people
act and the responses occur. Typically we cannot choose the level of a treatment and then record the outcome, but we can
- ften observe people behavior as recorded in nonexperimental
- r observational data.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Experimentation
◮ For example, how can we isolate the effect of institutions on
economic performance, and quantitatively establish that institutions matter for economic development?
◮ Suppose we can conduct the following experiment: Pick 2
identical economies, holding all other factors fixed, change institutions of only one country, and then watch what happen to economic development of these 2 countries.
◮ Then we can convincingly attribute the difference in
development paths to institutional change.
◮ Fortunately, we cannot do that. Instead, we can use
econometric methods to effectively hold other factors fixed.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
◮ Experiments are conducted less often in the social sciences
than in the natural sciences.
◮ Thus experimental data that are often collected in laboratory
environments in the natural sciences, are more difficult to
- btain in the social sciences.
◮ Although some social experiments can be devised, it is often
expensive, or morally repugnant to conduct the kinds of controlled experiments that would be needed to address economic issues.
◮ What we usually have are nonexperimental or observational
- data. And econometrics has evolved as a separate discipline
from statistics because it focuses on the problems inherent in collecting and analyzing nonexperimental economic data.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
◮ Causal effect refers to the answer to the following
counterfactual thought experiment: if, all else being equal, but
- nly x changes exogenously, what would be the effect on y?
◮ Thus the notion of ceteris paribus which means other
(relevant) factors being equal plays an important role in causal analysis.
◮ Answering such causal questions is quite challenging, because
it is hard to hold all other relevant factor fixed.
◮ The key question in most empirical studies is: Have enough
- ther factors been held fixed to make a case for causality? If
some relevant variables are omitted, it is then difficult to isolate changes in endogenous variables that are not driven by
- mitted factors.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
◮ Econometrics is useful because if it is carefully applied, it can
simulate a ceteris paribus experiment.
◮ For example, we might be interested in the effect of another
week of job training on wages, with all other components being equal (in particular, education and experience).
◮ If we succeed in holding all other relevant factors fixed and
then find a link between job training and wages, we can conclude that job training has a causal effect on worker productivity.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
◮ Example: Measuring the Return to Education ◮ If a person is chosen from the population and given another
year of education, by how much will his or her wage increase?
◮ This is a ceteris paribus question where all other factors are
held fixed while another year of education is given to the person.
◮ If we can conduct an experiment to measure the return to
education? How would we set up this experiment?
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
◮ One way is to randomly pick and assign each person an
amount of education; some are given no education at all, some are given a high school education, some are given two years of college, and so on.
◮ Subsequently, we measures wages for each group of people. ◮ However, the experiment described above is infeasible. We
cannot give someone only a high school education if he or she already has a college degree and so on.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
◮ Even though experimental data cannot be obtained for
measuring the return to education, we can collect nonexperimental data on education levels and wages for a large group by sampling randomly from the population of working people.
◮ Example of such data: the Current Population Survey (CPS),
the Labor Force Survey (LFS).
◮ A common feature of many observational data is
self-selection. Usually people choose their own levels of
- education. Therefore education levels are probably not
determined independently of all other factors affecting wage.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
◮ If we settle on a list of controls, and if all factors in the list
can be observed, then estimating the causal effect of x on y is quite straightforward. But some factors in the list may not be
- bservable.
◮ For example, to estimate the causal effect of education on
wage, we might decide that the relevant list to control for is years of workforce experience, and innate ability.
◮ Since pursuing more education generally requires postponing
entering the workforce, those with more education usually have less experience.
◮ Thus, in a nonexperimental data set on wages and education,
education is likely to be negatively associated with experience.
◮ People with more innate ability often choose higher levels of
- education. Since higher ability leads to higher wages, there
should be a positive relationship between education and ability.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
◮ As it is not difficult to measure experience, it is likely to have
this variable in nonexperimental data set.
◮ So accounting for observed factors, such as experience, when
estimating the ceteris paribus effect of another variable, such as education, is relatively straightforward.
◮ Ability, on the other hand, is difficult to measure. ◮ Accounting for inherently unobservable factors, such as ability,
is much more problematic. Many of the advances in econometric methods have tried to deal with unobserved factors in econometric models.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Types of Data
◮ Some econometric methods can be applied across different
kinds of data sets.
◮ But some data sets might have special features that must be
accounted for or should be exploited in particular.
◮ The most important data structures encountered in applied
work are
◮ 1) Cross-Section Data ◮ 2) Time Series Data ◮ 3) Pooled Cross Sections ◮ 4) Panel Data
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Cross-Section Data
◮ A cross-section data consists of sample units collected in a
particular time period. The ”sample units” could be persons, households, firms, cities, states, or countries, etc.
◮ We can often assume in cross-sectional data that they have
been obtained by random sampling from the underlying population.
◮ For example, if 1,289 people are randomly drawing from the
working population and we acquire information on wages, education, experience, and other characteristics, then we have a random sample from the population of working people.
◮ Random sampling simplifies the analysis of cross-sectional
data.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
◮ Example of cross-section data: a portion of CPS ◮ The variable obs is the observation number assigned to each
person in the sample. Unlike the other variables, it is not a characteristic of the individual.
◮ It does not matter which person is labeled as observation 1,
which person is called observation 2, and so on.
◮ The fact that the ordering of the data does not matter for
econometric analysis is a key feature of cross-sectional data sets obtained from random sampling.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Time Series Data
◮ A time series data consists of observations on a variable or
several variables over time.
◮ The observations refer to a single unit such as person,
household, firm, village, province, country and so on.
◮ For example, rainfalls, stock prices, M1, CPI, GDP, exchange
rates, export, import, etc.
◮ Because past events can influence future events and lags in
behavior are prevalent in the social sciences, time is an important dimension in a time series data set.
◮ Unlike the arrangement of cross-sectional data, the
chronological ordering of observations in a time series conveys potentially important information.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
◮ Because observations in time series can rarely be assumed to
be independent across time, time series data are usually more difficult to analyze than cross-section data.
◮ Most time series are related to their recent histories. ◮ For example, knowing something about the GDP from last
quarter tells us quite a bit about the likely range of the GDP during this quarter, because GDP tends to remain fairly stable from one quarter to the next. Or the probability that it rains today is not independent of whether it rained yesterday.
◮ Although most econometric procedures can be used with both
cross-sectional and time series data, more needs to be done in specifying econometric models for time series data before standard econometric methods can be justified.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Pooled Cross Sections
◮ Some data sets have both cross-sectional and time series
features.
◮ For example, if we pool cross section data SES1986 with
another cross section data SES2006. This pooled cross sctions will have more sample of the same variables.
◮ Pooled cross sections is an effective way to analyze the effects
- f a new government policy.
◮ We can also use it to see how a key relationship has changed
- ver time.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Panel Data
◮ Panel data set consists of a time series for each cross-sectional
member in the data set.
◮ For example, suppose we have wage, education, and
employment history for a set of individuals followed over a ten-year period. Or we might collect investment and financial data of the same set of firms over a five-year time period.
◮ Panel data can also be collected on geographical units. For
example, we can collect data for the same set of villages on migration flows, remittances, employment, loan default, and so on, for the years 2000, 2001, and 2010.
◮ The key feature of panel data that distinguishes them from a
pooled cross section is that the same cross-sectional units (individuals, firms, or villages in the preceding examples) are followed over a given time period.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
◮ Having multiple observations on the same units allows us to
control for certain unobserved characteristics of individuals, firms, and so on.
◮ The use of more than one observation can facilitate causal
inference in situations where inferring causality would be very difficult if only a single cross section were available.
◮ Thus, even we can treat panel data set as a pooled cross
- section. But the panel structure can be used to to analyze
questions that cannot be answered by simply viewing this as a pooled cross section.
◮ Another advantage of panel data is that they often allow us to
study the importance of lags in behavior or the result of decision making.
◮ This information can be significant because many economic
policies can be expected to have an impact only after some time has passed.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Empirical Relations
◮ A portion of CPS
http://www.census.gov/programs-surveys/cps.html
◮ Is there any wage difference across gender, race, union
membership?
◮ In this case, y is wage, and x are gender, race, union
- membership. y is the dependent or explained variable and x
are the independent or explanatory variables.
◮ Typically, both y and x are assumed to be random variables
which means that the observations are supposed to be generated by a random experiment, in advance of which their values are unknown.
◮ This aims to capture the idea that the experiment might in
principle be performed repeatedly, in each case throwing up a new sample whose particular values are not predictable in advance.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
◮ First, look at differences in sample means of each group. ◮ On average, men receive $3.52522 per hour more than women. ◮ Whites receive $2.804217 per hour more than nonwhites. ◮ Union members receive $2.20683 per hour more than
nonmembers.
◮ Can we conclude that being male causes earning to be higher
by $3.52522 per hour and so on?
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
◮ There is systematic variation in the characteristics of each
group that confound the difference in wages.
◮ 19.14% of men in the sample are members of unions but only
13% of women are.
◮ 86.27% of men are white while 83.15% of women are white.
◮ The average difference in wage between men and women
reflects union membership and racial differences as well as gender.
◮ We also expect that wages vary with education and
experience.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
20 40 60 Hourly wage in dollars 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Education in years
Figure: Scatter Plot of Hourly Wage against Education Level
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
20 40 60 Hourly wage in dollars 10 20 30 40 50 60 Potential work experience in years, age−schooling−6
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
20 40 60 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Education in years Hourly Wage ($) Mean Wage ($)
Raw Data and Conditional Means
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
20 40 60 10 20 30 40 50 60 Potential work experience in years, age−schooling−6 Hourly Wage ($) Mean Wage ($)
Raw Data and Conditional Means
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .