Labor Supply and the Two-Step Estimator James J. Heckman - - PowerPoint PPT Presentation
Labor Supply and the Two-Step Estimator James J. Heckman - - PowerPoint PPT Presentation
Labor Supply and the Two-Step Estimator James J. Heckman University of Chicago Econ 312 This draft, April 7, 2006 In this lecture, we look at a labor supply model and discuss various approaches to identify the key parameters of
In this lecture, we look at a labor supply model and discuss various approaches to identify the key parameters of the model, including the two-step estimator.
1 A one period model
Consider a simple one period model of the labor supply choice (with total time normalized to 1): max
{} ( ) =
μ 1
- ¶
+ μ 1
- ¶
- such that + + , where is consumption, is leisure,
is the wage rate, and is non wage income. 1
The Euler equation is = 1 1 and the reservation wage is given by:
- =
1 1 ¸
=1=
=
- 1
= ln = ln + (1 ) ln Assume ln = + , ( ), (0 2
).
2
Now we will consider three cases:
- 1. Wages are observed for everyone, i.e. for those participat-
ing ( 1) and also for those not participating ( = 1);
- 2. Wages are not known for everyone, but we know the func-
tional form for wages; and
- 3. Wages are observed for workers only.
We discuss the commonly used methodologies to identify the key parameters of the model in each of these cases below. 3
2 Observe wages for everyone
Pr(Person works | ) = Pr(ln ln | ) = Pr μ
- ln (1 ) ln
- ¶
= () where = ln (1 ) ln
- 4
2.1 Grouped data estimator
Each cell has common values of . For each cell, obtain: ˆ ( = 1| ) = cell proportion working = (ˆ ) Then calculate ˆ = 1( ˆ ). Regress ˆ
- n
ln (1 ) ln
- and obtain estimates of .
Note that here, instead of standard normal (), one could use a standard logistic model (() =
- 1 + ) or a linear probability
model: () =
- 5
2.2 Justifying the grouped data estimator
Suppose = 1 if agent i works, 0 if not. For each cell, get ˆ ( = 1| ). By WLLN and Slutsky: plim1(ˆ ( = 1| )) = 1(plimˆ ( = 1| )) = 1(( = 1| )) = ln (1 ) ln
- Set up the regression function:
1(ˆ ) = 1() + £ 1(ˆ ) 1() ¤ = ln (1 ) ln
- +
6
We need to characterize = 1(ˆ ) 1(). By the delta method, assuming is continuously dierentiable, we get:
- ((ˆ
) ()) =
- ¯
¯ ¯ ¯
- (ˆ
) where min(ˆ ) max(ˆ ). Now assume
- (ˆ
) (0 2
)
Applying this to our regression function, we get (using = 1()) p (1(ˆ ) 1()) = 1 (
)
p (ˆ ) cells 7
Suppose {} and errors are uncorrelated asymptoti-
- cally. Then
- Ã
- 1
(
)
¸2 (1 )
- !
- where (1 )
- is the variance of the binary random variable
. We obtain the feasible GLS estimator by regressing: 1(ˆ ) (
)
r (1 )
- n
ln (1 ) ln
- (
)
r (1 )
- 8
We can show that the estimates are asymptotically ecient and satisfy the orthogonality condition
- X
=1
plim 1 (
)
μln (1 ) ln
- ¶ ¡
1(ˆ ) 1() ¢ = 0 9
2.3 Microdata analogue
= Y
=1
() Y
=0
() = Y
- ([2 1])
MLE gives consistent and asymptotically normal estimates of . 10
3 Do not observe wages, but wages follow specific functional form
Here we do not observe wages for anyone, but do know that the wages have the following functional form: ln = + where (0 2
) ( )
and ( ) (0 2
+ 2 2) = (0 2)
where ()2 = (2
+ 2 2).
11
Then: Pr( works) = Pr μ
- (1 ) ln
- ¶
- Note: if ( ) are Extreme Value (Type I), then ( ) is
logistic. 12
3.1 Identification (2 cases)
- 1. If , distinct, then can estimate
μ 1
- ¶
, but can’t estimate .
- 2. If , have elements in common ( = ), then
can estimate only: μ
- (1 )
- ¶
and again not . 13
4 Observe wages for workers only
Here we have: ln = + ln = + (1 ) ln + ( )
- ( )
= ln ln = (1 ) ln + Agent works if 0. This is a the Roy model with 2 sectors: market () and nonmarket (). 14
Following the derivations in the lecture ‘Empirical Content of Roy Model,’ we get the expression for the expected wages in the market sector, conditional on participation and the and variables as: (ln | ln ln ) = +
- μ + (1 ) ln
- ¶
= +
- ()
where = (1 ) ln
- .
15
4.1 2-step Estimator
- Step 1: Run probit on LFP (labor force participation)
decision (as we did in section 2.3) : ( ˆ
- ˆ
- ˆ
- ) = argmax
X
- ln [(2 1)]
Form: (ˆ ) = (ˆ ) 1 (ˆ ) where ˆ
- =
ˆ ˆ (1 ˆ ) ln ˆ
- 16
- Step 2: Estimate (ˆ
- \
- ) via OLS on
ln = +
- (ˆ
) + using sample of workers only (refer to expression for con- ditional expectation of market wages derived above). 17
4.2 Identification 4.3 With one exclusion restriction
With one exclusion restriction (1 variable, call it 1, in not in or ln ; let all other be common with ), we can now identify everything: ( ) We describe below how we recover all the relevant parameters: 18
(i) Step 1 of 2-step gives 1 as well as μ( )
- 1
- ¶
. Step 2 gives 1 as well as μ
- ¶
. Solve for . Use to solve for ( ). 19
(ii) Look at residuals from step 2: ln = +
- (ˆ
) + Then following results in the earlier lecture, we have: (2
)
=
- ¡
2 £ 1 2() () ¤ + £ 1 2¤¢ = 2 £ 2
+
¤
- Regression of ˆ
2
- n ˆ
- 2
+ ˆ
ˆ gives consistent estimates
- f ( ). Solve for . Use = + 2 to
solve for . 20
4.4 Without any exclusion restrictions on
Without any exclusion restrictions on , we can only identify ( ): We cannot uniquely identify or . (i) Step 2 of 2-step gives μ
- ¶
- (ii) Obtain ( ) from residual regression as above.
(iii) To solve for ( ) either normalize = 1 or =
- 0. If we assume = 1, then from step 2 we obtain
1 + 2 , from which we can obtain . If we as- sume = 0, then from step 2 we get
- +
and can solve for . 21
5 Durbin’s problem (1970)
(See also Newey (1984) and Newey and McFadden (1994), and also refer handout on the Durbin problem.) In step 2 of the two-step estimation, setting for simplicity ˆ = ˆ , we have the regression: ln = + (ˆ ) + h () (ˆ ) i + OLS gives consistent estimates of but the variance of the OLS estimates equal the usual OLS variance matrix plus an additional term due to the ( ˆ ) term, so we have het- eroskedasticity and extra variability. 22
(ˆ ) = () + (ˆ ) + (·)
- (ˆ
) =
- (ˆ
) + (·)
- μ
- 0¶
- =
- ³
(ˆ ) ´
- Sampling distribution of the OLS coecient is:
μ ˆ
- ˆ
- ¶
= μ
- ¶
+ Ã 0ˆ
- ˆ
- ˆ
- 0ˆ
- !1 μ
ˆ
- ¶ ³
(ˆ ) + ´0
- where 0 = P
=1 0 , 0ˆ
= P
=1 , etc.
23
Rearranging we get:
- μ ˆ
- ˆ
- ¶
- μ
- ¶¸
=
- 0ˆ
- ˆ
- ˆ
- 0ˆ
- 1
- ˆ
- ³
(ˆ ) + ´ Taylor expanding around the true and taking probability limits of each element on the rhs gives: 24
0ˆ
- =
- +
(ˆ ) ¸
- = 0
+ (ˆ )
- ˆ
- 0ˆ
- =
- +
(ˆ ) ¸0 + (ˆ ) ¸
- =
+ 2
- (ˆ
) +
- (ˆ
)
- (ˆ
)
- 25
0(ˆ )
- =
0( (ˆ ))
- =
- μ
- ¶
- (ˆ
)
- 1(0 )
= (0 210
1)
- (0 2
2 )
26
ˆ
- 0[( ˆ
) + ]
- =
- +
(ˆ ) ¸0 (ˆ ) ¸
- +
- +
(ˆ ) ¸0
- 27
=
- (ˆ
) ¸
- (ˆ
) ¸0 (ˆ ) ¸
- + 0
- +
- (ˆ
) ¸0
- 28
=
- ¸
- (ˆ
)
- ¸0
¸
- h
(ˆ ) i0 h (ˆ ) i + 0
- +
- ¸0
- (ˆ
) 29
- ¸
(ˆ )
- ¸0
¸
- h
(ˆ ) i0 h (ˆ ] i + 0
- 30
- 2(0 )
= (0 220
2)
+(0 2
2 )
where 2 = plim
- μ
¶¸
- assuming
plim
- ¸0
- = 0
and 0
- (0 2
2 ).
31
Putting this all together and assuming that the random com- ponents of the first and second step are independent (i.e. se- quence of estimates ˆ is independent of ), we get:
- μ ˆ
- ˆ
- ¶
- μ
- ¶¸
- [0 ]
where = 2
1
+ 21
0 10 11
μ 0 ¶ 1 μ 1 2 ¶
- Note that since ˆ
is estimated using , we have: = 2(; ) ¸1 32
In the non-independence case we have shown that:
- μ ˆ
- ˆ
- ¶
- μ
- ¶¸
- [0 ]
where: = 2
1
+ 21 11
1 ()0 1
11
1 ()0 2 21 1 ()0 1
¸ 1 with: [] 1() = 2(; ) ¸1 0 = plim1 [ 0] 1 = plim1
- ¸¸
2 = plim1
- X
=1
- (; )
33