[PPT] - Labor Supply and the Two-Step Estimator James J. Heckman PowerPoint Presentation

SLIDE 1

Labor Supply and the Two-Step Estimator

James J. Heckman University of Chicago Econ 312 This draft, April 7, 2006

SLIDE 2

In this lecture, we look at a labor supply model and discuss various approaches to identify the key parameters of the model, including the two-step estimator.

1 A one period model

Consider a simple one period model of the labor supply choice (with total time normalized to 1): max

{} ( ) =

μ 1

¶

+ μ 1

¶
such that + + , where is consumption, is leisure,

is the wage rate, and is non wage income. 1

SLIDE 3

The Euler equation is = 1 1 and the reservation wage is given by:

=

1 1 ¸

=1=

=

1

= ln = ln + (1 ) ln Assume ln = + , ( ), (0 2

).

2

SLIDE 4

Now we will consider three cases:

1. Wages are observed for everyone, i.e. for those participat-

ing ( 1) and also for those not participating ( = 1);

2. Wages are not known for everyone, but we know the func-

tional form for wages; and

3. Wages are observed for workers only.

We discuss the commonly used methodologies to identify the key parameters of the model in each of these cases below. 3

SLIDE 5

2 Observe wages for everyone

Pr(Person works | ) = Pr(ln ln | ) = Pr μ

ln (1 ) ln
¶

= () where = ln (1 ) ln

4

SLIDE 6

2.1 Grouped data estimator

Each cell has common values of . For each cell, obtain: ˆ ( = 1| ) = cell proportion working = (ˆ ) Then calculate ˆ = 1( ˆ ). Regress ˆ

n

ln (1 ) ln

and obtain estimates of .

Note that here, instead of standard normal (), one could use a standard logistic model (() =

1 + ) or a linear probability

model: () =

5

SLIDE 7

2.2 Justifying the grouped data estimator

Suppose = 1 if agent i works, 0 if not. For each cell, get ˆ ( = 1| ). By WLLN and Slutsky: plim1(ˆ ( = 1| )) = 1(plimˆ ( = 1| )) = 1(( = 1| )) = ln (1 ) ln

Set up the regression function:

1(ˆ ) = 1() + £ 1(ˆ ) 1() ¤ = ln (1 ) ln

+

6

SLIDE 8

We need to characterize = 1(ˆ ) 1(). By the delta method, assuming is continuously dierentiable, we get:

((ˆ

) ()) =

¯

¯ ¯ ¯

(ˆ

) where min(ˆ ) max(ˆ ). Now assume

(ˆ

) (0 2

)

Applying this to our regression function, we get (using = 1()) p (1(ˆ ) 1()) = 1 (

)

p (ˆ ) cells 7

SLIDE 9

Suppose {} and errors are uncorrelated asymptoti-

cally. Then
Ã
1

(

)

¸2 (1 )

!
where (1 )
is the variance of the binary random variable

. We obtain the feasible GLS estimator by regressing: 1(ˆ ) (

)

r (1 )

n

ln (1 ) ln

(

)

r (1 )

8

SLIDE 10

We can show that the estimates are asymptotically ecient and satisfy the orthogonality condition

X

=1

plim 1 (

)

μln (1 ) ln

¶ ¡

1(ˆ ) 1() ¢ = 0 9

SLIDE 11

2.3 Microdata analogue

= Y

=1

() Y

=0

() = Y

([2 1])

MLE gives consistent and asymptotically normal estimates of . 10

SLIDE 12

3 Do not observe wages, but wages follow specific functional form

Here we do not observe wages for anyone, but do know that the wages have the following functional form: ln = + where (0 2

) ( )

and ( ) (0 2

+ 2 2) = (0 2)

where ()2 = (2

+ 2 2).

11

SLIDE 13

Then: Pr( works) = Pr μ

(1 ) ln
¶
Note: if ( ) are Extreme Value (Type I), then ( ) is

logistic. 12

SLIDE 14

3.1 Identification (2 cases)

1. If , distinct, then can estimate

μ 1

¶

, but can’t estimate .

2. If , have elements in common ( = ), then

can estimate only: μ

(1 )
¶

and again not . 13

SLIDE 15

4 Observe wages for workers only

Here we have: ln = + ln = + (1 ) ln + ( )

( )

= ln ln = (1 ) ln + Agent works if 0. This is a the Roy model with 2 sectors: market () and nonmarket (). 14

SLIDE 16

Following the derivations in the lecture ‘Empirical Content of Roy Model,’ we get the expression for the expected wages in the market sector, conditional on participation and the and variables as: (ln | ln ln ) = +

μ + (1 ) ln
¶

= +

()

where = (1 ) ln

.

15

SLIDE 17

4.1 2-step Estimator

Step 1: Run probit on LFP (labor force participation)

decision (as we did in section 2.3) : ( ˆ

ˆ
ˆ
) = argmax

X

ln [(2 1)]

Form: (ˆ ) = (ˆ ) 1 (ˆ ) where ˆ

=

ˆ ˆ (1 ˆ ) ln ˆ

16

SLIDE 18

Step 2: Estimate (ˆ
\
) via OLS on

ln = +

(ˆ

) + using sample of workers only (refer to expression for conditional expectation of market wages derived above). 17

SLIDE 19

4.2 Identification 4.3 With one exclusion restriction

With one exclusion restriction (1 variable, call it 1, in not in or ln ; let all other be common with ), we can now identify everything: ( ) We describe below how we recover all the relevant parameters: 18

SLIDE 20

(i) Step 1 of 2-step gives 1 as well as μ( )

1
¶

. Step 2 gives 1 as well as μ

¶

. Solve for . Use to solve for ( ). 19

SLIDE 21

(ii) Look at residuals from step 2: ln = +

(ˆ

) + Then following results in the earlier lecture, we have: (2

)

=

¡

2 £ 1 2() () ¤ + £ 1 2¤¢ = 2 £ 2

+

¤

Regression of ˆ

2

n ˆ
2

+ ˆ

ˆ gives consistent estimates

f ( ). Solve for . Use = + 2 to

solve for . 20

SLIDE 22

4.4 Without any exclusion restrictions on

Without any exclusion restrictions on , we can only identify ( ): We cannot uniquely identify or . (i) Step 2 of 2-step gives μ

¶
(ii) Obtain ( ) from residual regression as above.

(iii) To solve for ( ) either normalize = 1 or =

0. If we assume = 1, then from step 2 we obtain

1 + 2 , from which we can obtain . If we assume = 0, then from step 2 we get

+

and can solve for . 21

SLIDE 23

5 Durbin’s problem (1970)

(See also Newey (1984) and Newey and McFadden (1994), and also refer handout on the Durbin problem.) In step 2 of the two-step estimation, setting for simplicity ˆ = ˆ , we have the regression: ln = + (ˆ ) + h () (ˆ ) i + OLS gives consistent estimates of but the variance of the OLS estimates equal the usual OLS variance matrix plus an additional term due to the ( ˆ ) term, so we have het- eroskedasticity and extra variability. 22

SLIDE 24

(ˆ ) = () + (ˆ ) + (·)

(ˆ

) =

(ˆ

) + (·)

μ
0¶
=
³

(ˆ ) ´

Sampling distribution of the OLS coecient is:

μ ˆ

ˆ
¶

= μ

¶

+ Ã 0ˆ

ˆ
ˆ
0ˆ
!1 μ

ˆ

¶ ³

(ˆ ) + ´0

where 0 = P

=1 0 , 0ˆ

= P

=1 , etc.

23

SLIDE 25

Rearranging we get:

μ ˆ
ˆ
¶
μ
¶¸

=

0ˆ
ˆ
ˆ
0ˆ
1
ˆ
³

(ˆ ) + ´ Taylor expanding around the true and taking probability limits of each element on the rhs gives: 24

SLIDE 26

0ˆ

=
+

(ˆ ) ¸

= 0

+ (ˆ )

ˆ
0ˆ
=
+

(ˆ ) ¸0 + (ˆ ) ¸

=

+ 2

(ˆ

) +

(ˆ

)

(ˆ

)

25

SLIDE 27

0(ˆ )

=

0( (ˆ ))

=
μ
¶
(ˆ

)

1(0 )

= (0 210

1)

(0 2

2 )

26

SLIDE 28

ˆ

0[( ˆ

) + ]

=
+

(ˆ ) ¸0 (ˆ ) ¸

+
+

(ˆ ) ¸0

27

SLIDE 29

=

(ˆ

) ¸

(ˆ

) ¸0 (ˆ ) ¸

+ 0
+
(ˆ

) ¸0

28

SLIDE 30

=

¸
(ˆ

)

¸0

¸

h

(ˆ ) i0 h (ˆ ) i + 0

+
¸0
(ˆ

) 29

SLIDE 31

¸

(ˆ )

¸0

¸

h

(ˆ ) i0 h (ˆ ] i + 0

30

SLIDE 32

2(0 )

= (0 220

2)

+(0 2

2 )

where 2 = plim

μ

¶¸

assuming

plim

¸0
= 0

and 0

(0 2

2 ).

31

SLIDE 33

Putting this all together and assuming that the random com- ponents of the first and second step are independent (i.e. se- quence of estimates ˆ is independent of ), we get:

μ ˆ
ˆ
¶
μ
¶¸
[0 ]

where = 2

1

+ 21

0 10 11

μ 0 ¶ 1 μ 1 2 ¶

Note that since ˆ

is estimated using , we have: = 2(; ) ¸1 32

SLIDE 34

In the non-independence case we have shown that:

μ ˆ
ˆ
¶
μ
¶¸
[0 ]

where: = 2

1

+ 21 11

1 ()0 1

11

1 ()0 2 21 1 ()0 1

¸ 1 with: [] 1() = 2(; ) ¸1 0 = plim1 [ 0] 1 = plim1

¸¸

2 = plim1

X

=1

(; )

33