[PPT] - LATE and the Generalized Roy Model: Some Relationships James J. PowerPoint Presentation

SLIDE 1

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

LATE and the Generalized Roy Model: Some Relationships

James J. Heckman University of Chicago Extract from: Building Bridges Between Structural and Program Evaluation Approaches to Evaluating Policy James J. Heckman (JEL 2010) Econ 312, Spring 2019

Heckman LATE and the Roy Model

SLIDE 2

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

LATE Identifying Policy Parameters

2 Example of Normal Model 3 Nonparametric Identification of the Roy Model

Heckman LATE and the Roy Model

SLIDE 3

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

Defining LATE

Heckman LATE and the Roy Model

SLIDE 4

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

Question:* Derive the MTE from the sample selection model.

What parameters are identified by the selection model that are not identified by MTE? Explain the advantages and disadvantages of each approach. *Answer after reading these slides

Heckman LATE and the Roy Model

SLIDE 5

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

LATE

LATE is defined by the variation of an instrument.
The instrument in LATE plays the role of a randomized

assignment.

Randomized assignment is an instrument.
Y0 and Y1 are potential ex-post outcomes.
Instrument Z assumes values in Z, z ∈ Z.

Heckman LATE and the Roy Model

SLIDE 6

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

D(z): indicator of hypothetical choice representing what choice

the individual would have made had the individual’s Z been exogenously set to z.

D(z) = 1 if the person chooses (is assigned to) 1.
D(z) = 0, otherwise.
One can think of the values of z as fixed by an experiment or

by some other mechanism independent of (Y0, Y1).

All policies are assumed to operate through their effects on Z.
It is assumed that Z can be varied conditional on X.

Heckman LATE and the Roy Model

SLIDE 7

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

Three assumptions define LATE.

IA Assumption 1

(Y0, Y1, {D(z)}z∈Z) ⊥ ⊥ Z | X

IA Assumption 2

Pr(D = 1 | Z = z) is a nontrivial function of z conditional on X.

Heckman LATE and the Roy Model

SLIDE 8

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

IA Assumption 3

For any two values of Z, say Z = z1 and Z = z2, either D(z1) ≥ D(z2) for all persons, or D(z1) ≤ D(z2) for all persons.

This condition is a statement across people.
This condition does not require that for any other two values of

Z, say z3 and z4, the direction of the inequalities on D(z3) and D(z4) have to be ordered in the same direction as they are for D(z1) and D(z2).

It only requires that the direction of the inequalities are the

same across people.

Thus for any person, D(z) need not be monotonic in z.

Heckman LATE and the Roy Model

SLIDE 9

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

Under LATE conditions, for two distinct values of Z, z1 and z2,

IV applied to LATE(z2, z1) = E(Y1 − Y0 | D(z2) = 1, D(z1) = 0), if the change from z1 to z2 induces people into the program (D(z2) ≥ D(z1)).

This is the mean return to participation in the program for

people induced to switch treatment status by the change from z1 to z2.

Heckman LATE and the Roy Model

SLIDE 10

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

LATE does not identify which people are induced to change

their treatment status by the change in the instrument.

It leaves unanswered many policy questions.
For example, if a proposed program changes the same

components of vector Z as used to identify LATE but at different values of Z (say z4, z3), LATE(z2,z1) does not identify LATE(z4, z3).

Heckman LATE and the Roy Model

SLIDE 11

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

If the policy operates on different components of Z than are

used to identify LATE, one cannot safely use LATE to identify marginal returns to the policy.

It does not, in general, identify treatment on the treated, ATE
r a variety of criteria.
But using the implicit economics of the problem one can do

better as I show below.

Heckman LATE and the Roy Model

SLIDE 12

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

Identifying Policy Parameters Y1 = µ1(X)+U1, Y0 = µ0(X)+U0, C = µC(Z)+UC, (1)

(X, Z) are observed by the analyst.
U0, U1, UC are unobserved.

Heckman LATE and the Roy Model

SLIDE 13

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

Define Z to include all of X.
Variables in Z not in X are instruments.
ID = E(Y1 − Y0 − C | I) = µD(Z) − V

µD(Z) = E(µ1(X) − µ0(X) − µC(Z) | I) V = −E(U1 − U0 − UC | I).

Choice equation:

D = 1(µD(Z) ≥ V ). (2)

Recall from Vytlacil’s Theorem (2002) that:
(2) ⇔ equivalence not implication.
IA Assumption 1–IA Assumption 3: monotonicity.
In the early literature that implemented this approach µ0(X),

µ1(X), and µC(Z) were assumed to be linear in the parameters, and the unobservables were assumed to be normal and distributed independently of X and Z.

Heckman LATE and the Roy Model

SLIDE 14

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

The essential aspect of the structural approach is joint

modeling of outcome and choice equations.

Structural econometricians have developed nonparametric

identification analyses for the Roy and generalized Roy models.

Central to the whole LATE enterprise is Pr(D = 1|X, Z) = P.
Remember D = 1[FV (MD(Z)) ≥ FV (V )].
We keep X implicit.

Heckman LATE and the Roy Model

SLIDE 15

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

To Recapitulate A useful fact: Assume Z ⊥ ⊥ V (implied by IA Assumption 1) Then Choice Probability : P(z) = Pr(D = 1 | Z = z) = Pr(µD(z) ≥ V ) = Pr µD(z) σV ≥ V σV

P(z) = F

V σV

µD(z)

σV

UD = F

V σV

V

σV

;

Uniform(0, 1)

Heckman LATE and the Roy Model

SLIDE 16

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

P(z) = Pr

F V

σV

µD(z) σV

≥ F

V σV

V

σV

= Pr (P(z) ≥ UD)

P(z) is the p(z)th quantile of UD.

Heckman LATE and the Roy Model

SLIDE 17

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

Recall Y = DY1 + (1 − D)Y0 = Y0 + D(Y1 − Y0) Keep X implicit (condition on X = x) E(Y | Z = z) = E(Y0) + E(Y1 − Y0 | D = 1, Z = z)P(z)

from law of iterated expectations

= E(Y0) + E(Y1 − Y0 | P(z) ≥ UD)P(z) ∴ It depends on Z only through P(Z). E(Y | Z = z′) = E(Y0) + E(Y1 − Y0 | P(z′) ≥ UD)P(z′)

Heckman LATE and the Roy Model

SLIDE 18

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

What is E(Y1 − Y0 | P(z) ≥ UD)? (Treatment on the treated)
Assume (Y1, Y0, UD) (absolutely) continuous.
The joint density of (Y1 − Y0, UD): fY1−Y0,UD(y1 − y0, uD).
Does not depend on Z.
It may, in general, depend on X.
E(Y1 − Y0 | P(z) ≥ UD)

=

∞

−∞

P(z)

(y1 − y0)fy1−y0,uD(y1 − y0, uD) duDd(y1 − y0)

Pr(P(z) ≥ UD)

Heckman LATE and the Roy Model

SLIDE 19

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

Recall that

UD = F

V σV

V

σV

.
UD is a quantile of the V /σV distribution.

Heckman LATE and the Roy Model

SLIDE 20

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

By construction, UD is Uniform(0, 1) (this is the definition of a

quantile).

∴ fUD(uD) = 1.
Also, Pr(P(z) ≥ UD) = P(z).
By law of conditional probability,

fY1−Y0,UD(y1 − y0, uD) = fY1−Y0,UD(y1 − y0 | UD = uD) fUD(uD)

=1

.

Heckman LATE and the Roy Model

SLIDE 21

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

E(Y1 − Y0 | P(z) ≥ UD) =

P(z)

∞
−∞

(y1 − y0)fY1−Y0,UD(y1 − y0, uD) d(y1 − y0) duD P(z) E(Y1 − Y0 | P(z) ≥ UD) =

P(z)

∞
−∞

(y1 − y0)fY1−Y0,UD(y1 − y0 | UD = uD) d(y1 − y0) duD P(z) =

P(z)

E(Y1 − Y0 | UD = uD) duD

P(z)

Heckman LATE and the Roy Model

SLIDE 22

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

∴ E(Y | Z = z) = E(Y0) +

P(z)

E(Y1 − Y0 | UD = uD)duD

∂E(Y | Z = z) ∂P(z) = E(Y1 − Y0 | UD = P(z))

marginal gains for

people with UD=P(z)

= MTE(UD) for UD = P(Z) E(Y | Z = z′) = E(Y0) +

P(z′)

E(Y1 − Y0 | UD = uD)duD

Heckman LATE and the Roy Model

SLIDE 23

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

Suppose P(z) > P(z′)

∴ E(Y | Z = z) − E(Y | Z = z′) = =

P(z)

P(z′)

E(Y1 − Y0 | UD = uD)duD = E(Y1 − Y0 | P(z) ≥ UD ≥ P(z′)) Pr(P(z) ≥ UD ≥ P(z′))

Heckman LATE and the Roy Model

SLIDE 24

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

Notice Pr(P(z) ≥ UD ≥ P(z′)) =

P(z)

P(z′)

duD = P(z) − P(z′) E(Y | Z = z) − E(Y | Z = z′) = E(Y1 − Y0 | P(z) ≥ UD ≥ P(z′))

LATE

(P(z) − P(z′))

Heckman LATE and the Roy Model

SLIDE 25

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

E(Y | Z = z) − E(Y | Z = z′) P(z) − P(z′) = LATE(z, z′) =

P(z)

P(z′)

MTE(uD)duD P(z) − P(z′)

Heckman LATE and the Roy Model

SLIDE 26

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

Question: In what sense is E(Y1 − Y0 | P(z) ≥ UD) a

measure of surplus of agents for whom P(z) ≥ UD?

Heckman LATE and the Roy Model

SLIDE 27

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

The Surplus From Treatment and the Marginal Treatment Effect

Using Vytlacil’s theorem, it is possible to understand more

deeply what economic questions LATE answers.

For P(Z) = p, the mean gross gain of moving from “0” to “1”

for people with UD less than or equal to p is E(Y1 − Y0 | P(Z) ≥ UD, P(Z) = p) (3) = E(Y1 − Y0 | p ≥ UD) = E(Y1 − Y0 | µD(z) ≥ V ).

Heckman LATE and the Roy Model

SLIDE 28

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

The mean gross gain in the population (or gross surplus S(p))

that arises from participation in the program for people whose UD is at or below p and the proportion of people whose UD is at or below p: E(Y1 − Y0 | p ≥ UD)p = S(p). E(Y | P(Z) = p) = E(Y0 + 1(p ≥ UD)(Y1 − Y0)) (4) = E(Y0) + E(Y1 − Y0 | p ≥ UD)p

S(p)

.

Can identify the left-hand side of (4) for all values of p in the

support of P(Z).

Heckman LATE and the Roy Model

SLIDE 29

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

It is not necessary to impose functional forms to obtain this

expression, and one can avoid one of the criticisms directed against 1980’s structural econometrics.

The surplus can be defined for all values of p ∈ [0, 1] whether
r not the model is identified.

Heckman LATE and the Roy Model

SLIDE 30

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

Marginal increment in outcomes is

∂E(Y | P(Z) = p) ∂p = E(Y1 − Y0 | UD = p)

MTE

= ∂S(p) ∂p . (5)

The sample analogue of (5) is the local instrumental variable

(LIV) estimator of Heckman and Vytlacil (1999, 2005).

Adopting a nonparametric approach to estimating

E(Y | P(Z) = p) avoids extrapolation outside of the sample support of P(Z) and produces a data sensitive structural analysis.

Heckman LATE and the Roy Model

SLIDE 31

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

A generalization of this parameter defined for other points of

evaluation of uD is the Marginal Treatment Effect (MTE) is MTE(uD) ≡ E(Y1 − Y0 | UD = uD).

Expression (4) can be simplified to

E (Y | P(Z) = p) = E(Y0) +

p

MTE(uD) duD
S(p)

, (6) ∂S(p) ∂p = MTE(p).

Heckman LATE and the Roy Model

SLIDE 32

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

Figure 1: Plots of E(Y |P(Z) = p) and the MTE derived from E(Y |P(Z) = p)

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 5 10 15 20 25 p E[Y|P(Z)=p]

Plot of the E(Y |P(Z) = p Source: Heckman and Vytlacil (2005).

Heckman LATE and the Roy Model

SLIDE 33

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

Figure 1: Plots of E(Y |P(Z) = p) and the MTE derived from E(Y |P(Z) = p)

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 uD MTE

Plot of MTE(uD): The derivative of E(Y | P(Z) = p) evaluated at points p = uD Source: Heckman and Vytlacil (2005).

Heckman LATE and the Roy Model

SLIDE 34

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

From LIV, it is possible to identify returns at all quantiles of UD

within the support of the distribution of P(Z) to determine which persons (identified by the quantile of the unobserved component of the desire to go to college, UD) are induced to go into college (D = 1) by a marginal change in P(z), i.e., analysts can define the margins of choice traced out by variations in different instruments as they shift P(z).

This clarifies what empirical versions of LATE identify by

showing that all instruments operate through P(Z), and variations around different levels of P(Z) identify different stretches of the MTE.

Heckman LATE and the Roy Model

SLIDE 35

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

The Fundamental Role of the Choice Probability in Understanding What Instrumental Variables Estimate When β Sorted by D

For p2 > p1,

S(p2) − S(p1) = E(Y1 − Y0 | p1 ≤ UD ≤ p2) Pr(p1 ≤ UD ≤ p2) = E(Y1 − Y0 | p1 ≤ UD ≤ p2)(p2 − p1), note that Pr(p1 ≤ UD ≤ p2) = p2 − p1.

Thus,

S(p2) − S(p1) =

p2

p1

MTE(uD) duD.

Heckman LATE and the Roy Model

SLIDE 36

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

LATE(p2, p1) =

p2

p1

MTE(uD) duD p2 − p1 = S(p2) − S(p1) p2 − p1 . (7)

By the mean value theorem, LATE(p2, p1) = MTE(uD(p2, p1))

where uD(p2, p1) is a point of evaluation and uD(p2, p1) ∈ [p1, p2].

Heckman LATE and the Roy Model

SLIDE 37

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

The model-generated LATE can be identified if there are values
f Z , say ˜

z and ˜ ˜ z, such that Pr(D = 1 | Z = ˜ z) = p1 and Pr(D = 1 | Z = ˜ ˜ z) = p2.

Under standard regularity conditions

lim

p2→p1 LATE(p2, p1) = MTE(p1).

Heckman LATE and the Roy Model

SLIDE 38

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

Partition the support of uD into M discrete and exhaustive

intervals [uD,0, uD,1), [uD,1, uD,2), . . . , [uD,M−1, uD,M], where uD,0 = 0 and uD,M = 1, E(Y | UD ≤ uD,k) = E(Y0) +

k

j=1

LATE(uD,j, uD,j−1)ηj, where ηj = uD,j − uD,j−1.

Thus

E(Y ) = E(Y0) +

M

j=1

LATE(uD,j, uD,j−1)ηj. (8)

Counterpart to expression (6) when p = 1.
It shows how mean income can be represented as a sum of

incremental gross surpluses above E(Y0).

Heckman LATE and the Roy Model

SLIDE 39

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

If Pr(D = 1 | Z = z) assumes values at only a discrete set of

support points, say p1 < p2 < · · · < pL, we can only identify LATE in intervals with boundaries defined by uD,ℓ = pℓ, ℓ = 1, . . . , L.

Heckman LATE and the Roy Model

SLIDE 40

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

MTE(uD) and the model-generated LATE (??) are structural

parameters in the sense that changes in Z (conditional on X) do not affect MTE(uD) or theoretical LATE.

They are invariant with respect to all policy changes that
perate through Z.
Conditional on X, one can transport MTE and the derived

theoretical LATEs across different policy environments and different data sets.

These policy invariant parameters implement Marschak’s

Maxim since they are defined for combinations of the parameters of the generalized Roy model.

Heckman LATE and the Roy Model

SLIDE 41

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

This deeper understanding of LATE facilitates its use in

answering out of sample policy questions P2 and P3 for policies that operate through changing Z.

Thus if one computes a LATE for any two pairs of values

Z = z1, and Z = z2, with associated probabilities Pr(D = 1 | Z = z1) = P(z1) = p1 and Pr(D = 1 | Z = z2) = P(z2) = p2, one can use it to evaluate any other pair of policies ˜ z and ˜ ˜ z such that Pr(D = 1 | Z = z1) = Pr(D = 1 | Z = ˜ z) = p1 and Pr(D = 1 | Z = z2) = Pr(D = 1 | Z = ˜ ˜ z) = p2.

Heckman LATE and the Roy Model

SLIDE 42

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

Thus, one can use an empirical LATE determined for one set of

instrument configurations to identify outcomes for other sets of instrument configurations that produce the same p1 and p2, i.e., we can compare any policy described by ˜ z ∈ {z | P(z) = p1} with any policy ˜ ˜ z ∈ {z | P(z) = p2} and not just the policies associated with z1 and z2 that identify the sample LATE.

This is a useful result and enables analysts to solve policy

evaluation question P3 to evaluate new policies never previously implemented if they can be cast in terms of variations in P(Z) over the empirical support on Z.

Heckman LATE and the Roy Model

SLIDE 43

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

Variation in different components of Z produce variation in

P(Z).

Analysts can aggregate the variation in different components of

Z into the induced variation in P(Z) to trace out MTE(uD)

ver more of the support of uD than would be possible using

variation in any particular component of Z.

The structural approach enables analysts to determine what

stretches of the MTE different instruments identify and to determine the margin of UD identified by the variation in an instrument.

Heckman LATE and the Roy Model

SLIDE 44

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

Figure 2: MTE as a function of uD: What sections of the MTE different values of the instruments and different instruments approximate.

1 p1 p2 p3 p4 uD

MTE Mean Marginal Gain LATE(p2, p1) LATE(p4, p3) uD(p2, p1) uD(p4, p3) Heckman LATE and the Roy Model

SLIDE 45

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

Instruments associated with higher values of P(Z), [p3, p4],

identify the LATE in a different stretch of the MTE associated with higher values of uD.

Continuous instruments identify entire stretches of the MTE

while discrete instruments define the MTE at discrete points of the support (i.e., the LATE associated with the interval defined by the values assumed by P(Z)).

Heckman LATE and the Roy Model

SLIDE 46

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

If the MTE does not depend on uD,

E(Y | P(Z) = p) = E(Y0) + (µ1 − µ0)p, and all instruments identify the same parameter: ¯ β = µ1 − µ0.

In this case, MTE is a flat line parallel to the uD axis.

Heckman LATE and the Roy Model

SLIDE 47

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

A test of whether MTE(uD) depends on uD, or a test of

nonlinearity of E(Y | P(Z) = p) in p, is a test of the whether different instruments estimate the same parameter.

The LATE model and its extensions overturn the logic of the

Durbin (1954)–Wu (1973)–Hausman (1978) test for

veridentification.
Variability among the estimates from IV estimators based on

different instruments may have nothing to do with the validity

f any particular instrument, but may just depend on what

stretch of the MTE they approximate.

Heckman LATE and the Roy Model

SLIDE 48

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

Appendix: The Generalized Roy Model for the Normal Case

Heckman LATE and the Roy Model

SLIDE 49

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

Y1 = µ1(X) + U1 Y0 = µ0(X) + U0 C = µC(Z) + UC Net Benefit: I = Y1 − Y0 − C I = µ1(X) − µ0(X) − µC(Z)

µD(Z)

+ U1 − U0 − UC

−V

(U0, U1, UC) ⊥ ⊥ (X, Z) E(U0, U1, UC) = (0, 0, 0) V ⊥ ⊥ (X, Z)

Heckman LATE and the Roy Model

SLIDE 50

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

Assume normally distributed errors.
Assume Z contains X but may contain other variables

(exclusions) Y = DY1 + (1 − D)Y0

bserved Y

D = 1(I ≥ 0) = 1(µD(Z) ≥ V )

Assume V ∼ N(0, σ2

V )

Heckman LATE and the Roy Model

SLIDE 51

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

Propensity Score:

Pr(D = 1 | Z = z) = Φ µD(z) σV

E(Y | D = 1, X = x, Z = z) = µ1(X) + E(U1 | µD(z) ≥ V )
K1(P(z))

because (X, Z) ⊥ ⊥ (U1, V ).

Under normality we obtain

E

U1
µD(z)

σV ≥ V σV

=

Cov(U1, V

σV )

Var( V

σV )

˜ λ µD(z) σV

Heckman

LATE and the Roy Model

SLIDE 52

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

Why?

U1 = Cov

U1, V

σV V σV + ε1 ε1 ⊥ ⊥ V E V σV | µD(z) σV ≥ V σV

=

µD (z) σV

−∞

t

1 √ 2πe

−t2 2 dt µD (z) σV

−∞

1 √ 2πe

−t2 2 dt

= ˜ λ µD(z) σV

=

−1 √ 2πe(− 1

2)

µD (z)

σV

2

Φ

µD(z)

σV

= ˜

λ µD(z) σV

=

−φ

µD(z)

σV

Φ
µD(z)

σV

Heckman

LATE and the Roy Model

SLIDE 53

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

Notice

lim

µD(z)→∞

˜ λ µD(z) σV

=0

lim

µD(z)→−∞

˜ λ µD(z) σV

= − ∞
Propensity score:

P(z) = Pr(D = 1 | Z = z) = Φ µD(z) σV

∴

µD(z) σV

= Φ−1 (Pr(D = 1 | Z = z))

Heckman LATE and the Roy Model

SLIDE 54

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

Thus we can replace µD(z)

σV

with a known function of P(z)

Heckman LATE and the Roy Model

SLIDE 55

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

Notice that because (X, Z) ⊥

⊥ (U, V ), Z enters the model (conditional on X) only through P(Z).

This is called index sufficiency.

Heckman LATE and the Roy Model

SLIDE 56

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

Put all of these results together to obtain

E (Y | D = 1, X = x, Z = z) = µ1(x) +

Cov(U1, V

σV )

Var( V

σV )

˜

λ µD(z) σV

= E (Y1 | D = 1, X = x, Z = z) = µ1(x) +
Cov(U1, V

σV )

Var( V

σV )

˜

λ µD(z) σV

˜

λ(z) = E V σV | V σV < µD(z) σV

< 0

λ(z) = E V σV | V σV ≥ µD(z) σV

> 0

E (Y | D = 0, X = x, Z = z) = µ0(x) +

Cov(U0, V

σV )

Var( V

σV )

λ

µD(z) σV

Var

V σV

= 1

Heckman LATE and the Roy Model

SLIDE 57

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

V σV = −(U1 − U0 − UC) σV Cov

U1, V

σV

= − Cov
U1, V

σV

+ Cov
U0, V

σV

+ Cov
UC, V

σV

In Roy model case (UC = 0),

Cov

U1, V

σV

= − Cov
U1, U1 − U0

σV

= −Cov (U1 − U0, U1)
Var(U1 − U0)

Heckman LATE and the Roy Model

SLIDE 58

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

We can identify µ1(x), µ0(x)
From Discrete Choice model we can identify

µD(z) σV = µ1(x) − µ0(x) − µC(z) σV

If we have a regressor in X that does not affect µC(z) (say

regressor xj, so ∂µC (z)

∂xj

= 0), we can identify σV and µC(z).

∴ We can identify the net benefit function and the cost

function up to scale.

∴ We can compute ex-ante subjective net gains.

Heckman LATE and the Roy Model

SLIDE 59

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

Method generalizes: Don’t need normality
Need “Large Support” assumption to identify ATE and TT
E (Y | D = 1, X = x, Z = z) = µ1(x) +

control function

K1(P(z))

E (Y | D = 0, X = x, Z = z) = µ0(x) + K0(P(z))

control function

lim

P(z)→1 E (Y | D = 1, X = x, Z = z) = µ1(x)

lim

P(z)→0 E (Y | D = 0, X = x, Z = z) = µ0(x)

Heckman LATE and the Roy Model

SLIDE 60

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

If we have this condition satisfied, we can identify ATE

E(Y1 − Y0 | X = x) = µ1(x) − µ0(x)

ATE is defined in a limit set. This is true for any model with

selection on unobservables (IV; selection models)

Heckman LATE and the Roy Model

SLIDE 61

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

What about treatment on the treated?

E(Y1 − Y0 | D = 1, X = x, Z = z)

Heckman LATE and the Roy Model

SLIDE 62

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

(a) From the data, we observe

E(Y1 | D = 1, X = x, Z = z)

(b) Can also create it from the model (c) E(Y0 | D = 1, X = x, Z = z) is a counterfactual

We know E(Y0 | D = 0, X = x, Z = z) = µ0(x) + Cov

U0, V

σV

λ

µD(Z) σV

(this is data)

Heckman LATE and the Roy Model

SLIDE 63

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

(d) We seek

E(Y0 | D = 1, X = x, Z = z) = µ0(x) + Cov

U0, V

σV

˜

λ µD(z) σV

But under normality, we know Cov
U0, V

σV

We know µD(Z)

σV

˜

λ(·) is a known function.

Can form ˜

λ

µD(z)

σV

and can construct counterfactual.

Heckman LATE and the Roy Model

SLIDE 64

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

More generally, without normality but with (X, Z) ⊥

⊥ (U, V ),

E(Y1 | D = 1, X, Z) = E(Y | D = 1, X = x, Z = z) = µ1(x) + K1(P(z)) E(Y0 | D = 0, X, Z) = E(Y | D = 0, X = x, Z = z) = µ0(x) + ˜ K0(P(z)) where K1(P(z)) = E(U1 | D = 1, X = x, Z = z) = E

U1 | µD(z)

σV > V σV

˜

K1(P(z)) = E

U1 | µD(z)

σV > V σV

˜

K0(P(z)) = E

U0 | µD(z)

σV > V σV

Heckman

LATE and the Roy Model

SLIDE 65

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

Use the transformation

FV σV µD(z) σV

= P(z)

FV σV V σV

= UD

(a uniform random variable) D = 1 µD(z) σV ≥ V σV

= 1 (P(z) ≥ UD)

K1(P(z)) = E(U1 | P(z) > UD) K1(P(z))P(z) + ˜ K1(P(z))(1 − P(z)) = 0 ∴ we can construct ˜ K1(P(z))

Heckman LATE and the Roy Model

SLIDE 66

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

Symmetrically

˜ K0(P(z)) = E(U0 | P(z) ≤ UD) K0(P(z)) = E(U0 | P(z) > UD) (1 − P(z)) ˜ K0(P(z)) + P(z)K0(P(z)) = 0

Heckman LATE and the Roy Model

SLIDE 67

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

∴ If we have “identification at infinity,” we can construct

E(Y1 − Y0 | X = x) = µ1(x) − µ0(x)

We can construct TT

E(Y1 − Y0 | D = 1, X = x, Z = z) = = [µ1(x) + K1(P(z))]

factual

− [µ0(x) + K0(P(z))]

counterfactual
But we can form µ1(x) + K1(P(z)) from data
We get µ0(x) from limit set P(z) → 0 identifies µ0(x)
We can form K0(P(z)) = − ˜

K0(P(z))

P(z) 1−P(z)

∴ Can construct the desired counterfactual mean.

Heckman LATE and the Roy Model

SLIDE 68

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

Notice how we can get Effect of Treatment for People at the Margin:

E(Y1 − Y0 | I = 0, X = x, Z = z)

Under normality we have (as a result of independence and normality)

E(Y1 − Y0 | I = 0, X = x, Z = z) = µ1(x) − µ0(x) + E

U1 − U0 | µD(z)

σV = V σV , X = x, Z = z

= µ1(x) − µ0(x) + Cov
U1 − U0, V

σV µD(z) σV In the Roy model case where UC = 0 but µC(z) = 0 = µ1(x) − µ0(x) − σV µD(z) σV

= µ1(x) − µ0(x) − µD(z)

= µC(z) (marginal gain = marginal cost)

Heckman LATE and the Roy Model

SLIDE 69

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

MTE is

E(Y1 − Y0 | V = v, X = x, Z = z) = = µ1(x) − µ0(x) + Cov

U1 − U0, V

σV

v
Effect of Treatment for People at the Margin picks v = µD(z)

σV

Notice we can use the result that

µD(z) σV = F −1

V

σV

(P(z))

V = F −1

V

σV

(UD)

Heckman LATE and the Roy Model

SLIDE 70

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

Effect of Treatment for People at Margin of Indifference

Between Taking Treatment and Not: E(Y1 − Y0 | I = 0, X = x, Z = z) = = µ1(x) − µ0(x) + Cov

U1 − U0, V

σV

F −1
V

σV

(P(z))

MTE:

E(Y1 − Y0 | UD = uD, X = x, Z = z) = = µ1(x) − µ0(x) + Cov

U1 − U0, V

σV

F −1
V

σV

(UD)

Heckman LATE and the Roy Model

SLIDE 71

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

Recent (1987 and Later!) Advances in Econometrics:

(a) Relax normality (b) Do not assume linearity of µ1(X) and µ0(X) in terms of X (c) Do not require identification at infinity but only because they

abandon pursuit of ATE, TT, TUT or else assume that (Y1, Y0) ⊥ ⊥ D | X (matching assumption)

(d) Identification at infinity in some version or the other is required

for ATE, TT, TUT as long as there is selection on unobservables (i.e., (Y1, Y0) ⊥

⊥ D | X)

Heckman LATE and the Roy Model

SLIDE 72

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

End of Example of Normal Model

Heckman LATE and the Roy Model

SLIDE 73

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

Appendix: Nonparametric Identification of the Roy Model

(Y0, Y1) potential outcomes
I ∗ = Y1 − Y0 choice index
Observe Y1 if Y1 ≥ Y0.
Observe Y0 if Y1 < Y0.
Cannot simultaneously observe Y0 and Y1.
Generalized Roy model: I = Y1 − Y0 − C.
C depends on Z.

Heckman LATE and the Roy Model

SLIDE 74

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

Heuristically, we can conduct an identification analysis

assuming we know I = I ∗ σY1−Y0 = Y1 − Y0 σY1−Y0 for each person where D = 1(I > 0).

See Cosslett (1983), Manski (1988), Matzkin (1992).
Assumes there is an an instrument Z that shifts C.
Even though we do not ever observe I, we observe (Y0, D) and

(Y1, D). We never observe the full triple (Y0, Y1, D) for anyone.

We only observe some components of C.

Heckman LATE and the Roy Model

SLIDE 75

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

Under conditions specified in the literature, F(Y0, I|X, Z) and

F(Y1, I|X, Z) are identified where Y0 = µ0(X) + U0 E(Y0 | X) = µ0(X) Y1 = µ1(X) + U1 E(Y1 | X) = µ1(X) I ∗ = µI(X, Z) + UI I = µI (X,Z)

σUI

+ UI

σUI

Source: Heckman (1990), Heckman and Honor´

e (1990).

The key idea in these papers is “sufficient” variation in Z

holding X fixed.

Heckman LATE and the Roy Model

SLIDE 76

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

Sketch of the Proof

From the left-hand side of

Pr(D = 1|X, Z) = Pr(µI(X, Z) + UI ≥ 0|X, Z), we can identify the distribution of

UI σUI , as well as µI (X,Z) σUI

.

This is true under normality or any assumed form for the

distribution of

UI σUI .

It is also true more generally.
One does not have to assume the distribution of UI is known or

that the functional form of µI(X, Z) is linear, e.g., µI(X, Z) = XβI + Zγ.

See the conditions in the Matzkin (1992) paper and the survey

in Matzkin, 2007, Handbook of Econometrics.

Heckman LATE and the Roy Model

SLIDE 77

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

This more general claim requires full support of Z and

restrictions on µI(X, Z). See the “Matzkin conditions” in Cunha, Heckman, and Navarro (2007, IER).

A key condition is

Support µI(X, Z) σUI

⊇ Support

UI σUI

and other regularity conditions.
Commonly it is assumed that for a fixed X

Support µI(X, Z) σUI

= (−∞, ∞).
This is called “identification at infinity.” When we vary Z over

its conditional support (for each X) we trace out the full support of

UI σUI .

Heckman LATE and the Roy Model

SLIDE 78

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

Identifying the Joint Distribution of (Y0, I)

From data, we know the conditional distribution of Y0:

F(Y0 | D = 0, X, Z) = Pr(Y0 ≤ y0 | µI(X, Z) + UI ≤ 0, X, Z)

Multiply this by Pr(D = 0 | X, Z):

F(Y0 | D = 0, X, Z) Pr(D = 0 | X, Z) = Pr(Y0 ≤ y0, I ∗ ≤ 0 | X, Z) (*)

Follow the analysis of Heckman (1990), Heckman and Smith

(1998), and Carneiro, Hansen, and Heckman (2003).

Heckman LATE and the Roy Model

SLIDE 79

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

Left hand side of (*) is known from the data.
Right hand side:

Pr

Y0 ≤ y0, UI

σUI < −µI(X, Z) σUI | X, Z

Since we know µI(X, Z)

σUI from the previous analysis, we can vary it for each fixed X.

Heckman LATE and the Roy Model

SLIDE 80

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

If µI(X, Z) gets small (µI(X, Z) → −∞), recover the marginal

distribution Y and in this limit set we can identify the marginal distribution of Y0 = µ0(X) + U0 ∴ can identify µ0(X) in limit.

(See Heckman, 1990, and Heckman and Vytlacil, 2007.)
More generally, we can form:

Pr

U0 ≤ y0 − µ0(X), UI

σUI ≤ −µI(X, Z) σUI | X, Z

X and Z can be varied and y0 is a number. We can trace out

joint distribution of

U0, UI

σUI

by varying (Y0, Z) for each fixed

X.

Heckman LATE and the Roy Model

SLIDE 81

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

∴ Recover joint distribution of

(Y0, I) =

µ0(X) + U0, µI(X, Z) + UI

σUI

.
Three key ingredients:

1 The independence of (U0, UI) and (X, Z). 2 The assumption that we can set µI(X, Z)

σUI to be very small (so we get the marginal distribution of Y0 and hence µ0(X)).

3 The assumption that µI(X, Z)

σUI can be varied independently of µ0(X).

Trace out the joint distribution of
U0, UI

σUI

. Result generalizes

easily to the vector case. (Carneiro, Hansen, and Heckman, 2003, IER; Heckman and Vytlacil, Part I).

Heckman LATE and the Roy Model

SLIDE 82

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

Another way to see this is to write:

F(Y0 | D = 0, X, Z) Pr(D = 0 | X, Z)

This is a function of µ0(X) and µI(X, Z)

σUI (Index sufficiency)

Heckman LATE and the Roy Model

SLIDE 83

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

Varying the µ0(X) and µI(X, Z)

σUI traces out the distribution of

U0, UI

σUI

.
Effectively we observe the pairs
I

σUI , Y1

and
I

σUI , Y0

.
We never observe the triple
I

σUI , Y0, Y1

.

Heckman LATE and the Roy Model

SLIDE 84

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

Use the intuition that we “know” I.
Actually we observe

F(Y0 | I < 0, X, Z) and F(Y1 | I ≥ 0, X, Z) and Pr(I ≥ 0 | X, Z)

Can construct the joint distributions F(Y0, I | X, Z) and

F(Y1, I | X, Z).

Heckman LATE and the Roy Model

SLIDE 85

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

Roy Case

Armed with normality (or the nonparametric assumptions in

Heckman and Honor´ e, 1990), we can estimate Cov(I, Y1) = Var(Y1) − Cov(Y0, Y1) σ2

Y1 + σ2 Y0 − 2σY1,Y0

Cov(I, Y0) = −Var(Y0) − Cov(Y0, Y1) σ2

Y1 + σ2 Y0 − 2σY1,Y0

.

We know Var Y1, Var Y0 (e.g. normal selection model or use

limit sets).

∴ Cov(Y0, Y1) is identified (actually over-identified).
This line of argument does not generalize if we add a cost

component (C) that is unobserved (or partly so).

It carries through exactly if C(Z) is solely a function of Z.

Heckman LATE and the Roy Model

SLIDE 86

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

Intuition

In the Roy model the decision rule is generated solely by

(Y1, Y0).

Knowing agent choices we observe the relative order (and

magnitude) of Y1 and Y0.

Thus we get a second valuable piece of information from agent
choices. This information is ignored in statistical approaches to

program evaluation.

But does this analysis generalize?

Heckman LATE and the Roy Model

SLIDE 87

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

Generalized Roy Model

Add cost

I = Y1 − Y0 − C

Assume that we do not directly observe C.

Observe Y1 | I > 0, Observe Y0 | I < 0, I = Y1 − Y0 − C

Var(Y1 − Y0 − C)

.

Heckman LATE and the Roy Model

SLIDE 88

Main: Defining LATE Example of Normal Model Nonparametric Identification of the Roy Model

We can identify Var Y1 and can identify Var Y0.
But we cannot directly identify Cov(Y0, Y1) which measures

comparative advantage in Willis-Rosen model.

Notice, however, we can determine if

E(Y1 | I > 0) > E(Y1) E(Y0 | I < 0) > E(Y0)

(Are people who work in a sector above average for the sector?)

Heckman LATE and the Roy Model

LATE and the Generalized Roy Model: Some Relationships

James J. Heckman University of Chicago Extract from: Building Bridges Between Structural and Program Evaluation Approaches to Evaluating Policy James J. Heckman (JEL 2010) Econ 312, Spring 2019

Table of Contents

1 Main: Defining LATE

LATE Identifying Policy Parameters

2 Example of Normal Model 3 Nonparametric Identification of the Roy Model

Defining LATE

What parameters are identified by the selection model that are not identified by MTE? Explain the advantages and disadvantages of each approach. *Answer after reading these slides

LATE

assignment.

the individual would have made had the individual’s Z been exogenously set to z.

by some other mechanism independent of (Y0, Y1).

IA Assumption 1

(Y0, Y1, {D(z)}z∈Z) ⊥ ⊥ Z | X

IA Assumption 2

Pr(D = 1 | Z = z) is a nontrivial function of z conditional on X.

IA Assumption 3

For any two values of Z, say Z = z1 and Z = z2, either D(z1) ≥ D(z2) for all persons, or D(z1) ≤ D(z2) for all persons.

Z, say z3 and z4, the direction of the inequalities on D(z3) and D(z4) have to be ordered in the same direction as they are for D(z1) and D(z2).

same across people.

IV applied to LATE(z2, z1) = E(Y1 − Y0 | D(z2) = 1, D(z1) = 0), if the change from z1 to z2 induces people into the program (D(z2) ≥ D(z1)).

people induced to switch treatment status by the change from z1 to z2.

their treatment status by the change in the instrument.

components of vector Z as used to identify LATE but at different values of Z (say z4, z3), LATE(z2,z1) does not identify LATE(z4, z3).

used to identify LATE, one cannot safely use LATE to identify marginal returns to the policy.

better as I show below.

Identifying Policy Parameters Y1 = µ1(X)+U1, Y0 = µ0(X)+U0, C = µC(Z)+UC, (1)

µD(Z) = E(µ1(X) − µ0(X) − µC(Z) | I) V = −E(U1 − U0 − UC | I).

D = 1(µD(Z) ≥ V ). (2)

µ1(X), and µC(Z) were assumed to be linear in the parameters, and the unobservables were assumed to be normal and distributed independently of X and Z.

modeling of outcome and choice equations.

identification analyses for the Roy and generalized Roy models.

To Recapitulate A useful fact: Assume Z ⊥ ⊥ V (implied by IA Assumption 1) Then Choice Probability : P(z) = Pr(D = 1 | Z = z) = Pr(µD(z) ≥ V ) = Pr µD(z) σV ≥ V σV

σV

σV

Uniform(0, 1)

P(z) = Pr

µD(z) σV

σV

P(z) is the p(z)th quantile of UD.

Recall Y = DY1 + (1 − D)Y0 = Y0 + D(Y1 − Y0) Keep X implicit (condition on X = x) E(Y | Z = z) = E(Y0) + E(Y1 − Y0 | D = 1, Z = z)P(z)

= E(Y0) + E(Y1 − Y0 | P(z) ≥ UD)P(z) ∴ It depends on Z only through P(Z). E(Y | Z = z′) = E(Y0) + E(Y1 − Y0 | P(z′) ≥ UD)P(z′)

=

∞

P(z)

Pr(P(z) ≥ UD)

UD = F

σV

quantile).

fY1−Y0,UD(y1 − y0, uD) = fY1−Y0,UD(y1 − y0 | UD = uD) fUD(uD)

=1

.

E(Y1 − Y0 | P(z) ≥ UD) =

P(z)

(y1 − y0)fY1−Y0,UD(y1 − y0, uD) d(y1 − y0) duD P(z) E(Y1 − Y0 | P(z) ≥ UD) =

P(z)

(y1 − y0)fY1−Y0,UD(y1 − y0 | UD = uD) d(y1 − y0) duD P(z) =

P(z)

P(z)

∴ E(Y | Z = z) = E(Y0) +

P(z)

∂E(Y | Z = z) ∂P(z) = E(Y1 − Y0 | UD = P(z))

people with UD=P(z)

= MTE(UD) for UD = P(Z) E(Y | Z = z′) = E(Y0) +

P(z′)

∴ E(Y | Z = z) − E(Y | Z = z′) = =

P(z)

E(Y1 − Y0 | UD = uD)duD = E(Y1 − Y0 | P(z) ≥ UD ≥ P(z′)) Pr(P(z) ≥ UD ≥ P(z′))

Notice Pr(P(z) ≥ UD ≥ P(z′)) =

P(z)

duD = P(z) − P(z′) E(Y | Z = z) − E(Y | Z = z′) = E(Y1 − Y0 | P(z) ≥ UD ≥ P(z′))

(P(z) − P(z′))

E(Y | Z = z) − E(Y | Z = z′) P(z) − P(z′) = LATE(z, z′) =

P(z)

MTE(uD)duD P(z) − P(z′)

measure of surplus of agents for whom P(z) ≥ UD?

The Surplus From Treatment and the Marginal Treatment Effect

deeply what economic questions LATE answers.

for people with UD less than or equal to p is E(Y1 − Y0 | P(Z) ≥ UD, P(Z) = p) (3) = E(Y1 − Y0 | p ≥ UD) = E(Y1 − Y0 | µD(z) ≥ V ).

that arises from participation in the program for people whose UD is at or below p and the proportion of people whose UD is at or below p: E(Y1 − Y0 | p ≥ UD)p = S(p). E(Y | P(Z) = p) = E(Y0 + 1(p ≥ UD)(Y1 − Y0)) (4) = E(Y0) + E(Y1 − Y0 | p ≥ UD)p