Dynamic Programming: Estimation ECON 34430: Topics in Labor Markets - - PowerPoint PPT Presentation

dynamic programming estimation
SMART_READER_LITE
LIVE PREVIEW

Dynamic Programming: Estimation ECON 34430: Topics in Labor Markets - - PowerPoint PPT Presentation

Dynamic Programming: Estimation ECON 34430: Topics in Labor Markets T. Lamadon (U of Chicago) Winter 2016 Agenda 1 Introduction - General formulation - Assumptions - Estimation in general 2 Estimation of Rust models - Example of Rust and Phelan


slide-1
SLIDE 1

Dynamic Programming: Estimation

ECON 34430: Topics in Labor Markets

  • T. Lamadon (U of Chicago)

Winter 2016

slide-2
SLIDE 2

Agenda

1 Introduction

  • General formulation
  • Assumptions
  • Estimation in general

2 Estimation of Rust models

  • Example of Rust and Phelan
  • NFXP
  • Partial likelihood
  • Hotz and Miller
  • I follow Aguirregabiria and Mira (2010)
slide-3
SLIDE 3

Introduction

slide-4
SLIDE 4

General formulation

  • time is discrete indexed by t
  • agents are indexed by i
  • state of the world at time t:
  • state sit
  • control variable ait
  • agent preferences are

T

  • j=0

βj U (ai,t+j , si,t+j )

  • agents have beliefs about state transitions F(si,t+1|ait, sit)
slide-5
SLIDE 5

Decision

  • The agent Bellman equation is

V (sit) = max

a∈A

  • U (a, sit) + β
  • V (si,t+1)dF(si,t+1|a, sit)
  • we define the choice specific value function or Q-value :

v(a, sit) = U (a, sit) + β

  • V (si,t+1)dF(si,t+1|a, sit)
  • and the policy function:

α(sit) = arg max

a∈A v(a, sit)

slide-6
SLIDE 6

Data

  • ait is the action
  • xit is a subset of sit = (xit, ǫit)
  • ǫit gives a source of variation of the individual level
  • can have structural interpretation (pref shocks)
  • yit is a payoff variable yit = Y (ait, xit, ǫit)
  • then U (ait, sit) =

U (yit, ait, sit)

  • earnings is a good example
  • Data = {ait, xit, yit : i = 1, 2...N ; t = 1, 2..Ti}
  • usually N is large, Ti is small
slide-7
SLIDE 7

Estimation

  • parameter θ affects U (a, sit) and F(si,t+1|a, sit)
  • we have an estimation criteria gN (θ)
  • example is likelihood gN (θ) =

i li(θ):

li(θ) = log Pr[α(xit, ǫit, θ)=ait, Y (ait, xit, ǫit, θ)=yit, xit, t = 1..Ti|θ]

  • in general, we need to solve for α(·) for each value of θ
  • the particular form of li(θ) depends on relation between
  • bservables and unobservables
slide-8
SLIDE 8

Econometric assumptions

slide-9
SLIDE 9

Assumptions

AS Additive separability

  • U (a, xit, ǫit) = u(a, xit) + ǫit(a)
  • ǫit(a) is 1-dimensional, mean 0, unbounded
  • there is one per choice, ǫit is (J + 1)-dimensional

IID IID unobservables

  • ǫit are iid across agents and time
  • distribution Gǫ(ǫit)

CLOGIT

  • ǫit are independent across alternatives and type-1 extreme

value distribution

slide-10
SLIDE 10

Assumptions

CI-X Conditional independence of future x

  • xi,t+1 ⊥ ǫit|ait, xit
  • θf describes F(xi,t+1|ait, xit)
  • future realization of the state do not depend on the shock

CI-Y Conditional independence of y

  • yi,t ⊥ ǫit|ait, xit
  • θY describes F(yit)(ait, xit)
  • rules out Heckman type selection

DIS Discrete support for x

  • xit is finite
slide-11
SLIDE 11

Example 1

slide-12
SLIDE 12

Retirement model, Rust and Phelan (1997)

Model

  • consumption is cit = yit − hcit (hcit is health care expenditure)
  • earnings is yit = aitwit + (1 − ait)bit
  • mit is marital status Markov, hit is health status Markov
  • ppit is pension point with Fpp(ppit+1|wit, ppit)
  • preferences:

U (ait, xit, ǫit) = E[cθu1

it |ait, xit]

· exp

  • θu2 + θu3hit + θu4mit + θu5

tit 1 + tit

  • − θu6ait + ǫit(ait)
  • wages:

wit = exp

  • θw1+θw2hit+θw3mit+θw4

tit 1 + tit +θw5ppit+ξit

slide-13
SLIDE 13

Retirement model, Rust and Phelan (1997)

Assumptions

  • CI-Y holds since ξit is
  • serially uncorrelated
  • independent of xit, ǫit
  • unknown at the time of decision
  • AS, additive separability is also assumed here
  • this implies that there is no uncertainty about future marginal

utilities of consumption

  • CI-X also holds
  • future xit do not depend on the current shock, just current

action (it does depend on ξit though)

  • DIS and IID also hold
slide-14
SLIDE 14

Retirement model, Rust and Phelan (1997)

Implications

  • under CI-X and IID we get that

F(xi,t+1, ǫi,t+1|ait, xit, ǫit) = Gǫ(ǫi,t+1)Fx(xi,t+1|ait, xit)

  • the unobserved ǫit drops from the state space, we can look at

integrated value function or Emax function: ¯ V (xit) =

  • max

a∈A

  • u(a, xit) + ǫit(a)

+ β

  • xi,t+1

¯ V (xi,t+1)fx(xi,t+1|a, xit)

  • the computational complexity is only driven by the size of the

support of x

  • we define

v(a, xit) = u(a, xit) + β

  • xi,t+1

¯ V (xi,t+1)fx(xi,t+1|a, xit)

slide-15
SLIDE 15

Retirement model, Rust and Phelan (1997)

Implications

  • under CI-X and IID we get that

F(xi,t+1, ǫi,t+1|ait, xit, ǫit) = Gǫ(ǫi,t+1)Fx(xi,t+1|ait, xit)

  • the unobserved ǫit drops from the state space, we can look at

integrated value function or Emax function: ¯ V (xit) =

  • max

a∈A

  • u(a, xit) + ǫit(a)

+ β

  • xi,t+1

¯ V (xi,t+1)fx(xi,t+1|a, xit)

  • the computational complexity is only driven by the size of the

support of x

  • we define

v(a, xit) = u(a, xit) + β

  • xi,t+1

¯ V (xi,t+1)fx(xi,t+1|a, xit)

slide-16
SLIDE 16

Retirement model, Rust and Phelan (1997)

Implications 2

  • under CI-X and IID the log-likelihood is separable:

li(θ) =

  • t

log P(ait|xit, θ) +

  • t

log fY (yit|ait, xit, θY ) +

  • t

log fX (xi,t+1|ait, xit, θf ) + log Pr[xi1|θ]

  • note how here each term can be tackled separately, in

particular, the wage equation and the transition probabilities can be estimated directly from the data

  • P(ait|xit, θ) is referred to as Conditional Choice Probability
  • r CCP
slide-17
SLIDE 17

Retirement model, Rust and Phelan (1997)

Implications 3

  • The CPP is given by:

P(ait=a|xit, θ) =

  • I [α(xit, ǫit; θ) = a]dGǫ(ǫit)

=

  • I [v(a, xit) + ǫit(a) > v(a′, xit) for all a′]dGǫ(ǫit)
  • If we add to this the CLOGIT assumption we get:

¯ V (xit) = log

  • a∈A

exp

  • u(a, xit) + ǫit(a)

+ β

  • xi,t+1

¯ V (xi,t+1)fx(xi,t+1|a, xit)

  • we do not even need to do a maximization!
slide-18
SLIDE 18

Retirement model, Rust and Phelan (1997)

Implications 4

  • using:

v(a, xit) = u(a, xit) + β

  • xi,t+1

¯ V (xi,t+1)fx(xi,t+1|a, xit)

  • we get our CCP:

P(a|xit, θ) = exp{v(a, xit)}

  • j exp{v(aj , xit)}
slide-19
SLIDE 19

Estimation procedures

slide-20
SLIDE 20

Nested fixed point

1 pick parameter θ 2 solve for the policy α(·)

V (sit) = max

a∈A

  • U (a, sit) + β
  • V (si,t+1)dF(si,t+1|a, sit)
  • 3 compute full likelihood

li(θ) = log Pr[α(xit, ǫit, θ)=ait, Y (ait, xit, ǫit, θ)=yit, xit, t = 1..Ti|θ]

4 update θ (use gradient method or other ...)

  • very intuitive
  • provide MLE estimate
  • very costly, sometimes you need to simulate integral (obejctive

might not be smooth)

  • how closely to solve inside problem
slide-21
SLIDE 21

Nested fixed point

1 pick parameter θ 2 solve for the policy α(·)

V (sit) = max

a∈A

  • U (a, sit) + β
  • V (si,t+1)dF(si,t+1|a, sit)
  • 3 compute full likelihood

li(θ) = log Pr[α(xit, ǫit, θ)=ait, Y (ait, xit, ǫit, θ)=yit, xit, t = 1..Ti|θ]

4 update θ (use gradient method or other ...)

  • very intuitive
  • provide MLE estimate
  • very costly, sometimes you need to simulate integral (obejctive

might not be smooth)

  • how closely to solve inside problem
slide-22
SLIDE 22

Rust partial likelihood

  • use the separability of the likelihood under CI-X and IID:

li(θ) =

  • t

log P(ait|xit, θ) +

  • t

log fY (yit|ait, xit, θY ) +

  • t

log fX (xi,t+1|ait, xit, θf ) + log Pr[xi1|θ]

1 estimate fY (yit|ait, xit, θY ) and fX (xi,t+1|ait, xit, θf ) 2 then iterate on θu only, using a small NFXP, and using

CLOGIT

slide-23
SLIDE 23

Rust partial likelihood

part 2

  • solve Bellman, where fx(xi,t+1|a, xit) is given :

¯ V (xit) = log

  • a∈A

exp

  • u(a, xit) + ǫit(a)

+ β

  • xi,t+1

¯ V (xi,t+1)fx(xi,t+1|a, xit)

  • with:

v(a, xit) = u(a, xit) + β

  • xi,t+1

¯ V (xi,t+1)fx(xi,t+1|a, xit) P(a|xit, θ) = exp{v(a, xit)}

  • j exp{v(aj , xit)}
slide-24
SLIDE 24

Rust partial likelihood

  • take advantage of log-likelihood separability
  • loose on estimator efficiency, huge gains in computational

efficiency

  • CLOGIT and finite T allows for exact solutions
  • Rust type models are the building block to tackle dynamic

games

slide-25
SLIDE 25

Hotz and Miller (1993) approach

  • under RUST assumptions, we don’t need to solve the model
  • intuition: agents in the data have already done it for us!
  • the costly part of Rust partial likelihood is to solve at each θu:

¯ V (xit) = log

  • a∈A

exp

  • u(a, xit) + ǫit(a)

+ β

  • xi,t+1

¯ V (xi,t+1)fx(xi,t+1|a, xit)

  • the transitions in the data are according to the correct policy,

we can estimate directly the CCP.

slide-26
SLIDE 26

Hotz and Miller (1993) approach

  • consider a linear utility model u(a, x, θu) = z(a, x)′θu
  • Hotz and Miller show that

v(a, xt, θ) = ˜ z(a, xt, θ)′θu + ˜ e(a, xt, θ)

  • where ˜

z(a, xt, θ) and ˜ e(a, xt, θ) depend on θ only through parameters in the transition probabilities Fx and the CCPs of the individuals.

slide-27
SLIDE 27

Hotz and Miller (1993) approach

  • for instance:

˜ z(a, xt, θ) = z(a, xt) +

T−τ

  • τ=0

βτExt+τ ,ǫt+τ [z(α(xt+τ, ǫt+τ, θ), xt+τ)|at = a, xt] = z(a, xt) +

T−τ

  • τ=0

βτExt+τ [

  • j

P(at+τ=aj |xt+τ) · z(at+τ, xt+τ)|at=a, xt]

  • however we can measure P(at+τ=aj |xt+τ) directly from the

data

  • we end up with logit restrictions of the form

I (ait = aj )− exp

  • ˜

z(aj , xit)θu + ˜ e(aj , xit)

j exp

  • ˜

z(a′

j , xit)θu + ˜

e(a′

j , xit)

= 0, for each j, t, xit

slide-28
SLIDE 28

Hotz and Miller (1993) approach

recap

1 estimate Fx and Fy using partial likelihood 2 estimate conditional choice probabilities P(aj |xit) 3 construct ˜

z and ˜ e

4 estimate θu using logistic expression

  • this is computational trivial (convex problem)
  • we never have to solve a Dynamic problem!
  • of course it relies on quite strong assumptions
slide-29
SLIDE 29

Eckstein-Keane-Wolpin type models

slide-30
SLIDE 30

Occupation of young men, Keane and Wolpin (1997)

Model

  • start at age 16, to age T
  • chooses stay home (ait = 0), school (ait = 4), work among 3
  • ccupations (ait = 1, 2, 3)
  • preferences where hit is years of schooling:

U (0, sit) = ωi(0) + ǫit(0) U (4, sit) = ωi(4) − θtc1I [hit ≥ 12] − θtc2I [hit ≥ 16] + ǫit(4) U (a, sit) = Wit(a)

  • wages, where ra is an occupation price, kit(a) is experience

Wit(a) = raexp

  • ωi(a)+θa1hit+θa2kit(a)−θa3kit(a)2+ǫit(a)
slide-31
SLIDE 31

Occupation of young men, Keane and Wolpin (1997)

Data structure

  • ǫit(a) jointly normal with unrestricted covariance structure,

serially uncorrelated

  • ωi(a) choice specific permanent heterogeneity
  • unobservable states: ǫit(a), ωi(a)
  • observable states: xit = {hit, tit, kit(a) : a = 1, 2, 3}
  • fx(xi,t+1|ait, xit) is deterministic
  • the payoff Wit(a) is only observed at choice ait
  • do IID, CI-Y, CI-X, AS, CLOGIT hold here?
slide-32
SLIDE 32

Occupation of young men, Keane and Wolpin (1997)

Assumptions

  • ǫit(a) + ωi(a) are serially correlated, no IID
  • ǫit(a) are normal and correlated, no CLOGIT
  • ǫit(a) is observed before decision, no CI-Y
  • we do have CI-X, but here Fx is trivial (just accumulation)
  • is there anything we can do?
slide-33
SLIDE 33

Occupation of young men, Keane and Wolpin (1997)

Latent structure

  • Conditional on ωi(a), ǫit(a) do satisfy IID
  • using discrete Ω = {ωl(a)}l=1..L we can factor the likelihood:

li(θ, Ω, π) = log(

  • l

Li(θ, ωl)πj|x1)

  • and

Li(θ, ωl) =

  • t

p(ait|xit, θ, ωl) · fY (yit|ait, xit, θ, ωl)

  • t

fX (xi,t+1|ait, xit, θf , ωl)

  • but we can apply an EM approach.
  • however fY still depends on full θ.
  • there is an initial condition difficulty
slide-34
SLIDE 34

Occupation of young men, Keane and Wolpin (1997)

Latent structure

  • Conditional on ωi(a), ǫit(a) do satisfy IID
  • using discrete Ω = {ωl(a)}l=1..L we can factor the likelihood:

li(θ, Ω, π) = log(

  • l

Li(θ, ωl)πj|x1)

  • and

Li(θ, ωl) =

  • t

p(ait|xit, θ, ωl) · fY (yit|ait, xit, θ, ωl)

  • t

fX (xi,t+1|ait, xit, θf , ωl)

  • but we can apply an EM approach.
  • however fY still depends on full θ.
  • there is an initial condition difficulty
slide-35
SLIDE 35

Occupation of young men, Keane and Wolpin (1997)

Latent structure

  • conditional on ω, we have IID:

¯ Vl(xit) =

  • max

a∈A

  • u(a, xit, ǫit, ωl)

+ β

  • xi,t+1

¯ Vl(xi,t+1)fx(xi,t+1|a, xit, ωl)

  • getting α(xit, ǫit, ωl) requires solving the DP L times
  • integration is done by approximating ¯

Vl(xit) as a regression model and using monte carlo: ¯ Vl(x) = φ(x)′γl + ν

slide-36
SLIDE 36

Arcidiacono and Jones

  • if CI-Y holds conditional on ωl, we get get separability within

Li(θ, ωl) =

  • t

p(ait|xit, θ, ωl) · fY (yit|ait, xit, θy, ωl)

  • t

fX (xi,t+1|ait, xit, θf , ωl)

  • this means that within ωl group, we can get the Fy and FX

directly when we have the posterior probabilities of ωl

slide-37
SLIDE 37

Recap

  • Full nested fixed point
  • for each θ including π, ωl, evaluate likelihood, and maximize
  • use numerical integration
  • EM approach
  • start with guess of θ
  • compute posterior πi(l)
  • update θ
  • use numerical integration
  • EM approach
  • start with guess of θ including
  • compute posterior πi(l)
  • update θ using separability!
  • use numerical integration
slide-38
SLIDE 38

References

Aguirregabiria, V., and P. Mira (2010): “Dynamic discrete choice structural models: A survey,”J. Econom., 156(1), 38–67.