ADVANCED ECONOMETRICS I Theory (3/3) Instructor: Joaquim J. S. - - PowerPoint PPT Presentation

โ–ถ
advanced econometrics i
SMART_READER_LITE
LIVE PREVIEW

ADVANCED ECONOMETRICS I Theory (3/3) Instructor: Joaquim J. S. - - PowerPoint PPT Presentation

ADVANCED ECONOMETRICS I Theory (3/3) Instructor: Joaquim J. S. Ramalho E.mail: jjsro@iscte-iul.pt Personal Website: http://home.iscte-iul.pt/~jjsro Office: D5.10 Course Website: https://jjsramalho.wixsite.com/advecoi Fnix:


slide-1
SLIDE 1

ADVANCED ECONOMETRICS I

Theory (3/3)

Instructor: Joaquim J. S. Ramalho E.mail: jjsro@iscte-iul.pt Personal Website: http://home.iscte-iul.pt/~jjsro Office: D5.10 Course Website: https://jjsramalho.wixsite.com/advecoi Fรฉnix: https://fenix.iscte-iul.pt/disciplinas/03089

slide-2
SLIDE 2

Joaquim J.S. Ramalho

Ordered choices:

Values for the dependent variable: ๐‘ โˆˆ 0,1, โ€ฆ , ๐‘ โˆ’ 1 Latent model: ๐‘

๐‘— โˆ— = ๐‘ฆ๐‘— โ€ฒ๐›พ + ๐‘ฃ๐‘—

โ–ช ๐‘ฆ๐‘— cannot include an intercept

Individual behaviour observed only by intervals: ๐‘๐‘— = เตž if ๐‘

๐‘— โˆ— โ‰ค ๐›ฟ0

๐‘› if ๐›ฟ๐‘›โˆ’1 < ๐‘

๐‘— โˆ— โ‰ค ๐›ฟ๐‘›,

1 โ‰ค ๐‘› โ‰ค ๐‘ โˆ’ 2 ๐‘ โˆ’ 1 if ๐‘

๐‘— โˆ— > ๐›ฟ๐‘โˆ’2

โ–ช Example:

โ€“ ๐‘

๐‘— โˆ— is a latent measure of the health status

โ€“ ๐‘๐‘— is an observed health indicator: poor, satisfactory, good, excellent

Assumption: the ๐›ฟ๐‘˜โ€™s are not known

  • 3. Discrete Choice Models

3.2. Models for Ordered Choices

2020/2021 Advanced Econometrics I 2

slide-3
SLIDE 3

Joaquim J.S. Ramalho

Probabilities:

Aim:

โ–ช Modelling the probability of observing ๐‘

๐‘— โˆ— in a given interval

Each probability is based on the same ๐ป โˆ™ functions used with binary choices, being given by:

๐‘„๐‘  ๐‘

๐‘— = ๐‘›|๐‘ฆ๐‘— = ๐‘„๐‘  ๐›ฟ๐‘›โˆ’1 < ๐‘ ๐‘— โˆ— โ‰ค ๐›ฟ๐‘›|๐‘ฆ๐‘—

= ๐‘„๐‘  ๐‘

๐‘— โˆ— โ‰ค ๐›ฟ๐‘›|๐‘ฆ๐‘— โˆ’ ๐‘„๐‘  ๐‘ ๐‘— โˆ— < ๐›ฟ๐‘›โˆ’1|๐‘ฆ๐‘—

= ๐‘„๐‘  ๐‘ฆ๐‘—

โ€ฒ๐›พ + ๐‘ฃ๐‘— โ‰ค ๐›ฟ๐‘›|๐‘ฆ๐‘— โˆ’ ๐‘„๐‘  ๐‘ฆ๐‘— โ€ฒ๐›พ + ๐‘ฃ๐‘— < ๐›ฟ๐‘›โˆ’1|๐‘ฆ๐‘—

= ๐‘„๐‘  ๐‘ฃ๐‘— โ‰ค ๐›ฟ๐‘› โˆ’ ๐‘ฆ๐‘—

โ€ฒ๐›พ|๐‘ฆ๐‘— โˆ’ ๐‘„๐‘  ๐‘ฃ๐‘— < ๐›ฟ๐‘›โˆ’1 โˆ’ ๐‘ฆ๐‘— โ€ฒ๐›พ|๐‘ฆ๐‘—

= ๐ป ๐›ฟ๐‘› โˆ’ ๐‘ฆ๐‘—

โ€ฒ๐›พ โˆ’ ๐ป ๐›ฟ๐‘›โˆ’1 โˆ’ ๐‘ฆ๐‘— โ€ฒ๐›พ

Hence, the general case is:

๐‘„๐‘  ๐‘

๐‘— = ๐‘›|๐‘ฆ๐‘— = เตž

๐ป ๐›ฟ0 โˆ’ ๐‘ฆ๐‘—

โ€ฒ๐›พ

if ๐‘› = 0 ๐ป ๐›ฟ๐‘› โˆ’ ๐‘ฆ๐‘—

โ€ฒ๐›พ โˆ’ ๐ป ๐›ฟ๐‘›โˆ’1 โˆ’ ๐‘ฆ๐‘— โ€ฒ๐›พ if 1 โ‰ค ๐‘› โ‰ค ๐‘ โˆ’ 2

1 โˆ’ ๐ป ๐›ฟ๐‘โˆ’2 โˆ’ ๐‘ฆ๐‘—

โ€ฒ๐›พ

if ๐‘› = ๐‘ โˆ’ 1

  • 3. Discrete Choice Models

3.2. Models for Ordered Choices

2020/2021 Advanced Econometrics I 3

slide-4
SLIDE 4

Joaquim J.S. Ramalho

Estimation:

Parameters to be estimated:

โ–ช ๐›พ โ–ช ๐›ฟ0, โ€ฆ , ๐›ฟ๐‘โˆ’2

Estimation method:

โ–ช Maximum likelihood

Most common models:

โ–ช Ordered logit โ–ช Ordered probit

  • 3. Discrete Choice Models

3.2. Models for Ordered Choices

2020/2021 Advanced Econometrics I 4

Stata

  • logit Y ๐‘Œ1 โ€ฆ ๐‘Œ๐‘™
  • probit Y ๐‘Œ1 โ€ฆ ๐‘Œ๐‘™
slide-5
SLIDE 5

Joaquim J.S. Ramalho

Partial effects:

Each ๐‘Œ๐‘˜ affects ๐‘ probabilities: โˆ†๐‘Œ

๐‘˜ = 1 โŸน โˆ†๐‘„๐‘  ๐‘ = ๐‘› ๐‘Œ

= เตž โˆ’๐›พ๐‘˜๐‘• ๐›ฟ0 โˆ’ ๐‘ฆ๐‘—

โ€ฒ๐›พ

if ๐‘› = 0 ๐›พ๐‘˜ ๐‘• ๐›ฟ๐‘› โˆ’ ๐‘ฆ๐‘—

โ€ฒ๐›พ โˆ’ ๐‘• ๐›ฟ๐‘›โˆ’1 โˆ’ ๐‘ฆ๐‘— โ€ฒ๐›พ

if 1 โ‰ค ๐‘› โ‰ค ๐‘ โˆ’ 2 ๐›พ๐‘˜๐‘• ๐›ฟ๐‘โˆ’2 โˆ’ ๐‘ฆ๐‘—

โ€ฒ๐›พ

if ๐‘› = ๐‘ โˆ’ 1 The sign of ๐›พ๐‘˜ is informative about the direction of โˆ†๐‘„๐‘  ๐‘ = 0 ๐‘Œ and โˆ†๐‘„๐‘  ๐‘ = ๐‘ โˆ’ 1 ๐‘Œ but not of the changes in the remaining probabilities

  • 3. Discrete Choice Models

3.2. Models for Ordered Choices

2020/2021 Advanced Econometrics I 5

slide-6
SLIDE 6

Joaquim J.S. Ramalho

Multinomial choices:

Values for the dependent variable: ๐‘ โˆˆ 0,1, โ€ฆ , ๐‘ โˆ’ 1 Latent model:

โ–ช Each individual has a given utility associated with each alternative:

๐‘‰๐‘—๐‘› = ๐‘ฆ๐‘—๐‘›

โ€ฒ ๐›พ + ๐‘ฃ๐‘—๐‘›

โ–ช The selected alternative is the one that maximizes utility: ๐‘„๐‘  ๐‘

๐‘— = ๐‘›|๐‘ฆ๐‘— = ๐‘„๐‘  ๐‘‰๐‘—๐‘› = ๐‘›๐‘๐‘ฆ ๐‘‰๐‘—1, โ€ฆ ๐‘‰๐‘—๐‘ |๐‘ฆ๐‘—

Main models:

โ–ช Multinomial Logit: ๐‘‰๐‘—๐‘› ~ ๐ป๐‘ฃ๐‘›๐‘๐‘“๐‘š and ๐‘‰๐‘—๐‘› independent โˆ€๐‘› โ–ช Multinomial Probit: ๐‘‰๐‘—๐‘› ~ ๐‘‚๐‘๐‘ ๐‘›๐‘๐‘š โ–ช Nested Logit โ–ช Random Parameters Logit

  • 3. Discrete Choice Models

3.3. Models for Multinomial Choices

2020/2021 Advanced Econometrics I 6

slide-7
SLIDE 7

Joaquim J.S. Ramalho

Explanatory variables:

๐‘ฆ๐‘—๐‘› may include:

โ–ช ๐‘ฆ๐‘—๐‘›: variables that are different across individuals and alternatives โ–ช ๐‘ฆ๐‘›: variables that differ across alternatives but not individuals โ–ช ๐‘ฆ๐‘—: variables that differ across individuals but not alternatives

Example:

โ–ช ๐‘

๐‘— - selected means of transport to go to work

โ–ช ๐‘ฆ๐‘—๐‘› - time that each individual ๐‘— takes in going to work when using transport ๐‘› โ–ช ๐‘ฆ๐‘› - price of transport ๐‘› โ–ช ๐‘ฆ๐‘— - age of individual ๐‘—

  • 3. Discrete Choice Models

3.3. Models for Multinomial Choices

2020/2021 Advanced Econometrics I 7

slide-8
SLIDE 8

Joaquim J.S. Ramalho

Multinomial logit:

๐‘„๐‘  ๐‘

๐‘— = ๐‘›|๐‘ฆ๐‘—๐‘› = ๐ป๐‘› ๐‘ฆ๐‘—๐‘› โ€ฒ ๐›พ + ๐‘ฆ๐‘— โ€ฒ๐›พ๐‘› =

๐‘“๐‘ฆ๐‘—๐‘›

โ€ฒ ๐›พ+๐‘ฆ๐‘— โ€ฒ๐›พ๐‘›

ฯƒ๐‘˜=0

๐‘โˆ’1 ๐‘“๐‘ฆ๐‘—๐‘˜

โ€ฒ ๐›พ+๐‘ฆ๐‘— โ€ฒ๐›พ๐‘˜

๐›พ๐‘› has to be normalized, that is for one alternative (base

  • utcome) its value is set to zero

๐›พ cannot include a constant term Independence of Irrelevant Alternatives (IIA) โ€“ the odds ratio between two alternatives does not depend on the remaining alternatives:

๐‘„๐‘  ๐‘

๐‘— = ๐‘›|๐‘ฆ๐‘—๐‘›

๐‘„๐‘  ๐‘

๐‘— = ๐‘š|๐‘ฆ๐‘—๐‘›

= ๐‘“๐‘ฆ๐‘—๐‘›

โ€ฒ ๐›พ+๐‘ฆ๐‘— โ€ฒ๐›พ๐‘›

๐‘“๐‘ฆ๐‘—๐‘š

โ€ฒ ๐›พ+๐‘ฆ๐‘— โ€ฒ๐›พ๐‘š

  • 3. Discrete Choice Models

3.3. Models for Multinomial Choices

2020/2021 Advanced Econometrics I 8

slide-9
SLIDE 9

Joaquim J.S. Ramalho

When all explanatory variables are of the type ๐‘ฆ๐‘—๐‘› and ๐‘ฆ๐‘›, the choice between alternatives ๐‘› and ๐‘š is fully explained by diferences in the alternative characteristics: ๐‘„๐‘  ๐‘

๐‘— = ๐‘›|๐‘ฆ๐‘—๐‘›

๐‘„๐‘  ๐‘

๐‘— = ๐‘š|๐‘ฆ๐‘—๐‘š

= ๐‘“๐‘ฆ๐‘—๐‘›

โ€ฒ ๐›พ

๐‘“๐‘ฆ๐‘—๐‘š

โ€ฒ ๐›พ = ๐‘“ ๐‘ฆ๐‘—๐‘› โ€ฒ โˆ’๐‘ฆ๐‘—๐‘š โ€ฒ ๐›พ

โ–ช Is this case, the model is often called โ€˜conditional logitโ€™

When all explanatory variables are of the type ๐‘ฆ๐‘—, the choice between alternatives ๐‘› e ๐‘š is fully explained by diferences between ๐›พ๐‘› e ๐›พ๐‘š: ๐‘„๐‘  ๐‘

๐‘— = ๐‘›|๐‘ฆ๐‘—

๐‘„๐‘  ๐‘

๐‘— = ๐‘š|๐‘ฆ๐‘—

= ๐‘“๐‘ฆ๐‘—

โ€ฒ๐›พ๐‘›

๐‘“๐‘ฆ๐‘—

โ€ฒ๐›พ๐‘š = ๐‘“๐‘ฆ๐‘— โ€ฒ ๐›พ๐‘›โˆ’๐›พ๐‘š

  • 3. Discrete Choice Models

3.3. Models for Multinomial Choices

2020/2021 Advanced Econometrics I 9

Stata asclogit Y ๐‘Œ1๐‘› โ€ฆ, case(id) alternatives(varname) casevars(๐‘Œ๐‘— โ€ฆ) basealternative(name) Stata mlogit Y ๐‘Œ1 โ€ฆ ๐‘Œ๐‘™, baseoutcome(0)

slide-10
SLIDE 10

Joaquim J.S. Ramalho

Estimation:

โ–ช Maximum likelihood based on the following log-likelihood function: ๐‘€๐‘€ = เท

๐‘—=1 ๐‘‚

๐‘’๐‘—๐‘›๐‘š๐‘๐‘• ๐ป๐‘› ๐‘ฆ๐‘—๐‘›

โ€ฒ ๐›พ + ๐‘ฆ๐‘— โ€ฒ๐›พ๐‘›

โ–ช ๐‘’๐‘—๐‘› = 1 if individual ๐‘— chooses alternative ๐‘›

Partial effects:

โ–ช โˆ†๐‘Œ๐‘—๐‘˜ = 1 โŸน

โ€“ โˆ†๐‘„๐‘  ๐‘

๐‘— = ๐‘› ๐‘Œ = ๐›พ๐‘˜๐ป๐‘› โˆ™ ๐‘’๐‘—๐‘› โˆ’ ๐ป๐‘› โˆ™

โ€“ ๐›พ๐‘˜ gives the sign of the partial effect

โ–ช โˆ†๐‘Œ๐‘— = 1 โŸน

โ€“ โˆ†๐‘„๐‘  ๐‘

๐‘— = ๐‘› ๐‘Œ = ๐ป๐‘› โˆ™

๐›พ๐‘˜ โˆ’ าง ๐›พ , onde าง ๐›พ = ฯƒ๐‘›=1

๐‘โˆ’1 ๐›พ๐‘›๐ป๐‘› โˆ™

โ€“ ๐›พ๐‘˜ gives the sign of the partial effect relative to the base alternative, not the sign of the overall effect

  • 3. Discrete Choice Models

3.3. Models for Multinomial Choices

2020/2021 Advanced Econometrics I 10

slide-11
SLIDE 11

Joaquim J.S. Ramalho

Testing IIA

โ–ช Hausman test comparing:

โ€“ Full multinomial logit model โ€“ Multinomial logit model excluding one or more alternatives

โ–ช If multinomial logit is the correct model, then both models produce consistent estimators (null hypothesis) โ–ช If multinomial logit is not the correct model, then the results generated by both models will be different (alternative hypothesis)

  • 3. Discrete Choice Models

3.3. Models for Multinomial Choices

2020/2021 Advanced Econometrics I 11

Stata mlogit Y ๐‘Œ1 โ€ฆ ๐‘Œ๐‘™, baseoutcome(0) (ou asclogitโ€ฆ) estimates store Mod1 mlogit Y ๐‘Œ1 โ€ฆ ๐‘Œ๐‘™ if Y != 3, baseoutcome(0) (ou asclogitโ€ฆ) estimates store Mod2 hausman Mod1 Mod2

slide-12
SLIDE 12

Joaquim J.S. Ramalho

Multinomial probit:

Not affected by the IIA property Very complex, requiring the computation of ๐‘ โˆ’ 1 integrals The version implemented in Stata assumes independent errors, which eliminates the only advantage of multinomial probit over multinomial logit

  • 3. Discrete Choice Models

3.3. Models for Multinomial Choices

2020/2021 Advanced Econometrics I 12

Stata mprobit Y ๐‘Œ1 โ€ฆ ๐‘Œ๐‘™, baseoutcome(0)

slide-13
SLIDE 13

Joaquim J.S. Ramalho

Nested logit:

Not affected by the IIA property, grouping the choices in several sets in such a way that:

โ–ช Within each group, alternatives may be correlated โ–ช Between groups, alternatives are independent

Results from a sequential decision process โ€“ example for a two-level process:

โ–ช Level 1 โ€“ defining J groups, โ–ช Level 2 โ€“ defining ๐‘

๐‘˜ choices in each group

  • 3. Discrete Choice Models

3.3. Models for Multinomial Choices

2020/2021 Advanced Econometrics I 13

Financing Own funds Bank debt Bank 1 Stock market Bank N Lisbon Frankfurt

Stata nlogit โ€ฆ

slide-14
SLIDE 14

Joaquim J.S. Ramalho

Random parameters logit:

Latent model: ๐‘‰๐‘—๐‘› = ๐‘ฆ๐‘—๐‘›

โ€ฒ ๐›พ๐‘— + ๐‘ฃ๐‘—๐‘›

Most common assumption: ๐›พ๐‘— ~ ๐‘‚ ๐›พ, ฮฃ๐›พ Not affected by the IIA property If ฮฃ๐›พ = 0, it reduces to the Multinomial Logit model; hence, comparing the two models allows the IIA property to be tested

  • 3. Discrete Choice Models

3.3. Models for Multinomial Choices

2020/2021 Advanced Econometrics I 14

slide-15
SLIDE 15

Joaquim J.S. Ramalho

4.1. Models for Nonnegative Outcomes 4.2. Models for Fractional Responses 4.3. Models for Discrete-Continuous Responses

  • 4. Models for Continuous Limited Dependent Variables

2020/2021 Advanced Econometrics I 15

slide-16
SLIDE 16

Joaquim J.S. Ramalho

Nonnegative outcomes can be:

โ–ช Continuous: ๐‘ ฯต 0, +โˆž

โ€“ Examples: prices, wages,โ€ฆ

โ–ช Discrete (counts): ๐‘ ฯต 0,1,2,3, โ€ฆ

โ€“ Examples: patents applied for by a firm in a year, times someone is arrested in a year,...

Linear regression models are not the most suitable option because:

โ–ช May generate negative predictions for the dependent variable โ–ช At least close to the lower bound of ๐‘, it does not make sense to assume constant partial effects

  • 4. Models for Continuous Limited Dependent Variables

4.1. Models for Nonnegative Outcomes

2020/2021 Advanced Econometrics I 16

slide-17
SLIDE 17

Joaquim J.S. Ramalho

Log-linear regression model:

ln ๐‘

๐‘— = ๐›พ0 + ๐›พ1๐‘ฆ1๐‘— + โ‹ฏ + ๐›พ๐‘™๐‘ฆ๐‘™๐‘— + ๐‘ฃ๐‘—

With this transformation, the dependent variable becomes unbounded: ๐‘ โˆˆ ]0, +โˆž[โŸน ln ๐‘ โˆˆ ] โˆ’ โˆž, +โˆž[ Assumption: ๐น ๐‘ฃ๐‘—|๐‘ฆ = 0 However, two new problems arise:

โ–ช The log-linear model is not defined for ๐‘ = 0; adding a small constant value to ๐‘ or dropping zeros are not in general good solutions โ–ช Prediction is more interesting in the original scale, เทก ๐‘

๐‘—, and not in the

logarithmic scale, เทฃ ln ๐‘

๐‘— ; the log-linear model gives the latter directly

but retransforming it to the original scale requires additional assumptions and calculations and/or the application of relatively complex methods (see the next slide to understand the problem)

  • 4. Models for Continuous Limited Dependent Variables

4.1. Models for Nonnegative Outcomes

2020/2021 Advanced Econometrics I 17

slide-18
SLIDE 18

Joaquim J.S. Ramalho

Assumed model: ln ๐‘

๐‘— = ๐›พ0 + ๐›พ1๐‘ฆ1๐‘— + โ‹ฏ + ๐›พ๐‘™๐‘ฆ๐‘™๐‘— + ๐‘ฃ๐‘—

โ–ช Consistent estimation requires ๐น ๐‘ฃ๐‘—|๐‘ฆ = 0 โ–ช Under ๐น ๐‘ฃ๐‘—|๐‘ฆ = 0: ๐น ln ๐‘

๐‘— |๐‘ฆ = ๐›พ0 + ๐›พ1๐‘ฆ1๐‘— + โ‹ฏ + ๐›พ๐‘™๐‘ฆ๐‘™๐‘—

เทฃ ln ๐‘

๐‘— = แˆ˜

๐›พ0 + แˆ˜ ๐›พ1๐‘ฆ1๐‘— + โ‹ฏ + แˆ˜ ๐›พ๐‘™๐‘ฆ๐‘™๐‘—

Prediction of ๐‘

๐‘—:

โ–ช If ln ๐‘

๐‘— = ๐›พ0 + ๐›พ1๐‘ฆ1๐‘— + โ‹ฏ + ๐›พ๐‘™๐‘ฆ๐‘™๐‘— + ๐‘ฃ๐‘—, then:

๐‘

๐‘— = ๐‘“๐›พ0+๐›พ1๐‘ฆ1๐‘—+โ‹ฏ+๐›พ๐‘™๐‘ฆ๐‘™๐‘—+๐‘ฃ๐‘—

and ๐น ๐‘

๐‘—|๐‘ฆ = ๐‘“๐›พ0+๐›พ1๐‘ฆ1๐‘—+โ‹ฏ+๐›พ๐‘™๐‘ฆ๐‘™๐‘—๐น ๐‘“๐‘ฃ๐‘—|๐‘ฆ

โ–ช Consistent prediction of ๐‘

๐‘— would require assuming ๐น ๐‘“๐‘ฃ๐‘—|๐‘ฆ = 1;

however, the assumption made, ๐น ๐‘ฃ๐‘—|๐‘ฆ = 0, implies that, in general, ๐น ๐‘“๐‘ฃ๐‘—|๐‘ฆ โ‰  1 โ–ช Alternatively, we need to get a consistent estimate of ๐น ๐‘“๐‘ฃ๐‘—|๐‘ฆ , which requires additional assumptions

  • 4. Models for Continuous Limited Dependent Variables

4.1. Models for Nonnegative Outcomes

2020/2021 Advanced Econometrics I 18

slide-19
SLIDE 19

Joaquim J.S. Ramalho

Exponential regression model:

๐‘ = exp ๐‘ฆโ€ฒ๐›พ + ๐‘ฃ ๐น ๐‘|๐‘Œ = exp ๐‘ฆโ€ฒ๐›พ Assumption: ๐น ๐‘“๐‘ฃ|๐‘ฆ = 1 Advantages:

โ–ช เทก ๐‘

๐‘— is always nonnegative

โ–ช Predictions are obtained directly in the original scale, without requiring any retransformations

Partial effects: โˆ†๐‘Œ

๐‘˜ = 1 โŸน โˆ†๐น ๐‘ ๐‘Œ = ๐›พ๐‘˜exp ๐‘ฆโ€ฒ๐›พ

โ–ช The sign of the effect is given by the sign of ๐›พ๐‘˜ โ–ช ๐›พ๐‘˜ can be interpreted as a semi-elasticity (see the next slide for a proof)

  • 4. Models for Continuous Limited Dependent Variables

4.1. Models for Nonnegative Outcomes

2020/2021 Advanced Econometrics I 19

slide-20
SLIDE 20

Joaquim J.S. Ramalho

โˆ†๐‘Œ

๐‘˜ = 1 โŸน โˆ†๐น ๐‘ ๐‘Œ = ๐›พ๐‘˜exp ๐‘ฆโ€ฒ๐›พ

โŸน โˆ†๐น ๐‘ ๐‘Œ = ๐›พ๐‘˜๐น ๐‘ ๐‘Œ โŸน โˆ†๐น ๐‘ ๐‘Œ ๐น ๐‘|๐‘Œ = ๐›พ๐‘˜ โŸน 100 โˆ†๐น ๐‘ ๐‘Œ ๐น ๐‘|๐‘Œ = 100๐›พ๐‘˜ โŸน %โˆ†๐น ๐‘ ๐‘Œ = 100๐›พ๐‘˜%

  • 4. Models for Continuous Limited Dependent Variables

4.1. Models for Nonnegative Outcomes

2020/2021 Advanced Econometrics I 20

slide-21
SLIDE 21

Joaquim J.S. Ramalho

Assumptions and estimation methods according to the type of nonnegative outcome:

โ–ช Continuous response:

โ€“ Assumption: only ๐น ๐‘|๐‘Œ ; estimation: QML

โ–ช Count data - two alternatives:

โ€“ Assumption: only ๐น ๐‘|๐‘Œ ; estimation: QML โ€“ Assumption: ๐น ๐‘|๐‘Œ and ๐‘„๐‘  ๐‘ = ๐‘˜|๐‘Œ ; estimation: ML

Three main distribution functions are used as basis for QML and/or ML estimation:

โ–ช Poisson โ–ช Negative Binomial 1 โ–ช Negative Binomial 2

  • 4. Models for Continuous Limited Dependent Variables

4.1. Models for Nonnegative Outcomes

2020/2021 Advanced Econometrics I 21

slide-22
SLIDE 22

Joaquim J.S. Ramalho

Poisson regression model:

๐‘

๐‘— ~ ๐‘„๐‘๐‘—๐‘ก๐‘ก๐‘๐‘œ ๐œ‡๐‘— โŸน ๐‘„๐‘  ๐‘ ๐‘— = ๐‘ง|๐‘ฆ๐‘— = ๐‘“โˆ’๐œ‡๐‘—๐œ‡๐‘— ๐‘ง

๐‘ง! where ๐œ‡๐‘— = ๐น ๐‘|๐‘Œ = exp ๐‘ฆโ€ฒ๐›พ Estimation methods: ML (only count data) or QML, since the Poisson distribution belongs to the linear exponential family By definition, ๐น ๐‘|๐‘Œ = ๐‘Š๐‘๐‘  ๐‘|๐‘Œ (equidispersion), which may be a strong assumption is some empirical applications

  • 4. Models for Continuous Limited Dependent Variables

4.1. Models for Nonnegative Outcomes

2020/2021 Advanced Econometrics I 22

Stata ML: poisson Y ๐‘Œ1 โ€ฆ ๐‘Œ๐‘™ QML: poisson Y ๐‘Œ1 โ€ฆ ๐‘Œ๐‘™, robust

slide-23
SLIDE 23

Joaquim J.S. Ramalho

Negative binomial regression models:

Two variants, both allowing for overdispersion (๐œ€ > 0):

โ–ช NEGBIN1: ๐‘Š๐‘๐‘  ๐‘|๐‘Œ = 1 + ๐œ€ ๐น ๐‘|๐‘Œ - ML estimation โ–ช NEGBIN2: ๐‘Š๐‘๐‘  ๐‘|๐‘Œ = 1 + ๐œ€๐น ๐‘|๐‘Œ ๐น ๐‘|๐‘Œ - it belongs to the linear exponential family, enabling estimation by both ML (only count data) and QML

Overdispersion test:

๐ผ0: ๐œ€ = 0 (Poisson model) ๐ผ1: ๐œ€ โ‰  0 (Negative Binomial 1 or 2 model)

  • 4. Models for Continuous Limited Dependent Variables

4.1. Models for Nonnegative Outcomes

2020/2021 Advanced Econometrics I 23

Stata NEGBIN1: nbreg Y ๐‘Œ1 โ€ฆ ๐‘Œ๐‘™, dispersion(constant) NEGBIN2 (ML): nbreg Y ๐‘Œ1 โ€ฆ ๐‘Œ๐‘™, dispersion(mean) NEGBIN2 (QML): nbreg Y ๐‘Œ1 โ€ฆ ๐‘Œ๐‘™, dispersion(mean) robust

slide-24
SLIDE 24

Joaquim J.S. Ramalho

Base panel data model:

Continuous / count data: ๐น ๐‘

๐‘—๐‘ข ๐‘ฆ๐‘—๐‘ข, ๐›ฝ๐‘— = exp ๐›ฟ๐‘— + ๐‘ฆ๐‘—๐‘ข โ€ฒ ๐›พ = ๐›ฝ๐‘—exp ๐‘ฆ๐‘—๐‘ข โ€ฒ ๐›พ

Count data: ๐‘„๐‘  ๐‘

๐‘—๐‘ข = ๐‘ง|๐‘ฆ๐‘—๐‘ข, ๐›ฝ๐‘— = ๐‘“โˆ’๐œ‡๐‘—๐‘ข๐œ‡๐‘—๐‘ข ๐‘ง

๐‘ง! ๐œ‡๐‘— = ๐น ๐‘

๐‘—๐‘ข ๐‘ฆ๐‘—๐‘ข, ๐›ฝ๐‘— = ๐›ฝ๐‘—exp ๐‘ฆ๐‘—๐‘ข โ€ฒ ๐›พ

Pooled estimator:

Based on the cross-sectional assumption ๐น ๐‘

๐‘—๐‘ข ๐‘ฆ๐‘—๐‘ข =

exp ๐‘ฆ๐‘—๐‘ข

โ€ฒ ๐›พ

Produces consistent estimators only if ๐น ๐›ฝ๐‘— ๐‘ฆ๐‘—๐‘ข = 1

  • 4. Models for Continuous Limited Dependent Variables

4.1. Models for Nonnegative Outcomes

2020/2021 Advanced Econometrics I 24

Stata poisson Y ๐‘Œ1 โ€ฆ ๐‘Œ๐‘™, vce(cluster clustvar)

slide-25
SLIDE 25

Joaquim J.S. Ramalho

Random Effects Poisson Estimator:

Assumptions:

โ–ช ๐‘

๐‘—๐‘ข ~ ๐‘„๐‘๐‘—๐‘ก๐‘ก๐‘๐‘œ ๐œ‡๐‘—๐‘ข

โ–ช ๐œ‡๐‘— = ๐น ๐‘

๐‘—๐‘ข ๐‘ฆ๐‘—๐‘ข, ๐›ฝ๐‘— = ๐›ฝ๐‘—exp ๐‘ฆ๐‘—๐‘ข โ€ฒ ๐›พ

โ–ช log ๐›ฝ๐‘— = ๐›ฟ๐‘— ~ ๐ป๐‘๐‘›๐‘›๐‘ 1, ๐œƒ

Resulting model:

โ–ช NEGBIN2-type model โ–ช Estimation method: ML โ–ช ๐น ๐‘

๐‘—๐‘ข ๐‘ฆ๐‘—๐‘ข = exp ๐‘ฆ๐‘—๐‘ข โ€ฒ ๐›พ , which implies that the Pooled estimator is

consistent under random effects of this type

  • 4. Models for Continuous Limited Dependent Variables

4.1. Models for Nonnegative Outcomes

2020/2021 Advanced Econometrics I 25

Stata xtpoisson Y ๐‘Œ1 โ€ฆ ๐‘Œ๐‘™, re vce(robust)

slide-26
SLIDE 26

Joaquim J.S. Ramalho

Fixed Effects Estimators:

Fixed effects Poisson estimator (three equivalent versions):

โ–ช Pooled estimator with individual effects โ–ช Estimator conditional on ฯƒ๐‘ข=1

๐‘ˆ

๐‘

๐‘—๐‘ข, with ฯƒ๐‘ข=1 ๐‘ˆ

๐‘

๐‘—๐‘ข โ‰  0

โ–ช Quasi mean-differenced GMM estimator (Hausman, Hall and Griliches, 1984)

Quasi-differences GMM estimator:

โ–ช Chamberlain (1992) โ–ช Wooldridge (1997)

  • 4. Models for Continuous Limited Dependent Variables

4.1. Models for Nonnegative Outcomes

2020/2021 Advanced Econometrics I 26

slide-27
SLIDE 27

Joaquim J.S. Ramalho

Fixed effects Poisson estimator:

May be derived using the three equivalent versions Pooled estimator with individual effects:

โ–ช Adds individual dummies, associated to the ๐›ฟ๐‘—

โ€ฒs

โ–ช As in linear models, ๐›พ is consistently estimated even in short panels (no incidental parameters problem)

The quasi mean-differenced GMM estimator is based on the following moment condition: ๐น แ‰ค ๐‘

๐‘—๐‘ข โˆ’ ๐œ‡๐‘—๐‘ข

าง ๐œ‡๐‘— เดค ๐‘

๐‘— ๐‘ฆ๐‘—๐‘ข

= 0, where ๐œ‡๐‘—๐‘ข = ๐‘“๐‘ฆ๐‘ž ๐‘ฆ๐‘—๐‘ข

โ€ฒ ๐›พ

Requires strictly exogenous explanatory variables

  • 4. Models for Continuous Limited Dependent Variables

4.1. Models for Nonnegative Outcomes

2020/2021 Advanced Econometrics I 27

Stata xtpoisson Y ๐‘Œ1 โ€ฆ ๐‘Œ๐‘™, fe vce(robust)

slide-28
SLIDE 28

Joaquim J.S. Ramalho

Quasi-differences GMM estimator :

Chamberlain (1992): ๐น แ‰ค ๐œ‡๐‘—๐‘ข ๐œ‡๐‘—,๐‘ขโˆ’1 ๐‘

๐‘—๐‘ข โˆ’ ๐‘ ๐‘—,๐‘ขโˆ’1 ๐‘ฆ๐‘—๐‘ข

= 0 Wooldridge (1997):

๐น แ‰ค ๐‘

๐‘—๐‘ข

๐œ‡๐‘—๐‘ข โˆ’ ๐‘

๐‘—,๐‘ขโˆ’1

๐œ‡๐‘—,๐‘ขโˆ’1 ๐‘ฆ๐‘—๐‘ข = 0

In both cases the explanatory variables do not need to be strictly exogenous, so these estimators are particularly useful in dynamic models

  • 4. Models for Continuous Limited Dependent Variables

4.1. Models for Nonnegative Outcomes

2020/2021 Advanced Econometrics I 28

slide-29
SLIDE 29

Joaquim J.S. Ramalho

Fractional outcomes:

๐‘ ฯต 0,1

Base specification:

๐น ๐‘ ๐‘Œ = ๐ป ๐‘ฆโ€ฒ๐›พ where the ๐ป โˆ™ function must respect the restriction 0 โ‰ค ๐ป โˆ™ โ‰ค 1

Main models:

Fractional regression model: assumes only ๐น ๐‘|๐‘Œ Beta regression model: assumes also ๐‘„๐‘  ๐‘|๐‘Œ Transformation regression models (assume only ๐น ๐‘|๐‘Œ ):

โ–ช Linear transformation โ–ช Exponential transformation

  • 4. Models for Continuous Limited Dependent Variables

4.2. Models for Fractional Responses

2020/2021 Advanced Econometrics I 29

slide-30
SLIDE 30

Joaquim J.S. Ramalho

Fractional regression models:

Very similar to binary regression models

โ–ช Main models: Logit, Probit, Cloglog โ–ช Partial effects calculated using the same expressions โ–ช Estimation also based on the Bernoulli function, but only by QML

  • 4. Models for Continuous Limited Dependent Variables

4.2. Models for Fractional Responses

2020/2021 Advanced Econometrics I 30

Stata glm Y ๐‘Œ1 โ€ฆ ๐‘Œ๐‘™, family(binomial) link(logit) robust glm Y ๐‘Œ1 โ€ฆ ๐‘Œ๐‘™, family(binomial) link(probit) robust glm Y ๐‘Œ1 โ€ฆ ๐‘Œ๐‘™, family(binomial) link(cloglog) robust

slide-31
SLIDE 31

Joaquim J.S. Ramalho

Beta regression model:

Assumes also ๐น ๐‘ ๐‘Œ = ๐ป ๐‘ฆโ€ฒ๐›พ , using the same functions for ๐ป โˆ™ Additional assumption: ๐‘

๐‘— ~ ๐ถ๐‘“๐‘ข๐‘, with mean given by ๐ป ๐‘ฆโ€ฒ๐›พ

and precision parameter ๐œš Estimation only by ML: more efficient, less robust Only available when ๐‘ ฯต 0,1

  • 4. Models for Continuous Limited Dependent Variables

4.2. Models for Fractional Responses

2020/2021 Advanced Econometrics I 31

slide-32
SLIDE 32

Joaquim J.S. Ramalho

Linear transformation:

๐‘

๐‘— = ๐ป ๐‘ฆ๐‘— โ€ฒ๐›พ + ๐‘ฃ๐‘—

๐ผ ๐‘

๐‘— = ๐‘ฆ๐‘— โ€ฒ๐›พ + ๐‘ฃ๐‘—

Alternative specifications:

โ–ช Logit: ๐ผ ๐‘

๐‘— = ln ๐‘๐‘— 1โˆ’๐‘๐‘—

โ–ช Probit: ๐ผ ๐‘

๐‘— = ฮฆโˆ’1 ๐‘ ๐‘—

โ–ช Cloglog: ๐ผ ๐‘

๐‘— = ln โˆ’ln 1 โˆ’ ๐‘ ๐‘—

Advantages:

โ–ช Estimation: OLS โ–ช Easy to deal with panel data and endogenous variables

Limitations:

โ–ช ๐ผ ๐‘

๐‘— is not defined for ๐‘ ๐‘— = 0 and ๐‘ ๐‘— = 1

โ–ช Prediction in the original scale requires additional assumptions and calculations and/or the application of relatively complex methods

  • 4. Models for Continuous Limited Dependent Variables

4.2. Models for Fractional Responses

2020/2021 Advanced Econometrics I 32

Example for logit: ๐‘

๐‘— =

๐‘“๐‘ฆ๐‘—

โ€ฒ๐›พ+๐‘ฃ๐‘—

1 + ๐‘“๐‘ฆ๐‘—

โ€ฒ๐›พ+๐‘ฃ๐‘—

๐‘

๐‘— + ๐‘ ๐‘—๐‘“๐‘ฆ๐‘—

โ€ฒ๐›พ+๐‘ฃ๐‘— = ๐‘“๐‘ฆ๐‘— โ€ฒ๐›พ+๐‘ฃ๐‘—

๐‘

๐‘— = ๐‘“๐‘ฆ๐‘—

โ€ฒ๐›พ+๐‘ฃ๐‘— โˆ’ ๐‘

๐‘—๐‘“๐‘ฆ๐‘—

โ€ฒ๐›พ+๐‘ฃ๐‘—

๐‘

๐‘— = 1 โˆ’ ๐‘ ๐‘— ๐‘“๐‘ฆ๐‘—

โ€ฒ๐›พ+๐‘ฃ๐‘—

๐‘

๐‘—

1 โˆ’ ๐‘

๐‘—

= ๐‘“๐‘ฆ๐‘—

โ€ฒ๐›พ+๐‘ฃ๐‘—

ln ๐‘

๐‘—

1 โˆ’ ๐‘

๐‘—

= ๐‘ฆ๐‘—

โ€ฒ๐›พ + ๐‘ฃ๐‘—

slide-33
SLIDE 33

Joaquim J.S. Ramalho

Exponential transformation:

๐‘

๐‘— = ๐ป ๐‘ฆ๐‘— โ€ฒ๐›พ + ๐‘ฃ๐‘— = ๐ป1 exp ๐‘ฆ๐‘— โ€ฒ๐›พ + ๐‘ฃ๐‘—

๐ผ1 ๐‘

๐‘— = exp ๐‘ฆ๐‘— โ€ฒ๐›พ + ๐‘ฃ๐‘—

Alternative specifications:

โ–ช Logit: ๐ผ1 ๐‘

๐‘— = ๐‘๐‘— 1โˆ’๐‘๐‘—

โ–ช Cloglog: ๐ผ1 ๐‘

๐‘— = โˆ’ln 1 โˆ’ ๐‘ ๐‘—

Advantages:

โ–ช Estimation: same methods as those used for nonnegative responses โ–ช Easy to deal with panel data and endogenous variables

Limitations:

โ–ช Not aplicable to the probit model โ–ช ๐ผ ๐‘

๐‘— is not defined for ๐‘ ๐‘— = 1 (but it is for ๐‘ ๐‘— = 0)

โ–ช Prediction in the original scale requires additional assumptions and calculations and/or the application of relatively complex methods

  • 4. Models for Continuous Limited Dependent Variables

4.2. Models for Fractional Responses

2020/2021 Advanced Econometrics I 33

slide-34
SLIDE 34

Joaquim J.S. Ramalho

Multivariate fractional outcomes:

๐‘

๐‘—๐‘› ฯต 0,1 , ๐‘› = 0, โ€ฆ , ๐‘ โˆ’ 1

ฯƒ๐‘›=0

๐‘โˆ’1 ๐‘ ๐‘—๐‘› = 1

Base specification:

๐น ๐‘

๐‘—๐‘› ๐‘Œ๐‘— = ๐ป๐‘› ๐‘ฆโ€ฒ๐›พ

The ๐ป๐‘› โˆ™ function must respect the restrictions 0 โ‰ค ๐ป๐‘› โˆ™ โ‰ค 1 and ฯƒ๐‘›=0

๐‘โˆ’1 ๐ป๐‘› = 1

Main models:

Multivariate fractional regression model Dirichlet regression model

  • 4. Models for Continuous Limited Dependent Variables

4.2. Models for Fractional Responses

2020/2021 Advanced Econometrics I 34

slide-35
SLIDE 35

Joaquim J.S. Ramalho

Multivariate fractional regression model:

Very similar to multinomial choice models

โ–ช Main models: Logit Multinomial, Nested Logit, Random Parameters Logit, โ€ฆ โ–ช Partial effects calculated using the same expressions

QML estimation based on the multivariate Bernoulli function

Dirichlet regression model:

Assumes the same specifications for ๐ป๐‘› โˆ™ Additional assumption: ๐‘

๐‘— ~ ๐ธ๐‘—๐‘ ๐‘—๐‘‘โ„Ž๐‘š๐‘“๐‘ข, with means given by

๐ป๐‘› ๐‘ฆโ€ฒ๐›พ and precision parameter ๐œš Estimation only by ML: more efficient, less robust Only available when ๐‘

๐‘—๐‘› ฯต 0,1

  • 4. Models for Continuous Limited Dependent Variables

4.2. Models for Fractional Responses

2020/2021 Advanced Econometrics I 35

slide-36
SLIDE 36

Joaquim J.S. Ramalho

Panel data - base specification:

๐น ๐‘

๐‘—๐‘ข ๐‘ฆ๐‘—๐‘ข, ๐›ฝ๐‘— = ๐ป ๐›ฝ๐‘— + ๐‘ฆ๐‘—๐‘ข โ€ฒ ๐›พ

Estimators:

Pooled estimator (requires ๐›ฝ๐‘— = ๐›ฝ for consistency) Pooled with individual effects (requires ๐‘ˆ โŸถ โˆž for consistency) Random effects (assumes ๐›ฝ๐‘—~๐‘‚ 0, ๐œ๐›ฝ

2 )

Fixed effects (based on linear or exponential transformations)

  • 4. Models for Continuous Limited Dependent Variables

4.2. Models for Fractional Responses

2020/2021 Advanced Econometrics I 36

slide-37
SLIDE 37

Joaquim J.S. Ramalho

Tobit Model Two-Part Model Sample Selection Model

  • 4. Models for Continuous Limited Dependent Variables

4.3. Models for Discrete-Continuous Responses

2020/2021 Advanced Econometrics I 37

slide-38
SLIDE 38

Joaquim J.S. Ramalho

Motivation:

Sometimes, the dependent variable has both discrete and continuous values; typically:

โ–ช Discrete value: for many individuals, ๐‘

๐‘— = 0

โ–ช Continuous component: for the remaining individuals, ๐‘

๐‘— may take on

some positive value, which may be bounded (fractional outcome) or not (nonnegative outcome)

Examples:

โ–ช Expenditures on durable goods, alcohol,,... โ–ช Work hours

  • 4. Models for Continuous Limited Dependent Variables

4.3. Models for Discrete-Continuous Responses

2020/2021 Advanced Econometrics I 38

slide-39
SLIDE 39

Joaquim J.S. Ramalho

Alternative models:

Tobit model: a single model explains all values Two-part model: uses two independent models for explaining separately the zeros and the positive values Sample selection model: uses two different, but interdependent, models for explaining the zeros and the positive values

  • 4. Models for Continuous Limited Dependent Variables

4.3. Models for Discrete-Continuous Responses

2020/2021 Advanced Econometrics I 39

slide-40
SLIDE 40

Joaquim J.S. Ramalho

Tobit model - specification:

Latent model: ๐‘

๐‘— โˆ— = ๐‘ฆ๐‘— โ€ฒ๐›พ + ๐‘ฃ๐‘—, โˆ’โˆž < ๐‘ ๐‘— โˆ— < +โˆž

Instead of ๐‘

๐‘— โˆ—, it is observed:

๐‘

๐‘— = เต0 if ๐‘ ๐‘— โˆ— โ‰ค 0

๐‘

๐‘— โˆ— if ๐‘ ๐‘— โˆ— > 0

Assumption: ๐‘ฃ๐‘— ~ ๐‘‚ 0, ๐œ2

โ–ช ๐‘„๐‘  ๐‘

๐‘— = 0|๐‘ฆ๐‘— = ๐‘„๐‘  ๐‘ ๐‘— โˆ— โ‰ค 0|๐‘ฆ๐‘— = ๐‘„๐‘  ๐‘ฆ๐‘— โ€ฒ๐›พ + ๐‘ฃ๐‘— โ‰ค 0|๐‘ฆ๐‘— = ๐‘„๐‘ (

) ๐‘ฃ๐‘— โ‰ค โˆ’๐‘ฆ๐‘—

โ€ฒ๐›พ|๐‘ฆ๐‘— = ๐‘„๐‘ 

เธฌ

๐‘ฃ๐‘— ๐œ2 โ‰ค โˆ’ ๐‘ฆ๐‘—

โ€ฒ๐›พ

๐œ2 ๐‘ฆ๐‘—

= ฮฆ โˆ’

๐‘ฆ๐‘—

โ€ฒ๐›พ

๐œ2

= 1 โˆ’ ฮฆ

๐‘ฆ๐‘—

โ€ฒ๐›พ

๐œ2

โ–ช Hence: ๐‘” ๐‘ง๐‘—|๐‘ฆ๐‘— = 1 โˆ’ ฮฆ

๐‘ฆ๐‘—

โ€ฒ๐›พ

๐œ2

if Y = 0

1 2๐œŒ๐œ2 ๐‘“โˆ’

๐‘ง๐‘—โˆ’๐‘ฆ๐‘— โ€ฒ๐›พ 2 2๐œ2

if ๐‘ > 0

  • 4. Models for Continuous Limited Dependent Variables

4.3. Models for Discrete-Continuous Responses

2020/2021 Advanced Econometrics I 40

slide-41
SLIDE 41

Joaquim J.S. Ramalho

Estimation:

Method: ML Parameters to be estimated: ๐›พ and ๐œ Log-likelihood function:

๐‘€๐‘€ = เท 1 โˆ’ ๐‘’๐‘— ๐‘š๐‘๐‘• 1 โˆ’ ฮฆ ๐‘ฆ๐‘—

โ€ฒ๐›พ

๐œ2 + ๐‘’๐‘—๐‘š๐‘๐‘• 1 2๐œŒ๐œ2 ๐‘“โˆ’ ๐‘ง๐‘—โˆ’๐‘ฆ๐‘—

โ€ฒ๐›พ 2

2๐œ2

where ๐‘’๐‘— = แ‰Š0 if ๐‘

๐‘— = 0

1 if ๐‘

๐‘— > 0

  • 4. Models for Continuous Limited Dependent Variables

4.3. Models for Discrete-Continuous Responses

2020/2021 Advanced Econometrics I 41

Stata tobit Y ๐‘Œ1 โ€ฆ ๐‘Œ๐‘™, ll(0)

slide-42
SLIDE 42

Joaquim J.S. Ramalho

Quantities of interest:

Conditional mean given that ๐‘

๐‘— is positive:

๐น ๐‘

๐‘—|๐‘ฆ๐‘—, ๐‘ ๐‘— > 0 = ๐‘ฆ๐‘— โ€ฒ๐›พ + ๐œ๐œ‡ ๐‘ฆ๐‘— โ€ฒ๐›พ

๐œ

where ๐œ‡

๐‘ฆ๐‘—

โ€ฒ๐›พ

๐œ

=

๐œš

๐‘ฆ๐‘— โ€ฒ๐›พ ๐œ

ฮฆ

๐‘ฆ๐‘— โ€ฒ๐›พ ๐œ

is the Mills ratio

Probability of observing positive values for ๐‘

๐‘—:

Pr ๐‘

๐‘— > 0|๐‘ฆ๐‘— = ฮฆ ๐‘ฆ๐‘— โ€ฒ๐›พ

๐œ Overall conditional mean:

๐น ๐‘

๐‘—|๐‘ฆ๐‘— = ๐‘„๐‘  ๐‘ ๐‘— = 0|๐‘ฆ๐‘— ๐น ๐‘ ๐‘—|๐‘ฆ๐‘—, ๐‘ ๐‘— = 0 + Pr ๐‘ ๐‘— > 0|๐‘ฆ๐‘— ๐น ๐‘ ๐‘—|๐‘ฆ๐‘—, ๐‘ ๐‘— > 0

= ฮฆ ๐‘ฆ๐‘—

โ€ฒ๐›พ

๐œ ๐‘ฆ๐‘—

โ€ฒ๐›พ + ๐œ๐œš ๐‘ฆ๐‘— โ€ฒ๐›พ

๐œ

  • 4. Models for Continuous Limited Dependent Variables

4.3. Models for Discrete-Continuous Responses

2020/2021 Advanced Econometrics I 42

slide-43
SLIDE 43

Joaquim J.S. Ramalho

Partial effects:

โˆ†๐‘Œ

๐‘˜ = 1 โŸน

โ–ช โˆ†๐น ๐‘

๐‘—|๐‘ฆ๐‘—, ๐‘ ๐‘— > 0 = ๐›พ๐‘˜ 1 โˆ’ ๐œ‡ ๐‘ฆ๐‘—

โ€ฒ๐›พ

๐œ ๐‘ฆโ€ฒ๐‘—๐›พ ๐œ

+ ๐œ‡

๐‘ฆ๐‘—

โ€ฒ๐›พ

๐œ

โ–ช โˆ†๐‘„๐‘  ๐‘

๐‘— > 0|๐‘ฆ๐‘— = ๐›พ๐‘˜ ๐œ ๐œš ๐‘ฆ๐‘—

โ€ฒ๐›พ

๐œ

โ–ช โˆ†๐น ๐‘

๐‘—|๐‘ฆ๐‘—

= ๐›พ๐‘˜ฮฆ

๐‘ฆ๐‘—

โ€ฒ๐›พ

๐œ

The three effects have the same sign

  • 4. Models for Continuous Limited Dependent Variables

4.3. Models for Discrete-Continuous Responses

2020/2021 Advanced Econometrics I 43

slide-44
SLIDE 44

Joaquim J.S. Ramalho

Two-part model specification:

First part โ€“ binary regression model: ๐‘„๐‘  ๐‘’๐‘— = 1|๐‘ฆ๐‘— = ๐ป1 ๐‘ฆ๐‘—

โ€ฒ๐›พ

โ–ช ๐‘’๐‘— = แ‰Š0 se ๐‘

๐‘— = 0

1 se ๐‘

๐‘— > 0

Second part โ€“ exponential or fractional regression model ๐น ๐‘

๐‘—|๐‘ฆ๐‘—, ๐‘’๐‘— = 1 = ๐ป2 ๐‘ฆ๐‘— โ€ฒ๐œ„

Overall conditional mean: ๐น ๐‘

๐‘—|๐‘ฆ๐‘—

= ๐‘„๐‘  ๐‘

๐‘— = 0|๐‘ฆ๐‘— ๐น ๐‘ ๐‘—|๐‘ฆ๐‘—, ๐‘ ๐‘— = 0 + Pr ๐‘ ๐‘— > 0|๐‘ฆ๐‘— ๐น ๐‘ ๐‘—|๐‘ฆ๐‘—, ๐‘ ๐‘— > 0

= ๐ป1 ๐‘ฆ๐‘—

โ€ฒ๐›พ ๐ป2 ๐‘ฆ๐‘— โ€ฒ๐œ„

  • 4. Models for Continuous Limited Dependent Variables

4.3. Models for Discrete-Continuous Responses

2020/2021 Advanced Econometrics I 44

slide-45
SLIDE 45

Joaquim J.S. Ramalho

Estimation:

Each part of the model is estimated separately:

โ–ช In each part, use the standard methods for the type of data being analyzed โ–ช In the first part of the model, use the full sample โ–ช In the second part of the model, use the subsample for which ๐‘

๐‘— > 0

โ–ช One may use different explanatory variables in each part of the model

Partial effects:

โˆ†๐‘„๐‘  ๐‘’๐‘— = 1|๐‘ฆ๐‘— โˆ†๐น ๐‘

๐‘—|๐‘ฆ๐‘—, ๐‘’๐‘— = 1

โˆ†๐น ๐‘

๐‘—|๐‘ฆ๐‘—

= โˆ†๐‘„๐‘  ๐‘’๐‘— = 1|๐‘ฆ๐‘— ๐น ๐‘

๐‘—|๐‘ฆ๐‘—, ๐‘’๐‘— = 1 + ๐‘„๐‘ (

) ๐‘’๐‘— = 1|๐‘ฆ๐‘— โˆ†๐น ๐‘

๐‘—|๐‘ฆ๐‘—, ๐‘’๐‘— = 1

  • 4. Models for Continuous Limited Dependent Variables

4.3. Models for Discrete-Continuous Responses

2020/2021 Advanced Econometrics I 45

slide-46
SLIDE 46

Joaquim J.S. Ramalho

Sample selection model - latent variable:

๐‘

2๐‘— โˆ— : main variable

๐‘

1๐‘— โˆ— : variable that determines whether ๐‘ 2๐‘— โˆ— is observed or not

Two equations:

Participation equation (e.g. to work or not): ๐‘

1๐‘— = เต0 if ๐‘ 1๐‘— โˆ— โ‰ค 0

1 if ๐‘

1๐‘— โˆ— > 0

Outcome equation (e.g. how much to work): ๐‘

2๐‘— = เตโˆ’

if ๐‘

1๐‘— โˆ— โ‰ค 0

๐‘

2๐‘— โˆ— if ๐‘ 1๐‘— โˆ— > 0

  • 4. Models for Continuous Limited Dependent Variables

4.3. Models for Discrete-Continuous Responses

2020/2021 Advanced Econometrics I 46

slide-47
SLIDE 47

Joaquim J.S. Ramalho

Latent linear models:

เต๐‘

1๐‘— โˆ— = ๐‘ฆ1๐‘— โ€ฒ ๐›พ1 + ๐‘ฃ1๐‘—

๐‘

2๐‘— โˆ— = ๐‘ฆ2๐‘— โ€ฒ ๐›พ2 + ๐‘ฃ2๐‘—

Assumptions:

The error terms of the two equations are assumed to be correlated, having a bivariate normal distribution: ๐‘ฃ1๐‘— ๐‘ฃ2๐‘— ~๐‘‚ 0 , 1 ๐œ12 ๐œ12 ๐œ2

2

Only when ๐œ12 = 0 the two equations will be independent (the selection mechanism is exogenous or ignorable):

โ–ช In this case, the second equation may be estimated by OLS using only the observed data

  • 4. Models for Continuous Limited Dependent Variables

4.3. Models for Discrete-Continuous Responses

2020/2021 Advanced Econometrics I 47

slide-48
SLIDE 48

Joaquim J.S. Ramalho

Quantities of interest:

Conditional mean of the main latent variable: ๐น ๐‘

2๐‘— โˆ— |๐‘ฆ๐‘— = ๐‘ฆ2๐‘— โ€ฒ ๐›พ2

Conditional mean of the main observed dependent variable: ๐น ๐‘

2๐‘—|๐‘ฆ๐‘—, ๐‘ 1๐‘— = 1 = ๐‘ฆ2๐‘— โ€ฒ ๐›พ2 + ๐œ12๐œ‡ ๐‘ฆ1๐‘— โ€ฒ ๐›พ1

Probability of observing positive values: ๐‘„๐‘  ๐‘

2๐‘— > 0|๐‘ฆ๐‘— = ๐‘„๐‘  ๐‘ 1๐‘— = 1|๐‘ฆ๐‘— = ฮฆ ๐‘ฆ1๐‘— โ€ฒ ๐›พ1

  • 4. Models for Continuous Limited Dependent Variables

4.3. Models for Discrete-Continuous Responses

2020/2021 Advanced Econometrics I 48

slide-49
SLIDE 49

Joaquim J.S. Ramalho

Parameters to be estimated: ๐›พ, ๐œ12, ๐œ2 Estimation methods:

ML Heckmanโ€™s two-step method

ML:

Based on the following log-likelihood function:

๐‘€๐‘€ = เท 1 โˆ’ ๐‘’๐‘— Pr ๐‘

1๐‘— = 0|๐‘ฆ1๐‘— + ๐‘’๐‘— ๐‘” ๐‘ 1๐‘— = 1|๐‘ 2๐‘— + ๐‘” ๐‘ 2๐‘—

  • 4. Models for Continuous Limited Dependent Variables

4.3. Models for Discrete-Continuous Responses

2020/2021 Advanced Econometrics I 49

Stata heckman ๐‘

2 ๐‘Œ1 โ€ฆ ๐‘Œ๐‘™, select(๐‘ 1 ๐‘Œ1 โ€ฆ ๐‘Œ๐‘™)

slide-50
SLIDE 50

Joaquim J.S. Ramalho

Heckmanโ€™s two-step method:

Based on ๐น ๐‘

2๐‘—|๐‘ฆ๐‘—, ๐‘ 1๐‘— = 1 = ๐‘ฆ2๐‘— โ€ฒ ๐›พ2 + ๐œ12๐œ‡ ๐‘ฆ1๐‘— โ€ฒ ๐›พ1

First step: estimate the probit model ๐‘„๐‘  ๐‘

1๐‘— = 1|๐‘ฆ๐‘— =

ฮฆ ๐‘ฆ1๐‘—

โ€ฒ ๐›พ1 and get ๐œ‡ ๐‘ฆ1๐‘— โ€ฒ แˆ˜

๐›พ1 =

๐œš ๐‘ฆ1๐‘—

โ€ฒ เทก

๐›พ1 ฮฆ ๐‘ฆ1๐‘—

โ€ฒ เทก

๐›พ1

Second step: regress ๐‘

2๐‘— on ๐‘ฆ2๐‘— and ๐œ‡ ๐‘ฆ1๐‘— โ€ฒ แˆ˜

๐›พ1 using only individuals fully observed and OLS, and correct the variances t test for H0: ๐œ12 = 0 (exogenous selection mechanism) If the same regressors are used in both steps, multicolinearity may arise; to avoid it, it is usual to exclude from ๐‘ฆ2๐‘— some of the variables included in ๐‘ฆ1๐‘—

  • 4. Models for Continuous Limited Dependent Variables

4.3. Models for Discrete-Continuous Responses

2020/2021 Advanced Econometrics I 50

Stata heckman ๐‘

2 ๐‘Œ1 โ€ฆ ๐‘Œ๐‘™, twostep select(๐‘ 1 ๐‘Œ1 โ€ฆ ๐‘Œ๐‘™)