classical discrete choice theory
play

Classical Discrete Choice Theory James J. Heckman University of - PowerPoint PPT Presentation

Classical Discrete Choice Theory James J. Heckman University of Chicago Econ 312, Spring 2019 Heckman Classical Discrete Choice Theory Classical regression model: y = x + 0 = E ( | x ) 0 , 2 I E N 1


  1. Debreu (1960) criticism of Luce Model • “Red Bus - Blue Bus Problem” • Suppose N + 1 th alternative is identical to the first 2 e θ ( s ) ′ x N +1 Pr(choose 1 or N + 1 | s , B ′ ) = � N +1 l =1 e θ ( s ) ′ x l • = ⇒ Introduction of identical good changes probability of riding a bus. • not an attractive result • comes from need to make iid assumption on new alternative Heckman Classical Discrete Choice Theory

  2. Debreu (1960) criticism of Luce Model: Some Alternative Assumptions 1 Could let v i = ln( θ ( s ) ′ x i ) θ ( s ) ′ x j Pr( j | s , B ) = � N +1 l =1 θ ( s ) ′ x l If we also imposed � N l =1 θ ( s ) ′ x l = 1, we would get linear probability model but this could violate IIA. 2 Could consider model of form e θ j ( s ) x i Pr( j | s , B ) = � N l =1 e θ l ( s ) x l but here we have lost our forecasting ability (cannot predict demand for a new good). 3 Universal Logit Model e ϕ i ( x 1 ,..., x N ) β ( s ) Pr( i | s , x 1 , ..., x N ) = � N l =1 e ϕ l ( x 1 ,..., x N ) β ( s ) Heckman Classical Discrete Choice Theory Here we lose IIA and forecasting (Bernstein Polynomial

  3. Criteria for a good PCS 1 Goal: We want a probabilistic choice model that 1 has a flexible functional form 2 is computationally practical 3 allows for flexibility in representing substitution patterns among choices 4 is consistent with a random utility model (RUM) = ⇒ has a structural interpretation Heckman Classical Discrete Choice Theory

  4. How do you verify that a candidate PCS is consistent with a RUM? 1 Goal: (a) Either start with a R.U.M. u i = v ( s , x i ) + ε ( s , x i ) and solve integral for � v l + ε l � Pr( u i > u l , ∀ l � = i ) = Pr( i = arg max ) l or (b) start with a candidate PCS and verify that it is consistent with a R.U.M. (easier) 2 McFadden provides sufficient conditions 3 See discussion of Daley-Zachary-Williams theorem Heckman Classical Discrete Choice Theory

  5. Link to Airum Models Heckman Classical Discrete Choice Theory

  6. Daly-Zachary-Williams Theorem • Daly-Zachary (1976) and Williams (1977) provide a set of conditions that makes it easy to derive a PCS from a RUM with a class of models (“generalized extreme value” (GEV) models) • Define G : G ( Y 1 , . . . , Y J ) • If G satisfies the following 1 nonnegative defined on Y 1 , . . . , Y J ≥ 0 2 homogeneous degree one in its arguments Y i →∞ G ( Y 1 , . . . , Y i , . . . , Y J ) → ∞ , ∀ i = 1 , . . . , J lim 3 • ∂ k G is nonnegative if k odd (1) nonpositive if even ∂ Y 1 · · · ∂ Y k Heckman Classical Discrete Choice Theory

  7. • Then for a R.U.M. with u i = v i + ε i and � � e − ε 1 , . . . , e − ε J �� F ( ε 1 , . . . , ε J ) = exp − G • This cdf has Weibull marginals but allows for more dependence among ε ’s. • The PCS is given by = e v i G i ( e v 1 , . . . , e v J ) P i = ∂ ln G ∂ v i G ( e v 1 , . . . , e v J ) • Note: McFadden shows that under certain conditions on the form of the indirect utility function (satisfies AIRUM form), the DZW result can be seen as a form of Roy’s identity. Heckman Classical Discrete Choice Theory

  8. • Let’s apply this result • Multinomial logit model (MNL) e − e − ε 1 · · · e − e − ε J ← cdf F ( ε 1 , . . . , ε J ) = − product of iid Weibulls e − � J j =1 e − ε j = • Can verify that G ( e v 1 , . . . , e v J ) = � J j =1 e v i satisfies DZW conditions e v j P ( j ) = ∂ ln G = l =1 e v l = MNL model � J ∂ v i Heckman Classical Discrete Choice Theory

  9. • Another GEV model • Nested logit model (addresses to a limited extent the IIA criticism) • Let   1 − σ m ( տ like an M �  � vi G ( e v 1 , . . . , e v J ) =  elasticity a m e 1 − σ m of substitution) m =1 i ∈ B m Heckman Classical Discrete Choice Theory

  10. • Idea: divide goods into branches • First choose branch, then good within branch red blue car bus • Will allow for correlation between errors (this is role of σ )) • B m ⊆ { 1 , . . . , J } � B m = B m =1 is a single branch—need not have all choices on all branches Heckman Classical Discrete Choice Theory

  11. • Note: if σ = 0, get usual MNL form • Calculate equation for � � 1 − σ m � �� � m vi ∂ ln m =1 a m i ∈ B m e 1 − σ m ∂ ln G p i = = ∂ v i ∂ v i � � �� � − σ m �� � − 1 �� � � vi vi vi vi m ∋ i ∈ B m a m e 1 − σ m i ∈ B m e 1 − σ m i ∈ B m e 1 − σ m i ∈ B m e 1 − σ m = �� � 1 − σ m � m vi m =1 a m i ∈ B m e 1 − σ m m � = P ( i | B m ) P ( B m ) m =1 Heckman Classical Discrete Choice Theory

  12. • Where vi e 1 − σ m P ( i | B m ) = if i ∈ B m , 0 otherwise � vi i ∈ B m e 1 − σ m �� � 1 − σ m vi a m i ∈ B m e 1 − σ m P ( B m ) = �� � 1 − σ m � m vi m =1 a m i ∈ B m e 1 − σ m • Note: If P ( B m ) = 1 get logit form • Nested logit requires that analyst make choices about nesting structure Heckman Classical Discrete Choice Theory

  13. • How does nested logit solve red bus/blue bus problem? • Suppose � � 1 − σ 1 1 Y i = e v i 1 − σ 1 − σ G = Y 1 + Y + Y 2 3 Heckman Classical Discrete Choice Theory

  14. e v 1 ∂ ln G P (1 | { 123 } ) = = � � 1 − σ ∂ v i v 2 v 3 e v 1 + 1 − σ + e e 1 − σ � � − σ v 2 v 2 v 3 1 − σ + e e e 1 − σ 1 − σ ∂ ln G P (2 | { 123 } ) = = � � 1 − σ ∂ v i v 2 v 3 e v 1 + 1 − σ + e e 1 − σ Heckman Classical Discrete Choice Theory

  15. • As v 3 → −∞ e v 1 P (1 | { 123 } ) = (get logistic) e v 1 + e v 2 • As v 1 → −∞ Heckman Classical Discrete Choice Theory

  16. What Role Does σ Play? • σ is the degree of substitutability parameter • Recall F ( ε 1 , ε 2 , ε 3 ) = exp {− G ( e − ε 1 , e − ε 2 , e − ε 3 ) } • Here cov ( ε 2 , ε 3 ) σ = √ var ε 2 var ε 3 = correlation coefficient • Thus we require − 1 ≤ σ ≤ 1, but turns out we also need to require σ > 0 for DZW conditions to be satisfied. This is unfortunate because it does not allow ε ’s to be negatively correlated. • Can show that e v 1 σ → 1 P (1 | { 123 } ) = lim e v 1 + max( e v 2 , e v 3 ) (L’Hˆ opital’s Rule) Heckman Classical Discrete Choice Theory

  17. • If v 2 = v 3 , then � � − σ v 2 v 2 e 2 e 1 − σ 1 − σ P (2 | { 123 } ) = � � 1 − σ v 2 e v 1 + 2 e 1 − σ e v 2 2 − σ = e v 1 + ( e v 2 ) (2 1 − σ ) e v 2 2 − 1 lim = e v 1 + e v 2 when v 1 = v 2 σ → 1 ր introduce 3rd identical alternative and cut the probability of choosing 2 in half • Solves red-bus/blue-bus problem • Probability cut in half with two identical alternatives Heckman Classical Discrete Choice Theory

  18. red bus blue bus car • σ is a measure of similarity between red and blue bus. • When σ close to one, the conditional choice probability selects with high probability the alternative. Heckman Classical Discrete Choice Theory

  19. • Remark: We can expand logit to accommodate multiple levels ex.  � 1 − σ m  ��   Q � � 1 1 − σ m G = a q a m y  3 levels i  q =1 m ∈ Q q i ∈ B m Heckman Classical Discrete Choice Theory

  20. • Example: Two Choices 1 Neighborhood ( m ) 2 Transportation mode ( t ) 3 P ( m ): choice of neighborhood 4 P ( i | B m ): probability of choosing i th mode, given neighborhood m Heckman Classical Discrete Choice Theory

  21. 1 Not all modes available in all neighborhoods �� T m � − σ m v ( m , t ) v ( m , t ) e t =1 e 1 − σ m 1 − σ m P m , t = �� T j � 1 − σ m � m v ( m , t ) t =1 e 1 − σ m j =1 v ( m , t ) e 1 − σ m P t | m = � T m v ( m , t ) t =1 e 1 − σ m �� T m � 1 − σ m v ( m , t ) t =1 e 1 − σ m P m = � 1 − σ m = P ( B m ) �� T j � m v ( m , t ) t =1 e 1 − σ m j =1 Heckman Classical Discrete Choice Theory

  22. • Standard type of utility function that people might use v ( m , t ) = z ′ t γ + x ′ mt β + y ′ m α Heckman Classical Discrete Choice Theory

  23. • z ′ t is transportation mode characteristics, x ′ mt is interactions and y ′ m is neighborhood characteristics. • Then ( z ′ t γ + x ′ mt β ) e 1 − σ m �� T m � P t | m = ( z ′ t γ + x ′ mt β ) t =1 e 1 − σ m �� T m � 1 − σ m ( z ′ t γ + x ′ mt β ) e y ′ m α t =1 e 1 − σ m P m = �� T m � 1 − σ j ( z ′ t γ + x ′ mt β ) � m j =1 e y ′ m α t =1 e 1 − σ j Heckman Classical Discrete Choice Theory

  24. • Estimation (in two steps) (see Amemiya, Chapter 9) • Let � T m ( z ′ t γ + x ′ mt β ) I m = e 1 − σ m t =1 Heckman Classical Discrete Choice Theory

  25. � γ � β 1 Within each neighborhood, get 1 − σ m and 1 − σ m by logit 2 Form � I m 3 Then estimate by MLE m α +(1 − σ m ) ln � e y ′ I m get � α, � σ m � m m α +(1 − σ j ) ln � j =1 e y ′ I j • Assume σ m = σ j ∀ j , m or at least need some restrictions across multiple neighborhoods? • Note: � I m is an estimated regressor (“Durbin problem”) • Need to correct standard errors Heckman Classical Discrete Choice Theory

  26. Multinomial Probit Models 1 Also known as: 1 Thurstone Model V (1929; 1930) 2 Thurstone-Quandt Model 3 Developed by Domencich-McFadden (1978) (on reading list) u i = v i + η i i = 1 , ..., J v i = Z i β (linear in parameters form) u i = Z i β + η i MNL MNP � ¯ � ( i ) β fixed ( i ) β random coefficient β ∼ N β, Σ β ( ii ) η i iid ( ii ) β independent of η η ∼ (0 , Σ η ), • Allow gen. forms of correlation between errors Heckman Classical Discrete Choice Theory

  27. � � u i = Z i ¯ β − ¯ β + Z i β + η i � � • ( β − ¯ β − ¯ β ) = ε and Z i β + η i is a composite heteroskedastic error term. • β random = taste heterogeneity, • η i can interpret as unobserved attributes of goods • Main advantage of MNP over MNL is that it allows for general error covariance structure. • Note: To make computation easier, users sometimes set Σ β = 0 (fixed coefficient version) • allowing for β random • permits random taste variation • allows for possibility that different persons value 2 characteristics differently Heckman Classical Discrete Choice Theory

  28. Problem of Identification and Normalization in the MNP Model • Reference: David Bunch (1979), “Estimability In the Multinominal Probit Model” in Transportation Research • Domencich and McFadden • Let     Z 1 · ¯ β η 1 J alternatives  .   .  Z ¯ . . β = ˜ η = K characteristics     . . Z J · ¯ β random β ∼ N ( β, Σ β ) β η J (2) Heckman Classical Discrete Choice Theory

  29. Problem of Identification and Normalization in the MNP Model • Pr (alternative j selected): = Pr ( u j > u i ) ∀ i � = j u j u j � ∞ � � = Φ ( u | V µ , Σ µ ) du J du l du j u j = −∞ u i = −∞ u J = −∞ where Φ ( u | V µ , Σ µ ) is pdf (Φ is J -dimensional MVN density with mean V µ , Σ µ ) • Note: Unlike the MVL, no closed form expression for the integral. • The integrals often evaluated using simulation methods (we will work an example). Heckman Classical Discrete Choice Theory

  30. How many parameters are there? • ¯ β : K parameters • Σ β : K × K symmetric matrix K 2 − K + K = K ( K +1) 2 2 J ( J +1) • Σ η : 2 • Note: When a person chooses j , all we know is relative utility, not absolute utility. • This suggests that not all parameters in the model will be identified. • Requires normalizations. Heckman Classical Discrete Choice Theory

  31. Digression on Identification • What does it mean to say a parameter is not identified in a model? • Model with one parameterization is observationally equivalent to another model with a different parameterization Heckman Classical Discrete Choice Theory

  32. Digression on Identification • Example: Binary Probit Model (fixed β ) Pr ( D = 1 | Z ) = Pr ( v 1 + ε 1 > v 2 + ε 2 ) = Pr ( x β + ε 1 > x 2 β + ε 2 ) = Pr (( x 1 − x 2 ) β > ε 2 − ε 1 ) � ( x 1 − x 2 ) β � > ε 2 − ε 1 = Pr σ σ � ˜ � x β = Φ x = x 1 − x 2 ¯ σ � ˜ � � ˜ � x β x β ∗ for β σ = β ∗ • Φ is observationally equivalent to Φ σ ∗ . σ σ ∗ Heckman Classical Discrete Choice Theory

  33. • β not separably identified relative to σ but ratio is identified: � ˜ � � ˜ � x β ∗ x β Φ = Φ σ ∗ σ � ˜ � ˜ � � x β ∗ x β Φ − 1 · Φ Φ − 1 Φ = σ ∗ σ σ = β ∗ β ⇒ σ ∗ • Set { b : b = β · δ, δ any positive scalar } is identified (say “ β is identified up to scale and sign is identified”). Heckman Classical Discrete Choice Theory

  34. Identification in the MVP model Pr ( j selected | V µ , Σ µ ) = Pr ( u i − u j < 0 ∀ i � = j )   1 0 .. − 1 .. 0   0 1 .. − 1 .. 0   Define ∆ j = (contrast matrix)   : : : 0 .. .. − 1 0 1 ( J − 1) × J   u ′ − u j   ∆ j ˜ u = : u J − u j Heckman Classical Discrete Choice Theory

  35. Identification in the MVP model Pr ( j selected | V µ , Σ µ ) = Pr (∆ j ˜ u < 0 | V µ , Σ µ ) = Φ (0 | V Z , Σ Z ) • Where u = ∆ j ˜ Z ¯ 1 V Z is the mean of ∆ j ˜ β 2 Σ Z is the variance of ∆ j ˜ Z Σ β ˜ Z ′ ∆ ′ j + ∆ j Σ η ∆ ′ j 3 V Z is ( J − 1) × 1 4 Σ Z : ( J − 1) × ( J − 1) • We reduce dimensions of the integral by one. Heckman Classical Discrete Choice Theory

  36. • This says that all of the information exists in the contrasts. • Can’t identify all the components because we only observe the contrasts. • Now define ˜ ∆ j as ∆ j with J th column removed and choose J as the reference alternative with corresponding ∆ J . • Then can verify that ∆ j = ˜ ∆ j · ∆ J Heckman Classical Discrete Choice Theory

  37. • For example, with three goods: � 1 � � 1 � � 1 � − 1 0 − 1 − 1 0 × = 0 − 1 0 1 1 0 − 1 1 ˜ • ∆ j , ( j = 2 , ∆ J , ( J = 3 , ∆ j , ( j = 2 , 3rd column included) 3rd column reference alt.) removed) Heckman Classical Discrete Choice Theory

  38. • Therefore, we can write ∆ j ˜ Z ¯ V Z = β ∆ j ˜ Z Σ β ˜ j + ˜ J ˜ Z ′ ∆ ′ ∆ j ∆ J Σ η ∆ ′ ∆ ′ Σ Z = j • where C J = ∆ J Σ η ∆ ′ J and ( J − 1) × ( J − 1) has ( J − 1) 2 − ( J − 1) + ( J + 1) parameters = J ( J − 1) total. 2 2 • Since original model can always be expressed in terms of a model with ( β, Σ β , C J ) , it follows that some of the parameters in the original model are not identified. Heckman Classical Discrete Choice Theory

  39. How many parameters not identified? • Original model: K + K ( K + 1) + J ( J + 1) 2 2 • Now: J 2 + J − ( J 2 − J ) K + K ( K + 1) + J ( J − 1) , 2 2 2 = J not identified • Turns out that one additional parameter not identified. • Total: J + 1 • Note : Evaluation of Φ (0 | kv Z , k 2 Σ Z ) k > 0 gives same result as evaluating Φ (0 | v Z , Σ Z ) can eliminate one more parameter by suitable choice of k . Heckman Classical Discrete Choice Theory

  40. Illustration   σ 11 σ 12 σ 13   J = 3 Σ η = σ 21 σ 22 σ 23 σ 31 σ 32 σ 33 � 1 � 1 � � ′ − 1 0 − 1 0 C 2 = ∆ 2 Σ η ∆ ′ 2 = · Σ η 0 − 1 1 0 − 1 1 � σ 11 � − 2 σ 21 + σ 22 , σ 21 − σ 31 − σ 32 + σ 22 = σ 21 − σ 31 − σ 32 + σ 22 , σ 33 − 2 σ 31 + σ 22 Heckman Classical Discrete Choice Theory

  41. Illustration � 1 � − 1 C 2 = ˜ ∆ 2 ∆ 3 Σ η ∆ ′ 3 ∆ ′ 2 = · 0 − 1 � σ 11 � − 2 σ 21 + σ 33 , σ 21 − σ 31 − σ 32 + σ 33 · σ 21 − σ 31 − σ 32 + σ 33 σ 22 − 2 σ 32 σ 33 � � 1 0 − 1 − 1 Heckman Classical Discrete Choice Theory

  42. Normalization Approach of Albreit, Lerman, and Manski (1978) • Note: Need J + 1 restrictions on VCV matrix. • Fix J parameters by setting last row and last column of Σ η to 0 • Fix scale by constraining diagonal elements of Σ η so that trace Σ ε J equals variance of a standard Weibull. (To compare estimates with MNL and independent probit) Heckman Classical Discrete Choice Theory

  43. How do we solve the forecasting problem? • Suppose that we have 2 goods and add a 3rd � � u 1 − u 2 ≥ 0 Pr (1 chosen) = Pr �� Z 1 − Z 2 � ¯ β ≥ ω 2 − ω 1 � = Pr 1 • where ω 1 = Z 1 � � ω 2 = Z 2 � � β − ¯ β − ¯ + η 1 , + η 2 β β ( Z 1 − Z 2 ) ¯ � β 1 1 / 2 [ σ 11+ σ 22 − 2 σ 12+ ( Z 2 − Z 1 ) Σ η ( Z 2 − Z 1 ) ′ ] e − t / 2 dt = √ 2 π −∞ • Now add a 3rd good β + Z 3 � � u 3 = Z 3 ¯ β − ¯ + η 3 . β Heckman Classical Discrete Choice Theory

  44. • Problem : We don’t know correlation of η 3 with other errors. • Suppose that η 3 = 0 ( i.e. only preference heterogeneity). Then � a � b Pr (1 chosen) = B . V . N . dt 1 dt 2 −∞ −∞ � Z 1 − Z 2 � ¯ β when a = � σ 11 + σ 22 − 2 σ 12 + ( Z 2 − Z 1 ) Σ β ( Z 2 − Z 1 ) ′ � 1 / 2 � Z 1 − Z 3 � ¯ β and b = � σ 11 + ( Z 3 − Z 1 ) Σ β ( Z 3 − Z 1 ) ′ � 1 / 2 • We could also solve the forecasting problem if we make an assumption like η 2 = η 3 . • We solve red-bus//blue-bus problem if η 2 = η 1 = 0 and z 3 = z 2 . Heckman Classical Discrete Choice Theory

  45. � � u 1 − u 2 ≥ 0 , u 1 − u 3 ≥ 0 Pr (1 chosen) = Pr • but u 1 − u 2 ≥ 0 ∧ u 1 − u 3 ≥ 0 are the same event. • ∴ adding a third choice does not change the choice of 1 . Heckman Classical Discrete Choice Theory

  46. Estimation Methods for MNP Models • Models tend to be difficult to estimate because of high dimensional integrals. • Integrals need to be evaluated at each stage of estimating the likelihood. • Simulation provides a means of estimating P ij = Pr ( i chooses j ) Heckman Classical Discrete Choice Theory

  47. Computation and Estimation Link to Appendix Heckman Classical Discrete Choice Theory

  48. Classical Models for Estimating Models with Limited Dependent Variables References: • Amemiya, Ch. 10 • Different types of sampling (previously discussed) (a) random sampling (b) censored sampling (c) truncated sampling (d) other non-random (exogenous stratified, choice-based) Heckman Classical Discrete Choice Theory

  49. Standard Tobit Model (Tobin, 1958) “Type I Tobit” y ∗ i = x i β + u i • Observe y ∗ if y ∗ i ≥ y 0 or y i = 1 ( y ∗ i ≥ y 0 ) y ∗ y i = i i if y ∗ y i = 0 i < y 0 • Tobin’s example-expenditure on a durable good only observed if good is purchased Heckman Classical Discrete Choice Theory

  50. Figure 1 expenditure x x x y x x 0 x x x x x individuals Note: Censored observations might have bought the good if price had been lower. • Estimator. Assume y ∗ i / x i ∼ N (0 , σ 2 y ∗ i / x i ∼ N ( x i β, σ 2 u ) u ) Heckman Classical Discrete Choice Theory

  51. Density of Latent Variables g ( y ∗ ) = π 0 Pr ( y ∗ i < y 0 ) + π 1 f ( y ∗ i | y i ≥ y 0 ) · Pr ( y ∗ i ≥ y 0 ) � u i � � y 0 − x i β � < y 0 − x i β Pr ( y ∗ i < y 0 ) = Pr ( x i β + u i < y 0 ) = Pr = Φ σ u σ u σ u � y ∗ � i − x i β 1 σ u φ σ u f ( y ∗ i | y ∗ � � why? i ≥ y 0 ) = y 0 − x i β 1 − Φ σ u Pr ( y ∗ = y ∗ i | y 0 ≤ y ∗ ) = Pr ( x β + u = y ∗ i | y 0 ≤ x β + u ) � u � = y ∗ i − x β | u ≥ y 0 − x β Pr σ u σ u σ u σ u Heckman Classical Discrete Choice Theory

  52. • Note that likelihood can be written as: � � � y 0 − x i β � � � y 0 − x i β �� y ∗ i − x i β 1 σ u φ σ u � � �� L = Π 0 Φ Π 1 1 − Φ Π 1 σ u σ u y 0 − x i β 1 − Φ � �� � σ u � �� � This part you would set with just a simple probit Additional information • You could estimate β up to scale using only the information on whether y i � y 0 , but will get more efficient estimate using additional information. * if you know y 0 , you can estimate σ u . Heckman Classical Discrete Choice Theory

  53. Truncated Version of Type I Tobit Observe y i = y ∗ i if y ∗ i > o � observe nothing for censored observations � example: only observe wages for workers � � y ∗ i − x i β 1 σ u φ σ u � � Z = Π 1 x i β Φ σ u Pr ( y ∗ i > 0) = Pr ( x β + u > 0) � u � > − x β = Pr σ u σ u � � u < x β = Pr σ u Heckman Classical Discrete Choice Theory

  54. Different Ways of Estimating Tobit β (a) if censored, could obtain estimates of σ u by simple probit (b) run OLS on observations for which y ∗ i is observed � u i � | u i > − x β E ( y i | x i β + u i ≥ 0) = x i β + σ u E ( y 0 = 0) σ u σ u σ u • where E ( y i | x i β + u i ≥ 0) is the conditional mean for truncated normal r.v and � � � u i � � x i β � − x β φ | u i > − x β σ u � � σ u E − → λ = σ u σ u σ u σ u π i β Φ σ u � � x i β • λ known as “Mill’s ratio” ; bias due to censoring, can be σ u viewed as an omitted variables problem Heckman Classical Discrete Choice Theory

  55. Heckman Two-Step procedure β • Step 1: estimate σ u by probit • Step 2: � � x i ˆ β form ˆ λ σ regress � x i β � x i β + σ ˆ y i = λ + v + ε σ � � x i β � � x i β �� − ˆ v = σ λ λ σ σ ε = u i − E ( u i | u i > x i β ) Heckman Classical Discrete Choice Theory

  56. • Note: errors (v+e) will be heteroskedatic; • need to account for fact that λ is estimated (Durbin problem) • Two ways of doing this: (a) Delta method (b) GMM (Newey, Economic Letters, 1984) (c) Suppose you run OLS using all the data � � u i �� > − x i β | u i E ( y i ) = Pr ( y ∗ i ≤ 0) · 0 + Pr ( y ∗ i > 0) x i β + σ u E σ u σ u σ � x i β � � � x i β �� =Φ x i β + σ u λ σ σ • could estimate model by replacing Φ with ˆ φ and λ with ˆ λ. • For both (b) and (c), errors are heteroskedatic, meaning that you could use weights to improve efficiency. • Also need to adjust for estimated regressor. (d) Estimate model by Tobit maximum likelihood directly. Heckman Classical Discrete Choice Theory

  57. Variations on Standard Tobit Model y ∗ = x 1 i β + u 1 i 1 i y ∗ = x 2 i β + u 2 i 2 i y ∗ y ∗ y 2 i = if 1 i ≥ 0 2 i = 0 else • Example • y 2 i student test scores • y ∗ 1 i index representing parents propensity to enroll students in school • Test scores only observed for proportion enrolled Heckman Classical Discrete Choice Theory

  58. L =Π 1 [Pr ( y ∗ 1 i > 0) f ( y 2 i | y ∗ 1 i > 0)] Π 0 [Pr ( y ∗ 1 i ≤ 0)] � ∞ 0 f ( y ∗ 1 i , y ∗ 2 i ) dy ∗ 1 i f ( y ∗ 2 i | y ∗ � ∞ 1 i ≥ 0) = 0 f ( y ∗ 1 i ) dy ∗ 1 i � ∞ 0 f ( y ∗ 1 i | y ∗ 2 i ) dy ∗ = f ( y 2 i ) 1 i � ∞ 0 f ( y ∗ 1 i ) dy ∗ 1 i � ∞ � y ∗ � 0 f ( y ∗ 1 i | y ∗ 2 i ) dy ∗ = 1 2 i − x 2 i β 2 1 i σ 2 φ · σ 2 Pr ( y ∗ 1 i > 0) � x 1 i β 1 , σ 2 � y 1 i ∼ N y 2 i ∼ N ( x 2 i β 2 , ) Heckman Classical Discrete Choice Theory

  59. � � x 1 i β 1 + σ 12 1 − σ 12 y ∗ 1 i | y ∗ ( y 2 i − x 2 i β 2 ) , σ 2 2 i ∼ N σ 2 σ 2 2 2 E ( y ∗ 1 i | u 2 i = y ∗ 2 i − x 2 i β ) = x 1 i β 1 + E ( u 1 i | u 2 i = y ∗ 2 i − x 2 i β ) Heckman Classical Discrete Choice Theory

  60. Estimation by MLE � � x 1 i β �� � y ∗ � 1 2 i − x 2 i β 2 L = Π 0 1 − Φ Π 1 · φ σ 1 σ 2 σ 2 � �     x 1 i β 1 + σ 12   − 2 ( y 2 i − x 2 i β 2 ) σ 2   ·  1 − Φ σ x  Heckman Classical Discrete Choice Theory

  61. Estimation by Two-Step Approach • Using data on y 2 i for which y 1 i > 0 E ( y 2 i | y 1 i > 0) = x 2 i β + E ( u 2 i | x i β + u 1 i > 0) � u 2 i � | u 1 i > − x 1 i β 1 = x 2 i β + σ 2 E σ 2 σ 1 σ 1 � u 1 i � σ 12 | u 1 i > − x 1 i β 1 = x 2 i β + \ σ 2 E σ 1 \ σ 2 σ 1 σ 1 σ 1 � − x i β � x 2 i β 2 + σ 12 = λ σ 1 σ Heckman Classical Discrete Choice Theory

  62. Example: Female labor supply model max u ( L , x ) s.t. x = wH + v H = 1 − L where H : hours worked v : asset income w given P x = 1 L : time spent at home for child care ∂ u ∂ L = w when L < 1 ∂ u ∂ x reservation wage = MRS | H =0 = w R Heckman Classical Discrete Choice Theory

  63. Example: Female labor supply model • We don’t observe w R directly. w 0 Model = x β + u (wage person would earn if they worked) w R = z γ + v w 0 i < w 0 w R w i = if i i = 0 else • Fits within previous Tobit framework if we set x β − z γ + u − v = w 0 − w R y ∗ = 1 i y 2 i = w i • Note - Gronau does not develop a model to explain hours of work. Heckman Classical Discrete Choice Theory

  64. Incorporate choice of H w 0 = x 2 i β 2 + u 2 i given ∂ u = γ H i + z ′ ∂ L MRS = i α + v i ∂ u ∂ x (Assume functional form for utility function that yields this) Heckman Classical Discrete Choice Theory

  65. w r ( H i = 0) z ′ = i α + v i w 0 work if = x 2 i β 2 + u 2 i > z i α + v i w 0 if work, then = MRS = ⇒ x 2 i β 2 + u 2 i = α H i + z i α + v i i H i = x 2 i β 2 − z ′ i α + u 2 i − v i = ⇒ γ = x 1 i β 1 + u 1 i ( x 2 i β 2 − z i α ) γ − 1 where x 1 i β 1 = u 1 i = u 2 i − v i Heckman Classical Discrete Choice Theory

  66. Type 3 Tobit Model y ∗ 1 i = x 1 i β 1 + u 1 i ← − hours y ∗ 2 i = x 2 i β 1 + u 2 i ← − wage y ∗ if y ∗ y 1 i = 1 i > 0 1 i if y ∗ = 0 1 i ≤ 0 y ∗ if y ∗ y 2 i = 1 i > 0 2 i if y ∗ = 0 1 i ≤ 0 Heckman Classical Discrete Choice Theory

  67. H ∗ H ∗ Here H i = if i > 0 i H ∗ = 0 if i ≤ 0 w 0 H ∗ w i = if i > 0 i H ∗ = 0 if i ≤ 0 • Note: Type IV Tobit simply adds y ∗ if y ∗ y 3 i = 1 i > 0 3 i if y ∗ = 0 1 i ≤ 0 Heckman Classical Discrete Choice Theory

  68. • Can estimate by (1) maximum likelihood (2) Two-step method � � w 0 E i | H i > 0 = γ H i + z i α + E ( v i | H i > 0) Heckman Classical Discrete Choice Theory

  69. Type V Tobit Model of Heckman (1978) y ∗ = γ y 2 i + x 1 i β + δ 2 w i + u 1 i 1 j γ 2 y ∗ y 2 i = 1 i + x 2 i β 2 + δ 2 w i + u 2 i • Analysis of an antidiscrimination law on average income of African Americans in i th state. • Observe x 1 i , x 2 i , y 2 i and w i if y ∗ w i = 1 1 i > 0 if y ∗ w i = 0 1 i ≤ 0 • y 2 i = average income of African Americans in the state • y ∗ 1 i = unobservable sentiment towards African Americans • w i = if law is in effect Heckman Classical Discrete Choice Theory

  70. • Adoption of Law is endogenous • Require restriction γδ 2 + δ 1 = 0 so that we can solve for y ∗ 1 j as a function that does not depend on w i . • This class of models known as “dummy endogenous variable” models. Coherency Problem (Suppose Not Restricted?) Heckman Classical Discrete Choice Theory

  71. Relaxing Parametric Assumptions in the Selection Model References: • Heckman (AER, 1990) “Varieties of Selection Bias” • Heckman (1980), “Addendum to Sample Selection Bias as Specification Error” • Heckmand and Robb (1985, 1986) y ∗ = x β + u 1 y ∗ = z γ + v 2 y ∗ if y ∗ y 1 = 2 > 0 1 Heckman Classical Discrete Choice Theory

  72. Relaxing Parametric Assumptions in the Selection Model E ( y ∗ 1 | observed) = x β + E ( u | x , z γ + u > 0) + [ u − E ( u | x , z γ + u > 0)] � ∞ � − z γ −∞ uf ( u , v | x , z ) dvdu −∞ � ∞ � − z γ −∞ f ( uv | x , z ) dvdu −∞ • Note: Pr ( y ∗ 2 > 0 | z ) = Pr ( z γ + u > 0 | z ) = P ( Z ) = 1 − F v ( − z γ ) Heckman Classical Discrete Choice Theory

  73. ⇒ F v ( − z γ ) = 1 − P ( Z ) − z γ = F − 1 ⇒ (1 − P ( Z )) if F v v • Can replace − z γ in integrals in integrals by F − 1 (1 − P ( Z )) if v in addition f ( u , v | x , z ) = f ( u , v | z γ ) (index sufficiency) • Then E ( y ∗ 1 | y 2 > 0) = x β + g ( P ( z )) + ε where g ( P ( Z )) is bias or “control function.” • Semiparametric selection model-Approximate bias function by Taylor series in P ( z γ ) , truncated power series. Heckman Classical Discrete Choice Theory

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend