Classical Discrete Choice Theory James J. Heckman University of - PowerPoint PPT Presentation

Debreu (1960) criticism of Luce Model • “Red Bus - Blue Bus Problem” • Suppose N + 1 th alternative is identical to the first 2 e θ ( s ) ′ x N +1 Pr(choose 1 or N + 1 | s , B ′ ) = � N +1 l =1 e θ ( s ) ′ x l • = ⇒ Introduction of identical good changes probability of riding a bus. • not an attractive result • comes from need to make iid assumption on new alternative Heckman Classical Discrete Choice Theory

Debreu (1960) criticism of Luce Model: Some Alternative Assumptions 1 Could let v i = ln( θ ( s ) ′ x i ) θ ( s ) ′ x j Pr( j | s , B ) = � N +1 l =1 θ ( s ) ′ x l If we also imposed � N l =1 θ ( s ) ′ x l = 1, we would get linear probability model but this could violate IIA. 2 Could consider model of form e θ j ( s ) x i Pr( j | s , B ) = � N l =1 e θ l ( s ) x l but here we have lost our forecasting ability (cannot predict demand for a new good). 3 Universal Logit Model e ϕ i ( x 1 ,..., x N ) β ( s ) Pr( i | s , x 1 , ..., x N ) = � N l =1 e ϕ l ( x 1 ,..., x N ) β ( s ) Heckman Classical Discrete Choice Theory Here we lose IIA and forecasting (Bernstein Polynomial

Criteria for a good PCS 1 Goal: We want a probabilistic choice model that 1 has a flexible functional form 2 is computationally practical 3 allows for flexibility in representing substitution patterns among choices 4 is consistent with a random utility model (RUM) = ⇒ has a structural interpretation Heckman Classical Discrete Choice Theory

How do you verify that a candidate PCS is consistent with a RUM? 1 Goal: (a) Either start with a R.U.M. u i = v ( s , x i ) + ε ( s , x i ) and solve integral for � v l + ε l � Pr( u i > u l , ∀ l � = i ) = Pr( i = arg max ) l or (b) start with a candidate PCS and verify that it is consistent with a R.U.M. (easier) 2 McFadden provides sufficient conditions 3 See discussion of Daley-Zachary-Williams theorem Heckman Classical Discrete Choice Theory

Link to Airum Models Heckman Classical Discrete Choice Theory

Daly-Zachary-Williams Theorem • Daly-Zachary (1976) and Williams (1977) provide a set of conditions that makes it easy to derive a PCS from a RUM with a class of models (“generalized extreme value” (GEV) models) • Define G : G ( Y 1 , . . . , Y J ) • If G satisfies the following 1 nonnegative defined on Y 1 , . . . , Y J ≥ 0 2 homogeneous degree one in its arguments Y i →∞ G ( Y 1 , . . . , Y i , . . . , Y J ) → ∞ , ∀ i = 1 , . . . , J lim 3 • ∂ k G is nonnegative if k odd (1) nonpositive if even ∂ Y 1 · · · ∂ Y k Heckman Classical Discrete Choice Theory

• Then for a R.U.M. with u i = v i + ε i and � � e − ε 1 , . . . , e − ε J �� F ( ε 1 , . . . , ε J ) = exp − G • This cdf has Weibull marginals but allows for more dependence among ε ’s. • The PCS is given by = e v i G i ( e v 1 , . . . , e v J ) P i = ∂ ln G ∂ v i G ( e v 1 , . . . , e v J ) • Note: McFadden shows that under certain conditions on the form of the indirect utility function (satisfies AIRUM form), the DZW result can be seen as a form of Roy’s identity. Heckman Classical Discrete Choice Theory

• Let’s apply this result • Multinomial logit model (MNL) e − e − ε 1 · · · e − e − ε J ← cdf F ( ε 1 , . . . , ε J ) = − product of iid Weibulls e − � J j =1 e − ε j = • Can verify that G ( e v 1 , . . . , e v J ) = � J j =1 e v i satisfies DZW conditions e v j P ( j ) = ∂ ln G = l =1 e v l = MNL model � J ∂ v i Heckman Classical Discrete Choice Theory

• Another GEV model • Nested logit model (addresses to a limited extent the IIA criticism) • Let   1 − σ m ( տ like an M �  � vi G ( e v 1 , . . . , e v J ) =  elasticity a m e 1 − σ m of substitution) m =1 i ∈ B m Heckman Classical Discrete Choice Theory

• Idea: divide goods into branches • First choose branch, then good within branch red blue car bus • Will allow for correlation between errors (this is role of σ )) • B m ⊆ { 1 , . . . , J } � B m = B m =1 is a single branch—need not have all choices on all branches Heckman Classical Discrete Choice Theory

• Note: if σ = 0, get usual MNL form • Calculate equation for � � 1 − σ m � �� m vi ∂ ln m =1 a m i ∈ B m e 1 − σ m ∂ ln G p i = = ∂ v i ∂ v i � � �� − σ m �� − 1 �� vi vi vi vi m ∋ i ∈ B m a m e 1 − σ m i ∈ B m e 1 − σ m i ∈ B m e 1 − σ m i ∈ B m e 1 − σ m = �� 1 − σ m � m vi m =1 a m i ∈ B m e 1 − σ m m � = P ( i | B m ) P ( B m ) m =1 Heckman Classical Discrete Choice Theory

• Where vi e 1 − σ m P ( i | B m ) = if i ∈ B m , 0 otherwise � vi i ∈ B m e 1 − σ m �� 1 − σ m vi a m i ∈ B m e 1 − σ m P ( B m ) = �� 1 − σ m � m vi m =1 a m i ∈ B m e 1 − σ m • Note: If P ( B m ) = 1 get logit form • Nested logit requires that analyst make choices about nesting structure Heckman Classical Discrete Choice Theory

• How does nested logit solve red bus/blue bus problem? • Suppose � � 1 − σ 1 1 Y i = e v i 1 − σ 1 − σ G = Y 1 + Y + Y 2 3 Heckman Classical Discrete Choice Theory

e v 1 ∂ ln G P (1 | { 123 } ) = = � � 1 − σ ∂ v i v 2 v 3 e v 1 + 1 − σ + e e 1 − σ � � − σ v 2 v 2 v 3 1 − σ + e e e 1 − σ 1 − σ ∂ ln G P (2 | { 123 } ) = = � � 1 − σ ∂ v i v 2 v 3 e v 1 + 1 − σ + e e 1 − σ Heckman Classical Discrete Choice Theory

• As v 3 → −∞ e v 1 P (1 | { 123 } ) = (get logistic) e v 1 + e v 2 • As v 1 → −∞ Heckman Classical Discrete Choice Theory

What Role Does σ Play? • σ is the degree of substitutability parameter • Recall F ( ε 1 , ε 2 , ε 3 ) = exp {− G ( e − ε 1 , e − ε 2 , e − ε 3 ) } • Here cov ( ε 2 , ε 3 ) σ = √ var ε 2 var ε 3 = correlation coefficient • Thus we require − 1 ≤ σ ≤ 1, but turns out we also need to require σ > 0 for DZW conditions to be satisfied. This is unfortunate because it does not allow ε ’s to be negatively correlated. • Can show that e v 1 σ → 1 P (1 | { 123 } ) = lim e v 1 + max( e v 2 , e v 3 ) (L’Hˆ opital’s Rule) Heckman Classical Discrete Choice Theory

• If v 2 = v 3 , then � � − σ v 2 v 2 e 2 e 1 − σ 1 − σ P (2 | { 123 } ) = � � 1 − σ v 2 e v 1 + 2 e 1 − σ e v 2 2 − σ = e v 1 + ( e v 2 ) (2 1 − σ ) e v 2 2 − 1 lim = e v 1 + e v 2 when v 1 = v 2 σ → 1 ր introduce 3rd identical alternative and cut the probability of choosing 2 in half • Solves red-bus/blue-bus problem • Probability cut in half with two identical alternatives Heckman Classical Discrete Choice Theory

red bus blue bus car • σ is a measure of similarity between red and blue bus. • When σ close to one, the conditional choice probability selects with high probability the alternative. Heckman Classical Discrete Choice Theory

• Remark: We can expand logit to accommodate multiple levels ex.  � 1 − σ m  ��   Q � � 1 1 − σ m G = a q a m y  3 levels i  q =1 m ∈ Q q i ∈ B m Heckman Classical Discrete Choice Theory

• Example: Two Choices 1 Neighborhood ( m ) 2 Transportation mode ( t ) 3 P ( m ): choice of neighborhood 4 P ( i | B m ): probability of choosing i th mode, given neighborhood m Heckman Classical Discrete Choice Theory

1 Not all modes available in all neighborhoods �� T m � − σ m v ( m , t ) v ( m , t ) e t =1 e 1 − σ m 1 − σ m P m , t = �� T j � 1 − σ m � m v ( m , t ) t =1 e 1 − σ m j =1 v ( m , t ) e 1 − σ m P t | m = � T m v ( m , t ) t =1 e 1 − σ m �� T m � 1 − σ m v ( m , t ) t =1 e 1 − σ m P m = � 1 − σ m = P ( B m ) �� T j � m v ( m , t ) t =1 e 1 − σ m j =1 Heckman Classical Discrete Choice Theory

• Standard type of utility function that people might use v ( m , t ) = z ′ t γ + x ′ mt β + y ′ m α Heckman Classical Discrete Choice Theory

• z ′ t is transportation mode characteristics, x ′ mt is interactions and y ′ m is neighborhood characteristics. • Then ( z ′ t γ + x ′ mt β ) e 1 − σ m �� T m � P t | m = ( z ′ t γ + x ′ mt β ) t =1 e 1 − σ m �� T m � 1 − σ m ( z ′ t γ + x ′ mt β ) e y ′ m α t =1 e 1 − σ m P m = �� T m � 1 − σ j ( z ′ t γ + x ′ mt β ) � m j =1 e y ′ m α t =1 e 1 − σ j Heckman Classical Discrete Choice Theory

• Estimation (in two steps) (see Amemiya, Chapter 9) • Let � T m ( z ′ t γ + x ′ mt β ) I m = e 1 − σ m t =1 Heckman Classical Discrete Choice Theory

� γ � β 1 Within each neighborhood, get 1 − σ m and 1 − σ m by logit 2 Form � I m 3 Then estimate by MLE m α +(1 − σ m ) ln � e y ′ I m get � α, � σ m � m m α +(1 − σ j ) ln � j =1 e y ′ I j • Assume σ m = σ j ∀ j , m or at least need some restrictions across multiple neighborhoods? • Note: � I m is an estimated regressor (“Durbin problem”) • Need to correct standard errors Heckman Classical Discrete Choice Theory

Multinomial Probit Models 1 Also known as: 1 Thurstone Model V (1929; 1930) 2 Thurstone-Quandt Model 3 Developed by Domencich-McFadden (1978) (on reading list) u i = v i + η i i = 1 , ..., J v i = Z i β (linear in parameters form) u i = Z i β + η i MNL MNP � ¯ � ( i ) β fixed ( i ) β random coefficient β ∼ N β, Σ β ( ii ) η i iid ( ii ) β independent of η η ∼ (0 , Σ η ), • Allow gen. forms of correlation between errors Heckman Classical Discrete Choice Theory

� � u i = Z i ¯ β − ¯ β + Z i β + η i � � • ( β − ¯ β − ¯ β ) = ε and Z i β + η i is a composite heteroskedastic error term. • β random = taste heterogeneity, • η i can interpret as unobserved attributes of goods • Main advantage of MNP over MNL is that it allows for general error covariance structure. • Note: To make computation easier, users sometimes set Σ β = 0 (fixed coefficient version) • allowing for β random • permits random taste variation • allows for possibility that different persons value 2 characteristics differently Heckman Classical Discrete Choice Theory

Problem of Identification and Normalization in the MNP Model • Reference: David Bunch (1979), “Estimability In the Multinominal Probit Model” in Transportation Research • Domencich and McFadden • Let     Z 1 · ¯ β η 1 J alternatives  .   .  Z ¯ . . β = ˜ η = K characteristics     . . Z J · ¯ β random β ∼ N ( β, Σ β ) β η J (2) Heckman Classical Discrete Choice Theory

Problem of Identification and Normalization in the MNP Model • Pr (alternative j selected): = Pr ( u j > u i ) ∀ i � = j u j u j � ∞ � � = Φ ( u | V µ , Σ µ ) du J du l du j u j = −∞ u i = −∞ u J = −∞ where Φ ( u | V µ , Σ µ ) is pdf (Φ is J -dimensional MVN density with mean V µ , Σ µ ) • Note: Unlike the MVL, no closed form expression for the integral. • The integrals often evaluated using simulation methods (we will work an example). Heckman Classical Discrete Choice Theory

How many parameters are there? • ¯ β : K parameters • Σ β : K × K symmetric matrix K 2 − K + K = K ( K +1) 2 2 J ( J +1) • Σ η : 2 • Note: When a person chooses j , all we know is relative utility, not absolute utility. • This suggests that not all parameters in the model will be identified. • Requires normalizations. Heckman Classical Discrete Choice Theory

Digression on Identification • What does it mean to say a parameter is not identified in a model? • Model with one parameterization is observationally equivalent to another model with a different parameterization Heckman Classical Discrete Choice Theory

Digression on Identification • Example: Binary Probit Model (fixed β ) Pr ( D = 1 | Z ) = Pr ( v 1 + ε 1 > v 2 + ε 2 ) = Pr ( x β + ε 1 > x 2 β + ε 2 ) = Pr (( x 1 − x 2 ) β > ε 2 − ε 1 ) � ( x 1 − x 2 ) β � > ε 2 − ε 1 = Pr σ σ � ˜ � x β = Φ x = x 1 − x 2 ¯ σ � ˜ � � ˜ � x β x β ∗ for β σ = β ∗ • Φ is observationally equivalent to Φ σ ∗ . σ σ ∗ Heckman Classical Discrete Choice Theory

• β not separably identified relative to σ but ratio is identified: � ˜ � � ˜ � x β ∗ x β Φ = Φ σ ∗ σ � ˜ � ˜ � � x β ∗ x β Φ − 1 · Φ Φ − 1 Φ = σ ∗ σ σ = β ∗ β ⇒ σ ∗ • Set { b : b = β · δ, δ any positive scalar } is identified (say “ β is identified up to scale and sign is identified”). Heckman Classical Discrete Choice Theory

Identification in the MVP model Pr ( j selected | V µ , Σ µ ) = Pr ( u i − u j < 0 ∀ i � = j )   1 0 .. − 1 .. 0   0 1 .. − 1 .. 0   Define ∆ j = (contrast matrix)   : : : 0 .. .. − 1 0 1 ( J − 1) × J   u ′ − u j   ∆ j ˜ u = : u J − u j Heckman Classical Discrete Choice Theory

Identification in the MVP model Pr ( j selected | V µ , Σ µ ) = Pr (∆ j ˜ u < 0 | V µ , Σ µ ) = Φ (0 | V Z , Σ Z ) • Where u = ∆ j ˜ Z ¯ 1 V Z is the mean of ∆ j ˜ β 2 Σ Z is the variance of ∆ j ˜ Z Σ β ˜ Z ′ ∆ ′ j + ∆ j Σ η ∆ ′ j 3 V Z is ( J − 1) × 1 4 Σ Z : ( J − 1) × ( J − 1) • We reduce dimensions of the integral by one. Heckman Classical Discrete Choice Theory

• This says that all of the information exists in the contrasts. • Can’t identify all the components because we only observe the contrasts. • Now define ˜ ∆ j as ∆ j with J th column removed and choose J as the reference alternative with corresponding ∆ J . • Then can verify that ∆ j = ˜ ∆ j · ∆ J Heckman Classical Discrete Choice Theory

• For example, with three goods: � 1 � � 1 � � 1 � − 1 0 − 1 − 1 0 × = 0 − 1 0 1 1 0 − 1 1 ˜ • ∆ j , ( j = 2 , ∆ J , ( J = 3 , ∆ j , ( j = 2 , 3rd column included) 3rd column reference alt.) removed) Heckman Classical Discrete Choice Theory

• Therefore, we can write ∆ j ˜ Z ¯ V Z = β ∆ j ˜ Z Σ β ˜ j + ˜ J ˜ Z ′ ∆ ′ ∆ j ∆ J Σ η ∆ ′ ∆ ′ Σ Z = j • where C J = ∆ J Σ η ∆ ′ J and ( J − 1) × ( J − 1) has ( J − 1) 2 − ( J − 1) + ( J + 1) parameters = J ( J − 1) total. 2 2 • Since original model can always be expressed in terms of a model with ( β, Σ β , C J ) , it follows that some of the parameters in the original model are not identified. Heckman Classical Discrete Choice Theory

How many parameters not identified? • Original model: K + K ( K + 1) + J ( J + 1) 2 2 • Now: J 2 + J − ( J 2 − J ) K + K ( K + 1) + J ( J − 1) , 2 2 2 = J not identified • Turns out that one additional parameter not identified. • Total: J + 1 • Note : Evaluation of Φ (0 | kv Z , k 2 Σ Z ) k > 0 gives same result as evaluating Φ (0 | v Z , Σ Z ) can eliminate one more parameter by suitable choice of k . Heckman Classical Discrete Choice Theory

Illustration   σ 11 σ 12 σ 13   J = 3 Σ η = σ 21 σ 22 σ 23 σ 31 σ 32 σ 33 � 1 � 1 � � ′ − 1 0 − 1 0 C 2 = ∆ 2 Σ η ∆ ′ 2 = · Σ η 0 − 1 1 0 − 1 1 � σ 11 � − 2 σ 21 + σ 22 , σ 21 − σ 31 − σ 32 + σ 22 = σ 21 − σ 31 − σ 32 + σ 22 , σ 33 − 2 σ 31 + σ 22 Heckman Classical Discrete Choice Theory

Illustration � 1 � − 1 C 2 = ˜ ∆ 2 ∆ 3 Σ η ∆ ′ 3 ∆ ′ 2 = · 0 − 1 � σ 11 � − 2 σ 21 + σ 33 , σ 21 − σ 31 − σ 32 + σ 33 · σ 21 − σ 31 − σ 32 + σ 33 σ 22 − 2 σ 32 σ 33 � � 1 0 − 1 − 1 Heckman Classical Discrete Choice Theory

Normalization Approach of Albreit, Lerman, and Manski (1978) • Note: Need J + 1 restrictions on VCV matrix. • Fix J parameters by setting last row and last column of Σ η to 0 • Fix scale by constraining diagonal elements of Σ η so that trace Σ ε J equals variance of a standard Weibull. (To compare estimates with MNL and independent probit) Heckman Classical Discrete Choice Theory

How do we solve the forecasting problem? • Suppose that we have 2 goods and add a 3rd � � u 1 − u 2 ≥ 0 Pr (1 chosen) = Pr �� Z 1 − Z 2 � ¯ β ≥ ω 2 − ω 1 � = Pr 1 • where ω 1 = Z 1 � � ω 2 = Z 2 � � β − ¯ β − ¯ + η 1 , + η 2 β β ( Z 1 − Z 2 ) ¯ � β 1 1 / 2 [ σ 11+ σ 22 − 2 σ 12+ ( Z 2 − Z 1 ) Σ η ( Z 2 − Z 1 ) ′ ] e − t / 2 dt = √ 2 π −∞ • Now add a 3rd good β + Z 3 � � u 3 = Z 3 ¯ β − ¯ + η 3 . β Heckman Classical Discrete Choice Theory

• Problem : We don’t know correlation of η 3 with other errors. • Suppose that η 3 = 0 ( i.e. only preference heterogeneity). Then � a � b Pr (1 chosen) = B . V . N . dt 1 dt 2 −∞ −∞ � Z 1 − Z 2 � ¯ β when a = � σ 11 + σ 22 − 2 σ 12 + ( Z 2 − Z 1 ) Σ β ( Z 2 − Z 1 ) ′ � 1 / 2 � Z 1 − Z 3 � ¯ β and b = � σ 11 + ( Z 3 − Z 1 ) Σ β ( Z 3 − Z 1 ) ′ � 1 / 2 • We could also solve the forecasting problem if we make an assumption like η 2 = η 3 . • We solve red-bus//blue-bus problem if η 2 = η 1 = 0 and z 3 = z 2 . Heckman Classical Discrete Choice Theory

� � u 1 − u 2 ≥ 0 , u 1 − u 3 ≥ 0 Pr (1 chosen) = Pr • but u 1 − u 2 ≥ 0 ∧ u 1 − u 3 ≥ 0 are the same event. • ∴ adding a third choice does not change the choice of 1 . Heckman Classical Discrete Choice Theory

Estimation Methods for MNP Models • Models tend to be difficult to estimate because of high dimensional integrals. • Integrals need to be evaluated at each stage of estimating the likelihood. • Simulation provides a means of estimating P ij = Pr ( i chooses j ) Heckman Classical Discrete Choice Theory

Computation and Estimation Link to Appendix Heckman Classical Discrete Choice Theory

Classical Models for Estimating Models with Limited Dependent Variables References: • Amemiya, Ch. 10 • Different types of sampling (previously discussed) (a) random sampling (b) censored sampling (c) truncated sampling (d) other non-random (exogenous stratified, choice-based) Heckman Classical Discrete Choice Theory

Standard Tobit Model (Tobin, 1958) “Type I Tobit” y ∗ i = x i β + u i • Observe y ∗ if y ∗ i ≥ y 0 or y i = 1 ( y ∗ i ≥ y 0 ) y ∗ y i = i i if y ∗ y i = 0 i < y 0 • Tobin’s example-expenditure on a durable good only observed if good is purchased Heckman Classical Discrete Choice Theory

Figure 1 expenditure x x x y x x 0 x x x x x individuals Note: Censored observations might have bought the good if price had been lower. • Estimator. Assume y ∗ i / x i ∼ N (0 , σ 2 y ∗ i / x i ∼ N ( x i β, σ 2 u ) u ) Heckman Classical Discrete Choice Theory

Density of Latent Variables g ( y ∗ ) = π 0 Pr ( y ∗ i < y 0 ) + π 1 f ( y ∗ i | y i ≥ y 0 ) · Pr ( y ∗ i ≥ y 0 ) � u i � � y 0 − x i β � < y 0 − x i β Pr ( y ∗ i < y 0 ) = Pr ( x i β + u i < y 0 ) = Pr = Φ σ u σ u σ u � y ∗ � i − x i β 1 σ u φ σ u f ( y ∗ i | y ∗ � � why? i ≥ y 0 ) = y 0 − x i β 1 − Φ σ u Pr ( y ∗ = y ∗ i | y 0 ≤ y ∗ ) = Pr ( x β + u = y ∗ i | y 0 ≤ x β + u ) � u � = y ∗ i − x β | u ≥ y 0 − x β Pr σ u σ u σ u σ u Heckman Classical Discrete Choice Theory

• Note that likelihood can be written as: � � � y 0 − x i β � � � y 0 − x i β �� y ∗ i − x i β 1 σ u φ σ u � � �� L = Π 0 Φ Π 1 1 − Φ Π 1 σ u σ u y 0 − x i β 1 − Φ � �� σ u � �� This part you would set with just a simple probit Additional information • You could estimate β up to scale using only the information on whether y i � y 0 , but will get more efficient estimate using additional information. * if you know y 0 , you can estimate σ u . Heckman Classical Discrete Choice Theory

Truncated Version of Type I Tobit Observe y i = y ∗ i if y ∗ i > o � observe nothing for censored observations � example: only observe wages for workers � � y ∗ i − x i β 1 σ u φ σ u � � Z = Π 1 x i β Φ σ u Pr ( y ∗ i > 0) = Pr ( x β + u > 0) � u � > − x β = Pr σ u σ u � � u < x β = Pr σ u Heckman Classical Discrete Choice Theory

Different Ways of Estimating Tobit β (a) if censored, could obtain estimates of σ u by simple probit (b) run OLS on observations for which y ∗ i is observed � u i � | u i > − x β E ( y i | x i β + u i ≥ 0) = x i β + σ u E ( y 0 = 0) σ u σ u σ u • where E ( y i | x i β + u i ≥ 0) is the conditional mean for truncated normal r.v and � � � u i � � x i β � − x β φ | u i > − x β σ u � � σ u E − → λ = σ u σ u σ u σ u π i β Φ σ u � � x i β • λ known as “Mill’s ratio” ; bias due to censoring, can be σ u viewed as an omitted variables problem Heckman Classical Discrete Choice Theory

Heckman Two-Step procedure β • Step 1: estimate σ u by probit • Step 2: � � x i ˆ β form ˆ λ σ regress � x i β � x i β + σ ˆ y i = λ + v + ε σ � � x i β � � x i β �� − ˆ v = σ λ λ σ σ ε = u i − E ( u i | u i > x i β ) Heckman Classical Discrete Choice Theory

• Note: errors (v+e) will be heteroskedatic; • need to account for fact that λ is estimated (Durbin problem) • Two ways of doing this: (a) Delta method (b) GMM (Newey, Economic Letters, 1984) (c) Suppose you run OLS using all the data � � u i �� > − x i β | u i E ( y i ) = Pr ( y ∗ i ≤ 0) · 0 + Pr ( y ∗ i > 0) x i β + σ u E σ u σ u σ � x i β � � � x i β �� =Φ x i β + σ u λ σ σ • could estimate model by replacing Φ with ˆ φ and λ with ˆ λ. • For both (b) and (c), errors are heteroskedatic, meaning that you could use weights to improve efficiency. • Also need to adjust for estimated regressor. (d) Estimate model by Tobit maximum likelihood directly. Heckman Classical Discrete Choice Theory

Variations on Standard Tobit Model y ∗ = x 1 i β + u 1 i 1 i y ∗ = x 2 i β + u 2 i 2 i y ∗ y ∗ y 2 i = if 1 i ≥ 0 2 i = 0 else • Example • y 2 i student test scores • y ∗ 1 i index representing parents propensity to enroll students in school • Test scores only observed for proportion enrolled Heckman Classical Discrete Choice Theory

L =Π 1 [Pr ( y ∗ 1 i > 0) f ( y 2 i | y ∗ 1 i > 0)] Π 0 [Pr ( y ∗ 1 i ≤ 0)] � ∞ 0 f ( y ∗ 1 i , y ∗ 2 i ) dy ∗ 1 i f ( y ∗ 2 i | y ∗ � ∞ 1 i ≥ 0) = 0 f ( y ∗ 1 i ) dy ∗ 1 i � ∞ 0 f ( y ∗ 1 i | y ∗ 2 i ) dy ∗ = f ( y 2 i ) 1 i � ∞ 0 f ( y ∗ 1 i ) dy ∗ 1 i � ∞ � y ∗ � 0 f ( y ∗ 1 i | y ∗ 2 i ) dy ∗ = 1 2 i − x 2 i β 2 1 i σ 2 φ · σ 2 Pr ( y ∗ 1 i > 0) � x 1 i β 1 , σ 2 � y 1 i ∼ N y 2 i ∼ N ( x 2 i β 2 , ) Heckman Classical Discrete Choice Theory

� � x 1 i β 1 + σ 12 1 − σ 12 y ∗ 1 i | y ∗ ( y 2 i − x 2 i β 2 ) , σ 2 2 i ∼ N σ 2 σ 2 2 2 E ( y ∗ 1 i | u 2 i = y ∗ 2 i − x 2 i β ) = x 1 i β 1 + E ( u 1 i | u 2 i = y ∗ 2 i − x 2 i β ) Heckman Classical Discrete Choice Theory

Estimation by MLE � � x 1 i β �� y ∗ � 1 2 i − x 2 i β 2 L = Π 0 1 − Φ Π 1 · φ σ 1 σ 2 σ 2 � �     x 1 i β 1 + σ 12   − 2 ( y 2 i − x 2 i β 2 ) σ 2   ·  1 − Φ σ x  Heckman Classical Discrete Choice Theory

Estimation by Two-Step Approach • Using data on y 2 i for which y 1 i > 0 E ( y 2 i | y 1 i > 0) = x 2 i β + E ( u 2 i | x i β + u 1 i > 0) � u 2 i � | u 1 i > − x 1 i β 1 = x 2 i β + σ 2 E σ 2 σ 1 σ 1 � u 1 i � σ 12 | u 1 i > − x 1 i β 1 = x 2 i β + \ σ 2 E σ 1 \ σ 2 σ 1 σ 1 σ 1 � − x i β � x 2 i β 2 + σ 12 = λ σ 1 σ Heckman Classical Discrete Choice Theory

Example: Female labor supply model max u ( L , x ) s.t. x = wH + v H = 1 − L where H : hours worked v : asset income w given P x = 1 L : time spent at home for child care ∂ u ∂ L = w when L < 1 ∂ u ∂ x reservation wage = MRS | H =0 = w R Heckman Classical Discrete Choice Theory

Example: Female labor supply model • We don’t observe w R directly. w 0 Model = x β + u (wage person would earn if they worked) w R = z γ + v w 0 i < w 0 w R w i = if i i = 0 else • Fits within previous Tobit framework if we set x β − z γ + u − v = w 0 − w R y ∗ = 1 i y 2 i = w i • Note - Gronau does not develop a model to explain hours of work. Heckman Classical Discrete Choice Theory

Incorporate choice of H w 0 = x 2 i β 2 + u 2 i given ∂ u = γ H i + z ′ ∂ L MRS = i α + v i ∂ u ∂ x (Assume functional form for utility function that yields this) Heckman Classical Discrete Choice Theory

w r ( H i = 0) z ′ = i α + v i w 0 work if = x 2 i β 2 + u 2 i > z i α + v i w 0 if work, then = MRS = ⇒ x 2 i β 2 + u 2 i = α H i + z i α + v i i H i = x 2 i β 2 − z ′ i α + u 2 i − v i = ⇒ γ = x 1 i β 1 + u 1 i ( x 2 i β 2 − z i α ) γ − 1 where x 1 i β 1 = u 1 i = u 2 i − v i Heckman Classical Discrete Choice Theory

Type 3 Tobit Model y ∗ 1 i = x 1 i β 1 + u 1 i ← − hours y ∗ 2 i = x 2 i β 1 + u 2 i ← − wage y ∗ if y ∗ y 1 i = 1 i > 0 1 i if y ∗ = 0 1 i ≤ 0 y ∗ if y ∗ y 2 i = 1 i > 0 2 i if y ∗ = 0 1 i ≤ 0 Heckman Classical Discrete Choice Theory

H ∗ H ∗ Here H i = if i > 0 i H ∗ = 0 if i ≤ 0 w 0 H ∗ w i = if i > 0 i H ∗ = 0 if i ≤ 0 • Note: Type IV Tobit simply adds y ∗ if y ∗ y 3 i = 1 i > 0 3 i if y ∗ = 0 1 i ≤ 0 Heckman Classical Discrete Choice Theory

• Can estimate by (1) maximum likelihood (2) Two-step method � � w 0 E i | H i > 0 = γ H i + z i α + E ( v i | H i > 0) Heckman Classical Discrete Choice Theory

Type V Tobit Model of Heckman (1978) y ∗ = γ y 2 i + x 1 i β + δ 2 w i + u 1 i 1 j γ 2 y ∗ y 2 i = 1 i + x 2 i β 2 + δ 2 w i + u 2 i • Analysis of an antidiscrimination law on average income of African Americans in i th state. • Observe x 1 i , x 2 i , y 2 i and w i if y ∗ w i = 1 1 i > 0 if y ∗ w i = 0 1 i ≤ 0 • y 2 i = average income of African Americans in the state • y ∗ 1 i = unobservable sentiment towards African Americans • w i = if law is in effect Heckman Classical Discrete Choice Theory

• Adoption of Law is endogenous • Require restriction γδ 2 + δ 1 = 0 so that we can solve for y ∗ 1 j as a function that does not depend on w i . • This class of models known as “dummy endogenous variable” models. Coherency Problem (Suppose Not Restricted?) Heckman Classical Discrete Choice Theory

Relaxing Parametric Assumptions in the Selection Model References: • Heckman (AER, 1990) “Varieties of Selection Bias” • Heckman (1980), “Addendum to Sample Selection Bias as Specification Error” • Heckmand and Robb (1985, 1986) y ∗ = x β + u 1 y ∗ = z γ + v 2 y ∗ if y ∗ y 1 = 2 > 0 1 Heckman Classical Discrete Choice Theory

Relaxing Parametric Assumptions in the Selection Model E ( y ∗ 1 | observed) = x β + E ( u | x , z γ + u > 0) + [ u − E ( u | x , z γ + u > 0)] � ∞ � − z γ −∞ uf ( u , v | x , z ) dvdu −∞ � ∞ � − z γ −∞ f ( uv | x , z ) dvdu −∞ • Note: Pr ( y ∗ 2 > 0 | z ) = Pr ( z γ + u > 0 | z ) = P ( Z ) = 1 − F v ( − z γ ) Heckman Classical Discrete Choice Theory

⇒ F v ( − z γ ) = 1 − P ( Z ) − z γ = F − 1 ⇒ (1 − P ( Z )) if F v v • Can replace − z γ in integrals in integrals by F − 1 (1 − P ( Z )) if v in addition f ( u , v | x , z ) = f ( u , v | z γ ) (index sufficiency) • Then E ( y ∗ 1 | y 2 > 0) = x β + g ( P ( z )) + ε where g ( P ( Z )) is bias or “control function.” • Semiparametric selection model-Approximate bias function by Taylor series in P ( z γ ) , truncated power series. Heckman Classical Discrete Choice Theory

Classical Discrete Choice Theory James J. Heckman University of - PowerPoint PPT Presentation

Classical Discrete Choice Theory James J. Heckman University of Chicago Econ 312, Spring 2019 Heckman Classical Discrete Choice Theory Classical regression model: y = x + 0 = E ( | x ) 0 , 2 I E N 1

Classical Discrete Choice Theory James J. Heckman University of Chicago Econ 312, Spring 2019

Choice Set Optimization Under Discrete Choice Models of Group Decisions Kiran Tomlinson and

MOBILITY CHOICE STUDY MOBILITY CHOICE STUDY MOBILITY CHOICE STUDY Planning for Mobility in

Theoretical foundations Ingredients of choice theory Michel Bierlaire Introduction to choice

Applied Hodge Theory: Social Choice, Crowdsourced Ranking, and Game Theory Yuan Yao HKUST

Discrete choice analysis & taboos Caspar Chorus 5-6-2019 Professor of choice behavior

Voting in Maines Ranked Choice Election A non-partisan guide to ranked choice elections

Homecare Choice Program Presented by Jenny Cokeley Homecare Choice Program Manager Homecare

Discrete Mathematics Jeremy Siek Spring 2010 Jeremy Siek Discrete Mathematics 1 / 118 Jeremy

Cyber-Physical Systems Discrete Dynamics IECE 553/453 Fall 2019 Prof. Dola Saha 1 Discrete

CMSC 222: Discrete Mathematics Prof S Fall 2018 What is Discrete Mathematics? Discrete

Cyber-Physical Systems Discrete Dynamics ICEN 553/453 Fall 2018 Prof. Dola Saha 1 Discrete

Plan Discrete paths as Heyting algebras Discrete paths as categories Discrete paths as quantales

Discrete-time Systems in the Time Domain Chaiwoot Boonyasiriwat August 21, 2020 Discrete-time

Evidence evaluation for discrete data Evidence evaluation for discrete data Evidence evaluation

HOW WILL HOW WILL RANKED CHOICE VOTING RANKED CHOICE VOTING WORK IN HI? WORK IN HI? VOTERS

An Expandable Extraction Framework for Architectural Performance Models Jrgen Walter*,

The Descartes Modeling Language: Status Quo Samuel Kounev University of Wrzburg

LISA: Opens the low-frequency gravita=onal universe ! 1 22 Years aAer the First LISA Symposium

1 2 Speaker: Ruby Qazilbash Ruby Qazilbash Associate Deputy Director Bureau of Justice

SEMAFOR: Frame Argument Resolution with Log-Linear Models or, The Case of the Missing Arguments

In tro duction Spin correlations for the and pairs

Neutralino Dark Matter in the BMSSM Nicols Bernal CFTP - IST, Lisbon June 3 rd 2010 JCAP

CS6200 Information Retrieval David Smith College of Computer and Information Science

Classical Discrete Choice Theory James J. Heckman University of - PowerPoint PPT Presentation

Classical Discrete Choice Theory James J. Heckman University of Chicago Econ 312, Spring 2019 Heckman Classical Discrete Choice Theory Classical regression model: y = x + 0 = E ( | x ) 0 , 2 I E N 1

Classical Discrete Choice Theory James J. Heckman University of Chicago Econ 312, Spring 2019

Choice Set Optimization Under Discrete Choice Models of Group Decisions Kiran Tomlinson and

MOBILITY CHOICE STUDY MOBILITY CHOICE STUDY MOBILITY CHOICE STUDY Planning for Mobility in

Theoretical foundations Ingredients of choice theory Michel Bierlaire Introduction to choice

Applied Hodge Theory: Social Choice, Crowdsourced Ranking, and Game Theory Yuan Yao HKUST

Discrete choice analysis &amp; taboos Caspar Chorus 5-6-2019 Professor of choice behavior

Voting in Maines Ranked Choice Election A non-partisan guide to ranked choice elections

Homecare Choice Program Presented by Jenny Cokeley Homecare Choice Program Manager Homecare

Discrete Mathematics Jeremy Siek Spring 2010 Jeremy Siek Discrete Mathematics 1 / 118 Jeremy

Cyber-Physical Systems Discrete Dynamics IECE 553/453 Fall 2019 Prof. Dola Saha 1 Discrete

CMSC 222: Discrete Mathematics Prof S Fall 2018 What is Discrete Mathematics? Discrete

Cyber-Physical Systems Discrete Dynamics ICEN 553/453 Fall 2018 Prof. Dola Saha 1 Discrete

Plan Discrete paths as Heyting algebras Discrete paths as categories Discrete paths as quantales

Discrete-time Systems in the Time Domain Chaiwoot Boonyasiriwat August 21, 2020 Discrete-time

Evidence evaluation for discrete data Evidence evaluation for discrete data Evidence evaluation

HOW WILL HOW WILL RANKED CHOICE VOTING RANKED CHOICE VOTING WORK IN HI? WORK IN HI? VOTERS

An Expandable Extraction Framework for Architectural Performance Models Jrgen Walter*,

The Descartes Modeling Language: Status Quo Samuel Kounev University of Wrzburg

LISA: Opens the low-frequency gravita=onal universe ! 1 22 Years aAer the First LISA Symposium

1 2 Speaker: Ruby Qazilbash Ruby Qazilbash Associate Deputy Director Bureau of Justice

SEMAFOR: Frame Argument Resolution with Log-Linear Models or, The Case of the Missing Arguments

In tro duction Spin correlations for the and pairs

Neutralino Dark Matter in the BMSSM Nicols Bernal CFTP - IST, Lisbon June 3 rd 2010 JCAP

CS6200 Information Retrieval David Smith College of Computer and Information Science

Discrete choice analysis & taboos Caspar Chorus 5-6-2019 Professor of choice behavior