 
              Outline “A Course in Applied Econometrics” 1. Introduction Lecture 8 2. Multinomial and Conditional Logit Models 3. Independence of Irrelevant Alternatives Discrete Choice Models 4. Models without IIA 5. Berry-Levinsohn-Pakes Guido Imbens 6. Models with Multiple Unobserved Choice Characteristics IRP Lectures, UW Madison, August 2008 7. Hedonic Models 1 1. Introduction 2. Multinomial and Conditional Logit Models Various versions of multinomial logit models developed by Mc- Models for discrete choice with more than two choices. Fadden in 70’s. The choice Y i takes on non-negative, unordered integer values In IO applications with substantial number of choices IIA prop- between zero and J . erty found to be particularly unattractive because of unrealistic implications for substitution patterns. Examples are travel modes (bus/train/car), employment sta- tus (employed/unemployed/out-of-the-laborforce), car choices Random effects approach is more appealing generalization than (suv, sedan, pickup truck, convertible, minivan). either nested logit or unrestricted multinomial probit We wish to model the distribution of Y in terms of covariates Generalization by BLP to allow for endogenous choice charac- individual-specific, choice-invariant covariates Z i (e.g., age) teristics, unobserved choice characteristics, using only aggre- choice (and possibly individual) specific covariates X ij . gate choice data. 2 3
2.A Multinomial Logit 2.B Conditional Logit Individual-specific covariates only. exp( z ′ γ j ) Suppose all covariates vary by choice (and possibly also by Pr( Y i = j | Z i = z ) = l =1 exp( z ′ γ l ) , 1 + � J individual). The conditional logit model specifies: exp( X ′ ij β ) for choices j = 1 , . . . , J and for the first choice: Pr( Y i = j | X i 0 , . . . , X iJ ) = il β ) , � J l =0 exp( X ′ 1 Pr( Y i = 0 | Z i = z ) = l =1 exp( z ′ γ l ) , 1 + � J for j = 0 , . . . , J . Now the parameter vector β is common to all choices, and the covariates are choice-specific. The γ l here are choice-specific parameters. This multinomial logit model leads to a very well-behaved likelihood function, Also easy to estimate. and it is easy to estimate using standard optimization tech- niques. 4 5 The multinomial logit model can be viewed as a special case 2.D Link with Utility Maximization of the conditional logit model. Suppose we have a vector of Utility, for individual i , associated with choice j , is individual characteristics Z i of dimension K , and J vectors of coefficients γ j , each of dimension K . Then define U ij = X ′ ij β + ε ij . (1) ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ Z i 0 0 ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ . . . . i choose option j if choice j provides the highest level of utility ⎜ 0 ⎟ ⎜ . ⎟ ⎜ . ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ . . ⎜ . ⎟ ⎜ . ⎟ ⎜ ⎟ X i 1 = . , . . . . . . X iJ = . , and X i 0 = 0 , ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ . . ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ . . Y i = j if U ij ≥ U il for all l = 0 , . . . , J, . 0 . ⎝ ⎠ ⎝ ⎠ ⎝ ⎠ 0 Z i 0 Now suppose that the ε ij are independent accross choices and individuals and have type I extreme value distributions. and define the parameter vector as β = ( γ ′ 1 , . . . , γ ′ J ) ′ . Then F ( ǫ ) = exp( − exp( − ǫ )) , f ( ǫ ) = exp( − ǫ ) · exp( − exp( − ǫ )) . exp( Z ′ i γ j ) Pr( Y i = j | Z i ) = 1 + � J k =1 exp( Z ′ i γ k ) (This distribution has a unique mode at zero, a mean equal to 0.58, and a a second moment of 1.99 and a variance of 1.65.) exp( X ′ ij β ) = ik β ) = Pr( Y i = j | X i 0 , . . . , X iJ ) � J k =0 exp( X ′ Then the choice Y i follows the conditional logit model. 6 7
4 3. Independence of Irrelevant Alternatives 3 The main problem with the conditional logit is the property of extreme value distribution (solid) and normal distribution (dashed) Independence of Irrelevant Alternative (IIA). 2 The conditional probability of choosing j given either j or l : 1 Pr( Y i = j ) Pr( Y i = j | Y i ∈ { j, l } ) = 0 Pr( Y i = j ) + Pr( Y i = l ) exp( X ′ ij β ) �1 = il β ) . exp( X ′ ij β ) + exp( X ′ �2 This probability does not depend on the characteristics X im of alternatives m . �3 Also unattractive implications for marginal probabilities for new �4 0.4 0.35 0.3 0.25 0.2 0.15 0.1 0.05 0 choices. 8 Although multinomial and conditional logit models may fit well, they are not necessarily attractive as behavior/structural mod- Now suppose that we raise the price at Lalime’s to 1000 (or els. because they generates unrealistic substitution patterns. raise it to infinity, corresponding to taking it out of business). Suppose that individuals have the choice out of three restau- The conditional logit model predicts that the market share for rants, Chez Panisse (C), Lalime’s (L), and the Bongo Burger Lalime’s gets divided by Chez Panisse and the Bongo Burger, (B). Suppose we have two characteristics, price and quality proportional to their original market share, and thus ˜ S C = 0 . 13 and ˜ S B = 0 . 87: most of the individuals who would have gone price P C = 95, P L = 80, P B = 5, to Lalime’s will now dine (if that is the right term) at the quality Q C = 10, Q L = 9, Q B = 2 Bongo Burger. market share S C = 0 . 10, S L = 0 . 25, S B = 0 . 65. That seems implausible. The people who were planning to These numbers are roughly consistent with a conditional logit go to Lalime’s would appear to be more likely to go to Chez model where the utility associated with individual i and restau- Panisse if Lalime’s is closed than to go to the Bongo Burger, rant j is implying ˜ S C ≈ 0 . 35 and ˜ S B ≈ 0 . 65. U ij = − 0 . 2 · P j + 2 · Q j + ǫ ij , 9 10
4. Models without IIA Recall the latent utility set up with the utility Here we discuss 3 ways of avoiding the IIA property. All can be interpreted as relaxing the independence between the ǫ ij . U ij = X ′ ij β + ǫ ij . (2) The first is the nested logit model where the researcher groups In the conditional logit model we assume independent extreme together sets of choices. This allows for non-zero correlation value ǫ ij . The independence is essentially what creates the between unobserved components of choices within a nest and IIA property. (This is not completely correct, because other maintains zero correlation across nests. distributions for the unobserved, say with normal errors, we Second, the unrestricted multinomial probit model with no re- would not get IIA exactly, but something pretty close to it.) strictions on the covariance between unobserved components, beyond normalizations. The solution is to allow in some fashion for correlation between the unobserved components in the latent utility representation. Third, the mixed or random coefficients logit where the marginal In particular, with a choice set that contains multiple versions utilities associated with choice characteristics vary between of similar choices (like Chez Panisse and LaLime’s), we should individuals, generating positive correlation between the un- observed components of choices that are similar in observed allow the latent utilities for these choices to be similar. choice characteristics. 11 12 If we fix ρ s = 1 for all s , then Nested Logit Models exp( X ′ ij β ) Pr( Y i = j | X i ) = il β ) , � S � Partition the set of choices { 0 , 1 , . . . , J } into S sets B 1 , . . . , B S l ∈ B t exp( X ′ t =1 and we are back in the conditional logit model. Now let the conditional probability of choice j given that your choice is in the set B s , be equal to The implied joint distribution function of the ǫ ij is exp( ρ − 1 X ′ ij β ) ⎛ ⎛ ⎞ ρ s ⎞ s S Pr( Y i = j | X i , Y i ∈ B s ) = , � � � ⎝ � � l ∈ B s exp( ρ − 1 X ′ − ρ − 1 ⎝ − ⎠ ⎠ . il β ) F ( ǫ i 0 , . . . , ǫ iJ ) = exp exp ǫ ij s s s =1 j ∈ B s for j ∈ B s , and zero otherwise. In addition suppose the marginal Within the sets the correlation coefficient for the ǫ ij is approxi- probability of a choice in the set B s is mately equal to 1 − ρ . Between the sets the ǫ ij are independent. �� � ρ s l ∈ B s exp( ρ − 1 X ′ il β ) The nested logit model could capture the restaurant example s Pr( Y i ∈ B s | X i ) = �� � ρ s . � S by having two nests, the first B 1 = { Chez Panisse , LaLime ′ s } , l ∈ B t exp( ρ − 1 X ′ il β ) t t =1 and the second one B 2 = { Bongoburger } . 13 14
Recommend
More recommend