mect microeconometrics blundell lecture 2 censored data
play

MECT Microeconometrics Blundell Lecture 2 Censored Data Models - PowerPoint PPT Presentation

MECT Microeconometrics Blundell Lecture 2 Censored Data Models Richard Blundell http://www.ucl.ac.uk/~uctp39a/ University College London February-March 2016 Blundell ( University College London ) ECONG107: Blundell Lecture 2 February-March


  1. MECT Microeconometrics Blundell Lecture 2 Censored Data Models Richard Blundell http://www.ucl.ac.uk/~uctp39a/ University College London February-March 2016 Blundell ( University College London ) ECONG107: Blundell Lecture 2 February-March 2016 1 / 29

  2. Censored Data Models � Censored and truncated data Examples: earnings hours of work (mroz.dta is a ‘typical’ data set to play with) top coding of wealth expenditure on cars (this was James Tobin’s original example which became know as Tobin’s Probit model or the Tobit model.) � Typical definitions: Censored data includes the censoring points Truncated data excludes the censoring points Blundell ( University College London ) ECONG107: Blundell Lecture 2 February-March 2016 2 / 29

  3. � A mixture of discrete and continuous processes. In general we should model the process of censoring or truncation as a separate discrete mechanism, i.e. the ‘selectivity’ model. � To begin with we have a model in which the two processes are generated from the same underlying continuous latent variable model e.g. corner solution models in economics. y ∗ i = x � i β + u i with � y ∗ if y ∗ i > 0 i y i = 0 otherwise or � y ∗ if u i > − x i β i y i = 0 otherwise � Sometimes also define D i � 1 if y ∗ i > 0 D i = 0 otherwise Blundell ( University College London ) ECONG107: Blundell Lecture 2 February-March 2016 3 / 29

  4. The general specification for the censored regression model is y ∗ = x i β + u i i max { 0 , y ∗ = i } y i where y ∗ is the unobservable underlying process (similar to what was used with discrete choice models) and y is the data observation. � When u are normally distributed - u | x ∼ N ( 0 , σ 2 ) - the model is the Tobit model. � Note that � x � β � P ( y > 0 | x ) = P ( u > − x � β | x ) = Φ σ Blundell ( University College London ) ECONG107: Blundell Lecture 2 February-March 2016 4 / 29

  5. � Consider the moments of the truncated normal. � Assume w � N ( 0 , σ ) . Then w | w > c where c is an arbitrary constant, is a truncated normal. � The density function for the truncated normal is: f ( w ) f ( w | w > c ) = 1 − F ( c ) � w � σφ σ � c � = 1 − Φ σ where f is the density function of w and F is the cumulative density function of w . Blundell ( University College London ) ECONG107: Blundell Lecture 2 February-March 2016 5 / 29

  6. � We can now write � ∞ E ( w | w > c ) = wf ( w | w > c ) dw c � c � φ σ � c � = σ 1 − Φ σ Applying this result to the regression model yields � � x � β φ σ E ( y | x , y > 0 ) = x � β + E ( u | u > − x � β ) = x � β + σ � � x � β Φ σ � Note that φ ( w ) / Φ ( w ) is the Inverse Mills Ratio, usually written λ ( w ) . � Also note that, contrary to the discrete choice models, the variance of the residual plays a central role here: it determines the size of the partial effects. Blundell ( University College London ) ECONG107: Blundell Lecture 2 February-March 2016 6 / 29

  7. OLS Bias � Truncated Data: � Suppose one uses only the positive observations to estimate the model and the unobservables are normally distributed. Then, we have seen that, � x � β � E ( y | x , y > 0 ) = x � β + σλ σ where the last term is E ( u | x , u > − x � β ) , which is generally non-zero. � A model of the form: y = x � β + σλ + v would have E ( v | x , y > 0 ) = 0 . � This implies the inconsistency of OLS: omitted variable problem. Thus, the resulting error term will be correlated with x . Blundell ( University College London ) ECONG107: Blundell Lecture 2 February-March 2016 7 / 29

  8. Censored Data: � Now suppose we use all observations, both positive and zero. � Under normality of the residual, we obtain, � x � β � � x � β � x � β + σφ E ( y | x ) = Φ σ σ � Thus, once again the OLS estimates will be biased and inconsistent. Blundell ( University College London ) ECONG107: Blundell Lecture 2 February-March 2016 8 / 29

  9. The Maximum Likelihood Estimator � Let { ( y i , x i ) , i = 1 , ..., N } be a random sample of data on the model. The contribution to the likelihood of a zero observation is determined by, � x � � i β P ( y i = 0 | x i ) = 1 − Φ σ The contribution to the likelihood of a non-zero observation is determined by, � y i − x � � f ( y i | x i ) = 1 i β σφ σ which is not invariant to σ . Thus, the overall contribution of observation i to the loglikelihood function is, � � x � �� i β 1 − Φ ln l i ( x i ; β , σ ) = 1 ( y i = 0 ) ln σ � 1 � y i − x � �� i β + 1 ( y i = 1 ) ln σφ σ Blundell ( University College London ) ECONG107: Blundell Lecture 2 February-March 2016 9 / 29

  10. and the sample loglikelihood is,  � � x � ��  i β   1 − Φ N ( 1 − D i ) ln σ � � y i − x � � � ∑ ln L N ( β , σ ) =  i β  + D i ln φ − ln σ i = 1 σ where D equals one when y ∗ > 0 and equals zero otherwise. � Notice that both β and σ are separately identified. Moreover, if D = 1 for all i , the ML and the OLS estimators will be the same. � FOC � x � �   i β   σφ N ∂ ln L 1 σ  D i ( y i − x � ∑ � x � � x i = i β ) x i − ( 1 − D i ) σ 2  ∂β i β 1 − Φ i = 1 σ � x � �  � i β � ( y i − x �   x i βφ N i β ) 2 ∂ ln L 1 σ ∑ � � x � �� + D i =  ( 1 − D i ) − ∂σ 2 2 σ 4 2 σ 2  i β 2 σ 2 1 − Φ i = 1 σ Blundell ( University College London ) ECONG107: Blundell Lecture 2 February-March 2016 10 / 29

  11. Or write as: ( 1 ) ∂ ln L 1 σφ i x i + 1 − ∑ ( y i − x � σ 2 ∑ = i β ) x i σ 2 ∂β 1 − Φ i i ∈ 0 i ∈ + x i βφ i ( 2 ) ∂ ln L 1 1 i β ) 2 − N + ( y i − x � 2 σ 2 ∑ 2 σ 4 ∑ = + ∂σ 2 1 − Φ i 2 σ 2 i ∈ 0 i ∈ + β � note that 2 σ 2 x (1) + (2) → 1 σ 2 = ( y i − x � i β ) 2 N + ∑ � i ∈ + that is the positive observations only contribute to the estimation of σ . Blundell ( University College London ) ECONG107: Blundell Lecture 2 February-March 2016 11 / 29

  12. � Also if we define m i ≡ E ( y ∗ i | y i ) then we can write (1) as N ∂ ln L ∑ x i ( m i − x � = c i β ) ∂β i = 1 or N N x i x � ∑ ∑ x i m i = i β i = 1 i = 1 which defines an EM algorithm for the Tobit model. Note also that � y ∗ if y ∗ i > 0 m i = φ i x � i β − σ otherwise 1 − Φ i again replacing y ∗ with its best guess, given y , when it is unobserved. � Using the Theorems 1 and 2 from Lecture 6, MLE of β and σ 2 is consistent and asymptotically normally distributed. Blundell ( University College London ) ECONG107: Blundell Lecture 2 February-March 2016 12 / 29

  13. � Exercise: Derive the asymptotic covariance matrix from the expected values of the 2nd partial derivatives of ln L . � Note is has the general form � � � � E ∂ 2 ln L E ∂ 2 ln L ∑ N ∑ N i = 1 a i x i x � i b i x i ∂β 2 ∂β∂σ 2 i − = E ∂ 2 ln L ∑ N . i = 1 c i . ∂σ 2 Blundell ( University College London ) ECONG107: Blundell Lecture 2 February-March 2016 13 / 29

  14. LM or Score Test � Let the log likelihood be written ln L ( θ 1 , θ 2 ) where θ 1 is the set of parameters that are unrestricted under the null hypothesis and θ 2 are k 2 restricted parameters under H 0 . H 0 : θ 2 = 0 H 1 : θ 2 � = 0 � e.g. y ∗ i = x � 1 i β 1 + x � 2 i β 2 + u i with u i ∼ N ( 0 , σ 2 ) . 1 , σ 2 ) � and θ 2 = β 2 . where θ 1 = ( β � Blundell ( University College London ) ECONG107: Blundell Lecture 2 February-March 2016 14 / 29

  15. ∂ ln L ( θ 1 , θ 2 ) ∂ ln l i ( θ 1 , θ 2 ) = ∑ ∂θ ∂θ or S ( θ ) = ∑ S i ( θ ) � Let � θ be the MLE under H 0 . Then 1 θ ) ∼ a N ( 0 , H ) S ( � √ N therefore 1 θ ) ∼ a χ 2 N S ( � θ ) � H − 1 S ( � ( k 2 ) Blundell ( University College London ) ECONG107: Blundell Lecture 2 February-March 2016 15 / 29

  16. In the Tobit model consider the case of H 0 : β 2 = 0 ∂ ln L ( 1 − D i ) σ i φ i = 1 i β ) x 2 i − 1 D i ( y i − x � σ 2 ∑ σ 2 ∑ x 2 i ∂β 2 1 − Φ i i i ∂ ln L = 1 e ( 1 ) σ 2 ∑ x 2 i i ∂β 2 i where i β ) + ( 1 − D i )( − σ i φ i e ( 1 ) = D i ( y i − x � ) i 1 − Φ i is known as the first order ‘generalised’ residual , which reduces to u i = y i − x � i β in the general linear model case. Blundell ( University College London ) ECONG107: Blundell Lecture 2 February-March 2016 16 / 29

  17. This kind of Score or LM test can be extended to specification tests for heteroskedasticity and for non-normality. Notice that is estimation under the alternative is avoided, at least in terms of the test statistic. If H 0 is rejected then estimation under H a is unavoidable. � Consider the normal distribution � � u 2 1 − 1 i f ( u i ) = √ exp σ 2 2 σ 2 π can be written in terms of log scores ∂ ln f ( u i ) = − u i σ 2 . ∂ u i � A popular generalisation ( Pearson family of distributions) is Blundell ( University College London ) ECONG107: Blundell Lecture 2 February-March 2016 17 / 29

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend