slide set 5 clrm sample properties of ols
play

Slide Set 5 CLRM: sample properties of OLS Pietro Coretto - PowerPoint PPT Presentation

Notes Slide Set 5 CLRM: sample properties of OLS Pietro Coretto pcoretto@unisa.it Econometrics Master in Economics and Finance (MEF) Universit degli Studi di Napoli Federico II Version: Saturday 28 th December, 2019 (h16:05) P.


  1. Notes Slide Set 5 CLRM: sample properties of OLS Pietro Coretto pcoretto@unisa.it Econometrics Master in Economics and Finance (MEF) Università degli Studi di Napoli “Federico II” Version: Saturday 28 th December, 2019 (h16:05) P. Coretto • MEF CLRM: sample properties of OLS 1 / 16 Notes By finite-sample we mean n < + ∞ , that is finite sample size Note: the terminology “ finite sample ” means a sample size n that is not “ too large ”. When n is “ large enough ”, the asymptotic regime takes over, but this will be the object of the second part of the course. We will investigate properties of both b and s 2 A1–A4 are assumptions from previous slide sets P. Coretto • MEF CLRM: sample properties of OLS 2 / 16

  2. Finite sample properties of b Notes Proposition (unbiasedness of b ) Assume A1, A2 and A3. Then E[ b | X ] = β , that is b is an unbiased estimator of β conditional on X . Proof. First note that by the linearity of expectations E[ b | X ] = β whenever E[ b − β | X ] = 0 Recall the decomposition of the estimation error � − 1 X ′ ε b − β = � X ′ X then �� X ′ X � − 1 X ′ ε | X � E[ b − β | X ] = E pull out what’s known and obtain � − 1 X ′ E [ ε | X ] = 0 E[ b − β | X ] = � X ′ X � P. Coretto • MEF CLRM: sample properties of OLS 3 / 16 Notes Note that this is a conditional unbiasedness statement, and it is stronger than the more traditional unconditional statement. This is stronger because E[ b | X ] = β = ⇒ E[E[ b | X ]] = E[ b ] = β P. Coretto • MEF CLRM: sample properties of OLS 4 / 16

  3. Proposition (variance-covariance of b ) Notes Assume A1, A2 and A3 and A4. Then the followings hold: (a) Var[ b | X ] = σ 2 ( X ′ X ) − 1 (b) (Gauss-Markov theorem) Let b 0 be any estimator of β that is linear in the y and unbiased. Then Var[ b 0 | X ] � Var[ b | X ] (c) Cov[ b , e | X ] = 0 Recall that � means bigger or equal for certain matrices. Since of (b) the OLS estimator is also called BLUE = Best Linear Unbiased Estimator . Best here is in terms of mean square error (MSE). Since b is unbiased, its efficiency is equal to its variance. P. Coretto • MEF CLRM: sample properties of OLS 5 / 16 Notes Proof of part (a). Since β is not a random variable, then its variance is zero, then Var[ b | X ] = Var[ b − β | X ] � − 1 X ′ ε | X ] � X ′ X = Var[ � − 1 X ′ � X ′ X = Var[ Aε | X ] , for A = = A Var[ ε | X ] A ′ � − 1 X ′ Var[ ε | X ] X � − 1 � X ′ X � X ′ X = ( ∗ ) � − 1 X ′ ( σ 2 ) I n X � − 1 � X ′ X � X ′ X = � − 1 ( X ′ X ) � − 1 = σ 2 � X ′ X � X ′ X � − 1 = σ 2 � X ′ X (*) this is because for a matrix B , we have that ( B ′ ) − 1 = ( B − 1 ) ′ � P. Coretto • MEF CLRM: sample properties of OLS 6 / 16

  4. Proof of part (b). First note that OLS is linear map of y . In fact, define Notes A = ( X ′ X ) − 1 X ′ , then b = Ay . Define an alternative linear estimator b 0 assuming that it exists b 0 = Cy = ( D + A ) y with D = C − A b 0 =( D + A ) y = Dy + Ay = D ( Xβ + ε ) + b = DXβ + Dε + b By assumption b 0 it is unbiased conditional on X , therefore E[ b 0 | X ] = β . P. Coretto • MEF CLRM: sample properties of OLS 7 / 16 Therefore Notes E[ b 0 | X ] = E[ DXβ + Dε + b | X ] = DXβ + E[ Dε | X ] + E[ b | X ] = DXβ + D E[ ε | X ] + β = DXβ + β The fact that b 0 is unbiased implies DX = 0 , hence b 0 = Dε + b (5.1) Now substract β from both side of (5.1) and obtain the estimation error b 0 b 0 − β = Dε + ( b − β ) = ( D + A ) ε P. Coretto • MEF CLRM: sample properties of OLS 8 / 16

  5. Notes In order to show that the OLS is BLUE we need to show that for any b 0 (linear in y and unbiased) it holds true that Var[ b 0 | X ] is bigger then Var[ b | X ] We need to show that Var[ b 0 | X ] � Var[ b | X ] , that is Var[ b 0 | X ] − Var[ b | X ] is PSD . Same trick as before: the variance of constant is zero, so we start from Var[ b 0 | X ] = Var[( b 0 − β ) | X ] . P. Coretto • MEF CLRM: sample properties of OLS 9 / 16 Notes Var[ b 0 | X ] = Var[( b 0 − β ) | X ] = Var[( D + A ) ε | X ] =( D + A ) Var[ ε | X ] ( D + A ) ′ = σ 2 ( D + A )( D ′ + A ′ ) = σ 2 ( DD ′ + AD ′ + DA ′ + AA ′ ) Now � − 1 = 0 DA ′ = DX � X ′ X � − 1 = AA ′ = � − 1 X ′ X � − 1 � X ′ X � X ′ X � X ′ X Var[ b 0 | X ] = σ 2 [ DD ′ + � − 1 ] = σ 2 DD ′ + σ 2 � X ′ X � − 1 � X ′ X P. Coretto • MEF CLRM: sample properties of OLS 10 / 16

  6. Notes Any variance covariance matrix is PSD, and the sum of PSD matrices is PSD, then DD ′ is PSD. Finally Var[ b 0 | X ] − Var[ b | X ] = σ 2 DD ′ which is PSD. This completes the proof � P. Coretto • MEF CLRM: sample properties of OLS 11 / 16 Proof of part (c). First note that e = Mε , in fact recall that MX = 0 Notes and e = My = M ( Xβ + ε ) = MXβ + Mε = Mε Now work out the formula of covariance between vectors: Cov[ b , ε | X ] = E[( b − E[ b | X ])( e − E[ e | X ]) ′ | X ] � − 1 X ′ ε ( Mε ) ′ | X ] � X ′ X = E[ � − 1 X ′ εε ′ M | X ] � X ′ X = E[ � − 1 X ′ E[ εε ′ | X ] M � X ′ X = (pull out what’s known) � − 1 X ′ M = σ 2 � X ′ X Since X ′ M = ( MX ) ′ = 0 , then Cov[ b , ε | X ] = 0 � P. Coretto • MEF CLRM: sample properties of OLS 12 / 16

  7. Finite sample properties of s 2 Notes In order to do inference on β , we need the variance of b , but it depends on σ 2 Since ε i is recovered based on e i , the natural analog estimator of σ 2 would be n σ 2 = 1 i = e ′ e � e 2 ˆ n n i =1 However, this is biased because residuals are a two stages estimates of ε i . In fact, we need to pre-estimate β to get e . An unbiased estimator is obtained by correcting the biased one with the degrees of freedom as usual, that is n 1 e ′ e s 2 = � e 2 i = n − K n − K i =1 P. Coretto • MEF CLRM: sample properties of OLS 13 / 16 Notes Proposition (unbiased estimation of σ 2 ) Assume A1, A2, A3, A4, then s 2 is an unbiased estimator of σ 2 . Proof. E[ e ′ e | X ] = E[( Mε ) ′ Mε | X ] = E[ ε ′ M ′ Mε | X ] = E[ ε ′ Mε | X ] n n � � = m ij E[ ε i ε j | X ] (verify this!) i =1 j =1 n � = σ 2 m ii i =1 = σ 2 tr( M ) P. Coretto • MEF CLRM: sample properties of OLS 14 / 16

  8. tr( M ) = tr( I n − P ) = n − tr( P ) Notes and � � − 1 X ′ � � � − 1 � � X ′ X X ′ X � X ′ X tr( P ) = tr X = tr = tr( I k ) = K Therefore E[ e ′ e | X ] = ( n − K ) σ 2 . And this proves that the analog σ 2 is biased, in fact estimator ˆ � e ′ e = n − K � � σ 2 | X � � σ 2 � E ˆ = E � X � n n This also shows that s 2 is unbiased, in fact � e ′ e � = n − K � E[ s 2 | X ] = E n − K σ 2 = σ 2 � � X � n − K � P. Coretto • MEF CLRM: sample properties of OLS 15 / 16 Estimated standard errors of b Notes We know that � − 1 Var[ b | X ] = σ 2 � X ′ X which depends on population quantities. We can estimate the variance matrix of b based on the plug-in principle: � � − 1 Var ( b ) = s 2 � X ′ X Let Se( b k ) be the square root of k th element on the diagonal of � Var ( b ) , then Se( b k ) is the standard error of the b k parameter estimate, that is � ( X ′ X ) − 1 Se( b k ) = s kk P. Coretto • MEF CLRM: sample properties of OLS 16 / 16

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend