slide set 4 clrm estimation
play

Slide Set 4 CLRM estimation Pietro Coretto pcoretto@unisa.it - PowerPoint PPT Presentation

Notes Slide Set 4 CLRM estimation Pietro Coretto pcoretto@unisa.it Econometrics Master in Economics and Finance (MEF) Universit degli Studi di Napoli Federico II Version: Saturday 28 th December, 2019 (h16:05) P. Coretto MEF CLRM


  1. Notes Slide Set 4 CLRM estimation Pietro Coretto pcoretto@unisa.it Econometrics Master in Economics and Finance (MEF) Università degli Studi di Napoli “Federico II” Version: Saturday 28 th December, 2019 (h16:05) P. Coretto • MEF CLRM estimation 1 / 22 Least Squares Method (LS) Notes Given an additive regression model: y = f ( X ; β ) + ε note that ε is not observed, but it is function of observables and the unknown parameter ε = y − f ( X ; β ) LS method: assume the signal f ( X ; β ) is much stronger than the error ε . look for a β such that the “size” of ε is as small as possible size of ε is measured by some norm � ε � P. Coretto • MEF CLRM estimation 2 / 22

  2. Ordinary Least Squares estimator (OLS) Notes OLS = LS with �·� 2 . Therefore the OLS objective function is S ( β ) = � ε � 2 2 = ε ′ ε = ( y − f ( X ; β )) ′ ( y − f ( X ; β )) , and the OLS estimator b is defined as the optimal solution b = arg min β ∈ R K S ( β ) For the linear model n n � � S ( β ) = � ε � 2 2 = ε ′ ε = ( y − Xβ ) ′ ( y − Xβ ) = ε 2 i β ) 2 ( y i − x ′ i = i =1 i =1 S ( β ) is nicely convex! P. Coretto • MEF CLRM estimation 3 / 22 Notes Proposition: OLS estimator The “unique” OLS estimator is b = ( X ′ X ) − 1 X ′ y To see this, first we introduce two simple matrix derivative rules: 1 Let a , b ∈ R p then ∂ a ′ b = ∂ b ′ a = a ∂ b ∂ b 2 Let b ∈ R p , and let A ∈ R p × p be symmetric, then ∂ a ′ A b = 2 Ab = 2 b ′ A ∂ b P. Coretto • MEF CLRM estimation 4 / 22

  3. Proof . Rewrite the LS objective function Notes S ( β ) =( y − Xβ ) ′ ( y − Xβ ) = y ′ y − β ′ X ′ y − y ′ Xβ + β ′ X ′ Xβ Note that the transpose of a scalar is the scalar itself, then y ′ Xβ = ( y ′ Xβ ) ′ = β ′ X ′ y so that we write S ( β ) = y ′ y − 2 β ′ ( X ′ y ) + β ′ ( X ′ X ) β (4.1) Since S ( · ) is convex, there exists a minimum b which will satisfy the first order conditions � ∂S ( β ) � = 0 � ∂ β � β = b P. Coretto • MEF CLRM estimation 5 / 22 By applying the previous derivative rules (1) and (2) to the 2 nd and Notes 3 rd term of (4.1) ∂S ( b ) = − 2( X ′ y ) + 2( X ′ X ) b = 0 ∂ b Which lead to the so called “normal equations” ( X ′ X ) b = X ′ y The matrix X ′ X is square symmetric (see homeworks). Based on A3 with probability 1 X ′ X is non singular, then ( X ′ X ) − 1 exists, then the normal equation can be written as ( X ′ X ) − 1 ( X ′ X ) b = ( X ′ X ) − 1 X ′ y b = ( X ′ X ) − 1 X ′ y = ⇒ which proves the desired result � P. Coretto • MEF CLRM estimation 6 / 22

  4. Formulation in terms of sample averages Notes It can be shown (see homeworks) that n n � � X ′ X = x i x ′ and X ′ y = x i y i i i =1 i =1 Define n n S xx = 1 n X ′ X = 1 s xy = 1 n X ′ y = 1 � � x i x ′ and x i y i i n n i =1 i =1 Therefore b = ( X ′ X ) − 1 X ′ y can be written as � 1 � − 1 1 n X ′ X n X ′ y b = � � − 1 � � n n 1 1 � � x i x ′ = x i y i i n n i =1 i =1 = S − 1 xx s xy P. Coretto • MEF CLRM estimation 7 / 22 Once β is estimated via b , the estimated error, also called “ residual ” is Notes obtained as e = y − Xb Fitted values, also called the predicted values, are ˆ y = Xb so that e = y − ˆ y Note that y i = b 1 + b 2 x i 2 + b 2 x i 2 + . . . ˆ for all i = 1 , 2 , . . . , n What is ˆ y i ? ˆ y i I’s the estimated conditional expectation of Y for the when X 1 = 1 , X 2 = x i 2 , . . . , X K = x iK P. Coretto • MEF CLRM estimation 8 / 22

  5. Algebraic/Geometric properties of the OLS Notes Proposition (orthogonality of residuals) The column space of X is orthogonal to the residual vector Proof. Write the normal equations X ′ Xb − X ′ y = 0 X ′ ( y − Xb ) = 0 X ′ e = 0 = ⇒ = ⇒ Therefore for every column X · k (observed regressor) it holds true that the inner product X · k ′ e = 0 . � P. Coretto • MEF CLRM estimation 9 / 22 Proposition (residuals sum to zero) Notes If the linear model includes the constant term, then n n � � ( y i − x ′ e i = i b ) = 0 i =1 i =1 Proof. By assumption we have a liner model with constant/intercept term.That is y i = β 1 + β 2 x i 2 + β 3 x i 3 + . . . + ε i Therefore X · 1 = 1 n = (1 , , 1 , . . . , 1) ′ . Apply the previous property the 1 st column of X n � X · 1 ′ e = 1 ′ e = e i = 0 i =1 and this proves the property � P. Coretto • MEF CLRM estimation 10 / 22

  6. Proposition (Fitted vector is a projection) Notes ˆ y is the projection of y onto the space spanned by columns of X (regressors) Proof. y = Xb = X ( X ′ X ) − 1 X ′ y = P y ˆ It suffices to show that that P = X ( X ′ X ) − 1 X ′ is symmetric and idempotent. � X ( X ′ X ) − 1 X ′ � ′ P ′ = �� X ′ X � − 1 � ′ X ′ = X � ( X ′ X ) ′ � − 1 X ′ = X = X ( X ′ X ) − 1 X ′ = P Therefore P is symmetric. P. Coretto • MEF CLRM estimation 11 / 22 Notes � X ( X ′ X ) − 1 X ′ � � X ( X ′ X ) − 1 X ′ � P P = = X ( X ′ X ) − 1 ( X ′ X )( X ′ X ) − 1 X ′ = X ( X ′ X ) − 1 X ′ = P which shows that P is also idempotent, and this completes the proof � P it’s called the influence matrix, because measures the impact of the observed y s on each predicted ˆ y i . Elements of the diagonal of P are called leverages, because are the influence y i on the the corresponding ˆ y i P. Coretto • MEF CLRM estimation 12 / 22

  7. Proposition (Orthogonal decomposition) Notes The OLS fitting decomposes the observed vector y in the sum of two orthogonal components y = ˆ y + e = P y + My Remark: orthogonality implies that the individual contributions of each term of the decomposition of y are somewhat well identified . Proof. First notice that e = y − ˆ y = y − P y = ( I − P ) y = My where M = ( I − P ) . Therefore y = ˆ y + e = P y + My It remains to show that ˆ y = P y and e = My are orthogonal vectors. P. Coretto • MEF CLRM estimation 13 / 22 First note that MP = P M = 0 , in fact Notes ( I − P ) P = IP − P P = 0 Moreover � P y , My � = ( P y ) ′ ( My ) = y ′ P ′ My = y ′ P My = y ′ 0 y = 0 and this completes the proof � M = I − P is called the residual maker matrix because it maps y into e . It allows to write e in terms of the observables y and X . Properties: M is idempotent and symmetric (show it) MX = 0 , in fact MX = ( I − P ) X = X − X = 0 Remark: it can be shown that this decomposition is also unique (a consequence of Hilbert projection theorem). P. Coretto • MEF CLRM estimation 14 / 22

  8. Notes OLS Projection Source: Greene, W. H. (2011) “ Econometric Analysis ” 7th Edition P. Coretto • MEF CLRM estimation 15 / 22 Estimate of the variance of the error term Notes Min of the LS objective function S ( b ) = ( y − Xb ) ′ ( y − Xb ) = e ′ e This called “Residual sum of squares” n � e 2 RSS = i = e ′ e i =1 Note that e = My = M ( Xβ + ε ) = Mε and RSS = e ′ e = ( Mε ) ′ ( Mε ) = ε ′ M ′ Mε = ε ′ Mε P. Coretto • MEF CLRM estimation 16 / 22

  9. Notes Unbiased estimation of the error variance n 1 e ′ e RSS s 2 = � e 2 i = n − K = n − K n − K i =1 SER = “ standard error of the regression ” = s P. Coretto • MEF CLRM estimation 17 / 22 Estimation error decomposition Notes The sampling estimation error is given by b − β , now � X ′ X � − 1 X ′ y − β b − β = � X ′ X � − 1 X ′ ( Xβ + ε ) − β = � X ′ X � − 1 ( X ′ X ) β + � X ′ X � − 1 X ′ ε − β = � X ′ X � − 1 X ′ ε − β = β + � X ′ X � − 1 X ′ ε = The bias is the expected estimation error: Bias ( b ) = E[ b − β ] P. Coretto • MEF CLRM estimation 18 / 22

  10. TSS = total sum of squares Notes Let ¯ y be the sample average of the observed y 1 , y 2 , . . . , y n : n y = 1 � ¯ y i , n i =1 ) ′ . We can also write ¯ and let ¯ y = (¯ y, ¯ y, . . . , ¯ y y = ¯ y 1 n � �� � n times TSS = the deviance (variability) observed in the independent variable y n � ( y i − y ) 2 = ( y − ¯ y ) ′ ( y − ¯ TSS = y ) i =1 This is a variability measure, because it computes the squared deviations of y from its observed unconditional mean. P. Coretto • MEF CLRM estimation 19 / 22 ESS = explained sum of squares Notes ESS = the overall deviance of the predicted values of y wrt to the unconditional mean of y n � y i − y ) 2 = (ˆ y ) ′ (ˆ ESS = (ˆ y − ¯ y − ¯ y ) i =1 At first look this is not exactly a measure of variability (why?). But it turns out that another property of the OLS is that n n 1 y i = 1 � � ˆ y i n n i =1 i =1 P. Coretto • MEF CLRM estimation 20 / 22

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend