workshop 8 2a heterogeneity
play

Workshop 8.2a: Heterogeneity Murray Logan 23 Jul 2016 Section 1 - PowerPoint PPT Presentation

Workshop 8.2a: Heterogeneity Murray Logan 23 Jul 2016 Section 1 Linear modelling assumptions Assumptions y i = 0 + 1 x i + i i N (0 , 2 ) Linear modelling assumptions y i = 0 + 1 x i + i i N (0 ,


  1. Workshop 8.2a: Heterogeneity Murray Logan 23 Jul 2016

  2. Section 1 Linear modelling assumptions

  3. Assumptions y i = β 0 + β 1 × x i + ε i ϵ i ∼ N (0 , σ 2 )

  4. Linear modelling assumptions y i = β 0 + β 1 × x i + ε i ϵ i ∼ N (0 , σ 2 ) Homogeneity of variance   σ 2 . 0 0 ··· . .  σ 2  0 . ··· σ 2 )   y i = β 0 + β 1 × x i + ε i ε i ∼ N ( 0 , . V = cov = . . .   . . σ 2 � �� � � �� �  . .  ··· Linearity Normality σ 2 0 . ··· ··· Zero covariance (=independence) . . .

  5. Dealing with Heterogeneity y x 41.9 1 48.5 2 43 3 51.4 4 51.2 5 37.7 6 50.7 7 65.1 8 51.7 9 38.9 10 70.6 11 51.4 12 62.7 13 34.9 14 95.3 15 63.9 16

  6. Mean Median :51.30 Max. 3rd Qu.:12.25 3rd Qu.:63.00 : 8.50 Mean :53.68 > data1 <- read.csv ('../data/D1.csv') Median : 8.50 1st Qu.: 4.75 Max. 1st Qu.:42.73 : 1.00 Min. :34.90 Min. x y :16.00 :95.30 Dealing with Heterogeneity > summary (data1) y i = β 0 + β 1 × x i + ε i ϵ i ∼ N (0 , σ 2 ) • estimate β 0 , β 1 and σ 2

  7. Dealing with Heterogeneity

  8. Dealing with Heterogeneity

  9. Dealing with Heterogeneity   σ 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0     σ 2 0 0 0 0 0 0 0 0 0      0 0 0 0 0    σ 2  0 0 0 0 0 0 0 0 0     0 0 0 0 0    σ 2  0 0 0 0 0 0 0 0 0      0 0 0 0 0     σ 2 0 0 0 0 0 0 0 0 0     0 0 0 0 0     σ 2 0 0 0 0 0 0 0 0 0     0 0 0 0 0     σ 2 0 0 0 0 0 0 0 0 0     0 0 0 0 0     σ 2 0 0 0 0 0 0 0 0 0   V = cov =   0 0 0 0 0     σ 2 0 0 0 0 0 0 0 0 0       Variance-covariance matrix

  10. Dealing with Heterogeneity Homogeneity of variance   σ 2 . 0 0 ··· . .  σ 2  0 . ··· σ 2 )   ε i ∼ N ( 0 , . y i = β 0 + β 1 × x i + ε i . V = cov = . .   . . σ 2 � �� � � �� �  . .  ··· Linearity Normality σ 2 . 0 ··· ··· Zero covariance (=independence) . . .     · · · σ 2 · · · 1 0 0 0 0 . .     . . · · · σ 2 · · · 0 1 . 0 . V = σ 2 ×     =     . . . .     . . . . σ 2 · · · 1 · · · . . . .     σ 2 0 · · · · · · 1 0 · · · · · · � �� � � �� � Identity matrix Variance-covariance matrix

  11. Dealing with Heterogeneity ● ● 90 80 70 ● y ● ● ● 60 ● ● ● ● ● 50 ● ● ● 40 ● ● ● 5 10 15 x • variance proportional to X • variance inversely proportional to X

  12. Dealing with Heterogeneity • variance inversely proportional to X σ 2 ×  1   0 · · · √ 1 0 · · · 0 X 1 .  . σ 2 × .   1 . 0 · · · .  0 1 · · · . √   V = σ 2 × X ×  X 2 =   . .  . .   σ 2 × . . . .  1 · · · 1 · · · . . . .   √  X i · · · · · · 0 1 0 · · · · · · σ X n � �� � � �� Identity matrix Variance-covariance matrix

  13. Dealing with Heterogeneity   1 0 · · · 0 √ X 1 .   . 1 · · · 0 .   √ V = σ 2 × ω ,  X 2  where ω =  . .  . .  1  · · · . . √   X i 1 · · · · · · 0 √ X n � �� � Weights matrix

  14. > 1/ sqrt (data1$x) [1] 1.0000000 0.7071068 0.5773503 0.5000000 0.4472136 0.4082483 0.3779645 0.3535534 0.3333333 [10] 0.3162278 0.3015113 0.2886751 0.2773501 0.2672612 0.2581989 0.2500000 Dealing with Heterogeneity Calculating weights

  15. Generalized least squares (GLS) 1. use OLS to estimate fixed effects 2. use these estimates to estimate variances via ML 3. use these to re-estimate fixed effects (OLS)

  16. Generalized least squares (GLS) ML is biased (for variance) when N is small: • use REML • max. likelihood of residuals rather than data

  17. varIdent(form= |A) varExp(form= x) varComb(form= x|A) varPower(form= x) varFixed( x) varConstPower(form= x) Variance structures Variance function Variance structure Description V = σ 2 × x variance propor- tional to ฀x฀ (the covari- ate) V = σ 2 × e 2 δ × x variance propor- tional to the expo- nential of ฀x฀ raised to a con- stant power V x variance propor- tional to the absolute value of ฀x฀ raised to a con- stant power V x a variant on the power function V I when A is a factor, variance is al- lowed to be dif- ferent for each level (j) of the factor V x I combination of two of the above

  18. + method='REML') method='REML') > library (nlme) + > library (nlme) Generalized least squares (GLS) > data1.gls <- gls (y~x, data1, > plot (data1.gls) ● 2 Standardized residuals 1 ● ● ● ● ● ● 0 ● ● ● ● ● ● ● −1 ● −2 ● 45 50 55 60 65 Fitted values > data1.gls1 <- gls (y~x, data=data1, weights= varFixed (~x), > plot (data1.gls1) ● 2 ed residuals ● 1 ●

  19. > library (nlme) + method='REML') Generalized least squares (GLS) > data1.gls2 <- gls (y~x, data=data1, weights= varFixed (~x^2), > plot (data1.gls2) ● 1.5 ● ● Standardized residuals 1.0 ● ● 0.5 ● ● 0.0 ● ● ● −0.5 ● ● ● −1.0 ● ● −1.5 ● 45 50 55 60 65 Fitted values

  20. fitted (data1.gls2)) > plot ( resid (data1.gls) ~ + > plot ( resid (data1.gls2) ~ fitted (data1.gls)) + Generalized least squares (GLS) g r o n w 30 ● 20 resid(data1.gls) ● ● 10 ● ● ● ● ● 0 ● ● ● ● ● ● −20 ● ● 45 50 55 60 65 fitted(data1.gls)

  21. fitted (data1.gls2)) > plot ( resid (data1.gls,'normalized') ~ + > plot ( resid (data1.gls2,'normalized') ~ fitted (data1.gls)) + Generalized least squares (GLS) T R E C C O R resid(data1.gls, "normalized") ● 2 1 ● ● ● ● ● ● 0 ● ● ● ● ● ● −1 ● ● −2 ● 45 50 55 60 65 fitted(data1.gls)

  22. > plot ( resid (data1.gls2,'normalized') ~ data1$x) > plot ( resid (data1.gls,'normalized') ~ data1$x) Generalized least squares (GLS) resid(data1.gls, "normalized") ● 2 1 ● ● ● ● ● ● 0 ● ● ● ● ● ● ● −1 ● −2 ● 5 10 15 data1$x resid(data1.gls2, "normalized") 1.5 ● ● ● ● ● 0.5 ● ● ● ●

  23. 3 118.9904 120.9076 -56.49519 data1.gls > #OR > anova (data1.gls, data1.gls1, data1.gls2) Model df AIC BIC logLik 1 data1.gls2 3 127.6388 129.5559 -60.81939 data1.gls1 2 3 121.0828 123.0000 -57.54142 data1.gls2 3 3 120.9904 3 123.0828 > AIC (data1.gls, data1.gls1, data1.gls2) data1.gls2 df AIC data1.gls 3 127.6388 data1.gls1 3 121.0828 3 118.9904 data1.gls1 > library (MuMIn) df AICc data1.gls 3 129.6388 Generalized least squares (GLS) > AICc (data1.gls, data1.gls1, data1.gls2)

  24. Degrees of freedom: 16 total; 14 residual 1.49282 AIC BIC logLik 118.9904 120.9075 -56.49519 Variance function: Structure: fixed weights Formula: ~x^2 Coefficients: Value Std.Error t-value p-value (Intercept) 41.21920 1.493556 27.598018 0.0000 x 0.469988 Model: y ~ x Med Residual standard error: 1.393108 1.54157863 0.77799410 -1.49259798 -0.59852829 -0.07669281 Max Q3 Q1 3.176287 Min Standardized residuals: x -0.671 (Intr) Correlation: 0.0067 Data: data1 Generalized least squares fit by REML > summary (data1.gls) 1.57074 Generalized least squares fit by REML Model: y ~ x Data: data1 AIC BIC logLik 127.6388 129.5559 -60.81939 Coefficients: Value Std.Error t-value p-value (Intercept) 40.33000 7.189442 5.609615 0.0001 x 0.743514 2.112582 > summary (data1.gls2) Q3 Degrees of freedom: 16 total; 14 residual Residual standard error: 13.70973 2.29099872 0.35357567 -2.00006105 -0.29319830 -0.02282621 Max Med 0.0531 Q1 Min Standardized residuals: x -0.879 (Intr) Correlation: Generalized least squares (GLS)

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend