statistical modelling in stata 5 linear models
play

Statistical Modelling in Stata 5: Linear Models Mark Lunt Centre - PowerPoint PPT Presentation

The linear Model Testing assumptions Statistical Modelling in Stata 5: Linear Models Mark Lunt Centre for Epidemiology Versus Arthritis University of Manchester 17/11/2020 The linear Model Testing assumptions Structure This Week What is a


  1. The linear Model Testing assumptions Statistical Modelling in Stata 5: Linear Models Mark Lunt Centre for Epidemiology Versus Arthritis University of Manchester 17/11/2020

  2. The linear Model Testing assumptions Structure This Week What is a linear model ? How good is my model ? Does a linear model fit this data ? Next Week Categorical Variables Interactions Confounding Other Considerations Variable Selection Polynomial Regression

  3. The linear Model Testing assumptions Statistical Models All models are wrong, but some are use- ful. (G.E.P . Box) A model should be as simple as possible, but no simpler. (attr. Albert Einstein)

  4. Introduction Parameters The linear Model Prediction Testing assumptions ANOVA Stata commands for linear models What is a Linear Model ? Describes the relationship between variables Assumes that relationship can be described by straight lines Tells you the expected value of an outcome or y variable, given the values of one or more predictor or x variables

  5. Introduction Parameters The linear Model Prediction Testing assumptions ANOVA Stata commands for linear models Variable Names Outcome Predictor Dependent variable Independent variables Y-variable x-variables Response variable Regressors Output variable Input variables Explanatory variables Carriers Covariates

  6. Introduction Parameters The linear Model Prediction Testing assumptions ANOVA Stata commands for linear models The Equation of a Linear Model The equation of a linear model, with outcome Y and predictors x 1 , . . . x p Y = β 0 + β 1 x 1 + β 2 x 2 + . . . + β p x p + ε β 0 + β 1 x 1 + β 2 x 2 + . . . + β p x p is the Linear Predictor ˆ Y = β 0 + β 1 x 1 + β 2 x 2 + . . . + β p x p is the predictable part of Y . ε is the error term , the unpredictable part of Y . We assume that ε is normally distributed with mean 0 and variance σ 2 .

  7. Introduction Parameters The linear Model Prediction Testing assumptions ANOVA Stata commands for linear models Linear Model Assumptions Mean of Y | x is a linear function of x Variables Y 1 , Y 2 . . . Y n are independent. The variance of Y | x is constant. Distribution of Y | x is normal.

  8. Introduction Parameters The linear Model Prediction Testing assumptions ANOVA Stata commands for linear models Parameter Interpretation Y� Y = β 0 +� β 1 x� β 1 1� β 0 x� β 1 is the amount by which Y increases if x 1 increases by 1, and none of the other x variables change. β 0 is the value of Y when all of the x variables are equal to 0.

  9. Introduction Parameters The linear Model Prediction Testing assumptions ANOVA Stata commands for linear models Estimating Parameters β j in the previous equation are referred to as parameters or coefficients Don’t use the expression “beta coefficients”: it is ambiguous We need to obtain estimates of them from the data we have collected. Estimates normally given roman letters b 0 , b 1 , . . . , b n . Values given to b j are those which minimise � ( Y − ˆ Y ) 2 : hence “Least squares estimates”

  10. Introduction Parameters The linear Model Prediction Testing assumptions ANOVA Stata commands for linear models Inference on Parameters If assumptions hold, sampling distribution of b j is normal with mean β j and variance σ 2 / ns 2 x (for sufficiently large n ), where : σ 2 is the variance of the error terms ε , s 2 x is the variance of x j and n is the number of observations Can perform t-tests of hypotheses about β j (e.g. β j = 0). Can also produce a confidence interval for β j . Inference in β 0 (intercept) is usually not interesting.

  11. Introduction Parameters The linear Model Prediction Testing assumptions ANOVA Stata commands for linear models Inference on the Predicted Value Y = β 0 + β 1 x 1 + . . . + β p x p + ε Predicted Value ˆ Y = b 0 + b 1 x 1 + . . . + b p x p Observed values will differ from predicted values because of Random error ( ε ) Uncertainty about parameters β j . We can calculate a 95% prediction interval, within which we would expect 95% of observations to lie. Reference Range for Y

  12. Introduction Parameters The linear Model Prediction Testing assumptions ANOVA Stata commands for linear models Prediction Interval 15 10 Y1 5 0 0 5 10 15 20 x1

  13. Introduction Parameters The linear Model Prediction Testing assumptions ANOVA Stata commands for linear models Inference on the Mean The mean value of Y at a given value of x does not depend on ε . The standard error of ˆ Y is called the standard error of the prediction (by stata). We can calculate a 95% confidence interval for ˆ Y . This can be thought of as a confidence region for the regression line.

  14. Introduction Parameters The linear Model Prediction Testing assumptions ANOVA Stata commands for linear models Confidence Interval 15 10 Y1 5 0 0 5 10 15 20 x1

  15. Introduction Parameters The linear Model Prediction Testing assumptions ANOVA Stata commands for linear models Analysis of Variance (ANOVA) 2 2 + � ( ˆ 2 � ( Y − ¯ Y ) � ( Y − ˆ Y ) Y − ¯ Y ) Variance of Y is = n − 1 n − 1 � 2 SS reg = � � Y − ¯ ˆ Y (regression sum of squares) � 2 SS res = � � Y − ˆ Y (residual sum of squares) Each part has associated degrees of freedom : p d.f for the regression, n − p − 1 for the residual. The mean square MS = SS / df . MS reg should be similar to MS res if no association between Y and x F = MS reg MS res gives a measure of the strength of the association between Y and x .

  16. Introduction Parameters The linear Model Prediction Testing assumptions ANOVA Stata commands for linear models Analysis of Variance (ANOVA) 2 2 + � ( ˆ 2 � ( Y − ¯ Y ) � ( Y − ˆ Y ) Y − ¯ Y ) Variance of Y is = n − 1 n − 1 � 2 SS reg = � � Y − ¯ ˆ Y (regression sum of squares) � 2 SS res = � � Y − ˆ Y (residual sum of squares) Each part has associated degrees of freedom : p d.f for the regression, n − p − 1 for the residual. The mean square MS = SS / df . MS reg should be similar to MS res if no association between Y and x F = MS reg MS res gives a measure of the strength of the association between Y and x .

  17. Introduction Parameters The linear Model Prediction Testing assumptions ANOVA Stata commands for linear models Analysis of Variance (ANOVA) 2 2 + � ( ˆ 2 � ( Y − ¯ Y ) � ( Y − ˆ Y ) Y − ¯ Y ) Variance of Y is = n − 1 n − 1 � 2 SS reg = � � Y − ¯ ˆ Y (regression sum of squares) � 2 SS res = � � Y − ˆ Y (residual sum of squares) Each part has associated degrees of freedom : p d.f for the regression, n − p − 1 for the residual. The mean square MS = SS / df . MS reg should be similar to MS res if no association between Y and x F = MS reg MS res gives a measure of the strength of the association between Y and x .

  18. Introduction Parameters The linear Model Prediction Testing assumptions ANOVA Stata commands for linear models Analysis of Variance (ANOVA) 2 2 + � ( ˆ 2 � ( Y − ¯ Y ) � ( Y − ˆ Y ) Y − ¯ Y ) Variance of Y is = n − 1 n − 1 � 2 SS reg = � � Y − ¯ ˆ Y (regression sum of squares) � 2 SS res = � � Y − ˆ Y (residual sum of squares) Each part has associated degrees of freedom : p d.f for the regression, n − p − 1 for the residual. The mean square MS = SS / df . MS reg should be similar to MS res if no association between Y and x F = MS reg MS res gives a measure of the strength of the association between Y and x .

  19. Introduction Parameters The linear Model Prediction Testing assumptions ANOVA Stata commands for linear models Analysis of Variance (ANOVA) 2 2 + � ( ˆ 2 � ( Y − ¯ Y ) � ( Y − ˆ Y ) Y − ¯ Y ) Variance of Y is = n − 1 n − 1 � 2 SS reg = � � Y − ¯ ˆ Y (regression sum of squares) � 2 SS res = � � Y − ˆ Y (residual sum of squares) Each part has associated degrees of freedom : p d.f for the regression, n − p − 1 for the residual. The mean square MS = SS / df . MS reg should be similar to MS res if no association between Y and x F = MS reg MS res gives a measure of the strength of the association between Y and x .

  20. Introduction Parameters The linear Model Prediction Testing assumptions ANOVA Stata commands for linear models Analysis of Variance (ANOVA) 2 2 + � ( ˆ 2 � ( Y − ¯ Y ) � ( Y − ˆ Y ) Y − ¯ Y ) Variance of Y is = n − 1 n − 1 � 2 SS reg = � � Y − ¯ ˆ Y (regression sum of squares) � 2 SS res = � � Y − ˆ Y (residual sum of squares) Each part has associated degrees of freedom : p d.f for the regression, n − p − 1 for the residual. The mean square MS = SS / df . MS reg should be similar to MS res if no association between Y and x F = MS reg MS res gives a measure of the strength of the association between Y and x .

  21. Introduction Parameters The linear Model Prediction Testing assumptions ANOVA Stata commands for linear models Analysis of Variance (ANOVA) 2 2 + � ( ˆ 2 � ( Y − ¯ Y ) � ( Y − ˆ Y ) Y − ¯ Y ) Variance of Y is = n − 1 n − 1 � 2 SS reg = � � Y − ¯ ˆ Y (regression sum of squares) � 2 SS res = � � Y − ˆ Y (residual sum of squares) Each part has associated degrees of freedom : p d.f for the regression, n − p − 1 for the residual. The mean square MS = SS / df . MS reg should be similar to MS res if no association between Y and x F = MS reg MS res gives a measure of the strength of the association between Y and x .

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend