Statistical Modelling in Stata 5: Linear Models Mark Lunt Centre - PowerPoint PPT Presentation

The linear Model Testing assumptions Statistical Modelling in Stata 5: Linear Models Mark Lunt Centre for Epidemiology Versus Arthritis University of Manchester 17/11/2020

The linear Model Testing assumptions Structure This Week What is a linear model ? How good is my model ? Does a linear model fit this data ? Next Week Categorical Variables Interactions Confounding Other Considerations Variable Selection Polynomial Regression

The linear Model Testing assumptions Statistical Models All models are wrong, but some are use- ful. (G.E.P . Box) A model should be as simple as possible, but no simpler. (attr. Albert Einstein)

Introduction Parameters The linear Model Prediction Testing assumptions ANOVA Stata commands for linear models What is a Linear Model ? Describes the relationship between variables Assumes that relationship can be described by straight lines Tells you the expected value of an outcome or y variable, given the values of one or more predictor or x variables

Introduction Parameters The linear Model Prediction Testing assumptions ANOVA Stata commands for linear models Variable Names Outcome Predictor Dependent variable Independent variables Y-variable x-variables Response variable Regressors Output variable Input variables Explanatory variables Carriers Covariates

Introduction Parameters The linear Model Prediction Testing assumptions ANOVA Stata commands for linear models The Equation of a Linear Model The equation of a linear model, with outcome Y and predictors x 1 , . . . x p Y = β 0 + β 1 x 1 + β 2 x 2 + . . . + β p x p + ε β 0 + β 1 x 1 + β 2 x 2 + . . . + β p x p is the Linear Predictor ˆ Y = β 0 + β 1 x 1 + β 2 x 2 + . . . + β p x p is the predictable part of Y . ε is the error term , the unpredictable part of Y . We assume that ε is normally distributed with mean 0 and variance σ 2 .

Introduction Parameters The linear Model Prediction Testing assumptions ANOVA Stata commands for linear models Linear Model Assumptions Mean of Y | x is a linear function of x Variables Y 1 , Y 2 . . . Y n are independent. The variance of Y | x is constant. Distribution of Y | x is normal.

Introduction Parameters The linear Model Prediction Testing assumptions ANOVA Stata commands for linear models Parameter Interpretation Y� Y = β 0 +� β 1 x� β 1 1� β 0 x� β 1 is the amount by which Y increases if x 1 increases by 1, and none of the other x variables change. β 0 is the value of Y when all of the x variables are equal to 0.

Introduction Parameters The linear Model Prediction Testing assumptions ANOVA Stata commands for linear models Estimating Parameters β j in the previous equation are referred to as parameters or coefficients Don’t use the expression “beta coefficients”: it is ambiguous We need to obtain estimates of them from the data we have collected. Estimates normally given roman letters b 0 , b 1 , . . . , b n . Values given to b j are those which minimise � ( Y − ˆ Y ) 2 : hence “Least squares estimates”

Introduction Parameters The linear Model Prediction Testing assumptions ANOVA Stata commands for linear models Inference on Parameters If assumptions hold, sampling distribution of b j is normal with mean β j and variance σ 2 / ns 2 x (for sufficiently large n ), where : σ 2 is the variance of the error terms ε , s 2 x is the variance of x j and n is the number of observations Can perform t-tests of hypotheses about β j (e.g. β j = 0). Can also produce a confidence interval for β j . Inference in β 0 (intercept) is usually not interesting.

Introduction Parameters The linear Model Prediction Testing assumptions ANOVA Stata commands for linear models Inference on the Predicted Value Y = β 0 + β 1 x 1 + . . . + β p x p + ε Predicted Value ˆ Y = b 0 + b 1 x 1 + . . . + b p x p Observed values will differ from predicted values because of Random error ( ε ) Uncertainty about parameters β j . We can calculate a 95% prediction interval, within which we would expect 95% of observations to lie. Reference Range for Y

Introduction Parameters The linear Model Prediction Testing assumptions ANOVA Stata commands for linear models Prediction Interval 15 10 Y1 5 0 0 5 10 15 20 x1

Introduction Parameters The linear Model Prediction Testing assumptions ANOVA Stata commands for linear models Inference on the Mean The mean value of Y at a given value of x does not depend on ε . The standard error of ˆ Y is called the standard error of the prediction (by stata). We can calculate a 95% confidence interval for ˆ Y . This can be thought of as a confidence region for the regression line.

Introduction Parameters The linear Model Prediction Testing assumptions ANOVA Stata commands for linear models Confidence Interval 15 10 Y1 5 0 0 5 10 15 20 x1

Introduction Parameters The linear Model Prediction Testing assumptions ANOVA Stata commands for linear models Analysis of Variance (ANOVA) 2 2 + � ( ˆ 2 � ( Y − ¯ Y ) � ( Y − ˆ Y ) Y − ¯ Y ) Variance of Y is = n − 1 n − 1 � 2 SS reg = � � Y − ¯ ˆ Y (regression sum of squares) � 2 SS res = � � Y − ˆ Y (residual sum of squares) Each part has associated degrees of freedom : p d.f for the regression, n − p − 1 for the residual. The mean square MS = SS / df . MS reg should be similar to MS res if no association between Y and x F = MS reg MS res gives a measure of the strength of the association between Y and x .

Statistical Modelling in Stata 5: Linear Models Mark Lunt Centre - PowerPoint PPT Presentation

The linear Model Testing assumptions Statistical Modelling in Stata 5: Linear Models Mark Lunt Centre for Epidemiology Versus Arthritis University of Manchester 17/11/2020 The linear Model Testing assumptions Structure This Week What is a

Bayesian hierarchical models in Stata Nikolay Balov StataCorp LP 2016 Stata Conference Nikolay

Python applications in Stata 16 BPLIM 2020 Portuguese Stata Conference BPLIM Python

Bayesian Analysis using Stata Bill Rising StataCorp LP 2016 Brazilian Stata Users Group Meeting

Simulating Baboon Behavior using Stata Phil Ender UCLA Statistical Consulting Group (Ret) Stata

Linear Modelling in Stata Session 6: Further Topics in Linear Modelling Mark Lunt Centre for

Estimating dynamic stochastic general equilibrium models in Stata David Schenck Senior

Nonlinear dynamic stochastic general equilibrium models in Stata 16 David Schenck Senior

Nonlinear dynamic stochastic general equilibrium models in Stata 16 David Schenck Senior

Functional Linear Models 1 66 / 181 Functional Linear Models Statistical Models So far we have

Using Stata to estimate nonlinear models with fixed effects Paulo high-dimensional fixed effects

CS 7616 Pattern Recognition Linear, Linear, Linear Aaron Bobick School of Interactive

Introduction to Data Science: Logistic 0 1 1 according to a data fit criterion. account

Workshop 4: Statistical modelling intro Murray Logan 10 Mar 2019 Section 1 Introduction

Workshop 4: Statistical modelling intro Murray Logan March 10, 2019 Table of contents 1

Introduction to the R Statistical Computing Environment Linear and Generalized Linear Models in R

Extended multivariate generalised linear and non-linear mixed effects models Stata UK Meeting

R02 - Regression diagnostics STAT 587 (Engineering) Iowa State University October 21, 2020 All

From Supervised to Unsupervised Computational Sensing Ali Mousavi Aug 12 th 2019 brain Brain

Review Network flow definitions CSE 421 Flow examples Augmenting Paths Algorithms

Weight Parameterizations in Deep Neural Networks Sergey Zagoruyko e Paris-Est, Universit

Smoother Scheme Oren Peles and Eli Turkel Department of Applied Mathematics, Tel-Aviv University

RLT: Residual-Loop Training in Collaborative Filtering for Combining Factorization and

Depressive symptoms and urban residential greenness: Effects of measurement errors of the mean

4-connected shift residual networks ICCV 2019 Neural Architects Workshop Andrew Brown, Pascal