Simple linear regression STAT 401A - Statistical Methods for - PowerPoint PPT Presentation

Simple linear regression STAT 401A - Statistical Methods for Research Workers Jarad Niemi Iowa State University October 4, 2013 Jarad Niemi (Iowa State) Simple linear regression October 4, 2013 1 / 9

Model Simple Linear Regression Recall the One-way ANOVA model: ind ∼ N ( µ i , σ 2 ) Y ij where Y ij is the observation for individual j in group i . The simple linear regression model is ind ∼ N ( β 0 + β 1 X i , σ 2 ) Y i where Y i and X i are the response and explanatory variable, respectively, for individual i . response explanatory outcome covariate Terminology (all of these are equivalent): dependent independent endogenous exogenous Jarad Niemi (Iowa State) Simple linear regression October 4, 2013 2 / 9

Model Telomere length vs years post diagnosis ● 1.6 ● ● ● ● 1.4 ● ● ● ● ● ● Telomere length ● ● ● ● ● ● ● ● ● ● ● ● 1.2 ● ● ●● ● ● ● ● ● ● ● 1.0 ● ● ● ● ● 2 4 6 8 10 12 Years post diagnosis (jittered) Jarad Niemi (Iowa State) Simple linear regression October 4, 2013 3 / 9 R package abd , data set Telomeres

Model Interpretation Interpretation V [ Y i | X i = x ] = σ 2 E [ Y i | X i = x ] = β 0 + β 1 x If X i = 0, then E [ Y i | X i = 0] = β 0 . β 0 is the expected response when the explanatory variable is zero. If X i increases from x to x + 1, then E [ Y i | X i = x + 1] = β 0 + β 1 x + β 1 − E [ Y i | X i = x ] = β 0 + β 1 x = β 1 β 1 is the expected increase in the response for each unit increase in the explanatory variable. σ is the standard deviation of the response for a fixed value of the explanatory variable. Jarad Niemi (Iowa State) Simple linear regression October 4, 2013 4 / 9

Model Estimators Remove the mean: iid ∼ N (0 , σ 2 ) Y i = β 0 + β 1 X i + e i e i So e i = Y i − ( β 0 + β 1 X i ) which we approximate by the residual e i = Y i − (ˆ β 0 + ˆ r i = ˆ β 1 X i ) The least squares, maximum likelihood, and Bayesian estimators are ˆ β 1 = SXY / SXX ˆ = Y − ˆ β 0 β 1 X σ 2 ˆ = SSE / ( n − 2) d.f. = n − 2 = � n SXY i =1 ( X i − X )( Y i − Y ) = � n i =1 ( X i − X )( X i − X ) = � n i =1 ( X i − X ) 2 SXX = � n i =1 r 2 SSE i � n = 1 X i =1 X i n � n = 1 Y i =1 Y i n Jarad Niemi (Iowa State) Simple linear regression October 4, 2013 5 / 9

Model Standard errors How certain are we about ˆ β 0 and ˆ β 1 being equal to β 0 and β 1 ? We quantify this uncertainty using their standard errors: � 2 1 X SE ( β 0 ) = ˆ σ n + d . f . = n − 2 ( n − 1) s 2 X � 1 SE ( β 1 ) = ˆ σ d . f . = n − 2 ( n − 1) s 2 X s 2 = SXX / ( n − 1) X s 2 = SYY / ( n − 1) Y = � n i =1 ( Y i − Y ) 2 SYY = SXY / ( n − 1) correlation coefficient r XY s X s Y R 2 = r 2 = SST − SSE coefficient of determination XY SST = SYY = � n i =1 ( Y i − Y ) 2 SST The coefficient of determination is the percentage of the total response variation explained by the explanatory variable(s). Jarad Niemi (Iowa State) Simple linear regression October 4, 2013 6 / 9

Model Pvalues and confidence intervals Pvalues and confidence interval We can compute two-sided pvalues via � � � � � � � � ˆ ˆ β 0 β 1 � � � � 2 P t n − 2 > and 2 P t n − 2 > � � � � SE ( β 0 ) SE ( β 1 ) � � � � � � � � These test the null hypothesis that the corresponding parameter is zero. We can construct 100(1 − α )% confidence intervals via ˆ ˆ β 0 ± t n − 2 (1 − α/ 2) SE ( β 0 ) and β 1 ± t n − 2 (1 − α/ 2) SE ( β 1 ) These provide ranges of the parameter consistent with the data. Jarad Niemi (Iowa State) Simple linear regression October 4, 2013 7 / 9

Model Pvalues and confidence intervals Telomere length vs years post diagnosis ● 1.6 ● ● ● ● 1.4 ● ● ● ● ● ● Telomere length ● ● ● ● ● ● ● ● ● ● ● ● 1.2 ● ● ●● ● ● ● ● ● ● ● 1.0 ● ● ● ● ● 2 4 6 8 10 12 Years post diagnosis (jittered) Jarad Niemi (Iowa State) Simple linear regression October 4, 2013 8 / 9

Model Pvalues and confidence intervals DATA t; INFILE ’telomeres.csv’ DSD FIRSTOBS=2; INPUT years length; PROC REG DATA=t; MODEL length = years; RUN; The REG Procedure Model: MODEL1 Dependent Variable: length Number of Observations Read 39 Number of Observations Used 39 Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model 1 0.22777 0.22777 8.42 0.0062 Error 37 1.00033 0.02704 Corrected Total 38 1.22810 Root MSE 0.16443 R-Square 0.1855 Dependent Mean 1.22026 Adj R-Sq 0.1634 Coeff Var 13.47473 Parameter Estimates Parameter Standard Variable DF Estimate Error t Value Pr > |t| 95% Confidence Limits Intercept 1 1.36768 0.05721 23.91 <.0001 1.25176 1.48360 years 1 -0.02637 0.00909 -2.90 0.0062 -0.04479 -0.00796 Jarad Niemi (Iowa State) Simple linear regression October 4, 2013 9 / 9

Simple linear regression STAT 401A - Statistical Methods for - PowerPoint PPT Presentation

Simple linear regression STAT 401A - Statistical Methods for Research Workers Jarad Niemi Iowa State University October 4, 2013 Jarad Niemi (Iowa State) Simple linear regression October 4, 2013 1 / 9 Model Simple Linear Regression Recall

STAT 213 Simple Linear Regression I Colin Reimer Dawson Oberlin College 5 October 2016 Outline

Linear regression Linear regression is a simple approach to supervised learning. It assumes

Linear regression Linear regression is a simple approach to supervised learning. It assumes

Regression 1: Linear Regression Marco Baroni Practical Statistics in R Outline Classic linear

Regression Methods 1. Linear Regression and Logistic Regression: definitions, and a common

Linear regression Linear regression is a simple approach to supervised learning. It assumes

LINEAR REGRESSION LINEAR REGRESSION - FROM A MACHINE LEARNING POINT OF VIEW 25 SIMPLE LINEAR

Linear regression How to measure the accuracy of linear regression models Linear Regression

Linear Models for Regression Greg Mori - CMPT 419/726 Bishop PRML Ch. 3 Regression Linear Basis

Outline The Simple Linear Regression Model (12.1) Fitting the Regression Line (12.2)

Trendlines Simple Linear Regression Multiple Linear Regression Systematic Model

Logistic regression CS 446 1. Linear classifiers Linear regression Last two lectures, we studied

Notes on the Non-linear Regression The model Non-linear regression models, like ordinary linear

Regression: Simple and Linear Introduction to Machine Learning Regression Principle REGRESSION

CS70: Lecture 35. Regression (contd.): Linear and Beyond CS70: Lecture 35. Regression (contd.):

Chapter 7 Linear Regression 04/05/2016 Huamei Dong 1. Review Least square regression line 2.

CITIES, HEALTH AND WELL-BEING NOVEMBER 2011 Neighbourhood matters: explaining spatial patterns

On restrictions of balanced 2-interval graphs Philippe Gambette and Stphane Vialette Outline

ImmunoDB: a web based tool to analyze preclinical data Rosa LAVIERI a Gilberto FILACI b Daniela

The Future of Work Warren Harding Adjunct Professor, Curtin Business School Minister for

Semi-Automa+cally Modeling Web APIs to Create Linked APIs

Designing for Future Weather Presented by BuildingGreen, Inc. Russell Jones Chuck Khuen

Manipulating an Abstraction (Iteration) CT @ VT An algorithm with iteration START BOOK LIST =

Temperature accelerated Degradation of GaN HEMTs under High power Stress: Activation Energy