Rcourse: Linear model Sonja Grath, No emie Becker & Dirk - PowerPoint PPT Presentation

Rcourse: Linear model Sonja Grath, No´ emie Becker & Dirk Metzler Winter semester 2014-15

Background and basics 1 Analysis of variance 2 Model checking 3

Background and basics Contents Background and basics 1 Analysis of variance 2 Model checking 3

Background and basics Intruitive linear regression What is linear regression?

Background and basics Intruitive linear regression What is linear regression? It is the straight line that best approximates a set of points: y=a+b*x a is called the intercept and b the slope.

Background and basics Linear regression by eye I give you the following points: x <- 0:8 ; y <- c(12,10,8,11,6,7,2,3,3) ; plot(x,y) 12 ● ● 10 ● 8 ● y ● 6 ● 4 ● ● 2 ● 0 2 4 6 8 x

Background and basics Linear regression by eye I give you the following points: x <- 0:8 ; y <- c(12,10,8,11,6,7,2,3,3) ; plot(x,y) 12 ● ● 10 ● 8 ● y ● 6 ● 4 ● ● 2 ● 0 2 4 6 8 x By eye we would say a=12 and b=(12-2)/8=1.25

Background and basics Best fit in R y is modelled as a function of x. In R this job is done by the function lm() . Lets try on the R console.

Background and basics Best fit in R y is modelled as a function of x. In R this job is done by the function lm() . Lets try on the R console. 12 ● ● 10 ● 8 ● y ● 6 ● 4 ● ● 2 ● 0 2 4 6 8 x The linear model does not explain all of the variation. The error is called ”residual”. The purpose of linear regression is to minimize this error. But do you remember how we do this?

Background and basics Statistics We define the linear regression a + ˆ y = ˆ b · x by minimizing the sum of the square of the residuals: a , ˆ � (ˆ ( y i − ( a + b · x i )) 2 b ) = arg min ( a , b ) i This assumes that a , b exist, so that for all ( x i , y i ) y i = a + b · x i + ε i , where all ε i are independant and follow the normal distribution with varaince σ 2 .

Background and basics Statistics We estimate a and b , by calculating a , ˆ � (ˆ ( y i − ( a + b · x i )) 2 b ) := arg min ( a , b ) i

Background and basics Statistics We estimate a and b , by calculating a , ˆ � (ˆ ( y i − ( a + b · x i )) 2 b ) := arg min ( a , b ) i a und ˆ We can calculate ˆ b by i ( y i − ¯ y ) · ( x i − ¯ i y i · ( x i − ¯ � x ) � x ) ˆ b = = i ( x i − ¯ i ( x i − ¯ � x ) 2 � x ) 2 and y − ˆ ˆ a = ¯ b · ¯ x .

Background and basics Back to our example The commands used to produce this graph are the following: regr.obj <- lm(y x) 12 ● ● fitted <- predict(regr.obj) 10 ● 8 ● y ● 6 ● 4 ● ● 2 ● 0 2 4 6 8 x

Background and basics Back to our example The commands used to produce this graph are the following: regr.obj <- lm(y x) 12 ● ● fitted <- predict(regr.obj) 10 ● plot(x,y); abline(regr.obj) 8 ● for(i in 1:9) y ● 6 ● { 4 lines(c(x[i],x[i]),c(y[i],fitted[i])) ● ● } 2 ● 0 2 4 6 8 x

Analysis of variance Contents Background and basics 1 Analysis of variance 2 Model checking 3

Analysis of variance Reminder: ANOVA I am sure you all remember from statistic courses: We observe different mean values for different groups. ● ● Beobachtungswert Beobachtungswert 4 4 ● ● ● ● ● ● ● ● ● ● ● 2 2 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 0 ● ● ● ● ● ● ● ● ● ● ● −2 ● −2 ● ● ● ● ● ● Gruppe 1 Gruppe 2 Gruppe 3 Gruppe 1 Gruppe 2 Gruppe 3 High variability Low variability within groups within groups

Analysis of variance Reminder: ANOVA I am sure you all remember from statistic courses: We observe different mean values for different groups. ● ● Beobachtungswert Beobachtungswert 4 4 ● ● ● ● ● ● ● ● ● ● ● 2 2 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 0 ● ● ● ● ● ● ● ● ● ● ● −2 ● −2 ● ● ● ● ● ● Gruppe 1 Gruppe 2 Gruppe 3 Gruppe 1 Gruppe 2 Gruppe 3 High variability Low variability within groups within groups Could it be just by chance? It depends from the variability of the group means and of the values within groups.

Analysis of variance Reminder: ANOVA ANOVA-Table (”ANalysis Of VAriance“) Degrees Sum of of free- Mean sum of F -Value squares dom squares (SS/DF) (SS) (DF) Groups 1 88.82 88.82 30.97 Residuals 7 20.07 2.87

Analysis of variance Reminder: ANOVA ANOVA-Table (”ANalysis Of VAriance“) Degrees Sum of of free- Mean sum of F -Value squares dom squares (SS/DF) (SS) (DF) Groups 1 88.82 88.82 30.97 Residuals 7 20.07 2.87 Under the hypothesis H 0 ”the group mean values are equal“ (and the values are normally distributed) F is Fisher-distributed with 1 and 7 DF , p = Fisher 1 , 7 ([ 30 . 97 , ∞ )) ≤ 8 · 10 − 4 .

Analysis of variance Reminder: ANOVA ANOVA-Table (”ANalysis Of VAriance“) Degrees Sum of of free- Mean sum of F -Value squares dom squares (SS/DF) (SS) (DF) Groups 1 88.82 88.82 30.97 Residuals 7 20.07 2.87 Under the hypothesis H 0 ”the group mean values are equal“ (and the values are normally distributed) F is Fisher-distributed with 1 and 7 DF , p = Fisher 1 , 7 ([ 30 . 97 , ∞ )) ≤ 8 · 10 − 4 . We can reject H 0 .

Analysis of variance ANOVA in R In R ANOVA is performed using summary.aov() and summary() . These functions apply on a regression: result of command lm() . summary.aov() gives you only the ANOVA table whereas summary() outputs other information such as Residuals, R-square etc ...

Analysis of variance ANOVA in R In R ANOVA is performed using summary.aov() and summary() . These functions apply on a regression: result of command lm() . summary.aov() gives you only the ANOVA table whereas summary() outputs other information such as Residuals, R-square etc ... Lets see a couple of examples with self-generated data in R.

Model checking Contents Background and basics 1 Analysis of variance 2 Model checking 3

Model checking Model checking When you perform a linear model you have to check for the pvalues of your effects but also the variance and the normality of the residues. Why?

Model checking Model checking When you perform a linear model you have to check for the pvalues of your effects but also the variance and the normality of the residues. Why? This is because we assumed in our model that the residues are normally distributed and have the same variance.

Model checking Model checking When you perform a linear model you have to check for the pvalues of your effects but also the variance and the normality of the residues. Why? This is because we assumed in our model that the residues are normally distributed and have the same variance. In R you can do that directly by using the function plot() on your regression object. Lets try on one example. We will focus on the first two graphs.

Model checking Model checking: Good example This is how it should look like:

Model checking Model checking: Good example This is how it should look like: On the first graph, we should see no trend (equal variance).

Model checking Model checking: Good example This is how it should look like: On the first graph, we should see no trend (equal variance). On the second graph, points should be close to the line (normality).

Model checking Model checking: Bad example This is a more problematic case:

Model checking Model checking: Bad example This is a more problematic case: What do you con- clude?

Rcourse: Linear model Sonja Grath, No emie Becker & Dirk - PowerPoint PPT Presentation

Rcourse: Linear model Sonja Grath, No emie Becker & Dirk Metzler Winter semester 2014-15 Background and basics 1 Analysis of variance 2 Model checking 3 Background and basics Contents Background and basics 1 Analysis of variance

Rcourse: Basic statistics with R Sonja Grath, No emie Becker & Dirk Metzler Winter

CS 7616 Pattern Recognition Linear, Linear, Linear Aaron Bobick School of Interactive

Graphics 2014 Linear Algebra II Linear Maps & Matrices Linear Maps & Matrices CORE

Linear Model using Excel 2013 Trendline XL2A 4/3/2017 V0L XL2A V0L Model Trendline

Statistics for Applications Chapter 10: Generalized Linear Models (GLMs) 1/52 Linear model A

Overview IAML: Linear Regression The linear model Fitting the linear model to data

Notes on the Non-linear Regression The model Non-linear regression models, like ordinary linear

Categorical Semantics for Linear Logic Categorical semantics for linear logic Interaction

Linear Programming Linear Programming In a linear programming problem, there is a set of

Chapter 1 What is Linear Algebra? Chapter 1 What is Linear Algebra? The study of linear

Linear Model Selection and Regularization Recall the linear model Y = 0 + 1 X 1 +

STAT 213 Simple Linear Regression I Colin Reimer Dawson Oberlin College 5 October 2016 Outline

Linear Manifold Clustering Robert Haralick and Rave Harpaz Outline Background The linear

Workshop 3 Building from Linear Models to Generalised Linear Models Part 2: GLMs 2 2 What are

Limitations of linear models Richard Erickson Instructor DataCamp Generalized Linear Models in

Regression 1: Linear Regression Marco Baroni Practical Statistics in R Outline Classic linear

Multivariate Analysis of Variance Max Turgeon STAT 4690Applied Multivariate Analysis Quick

Sta$s$cs & Experimental Design with R Barbara Kitchenham

Univariate 1-Way ANOVA as a Linear Model with Fixed Regressors Group 1 Group 2 Group 3 x x x

Applied Political Research Session 11: Analysis of Variance (ANOVA) Lecturer: Prof. A.

Statistics and learning Analysis of variance (ANOVA) Emmanuel Rachelson and Matthieu Vignes ISAE

One-Way ANOVA modelling for RRAM reset curves alez 1 , Ana M. Aguilera 1 , Christian J. Acal

One-Population Tests One Population Mean Proportion t Test Z Test Z Test (1 & 2 (1

CS 147: Computer Systems Performance Analysis One-Factor Experiments 1 / 42 Overview CS147

Sambuz

Useful Links

Newsletter

Mail Us

Rcourse: Linear model Sonja Grath, No emie Becker & Dirk - PowerPoint PPT Presentation

Rcourse: Linear model Sonja Grath, No emie Becker & Dirk Metzler Winter semester 2014-15 Background and basics 1 Analysis of variance 2 Model checking 3 Background and basics Contents Background and basics 1 Analysis of variance

Rcourse: Basic statistics with R Sonja Grath, No emie Becker &amp; Dirk Metzler Winter

CS 7616 Pattern Recognition Linear, Linear, Linear Aaron Bobick School of Interactive

Graphics 2014 Linear Algebra II Linear Maps &amp; Matrices Linear Maps &amp; Matrices CORE

Linear Model using Excel 2013 Trendline XL2A 4/3/2017 V0L XL2A V0L Model Trendline

Statistics for Applications Chapter 10: Generalized Linear Models (GLMs) 1/52 Linear model A

Overview IAML: Linear Regression The linear model Fitting the linear model to data

Notes on the Non-linear Regression The model Non-linear regression models, like ordinary linear

Categorical Semantics for Linear Logic Categorical semantics for linear logic Interaction

Linear Programming Linear Programming In a linear programming problem, there is a set of

Chapter 1 What is Linear Algebra? Chapter 1 What is Linear Algebra? The study of linear

Linear Model Selection and Regularization Recall the linear model Y = 0 + 1 X 1 +

STAT 213 Simple Linear Regression I Colin Reimer Dawson Oberlin College 5 October 2016 Outline

Linear Manifold Clustering Robert Haralick and Rave Harpaz Outline Background The linear

Workshop 3 Building from Linear Models to Generalised Linear Models Part 2: GLMs 2 2 What are

Limitations of linear models Richard Erickson Instructor DataCamp Generalized Linear Models in

Regression 1: Linear Regression Marco Baroni Practical Statistics in R Outline Classic linear

Multivariate Analysis of Variance Max Turgeon STAT 4690Applied Multivariate Analysis Quick

Sta$s$cs &amp; Experimental Design with R Barbara Kitchenham

Univariate 1-Way ANOVA as a Linear Model with Fixed Regressors Group 1 Group 2 Group 3 x x x

Applied Political Research Session 11: Analysis of Variance (ANOVA) Lecturer: Prof. A.

Statistics and learning Analysis of variance (ANOVA) Emmanuel Rachelson and Matthieu Vignes ISAE

One-Way ANOVA modelling for RRAM reset curves alez 1 , Ana M. Aguilera 1 , Christian J. Acal

One-Population Tests One Population Mean Proportion t Test Z Test Z Test (1 &amp; 2 (1

CS 147: Computer Systems Performance Analysis One-Factor Experiments 1 / 42 Overview CS147

Sambuz

Useful Links

Newsletter

Mail Us

Rcourse: Basic statistics with R Sonja Grath, No emie Becker & Dirk Metzler Winter

Graphics 2014 Linear Algebra II Linear Maps & Matrices Linear Maps & Matrices CORE

Sta$s$cs & Experimental Design with R Barbara Kitchenham

One-Population Tests One Population Mean Proportion t Test Z Test Z Test (1 & 2 (1