workshop 7 2a introduction to linear models
play

Workshop 7.2a: Introduction to Linear models Murray Logan 19 Jul - PowerPoint PPT Presentation

Workshop 7.2a: Introduction to Linear models Murray Logan 19 Jul 2017 Section 1 Revision Aims of statistical modelling Use samples to: Describe relationships Inference testing (relationships/effects) Predictive models


  1. Workshop 7.2a: Introduction to Linear models Murray Logan 19 Jul 2017

  2. Section 1 Revision

  3. Aims of statistical modelling Use samples to: • Describe relationships • Inference testing (relationships/effects) • Predictive models

  4. Mathematical models 12 10 8 y 6 y = β 0 + β 1 x 4 2 0 0 1 2 3 4 5 6 x

  5. Statistical models 12 ● 10 ● ● 8 y 6 ● ● y = β 0 + β 1 x + ε 4 ● ε ~ Norm ( 0 , σ 2 ) ● 2 0 0 1 2 3 4 5 6 x

  6. Linear models 12 ● 10 ● ● 8 y 6 ● ● y = β 0 + β 1 x + ε 4 ● ● 2 0 0 1 2 3 4 5 6 x

  7. Linear models 120 ● 100 ● 80 y 60 ● 40 ● y = β 0 + β 1 x + β 2 x 2 20 ● ● ● 0 0 1 2 3 4 5 6 x

  8. Non-linear models 1500 ● 1000 y y = αβ x 500 ● ● ● ● 0 ● ● 0 1 2 3 4 5 6 x

  9. Linear models y i = + × + β 0 β 1 x 1 ϵ 1 variable = population response + population × predictor + error intercept slope variable � �� � Stoichastic component � �� � � �� � intercept term slope term � �� � Systematic component

  10. Linear models y i = + × + β 0 β 1 x 1 ε 1 response intercept single value × predictor slope vector = + + error single value vector � �� � Stoichastic component � �� � � �� � intercept term slope term � �� � Systematic component

  11. Vectors and Matrices Vector Matrix     3 . 0 1 0 2 . 5 1 1         6 . 0 1 2         5 . 5 1 3         9 . 0 1 4         8 . 6 1 5     12 . 0 1 6 Has length ONLY Has length AND width

  12. Estimation 12 ● 10 ● ● 8 y 6 ● ● y = β 0 + β 1 x + ε 4 ● ● 2 0 0 1 2 3 4 5 6 x Ordinary Least Squares

  13. Estimation Y X 3 0 2.5 1 6 2 5.5 3 9 4 8.6 5 12 6 3 . 0 = β 0 × 1 + β 1 × 0 + ε 1 2 . 5 = β 0 × 1 + β 1 × 1 + ε 1 6 . 0 = β 0 × 1 + β 1 × 2 + ε 2 5 . 5 = β 0 × 1 + β 1 × 3 + ε 3

  14. Estimation 3 . 0 = β 0 × 1 + β 1 × 0 + ε 1 β 0 × 1 β 1 × 1 2 . 5 = + + ε 1 6 . 0 = β 0 × 1 + β 1 × 2 + ε 2 5 . 5 = β 0 × 1 + β 1 × 3 + ε 3 9 . 0 = β 0 × 1 + β 1 × 4 + ε 4 8 . 6 = β 0 × 1 + β 1 × 5 + ε 5 12 . 0 = β 0 × 1 + β 1 × 6 + ε 6     3 . 0 1 0   ε 1 2 . 5 1 1     ε 2       6 . 0 1 2     ( β 0 )   ε 3       5 . 5 = 1 3 × +       β 1 ε 4       9 . 0 1 4       � �� � ε 5       8 . 6 1 5 Parameter vector     ε 6 12 . 0 1 6 � �� � � �� � � �� � Residual vector Response values Model matrix

  15. Inference testing Ho: β 1 = 0 (slope equals zero) The t-statistic param t = SE param t = β 1 SE β 1

  16. Inference testing Ho: β 1 = 0 (slope equals zero) The t-statistic and the t distribution −4 −2 0 2 4

  17. Section 2 Linear model Assumptions

  18. Assumptions • Independence - unbiased, scale of treatment • Normality - residuals • Homogeneity of variance - residuals • Linearity

  19. Assumptions y l i t r m a N o ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● y ● y ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

  20. Assumptions c e i a n v a r o f t y n e i o g e H o m ● ● ● ● ● Residuals ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● Y ● ● ● ● y ● res ● ● y ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● X x Predicted x ● ● ● ● ● Residuals ● ● ● ● ● ● ● ● ● ● ● ● res Y y ● y ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● X x Predicted x

  21. Assumptions y r i t n e a L i Trendline ● ● 60 ● ● 2.5 ● ● ● ● ● ● ● ● 50 ● 2.0 ● 40 ● ● ● ● ● ● 1.5 ● ● 30 ● ● ● 1.0 ● ● ● 20 ● ● ● ● ● ● ● 0.5 10 ● ● ● ● ● 0 0.0 0 5 10 15 20 25 30 0 5 10 15 20 25 30

  22. Assumptions y r i t n e a L i Loess (lowess) smoother ● 60 ● ● ● ● 50 ● ● 40 ● ● 30 ● ● 20 ● ● ● ● ● 10 ● ● ● ● 0 0 5 10 15 20 25 30 ● 2.5 ● ● ● ● ● ● ● 2.0 ●

  23. Assumptions i t y e a r L i n Spline smoother ● ● 60 ● ● 2.5 ● ● ● ● ● ● ● ● 50 ● 2.0 ● 40 ● ● ● ● ● ● 1.5 ● ● 30 ● ● ● 1.0 ● ● ● 20 ● ● ● ● ● ● ● 0.5 10 ● ● ● ● ● 0 0.0 0 5 10 15 20 25 30 0 5 10 15 20 25 30

  24. Assumptions y i = β 0 + β 1 × x i + ε i ϵ i ∼ N (0 , σ 2 )

  25. Assumptions y i = β 0 + β 1 × x i + ε i ϵ i ∼ N (0 , σ 2 )

  26. Example Make these data and call the data frame DATA Y X 3 0 2.5 1 6 2 5.5 3 9 4 8.6 5 12 6

  27. > DATA <- data.frame (Y= c (3, 2.5, 6.0, 5.5, 9.0, 8.6, 12), X=0:6) Example Make these data and call the data frame DATA Y X 3 0 2.5 1 6 2 5.5 3 9 4 8.6 5 12 6 • try this฀

  28. 148 FERTILIZER 1st Qu.:104.5 1st Qu.: 81.25 : 80.0 Min. : 25.00 Min. YIELD > summary (fert) Median :161.5 169 150 6 > fert <- read.csv ('../data/fertilizer.csv', strip.white=T) 125 5 154 100 Median :137.50 Mean 90 'data.frame': > library (INLA) 84 80 90 154 148 169 206 244 212 248 : int $ YIELD 25 50 75 100 125 150 175 200 225 250 $ FERTILIZER: int 2 variables: 10 obs. of > str (fert) :137.50 :248.0 Max. :250.00 Max. 3rd Qu.:210.5 3rd Qu.:193.75 :163.5 Mean 4 75 > fert.inla <- inla (YIELD ~ FERTILIZER, data=fert) 75 6 148 125 5 154 100 4 90 3 169 80 50 2 84 25 1 FERTILIZER YIELD > fert 150 7 3 248 80 50 2 84 25 1 FERTILIZER YIELD > head (fert) 250 175 10 212 225 9 244 200 8 206 > Worked Examples > summary (fert.inla) Call: ฀inla(formula = YIELD ฀ FERTILIZER, data = fert)฀ Time used: Pre-processing Running inla Post-processing Total 0.3043 0.0715 0.0217 0.3974 Fixed effects: mean sd 0.025quant 0.5quant 0.975quant mode kld (Intercept) 51.9341 12.9747 25.9582 51.9335 77.8990 51.9339 0 FERTILIZER 0.8114 0.0836 0.6439 0.8114 0.9788 0.8114 0 The model has no random effects Model hyperparameters: mean sd 0.025quant 0.5quant 0.975quant mode Precision for the Gaussian observations 0.0035 0.0015 0.0012 0.0032 0.007 0.0028 Expected number of effective parameters(std dev): 2.00(0.00) Number of equivalent replicates : 5.00 Marginal log-Likelihood: -61.65

  29. Worked Examples Question: is there a relationship between fertilizer concentration and grass yeild? Linear model: ε ∼ N (0 , σ 2 ) Y i = β 0 + β 1 F i + ε i

  30. > library (car) Example i s l y s a n a t a d a o r y r a t p l o E x > scatterplot (Y~X, data=DATA) 12 ● 10 ● ● 8 Y 6 ● ● 4 ● ● 0 1 2 3 4 5 6 X

  31. > library (car) Example i s l y s a n a t a d a r y a t o l o r E x p > peake <- read.csv ('../data/peake.csv') > scatterplot (SPECIES ~ AREA, data=peake) 25 ● ● ● ● ● ● ● 20 ● ● ● SPECIES 15 ● ● ● ● ● ● ● 10 ● ● ● ● ● ● ● 5 ● 0 5000 10000 15000 20000 25000 AREA ● ● ●

  32. smoother=gamLine) + > scatterplot (SPECIES ~ AREA, data=peake, Example i s l y s a n a t a d a o r y r a t p l o E x 25 ● ● ● ● ● ● ● 20 ● ● ● SPECIES 15 ● ● ● ● ● ● ● 10 ● ● ● ● ● ● ● 5 ● 0 5000 10000 15000 20000 25000 AREA ● ● ●

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend