correlation and regression
play

Correlation and Regression 9-1 Overview 9-2 Correlation 9-3 - PowerPoint PPT Presentation

Chapter 9 Slide 1 Correlation and Regression 9-1 Overview 9-2 Correlation 9-3 Regression 9-4 Variation and Prediction Intervals 9-5 Multiple Regression 9-6 Modeling Chapter 9, Triola, Elementary Statistics , MATH 1342 Slide 2 Section 9-1


  1. Chapter 9 Slide 1 Correlation and Regression 9-1 Overview 9-2 Correlation 9-3 Regression 9-4 Variation and Prediction Intervals 9-5 Multiple Regression 9-6 Modeling Chapter 9, Triola, Elementary Statistics , MATH 1342

  2. Slide 2 Section 9-1 & 9-2 Overview and Correlation and Regression Created by Erin Hodgess, Houston, Texas Chapter 9, Triola, Elementary Statistics , MATH 1342

  3. Overview Slide 3 Paired Data (p.506) � Is there a relationship? � If so, what is the equation? � Use that equation for prediction. Chapter 9, Triola, Elementary Statistics , MATH 1342

  4. Definition Slide 4 � A correlation exists between two variables when one of them is related to the other in some way. Chapter 9, Triola, Elementary Statistics , MATH 1342

  5. Definition Slide 5 � A Scatterplot (or scatter diagram) is a graph in which the paired ( x, y ) sample data are plotted with a horizontal x- axis and a vertical y- axis. Each individual ( x, y ) pair is plotted as a single point. Chapter 9, Triola, Elementary Statistics , MATH 1342

  6. Scatter Diagram Slide 6 of Paired Data (p.507) Chapter 9, Triola, Elementary Statistics , MATH 1342

  7. Positive Linear Slide 7 Correlation (p.498) Figure 9-2 Scatter Plots Chapter 9, Triola, Elementary Statistics , MATH 1342

  8. Negative Linear Slide 8 Correlation Figure 9-2 Scatter Plots Chapter 9, Triola, Elementary Statistics , MATH 1342

  9. No Linear Correlation Slide 9 Figure 9-2 Scatter Plots Chapter 9, Triola, Elementary Statistics , MATH 1342

  10. Definition (p.509) Slide 10 The linear correlation coefficient r measures strength of the linear relationship between paired x and y values in a sample. Chapter 9, Triola, Elementary Statistics , MATH 1342

  11. Assumptions (p.507) Slide 11 1. The sample of paired data ( x, y ) is a random sample. 2. The pairs of ( x, y ) data have a bivariate normal distribution. Chapter 9, Triola, Elementary Statistics , MATH 1342

  12. Notation for the Linear Correlation Coefficient Slide 12 n = number of pairs of data presented Σ denotes the addition of the items indicated. Σ x denotes the sum of all x - values. Σ x 2 indicates that each x - value should be squared and then those squares added. ( Σ x ) 2 indicates that the x - values should be added and the total then squared. Σ xy indicates that each x -value should be first multiplied by its corresponding y - value. After obtaining all such products, find their sum. r represents linear correlation coefficient for a sample ρ represents linear correlation coefficient for a population Chapter 9, Triola, Elementary Statistics , MATH 1342

  13. Definition Slide 13 The linear correlation coefficient r measures the strength of a linear relationship between the paired values in a sample. n Σ xy – ( Σ x )( Σ y ) r = n ( Σ x 2 ) – ( Σ x ) 2 n ( Σ y 2 ) – ( Σ y ) 2 Formula 9-1 Calculators can compute r ρ (rho) is the linear correlation coefficient for all paired data in the population. Chapter 9, Triola, Elementary Statistics , MATH 1342

  14. Rounding the Linear Slide 14 Correlation Coefficient r � Round to three decimal places so that it can be compared to critical values in Table A-6. (see p.510) � Use calculator or computer if possible. Chapter 9, Triola, Elementary Statistics , MATH 1342

  15. Calculating r Slide 15 Data x 1 1 3 5 2 8 6 4 y This data is from exercise #7 on p.521. Chapter 9, Triola, Elementary Statistics , MATH 1342

  16. Slide 16 Chapter 9, Triola, Elementary Statistics , MATH 1342 Calculating r

  17. Calculating r Slide 17 Data x 1 1 3 5 2 8 6 4 y n Σ xy – ( Σ x )( Σ y ) r = n ( Σ x 2 ) – ( Σ x ) 2 n ( Σ y 2 ) – ( Σ y ) 2 4( 48 ) – (10)(20) r = 4(36) – (10) 2 4(120) – (20) 2 –8 r = = – 0.135 59.329 Chapter 9, Triola, Elementary Statistics , MATH 1342

  18. Interpreting the Linear Slide 18 Correlation Coefficient (p.511) � If the absolute value of r exceeds the value in Table A - 6, conclude that there is a significant linear correlation. � Otherwise, there is not sufficient evidence to support the conclusion of significant linear correlation. Chapter 9, Triola, Elementary Statistics , MATH 1342

  19. Example: Slide 19 Boats and Manatees Given the sample data in Table 9-1, find the value of the linear correlation coefficient r , then refer to Table A-6 to determine whether there is a significant linear correlation between the number of registered boats and the number of manatees killed by boats. Using the same procedure previously illustrated, we find that r = 0.922. Referring to Table A-6, we locate the row for which n =10. Using the critical value for α =5, we have 0.632. Because r = 0.922, its absolute value exceeds 0.632, so we conclude that there is a significant linear correlation between number of registered boats and number of manatee deaths from boats. Chapter 9, Triola, Elementary Statistics , MATH 1342

  20. Properties of the Slide 20 Linear Correlation Coefficient r 1. –1 ≤ r ≤ 1 (see also p.512) 2. Value of r does not change if all values of either variable are converted to a different scale. 3. The r is not affected by the choice of x and y . interchange x and y and the value of r will not change. 4. r measures strength of a linear relationship. Chapter 9, Triola, Elementary Statistics , MATH 1342

  21. Interpreting r : Slide 21 Explained Variation The value of r 2 is the proportion of the variation in y that is explained by the linear relationship between x and y . (p.503 and p.533) Chapter 9, Triola, Elementary Statistics , MATH 1342

  22. Example: Slide 22 Boats and Manatees Using the boat/manatee data in Table 9-1, we have found that the value of the linear correlation coefficient r = 0.922 . What proportion of the variation of the manatee deaths can be explained by the variation in the number of boat registrations? With r = 0.922, we get r 2 = 0.850. We conclude that 0.850 (or about 85%) of the variation in manatee deaths can be explained by the linear relationship between the number of boat registrations and the number of manatee deaths from boats. This implies that 15% of the variation of manatee deaths cannot be explained by the number of boat registrations. Chapter 9, Triola, Elementary Statistics , MATH 1342

  23. Common Errors Slide 23 Involving Correlation (pp.503-504) 1. Causation: It is wrong to conclude that correlation implies causality. 2. Averages: Averages suppress individual variation and may inflate the correlation coefficient. 3. Linearity: There may be some relationship between x and y even when there is no significant linear correlation. Chapter 9, Triola, Elementary Statistics , MATH 1342

  24. Common Errors Slide 24 Involving Correlation FIGURE 9-3 Scatterplot of Distance above Ground and Time for Object Thrown Upward Chapter 9, Triola, Elementary Statistics , MATH 1342

  25. Formal Slide 25 Hypothesis Test (p.504) � We wish to determine whether there is a significant linear correlation between two variables. � We present two methods. � Both methods let H 0 : ρ = 0 (no significant linear correlation) H 1 : ρ ≠ 0 (significant linear correlation) Chapter 9, Triola, Elementary Statistics , MATH 1342

  26. FIGURE 9-4 Slide 26 Testing for a Linear Correlation (p.505) Chapter 9, Triola, Elementary Statistics , MATH 1342

  27. Method 1: Slide 27 Test Statistic is t (follows format of earlier chapters) Test statistic: r t = 1 – r 2 n – 2 Critical values: Use Table A-3 with degrees of freedom = n – 2 Chapter 9, Triola, Elementary Statistics , MATH 1342

  28. Method 2: Slide 28 Test Statistic is r (uses fewer calculations) � Test statistic: r � Critical values: Refer to Table A-6 (no degrees of freedom) Chapter 9, Triola, Elementary Statistics , MATH 1342

  29. Example: Slide 29 Boats and Manatees Using the boat/manatee data in Table 9-1, test the claim that there is a linear correlation between the number of registered boats and the number of manatee deaths from boats. Use Method 1. r t = 1 – r 2 n – 2 0.922 t = = 6.735 1 – 0.922 2 10 – 2 Chapter 9, Triola, Elementary Statistics , MATH 1342

  30. Method 1: Slide 30 Test Statistic is t (follows format of earlier chapters) Figure 9-5 (p.516) Chapter 9, Triola, Elementary Statistics , MATH 1342

  31. Example: Slide 31 Boats and Manatees Using the boat/manatee data in Table 9-1, test the claim that there is a linear correlation between the number of registered boats and the number of manatee deaths from boats. Use Method 2. The test statistic is r = 0.922. The critical values of r = ± 0.632 are found in Table A-6 with n = 10 and α = 0.05. Chapter 9, Triola, Elementary Statistics , MATH 1342

  32. Method 2: Slide 32 Test Statistic is r (uses fewer calculations) � Test statistic: r � Critical values: Refer to Table A-6 (10 degrees of freedom) Figure 9-6 (p.507) Chapter 9, Triola, Elementary Statistics , MATH 1342

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend