stat 113 variability
play

STAT 113 Variability Colin Reimer Dawson Oberlin College - PowerPoint PPT Presentation

Last Time: Shape and Center Variability Variance and Standard Deviaton Transformations STAT 113 Variability Colin Reimer Dawson Oberlin College September 14, 2017 1 / 48 Last Time: Shape and Center Variability Variance and Standard


  1. Last Time: Shape and Center Variability Variance and Standard Deviaton Transformations STAT 113 Variability Colin Reimer Dawson Oberlin College September 14, 2017 1 / 48

  2. Last Time: Shape and Center Variability Variance and Standard Deviaton Transformations Outline Last Time: Shape and Center Variability Boxplots and the IQR Variance and Standard Deviaton Transformations 2 / 48

  3. Last Time: Shape and Center Variability Variance and Standard Deviaton Transformations Distribution of a Quantitative Variable The distribution of a quantitative variable is characterized by: A. Shape (symmetric, skewed, bimodal, etc.) B. Center (mean, median) C. Spread (Interquartile Range, Standard Deviation) D. Outliers (if any) 3 / 48

  4. Last Time: Shape and Center Variability Variance and Standard Deviaton Transformations Skewness • A distribution is skewed when the extreme values on one side are more extreme than those on the other. • We call a distribution right-skewed when the longer “tail” is on the right, and left-skewed when the longer tail is on the left. 4 / 48

  5. Last Time: Shape and Center Variability Variance and Standard Deviaton Transformations Distribution of a Quantitative Variable The distribution of a numeric variable is characterized by: A. Shape (symmetric, skewed, bimodal, etc.) B. Center (mean, median) C. Spread (Interquartile Range, Standard Deviation) D. Outliers (if any) 5 / 48

  6. Last Time: Shape and Center Variability Variance and Standard Deviaton Transformations Resistance/Robustness • The mean is strongly affected by skew and by outliers • The mean is pulled toward the extreme values. • In these cases, we generally prefer a measure of central tendency which is resistant to the influence of extreme values (also called robust ). • The median is a resist/robust measure of center. 6 / 48

  7. Last Time: Shape and Center Variability Variance and Standard Deviaton Transformations Outline Last Time: Shape and Center Variability Boxplots and the IQR Variance and Standard Deviaton Transformations 7 / 48

  8. Last Time: Shape and Center Variability Variance and Standard Deviaton Transformations Distribution of a Quantitative Variable The distribution of a numeric variable is characterized by: A. Shape (symmetric, skewed, bimodal, etc.) B. Center (mean, median) C. Spread (Interquartile Range, Standard Deviation) D. Outliers (if any) 8 / 48

  9. Last Time: Shape and Center Variability Variance and Standard Deviaton Transformations Measures of Variability • We want to quantify the consistency, or lack thereof, of the data. • A general term for “lack of consistency” is variability . • We will look at: • Range • Interquartile Range • Variance / Standard Deviation 9 / 48

  10. Last Time: Shape and Center Variability Variance and Standard Deviaton Transformations The Range The range is easy to compute, but not very reliable. ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● −20 −10 0 10 20 30 Fund C1 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● −20 −10 0 10 20 30 Fund C2 Figure: Historical Annual Returns for Two Hypothetical Index Funds 10 / 48

  11. Last Time: Shape and Center Variability Variance and Standard Deviaton Transformations The Range The range is easy to compute, but not very reliable. ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● −10 −5 0 5 10 15 Fund E (Full Data Set) ● ● ● ● ● −10 −5 0 5 10 15 Fund Sample 1 ● ● ● ● ● −10 −5 0 5 10 15 Fund Sample 2 ● ● ●● ● −10 −5 0 5 10 15 Fund Sample 3 Figure: Annual Returns for 3 random samples of 5 years 11 / 48

  12. Last Time: Shape and Center Variability Variance and Standard Deviaton Transformations Outline Last Time: Shape and Center Variability Boxplots and the IQR Variance and Standard Deviaton Transformations 12 / 48

  13. Last Time: Shape and Center Variability Variance and Standard Deviaton Transformations Robust Measures of Variability • We’d like a more robust measure of variability, which is not affected so much by extreme values. • Analogous to the median: describe the “middle” part of the data. • The idea: find the “middle half” of the data, and then take its range. • Specifically, exclude the lowest 25% and the highest 25%, and take the difference between the highest and lowest remaining values. 13 / 48

  14. Last Time: Shape and Center Variability Variance and Standard Deviaton Transformations Quartiles • The median divides the data in two. • Percentiles divide the data into 100 pieces. . The k th • Quartiles divide the data into quartile (written Q k ) is the point below which k quarters of the data lies. • So, in terms of quartiles, the median is , the minimum value is , the maximum value is . • We can calculate the range using quartiles as . 14 / 48

  15. Last Time: Shape and Center Variability Variance and Standard Deviaton Transformations Quartiles Q 0 Q 1 Q 2 Q 3 Q 4 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● 20 25 30 35 40 45 50 Height (in.) 15 / 48

  16. Last Time: Shape and Center Variability Variance and Standard Deviaton Transformations The Inter-Quartile Range (IQR) The Inter-Quartile Range (IQR) The Inter-Quartile Range (or IQR ) is the distance between the first and third quartiles: IQR = Q 3 − Q 1 Pedantic Note The IQR is a single number , not the two quartiles themselves. 16 / 48

  17. Last Time: Shape and Center Variability Variance and Standard Deviaton Transformations The Inter-Quartile Range (IQR) Q 0 Q 1 Q 2 Q 3 Q 4 Range IQR ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● 20 25 30 35 40 45 50 Height (in.) 17 / 48

  18. Last Time: Shape and Center Variability Variance and Standard Deviaton Transformations The Five-Number Summary Five-number Summary • The quartiles are very natural to report together to describe the center and spread of a distribution. • Q 0 through Q 4 collectively form the five-number summary of a quantitative distribution. Five Number Summary = ( x min , Q 1 , Median , Q 3 , x max ) = ( Q 0 , Q 1 , Q 2 , Q 3 , Q 4 ) 18 / 48

  19. Last Time: Shape and Center Variability Variance and Standard Deviaton Transformations Box-and-Whisker Plots Box-and-Whisker Plots From the five-number summary, we construct a graph called a box-and-whisker plot (or just box plot , for short) 1. Draw an axis 2. Draw a rectangle (box) from Q 1 to Q 3 3. Draw a line across the box (or place a dot) at Q 2 4. Draw lines (whiskers) extending outward from the box on both sides to either (a) (Simplest version) x min and x max . (b) (R default) Q 1 − 1 . 5 IQR and Q 3 + 1 . 5 IQR . 5. In version (b), plot points beyond the whiskers individually. 19 / 48

  20. Last Time: Shape and Center Variability Variance and Standard Deviaton Transformations Box-and-Whisker Plot: Version 1 Q 0 Q 1 Q 2 Q 3 Q 4 Range ● ● IQR ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● 20 25 30 35 40 45 50 Height (in.) 20 25 30 35 40 45 50 20 / 48

  21. Last Time: Shape and Center Variability Variance and Standard Deviaton Transformations Box-and-Whisker Plot: Version 2 Q 0 Q 1 Q 2 Q 3 Q 4 Range ● ● IQR ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● 20 25 30 35 40 45 50 Height (in.) ● 20 25 30 35 40 45 50 21 / 48

  22. Last Time: Shape and Center Variability Variance and Standard Deviaton Transformations Box-and-Whisker Plot: Right Skew Density 0.000 0 500 1000 1500 2000 2001 Household Income (Thousands of 2016$) 0 500 1000 1500 2000 2001 Household Income (Thousands of 2016$) 22 / 48

  23. Last Time: Shape and Center Variability Variance and Standard Deviaton Transformations Box-and-Whisker Plot: Right Skew Density 0.000 0 500 1000 1500 2000 2001 Household Income (Thousands of 2016$) ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 500 1000 1500 2000 2001 Household Income (Thousands of 2016$) 23 / 48

  24. Last Time: Shape and Center Variability Variance and Standard Deviaton Transformations Matching Graphs to Variables Handout 24 / 48

  25. Last Time: Shape and Center Variability Variance and Standard Deviaton Transformations Outline Last Time: Shape and Center Variability Boxplots and the IQR Variance and Standard Deviaton Transformations 25 / 48

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend