stat 113 confidence intervals
play

STAT 113 Confidence Intervals Colin Reimer Dawson Oberlin College - PowerPoint PPT Presentation

Sampling Distributions Confidence Intervals Bootstrap Confidence Intervals STAT 113 Confidence Intervals Colin Reimer Dawson Oberlin College October 3, 2017 1 / 51 Sampling Distributions Confidence Intervals Bootstrap Confidence Intervals


  1. Sampling Distributions Confidence Intervals Bootstrap Confidence Intervals STAT 113 Confidence Intervals Colin Reimer Dawson Oberlin College October 3, 2017 1 / 51

  2. Sampling Distributions Confidence Intervals Bootstrap Confidence Intervals Outline Sampling Distributions Confidence Intervals Bootstrap Confidence Intervals Bootstrap Resampling Bootstrap Confidence Intervals Bootstrap Percentile Intervals 2 / 51

  3. Sampling Distributions Confidence Intervals Bootstrap Confidence Intervals Two Main Goals of Inference 1. Assessing strength of evidence about “yes/no” questions (hypothesis testing) 2. Estimating unknown quantities in a population using a sample (confidence intervals) 3 / 51

  4. Sampling Distributions Confidence Intervals Bootstrap Confidence Intervals Statistics vs. Parameters • Summary values (like mean, median, standard deviation) can be computed for populations or for samples. • In a population, such a summary value is called a parameter • In a sample, these values are called statistics , and are used to estimate the corresponding parameter Value Population Parameter Sample Statistic ¯ Mean µ X Proportion ˆ p p Correlation ρ r ˆ Slope of a Line β 1 β 1 X 1 − ¯ ¯ Difference in Means µ 1 − µ 2 X 2 . . . . . . . . . 4 / 51

  5. Sampling Distributions Confidence Intervals Bootstrap Confidence Intervals Using Samples to Make Estimates About Populations • The set of all gumballs from my factory is my population . • The mean flavor-life in the population is a population parameter (write µ for the pop. mean) • Ideally I can test a random sample • The mean flavor-life in the sample is a sample statistic (write ¯ x for the sample mean). Statistic : Sample :: Parameter : Population 5 / 51

  6. Sampling Distributions Confidence Intervals Bootstrap Confidence Intervals Variability due to Sampling • Samples are imperfect reflections of the population. • However, some populations are more compatible with the sample than others. • If we imagine a continuum of populations (or just population means), some are more plausible than others because they make the data more likely . 6 / 51

  7. Sampling Distributions Confidence Intervals Bootstrap Confidence Intervals Sampling Distributions Sampling Distribution Definition Consider all possible random samples of a fixed size, n from a population. Each one has its own value for a particular statistic (like ¯ x ). A sampling distribution is the collection of all of of those ¯ x values (or whatever the statistic is) 7 / 51

  8. Sampling Distributions Confidence Intervals Bootstrap Confidence Intervals Sampling Distribution of Gumball Means 0.15 Density mean = 66.77 0.00 60 65 70 75 Population flavor−life (min.) 0.4 Density s = 0.9 0.0 60 65 70 75 Sample Mean Flavor Life (n = 10) 8 / 51

  9. Sampling Distributions Confidence Intervals Bootstrap Confidence Intervals Self-Check Quiz 1. What are the cases in the context of a sampling distribution? Possible samples of a fixed size n 2. What is the variable in the relevant sampling distribution for the gumball life example? Each case has its own sample mean 9 / 51

  10. Sampling Distributions Confidence Intervals Bootstrap Confidence Intervals Standard Error Standard Error Definition The distribution of a quantitative variable has a standard deviation. The sampling distribution of a quantitative sample statistic (like a mean) has a standard deviation too. This has a special name: the standard error (e.g., “of the mean”). 10 / 51

  11. Sampling Distributions Confidence Intervals Bootstrap Confidence Intervals Sampling Distribution of Gumball Means 0.15 Density mean = 66.77 0.00 60 65 70 75 Population flavor−life (min.) 0.4 Density s = 0.9 0.0 60 65 70 75 Sample Mean Flavor Life (n = 10) 11 / 51

  12. Sampling Distributions Confidence Intervals Bootstrap Confidence Intervals Properties of Sampling Distribution Most (about 95%) of simple random samples have a sample mean ( ¯ x ) which is within 2 Standard Errors of the population mean ( µ ). 12 / 51

  13. Sampling Distributions Confidence Intervals Bootstrap Confidence Intervals Sampling Distribution of Gumball Means 0.15 Density mean = 66.77 0.00 60 65 70 75 Population flavor−life (min.) 0.4 Density s = 0.9 0.0 60 65 70 75 Sample Mean Flavor Life (n = 10) 13 / 51

  14. Sampling Distributions Confidence Intervals Bootstrap Confidence Intervals Properties of Sampling Distribution Most (about 95%) of simple random samples have a sample mean ( ¯ x ) which is within 2 Standard Errors of the population mean ( µ ). The population mean µ is within 2 Standard Errors of most (about 95%) sample means. Deeeeeep.... 14 / 51

  15. Sampling Distributions Confidence Intervals Bootstrap Confidence Intervals Outline Sampling Distributions Confidence Intervals Bootstrap Confidence Intervals Bootstrap Resampling Bootstrap Confidence Intervals Bootstrap Percentile Intervals 15 / 51

  16. Sampling Distributions Confidence Intervals Bootstrap Confidence Intervals Margins of Error In a Gallup poll released yesterday, a sample of 1500 adults in the U.S. voters were asked whether they approved or disapproved of the job that Donald Trump is doing as president. 42% of respondents said “approve” and 54% said disapprove. The poll’s margin of error was 3 percentage points. • What’s the meaning of that 3%? Margin of Error It defines a range of “plausible” values for each population proportion. Precisely, a 95% margin of error of 3 points means that 95% of surveys with the same procedure and sample size will yield sample statistics which are within 3 points of the corresponding 16 / 51 population parameter.

  17. Sampling Distributions Confidence Intervals Bootstrap Confidence Intervals Confidence Intervals • A point estimate of some population parameter (like a mean), together with some measure of our confidence/uncertainty (e.g., MoE), defines a confidence interval . • Can be written in the form “statistic ± MoE”. Stating Confidence Intervals • “With 95% confidence, the mean flavor-life of our gumballs is between 65.3 and 67.1 minutes.” • “With 95% confidence, between 43 (i.e, 46 − 3 ) and 49 (i.e., 46 + 3 ) percent of registered voters prefer Hillary Clinton to Donald Trump.” • “With 95% confidence, between 39 ( 42 − 3 ) and 45 ( 42 + 3 ) percent of U.S. adults approve of the president’s job performance.” 17 / 51

  18. Sampling Distributions Confidence Intervals Bootstrap Confidence Intervals Self-Check: Confidence Intervals HBO Sports/Marist gave 1253 U.S. adults the following poll question in spring 2015 (I have edited for length): "Top college men’s football and basketball programs bring in a lot of money to their schools... Do you think student athletes in [these top programs] should be paid for the hours they are required to spend practicing, traveling, and playing on the team, OR should not be paid given the value of their scholarship and a chance to earn a degree?" This poll’s 95% margin of error is 2.8%. The results are given in the following table. Find a 95% confidence interval for the percentage of U.S. adults who chose the first option. Should be paid Should not be paid Unsure 18 / 51 33% 65% 2%

  19. Sampling Distributions Confidence Intervals Bootstrap Confidence Intervals How to Determine the Margin of Error? The population mean µ is within 2 Standard Errors of most (about 95%) sample means (from simple random samples). Margin of Error A 95% margin of error of 3 points means that 95% of surveys with the same procedure and sample size will yield sample statistics which are within 3 points of the corresponding population parameter. If the sampling distribution is approximately Normal (bell-shaped), then 95% Margin of Error is about 2 Standard Errors. 19 / 51

  20. Sampling Distributions Confidence Intervals Bootstrap Confidence Intervals Confidence Interval of Gumball Flavor Life 0.4 Density s = 0.9 0.0 60 65 70 75 Sample Mean Flavor Life (n = 10) mean = 67.23 ● ● ● ● ● ● ● ●● ● 60 65 70 75 Sample flavor−life (min.) 20 / 51

  21. Sampling Distributions Confidence Intervals Bootstrap Confidence Intervals A Different Confidence Interval of Gumball Flavor Life 0.4 Density s = 0.9 0.0 60 65 70 75 Sample Mean Flavor Life (n = 10) mean = 71.08 ● ● ● ● ●● ● ● ● ● 60 65 70 75 Sample flavor−life (min.) 21 / 51

  22. Sampling Distributions Confidence Intervals Bootstrap Confidence Intervals Example: Carbon in Forest Biomass • Scientists hoping to curb deforestation estimate 1 that the carbon stored in tropical forests in Latin America, sub-Saharan Africa, and southeast Asia has a total biomass of 247 gigatons. • To arrive at this estimate, they first estimate the mean amount of carbon per square kilometer. • Based on a sample of size n = 4079 inventory plots, the sample mean is ¯ x = 11600 tons with a standard error of 1000 tons. • Give and interpret a 95% confidence interval for the carbon per km in the entire set of forests. 1 Saatchi, S.S. et. al. “Benchmark Map of Forest Carbon Stocks in Tropical Regions Across Three Continents,” Proceedings of the National Academy of Sciences , 5/31/11. 22 / 51

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend