STAT 113 Bootstrap Confidence Intervals Colin Reimer Dawson - PowerPoint PPT Presentation

Confidence Intervals Bootstrap Resampling Bootstrap Confidence Intervals Bootstrap Percentile Intervals STAT 113 Bootstrap Confidence Intervals Colin Reimer Dawson Oberlin College 3 March 2017

Confidence Intervals Bootstrap Resampling Bootstrap Confidence Intervals Bootstrap Percentile Intervals Using Samples to Make Estimates About Populations Statistic : Sample :: Parameter : Population We want to use our sample statistic to estimate the corresponding population parameter

Confidence Intervals Bootstrap Resampling Bootstrap Confidence Intervals Bootstrap Percentile Intervals Standard Error Standard Error Definition The distribution of a quantitative variable has a standard deviation. The sampling distribution of a quantitative sample statistic (like a mean) has a standard deviation too. This has a special name: the standard error (e.g., “of the mean”).

Confidence Intervals Bootstrap Resampling Bootstrap Confidence Intervals Bootstrap Percentile Intervals Confidence Intervals • A point estimate of some population parameter (like a mean), together with some measure of our confidence/uncertainty (e.g., MoE), defines a confidence interval . • Can be written in the form “statistic ± MoE”. • “With 95% confidence, the mean flavor-life of our gumballs is between 65.3 and 67.1 minutes.” • “With 95% confidence, between 39 ( 42 − 3 ) and 45 ( 42 + 3 ) percent of U.S. adults approve of Donald Trump’s job performance as president.

Confidence Intervals Bootstrap Resampling Bootstrap Confidence Intervals Bootstrap Percentile Intervals How to Determine the Margin of Error? The population mean µ is within 2 Standard Errors of most (about 95%) sample means (from simple random samples). Margin of Error A 95% margin of error of 3 points means that 95% of surveys with the same procedure and sample size will yield sample statistics which are within 3 points of the corresponding population parameter. If the sampling distribution is approximately Normal, then a 95% Margin of Error is about 2 Standard Errors.

Confidence Intervals Bootstrap Resampling Bootstrap Confidence Intervals Bootstrap Percentile Intervals Interpretations of CIs • 95% CIs contain 95% of the cases in the population. False. They represent uncertainty about a population parameter, not about individual points. • There is a 95% chance that the sample mean falls in the 95% CI. False. Any given CI is centered around the sample mean for that sample, so the sample mean is inside 100% of the time. • 95% of samples produce confidence intervals that contain the population parameter. True: This is the definition of a confidence interval

Confidence Intervals Bootstrap Resampling Bootstrap Confidence Intervals Bootstrap Percentile Intervals Correct or Incorrect? A 98% confidence interval for mean pulse rate in the Oberlin student population is 65 to 71. The interpretation “I am 98% sure that all students will have pulse rates between 65 and 71.” is A. Correct B. Incorrect

Confidence Intervals Bootstrap Resampling Bootstrap Confidence Intervals Bootstrap Percentile Intervals Correct or Incorrect? A 98% confidence interval for mean pulse rate in the Oberlin student population is 65 to 71. The interpretation “I am 98% sure that the mean pulse rate for this sample of students will fall between 65 and 71” is A. Correct B. Incorrect

Confidence Intervals Bootstrap Resampling Bootstrap Confidence Intervals Bootstrap Percentile Intervals Correct or Incorrect? A 98% confidence interval for mean pulse rate in the Oberlin student population is 65 to 71. The interpretation “I am 98% sure that the mean pulse rate for the population of all students will fall between 65 and 71” is A. Correct B. Incorrect

Confidence Intervals Bootstrap Resampling Bootstrap Confidence Intervals Bootstrap Percentile Intervals Correct or Incorrect? A 98% confidence interval for mean pulse rate in the Oberlin student population is 65 to 71. The interpretation “98% of the pulse rates for students at this college will fall between 65 and 71” is A. Correct B. Incorrect

Confidence Intervals Bootstrap Resampling Bootstrap Confidence Intervals Bootstrap Percentile Intervals

Confidence Intervals Bootstrap Resampling Bootstrap Confidence Intervals Bootstrap Percentile Intervals Summary To create a 95% confidence interval for a parameter: 1. Take many random samples from the population, and compute the sample statistic for each sample 2. Compute the standard error as the standard deviation of all these statistics 3. For your actual sample, use statistic ± 2 SE

Confidence Intervals Bootstrap Resampling Bootstrap Confidence Intervals Bootstrap Percentile Intervals Ok, but... In reality we only have one sample. How do we know what the standard error is? • Standard error depends on population characteristics, particularly variability • We can use the sample to estimate not only the parameter of interest (e.g., mean, proportion), but also the variability. • Two approaches: (1) Simulation, (2) Probability theory

Confidence Intervals Bootstrap Resampling Bootstrap Confidence Intervals Bootstrap Percentile Intervals Estimating the Margin of Error from One Sample • Since we only have one sample, we have to estimate the Margin of Error using only the information it contains. • Idea: Let the whole sample (not just the statistic of interest) serve as an estimate for the whole population

Confidence Intervals Bootstrap Resampling Bootstrap Confidence Intervals Bootstrap Percentile Intervals Note: We do not literally make copies of the data, or increase our sample size, by bootstrapping!

Confidence Intervals Bootstrap Resampling Bootstrap Confidence Intervals Bootstrap Percentile Intervals Sampling from the Pseudo-Population • Sampling from the estimated population is equivalent to sampling from the sample, but never “using up” the cases. • In other words, we sample with replacement from the sample. • The resulting sample is called a bootstrap sample .

Confidence Intervals Bootstrap Resampling Bootstrap Confidence Intervals Bootstrap Percentile Intervals Bootstrap Statistic and Bootstrap Distribution • We compute the relevant statistic (e.g., mean) on the bootstrap sample. This is a bootstrap statistic . • Over many bootstrap samples, each contributing a bootstrap statistic, we get a bootstrap distribution . • Each bootstrap statistic differs from the “pseudopopulation parameter” (which is really the real sample statistic). • We hope these differences are similar in size to the differences between true sample statistics and population parameter. Bootstrap statistic : Actual sample statistic :: Actual sample statistic : Actual Population Parameter

Confidence Intervals Bootstrap Resampling Bootstrap Confidence Intervals Bootstrap Percentile Intervals Examples: StatKey http://lock5stat.com/statkey

Confidence Intervals Bootstrap Resampling Bootstrap Confidence Intervals Bootstrap Percentile Intervals Population vs. Sample vs. Sampling Dist. vs. Bootstrap Dist. Population <- read.file("http://colindawson.net/data/ames.csv") Sample <- sample(Population, size = 50) SamplingDist <- do(5000) * sample(Population, size = 50) %>% mean(~Price, data = .) BootstrapDist <- do(5000) * resample(Sample) %>% mean(~Price, data = .)

Confidence Intervals Bootstrap Resampling Bootstrap Confidence Intervals Bootstrap Percentile Intervals Population vs. Sample vs. Sampling Dist. vs. Bootstrap Dist. Pop. Cases 100 80 60 40 20 0 150000 200000 Price • What is the center of the Samp. Cases 4 3 sampling distribution? 2 1 0 150000 200000 • What is the center of the Price bootstrap distribution? 800 Samples 600 • How does the spread 400 200 0 compare? 150000 200000 Mean Price Boot. Samples 600 400 200 0 150000 200000 Mean Price

Confidence Intervals Bootstrap Resampling Bootstrap Confidence Intervals Bootstrap Percentile Intervals Estimating the Margin of Error Samples 95% 400 ● 0 100000 150000 200000 250000 Mean Price Boot. Samples 600 95% ● 0 100000 150000 200000 250000 Mean Price • The spread of the bootstrap distribution approximates the spread of the true sampling distribution. • We can use the bootstrap distribution to get a Margin of Error for our Confidence Interval • Where should the center of the CI be?

Confidence Intervals Bootstrap Resampling Bootstrap Confidence Intervals Bootstrap Percentile Intervals Adjusting the Confidence Level If the sampling distribution is approximately Normal, then a 95% Margin of Error is about 2 Standard Errors. If the bootstrap distribution is approximately Normal, 95% of the bootstrap statistics are within 2 SE of the boostrap center (i.e., original sample stat.). That is, 95% of bootstrap statistics are within the 95% CI. If the bootstrap distribution is symmetric, then capturing the middle X% of the bootstrap statistics yields an X% confidence interval!

STAT 113 Bootstrap Confidence Intervals Colin Reimer Dawson - PowerPoint PPT Presentation

Confidence Intervals Bootstrap Resampling Bootstrap Confidence Intervals Bootstrap Percentile Intervals STAT 113 Bootstrap Confidence Intervals Colin Reimer Dawson Oberlin College 3 March 2017 Confidence Intervals Bootstrap Resampling

STAT 113 Confidence Intervals Colin Reimer Dawson Oberlin College October 3, 2017 1 / 51

Creating Confidence Intervals using Excel 2013 XL8A-V0R XL8A-V0R XL8A-V0R Create Confidence

Creating Confidence Intervals using Excel 2010 5/08/2015 V0M V0M V0M Create Confidence

Bootstrapping 18.05 Spring 2018 Agenda Leftover from 5/2 : binomial confidence intervals

Confidence Intervals for Normal Data 18.05 Spring 2014 Agenda Today Review of critical values

Intro to Confidence Intervals SECTION 10.1 1 Confidence Intervals Slides.notebook December 22,

Confidence Intervals for Normal Data 18.05 Spring 2014 Agenda Today Review of critical values

STAT 113 Tests and Confidence Intervals Colin Reimer Dawson Oberlin College October 10th, 2016

M5S1 - Confidence Intervals Professor Jarad Niemi STAT 226 - Iowa State University October 9,

I05 - Confidence intervals STAT 587 (Engineering) Iowa State University September 24, 2020

Intervals Yair Wexler Based on: An Introduction to the Bootstrap Bradley Efron and Robert J.

Confidence Intervals for Normal Data 18.05 Spring 2014 Jeremy Orloff and Jonathan Bloom Agenda

Confidence Intervals for Normal Data 18.05 Spring 2014 Jeremy Orloff and Jonathan Bloom Agenda

Confidence Intervals II 18.05 Spring 2014 Agenda Polling: estimating in Bernoulli( ). CLT

Confidence Intervals II 18.05 Spring 2014 Agenda Polling: estimating in Bernoulli( ). CLT

Confidence Intervals II 18.05 Spring 2014 Jeremy Orloff and Jonathan Bloom Agenda Polling:

Electronics for FCAL Detectors On behalf of the FCAL collaboration Angel Abusleme Pontificia

Storage Systems Requirements for Massive Throughput Detectors at Light Sources 35 th

Pulsar Variability and the Global Magnetosphere Alice K. Harding NASA Goddard Space Flight

Past and Present Rural-Urban Mortality Transitions Russell Sage Foundation: Listening to Rural

Q4 2014 Financial Results Conference Call Tuesday, January 27, 2015 2:00 P.M. Pacific Time

WAVEFORM SONAR Mehmet Can Erdem Meteksan Defence, Turkey #UDT2019 Classical Sonar Waveforms

High Energy Upgrade: LCLS-II-HE High Repetition Rate Soft X-rays Hard X-rays Electronic &

University of Manchester CS3282 : Digital Communications Section 4: Introduction to digital

STAT 113 Bootstrap Confidence Intervals Colin Reimer Dawson - PowerPoint PPT Presentation

Confidence Intervals Bootstrap Resampling Bootstrap Confidence Intervals Bootstrap Percentile Intervals STAT 113 Bootstrap Confidence Intervals Colin Reimer Dawson Oberlin College 3 March 2017 Confidence Intervals Bootstrap Resampling

STAT 113 Confidence Intervals Colin Reimer Dawson Oberlin College October 3, 2017 1 / 51

Creating Confidence Intervals using Excel 2013 XL8A-V0R XL8A-V0R XL8A-V0R Create Confidence

Creating Confidence Intervals using Excel 2010 5/08/2015 V0M V0M V0M Create Confidence

Bootstrapping 18.05 Spring 2018 Agenda Leftover from 5/2 : binomial confidence intervals

Confidence Intervals for Normal Data 18.05 Spring 2014 Agenda Today Review of critical values

Intro to Confidence Intervals SECTION 10.1 1 Confidence Intervals Slides.notebook December 22,

Confidence Intervals for Normal Data 18.05 Spring 2014 Agenda Today Review of critical values

STAT 113 Tests and Confidence Intervals Colin Reimer Dawson Oberlin College October 10th, 2016

M5S1 - Confidence Intervals Professor Jarad Niemi STAT 226 - Iowa State University October 9,

I05 - Confidence intervals STAT 587 (Engineering) Iowa State University September 24, 2020

Intervals Yair Wexler Based on: An Introduction to the Bootstrap Bradley Efron and Robert J.

Confidence Intervals for Normal Data 18.05 Spring 2014 Jeremy Orloff and Jonathan Bloom Agenda

Confidence Intervals for Normal Data 18.05 Spring 2014 Jeremy Orloff and Jonathan Bloom Agenda

Confidence Intervals II 18.05 Spring 2014 Agenda Polling: estimating in Bernoulli( ). CLT

Confidence Intervals II 18.05 Spring 2014 Agenda Polling: estimating in Bernoulli( ). CLT

Confidence Intervals II 18.05 Spring 2014 Jeremy Orloff and Jonathan Bloom Agenda Polling:

Electronics for FCAL Detectors On behalf of the FCAL collaboration Angel Abusleme Pontificia

Storage Systems Requirements for Massive Throughput Detectors at Light Sources 35 th

Pulsar Variability and the Global Magnetosphere Alice K. Harding NASA Goddard Space Flight

Past and Present Rural-Urban Mortality Transitions Russell Sage Foundation: Listening to Rural

Q4 2014 Financial Results Conference Call Tuesday, January 27, 2015 2:00 P.M. Pacific Time

WAVEFORM SONAR Mehmet Can Erdem Meteksan Defence, Turkey #UDT2019 Classical Sonar Waveforms

High Energy Upgrade: LCLS-II-HE High Repetition Rate Soft X-rays Hard X-rays Electronic &amp;

University of Manchester CS3282 : Digital Communications Section 4: Introduction to digital

High Energy Upgrade: LCLS-II-HE High Repetition Rate Soft X-rays Hard X-rays Electronic &