Unit 3: Foundations for inference Lecture 1: Variability in - PowerPoint PPT Presentation

Unit 3: Foundations for inference Lecture 1: Variability in estimates and CLT Statistics 101 Thomas Leininger May 28 2013

Announcements Announcements 1 Variability in estimates 2 Example Sampling distributions - via simulation Sampling distributions - via CLT Statistics 101 U3 - L1: Variability in estimates and CLT Thomas Leininger

Announcements Announcements Labs 2 & 3 due today PS 3 due tomorrow Projects Statistics 101 (Thomas Leininger) U3 - L1: Variability in estimates and CLT May 28 2013 2 / 16

Variability in estimates Announcements 1 Variability in estimates 2 Example Sampling distributions - via simulation Sampling distributions - via CLT Statistics 101 U3 - L1: Variability in estimates and CLT Thomas Leininger

Variability in estimates Example Announcements 1 Variability in estimates 2 Example Sampling distributions - via simulation Sampling distributions - via CLT Statistics 101 U3 - L1: Variability in estimates and CLT Thomas Leininger

Variability in estimates Example http://pewresearch.org/pubs/2191/young-adults-workers-labor-market-pay-careers-advancement-recession Statistics 101 (Thomas Leininger) U3 - L1: Variability in estimates and CLT May 28 2013 3 / 16

Variability in estimates Example Margin of error 41% ± 2.9%: We are 95% confident that 38.1% to 43.9% of the public believe young adults, rather than middle-aged or older adults, are having the toughest time in today’s economy. 49% ± 4.4%: We are 95% confident that 44.6% to 53.4% of 18-34 years olds have taken a job they didn’t want just to pay the bills. Statistics 101 (Thomas Leininger) U3 - L1: Variability in estimates and CLT May 28 2013 4 / 16

Variability in estimates Example Parameter estimation We are often interested in population parameters . Since complete populations are difficult (or impossible) to collect data on, we use sample statistics as point estimates for the unknown population parameters of interest. Sample statistics vary from sample to sample. Quantifying how sample statistics vary provides a way to estimate the margin of error associated with our point estimate. But before we get to quantifying the variability among samples, let’s try to understand how and why point estimates vary from sample to sample. Suppose we randomly sample 1,000 adults from each state in the US. Would you expect the sample means of their heights to be the same, somewhat different, or very different? Statistics 101 (Thomas Leininger) U3 - L1: Variability in estimates and CLT May 28 2013 5 / 16

Variability in estimates Sampling distributions - via simulation Announcements 1 Variability in estimates 2 Example Sampling distributions - via simulation Sampling distributions - via CLT Statistics 101 U3 - L1: Variability in estimates and CLT Thomas Leininger

Variability in estimates Sampling distributions - via simulation Average number of Duke games attended Next let’s look at the population data for the number of Duke basketball games attended: 150 100 Frequency 50 0 0 10 20 30 40 50 60 70 number of Duke games attended Statistics 101 (Thomas Leininger) U3 - L1: Variability in estimates and CLT May 28 2013 6 / 16

Variability in estimates Sampling distributions - via simulation Average number of Duke games attended (cont.) Sampling distribution, n = 10: What does each observa- tion in this distribution rep- 2000 resent? 1500 Is the variability of the sam- Frequency 1000 pling distribution smaller or larger than the variability of the population distribution? 500 Why? 0 0 5 10 15 20 sample means from samples of n = 10 Statistics 101 (Thomas Leininger) U3 - L1: Variability in estimates and CLT May 28 2013 7 / 16

Variability in estimates Sampling distributions - via simulation Average number of Duke games attended (cont.) Sampling distribution, n = 10: What does each observa- tion in this distribution rep- 2000 resent? Sample mean, ¯ x , of 1500 samples of size n = 10 . Is the variability of the sam- Frequency 1000 pling distribution smaller or larger than the variability of the population distribution? 500 Why? 0 0 5 10 15 20 sample means from samples of n = 10 Statistics 101 (Thomas Leininger) U3 - L1: Variability in estimates and CLT May 28 2013 7 / 16

Variability in estimates Sampling distributions - via simulation Average number of Duke games attended (cont.) Sampling distribution, n = 10: What does each observa- tion in this distribution rep- 2000 resent? Sample mean, ¯ x , of 1500 samples of size n = 10 . Is the variability of the sam- Frequency 1000 pling distribution smaller or larger than the variability of the population distribution? 500 Why? Smaller, sample means will 0 vary less than individual 0 5 10 15 20 observations. sample means from samples of n = 10 Statistics 101 (Thomas Leininger) U3 - L1: Variability in estimates and CLT May 28 2013 7 / 16

Variability in estimates Sampling distributions - via simulation Average number of Duke games attended (cont.) Sampling distribution, n = 30: 800 How did the shape, center, and spread of the sam- 600 pling distribution change go- Frequency ing from n = 10 to n = 30 ? 400 200 0 2 4 6 8 10 sample means from samples of n = 30 Statistics 101 (Thomas Leininger) U3 - L1: Variability in estimates and CLT May 28 2013 8 / 16

Variability in estimates Sampling distributions - via simulation Average number of Duke games attended (cont.) Sampling distribution, n = 30: 800 How did the shape, center, and spread of the sam- 600 pling distribution change go- Frequency ing from n = 10 to n = 30 ? 400 Shape is more symmetric, center is about the same, 200 spread is smaller. 0 2 4 6 8 10 sample means from samples of n = 30 Statistics 101 (Thomas Leininger) U3 - L1: Variability in estimates and CLT May 28 2013 8 / 16

Variability in estimates Sampling distributions - via simulation Average number of Duke games attended (cont.) Sampling distribution, n = 70: 1200 1000 800 Frequency 600 400 200 0 3 4 5 6 7 8 9 sample means from samples of n = 70 Statistics 101 (Thomas Leininger) U3 - L1: Variability in estimates and CLT May 28 2013 9 / 16

Variability in estimates Sampling distributions - via simulation Average number of Duke games attended (cont.) Question The mean of the sampling distribution is 5.75, and the standard devia- tion of the sampling distribution (also called the standard error ) is 0.75. Which of the following is the most reasonable guess for the 95% con- fidence interval for the true average number of Duke games attended by students? (a) 5 . 75 ± 0 . 75 (b) 5 . 75 ± 2 × 0 . 75 (c) 5 . 75 ± 3 × 0 . 75 (d) cannot tell from the information given Statistics 101 (Thomas Leininger) U3 - L1: Variability in estimates and CLT May 28 2013 10 / 16

Variability in estimates Sampling distributions - via simulation Average number of Duke games attended (cont.) Question The mean of the sampling distribution is 5.75, and the standard devia- tion of the sampling distribution (also called the standard error ) is 0.75. Which of the following is the most reasonable guess for the 95% con- fidence interval for the true average number of Duke games attended by students? (a) 5 . 75 ± 0 . 75 (b) 5 . 75 ± 2 × 0 . 75 → (4 . 25 , 7 . 25) (c) 5 . 75 ± 3 × 0 . 75 (d) cannot tell from the information given Statistics 101 (Thomas Leininger) U3 - L1: Variability in estimates and CLT May 28 2013 10 / 16

Variability in estimates Sampling distributions - via CLT Announcements 1 Variability in estimates 2 Example Sampling distributions - via simulation Sampling distributions - via CLT Statistics 101 U3 - L1: Variability in estimates and CLT Thomas Leininger

Variability in estimates Sampling distributions - via CLT Central limit theorem Central limit theorem The distribution of the sample mean is well approximated by a normal model: � � mean = µ, SE = σ x ∼ N ¯ √ n If σ is unknown, use s . So it wasn’t a coincidence that the sampling distributions we saw earlier were symmetric. σ We won’t go into the proving why SE = √ n , but note that as n increases SE decreases. As the sample size increases we would expect samples to yield more consistent sample means, hence the variability among the sample means would be lower. Statistics 101 (Thomas Leininger) U3 - L1: Variability in estimates and CLT May 28 2013 11 / 16

Variability in estimates Sampling distributions - via CLT CLT - conditions Certain conditions must be met for the CLT to apply: Independence: Sampled observations must be independent. 1 This is difficult to verify, but is more likely if random sampling/assignment is used, and, if sampling without replacement, n < 10% of the population. Sample size/skew/outliers: Either 2 1) the population distribution is normal OR 2) n > 30 and the population distribution is not extremely skewed. This is also difficult to verify for the population, but we can check it using the sample data, and assume that the sample mirrors the population. Statistics 101 (Thomas Leininger) U3 - L1: Variability in estimates and CLT May 28 2013 12 / 16

Variability in estimates Sampling distributions - via CLT CLT - sample size/skew condition - simulations (1) http://onlinestatbook.com/stat sim/sampling dist/index.html Statistics 101 (Thomas Leininger) U3 - L1: Variability in estimates and CLT May 28 2013 13 / 16

Unit 3: Foundations for inference Lecture 1: Variability in - PowerPoint PPT Presentation

Unit 3: Foundations for inference Lecture 1: Variability in estimates and CLT Statistics 101 Thomas Leininger May 28 2013 Announcements Announcements 1 Variability in estimates 2 Example Sampling distributions - via simulation Sampling

VARIABILITY OF HAWAIIAN WINTER RAINFALL VARIABILITY OF HAWAIIAN WINTER RAINFALL VARIABILITY OF

Variability of an artificial tandem repeat Ted Pak HURS 2007 Variability of an artificial tandem

Chapter 4: Foundations for inference OpenIntro Statistics, 2nd Edition Variability in estimates

Unit 3: Foundations for inference 1. Variability in estimates and CLT GOVT 3990 - Spring 2020

Introduction Variability in Data Summarizing variability in a data set CS 239

recap to this point foundations foundations foundations foundations genetics =

Boosting: Foundations and Algorithms Boosting: Foundations and Algorithms Boosting: Foundations

HOUSING PROJECT 1 UNIT 4 UNIT 1 UNIT 6 UNIT 5 UNIT 3 UNIT 2 Application of the Concept

Variability Extraction and Analysis Toolkit (VEXA) VEXA Introduction The Variability Extraction

Climate Variability in South Asia V. Niranjan, M. Dinesh Kumar, and Nitin Bassi Institute for

Chapter 4: Variability Variability Provides a quantitative measure of the degree to which

UQ, STAT2201, 2017, Lecture 8 (and part of 9). Unit 8 Two Sample Inference. Unit 9

The Foundations: Logic and Proofs Chapter 1, Part III: Proofs Rules of Inference Section 1.6

Foundations for Inference I Dajiang Liu @PHS525 Feb-09-2016 Statistical Inference

Inference in Bayesian networks Chapter 14.45 Chapter 14.45 1 Outline Exact inference

Machine Learning: Foundations Lecturer: Yishay Mansour Lecture 2 Bayesian Inference Kfir Bar

Whos Turn is it Anyway? Online Board Gaming and Libraries with Bebo and John Board Game

Conversion Service Sustaining the V alue of Digital through Format T ransformation Cloud

FI-WARE TECHNOLOGY FOUNDATION: THE FI-PPP CORE PLATFORM Markus Heller, SAP Research Karlsruhe,

CONSIDERATIONS FOR SUCCESSFUL VIRTUAL CASE MANAGEMENT IN HUMAN SERVICE DELIVERY April 28, 2020 |

1 &')#$

Beyond 2D representa/ons: track/shower separa/on in 3D Ji Won Park Kazu Terao 11/14/17 SLAC

Muse needs adding all 1 hour exposure datacubes obtained in different observing conditions to

Face Tracking Tracking and Person and Person Face Action Recognition Recognition Action

Sambuz

Useful Links

Newsletter

Mail Us

Unit 3: Foundations for inference Lecture 1: Variability in - PowerPoint PPT Presentation

Unit 3: Foundations for inference Lecture 1: Variability in estimates and CLT Statistics 101 Thomas Leininger May 28 2013 Announcements Announcements 1 Variability in estimates 2 Example Sampling distributions - via simulation Sampling

VARIABILITY OF HAWAIIAN WINTER RAINFALL VARIABILITY OF HAWAIIAN WINTER RAINFALL VARIABILITY OF

Variability of an artificial tandem repeat Ted Pak HURS 2007 Variability of an artificial tandem

Chapter 4: Foundations for inference OpenIntro Statistics, 2nd Edition Variability in estimates

Unit 3: Foundations for inference 1. Variability in estimates and CLT GOVT 3990 - Spring 2020

Introduction Variability in Data Summarizing variability in a data set CS 239

recap to this point foundations foundations foundations foundations genetics =

Boosting: Foundations and Algorithms Boosting: Foundations and Algorithms Boosting: Foundations

HOUSING PROJECT 1 UNIT 4 UNIT 1 UNIT 6 UNIT 5 UNIT 3 UNIT 2 Application of the Concept

Variability Extraction and Analysis Toolkit (VEXA) VEXA Introduction The Variability Extraction

Climate Variability in South Asia V. Niranjan, M. Dinesh Kumar, and Nitin Bassi Institute for

Chapter 4: Variability Variability Provides a quantitative measure of the degree to which

UQ, STAT2201, 2017, Lecture 8 (and part of 9). Unit 8 Two Sample Inference. Unit 9

The Foundations: Logic and Proofs Chapter 1, Part III: Proofs Rules of Inference Section 1.6

Foundations for Inference I Dajiang Liu @PHS525 Feb-09-2016 Statistical Inference

Inference in Bayesian networks Chapter 14.45 Chapter 14.45 1 Outline Exact inference

Machine Learning: Foundations Lecturer: Yishay Mansour Lecture 2 Bayesian Inference Kfir Bar

Whos Turn is it Anyway? Online Board Gaming and Libraries with Bebo and John Board Game

Conversion Service Sustaining the V alue of Digital through Format T ransformation Cloud

FI-WARE TECHNOLOGY FOUNDATION: THE FI-PPP CORE PLATFORM Markus Heller, SAP Research Karlsruhe,

CONSIDERATIONS FOR SUCCESSFUL VIRTUAL CASE MANAGEMENT IN HUMAN SERVICE DELIVERY April 28, 2020 |

1 &amp;')#$

Beyond 2D representa/ons: track/shower separa/on in 3D Ji Won Park Kazu Terao 11/14/17 SLAC

Muse needs adding all 1 hour exposure datacubes obtained in different observing conditions to

Face Tracking Tracking and Person and Person Face Action Recognition Recognition Action

Sambuz

Useful Links

Newsletter

Mail Us

1 &')#$