Inferential Statistics Chapters 6 &7 - PDF document

5/1/2017 Overview IMGD 2905 • Use simple statistics to infer population parameters Inferential Statistics Chapters 6 &7 http://3.bp.blogspot.com/_94E2PdKwaXE/S-xQRuoiKAI/AAAAAAAAABY/xvDRcG_Mcj0/s1600/120909_0159_1.png Overview Outline • Use simple statistics to infer population parameters • Overview (done) • Foundation (next) • Confidence Intervals • Hypothesis Testing http://3.bp.blogspot.com/_94E2PdKwaXE/S-xQRuoiKAI/AAAAAAAAABY/xvDRcG_Mcj0/s1600/120909_0159_1.png Inferential statistics Dice Rolling (1 of 4) Dice Rolling (1 of 4) • Have 1d6, sample (i.e., roll 1 die) • Have 1d6, sample (i.e., roll 1 die) • What is probability distribution of values? • What is probability distribution of values? “Square“ distribution http://www.investopedia.com/articles/06/probabilitydistribution.asp 1

5/1/2017 Dice Rolling (2 of 4) Dice Rolling (2 of 4) • Have 1d6, sample twice and sum (i.e., roll 2 • Have 1d6, sample twice and sum (i.e., roll 2 dice) dice) • What is probability distribution of values? • What is probability distribution of values? “Triangle“ distribution http://www.investopedia.com/articles/06/probabilitydistribution.asp Dice Rolling (3 of 4) Dice Rolling (3 of 4) • Have 1d6, sample thrice and sum (i.e., roll 3 • Have 1d6, sample thrice and sum (i.e., roll 3 dice) dice) • What is probability distribution of values? • What is probability distribution of values? What’s happening to the shape? http://www.investopedia.com/articles/06/probabilitydistribution.asp Dice Rolling (3 of 4) Dice Rolling (4 of 4) • Same holds for experiments with dice (i.e., • Have 1d6, sample thrice and sum (i.e., roll 3 observing sample sum and mean of dice rolls) dice) • What is probability distribution of values? What’s happening to the shape? http://www.muelaner.com/uncertainty-of-measurement/ Ok, neat – but what about experiments with other distributions? 2

5/1/2017 Sampling Why do we care about sample means Distributions following normal distribution? • With “enough” • What if we had only a samples, looks “bell- sample mean and no shaped”  Normal! measure of spread • How many is – e.g., mean rank for enough? Overwatch is 50 – 30 (15 if symmetric • What can we say about distribution) • Central Limit population mean? Theorem – Sum of independent variables tends towards Normal distribution http://flylib.com/books/2/528/1/html/2/images/figu115_1.jpg Why do we care about sample means Why do we care about sample means following normal distribution? following normal distribution? • What if we had only a • Remember this? sample mean and no measure of spread – e.g., mean rank for Overwatch is 50 • What can we say about population mean? – Not a whole lot! – Yes, population mean http://www.six-sigma-material.com/images/PopSamples.GIF could be 50. But could be 100. How likely are each? Sample mean • Allows us to predict range With mean and  No idea! Population mean? standard deviation to bound population mean Why do we care about sample means Outline following normal distribution? • Overview (done) • Foundation (done) • Confidence Intervals (next) • Hypothesis Testing Sample mean Probable range of population mean 3

5/1/2017 Sampling Error (1 of 2) Sampling Error (1 of 2) • • Population of 200 game Population of 200 game times times Mean μ = 69.637 Mean μ = 69.637 Std Dev σ = 10.411 Std Dev σ = 10.411 • • Experiment w/20 samples Experiment w/20 samples – Each 15 game times – Each 15 game times • • Observations? Observations? – Statistics differ each time! – Sometimes higher, sometimes lower than population (μ , σ) – Sample range varies a lot more than sample standard deviation – Population mean always within sample range This variation  Sampling error Sampling Error (2 of 2) Standard Error (1 of 2) • Error from estimating population parameters from sample statistics • Amount sample means • Exact error often cannot be known (do not vary from sample to sample know population parameters) • Also likelihood that • But size of error based on: sample statistic is near population parameter – Variation in population (s) itself – more variation, – Depends upon sample size more sample statistic variation (N) – Sample size (N) – larger sample, lower error – Depends upon standard deviation • Q: Why can’t we just make sample size super large ? • How much does it vary?  Standard error (Example next) Standard Error (2 of 2) Standard Error (2 of 2) standard error, 100 samples, N=3 standard error, 100 samples, N=3 standard error, 100 samples, N=20 For N = 20: What will happen to What will happen to x’s? bars for N = 20? What will happen to dots? Estimate population parameter  confidence interval http://www.biostathandbook.com/standarderror.html http://www.biostathandbook.com/standarderror.html 4

5/1/2017 Confidence Interval Confidence Interval for the Mean • Range of values with specific certainty that population parameter is within • Probability of  in interval • Say,  = 0.1. Could do k – e.g., 90% confidence interval for mean League of Legends [c 1 ,c 2 ] experiments, find sample – P(c 1 <  < c 2 ) = 1-  means, sort match duration: [28.5 minutes, 32.5 minutes] – Cumulative distribution [c1, c2] is confidence interval  is significance level • Interval from distribution: 100(1-  ) is confidence level – Lower bound: 5% • Typically want  small so • – Upper bound: 95% Have sample of durations • Compute interval containing confidence level 90%,  90% confidence interval population duration 95% or 99% (more on (with 90% confidence) effect later) • In general: probability of  in interval [c 1 ,c 2 ] We have to do k experiments, each of size n ? 28.5 32.5 http://www.comfsm.fm/~dleeling/statistics/notes009_normalcurve90.png Confidence Interval Estimate t distribution • Estimate interval from 1 • Looks like standard normal, but bit “squashed” experiment/sample, size n • Compute sample mean, • Gets more squashed as n gets smaller sample standard error (SE) • Note, can use • Multiply SE by t distribution • Add/subtract from sample standard normal (z mean distribution) when  Confidence interval large enough sample e.g., mean 30.5 size (N = 30+) SE x t is 2 • Ok, what is t distribution? 30.5 - 2 = 28.5 30.5 + 2 = 32.5 – Parameterized by  and n aka student’s t distribution (“student” [28.5, 32.5] was anonymous name used when published by William Gosset) http://ci.columbia.edu/ci/premba_test/c0331/images/s7/6317178747.gif 28.5 32.5 Meaning of Confidence Interval (  ) Confidence Interval Example  (Sorted) If 100 experiments and Game Time • �̅ = 3.90, stddev s =0.95, n =32 confidence level is 90%: 90 cases interval includes  , 1.9 3.9 • A 90% confidence interval (  is 0.1) for in 10 cases not include  2.7 3.9 f(x) population mean (  ): 2.8 4.1 2.8 4.1 Lookup 1.645 in 3.90 ± �.��×�.�� 2.8 4.2 table, or �� 2.9 4.2 =TINV(0.1,31) 3.1 4.4 = [3.62, 4.19] Includes  ? Experiment/Sample 3.1 4.5 1 yes 3.2 4.5 2 yes 3.2 4.8 • With 90% confidence,  in that 3.3 4.9 3 no interval. Chance of error 10%. 3.4 5.1 … e.g., 3.6 5.1  =0.1 • But, what does that mean? 100 yes 3.7 5.3 yes > 100 (1-  ) Total 90 3.8 5.6 no < 100  (See next slide for depiction of meaning) Total 10 3.9 5.9 5

5/1/2017 How does Confidence Interval Size How does Confidence Interval Change Change? (1 of 2)? • With number of samples (N) • What happens to confidence interval • With confidence level (  ) when sample larger (N increases)? – Hint: think about Standard Error How does Confidence Interval Change How does Confidence Interval Change (1 of 2)? (2 of 2)? • 90% CI = [6.5, 9.4] • What happens to – 90% chance population value is between 6.5, 9.4 confidence interval • 95% CI = [6.1, 9.8] when sample larger ( N – 95% chance population value is between 6.1, 9.8 increases)? • Why is interval wider when we are “more” confident? – Hint: think about Standard Error How does Confidence Interval Change Using Confidence Interval (1 of 2) (2 of 2)? • 90% CI = [6.5, 9.4] • Indicator of spread  Error bars – 90% chance population value is between 6.5, 9.4 • CI can be more informative than standard deviation • 95% CI = [6.1, 9.8]  indicates range of population parameter (make sure – 95% chance population value is between 6.1, 9.8 30+ samples!) • Why is interval wider when we are “more” confident? http://vassarstats.net/textbook/f1002.gif 6

Inferential Statistics Chapters 6 &7 - PDF document

5/1/2017 Overview IMGD 2905 Use simple statistics to infer population parameters Inferential Statistics Chapters 6 &7 http://3.bp.blogspot.com/_94E2PdKwaXE/S-xQRuoiKAI/AAAAAAAAABY/xvDRcG_Mcj0/s1600/120909_0159_1.png Overview Outline

Inferential Statistics Inferential statistics are used to test

Descriptive Statistics Descriptive and Inferential Statistics Recall that statistical methods are

Statistical Methods Statistical Methods Descriptive Inferential Statistics Statistics

Unit 3: Inferential Statistics for Continuous Data Statistics for Linguists with R A SIGIL

towards an inferential lexicon of event selecting predicates for french Ingrid Falk and Fabienne

Validity-preservation properties of rules for combining inferential models combining

On Computational Thinking, Inferential Thinking and Big Data Michael I. Jordan University

Inferential Statistics Concepts IN TR OD U C TION TO L IN E AR MOD E L IN G IN P YTH ON Jason

Why use R? Introduction to R: To perform inferential statistics (e.g., use a statistical

Official Statistics Matt Dray, Assistant Statistician Official Statistics 2 Official

Introduction to Inferential Statistics Jaranit Kaewkungwal, Ph.D. Faculty of Tropical Medicine

Inferential Statistics Stephen E. Brock, Ph.D., NCSP California State University, Sacramento 1

Quality Control Using Inferential Statistics In Weibull Based Reliability Analyses S. F. Duffy 1

Calibrated Bayes, and Inferential Paradigm for Of7icial Statistics in the Era of Big Data Rod

Lecture: Sampling and Standard Error 6.0002 LECTURE 8 1 Annou An ouncem emen ents Relevant

Sections 9.1 and 9.2 HYPOTHESIS TESTS FOR PROPORTIONS Inferential Statistics Two important

M5S1 - Confidence Intervals Professor Jarad Niemi STAT 226 - Iowa State University October 9,

Interval Estimation Edwin Leuven Interval estimation While an estimator may be unbiased or

CS70: Lecture 33. WLLN, Confidence Intervals (CI): Chebyshev vs. CLT 1. Review: Inequalities:

Interval Estimates A point estimate by itself provides no information about the precision and

A Course in Applied Econometrics 1. Introduction Lecture 10 2. Example I: Missing Data 3.

CS70: Jean Walrand: Lecture 29. Confidence? Confidence? Confidence is essential is many

Statistical Model Checking and Rare Events Paolo Zuliani Joint work with Edmund M. Clarke

A graphic comparison of the Fieller and Delta intervals for ratios of parameter estimates. Joe

Sambuz

Useful Links

Newsletter

Mail Us