Business Statistics CONTENTS Two types of error The power of a - PowerPoint PPT Presentation

POWER AND DESIGN Business Statistics

CONTENTS Two types of error The power of a test Experimental design Choosing sample size Power curves Power and “big data” Old exam question Further study

TWO TYPES OF ERROR What is the precise meaning of the significance level 𝛽 ? ▪ 𝛽 is the maximum acceptable probability of rejecting the null hypothesis when it is in fact true ▪ so 𝑄 reject 𝐼 0 𝐼 0 To be read as: “the conditional probability There are two possible decisions: of rejecting 𝐼 0 , given that ▪ reject 𝐼 0 𝐼 0 is true” ▪ do not reject 𝐼 0 And there are two possible realities: ▪ 𝐼 0 is true ▪ 𝐼 0 is not true

TWO TYPES OF ERROR Organize the situations in a 2 × 2 -table: Correct decision ▪ no further concern, because OK Wrong decision ▪ type I error ▪ type II error

TWO TYPES OF ERROR Probability of a type I error: 𝑄 reject 𝐼 0 𝐼 0 ≤ 𝛽 ▪ conventionally 𝛽 = 5% ▪ but you may choose another significance level if you think that is appropriate ▪ e.g., aircraft safety: better use 𝛽 = 1% or even 𝛽 = 0.1% ▪ e.g., choosing a colour of your shampoo flask: 𝛽 = 10% is OK ▪ Anyhow, you control the maximum type I error by choosing 𝛽 in advance ▪ and accept that you will once in a while reject 𝐼 0 while it is true

TWO TYPES OF ERROR Probability of a type II error: 𝑄 do not reject 𝐼 0 specific 𝐼 1 = 𝛾 for instance, 𝐼 0 : 𝜈 = 10 vs. ▪ how to choose 𝛾 ? 𝐼 1 : 𝜈 = 12 ▪ You cannot simply control the maximum type II error 𝛾 ▪ because it depends on the true value of the unknown (!) parameter (that is to be tested itself) ▪ as well as on 𝛽 ▪ but it could be good to know it, at least ▪ usually, this is done through the power concept

TWO TYPES OF ERROR Controlling 𝛽 and 𝛾 is important ▪ Example: airport security ▪ 𝐼 0 : bag does not contain a weapon ▪ Type I error: ▪ the bag did not contain a weapon, but the machine said it did ▪ loss of time in manual search, offended clients, delayed flights ▪ management will try to minimize the probability of this error type ▪ Type II error: ▪ the bag did contain a weapon, but the machine did not detect it ▪ hijacks, loss of crew and aircraft, liability claims, loss of credibility ▪ management will try to minimize the probability of this error type

TWO TYPES OF ERROR There is a trade-off: ▪ minimizing 𝛽 typically leads to increasing 𝛾 ▪ minimizing 𝛾 typically leads to increasing 𝛽 𝛽 and 𝛾 can only be decreased simultaneously by changing the set-up of the research ▪ most importantly, by increasing sample size

EXERCISE 1 What error do we make? a. A population has 𝜈 = 120 . We reject 𝐼 0 : 𝜈 ≤ 125 . b. A population has 𝜌 = 0.3 . We do not reject 𝐼 0 : 𝜌 ≤ 0.4 . c. A population has 𝜏 2 = 2.5 . We do not reject 𝐼 0 : 𝜏 2 ≤ 2 .

EXPERIMENTAL DESIGN In business and politics, a lot depends on what clients, the market and the public require So you do experiments: ▪ market surveys ▪ questionnaires ▪ customer cards ▪ polls ▪ website analysis (tracking cookies, etc) ▪ etc.

EXPERIMENTAL DESIGN How to set up such experiments ▪ qualitative research methods (interview techniques, etc.) ▪ quantitative research methods (choosing sample size, etc.) Here we will focus on sample size for 𝜈 and 𝜌

ҧ CHOOSING SAMPLE SIZE Recall that the confidence interval of a mean 𝜈 is 𝜏 𝜏 𝑦 − 𝑨 𝛽/2 𝑜 , ҧ 𝑦 + 𝑨 𝛽/2 𝑜 ▪ This means that the width of the confidence interval scales 1 with a factor 𝑜 If we need to estimate 𝜈 with a (minimal) precision of ±𝐹 (the allowable error), you would need a certain (minimal) sample size

CHOOSING SAMPLE SIZE This gives 𝜏 𝐹 = 𝑨 𝛽/2 𝑜 ▪ so 2 𝜏 𝑜 = 𝑨 𝛽/2 𝐹 Example ▪ to realize a 95% confidence interval for the mean of a population with 𝜏 = 3 with precision 𝐹 = 1 , 𝑜 = 34.5 , so use 𝑜 = 35 always round to the higher value in determining sample size

CHOOSING SAMPLE SIZE Observe that we need to know 𝜏 ▪ how to know it? Three suggestions: ▪ take a small preliminary sample and use the sample 𝑡 instead of 𝜏 in the formula ▪ estimate rough upper and lower limits 𝑏 and 𝑐 and set 𝜏 = 𝑐−𝑏 based on a uniform 12 distribution ▪ estimate rough upper and lower limits 𝑏 and 𝑐 and set 𝜏 = 𝑐−𝑏 based on the fact that most of the values od a normal distribution are 4 between 𝜈 − 2𝜏 and 𝜈 + 2𝜏

CHOOSING SAMPLE SIZE Likewise, the confidence interval of a proportion 𝜌 is 𝜌 1 − 𝜌 𝜌 1 − 𝜌 𝑞 − 𝑨 𝛽/2 , 𝑞 + 𝑨 𝛽/2 𝑜 𝑜 This gives 𝜌 1 − 𝜌 𝐹 = 𝑨 𝛽/2 𝑜 ▪ so 2 𝑨 𝛽/2 𝑜 = 𝜌 1 − 𝜌 𝐹

CHOOSING SAMPLE SIZE Observe that we need to know 𝜌 ▪ so to determine the sample size to estimate 𝜌 , you need 𝜌 Four suggestions ▪ take a small preliminary sample and use the sample 𝑞 instead of 𝜌 in the sample size formula ▪ take a small preliminary sample, find a confidence interval and from this interval use the value closest to 0.5 instead of 𝜌 in the sample size formula ▪ use a prior sample or historical data this conservative method ensures the ▪ assume that 𝜌 = 0.50 desired precision ( 𝜌 1 − 𝜌 has a maximum at 𝜌 = 0.50 )

EXERCISE 2 We ask a sample of persons if they are in favor or against Brexit. We want to deduce the proportion in favor, with a margin of no more than ±2% . What sample size to use?

POWER ▪ Suppose we test a one-sample mean ▪ the null hypothesis is 𝐼 0 : 𝜈 = 𝜈 ℎ𝑧𝑞 = 3 ▪ at a significance level 𝛽 = 0.05 ▪ If the true parameter is 𝜈 = 𝜈 𝑢𝑠𝑣𝑓 = 3 ▪ there is a probability 𝛽 = 0.05 to reject 𝐼 0 ▪ so this is the probability to make a type I error

POWER ▪ But if the true parameter is 𝜈 = 𝜈 𝑢𝑠𝑣𝑓 = 3.1 instead ▪ there is a larger probability to reject 𝐼 0 ▪ which is the correct decision 𝜏 ▪ how large depends on 𝜏 and 𝑜 (recall 𝑨 𝛽/2 𝑜 ) ▪ And if the true parameter is 𝜈 = 𝜈 𝑢𝑠𝑣𝑓 = 10 instead ▪ there is an even larger probability to reject 𝐼 0 ▪ which is the correct decision

POWER So, the probability of a rejecting an incorrect 𝐼 0 on the mean depends on ▪ the pre-defined probability 𝛽 of not rejecting a correct 𝐼 0 ▪ the sample size 𝑜 ▪ the standard deviation of the popolution 𝜏 ▪ the difference between the hypothesized 𝜈 ( 𝜈 ℎ𝑧𝑞 ) and the true 𝜈 ( 𝜈 𝑢𝑠𝑣𝑓 )

POWER Power is defined as the probability of rejecting 𝐼 0 when it should indeed be rejected ▪ when a specific 𝐼 1 is true So: power = 𝑄 reject 𝐼 0 specific 𝐼 1 Therefore: power = 1 − 𝑄 do not reject 𝐼 0 specific 𝐼 1 = 1 − 𝛾

POWER To calculate the power of a test for the mean, you need ▪ the significance level 𝛽 (you choose it) ▪ the sample size 𝑜 (you choose it) ▪ the standard deviation 𝜏 ▪ the hypothesized mean 𝜈 ℎ𝑧𝑞 (you choose it) ▪ the true mean 𝜈 𝑢𝑠𝑣𝑓 (you have no clue) Therefore, we typically do not calculate power ▪ but rather calculate a power function or power curves for different values of 𝜈 𝑢𝑠𝑣𝑓

POWER CURVES For a fixed 𝐼 0 : 𝜈 = 𝜈 ℎ𝑧𝑞 what happens for different values P (R e j e c t H o ) 1 of 𝜈 𝑢𝑠𝑣𝑓  = P (t y p e I I e r r o r )  𝜈 -axis: different options for 𝜈 𝑢𝑠𝑣𝑓 𝜈 0 = 𝜈 ℎ𝑧𝑞  0    1  0 one specific 𝜈 1 = 𝜈 𝑢𝑠𝑣𝑓

POWER CURVES Effect of different values of 𝜈 𝑢𝑠𝑣𝑓 : 𝜈 𝑢𝑠𝑣𝑓 = 6  =3  =6 0.4 0.3 0.2 0.1 0 - 1 0 1 2 3 4 5 6 7 8 9 10 1 p o w e r 0.8 0.6 0.4 0.2 0 - 1 0 1 2 3 4 5 6 7 8 9 10  =0.05 T W O - S I D E D T E S T I N G

POWER CURVES Effect of different values of 𝜈 𝑢𝑠𝑣𝑓 : 𝜈 𝑢𝑠𝑣𝑓 = 5  =3  =5 0.4 0.3 0.2 0.1 0 - 1 0 1 2 3 4 5 6 7 8 9 10 1 0.8 0.6 p o w e r 0.4 0.2 0 - 1 0 1 2 3 4 5 6 7 8 9 10  =0.05 T W O - S I D E D T E S T I N G

POWER CURVES Effect of different values of 𝜈 𝑢𝑠𝑣𝑓 : 𝜈 𝑢𝑠𝑣𝑓 = 4  =3  =4 0.4 0.3 0.2 0.1 0 - 1 0 1 2 3 4 5 6 7 8 9 10 1 0.8 0.6 0.4 0.2 p o w e r 0 - 1 0 1 2 3 4 5 6 7 8 9 10  =0.05 T W O - S I D E D T E S T I N G

POWER CURVES Effect of different values of 𝜈 𝑢𝑠𝑣𝑓 : 𝜈 𝑢𝑠𝑣𝑓 = 3.4  =3  =3.4 0.4 0.3 0.2 0.1 0 - 1 0 1 2 3 4 5 6 7 8 9 10 1 0.8 0.6 0.4 0.2 p o w e r 0 - 1 0 1 2 3 4 5 6 7 8 9 10  =0.05 T W O - S I D E D T E S T I N G

POWER CURVES Effect of different values of 𝛽 : 𝛽 = 0.05  =3  =5.5 0.4 0.3 0.2 0.1 0 - 1 0 1 2 3 4 5 6 7 8 9 10 1 0.8 p o w e r 0.6 0.4 0.2 0 - 1 0 1 2 3 4 5 6 7 8 9 10  =0.05 T W O - S I D E D T E S T I N G

POWER CURVES Effect of different values of 𝛽 : 𝛽 = 0.1  =3  =5.5 0.4 0.3 0.2 0.1 0 - 1 0 1 2 3 4 5 6 7 8 9 10 1 p o w e r 0.8 0.6 0.4 0.2 0 - 1 0 1 2 3 4 5 6 7 8 9 10  =0.1 T W O - S I D E D T E S T I N G

Business Statistics CONTENTS Two types of error The power of a - PowerPoint PPT Presentation

POWER AND DESIGN Business Statistics CONTENTS Two types of error The power of a test Experimental design Choosing sample size Power curves Power and big data Old exam question Further study TWO TYPES OF ERROR What is the precise

Official Statistics Matt Dray, Assistant Statistician Official Statistics 2 Official

Areal statistics Barry Rowlingson Research Fellow DataCamp Spatial Statistics in R Borders

The Pulse monitors: Statistics Smartpods PULSE 1 - Improve Facility Efficiencies 2 - Increase

Quality Assurance in Official Statistics Directorate of Economics & Statistics, Planning

UK Bleeding Disorder Statistics UK Bleeding Disorder Statistics UK Bleeding Disorder Statistics

The Statistics Network The Statistics Network Statistics network Compute servers Desktop PCs

1 Practical Information 2 Introduction to Statistics Per Bruun Brockhoff 3 Descriptive Statistics:

Statistics for Social Sciences I: Introduction to Statistics Introduction to Statistics

Global assessment of linking trade statistics and the business register Nancy Snyder United

Introduction to Business Statistics Professor Jarad Niemi STAT 226 - Iowa State University

Business and Business Environment Business and Business Environment Introduction Business is

Business statistics and Globalisation UN Committee of Experts on Business Statistics First

Introduction to Business Statistics Introduction to Business Statistics QM 120 Ch Chapter 3 t

Introduction to Business Statistics Introduction to Business Statistics QM 120 Ch Chapter 4 t

REPUBLIC OF NAMIBIA WHAT IS FOREIGN TRADE STATISTICS WHAT IS FOREIGN TRADE STATISTICS Records

AP Biology and Statistics Statistics Statistics help to better understand the meaning of a

Hypothesis Tests for Population Means Bernd Schr oder logo1 Bernd Schr oder Louisiana

ACMS 20340 Statistics for Life Sciences Chapter 15: Inference in Practice Inference in Practice

Statistical Analysis of Corpus Data with R Hypothesis Testing for Corpus Frequency Data The

Confidence Intervals and Hypothesis Testing Marc H. Mehlman marcmehlman@yahoo.com University of

Hypothesis testing and statistical decision theory Lirong Xia Fall, 2016 Schedule

Lecture 8 Hypothesis Testing I-Hsiang Wang Department of Electrical Engineering National Taiwan

Hypothesis Testing with An Important New . . . Interval Data: Case of Case of Probabilistic . .

Statistical Power Paul Gribble Winter, 2019 . . . . . . . . . . . . . . . . . .

Sambuz

Useful Links

Newsletter

Mail Us

Business Statistics CONTENTS Two types of error The power of a - PowerPoint PPT Presentation

POWER AND DESIGN Business Statistics CONTENTS Two types of error The power of a test Experimental design Choosing sample size Power curves Power and big data Old exam question Further study TWO TYPES OF ERROR What is the precise

Official Statistics Matt Dray, Assistant Statistician Official Statistics 2 Official

Areal statistics Barry Rowlingson Research Fellow DataCamp Spatial Statistics in R Borders

The Pulse monitors: Statistics Smartpods PULSE 1 - Improve Facility Efficiencies 2 - Increase

Quality Assurance in Official Statistics Directorate of Economics &amp; Statistics, Planning

UK Bleeding Disorder Statistics UK Bleeding Disorder Statistics UK Bleeding Disorder Statistics

The Statistics Network The Statistics Network Statistics network Compute servers Desktop PCs

1 Practical Information 2 Introduction to Statistics Per Bruun Brockhoff 3 Descriptive Statistics:

Statistics for Social Sciences I: Introduction to Statistics Introduction to Statistics

Global assessment of linking trade statistics and the business register Nancy Snyder United

Introduction to Business Statistics Professor Jarad Niemi STAT 226 - Iowa State University

Business and Business Environment Business and Business Environment Introduction Business is

Business statistics and Globalisation UN Committee of Experts on Business Statistics First

Introduction to Business Statistics Introduction to Business Statistics QM 120 Ch Chapter 3 t

Introduction to Business Statistics Introduction to Business Statistics QM 120 Ch Chapter 4 t

REPUBLIC OF NAMIBIA WHAT IS FOREIGN TRADE STATISTICS WHAT IS FOREIGN TRADE STATISTICS Records

AP Biology and Statistics Statistics Statistics help to better understand the meaning of a

Hypothesis Tests for Population Means Bernd Schr oder logo1 Bernd Schr oder Louisiana

ACMS 20340 Statistics for Life Sciences Chapter 15: Inference in Practice Inference in Practice

Statistical Analysis of Corpus Data with R Hypothesis Testing for Corpus Frequency Data The

Confidence Intervals and Hypothesis Testing Marc H. Mehlman marcmehlman@yahoo.com University of

Hypothesis testing and statistical decision theory Lirong Xia Fall, 2016 Schedule

Lecture 8 Hypothesis Testing I-Hsiang Wang Department of Electrical Engineering National Taiwan

Hypothesis Testing with An Important New . . . Interval Data: Case of Case of Probabilistic . .

Statistical Power Paul Gribble Winter, 2019 . . . . . . . . . . . . . . . . . .

Sambuz

Useful Links

Newsletter

Mail Us

Quality Assurance in Official Statistics Directorate of Economics & Statistics, Planning