business statistics
play

Business Statistics CONTENTS Estimating parameters The sampling - PowerPoint PPT Presentation

: ESTIMATES, CONFIDENCE INTERVALS, AND TESTS Business Statistics CONTENTS Estimating parameters The sampling distribution Confidence intervals for Hypothesis tests for The -distribution Comparison of and Old exam


  1. ๐œˆ : ESTIMATES, CONFIDENCE INTERVALS, AND TESTS Business Statistics

  2. CONTENTS Estimating parameters The sampling distribution Confidence intervals for ๐œˆ Hypothesis tests for ๐œˆ The ๐‘ข -distribution Comparison of ๐‘จ and ๐‘ข Old exam question Further study

  3. ESTIMATING PARAMETERS Central task in inferential statistics โ–ช Estimation โ–ช estimating a parameter (population value) from a sample โ–ช Example โ–ช what proportion of cars in Amsterdam is electric? โ–ช population value: ๐œŒ โ–ช sample of size ๐‘œ = 200 cars yields 26 electric cars 26 โ–ช so, ๐‘ž = 200 = 0.13 โ–ช this suggests ๐œŒ โ‰ˆ 0.13

  4. ESTIMATING PARAMETERS Terminology โ–ช Parameter โ–ช a characteristic descriptive of the population โ–ช e.g., ๐œˆ , ๐œŒ , ๐œ (or ๐œ 2 ) โ–ช Estimator โ–ช a statistic derived from a sample to infer the value of a population parameter โ–ช e.g., เดค ๐‘Œ , ๐‘„ , ๐‘‡ (or ๐‘‡ 2 ) โ–ช Estimate โ–ช the value of the estimator in a particular sample โ–ช e.g., าง ๐‘ฆ , ๐‘ž , ๐‘ก (or ๐‘ก 2 )

  5. ESTIMATING PARAMETERS

  6. าง ESTIMATING PARAMETERS Estimator Estimate Population parameter 1 1 Mean ๐œˆ เดค ๐‘œ ๐‘œ ๐‘Œ = ๐‘œ ฯƒ ๐‘—=1 ๐‘Œ ๐‘— ๐‘ฆ = ๐‘œ ฯƒ ๐‘—=1 ๐‘ฆ ๐‘— Standard ๐œ 1 1 ๐‘œ ๐‘Œ ๐‘— โˆ’ เดค ๐‘œ ๐‘Œ 2 ๐‘ฆ 2 ๐‘‡ = ๐‘œโˆ’1 ฯƒ ๐‘—=1 ๐‘ก = ๐‘œโˆ’1 ฯƒ ๐‘—=1 ๐‘ฆ ๐‘— โˆ’ าง deviation ๐‘ฆ ๐‘Œ Proportion ๐œŒ ๐‘ž = ๐‘„ = ๐‘œ ๐‘œ

  7. ESTIMATING PARAMETERS โ–ช Another example (Amsterdam, 2015): โ–ช what is the mean price of a glass of beer? โ–ช population value: ๐œˆ โ–ช sample of size ๐‘œ = 64 glasses of beer yields าง ๐‘ฆ = 2.06โ‚ฌ โ–ช this suggests that ๐œˆ = 2.06โ‚ฌ โ–ช But suppose we had taken a different sample โ–ช again with sample size ๐‘œ = 64 โ–ช but now perhaps yielding าง ๐‘ฆ = 2.13โ‚ฌ โ–ช then we would estimate ๐œˆ = 2.13โ‚ฌ โ–ช Obviously there is sampling variation ๐‘ฆ -values (the sampling distribution of เดค โ–ช so a distribution of าง ๐‘Œ ) โ–ช Solution: point estimates and confidence intervals

  8. THE SAMPLING DISTRIBUTION โ–ช Example โ–ช Consider a discrete uniform population consisting of the integers {0, 1, 2, 3} โ–ช The population parameters are: โ–ช ๐œˆ = 1.5 โ–ช ๐œ = 1.118

  9. THE SAMPLING DISTRIBUTION โ–ช Sample ๐‘œ = 2 values and calculate าง ๐‘ฆ โ–ช Do this for all possible sample of size ๐‘œ = 2 ๐‘ฆ -values: the distribution เดค โ–ช You will get a distribution of าง ๐‘Œ

  10. THE SAMPLING DISTRIBUTION โ–ช We will study the variance of the estimate of a population parameter from a sample statistic โ–ช We will do so by studying how the sample statistic varies when you draw a different sample โ–ช Example: โ–ช GMAT score of MBA students โ–ช ๐‘‚ = 2637 โ–ช ๐œˆ = 520.78 โ–ช ๐œ = 86.60

  11. THE SAMPLING DISTRIBUTION โ–ช Consider eight random samples, each of size ๐‘œ = 5 โ–ช the sample means ( าง ๐‘ฆ 8 = 582 ) ๐‘ฆ 1 = 504.0, าง ๐‘ฆ 2 = 576.0, โ€ฆ , าง tend to be close to the population mean ( ๐œˆ = 520.78 ) โ–ช sometimes a bit lower, sometimes a bit higher

  12. THE SAMPLING DISTRIBUTION โ–ช The dot plots show that the sample means ( าง ๐‘ฆ 8 ) ๐‘ฆ 1 , โ€ฆ , าง have much less variation than the individual data points ( ๐‘ฆ 1 , โ€ฆ , ๐‘ฆ 2637 )

  13. THE SAMPLING DISTRIBUTION โ–ช An estimator is a random variable since samples vary โ–ช so we write it as a capital letter, e.g., ๐‘Œ , เดค ๐‘Œ , ๐‘‡ , etc. โ–ช The sampling distribution of an estimator is the probability distribution of all possible values the statistic may assume when a random sample of (a fixed) size ๐‘œ is taken โ–ช so we write ๐‘Œ~๐‘‚ ๐œˆ, ๐œ , etc.

  14. THE SAMPLING DISTRIBUTION โ–ช The sampling distribution of เดค ๐‘Œ โ–ช for a population with ๐œˆ = ๐œˆ ๐‘Œ and ๐œ 2 = ๐œ ๐‘Œ 2 โ–ช If the CLT holds 2 ๐‘Œ~๐‘‚ ๐œˆ ๐‘Œ , ๐œ ๐‘Œ 3 things: เดค shape, mean, dispersion ๐‘œ โ–ช So, the statistic เดค ๐‘Œ โ–ช is normally distributed โ–ช has mean ๐œˆ ๐‘Œ ๐œ ๐‘Œ โ–ช and has standard deviation ๐‘œ โ–ช Fortunately, the CLT holds pretty often

  15. THE SAMPLING DISTRIBUTION โ–ช The standard deviation of the distribution of sample means เดค ๐‘Œ ๐œ ๐‘Œ โ–ช is given by ๐œ เดค ๐‘Œ = ๐‘œ โ–ช has a special name: standard error of the mean โ–ช is often abbreviated as the standard error (SE) โ–ช decreases with increasing sample size โ–ช but only according to the โ€œlaw of diminishing returnsโ€ ( 1/ ๐‘œ ) โ–ช is often calculated by software (SPSS, etc.) โ–ช is the basis for confidence intervals and hypothesis tests (see later) Thatโ€™s a bit confusing, because we will meet more standard errors later on

  16. EXERCISE 1 What is the meaning of the standard error?

  17. CONFIDENCE INTERVALS FOR ๐œˆ โ–ช A sample mean าง ๐‘ฆ is a point estimate of the population mean ๐œˆ โ–ช it is the best possible estimate of ๐œˆ To simplify notation, we will drop the โ€œ ๐‘Œ โ€ from ๐œˆ ๐‘Œ now, โ–ช but it will probably not be completely right and write just ๐œˆ โ–ช A confidence interval (CI) for the mean is a range of possible values for ๐œˆ : ๐œˆ lower โ‰ค ๐œˆ โ‰ค ๐œˆ upper โ–ช such that the interval ๐ท๐ฝ ๐œˆ = ๐œˆ lower , ๐œˆ upper contains the true value ( ๐œˆ ) with a certain probability (e.g., 95% )

  18. าง CONFIDENCE INTERVALS FOR ๐œˆ โ–ช From the CLT it follows that under certain conditions: the distribution of เดค โ–ช ๐‘Œ is normal the best estimate of เดค โ–ช ๐‘Œ of ๐œˆ is าง ๐‘ฆ ๐œ the standard deviation of เดค ๐‘Œ is โ–ช ๐‘œ โ–ช This implies that: ๐œ ๐œ with probability 2.5% , เดค ๐‘œ โ‡’ ๐œˆ > เดค โ–ช ๐‘Œ < ๐œˆ โˆ’ 1.96 ๐‘Œ + 1.96 ๐‘œ ๐œ ๐œ with probability 2.5% , เดค ๐‘œ โ‡’ ๐œˆ < เดค โ–ช ๐‘Œ > ๐œˆ + 1.96 ๐‘Œ โˆ’ 1.96 ๐‘œ ๐œ ๐œ so with probability 95% , เดค ๐‘œ โ‰ค ๐œˆ โ‰ค เดค โ–ช ๐‘Œ โˆ’ 1.96 ๐‘Œ + 1.96 ๐‘œ โ–ช So, if we find a sample mean าง ๐‘ฆ , we can construct the following 95% confidence interval for ๐œˆ : ๐‘ฆ โˆ’ 1.96 ๐œ ๐‘ฆ + 1.96 ๐œ CI ๐œˆ,0.95 = ๐‘œ , าง ๐‘œ

  19. าง าง าง CONFIDENCE INTERVALS FOR ๐œˆ Three notations for a confidence interval for ๐œˆ ๐œ ๐œ โ–ช ๐‘ฆ โˆ’ 1.96 ๐‘œ , าง ๐‘ฆ + 1.96 ๐‘œ ๐œ ๐œ โ–ช ๐‘ฆ โˆ’ 1.96 ๐‘œ โ‰ค ๐œˆ โ‰ค าง ๐‘ฆ + 1.96 ๐‘œ ๐œ โ–ช ๐‘ฆ ยฑ 1.96 ๐‘œ

  20. าง CONFIDENCE INTERVALS FOR ๐œˆ Example โ–ช Population โ–ช ๐œˆ = 520.78 (unknown) โ–ช ๐œ = 86.60 (known) โ–ช normally distributed (assumed) โ–ช Sample โ–ช ๐‘œ = 5 (chosen) ๐‘ฆ = 504.0 (estimated) โ–ช โ–ช Calculation 86.60 โ–ช standard error of mean: 5 = 38.73 โ–ช 1.96 ร— 38.73 = 75.91 โ–ช ๐ท๐ฝ ๐œˆ,0.95 = 428.09, 579.91

  21. EXERCISE 2 Write the confidence interval 428.09, 579.91 in two alternative ways.

  22. าง CONFIDENCE INTERVALS FOR ๐œˆ โ–ช The factor 1.96 is of course related to the 95% probability โ–ช Other confidence levels: Where ๐‘จ ๐›ฝ/2 is such that ๐‘„ ๐‘Ž โ‰ค ๐‘จ ๐›ฝ/2 = ๐›ฝ if ๐‘Ž is drawn from a ๐‘Ž -distribution โ–ช General form of a 1 โˆ’ ๐›ฝ ร— 100% confidence interval of the mean: ๐œ ๐œ CI ๐œˆ,1โˆ’๐›ฝ = ๐‘ฆ โˆ’ ๐‘จ ๐›ฝ/2 ๐‘œ , าง ๐‘ฆ + ๐‘จ ๐›ฝ/2 ๐‘œ

  23. CONFIDENCE INTERVALS FOR ๐œˆ

  24. CONFIDENCE INTERVALS FOR ๐œˆ โ–ช Trade-off โ–ช narrow CI ๏ƒ› low confidence level โ–ช wide CI ๏ƒ› high confidence level โ–ช Choice of confidence level depends on application โ–ช more precision required for a refinery than for a dairy farm

  25. CONFIDENCE INTERVALS FOR ๐œˆ โ–ช A confidence interval either does or does not contain ๐œˆ โ–ช The confidence level quantifies the risk โ–ช Out of 100 confidence intervals, approximately 95% will contain ๐œˆ , while approximately 5% might not contain ๐œˆ

  26. HYPOTHESIS TESTS FOR ๐œˆ โ–ช We can use the standard error to perform a hypothesis test โ–ช recall that ๐ท๐ฝ ๐œˆ,0.95 = 428.09, 579.91 โ–ช Suppose we hypothesize ๐œˆ = 550 โ–ช The value 550 is inside the 95% confidence interval for ๐œˆ โ–ช therefore the sample statistic+confidence interval will not suggest that the hypothesis ( ๐œˆ = 550 ) is wrong โ–ช and we will not reject the hypothesis โ–ช notice that we didnโ€™t say that ๐œˆ = 550 ; we only said that we canโ€™t reject it (at a 5% significance level)

  27. HYPOTHESIS TESTS FOR ๐œˆ โ–ช Another example: suppose we hypothesize that ๐œˆ = 600 โ–ช The value 600 is outside the confidence interval for ๐œˆ โ–ช finding a confidence interval not containing ๐œˆ happens only in 5% of the cases โ–ช so we conclude that ๐œˆ โ‰  600 (at a 5% significance level) โ–ช therefore the sample statistic+confidence interval will suggest that the hypothesis ( ๐œˆ = 600 ) is wrong โ–ช and we will reject the hypothesis Much more on hypothesis tests later on!

  28. าง THE ๐‘ข -DISTRIBUTION ๐œ ๐œ โ–ช A closer look at CI ๐œˆ,0.95 = ๐‘ฆ โˆ’ 1.96 ๐‘œ , าง ๐‘ฆ + 1.96 ๐‘œ โ–ช Given a sample mean าง ๐‘ฆ , you can find a 95% confidence interval for the population mean ๐œˆ โ–ช Sounds great when you donโ€™t know ๐œˆ ... โ–ช ... but it assumes you do know ๐œ ! โ–ช There are many situations in which you donโ€™t know ๐œˆ and you also donโ€™t know ๐œ โ–ช So what to do?

  29. THE ๐‘ข -DISTRIBUTION โ–ช A simple strategy โ–ช If the population standard deviation ๐œ is unknown, we can estimate it with the sample standard deviation ๐‘ก ๐‘ก ๐œ โ–ช Then we use ยฑ1.96 ๐‘œ instead of ยฑ1.96 ๐‘œ โ–ช But we pay a price for that โ–ช The reason is that ๐‘ก is itself an estimate of ๐œ , and therefore uncertain โ–ช The price we pay is that the factor โ€œ 1.96 โ€ must be somewhat larger

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend