SLIDE 1
Chapter 7
Inferences Based on a Single Sample: Estimation with Confidence Intervals
SLIDE 2 Large-Sample Confidence Interval for a Population Mean
How to estimate the population mean and assess the estimate’s reliability? is an estimate of , and we use CLT to assess how accurate that estimate is According to CLT, 95% of all from sample size n lie within of the mean We can use this to assess accuracy of as an estimate of
x x
x
96 . 1
x
SLIDE 3 Large-Sample Confidence Interval for a Population Mean
We are 95% confident, for any from sample size n, that will lie in the interval
n x 96 . 1
x
95 . 96 . 1 96 . 1 n x x
x
SLIDE 4 Large-Sample Confidence Interval for a Population Mean
We usually don’t know , but with a large sample s is a good estimator of . We can calculate confidence intervals for different confidence coefficients Confidence coefficient – probability that a randomly selected confidence interval encloses the population parameter Confidence level – Confidence coefficient expressed as a percentage
SLIDE 5
Large-Sample Confidence Interval for a Population Mean
The confidence coefficient is equal to 1- , and is split between the two tails of the distribution
SLIDE 6 Large-Sample Confidence Interval for a Population Mean
The Confidence Interval is expressed more generally as For samples of size > 30, the confidence interval is expressed as Requires that the sample used be random
n z x z x
x
2 2
n s z x
2
SLIDE 7 Large-Sample Confidence Interval for a Population Mean
2
z
Commonly used values of z/2 Confidence level 100(1-) /2 z/2 90% .10 .05 1.645 95% .05 .025 1.96 99% .01 .005 2.575
SLIDE 8
Small-Sample Confidence Interval for a Population Mean
2 problems presented by sample sizes of less than 30
–CLT no longer applies –Population standard deviation is almost always unknown, and s may provide a poor estimation when n is small
SLIDE 9 Small-Sample Confidence Interval for a Population Mean
If we can assume that the sampled population is approximately normal, then the sampling distribution of can be assumed to be approximately normal Instead of using we use This t is referred to as the t-statistic
x
n x z n s x t
SLIDE 10 Small-Sample Confidence Interval for a Population Mean
The t-statistic has: a sampling distribution very similar to z Variability dependent
Variability is expressed as (n-1) degrees of freedom (df). As (df) gets smaller, variability increases
SLIDE 11 Small-Sample Confidence Interval for a Population Mean
Table for t-distribution contains t-value for various combinations of degrees of freedom and t Partial table below shows components of table
Need Table 7.3 from text inserted here.
SLIDE 12
Small-Sample Confidence Interval for a Population Mean
Comparing t and z distributions for the same , with df=4 for the t-distribution, you can see that the t-score is larger, and therefore the confidence interval will be wider. The closer df gets to 30, the more closely the t-distribution approximates the normal distribution
SLIDE 13 Small-Sample Confidence Interval for a Population Mean
When creating a Confidence interval around for a small sample we use basing t/2 on n-1 degrees of freedom We assume a random sample drawn from a population that is approximately normally distributed
n s t x
2
SLIDE 14 Large-Scale Confidence Interval for a Population Proportion
Confidence intervals around a proportion are confidence intervals around the probability of success in a binomial experiment Sample statistic of interest is Mean of sampling distribution of is p. p is an unbiased estimator of Standard deviation of the sampling distribution is where q=1-p For large samples, the sampling distribution of is approximately normal
p ˆ p ˆ p ˆ
n pq
p ˆ
p ˆ
SLIDE 15 Large-Scale Confidence Interval for a Population Proportion
Sample size n is large if falls between 0 and 1 Confidence interval is calculated as where and
p
p
ˆ
3 ˆ
n q p z p n pq z p z p
p
ˆ ˆ ˆ ˆ ˆ
2 2 2
n x p ˆ
p q ˆ 1 ˆ
SLIDE 16 Large-Scale Confidence Interval for a Population Proportion
When p is near 0 or 1, the confidence intervals calculated using the formulas presented are misleading An adjustment can be used that works for any p, even with very small sample sizes
4 ~ 1 ~ ~
2
n p p z p
SLIDE 17 Determining the Sample Size
When we want to estimate to within x units with a (1-) level of confidence, we can calculate the sample size needed We use the Sampling Error (SE), which is half the width of the confidence interval To estimate with Sampling error SE and 100(1- )% confidence, where is estimated by s or R/4
2 2 2 2
SE z n
SLIDE 18 Determining the Sample Size
Assume a sample with =.01, and a range R of .4 What size sample do we need to achieve a desired SE of .025?
09 . 106 025 . 1 . ) 575 . 2 (
2 2 2 2 2 2 2
SE z n
SLIDE 19 Determining the Sample Size
Sample size can also be estimated for population proportion p Since pq is unknown you must estimate. Estimates with a value of p being equal or close to .5 are the most conservative
2 2 2
SE pq z n
SLIDE 20 Finite Population Correction for Simple Random Sampling
Used when the sample size n is large relative to the size of the population N, when n/N >.05
Standard error calculation for with correction Standard error calculation for p with correction
N n N n s
x
ˆ
N n N n p p
p
ˆ 1 ˆ ˆ
SLIDE 21 Sample Survey Designs
- Simple Random Sample
- Stratified Random Sampling
–separation into two or more groups of sampling units –Produce estimators with smaller standard errors –Increase representativeness –Can reduce cost
SLIDE 22 Sample Survey Designs
–Sampling of every nth unit –Samples are easier to select –Can lead to systematic bias, particularly if there is periodicity in the data you are drawing the data from
SLIDE 23 Sample Survey Designs
- Randomized Response Sampling
–Used when questions in the survey are of a sensitive nature and likely to result in false answers
SLIDE 24 Sample Survey Designs
Nonresponse – when units in the sample do not produce observations Nonresponse can produce bias in the results
- f the survey if there is a relationship
between the type of response and whether
- r not a response is achieved.
If a random sample is called for, any nonresponse means that your sample is no longer random