samples and statistics
play

Samples and Statistics The objective of statistical inference is to - PowerPoint PPT Presentation

ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control Samples and Statistics The objective of statistical inference is to draw conclusions or make decisions about a population, based on a sample


  1. ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control Samples and Statistics “The objective of statistical inference is to draw conclusions or make decisions about a population, based on a sample selected from the population.” Inference is simplest when the sample is a random sample from the population: the sample values X 1 , X 2 , . . . , X n are statistically independent and all have the same distribution. That is not possible when sampling without replacement from a finite population; in that case, a random sample is one that is drawn in � N � such a way that all possible samples have the same probability of n being chosen. 1 / 41 Inferences About Process Quality Statistics and Sampling Distributions

  2. ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control It is not always possible or desirable to use a random sample. For example, the successive values plotted in a control chart are rarely independent, because they are influenced by slow-changing properties of the system. When we know, or suspect, that the sample was not a random sample, we should use appropriate methods. 2 / 41 Inferences About Process Quality Statistics and Sampling Distributions

  3. ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control Statistic A statistic is a quantity that can be calculated from only the values in a sample. Examples of statistics: Sample mean: n x = 1 � ¯ x i ; n i =1 Sample standard deviation: � n � 1 � � s = ( x i − ¯ x ) 2 ; � n − 1 i =1 A quantity like ¯ x − µ is not a statistic, because to calculate it we must know the value of the population parameter µ . 3 / 41 Inferences About Process Quality Statistics and Sampling Distributions

  4. ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control Sampling distribution A statistic computed from a random sample it itself a random variable, and has its own probability distribution. The distribution of a statistic of a random sample is called its sampling distribution , to emphasize that we are dealing with a statistic and not a single observation. 4 / 41 Inferences About Process Quality Statistics and Sampling Distributions

  5. ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control Sampling from a normal distribution Suppose that X 1 , X 2 , . . . , X n is a random sample from a normal population with mean µ and variance σ 2 . That is, X 1 , X 2 , . . . , X n are independent, and each is distributed as N ( µ, σ 2 ). Then the sampling distribution of the sample mean ¯ X is N ( µ, σ 2 / n ), or equivalently ¯ X − µ Z = σ/ √ n ∼ N (0 , 1) . 5 / 41 Inferences About Process Quality Statistics and Sampling Distributions

  6. ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control The sampling distribution of the sample variance is a scaled chi-square distribution: χ 2 = ( n − 1) S 2 ∼ χ 2 n − 1 . σ 2 The χ 2 distribution with ν degrees of freedom, here n − 1, is the Gamma distribution with shape parameter r = ν/ 2 and rate parameter λ = 1 / 2. 6 / 41 Inferences About Process Quality Statistics and Sampling Distributions

  7. ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control These sampling distributions are used to derive confidence intervals for µ and σ 2 , respectively. However, the confidence interval for µ requires that we know the value of σ ; this is rarely the case. When σ is unknown, we use a third sampling result: the sampling distribution of ¯ X − µ T = S / √ n is Student’s t -distribution with n − 1 degrees of freedom. 7 / 41 Inferences About Process Quality Statistics and Sampling Distributions

  8. ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control Sampling from a Bernoulli distribution Recall the notion of a sequence of independent trials, each resulting in success or failure, used to introduce the binomial distribution. Let X i be the indicator of success at the i th trial: � if the i th trial is a success; 1 X i = if the i th trial is a failure. 0 Each X i follows the Bernoulli distribution with parameter p = P ( X i = 1). 8 / 41 Inferences About Process Quality Statistics and Sampling Distributions

  9. ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control The number of successes in n trials is X = X 1 + X 2 + · · · + X n , which follows the binomial distribution with parameters n and p . The sample mean ¯ X = X / n = ˆ p also has a discrete distribution, most easily described in terms of the distribution of X ; in particular E( ¯ X ) = p and Var( ¯ X ) = p (1 − p ) / n . By the Central Limit Theorem, ¯ X is approximately normal, N ( p , p (1 − p ) / n ). 9 / 41 Inferences About Process Quality Statistics and Sampling Distributions

  10. ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control Sampling from a Poisson distribution If X 1 , X 2 , . . . , X n are independent and each has the Poisson distribution with parameter λ , then X = X 1 + X 2 + · · · + X n follows the Poisson distribution with parameter n λ . The sample mean ¯ X = X / n = ˆ p also has a discrete distribution, most easily described in terms of the distribution of X ; in particular E( ¯ X ) = λ and Var( ¯ X ) = λ/ n . By the Central Limit Theorem, ¯ X is approximately normal, N ( λ, λ/ n ). 10 / 41 Inferences About Process Quality Statistics and Sampling Distributions

  11. ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control More generally, if X 1 , X 2 , . . . , X n are independent and X i has the Poisson distribution with parameter λ i , then X = X 1 + X 2 + · · · + X n follows the Poisson distribution with parameter � n i =1 λ i . 11 / 41 Inferences About Process Quality Statistics and Sampling Distributions

  12. ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control Point Estimation In any of these sampling contexts, we need to make inferences about the parameter(s) of the corresponding model. A point estimator of a parameter is a sample statistic that approximates the parameter. As a statistic, it has a sampling distribution, with a mean and a variance. The standard deviation of its sampling distribution is called its standard error . 12 / 41 Inferences About Process Quality Point Estimation of Process Parameters

  13. ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control If an estimator ˆ θ of some parameter θ satisfies E(ˆ θ ) = θ , it is called unbiased . In some situations, but not all, unbiased estimators are best. The mean squared error of an estimator ˆ θ of some parameter θ is θ ) 2 + Var(ˆ E[(ˆ θ − θ ) 2 ] = bias(ˆ θ ) which for an unbiased ˆ θ is just Var(ˆ θ ). X and variance s 2 are always In a random sample, the sample mean ¯ unbiased estimators of the population mean µ and variance σ 2 , respectively, but s is biased for σ . 13 / 41 Inferences About Process Quality Point Estimation of Process Parameters

  14. ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control In some situations, the sample range , x ( n ) − x (1) , has been used to construct an estimator of the population standard deviation σ because it requires little computation. This construction is critically dependent on the assumption that the data are normally distributed; for any other distribution, the relationship between the range and the standard deviation is different. 14 / 41 Inferences About Process Quality Point Estimation of Process Parameters

  15. ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control Inference for a Single Sample Inferences about some parameter may be made using: a point estimator; an interval estimator; a hypothesis test. 15 / 41 Inferences About Process Quality Statistical Inference for a Single Sample

  16. ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control Mean of a normal population Point estimator The usual point estimator of µ is the unbiased ¯ X . The sampling distribution of ¯ X is N ( µ, σ 2 / n ), so its standard error is σ/ √ n . When σ is unknown, we replace it by s to get the estimated standard error s / √ n . 16 / 41 Inferences About Process Quality Statistical Inference for a Single Sample

  17. ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control Interval estimator The usual interval estimator is a confidence interval , derived from the distribution of Z (when σ is known) or T (when σ is unknown). Known σ : X ± z α/ 2 × σ ¯ √ n Unknown σ : X ± t α/ 2 , n − 1 × s ¯ √ n In each case, the interval contains µ with probability 1 − α , and is called a 100(1 − α )% confidence interval. The confidence level 100(1 − α )% is often 95%, but sometimes 99% is preferred. 17 / 41 Inferences About Process Quality Statistical Inference for a Single Sample

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend