Introduction to Business Statistics QM 220 QM 220 Chapter 11 Dr. - - PowerPoint PPT Presentation

introduction to business statistics qm 220 qm 220 chapter
SMART_READER_LITE
LIVE PREVIEW

Introduction to Business Statistics QM 220 QM 220 Chapter 11 Dr. - - PowerPoint PPT Presentation

Department of Quantitative Methods & Information Systems Introduction to Business Statistics QM 220 QM 220 Chapter 11 Dr. Mohammad Zainal Chapter 11: Chi-Square test 11.1 The Chi-Square Distribution Like the t distribution, the


slide-1
SLIDE 1

Department of Quantitative Methods & Information Systems

Introduction to Business Statistics QM 220 QM 220 Chapter 11

  • Dr. Mohammad Zainal
slide-2
SLIDE 2

Chapter 11: Chi-Square test 11.1 The Chi-Square Distribution Like the t distribution, the chi-square distribution has

  • nly one parameter called the degrees of freedom (df).

Th h f ifi hi di t ib ti d d The shape of a specific chi-square distribution depends on the number of degrees of freedom. The chi square distribution curve starts at the origin and The chi-square distribution curve starts at the origin and lies entirely to the right of the vertical axis.

QM-220, M. Zainal 2

slide-3
SLIDE 3

Chapter 11: Chi-Square test 11.1 The Chi-Square Distribution Th hi di ib i h l The chi-square distribution has only one parameter, called the degrees of freedom. The shape of a chi-square di t ib ti i k d t th i ht f ll df d distribution curve is skewed to the right for small df and becomes symmetric for large df. The chi-square distribution assumes nonnegative values

  • nly, and these are denoted by the symbol χ2 (read as “chi-

square”). peak (or mode) of a chi-square distribution curve with 1

  • r 2 degrees of freedom occurs at zero and for a curve with

3 or more degrees of freedom at df − 2.

QM-220, M. Zainal 3

slide-4
SLIDE 4

Chapter 11: Chi-Square test 11.1 The Chi-Square Distribution

E l Fi d h l f

2 f

7 d f f d d f Example: Find the value of χ2 for 7 degrees of freedom and an area of .10 in the right tail of the chi-square distribution curve.

QM-220, M. Zainal 4

slide-5
SLIDE 5

Chapter 11: Chi-Square test 11.1 The Chi-Square Distribution

E l Fi d h l f

2 f

12 d f f d d Example: Find the value of χ2 for 12 degrees of freedom and an area

  • f .05 in the left tail of the chi-square distribution curve.

QM-220, M. Zainal 5

slide-6
SLIDE 6

Chapter 11: Chi-Square test 11.3 Contingency Tables Often we may have information on more than one variable for each element. S h i f ti b i d d t d Such information can be summarized and presented using a two-way classification table, which is also called a contingency table or cross-tabulation. contingency table or cross tabulation. Suppose a university has a total of 20,758 students enrolled.

QM-220, M. Zainal 6

slide-7
SLIDE 7

Chapter 11: Chi-Square test 11.3 Contingency Tables By classifying these students based on gender and whether these students are full-time or part-time, we can prepare a contingency table prepare a contingency table. It is also called a 2 × 2 (read as “two by two”) contingency table. table. A contingency table can be of any size. For example it can be 2 × 3 3 × 2 3 × 3 or 4 × 2 For example, it can be 2 × 3, 3 × 2, 3 × 3, or 4 × 2. In these notations, the first digit refers to the number of rows in the table, and the second digit refers to the number

  • ws

e b e, d e seco d d g e e s o e u be

  • f columns.

QM-220, M. Zainal 7

slide-8
SLIDE 8

Chapter 11: Chi-Square test 11.3 Contingency Tables In general an R × C table contains R rows and C In general, an R × C table contains R rows and C columns. Each of the four boxes that contain numbers in the table Each of the four boxes that contain numbers in the table is called a cell. The number of cells in a contingency table is obtained by multiplying the number of rows by the number of columns. In our example we had 2 × 2 = 4 cells. Th bj t th t b l t ll f ti t bl The subjects that belong to a cell of a contingency table possess two characteristics. For example, 2615 students listed in the second cell of the For example, 2615 students listed in the second cell of the first row in the table 11.5 are male and part-time.

QM-220, M. Zainal 8

slide-9
SLIDE 9

Chapter 11: Chi-Square test 11.3 Contingency Tables The numbers written inside the cells are usually called the joint frequencies. F l 2615 t d t b l t th j i t t f For example, 2615 students belong to the joint category of male and part-time. Hence, it is referred to as the joint frequency of this category frequency of this category

QM-220, M. Zainal 9

slide-10
SLIDE 10

Chapter 11: Chi-Square test 11.4 A Test of Independence or Homogeneity In a test of independence for a contingency table, we test the null hypothesis that the two attributes (characteristics)

  • f the elements of a given population are not related (that
  • f the elements of a given population are not related (that

is, they are independent) against the alternative hypothesis that the two characteristics are related (that is, they are ( , y dependent). For example, we may want to test if the affiliation of people with the Democratic and Republican parties is independent of their income levels. We perform such a test b sing the chi sq are distrib tion by using the chi-square distribution.

QM-220, M. Zainal 10

slide-11
SLIDE 11

Chapter 11: Chi-Square test 11.4 A Test of Independence or Homogeneity Another example, we may want to test if there is an association between being a man or a woman and having a preference for watching sports or soap operas on television preference for watching sports or soap operas on television. Degrees of Freedom for a Test of Independence A test of independence involves a test of the null hypothesis that two independence involves a test of the null hypothesis that two attributes of a population are not related. The degrees of freedom for a test of independence are g p where R and C are the number of rows and the number of w e e d C e e u be o

  • ws

d e u be o columns, respectively, in the given contingency table.

QM-220, M. Zainal 11

slide-12
SLIDE 12

Chapter 11: Chi-Square test 11.4 A Test of Independence or Homogeneity The value of the test statistic χ2 for a test of independence is calculated as where O and E are the observed and expected frequencies, respectively for a cell respectively, for a cell. The null hypothesis in a test of independence is always that the two attributes are not related The alternative that the two attributes are not related. The alternative hypothesis is that the two attributes are related. The frequencies obtained from the performance of an e eque c es ob ed

  • e pe o

ce o experiment for a contingency table are called the observed frequencies.

QM-220, M. Zainal 12

slide-13
SLIDE 13

Chapter 11: Chi-Square test 11.4 A Test of Independence or Homogeneity The expected frequency E for a cell is calculated as A test of independence is always right-tailed. To apply a chi-square test of independence, the sample size should be large enough so that the expected frequency for each cell is at least 5. If the expected frequency for a cell is not at least 5, we p q y either increase the sample size or combine some categories.

QM-220, M. Zainal 13

slide-14
SLIDE 14

Chapter 11: Chi-Square test 11.4 A Test of Independence or Homogeneity

E l A d l f 300 d lt l t d d th Example: A random sample of 300 adults was selected, and they were asked if they favor giving more freedom to schoolteachers to punish students for violence and lack of discipline. Based on the results of the survey, the two-way classification of the responses of these adults is presented in the following table D th l id ffi i t id t l d th t th t Does the sample provide sufficient evidence to conclude that the two attributes, gender and opinions of adults, are dependent? Use a 1% significance level.

QM-220, M. Zainal 14

slide-15
SLIDE 15

Chapter 11: Chi-Square test 11.4 A Test of Independence or Homogeneity Solution Step 1. State the null and alternative hypotheses. Ho: the two attributes are independent. H1: these attributes are dependent.

1

p Step 2. Select the distribution to use.

W

th hi di t ib ti t k t t f

We use the chi-square distribution to make a test of

independence for a contingency table.

QM-220, M. Zainal 15

slide-16
SLIDE 16

Chapter 11: Chi-Square test 11.4 A Test of Independence or Homogeneity Step 3 Determine the rejection and nonrejection regions Step 3. Determine the rejection and nonrejection regions. The test of independence is always right-tailed The area of the rejection region is 01 and it falls in The area of the rejection region is .01, and it falls in the right tail of the chi-square distribution curve. The contingency table contains two rows and three g y columns. The degrees of freedom are

From Table VI of Appendix C, the critical value of χ2 for the critical value of χ2 for df = 2 and α = .01 is 9.210

QM-220, M. Zainal 16

slide-17
SLIDE 17

Chapter 11: Chi-Square test 11.4 A Test of Independence or Homogeneity

QM-220, M. Zainal 17

slide-18
SLIDE 18

Chapter 11: Chi-Square test 11.4 A Test of Independence or Homogeneity Step 4 Calculate the value of the test statistic Step 4. Calculate the value of the test statistic. Step 5. Make a decision. The value of the test statistic χ2 = 8.252 is less than the critical value of χ2 = 9.210, and it falls in the nonrejection

  • region. So, we fail to reject the null hypothesis

QM-220, M. Zainal 18

slide-19
SLIDE 19

Chapter 11: Chi-Square test 11.4 A Test of Independence or Homogeneity

E l A h t d t t d th l ti hi b t Example: A researcher wanted to study the relationship between gender and owning cell phones. She took a sample of 2000 adults and

  • btained the information given in the following table

Own Cell Phones Do Not Own Cell Phones Men 640 450 Men 640 450 Women 440 470 At the 5% level of significance, can you conclude that gender and

  • wning a cell phone are related for all adults?

QM-220, M. Zainal 19

slide-20
SLIDE 20

Chapter 11: Chi-Square test 11.5 Inferences About the Population Variance So far we explained how to make inferences (confidence intervals and hypothesis tests) about the population mean and population proportion and population proportion. However, we may often need to control the variance (or standard deviation). standard deviation). Consequently, there may be a need to estimate and to test a hypothesis about the population variance σ2. yp p p We will learn how to make a confidence interval for the population variance (or standard deviation) and explain how to test a hypothesis about the population variance.

QM-220, M. Zainal 20

slide-21
SLIDE 21

Chapter 11: Chi-Square test 11.5 Inferences About the Population Variance For example, suppose a machine is set up to fill packages

  • f cookies so that the net weight of cookies per package is 32
  • unces
  • unces.

The machine will not put exactly 32 ounces of cookies into each package. Some of the packages will contain less and some will contain more than 32 ounces. However if the variance is too large some of the packages will However, if the variance is too large, some of the packages will contain quite a bit less than 32 ounces of cookies and some others will contain quite a bit more than 32 ounces. The manufacturer will not want a large variation in the amounts

  • f cookies put into different packages.

QM-220, M. Zainal 21

slide-22
SLIDE 22

Chapter 11: Chi-Square test 11.5 Inferences About the Population Variance

T k thi i ti ithi ifi d t bl li it th To keep this variation within some specified acceptable limit, the machine will be adjusted from time to time. Before the manager decides to adjust the machine at any time, he g j y , must estimate the variance or test a hypothesis or do both to find

  • ut if the variance exceeds the maximum acceptable value.

Like every sample statistic the sample variance is a Like every sample statistic, the sample variance is a random variable, and it possesses a sampling distribution. If all the possible samples of a given size are taken from a If all the possible samples of a given size are taken from a population and their variances are calculated, the probability distribution of these variances is called the p y sampling distribution of the sample variance.

QM-220, M. Zainal 22

slide-23
SLIDE 23

Chapter 11: Chi-Square test 11.5 Inferences About the Population Variance If the population from which the sample is selected is (approximately) normally distributed, then has a chi-square distribution with n − 1 degrees of freedom. Estimation of the Population Variance Estimation of the Population Variance The value of the sample variance s2 is a point estimate of the population variance σ2 the population variance σ2.

QM-220, M. Zainal 23

slide-24
SLIDE 24

Chapter 11: Chi-Square test 11.5 Inferences About the Population Variance Assuming that the population from which the sample is selected is (approximately) normally distributed, the (1 − α)100% confidence interval for the population variance (1 α)100% confidence interval for the population variance σ2 is where and are obtained from the chi-square where and are obtained from the chi square distribution table for α/2 and 1 − α/2 areas in the right tail

  • f the chi-square distribution curve, respectively, and for

n − 1 degrees of freedom.

QM-220, M. Zainal 24

slide-25
SLIDE 25

Chapter 11: Chi-Square test 11.5 Inferences About the Population Variance The confidence interval for the population standard deviation can be obtained by simply taking the positive t f th t li it f th fid i t l f square roots of the two limits of the confidence interval for the population variance. Th d f ki fid i t l f

2

The procedure for making a confidence interval for σ2 1.Take a sample of size n and compute s2. 2 C /2 1 /2 i f

2 f

2.Calculate α/2 and 1 − α/2. Find two values of χ2 from the chi-square. 3.Substitute all the values in the formula for the confidence interval for σ2 and simplify.

QM-220, M. Zainal 25

slide-26
SLIDE 26

Chapter 11: Chi-Square test 11.5 Inferences About the Population Variance Hypothesis Tests About the Population Variance A test of hypothesis about the population variance can be

  • ne-tailed or two-tailed.

To make a test of hypothesis about σ2, we perform the same five steps we have used earlier in hypothesis-testing examples. p The procedure to test a hypothesis about σ2 discussed in this section is applied only when the population from which this section is applied only when the population from which a sample is selected is (approximately) normally distributed.

QM-220, M. Zainal 26

slide-27
SLIDE 27

Chapter 11: Chi-Square test 11.5 Inferences About the Population Variance The value of the test statistic χ2 is calculated as where s2 is the sample variance σ2 is the hypothesized value where s2 is the sample variance, σ2 is the hypothesized value

  • f the population variance, and n − 1 represents the degrees

f f d Th l ti f hi h th l i

  • f freedom. The population from which the sample is

selected is assumed to be (approximately) normally distributed.

QM-220, M. Zainal 27

slide-28
SLIDE 28

Chapter 11: Chi-Square test 11.5 Inferences About the Population Variance

E l O t f ki f t d b f d i C Example: One type of cookie manufactured by a food company is Cocoa

  • Cookies. The machine that fills packages of these cookies is set up in

such a way that the average net weight of these packages is 32 ounces with a variance of .015 square ounces. From time to time the quality control inspector at the company selects a sample of a few such packages, calculates the variance of the net weights of these packages, p g , g p g , and constructs a 95% confidence interval for the population variance. If either both or one of the two limits of this confidence interval is not in the interval .008 to .030, the machine is stopped and adjusted. A the interval .008 to .030, the machine is stopped and adjusted. A recently taken random sample of 25 packages from the production line gave a sample variance of .029 square ounces. Based on this sample information do you think the machine needs an adjustment? Assume information, do you think the machine needs an adjustment? Assume that the net weights of cookies in all packages are normally distributed.

QM-220, M. Zainal 28

slide-29
SLIDE 29

Chapter 11: Chi-Square test 11.5 Inferences About the Population Variance

Example: One type of cookie manufactured by a food company is Cocoa Example: One type of cookie manufactured by a food company is Cocoa

  • Cookies. The machine that fills packages of these cookies is set up in

such a way that the average net weight of these packages is 32 ounces with a variance of 015 square ounces From time to time the quality with a variance of .015 square ounces. From time to time the quality control inspector at the company selects a sample of a few such packages, calculates the variance of the net weights of these packages, and makes a test of hypothesis about the population variance. She and makes a test of hypothesis about the population variance. She always uses α = .01. The acceptable value of the population variance is .015 square ounces or less. If the conclusion from the test of hypothesis is that the population variance is not within the acceptable limit, the p p p , machine is stopped and adjusted. A recently taken random sample of 25 packages from the production line gave a sample variance of .029 square ounces. Based on this sample information, do you think the machine needs an adjustment? Assume that the net weights of cookies in all packages are normally distributed

QM-220, M. Zainal 29

slide-30
SLIDE 30

Chapter 11: Chi-Square test 11.5 Inferences About the Population Variance

E l Th i f t d di d th ti t t f Example: The variance of scores on a standardized mathematics test for all high school seniors was 150 in 2004. A sample of scores for 20 high school seniors who took this test this year gave a variance of 170. Test at the 5% significance level if the variance of current scores of all high school seniors on this test is different from 150. Assume that the scores

  • f all high school seniors on this test are (approximately) normally

g ( pp y) y distributed.

QM-220, M. Zainal 30