Chapter 11 Categorical Data Analysis Categorical Data and the - - PowerPoint PPT Presentation

chapter 11
SMART_READER_LITE
LIVE PREVIEW

Chapter 11 Categorical Data Analysis Categorical Data and the - - PowerPoint PPT Presentation

Chapter 11 Categorical Data Analysis Categorical Data and the Multinomial Distribution Properties of the Multinomial Experiment 1. Experiment has n identical trials 2. There are k possible outcomes to each trial, called classes, categories or


slide-1
SLIDE 1

Chapter 11

Categorical Data Analysis

slide-2
SLIDE 2

Categorical Data and the Multinomial Distribution

Properties of the Multinomial Experiment

1. Experiment has n identical trials 2. There are k possible outcomes to each trial, called classes, categories or cells 3. Probabilities of the k outcomes remain constant from trial to trial 4. Trials are independent 5. Variables of interest are the cell counts, n1, n2…nk, the number of observations that fall into each of the k classes

slide-3
SLIDE 3

Testing Category Probabilities: One-Way Table

In a multinomial experiment with categorical data from a single qualitative variable, we summarize data in a one-way table.

Schema for one-way table for an experiment with k outcomes

k1 k2 … k n1 n2 … nk

slide-4
SLIDE 4

Testing Category Probabilities: One-Way Table

Hypothesis Testing for a One-Way Table

  • Based on the 2 statistic, which allows comparison

between the observed distribution of counts and an expected distribution of counts across the k classes

  • Expected distribution = E(nk)=npk, where n is the total

number of trials, and pk is the hypothesized probability of being in class k according to H0

  • The test statistic, 2, is calculated as

and the rejection region is determined by the 2 distribution using k-1 df and the desired 

   

2 2 1

( )

k i i i i

n E n E n 

  

slide-5
SLIDE 5

Testing Category Probabilities: One-Way Table

Hypothesis Testing for a One-Way Table

  • The null hypothesis is often formulated as a no difference,

where H0: p1=p2=p3=…=pk=1/k, but can be formulated with non-equivalent probabilities

  • Alternate hypothesis states that Ha: at least one of the

multinomial probabilities does not equal its hypothesized value

slide-6
SLIDE 6

Testing Category Probabilities: One-Way Table

Hypothesis Testing for a One-Way Table

  • The null hypothesis is often formulated as a no

difference, where H0: p1=p2=p3=…=pk=1/k, but can be formulated with non-equivalent probabilities

  • Alternate hypothesis states that Ha: at least one of

the multinomial probabilities does not equal its hypothesized value

slide-7
SLIDE 7

Testing Category Probabilities: One-Way Table

One-Way Tables: an example

H0: pnone=.10, pStandard=.65, pMerit=.25 Ha: At least 2 proportions differ from proposed plan Rejection region with =.01, df = k-1 = 2 is 9.21034 Since the test statistic falls in the rejection region, we reject H0

=Total x p

slide-8
SLIDE 8

Testing Category Probabilities: One-Way Table

Conditions Required for a valid 2 Test

  • Multinomial experiment has been

conducted

  • Sample size is large, with E(ni) at least 5 for

every cell

slide-9
SLIDE 9

Testing Category Probabilities: Two-Way (Contingency) Table

Used when classifying with two qualitative variables H0: The two classifications are independent Ha: The two classifications are dependent Test Statistic: Rejection region:2>2

, where 2  has (r-1)(c-1) df

General r x c Contingency Table Column 1 2 … c Row Totals 1 n11 n12 … n1c R1 2 n21 n22 n2c R2 Row … … … … … … r nr1 nr2 … nrc Rr Column Totals C1 C2 … Cc n

2 2 ij ij i j ij ij

n E R C w h e re E E n        

slide-10
SLIDE 10

Testing Category Probabilities:

Two-Way (Contingency) Table

Conditions Required for a valid 2 Test

  • N observed counts are a random sample

from the population of interest

  • Sample size is large, with E(ni) at least 5 for

every cell

slide-11
SLIDE 11

Testing Category Probabilities:

Two-Way (Contingency) Table

Sample Statistical package output

slide-12
SLIDE 12

A Word of Caution about Chi-Square Tests

  • When an expected cell count is less than 5,

2 probability distribution should not be used

  • If H0 is not rejected, do not accept H0 that

the classifications are independent, due to the implications of a Type II error.

  • Do not infer causality when H0 is rejected.

Contingency table analysis determines statistical dependence only.