Chapter 11 Categorical Data Analysis Categorical Data and the - PowerPoint PPT Presentation

Chapter 11 Categorical Data Analysis

Categorical Data and the Multinomial Distribution Properties of the Multinomial Experiment 1. Experiment has n identical trials 2. There are k possible outcomes to each trial, called classes, categories or cells 3. Probabilities of the k outcomes remain constant from trial to trial 4. Trials are independent Variables of interest are the cell counts, n 1 , n 2 …n k , the 5. number of observations that fall into each of the k classes

Testing Category Probabilities: One-Way Table In a multinomial experiment with categorical data from a single qualitative variable, we summarize data in a one-way table. Schema for one-way table for an experiment with k outcomes … k 1 k 2 k … n 1 n 2 n k

Testing Category Probabilities: One-Way Table Hypothesis Testing for a One-Way Table • Based on the  2 statistic, which allows comparison between the observed distribution of counts and an expected distribution of counts across the k classes • Expected distribution = E(n k )=np k , where n is the total number of trials, and p k is the hypothesized probability of being in class k according to H 0   2  k n E ( n )   • The test statistic,  2 , is calculated as 2 i i    E n  and the rejection region is determined i 1 i by the  2 distribution using k-1 df and the desired 

Testing Category Probabilities: One-Way Table Hypothesis Testing for a One-Way Table • The null hypothesis is often formulated as a no difference, where H 0 : p 1 =p 2 =p 3 =…=p k =1/k , but can be formulated with non-equivalent probabilities • Alternate hypothesis states that H a : at least one of the multinomial probabilities does not equal its hypothesized value

Testing Category Probabilities: One-Way Table One-Way Tables: an example H 0 : p none =.10, p Standard =.65, p Merit =.25 H a : At least 2 proportions differ from proposed plan Rejection region with  =.01, df = k-1 = 2 is 9.21034 Since the test statistic falls in the rejection =Total x p region, we reject H 0

Testing Category Probabilities: One-Way Table Conditions Required for a valid  2 Test • Multinomial experiment has been conducted • Sample size is large, with E(n i ) at least 5 for every cell

Testing Category Probabilities: Two-Way (Contingency) Table Used when classifying with two qualitative variables General r x c Contingency Table Column … 1 2 c Row Totals … 1 n 11 n 12 n 1c R 1 2 n 21 n 22 n 2c R 2 … … … … … … Row … r n r1 n r2 n rc R r … Column Totals C 1 C 2 C c n H 0 : The two classifications are independent H a : The two classifications are dependent 2    n E R C Test Statistic:   ij ij  i j  2   w h e re E ij E n ij Rejection region:  2 >  2  , where  2  has (r-1)(c-1) df

Testing Category Probabilities: Two-Way (Contingency) Table Conditions Required for a valid  2 Test • N observed counts are a random sample from the population of interest • Sample size is large, with E(n i ) at least 5 for every cell

Testing Category Probabilities: Two-Way (Contingency) Table Sample Statistical package output

A Word of Caution about Chi-Square Tests • When an expected cell count is less than 5,  2 probability distribution should not be used • If H 0 is not rejected, do not accept H 0 that the classifications are independent, due to the implications of a Type II error. • Do not infer causality when H 0 is rejected. Contingency table analysis determines statistical dependence only.

Chapter 11 Categorical Data Analysis Categorical Data and the - PowerPoint PPT Presentation

Chapter 11 Categorical Data Analysis Categorical Data and the Multinomial Distribution Properties of the Multinomial Experiment 1. Experiment has n identical trials 2. There are k possible outcomes to each trial, called classes, categories or

Topics 11/13/2006 Chapter 11, start Chapter 12 11/20/2006 Chapter 12 11/27/2006 Chapter 13

Topics 11/13/2006 Chapter 11, start Chapter 12 11/20/2006 Chapter 12 Inheritance Concepts

Chapter 13 Chapter 13 1 What is this? Chapter 13 2 What is this? Chapter 13 3 What is

CHAPTER CHAPTER VII CHAPTER CHAPTER VII VII VII MANAGEMENT AND MANAGEMENT AND

Appendix A Chapter 9 versus Chapter 1 1 at a Glance Chapter 9 Chapter 1 1 ( I n) voluntary Cannot

CHAPTER II I CHAPTER I Recurrent Neural Networks Recurrent Neural Networks CHAPTER II : I :

Pushdown Automata Chapter 5 Chapter 5 Chapter 5 Chapter 5

Chapter 6 Programme design and development Lets Recap Chapter 2: Chapter 3: Chapter 1:

OWASP London Chapter Meeting 27th July 2017 London Chapter Chapter Leaders: Sam

Constraint Satisfaction Problem s C t i t S ti f ti P bl Reading: Chapter 6 (3 rd ed );

Chapter 3 Chapter 3 Data Description McGraw-Hill, Bluman, 7 th ed, Chapter 3 1 Ch Chapter 3

OWASP London Chapter Meeting 23rd November 2017 London Chapter Chapter Leaders: Sam

A.I.S. Class 22: Outline I Learning Objectives for Chapter 8 I Chapter 8 Quiz I New ACCESS Features

A.I.S. Class 27: Outline I Learning Objectives for Chapter 8 I Chapter 8 Quiz I New ACCESS Features

Chapters for the Final Exam Chapter 20: Electric forces and fields (Conceptual Questions) Chapter

Chapter: 9 9 9 9 Chapter: Chapter: Chapter: High-Speed Downlink High-Speed Downlink Packet

SELECT THE RIGHT ABSTRACT INTERESTINGNESS MEASURE FOR ASSOCIATION PATTERNS Many techniques

Statistical Inference on Large Contingency Tables: Convergence, Testability, Stability Marianna

Reading Wikipedia to Answer Open-Domain Questions Authors - Danqi Chen Introduction

Observations on the modern NSM toolchest Christian Kreibich christian@lastline.com Bro4Pros,

Part III Unstructured Data Data Retrieval: III.1 Unstructured data and data retrieval

DATA MINING LECTURE 4 Frequent Itemsets and Association Rules This is how it all started

Measures of Variation Summary of Section 9.2 Range The difference Largest Data - Smallest Data in

Outline Review Practice Problems! Review Time! Random Variables Joint

Sambuz

Useful Links

Newsletter

Mail Us