Intro to Contingency Tables Author: Nicholas Reich Course: - PowerPoint PPT Presentation

Intro to Contingency Tables Author: Nicholas Reich Course: Categorical Data Analysis (BIOSTATS 743) Made available under the Creative Commons Attribution-ShareAlike 4.0 International License.

Independence Definition: Two categorical variable are independent iff π ij = π i + π + j , ∀ i ∈ { 1 , 2 , .. I } and j ∈ { 1 , 2 , .. J } or P ( X = i , Y = j ) = P ( X = i ) P ( Y = j ) Independence implies that the conditional distribution reverts to marginal distribution π j | i = π ij = π i + π j + = π j + π i + π i + or under the independence assumption P ( Y = j | X = i ) = P ( Y = j )

Testing for independence (Two-way contigency table) ◮ Under H 0 : π ij = π i + π + j , ∀ i , j , the expected cell counts are µ ij = n π i + π + j ◮ Usually π i + and π + j are unknown. Their MLEs are π i + = n i + π + j = n + j ˆ n , ˆ n ◮ Estimated expected cell counts are π + j = n i + n + j µ ij = n ˆ ˆ π i + ˆ n ◮ Pearson χ 2 statistic: I J µ ij ) 2 = ( n ij − ˆ X 2 = � � µ 2 i =1 j =1

◮ ˆ µ ij requires estimating π i + and π + j which have degrees of freedom I − 1 and J − 1, respectively. Notice the constraints � i π i + = � j π + j = 1 ◮ The degrees of freedom is ( IJ = 1) − ( I − 1) − ( J − 1) = ( I − 1)( J − 1) ◮ X 2 is asymptotically χ 2 ( I − 1)( J − 1) ◮ It is helpful to look at the residuals { ( O − E ) 2 } E The residuals can give useful information about where the model is fitting well or not

Measure of Diagnostic Tests Diagnosis Disease Status + - D π 11 π 12 π 21 π 22 D ◮ Sensitivity: P (+ | D ) = π 11 π 1+ ◮ Specificity: P ( −| D ) = π 22 π 2+ ◮ An ideal diagnostic test has high Sensitivity, Specificity

Example: Diagnosis Disease Status + - D 0.86 0.14 0.12 0.88 D ◮ Sensitivity = 0 . 86 ◮ Specificity = 0 . 88 However, from the clinical point, sensitivity and specificity do not provide useful information. So we introduce Positive Predictive Value and Negative Predictive Value

The same example: Diagnosis Disease Status + - D 0.86 0.14 D 0.12 0.88 ◮ If the the prevalence is P ( D ) = 0 . 02 0 . 86 × 0 . 02 ◮ PPV = 0 . 86 × 0 . 02+0 . 12 × 0 . 98 ≈ 13% ◮ Notice: π 11 PPV � = π 11 + π 21 n 1 ◮ This is only true when n 1 + n 2 equals the disease prevalence

Comparing two groups We first consider 2 × 2 tables. Suppose that the response variable Y has two categories: success and failure. The explanatory variable X has two categories, group 1 and group 2, with fixed sample sizes in each group. Response Y Explanatory X Success Failure Row Total group 1 n 11 = x 1 n 12 = n 1 − x 1 n 1 group 2 n 21 = x 2 n 22 = n 1 − x 2 n 2 The goal is to compare the probability of an outcome (success) of Y across the two levels of X. Assume: X 1 ∼ bin ( n 1 , π 1 ) , X 2 ∼ bin ( n 2 , π 2 ) ◮ difference of proportions ◮ relative risk ◮ odds ratio

Difference of Proportions Response Y Explanatory X Success Failure Row Total group 1 n 11 = x 1 n 12 = n 1 − x 1 n 1 group 2 n 21 = x 2 n 22 = n 1 − x 2 n 2 ◮ The difference of proportions of successes is: π 1 − π 2 ◮ Comparison on failures is equivalent to comparison on successes: (1 − π 1 ) − (1 − π 2 ) = π 2 − π 1 ◮ Difference of proportions takes values in [ − 1 , 1]

π 2 = n 11 n 1 − n 21 ◮ The estimate of π 1 − π 2 is ˆ π 1 − ˆ n 2 ◮ the estimate of the asymptotic standard error: π 2 ) = [ ˆ π 1 (1 − ˆ π 1 ) − ˆ π 2 (1 − ˆ π 2 ) ] 1 / 2 σ (ˆ ˆ π 1 − ˆ n 1 n 2 ◮ The statistic for testing H 0 : π 1 = π 2 vs. H a : π 1 � = π 2 Z = (ˆ π 1 − ˆ π 2 ) / ˆ σ (ˆ π 1 − ˆ π 2 ) which follows a standard normal distribution (normal + normal = normal) ◮ The CI is given by (ˆ π 1 − ˆ π 2 ) ± Z α/ 2 ˆ σ (ˆ π 1 − ˆ π 2 )

Relative Risk ◮ Definition r = π 1 /π 2 ◮ Motivation: The difference between π 1 = 0 . 010 and π 2 = 0 . 001 is more noteworthy than the difference between π 1 = 0 . 410 and π 2 = 0 . 401. The “relative risk” (0.010/0.001=10, 0.410/0.401=1.02) is more informative than “difference of proportions” (0.009 for both). ◮ The estimate of r is ˆ r = ˆ π 1 / ˆ π 2

◮ The estimator converges to normality faster on the log scale. ◮ The estimator of log r is log ˆ r = log ˆ π 1 − log ˆ π 2 The asymptotic standard error of log ˆ r r ) = (1 − π 1 + 1 − π 2 ) 1 / 2 σ (log ˆ ˆ π 1 n 1 π 2 n 2 ◮ Delta method: If √ n (ˆ β − β 0 ) → N (0 , σ 2 ), then √ n ( f (ˆ β ) − f ( β 0 )) → N (0 , [ f ′ ( β 0 )] 2 σ 2 ) for any function f satisfying the condition that f ′ ( β ) exists ◮ Here β = π 1 or π 2 and f ( β ) = log( π 1 ) or log( π 1 )

◮ The CI for log ˆ r is [log ˆ r − Z 1 − α/ 2 ˆ σ (log ˆ r ) , log ˆ r + Z 1 − α/ 2 ˆ σ (log ˆ r )] ◮ The CI for ˆ r is [exp { log ˆ r − Z 1 − α/ 2 ˆ σ (log ˆ r ) } , exp { log ˆ r + Z 1 − α/ 2 ˆ σ (log ˆ r ) } ]

Odds Ratio ◮ Odds in group 1: π 1 φ 1 = (1 − π 1 ) ◮ Interpretation: φ 1 = 3 means a success is three times as likely as a failure in group 1 ◮ Odds ratio: θ = φ 1 = π 1 / (1 − π 1 ) π 2 / (1 − π 2 ) ∼ χ 2 φ 2 ◮ Interpretation: θ = 4 means the odds of success in group 1 are four times the odds of success in group 2

◮ The estimate is θ = n 11 n 22 ˆ n 12 n 21 ◮ log(ˆ θ ) converge to normality much faster than ˆ θ ◮ An estimate of asymptotic standard error for log(ˆ θ ) is � 1 + 1 + 1 + 1 σ (log ˆ ˆ θ ) = n 11 n 12 n 21 n 22

This formula can be derived using the Delta method Recall log ˆ θ = log(ˆ π 1 ) − log(1 − ˆ π 1 ) − log(ˆ π 2 ) + log(1 − ˆ π 2 ) First, f ( β ) = log(ˆ π 1 ) − log(1 − ˆ π 1 ) σ = π 1 (1 − π 1 ) f ′ ( β ) = 1 1 , + n 1 π 1 1 − π 1 1 1 [ f ′ ( β )] 2 σ 2 = + n 1 π 1 n 1 (1 − π 1 ) 1 1 The estimate is n 11 + n 12 Similar, when f ( β ) = log(ˆ π 2 ) − log(1 − ˆ π 2 )

◮ The Wald CI for log ˆ θ is log ˆ σ (log ˆ θ ± Z α/ 2 ˆ θ ) ◮ Exponentiation of the endpoints provides a confidence interval for ˆ θ

Relationship between Odds Ratio and Relative Risk ◮ A large relative risk does not imply large odds ratio ◮ From the definitions of relative risk and odds ratio, we have θ = π 1 1 − π 2 = relative risk × 1 − π 2 π 2 1 − π 1 1 − π 1 ◮ When probabilities π 1 and π 2 (the risk in each row group)are both very small, then the second ratio above ≈ 1. Thus odds ratio ≈ relative risk ◮ This means when relative risk is not directly estimable, e.g., in case-control studies, and the probabilities π 1 and π 2 are both very small, the relative risk can be approximated by the odds ratio.

Case-Control Studies and Odds Ratio Consider the case-control study of lung cancer: Lung Cancer Smoker Cases Controls Yes 688 650 No 21 59 Total 709 709 ◮ People are recruited based on lung cancer status, therefore P ( Y = j ) is known. However P ( X = i ) is unknown ◮ Conditional probabilities P ( X = i | Y = j ) can be estimated ◮ Conditional probabilities P ( Y = j | X = i ) cannot be estimated ◮ Relative risk and difference of proportions cannot be estimated

◮ Odds can be estimated: P (Case|Smoker) Odds of lung cancer among smoker = P (Control|Smoker) P (Case ∩ Smoker) P (Smoker) = P (Control ∩ Smoker) P (Smoker) P (Case ∩ Smoker) = P (Control ∩ Smoker) = 688 / 650 = 1 . 06 ◮ Odds is irrelevant to the probability of being a smoker ◮ Odds ratio can also be estimated: θ = P ( X = 1 | Y = 1) P ( X = 2 | Y = 2) P ( X = 1 | Y = 2) P ( X = 2 | Y = 1) = 2 . 97

Supplementary: Review of the Delta Method The Delta method builds upon the Central Limit Theorem to allow us to examine the convergence of the distribution of a function g of a random variable X . It is not too complicated to derive the Delta method in the univariate case. We need to use Slutsky’s Theorem along the way; it will be helpful to first review ideas of convergence in order to better understand where Slutsky’s Theorem fits into the derivation.

Delta Method: Convergence of Random Variables Consider a sequence of random variables X 1 , X 2 , . . . , X n , where the distribution of X i may be a function of of i . ◮ Let F n ( x ) be the CDF for X n and F ( x ) be the CDF for X . It is said that X n converges in distribution to X , written X n → d X , if lim n →∞ [ F n ( x ) − F ( x )] = 0 for all x where F ( x ) is continuous. ◮ It is said that X n converges in probability to X , written X n → p X if lim n →∞ [ X n − X ] = 0. Note that if X n → p X , then F n ( x ) → d F ( x ), since F n ( x ) = P ( X n ≤ x ) and F ( x ) = P ( X ≤ x ). (This is not a proof, but an intuition. The Wikipedia article on convergence has a nice proof.)

Intro to Contingency Tables Author: Nicholas Reich Course: - PowerPoint PPT Presentation

Intro to Contingency Tables Author: Nicholas Reich Course: Categorical Data Analysis (BIOSTATS 743) Made available under the Creative Commons Attribution-ShareAlike 4.0 International License. Independence Definition: Two categorical variable

Business Statistics CONTENTS Contingency tables Independence of categorical variables 2 2

The Set of 3 4 4 Contingency Tables has 3-Neighborhood Property Toshio Sumi and Toshio

Statistical Inference on Large Contingency Tables: Convergence, Testability, Stability Marianna

Counting Contingency Tables Igor Pak, UCLA Combinatorics Seminar, OSU, September 17, 2020 1

Presentation of medical data. Frequency tables and contingency tables. Visualization.

TIMES TABLES HOW WE TEACH TIMES TABLES AND HOW YOU CAN HELP WHY ARE TIMES TABLES IMPORTANT?

NZ Data Tables Data tables sit alongside the Active NZ main report The data tables provide

Symbol tables COMP 520 Fall 2013 Symbol tables (2) Symbol tables are used to describe and analyse

Interchange Intro Presentation Plus: Intro (Mixed media Interchange Intro Presentation Plus: Intro

Interchange Intro Presentation Plus: Intro (Mixed media Interchange Intro Presentation Plus: Intro

Intro to Contingency Tables Author: Nicholas Reich and Anna Liu, based on Agresti Ch 1 Course:

Contingency planning and Outbreak management Nia Meddins Plant Health Policy Lead What does

Contingency Plan Contingency Plan i h i h in the events of in the events of f f Aberrant

Development of the Asia/Pacific Regional ATM Contingency Plan Shane Sumner Regional Officer

Humanitarian Response Plan Crisis preparedness and contingency Ongoing crisis Changes

Fundamentals of Evolution Session 22 - 11/27/2018 Contingency and Development 1 Contingency in

Teaching Confounder-Based Statistical Literacy 19 June, 2019 1 2 2019 Univ. New Mexico 2019

Evidence-informed Health Policy: NASHP Pre-conference Origins of the EiHP Workshop Original

What is the Human-Animal Bond? a mutually beneficial and dynamic relationship between people

Creating a Successful Case Statement Welcome! The webinar will begin at 2:00 p.m. CT. While

Visual Servoing, Intro Optimal Control Lecture 12 What will you take home today? Visual

Current Issues in the Design of Cluster Randomization Trials Allan Donner, PhD, FRSC Department

Epidemiology and practical research methods Lecture 1 1 An idea or problem A clear research

Episode of Psychosis Nev Jones Ph.D, Ashok Malla, M.D. Irene Hurford, M.D., Jill Dunstan, LMHC,