Dongmei Li Department of Public Health Sciences Office of Public Health Studies University of Hawai’i at Mānoa
1
1 Outline Chi-square test Logistic regression 2 Chi-square test - - PowerPoint PPT Presentation
Dongmei Li Department of Public Health Sciences Office of Public Health Studies University of Hawaii at Mnoa 1 Outline Chi-square test Logistic regression 2 Chi-square test 3 Chi-Square Test - Example Data below reveal a
1
2
3
Data below reveal a negative association between smoking and education level. Let us test H0: no association in the population vs. Ha: association in the population.
4
total table al column tot total row s frequencie xpected
i
E E
5
stat to a P-value with a a
) 1 )( 1 ( total table al column tot total row calculated cell in count expected and cell count,
where
cells all 2 2 stat
C R df E i E i O E E O
i i i i i i
6
7
stat= 13.20 with 4 df
Probability in right tail df 0.98 0.25 0.20 0.15 0.10 0.05 0.025 0.01 0.01 4 0.48 5.39 5.99 6.74 7.78 9.49 11.14 13.28 14.86
8
stat= 13.20 with 4 df
stat
9
Two different chi-square statistics are used in practice Pearson’s chi-square statistic (covered) is Yates’ continuity-corrected chi-square statistic is: The continuity-corrected method produces smaller chi-
square statistics and larger P-values.
Both chi-square are used in practice.
cells all 2 2 stat
i i i
E E O
| |
cells all 2 2 1 2 c stat,
i i i
E E O
10
Data set: Presentation4_chisqtest.jmp
11
P-value from both
Significant association
12
1.
2.
3.
13
14
Surviving third-degree burns These Presentation4_Burn.jmp data refer to 435 adults
Source: http://statmaster.sdu.dk/courses/st111/module14/index.html
15
Variable Description Midpoint: Midpoint of the group corresponding to the
survive: Binary variable: survived=1, died=0 A first idea might be to model the relationship
16
However, the scatterplot of the proportions of patients
17
The curved relationship is typical for many situations
Some examples of the curved relationship
18
The following scatterplot shows the logit-transformed
19
The simple logistic regression model relates px to x
Alternatively, it can be written as
20
) (
1 1 ) | (
bx a x
e x X D P p
x x
Data set: Presentation4_Burn.jmp
21
Estimated logistic regression
22
If X has several discrete levels or is measured on a continuous
scale, there is no change in the interpretation of a (log odds of D when X=0)
The log odds ratio comparing two exposure groups is b is the log odds ratio associated with a unit increase in X 10.66 is the log odds ratio of death associated with a unit
increase in midpoint midpoints of set intervals of log(area +1).
23
b x b a x b a p p p p p p p p x X D for
x X D for
OR
x x x x x x x x
] [ )] 1 ( [ )] 1 /( log[ )] 1 /( log[ ) 1 /( ) 1 /( log | 1 | log ) log(
1 1 1 1
Consider a study of the analgesic effects of treatments on
Look at the difference between male and female on pain Look at the treatment effect on pain
24
Data set: Presentation4_logistic.jmp Analyze---Fit Model
25
Logistic regression results
26
Suppose the exposure variable X only takes on two values (1 is
exposed and 0 is unexposed)
When X=0, then log(p0/1-p0)=a+b*0 = a So, a is the log odds of D amongst the unexposed. The slope parameter b is just the log Odds Ratio. 0.63 is the log odds ratio of No Pain comparing females vs. males.
27
b b a b a p p p p p p p p X D for
X D for
OR ) ( ) 1 ( )] 1 /( log[ )] 1 /( log[ ) 1 /( ) 1 /( log | 1 | log ) log(
1 1 1 1
Odds ratio for sex The odds ratio of reporting no pain comparing females
28
Calculate the odds ratio of no pain for comparing
29
A study is conducted to examine the effect of age on
1. Fit a logistic regression to examine the effect of age on
2. Fit a logistic regression to examine the effect of age
Data set Presentation4_logisticCHD.jmp.
30
31