ERRORS IN ERRORS IN EPIDEMIOLOGICAL EPIDEMIOLOGICAL STUDIES STUDIES
Pratap Pratap Singhasivanon Singhasivanon
Department of Tropical Hygiene Department of Tropical Hygiene
ERRORS IN ERRORS IN EPIDEMIOLOGICAL EPIDEMIOLOGICAL STUDIES - - PowerPoint PPT Presentation
ERRORS IN ERRORS IN EPIDEMIOLOGICAL EPIDEMIOLOGICAL STUDIES STUDIES Pratap Singhasivanon Singhasivanon Pratap Department of Tropical Hygiene Department of Tropical Hygiene An important goal of epidemiological studies is to measure
Pratap Pratap Singhasivanon Singhasivanon
Department of Tropical Hygiene Department of Tropical Hygiene
A false or mistaken result obtained in a study or experiment
BIAS BIAS BIAS
Fluctuation of and estimate around the population value (RANDOM VARIABILITY) Error due to factorsthat inherent in the design, conduct and analysis Result obtained in sample differs from result that would be obtained if the entire population were studies
ERROR ERROR ERROR SYSTEMATIC ERROR SYSTEMATIC ERROR RANDOM ERROR RANDOM ERROR
= = + +
Page 3
Systematic error Random error Random error
Page 4
an observation on an sample from the true population value
Page 5
Refers to fluctuations around a true value because of Sampling variability
Any difference between the true value and that actually obtained that is the result
Page 6
E r r
S t u d y s i z e
Source: Rothman, 2002
Systematic error (bias) Random error (chance)
Bias occurs when an estimated association
(RR, OR, difference in means etc.) deviates from the true measure of association
Consequence of bias systematic error in
RR, OR etc.
Bias may be introduced at design,
implementation or analysis phase of a study
SELECTION BIAS INFORMATION BIAS CONFOUNDING
Page 9
OR = 9.1
causal non-causal
test is capable of measuring what it is intended to measure
the truth, no systematic error and random error should be as small as possible
Page 11
Page 12
Different combinations of high and low Different combinations of high and low reliability and validity reliability and validity
Page 13
VALIDITY VALIDITY High High High High RELIABILITY RELIABILITY Low Low Low Low High High Low Low
External External Population Population
Target Population
Study Sample
VALIDITY VALIDITY VALIDITY
INT INT. . EXT EXT. .
Page 14
is a distorsion in the estimate of effect resulting from the manner in which subject are selected for the study population MAJOR SOUREC OF MAJOR SOUREC OF SELECTION BIAS SELECTION BIAS 1) flaws in the choice of groups to be compared 2) choice of sampling frame 3) loss to follow up or nonresponse during data collection 4) selective survival
SELECTION BIAS SELECTION BIAS
Page 15
Systematic error resulting from manner
Can occur when:
differ systematically from those in the target population
different populations
Distortions that arise from
– Procedures used to select subjects – Factors that influence study participation – Factors that influence participant attrition
Systematic error in identifying or
– Examples are…
If cases & controls or exposed & non-
May result from withdrawal or losses to
Problematic
– Can result in over- or under- estimation of the true magnitude of the relationship between an exposure and an outcome – May produce an apparent association when none exists
about association of exposure & disease
– May conceal a real association
To avoid it, ensure that:
– Subjects are representative of target population – Study and comparison groups are similar except for variables being investigated – Subject losses are kept to a minimum
INFORMATION BIAS INFORMATION BIAS is a distortion in the measurement error or misclassification of subject on
MAJOR SOURCES OF INFORMATION BIAS MAJOR SOURCES OF INFORMATION BIAS 1) invalid measurement 2) incorrect diagnostic criteria 3) omissions imprecisions 4) other inadequacies in previously recorded data
Page 21
Systematic error in the measurements of information on exposure or outcome information on exposure or outcome
Result in:
Differences in accuracy of:
exposure data between cases and controls
groups groups
Definition: Definition:
the erroneous classification of an individual, a the erroneous classification of an individual, a value, or an attribute into a category other value, or an attribute into a category other than that to which it should be assigned than that to which it should be assigned
“cutoff point cutoff point” ” in in disease diagnosis or exposure classification disease diagnosis or exposure classification
Hence errors are made in classifying to either disease or exposure status either disease or exposure status
Occurs when there is equal misclassification of exposure between diseased and non exposure between diseased and non-
diseased study subjects study subjects
OR
When there is equal misclassification of disease between exposed and non disease between exposed and non-
exposed study subjects study subjects
Cases Cases Controls Controls Total Total Exposed Exposed 100 100 50 50 150 150 Nonexposed Nonexposed 50 50 50 50 100 100 150 150 100 100 250 250
OR = ad/bc = 2.0; RR = a/(a+b)/c/(c+d) = 1.3 True Classification
Cases Cases Controls Controls Total Total Exposed Exposed 110 110 60 60 170 170 Nonexposed Nonexposed 40 40 40 40 80 80 150 150 100 100 250 250
OR = ad/bc = 1.8; RR = a/(a+b)/c/(c+d) = 1.3 Nondifferential misclassification Overestimate exposure in 10 cases, 10 controls bias towards null
Non-Differential Misclassification "True Situation"
Cases Controls Total Exp. 85 40 125 Not Exp. 15 60 75 Total 100 100 200
OR= 8.5 50% of exposed misclassified as unexposed
Cases Controls Exp. 43 20 Not Exp. 15 + 42 60 + 20 Total 100 100
OR= 3.0
Bias towards the null (1.0)
Example
Case-
Control study: :
If more cases are mistakenly mistakenly classified as being classified as being exposed than controls exposed than controls
Cohort study: :
If exposed group is more likely to be mistakenly mistakenly classified as having developed the outcome than the classified as having developed the outcome than the unexposed group unexposed group
Leads to over
estimation of the true
magnitude of the measure of association magnitude of the measure of association
0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8
1 2 3 4 5+
Birth Order Birth Order
Affected Affected Babies
Babies per 1000 Live Births
per 1000 Live Births
Prevalence of Down syndrome at birth by birth order Prevalence of Down syndrome at birth by birth order
1 2 3 4 5 6 7 8 9 <20 20-24 25-29 30-34 35-39 40+
Maternal Age Maternal Age Affected Babies per 1000 Live Affected Babies per 1000 Live Births Births
Prevalence of Down syndrome at Maternal Age Prevalence of Down syndrome at Maternal Age
Here’s what we’d like to assess : Exposure Exposure Disease Disease Here’s where confounding acts : Exposure Exposure Disease Disease Confounding Confounding
Where represents a causal relationship represents a non-causal relationship
Because…...
disease includes BOTH the contribution of the exposure AND the confounder
confounder gives an incorrect estimate of the impact of the exposure on the disease outcome
with or without exposure
but not caused by the exposure
pathway between exposure and outcome
E = Exposure D = Disease C = Confounder
C E D
part of causal pathway
E C D
E and C are part of a causal pathway
MIXING OF EFFECTS MIXING OF EFFECTS The estimate of the effect of The exposure
With the effect of an Extraneous factor
Page 45
COFFEE DRINKING COFFEE DRINKING, , CIGARETTE SMOKING CIGARETTE SMOKING AND CORONARY HEART DISEASE AND CORONARY HEART DISEASE EXPOSURE (coffee drinking) DISEASE (heart disease) CONFOUNDING VARIABLE (cigarette smoking)
Page 46
can lead to overestimation or under estimation of an effect depending on the direction of the association that the confounding factor has with exposure and disease.
direction of an effect. Example Example : : Alcohol Alcohol Smoking Smoking Oral cancer Oral cancer
Page 47
E = E = E = E =
D D D D E E E E E E E E D D D D
Situation in which Situation in which F
F is a
is a confounder confounder for a for a D
D -
E association
association. . Situation in which Situation in which F
F is not
is not a a confounder confounder for a for a D
D -
E association
association. .
E E F F D D E E F F D D E E F F D D E E F F D D E E F F D D E E F F D D E E F F D D
Page 52
To be confounding To be confounding, , the extraneous variable must the extraneous variable must have the following characteristics have the following characteristics
confounding variable must be a risk factor for the disease.
confounding variable must be associated with the exposure under study (in the population from which the case derive).
confounding variable must not be an intermediate step in the causal path between the exposure and the disease.
Page 53
presence or absence of confounding involve the comparison of a crude effect measure with an adjusted effect measure that corrects for distortions due to extraneous variables.
when the crude and adjusted effect measure d i f f e r i n v a l u e .
Page 54
DESIGN DESIGN ANALYSIS ANALYSIS
Page 55
AGE *MI (%) CONTROLS(%) **OC USE (%) 25-29 3 16 29 30-34 9 14 10 35-39 16 20 8 40-44 30 21 4 45-49 42 18 3 *MI : MYOCARDIAL INFARCTION **OC : ORAL CONTRACEPTIVE
Relation of Relation of Confounder Confounder to to Disease and Exposure Disease and Exposure
DISEASE EXPOSURE
Page 56
D D +
E
RR = 4 CRR
1000
Page 57
CRUDE RR 4
= =
1000 2000
E = E = D D E E E E D D
E E E E D D
Page 60
SMOKERS SMOKERS SMOKERS NON-SMOKERS NON NON-
SMOKERS
6 29 871
+
SM ^
ADJUSTED
CIR = 1.13 ^
CIR = 1.02 SM ^ EXPOSURE EXPOSURE
194 21 706 79
DISEASE DISEASE
94 94
+
200 800 800 50 50 950 950
ALC ALC 250 1750 DISEASE DISEASE 1000 1000 2000
CRUDE CIR = 4.0 ^
EXPOSURE EXPOSURE O C a O C a
+
C a O C a
Calculate crude measure of association: RR = 2.8 (2.21,3.63)
Smokers Non smokers Total CHD 305 58 363 NO CHD 345 292 637 Total 650 350 1000
Calculate stratum-specific measures of association...
Smokers Non smokers Total CHD 300 50 350 No CHD 300 150 450 Total 600 200 800
RR = 2.0 (1.55,2.58)
Smokers Non smokers Total CHD 5 8 13 No CHD 45 142 187 Total 50 150 200
STRATUM 1: MEN STRATUM 2: WOMEN
RR = 1.9 (0.64,5.47)
gender as a potential confounder... MEN RR = 2.0 WOMEN RR = 1.9
and CHD because the crude RR of 2.8 is NOT the same as the stratum-specific RR’s of approx. 2.0
roughly the same
smoking and CHD because the crude RR of 2.8 is NOT the same as the stratum-specific RR’s of 2.0 or 1.9
there is confounding present Is 2.0 and 1.9 more than 10% different from 2.8? 10% of 2.8 = .28 and the difference between our stratum specific RR’s and the crude RR is greater than .28 Not a replacement for a statistical test - simply a way to initially judge whether something is a potentially confounding factor
IN STUDY DESIGN…
confounders (i.e. simply don’t include confounder in study)
groups to attempt to even out confounders
assuring even distribution among study groups
IN DATA ANALYSIS…
method to adjust for confounders
collected data (frequency or group)
but it means throwing away data
assume that ORs are constant across stratum
i i i i i i MH
across strata by removing the effect of the confounder
Example of the Mantel Haenszel Method ...
Stroke No Stroke Total Fm Hsty 16 47 63 No Fm Hsty 2 7 9 Total 18 54 72
STRATUM 1: Pre-Menopause OR = 1.19 STRATUM 2: Post-Menopause OR = 1.05
Stroke No Stroke Total Fm Hsty 24 13 37 No Fm Hsty 58 33 91 Total 82 46 128
i i i i i i i i MH
HOW TO REPORT DATA WITH CONFOUNDERS IF YOU HAVE A CONFOUNDER….
it’s wrong)
Mantel-Haenszel OR (this is like compiling stratum-specific OR’s)
Are stratum-specific OR’s the same? YES YES NO NO
crude OR = stratum-specific? INTERACTION… report stratum-specific OR or RR CONFOUNDING Report summary Measure (MH OR) NO CONFOUNDING or INTERACTION Report crude OR or RR
Page 72
rather than mere presence or absence
degree of confounding degree of confounding = crude measure adjusted measure = 4.00 = 3.53 1.13 Crude = 1.68 Adjusted = 3.97 d.c. = 1.68 = 0.42 3.97
under estimation
AGE
Recent Use of OC
MI Controls OR
Yes 4 62 25-29 No 2 244 7.2 Yes 9 33 30-34 No 12 390 8.9 Yes 4 26 1.5 35-39 No 33 330 Yes 6 9 3.7 40-44 No 65 362 Yes 6 5 3.9 45-49 No 93 301 Yes 29 135 1.7 TOTAL No 205 1607
65AGE
Recent Use of OC
MI Controls OR
Yes 4 62 25-29 No 2 244 7.2 Yes 9 33 30-34 No 12 390 8.9 Yes 4 26 1.5 35-39 No 33 330 Yes 6 9 3.7 40-44 No 65 362 Yes 6 5 3.9 45-49 No 93 301 Yes 29 135 1.7 TOTAL No 205 1607
6597 . 3 ) MH ( OR a =
∧
4 4 fold risk of fold risk of MI MI among recent of among recent of OC OC users users as compared to non as compared to non-
users. .
Page 73
TYPES OF ASSOCIATION TYPES OF ASSOCIATION
1.
Noncausally associated associated (Secondarily) 2.
Causally associated
Page 74
Example No. Type of Confounding Unadjusted Relative Risk Adjusted Relative Risk 1 Positive 3.5 1.0 2 Positive 3.5 2.1 3 Positive 0.3 0.7 4 Negative 1.0 3.2 5 Negative 1.5 3.2 6 Negative 0.8 0.2 7 Qualitative 2.0 0.7 8 Qualitative 0.6 1.8
Hypothetical Examples of Unadjusted and Adjusted Relative Risks According to Type of confounding (Positive or Negative)