[PPT] - ERRORS IN ERRORS IN EPIDEMIOLOGICAL EPIDEMIOLOGICAL STUDIES PowerPoint Presentation

SLIDE 1

ERRORS IN ERRORS IN EPIDEMIOLOGICAL EPIDEMIOLOGICAL STUDIES STUDIES

Pratap Pratap Singhasivanon Singhasivanon

Department of Tropical Hygiene Department of Tropical Hygiene

SLIDE 2

An important goal of epidemiological studies is to measure accurately the

ccurrence of

exposure/risk factors and disease outcome

SLIDE 3

A false or mistaken result obtained in a study or experiment

BIAS BIAS BIAS

Fluctuation of and estimate around the population value (RANDOM VARIABILITY) Error due to factorsthat inherent in the design, conduct and analysis Result obtained in sample differs from result that would be obtained if the entire population were studies

ERROR ERROR ERROR SYSTEMATIC ERROR SYSTEMATIC ERROR RANDOM ERROR RANDOM ERROR

= = + +

Page 3

SLIDE 4

ERROR ERROR

Is defined as a false or mistaken result

btained in a study or experiment

Consists of 2 components

Systematic error Random error Random error

Page 4

SLIDE 5

RANDOM ERROR RANDOM ERROR

Is Is the divergence, due to chance alone, of

an observation on an sample from the true population value

Page 5

SLIDE 6

RANDOM ERROR RANDOM ERROR

Refers to fluctuations around a true value because of Sampling variability

SYSTEMATIC ERROR SYSTEMATIC ERROR

Any difference between the true value and that actually obtained that is the result

f all causes other than Sampling variability.

Page 6

SLIDE 7

Errors in epidemiological studies

E r r

r

S t u d y s i z e

Source: Rothman, 2002

Systematic error (bias) Random error (chance)

SLIDE 8

Bias

Bias occurs when an estimated association

(RR, OR, difference in means etc.) deviates from the true measure of association

Consequence of bias systematic error in

RR, OR etc.

Bias may be introduced at design,

implementation or analysis phase of a study

SLIDE 9

SYSTEMATIC ERROR SYSTEMATIC ERROR : :

SELECTION BIAS INFORMATION BIAS CONFOUNDING

Page 9

SLIDE 10

Should I believe my measurement?

Smoking Lung cancer

OR = 9.1

Random Error? Bias? Confounding? True association

causal non-causal

SLIDE 11

Is Is the expression of the degree to which a

test is capable of measuring what it is intended to measure

A A study is valid if its results corresponds to

the truth, no systematic error and random error should be as small as possible

VALIDITY VALIDITY

Page 11

SLIDE 12

A A high reliability

high reliability means that in repeated measurements the results fall very close to each other; conversely,

A A low reliability

low reliability means that they are scattered.

Page 12

SLIDE 13

Different combinations of high and low Different combinations of high and low reliability and validity reliability and validity

Page 13

VALIDITY VALIDITY High High High High RELIABILITY RELIABILITY Low Low Low Low High High Low Low

SLIDE 14

Internal and Internal and E External Validity xternal Validity

External External Population Population

Target Population

Study Sample

VALIDITY VALIDITY VALIDITY

INT INT. . EXT EXT. .

Page 14

SLIDE 15

is a distorsion in the estimate of effect resulting from the manner in which subject are selected for the study population MAJOR SOUREC OF MAJOR SOUREC OF SELECTION BIAS SELECTION BIAS 1) flaws in the choice of groups to be compared 2) choice of sampling frame 3) loss to follow up or nonresponse during data collection 4) selective survival

SELECTION BIAS SELECTION BIAS

Page 15

SLIDE 16

Selection Bias

Systematic error resulting from manner

in which subjects are selected or retained in the study

Can occur when:

Characteristics of subjects selected for study

differ systematically from those in the target population

Study and comparison groups are selected from

different populations

SLIDE 17

Selection Bias

Distortions that arise from

– Procedures used to select subjects – Factors that influence study participation – Factors that influence participant attrition

Systematic error in identifying or

selecting subjects

– Examples are…

SLIDE 18

Selection Bias

Example:

If cases & controls or exposed & non-

exposed individuals were selected in such a way that an association is

bserved even though exposure &

disease are not associated

May result from withdrawal or losses to

follow-up of study subjects

SLIDE 19

Selection Bias

Problematic

– Can result in over- or under- estimation of the true magnitude of the relationship between an exposure and an outcome – May produce an apparent association when none exists

OR/RR may be incorrect estimates ⇒ Invalid inferences

about association of exposure & disease

– May conceal a real association

SLIDE 20

Selection Bias

To avoid it, ensure that:

– Subjects are representative of target population – Study and comparison groups are similar except for variables being investigated – Subject losses are kept to a minimum

SLIDE 21

INFORMATION BIAS INFORMATION BIAS is a distortion in the measurement error or misclassification of subject on

ne or more variables

MAJOR SOURCES OF INFORMATION BIAS MAJOR SOURCES OF INFORMATION BIAS 1) invalid measurement 2) incorrect diagnostic criteria 3) omissions imprecisions 4) other inadequacies in previously recorded data

Page 21

SLIDE 22

Information Bias Information Bias

Definition

Definition: :

A distortion in the measure of association

A distortion in the measure of association caused by a lack of accurate caused by a lack of accurate measurements of the exposure (risk measurements of the exposure (risk factor) or disease state factor) or disease state

Also known as

Also known as Measurement bias Measurement bias

SLIDE 23

Information bias Information bias

Systematic error in the measurements of

Systematic error in the measurements of information on exposure or outcome information on exposure or outcome

Result in:

Result in:

Differences in accuracy

Differences in accuracy of:

f:
exposure data between cases and controls

exposure data between cases and controls

outcome data between different exposure
utcome data between different exposure

groups groups

SLIDE 24

Information bias Information bias

Sources of information bias include:

Sources of information bias include:

Defects in the measurement instruments

Defects in the measurement instruments

Deficiencies in the questionnaires

Deficiencies in the questionnaires

Inaccurate diagnostic procedures

Inaccurate diagnostic procedures

Ambigious

Ambigious definition of exposure definition of exposure

Poorly defined diagnostic criteria of

Poorly defined diagnostic criteria of disease disease

Incomplete or unreliable data sources

Incomplete or unreliable data sources

SLIDE 25

Information Bias Information Bias

Cause

Cause: :

Information bias arises when

Information bias arises when study variables (exposure, disease, or study variables (exposure, disease, or confounders) are inaccurately measured confounders) are inaccurately measured

r classified resulting in
r classified resulting in Misclassification

Misclassification

SLIDE 26

MISCLASSIFICATION BIAS MISCLASSIFICATION BIAS

Definition: Definition:

the erroneous classification of an individual, a the erroneous classification of an individual, a value, or an attribute into a category other value, or an attribute into a category other than that to which it should be assigned than that to which it should be assigned

ften results from an improper
ften results from an improper “

“cutoff point cutoff point” ” in in disease diagnosis or exposure classification disease diagnosis or exposure classification

Hence errors are made in classifying to

Hence errors are made in classifying to either disease or exposure status either disease or exposure status

SLIDE 27

MISCLASSIFICATION BIAS MISCLASSIFICATION BIAS

Types of misclassification bias

Types of misclassification bias

Non differential (random)

Non differential (random)

Differential (systematic)

Differential (systematic)

SLIDE 28

Nondifferential Nondifferential Misclassification Misclassification Bias Bias

Occurs when there is equal misclassification of

Occurs when there is equal misclassification of exposure between diseased and non exposure between diseased and non-

diseased

diseased study subjects study subjects

OR

OR

When there is equal misclassification of

When there is equal misclassification of disease between exposed and non disease between exposed and non-

exposed

exposed study subjects study subjects

SLIDE 29

Non Non-

differential Misclassification

differential Misclassification Bias Bias

If exposure or disease is dichotomous,

If exposure or disease is dichotomous, then, then, Non Non-

differential misclassification

differential misclassification causes a bias of the RR or OR causes a bias of the RR or OR towards the null towards the null

SLIDE 30

Nondifferential Nondifferential Misclassification Misclassification Bias Bias

Cases Cases Controls Controls Total Total Exposed Exposed 100 100 50 50 150 150 Nonexposed Nonexposed 50 50 50 50 100 100 150 150 100 100 250 250

OR = ad/bc = 2.0; RR = a/(a+b)/c/(c+d) = 1.3 True Classification

Cases Cases Controls Controls Total Total Exposed Exposed 110 110 60 60 170 170 Nonexposed Nonexposed 40 40 40 40 80 80 150 150 100 100 250 250

OR = ad/bc = 1.8; RR = a/(a+b)/c/(c+d) = 1.3 Nondifferential misclassification Overestimate exposure in 10 cases, 10 controls bias towards null

SLIDE 31

Non-Differential Misclassification "True Situation"

Cases Controls Total Exp. 85 40 125 Not Exp. 15 60 75 Total 100 100 200

OR= 8.5 50% of exposed misclassified as unexposed

Cases Controls Exp. 43 20 Not Exp. 15 + 42 60 + 20 Total 100 100

OR= 3.0

Bias towards the null (1.0)

SLIDE 32

Differential Misclassification Differential Misclassification

Occurs when misclassification of exposure is

Occurs when misclassification of exposure is not equal between diseased and non not equal between diseased and non-

diseased

diseased study subjects study subjects

OR

OR

When misclassification of disease is not equal

When misclassification of disease is not equal between exposed and non between exposed and non-

exposed study

exposed study subjects subjects

SLIDE 33

Differential Misclassification Differential Misclassification

Causes a bias in the RR or OR

Causes a bias in the RR or OR

either towards or away from the null,

either towards or away from the null,

depending on the proportions of study

depending on the proportions of study subjects misclassified subjects misclassified

SLIDE 34

Differential Misclassification Differential Misclassification

Direction of bias is

Direction of bias is towards the null towards the null if if

fewer cases are considered to be exposed or

fewer cases are considered to be exposed or

fewer exposed are considered to be diseased

fewer exposed are considered to be diseased

Direction of bias is

Direction of bias is away from the null away from the null if if

more cases are considered to be exposed or

more cases are considered to be exposed or

more exposed are considered to be diseased

more exposed are considered to be diseased

SLIDE 35

Differential Misclassification Differential Misclassification

Example

Example

Case

Case-

Control study

Control study: :

If more cases are

If more cases are mistakenly mistakenly classified as being classified as being exposed than controls exposed than controls

verestimation of OR
verestimation of OR
Cohort study

Cohort study: :

If exposed group is more likely to be

If exposed group is more likely to be mistakenly mistakenly classified as having developed the outcome than the classified as having developed the outcome than the unexposed group unexposed group

verestimation of RR
verestimation of RR
Leads to

Leads to over

ver-
or under
r under-
estimation

estimation of the true

f the true

magnitude of the measure of association magnitude of the measure of association

SLIDE 36

0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8

1 2 3 4 5+

Birth Order Birth Order

Affected Affected Babies

Babies per 1000 Live Births

per 1000 Live Births

Prevalence of Down syndrome at birth by birth order Prevalence of Down syndrome at birth by birth order

SLIDE 37

1 2 3 4 5 6 7 8 9 <20 20-24 25-29 30-34 35-39 40+

Maternal Age Maternal Age Affected Babies per 1000 Live Affected Babies per 1000 Live Births Births

Prevalence of Down syndrome at Maternal Age Prevalence of Down syndrome at Maternal Age

SLIDE 38

SLIDE 39

SLIDE 40

Here’s what we’d like to assess : Exposure Exposure Disease Disease Here’s where confounding acts : Exposure Exposure Disease Disease Confounding Confounding

Where represents a causal relationship represents a non-causal relationship

SLIDE 41

Why is confounding a problem?

Because…...

the estimate of association between exposure and

disease includes BOTH the contribution of the exposure AND the confounder

the estimate of association which includes the

confounder gives an incorrect estimate of the impact of the exposure on the disease outcome

SLIDE 42

Criteria for confounding

must be an independent predictor of disease

with or without exposure

must be associated (correlated) with exposure

but not caused by the exposure

must not be an intermediate link in a causal

pathway between exposure and outcome

SLIDE 43

Potential Confounder If ...

E D C

E = Exposure D = Disease C = Confounder

E D C

SLIDE 44

NOT A POTENTIAL CONFOUNDER

C E D

part of causal pathway

E C D

E D C E D C

E and C are part of a causal pathway

Or

SLIDE 45

CONFOUNDING CONFOUNDING

MIXING OF EFFECTS MIXING OF EFFECTS The estimate of the effect of The exposure

f interest is distorted because it is mixed

With the effect of an Extraneous factor

Page 45

SLIDE 46

COFFEE DRINKING COFFEE DRINKING, , CIGARETTE SMOKING CIGARETTE SMOKING AND CORONARY HEART DISEASE AND CORONARY HEART DISEASE EXPOSURE (coffee drinking) DISEASE (heart disease) CONFOUNDING VARIABLE (cigarette smoking)

CONFOUNDING CONFOUNDING

Page 46

SLIDE 47

T The distortion introduced by a confounding factor

can lead to overestimation or under estimation of an effect depending on the direction of the association that the confounding factor has with exposure and disease.

C Confounding

nfounding can even change the apparent

direction of an effect. Example Example : : Alcohol Alcohol Smoking Smoking Oral cancer Oral cancer

Page 47

SLIDE 48

E = E = E = E = D D D D

SLIDE 49

E = E = E = E = D D D D

SLIDE 50

E E E E E E D D E E D D

SLIDE 51

E = E = E = E =

D D D D E E E E E E E E D D D D

SLIDE 52

Situation in which Situation in which F

F is a

is a confounder confounder for a for a D

D -

E

E association

association. . Situation in which Situation in which F

F is not

is not a a confounder confounder for a for a D

D -

E

E association

association. .

E E F F D D E E F F D D E E F F D D E E F F D D E E F F D D E E F F D D E E F F D D

Page 52

SLIDE 53

To be confounding To be confounding, , the extraneous variable must the extraneous variable must have the following characteristics have the following characteristics

A A confounding variable

confounding variable must be a risk factor for the disease.

A A confounding variable

confounding variable must be associated with the exposure under study (in the population from which the case derive).

A A confounding variable

confounding variable must not be an intermediate step in the causal path between the exposure and the disease.

Page 53

SLIDE 54

T The data-based criterion for establishing the

presence or absence of confounding involve the comparison of a crude effect measure with an adjusted effect measure that corrects for distortions due to extraneous variables.

C Confounding is acknowledged to be present

when the crude and adjusted effect measure d i f f e r i n v a l u e .

Page 54

SLIDE 55

RESTRICTION
MATCHING
STRATIFICATION
MATHEMATICAL MODEL

(Multivariate analysis)

DESIGN DESIGN ANALYSIS ANALYSIS

CONTROL OF CONFOUNDING CONTROL OF CONFOUNDING

Page 55

SLIDE 56

AGE *MI (%) CONTROLS(%) **OC USE (%) 25-29 3 16 29 30-34 9 14 10 35-39 16 20 8 40-44 30 21 4 45-49 42 18 3 *MI : MYOCARDIAL INFARCTION **OC : ORAL CONTRACEPTIVE

Relation of Relation of Confounder Confounder to to Disease and Exposure Disease and Exposure

DISEASE EXPOSURE

Page 56

SLIDE 57

CRUDE RR CRUDE RR

D D +

E

E

RR = 4

RR = 4 CRR

1000

Collapsed
Collapsed in 1 table without separation into subgroup.

Page 57

CRUDE RR 4

= =

1000 2000

SLIDE 58

E = E = D D E E E E D D

SLIDE 59

E E E E D D

SLIDE 60

Page 60

SMOKERS SMOKERS SMOKERS NON-SMOKERS NON NON-

SMOKERS

SMOKERS

6 29 871

+

+

CIR = 1.86

SM ^

ADJUSTED

CIR = 1.13 ^

CIR = 1.02 SM ^ EXPOSURE EXPOSURE

194 21 706 79

+

DISEASE DISEASE

94 94

+

200

200 800 800 50 50 950 950

ALC ALC 250 1750 DISEASE DISEASE 1000 1000 2000

CRUDE CIR = 4.0 ^

EXPOSURE EXPOSURE O C a O C a

+

O

C a O C a

SLIDE 61

IDENTIFYING A CONFOUNDER

an example

Calculate crude measure of association: RR = 2.8 (2.21,3.63)

d) c/(c b) a/(a RR + + =

Smokers Non smokers Total CHD 305 58 363 NO CHD 345 292 637 Total 650 350 1000

SLIDE 62

Calculate stratum-specific measures of association...

Smokers Non smokers Total CHD 300 50 350 No CHD 300 150 450 Total 600 200 800

RR = 2.0 (1.55,2.58)

Smokers Non smokers Total CHD 5 8 13 No CHD 45 142 187 Total 50 150 200

STRATUM 1: MEN STRATUM 2: WOMEN

RR = 1.9 (0.64,5.47)

SLIDE 63

IS THERE A CONFOUNDER?

CRUDE RR for smoking and CHD =2.8
STRATUM-SPECIFIC RR for smoking and CHD with

gender as a potential confounder... MEN RR = 2.0 WOMEN RR = 1.9

Overlapping confidence intervals or 10% rule
Gender confounds the association between smoking

and CHD because the crude RR of 2.8 is NOT the same as the stratum-specific RR’s of approx. 2.0

roughly the same

SLIDE 64

IS THERE A CONFOUNDER?

THE 10% RULE
We assert that gender confounds the association between

smoking and CHD because the crude RR of 2.8 is NOT the same as the stratum-specific RR’s of 2.0 or 1.9

the 10% rule is a good rule of thumb for assessing whether

there is confounding present Is 2.0 and 1.9 more than 10% different from 2.8? 10% of 2.8 = .28 and the difference between our stratum specific RR’s and the crude RR is greater than .28 Not a replacement for a statistical test - simply a way to initially judge whether something is a potentially confounding factor

SLIDE 65

HOW TO CONTROL FOR CONFOUNDERS?

IN STUDY DESIGN…

RESTRICTION of subjects according to potential

confounders (i.e. simply don’t include confounder in study)

RANDOM ALLOCATION of subjects to study

groups to attempt to even out confounders

MATCHING subjects on potential confounder thus

assuring even distribution among study groups

SLIDE 66

HOW TO CONTROL FOR CONFOUNDERS?

IN DATA ANALYSIS…

STRATIFIED ANALYSIS using the Mantel Haenszel

method to adjust for confounders

IMPLEMENT A MATCHED-DESIGN after you have

collected data (frequency or group)

RESTRICTION is still possible at the analysis stage

but it means throwing away data

MODEL FITTING using regression techniques

SLIDE 67

Using the Mantel Haenszel Method to Report Adjusted OR’s

Can only use with confounders because we

assume that ORs are constant across stratum

General Formula : where i = strata and N = Total

i i i i i i MH

N c b N d a OR ∑ ∑ =

This technique generates a summary measure

across strata by removing the effect of the confounder

SLIDE 68

Example of the Mantel Haenszel Method ...

Stroke No Stroke Total Fm Hsty 16 47 63 No Fm Hsty 2 7 9 Total 18 54 72

STRATUM 1: Pre-Menopause OR = 1.19 STRATUM 2: Post-Menopause OR = 1.05

Stroke No Stroke Total Fm Hsty 24 13 37 No Fm Hsty 58 33 91 Total 82 46 128

1.29 OR MH =

128 13 58 72 47 2 128 33 24 72 7 16 × + × × + ×

SLIDE 69

Mantel Haenszel RELATIVE RISK

∑ ∑

+ + =

i i i i i i i i MH

N ) b (a c N ) d (c a RR

SLIDE 70

HOW TO REPORT DATA WITH CONFOUNDERS IF YOU HAVE A CONFOUNDER….

DO NOT report crude OR or RR (you know

it’s wrong)

GOOD: Report stratum-specific OR or RR
BEST: Report summary measures such as a

Mantel-Haenszel OR (this is like compiling stratum-specific OR’s)

SLIDE 71

THI RD VARI ABLE SUMMARY

Are stratum-specific OR’s the same? YES YES NO NO

crude OR = stratum-specific? INTERACTION… report stratum-specific OR or RR CONFOUNDING Report summary Measure (MH OR) NO CONFOUNDING or INTERACTION Report crude OR or RR

SLIDE 72

Page 72

Degree of Confounding Degree of Confounding

m measures the amount of confounding

rather than mere presence or absence

degree of confounding degree of confounding = crude measure adjusted measure = 4.00 = 3.53 1.13 Crude = 1.68 Adjusted = 3.97 d.c. = 1.68 = 0.42 3.97

ver estimation

under estimation

SLIDE 73

AGE

Recent Use of OC

MI Controls OR

Yes 4 62 25-29 No 2 244 7.2 Yes 9 33 30-34 No 12 390 8.9 Yes 4 26 1.5 35-39 No 33 330 Yes 6 9 3.7 40-44 No 65 362 Yes 6 5 3.9 45-49 No 93 301 Yes 29 135 1.7 TOTAL No 205 1607

65

AGE

Recent Use of OC

MI Controls OR

Yes 4 62 25-29 No 2 244 7.2 Yes 9 33 30-34 No 12 390 8.9 Yes 4 26 1.5 35-39 No 33 330 Yes 6 9 3.7 40-44 No 65 362 Yes 6 5 3.9 45-49 No 93 301 Yes 29 135 1.7 TOTAL No 205 1607

65

97 . 3 ) MH ( OR a =

∧

4 4 fold risk of fold risk of MI MI among recent of among recent of OC OC users users as compared to non as compared to non-

users

users. .

Page 73

SLIDE 74

TYPES OF ASSOCIATION TYPES OF ASSOCIATION

A. Not statistically associated (Independent)
B. Statistically associated

1.

1. Noncausally

Noncausally associated associated (Secondarily) 2.

2. Causally associated

Causally associated

a. Indirectly associated
b. Directly causal

Page 74

SLIDE 75

Example No. Type of Confounding Unadjusted Relative Risk Adjusted Relative Risk 1 Positive 3.5 1.0 2 Positive 3.5 2.1 3 Positive 0.3 0.7 4 Negative 1.0 3.2 5 Negative 1.5 3.2 6 Negative 0.8 0.2 7 Qualitative 2.0 0.7 8 Qualitative 0.6 1.8

Hypothetical Examples of Unadjusted and Adjusted Relative Risks According to Type of confounding (Positive or Negative)