ph d course in epidemiology fall 2012 confounding
play

Ph.D. course in epidemiology: Fall 2012. Confounding Analysis of - PowerPoint PPT Presentation

Ph.D. course in epidemiology: Fall 2012. Confounding Analysis of cohort studies. Epidemiology relies on observational studies or experiments of nature C & H, Ch. 6, 14-15. Often these are poor experiments no control for


  1. Ph.D. course in epidemiology: Fall 2012. Confounding Analysis of cohort studies. • Epidemiology relies on observational studies or experiments of nature C & H, Ch. 6, 14-15. • Often these are poor experiments — no control for confounding by extraneous influences 18 September 2012 • Definition: A confounder is a variable whose influence we would have controlled if we had been able to design the natural www.biostat.ku.dk/~nk/epiE12 experiment. Per Kragh Andersen 1 2 Example: confounding by age, Fig. 14.1 Age Age ✟✟✟✟✟ ✟✟✟✟✟ F F 0.1 0.1 • Probability of failure for unexposed : ❍❍❍❍❍ ❍❍❍❍❍ < 55 < 55 (0 . 8 × 0 . 1) + (0 . 2 × 0 . 3) = 0 . 14 � � 0.8 0.4 � � � 0.9 � 0.9 S S • Probability of failure for exposed : � � � � (0 . 4 × 0 . 1) + (0 . 6 × 0 . 3) = 0 . 22 ❅ ❅ ❅ ❅ ❅ ✟✟✟✟✟ ❅ ✟✟✟✟✟ F F • Difference entirely due to difference in age structure. 0.3 0.3 ❅ ❅ ❅ ❅ 0.2 0.6 • When there is a true effect, its magnitude can be distorted by ❍❍❍❍❍ ❍❍❍❍❍ 55+ 55+ such influences. 0.7 0.7 S S Unexposed subjects Exposed subjects 3 4

  2. Confounding when RR = 2 Results. Age ✟✟✟✟✟ Age ✟✟✟✟✟ F F 0.1 0.2 • The true relative risk, RR T = 0 . 2 / 0 . 1 = 0 . 4 / 0 . 2 = 2 ❍❍❍❍❍ ❍❍❍❍❍ < 55 < 55 • Probability of failure for unexposed : � � 0.8 0.4 � � 0.9 0.8 � � S S (0 . 8 × 0 . 1) + (0 . 2 × 0 . 2) = 0 . 12 � � � � • Probability of failure for exposed : ❅ ❅ ❅ ❅ ❅ ✟✟✟✟✟ ❅ ✟✟✟✟✟ (0 . 4 × 0 . 2) + (0 . 6 × 0 . 4) = 0 . 32 F F 0.2 0.4 ❅ ❅ 0.2 ❅ 0.6 ❅ • The apparent relative risk: 55+ ❍❍❍❍❍ 55+ ❍❍❍❍❍ RR O = 0 . 32 / 0 . 12 = 2 . 67 0.8 0.6 S S Unexposed subjects Exposed subjects 5 6 Confounding: schematically. A variable C is a potential confounder for the relation: Confounding E → O if it is A confounder is: • 1) related to the exposure: • associated with outcome: E − C e.g., older persons have higher disease probability, • 2) an independent risk factor for the outcome: • associated with the exposure: C → O e.g., older persons are more / less likely to be exposed, • 3) not a consequence of the exposure: • not a result of exposure, i.e. not an intermediate variable. E → C → O Not a statistical property; cannot be seen from tables; common That is: sense is required! − E C ց ւ O 7 8

  3. Confounding. The problem is that we do not always get a fair comparison between exposed and non-exposed. Controlling confounding, Sect. 14.2 EXPOSED NON-EXPOSED In controlled experiments there are two ways of controlling confounding: 1. Randomization of subjects to experimental groups so that the Young Young distributions of the confounder are the same. 2. Hold the confounder constant . Old Old A randomly selected exposed person tends to be older than a randomly chosen non-exposed. 9 10 Standardization is a classical statistical technique for controlling for extraneous variables (in particular: age ) in the analysis of an Direct standardization, sect. 14.3 observational study 1. Direct standardization simulates randomization by equalizing 1. Estimate age-specific rates (or risks) in each group, the distribution of extraneous variables. 2. Calculate marginal rates (risks) if the age distribution were fixed 2. Indirect standardization simulates the second method: holding to that of some agreed standard population . extraneous variables constant. A standard population is another term for a common age-distribution. We first discuss direct standardization and then later turn to the main ways of “holding the confounder constant”: 3. Direct standardization is good for illustrative purposes as it provides absolute rates. • stratified (“Mantel-Haenszel”) analysis • or (more importantly) regression analysis: logistic, Poisson, Cox. 11 12

  4. Age ✟✟✟✟✟ Age ✟✟✟✟✟ F F 0.1 0.1 ❍❍❍❍❍ ❍❍❍❍❍ < 55 < 55 � � 0.8 0.4 � � The Diet data 0.9 0.9 � � S S � � � � ❅ ❅ Exposed Unexposed ❅ ❅ Current ( < 2750 kcal) ( ≥ 2750 kcal) ❅ ✟✟✟✟✟ ❅ ✟✟✟✟✟ F F 0.3 0.3 ❅ ❅ age D Y Rate D Y Rate RR 0.2 ❅ 0.6 ❅ ❍❍❍❍❍ ❍❍❍❍❍ 55+ 55+ 40–49 2 311.9 6.41 4 607.9 6.58 0.97 50–59 12 878.1 13.67 5 1271.1 3.93 3.48 0.7 0.7 S S 60–69 14 667.5 20.97 8 888.9 9.00 2.33 Unexposed subjects Exposed subjects Total 28 1857.5 15.07 17 2768.9 6.14 2.46 Marginal failure probability (with 50-50 age distribution) is (0 . 5 × 0 . 1) + (0 . 5 × 0 . 3) = 0 . 2 for both groups 13 14 Direct standardization in the diet data. Choice of weights We can standardize the age-specific rates to a population with equal numbers of person–years in each age group. • Sometimes overall age structure of the whole study is used Exposed: • Use of a standard age structure can facilitate comparison with � 1 � � 1 � � 1 � other work. 3 × 6 . 41 + 3 × 13 . 67 + 3 × 20 . 97 = 13 . 67 • In cancer epidemiology standard populations approximating the Unexposed: European, US or World population age-distribution are used. � 1 � � 1 � � 1 � • Equal weights essentially give a comparison between cumulative 3 × 6 . 58 + 3 × 3 . 93 + 3 × 9 . 00 = 6 . 50 rates in the two groups Estimate of rate ratio is 13 . 67 / 6 . 50 = 2 . 10. 15 16

  5. If the effect of exposure is the same in all age-strata, we can re-parameterize rates as: Stratified (Mantel-Haenszel) analysis, Ch. 15. Exposed Unexposed • Aim is to hold age constant. Age Low energy High energy Rate Ratio • Compare exposed and unexposed persons within age strata. λ 0 1 = θλ 0 λ 0 40–49 θ 0 0 • Compute a combined estimate of effect over all strata. λ 1 1 = θλ 1 λ 1 50–59 θ 0 0 • This implies a model in which there is no (systematic) variation λ 2 1 = θλ 2 λ 2 60–69 θ 0 0 of effect over strata. This is the proportional hazards model: • If estimates are similar we combine them, by a suitable average. For every stratum a : λ a 1 = θλ a 0 . θ is the effect of exposure “controlled for” age. 17 18 The Mantel-Haenszel estimate Data The MH-estimate for θ is (the weighted average): Exposed Unexposed D 1 a Y 0 a � � a Q a = Q a Y 0 a + Y 1 a θ MH = = R. Age ( a ) Low energy (1) High energy (0) D 0 a Y 1 a � � a R a a Y 0 a + Y 1 a 40–49 ( a = 0) D 10 , Y 10 D 00 , Y 00 This may be calculated by hand. 50–59 ( a = 1) D 11 , Y 11 D 01 , Y 01 Note that only θ is estimated, not the λ ’s. 60–69 ( a = 2) D 12 , Y 12 D 02 , Y 02 Maximum likelihood estimation of all parameters: later. 19 20

  6. The Mantel-Haenszel test The Mantel-Haenszel test for no exposure effect is: An approximate confidence interval for θ can be obtained using a U 2 /V standard error for log(ˆ θ ) and then calculate the error factor in the where usual way: � � U = U a V sd(log( θ MH )) = a QR and where Y 1 a U a = D 1 a − ( D 0 a + D 1 a ) Y 0 a Y 1 a � � Y 0 a + Y 1 a V = V a = ( D 0 a + D 1 a ) ( Y 0 a + Y 1 a ) 2 . a a (NB: calculations by hand). This test may also be based on the likelihood principle. When θ = 1, this is approximately χ 2 1 − distributed. 21 22 Is it reasonable to assume constant rate ratio? Estimate θ and compute the expected number of unexposed cases given the total number of cases and the split of risk time between exposed and unexposed: The diet data. Y 0 a E 0 a = ( D 0 a + D 1 a ) • θ MH = 2 . 40, Y 0 a + θ MH Y 1 a • 90% c.i. from 1.44 to 4.01, (cases should occur in proportion Y 0 a : θ MH Y 1 a ). Then, compute the • MH-test statistic: 8.48 ∼ χ 2 “Breslow-Day” test statistic for homogeneity over strata: 1 , P = 0 . 004 , • Breslow-Day test statistic: 1.65 ∼ χ 2 A 2 , P = 0 . 44 . ( D 0 a − E 0 a ) 2 � ∼ χ 2 A − 1 , E 0 a a =1 (where A is the number of age strata). If this is sufficiently small, accept that the rate ratio is constant. 23 24

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend