biospecimen assessment
play

Biospecimen Assessment Michelle Danaher University of Maryland, - PowerPoint PPT Presentation

Biospecimen Assessment Michelle Danaher University of Maryland, Baltimore County and Eunice Kennedy Shriver National Institute of Child Health and Human Development 1/60 Outline Biospecimen Assessment Chapter 1: Background Pooling variables


  1. Biospecimen Assessment Michelle Danaher University of Maryland, Baltimore County and Eunice Kennedy Shriver National Institute of Child Health and Human Development 1/60

  2. Outline Biospecimen Assessment Chapter 1: Background Pooling variables in biomarker assessment models Chapter 2: Gene-environment interactions Chapter 3: Estimation of interaction effects using pooled biospecimen Constrained estimation in biomarker assessment models Chapter 4: Minkowski-Weyl priors for models with parameter constraints Chapter 1 Chapter 2 Chapter 3 Chapter 4 Acknowledgments 2/60

  3. Outline Biospecimen Assessment Chapter 1: Background Pooling variables in biomarker assessment models Chapter 2: Gene-environment interactions Chapter 3: Estimation of interaction effects using pooled biospecimen Constrained estimation in biomarker assessment models Chapter 4: Minkowski-Weyl priors for models with parameter constraints Danaher, M.R. , Schisterman, E.F ., Roy, A. and Albert, P .S. (2012), Estimation of gene-environment interaction by pooling biospecimens. Statistics in Medicine , 31: 3241–3252. DOI: 10.1002/sim.5357 Danaher, M.R. , Roy, A., Chen, Z., Mumford, S.L. and Schisterman, E.F . (2012), Minkowski-Weyl priors for models with parameter constraints: an analysis of the BioCycle Study. Journal of the American Statistical Association , in press. Chapter 1 Chapter 2 Chapter 3 Chapter 4 Acknowledgments 3/60

  4. Chapter 1 : Background Chapter 1 Chapter 2 Chapter 3 Chapter 4 Acknowledgments 4/60

  5. Background Why are we interested in biospecimen assessment? 5/60

  6. Background Why are we interested in biospecimen assessment? Epidemiologic studies increasingly depend on biomarkers to indicate exposure. 5/60

  7. Background Why are we interested in biospecimen assessment? Epidemiologic studies increasingly depend on biomarkers to indicate exposure. Financial expense limits epidemiologists from measuring biomarkers 5/60

  8. Background Why are we interested in biospecimen assessment? Epidemiologic studies increasingly depend on biomarkers to indicate exposure. Financial expense limits epidemiologists from measuring biomarkers We develop methods for proper statistical inference when biospecimens are pooled . 5/60

  9. Background Why are we interested in biospecimen assessment? Epidemiologic studies increasingly depend on biomarkers to indicate exposure. Financial expense limits epidemiologists from measuring biomarkers We develop methods for proper statistical inference when biospecimens are pooled . Biomarkers have known relationships due to biology 5/60

  10. Background Why are we interested in biospecimen assessment? Epidemiologic studies increasingly depend on biomarkers to indicate exposure. Financial expense limits epidemiologists from measuring biomarkers We develop methods for proper statistical inference when biospecimens are pooled . Biomarkers have known relationships due to biology We develop methods for effectively incorporating known biological constraints into a model. 5/60

  11. Background Pooling variables in biomarker assessment models In World War II Dorfman (1943) proposed pooling biospecimen to identify syphilis. Later, pooling was used to obtain a sum of individual components which are used in estimation of model parameters. (Brown and Fisher, 1972; Rhode, 1976) 6/60

  12. Background Pooling variables in biomarker assessment models The gene-environment interaction is a particularly important hypothesis. Traditional interaction estimators require genotyping a lot of individuals. We propose a novel pooling strategy to obtain a substantially less expensive estimator. 7/60

  13. Background Pooling variables in biomarker assessment models When two or more exposures are measured in pools, it is interesting to estimate interaction effects. Weinberg and Umbach (1999) proposed the set-based logistic model which, estimates main effects of exposures measured in pools, estimates interaction effects when one exposure is used to form pooling strata. HOWEVER, it can not estimate interaction, quadratic, or higher order effects of exposures measured in pools. We propose a novel method for estimating interactions of pooled exposures. 8/60

  14. Background Constrained estimation in biomarker assessment models 20 20        ( , ) V U 18 18 16 16   0 10 98 . 89  14 14 V    0 0 8 . 89  12 12  2  2   0 0 . 5  10 10 U     1 0 . 5 (98.89, 8.89) 8 8 6 6 4 4 2 2 0 0 0 20 40 60 80 100 120 0 (10,0) 20 40 60 80 100 120  1  1 Gelfand et al. (1992) proposed drawing samples from an unconstrained posterior, and retaining only those satisfying the constraints. Dunson et al. (2003) proposed a transformation approach for draws falling outside the constraint space. We propose a Bayesian model with a prior density on a constraint space using Minkowski-Weyl decomposition. 9/60

  15. Chapter 3 : Estimation of interaction effects using pooled biospecimen Chapter 1 Chapter 2 Chapter 3 Chapter 4 Acknowledgments 10/60

  16. Outline Estimation of interaction effects using pooled biospecimen    Introduction p 2 1 Notation and assumptions A cohort study The EM algorithm Variance estimation Simulation A case control study The EM algorithm Variance estimation Simulation Discussion Intro Notation Cohort Cohort Simulation Case Control Case Control Simulation Discussion 11/60

  17. Introduction Why are we interested in pooling biospecimen? 12/60

  18. Introduction Why are we interested in pooling biospecimen? Pooling saves financial resources, and invaluable biological resources. 12/60

  19. Introduction Why are we interested in pooling biospecimen? Pooling saves financial resources, and invaluable biological resources. When measuring a continuous biomarker, pooling may reduce the percentage of measurements under the LOD. 2500 2500 LOD LOD 2000 2000 1500 1500 1000 1000 500 500 0 0 -50 -40 -30 -20 -10 0 10 20 30 40 50 -40 -30 -20 -10 0 10 20 30 Pooled exposure assessment Individual exposure assessment 12/60

  20. Notation and assumptions n = the # of individuals in the study. p = the # of individuals’ biospecimen within each pool. n p = n / p is the # of pools of biospecimen. x ij = [ x ij 1 , x ij 2 ] ′ are two continuous exposures of interest, of the j th person in the i th pool. x p i = � p j = 1 x ij , is the observed exposure within a pool. y ij is the binary response of the j th person in the i th pool. 13/60

  21. Notation and assumptions We assume that the disease status and exposures have the following distributions, y ij | x ij , β ind ∼ Ber [ π ( x ij , β )] , x ij | µ , Σ iid ∼ N 2 ( µ , Σ) , for i = 1 , 2 , . . . , n p , j = 1 , 2 , . . . , p π ( x ij , β ) is the logistic link given by, β ′ D ( x ij ) � � exp π ( x ij , β ) = � , � β ′ D ( x ij ) 1 + exp β = [ β 0 , β 1 , β 2 , β 3 ] ′ are the unknown parameters of interest, and D ( x ij ) = [ 1 , x ij 1 , x ij 2 , x ij 1 × x ij 2 ] ′ . Define the unknown parameters by θ = [ β ′ , γ ′ ] ′ , where γ = [ σ 1 , σ 2 , ρ, µ ′ ] ′ . 14/60

  22. A cohort study The likelihood for a cohort study is as follows, n p p � � L ( θ | y 1 , . . . y p , X 1 . . . X p ) = Pr ( Y = y ij | x ij , β ) φ ( x ij | µ , Σ) i = 1 j = 1 When p = 2, and pooling is matched within disease status (i.e. y 1 = y 2 ), n p g ( x i 1 , y i 1 , x p L ( θ | y 1 , X 1 , X p ) � = i , θ ) , i = 1 where the function g is defined as, g ( x i 1 , y i 1 , x p i , θ ) = Pr ( Y = y i 1 | x i 1 , β ) φ ( x i 1 | µ , Σ) × Pr ( Y = y i 1 | x p i − x i 1 , β ) φ ( x p i − x i 1 | µ , Σ) . 15/60

  23. A cohort study The EM algorithm The first step of the EM algorithm is to evaluate Q , Q ( θ | θ t ) E f { log [ L ( θ | y 1 , X 1 , X p )] } = n p � = � x i 1 ∈ R 2 log [ g ( x i 1 , y i 1 , x p i , θ )] f X 1 | y 1 , X p , θ t ( x i 1 | y 1 , X p , θ t ) d x i 1 , i = 1 where the density f X 1 | y 1 , X p , θ t is, g ( x i 1 , y i 1 , x p i , θ t ) f X 1 | y 1 , X p , θ t ( x i 1 | y 1 , X p , θ t ) = . x i 1 ∈ R 2 g ( x i 1 , y i 1 , x p � i , θ t ) d x i 1 16/60

  24. A cohort study The EM algorithm The second step in the EM algorithm is to maximize Q with respect to (w.r.t.) the unknown parameters θ . We use Newton Raphson to maximize Q w.r.t β , where, � − 1 δ Q ( θ | θ t ) � δ 2 Q ( θ | θ t ) β k + 1 = β k − , δ β δ β ′ δ β and k increases until convergence. The values of µ and Σ which maximize Q are, n p x p 1 � µ t + 1 i = 2 , n p i = 1 n p x p i x p ′ µ t + 1 ( µ t + 1 ) ′ + 1 i ( µ t + 1 ) ′ + E f − x p Σ t + 1 � i x i 1 x ′ � � = i 1 n p 2 i = 1 − x p − E f ( x i 1 ) x p ′ x ′ i � � i 2 E f 2 , i 1 17/60

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend