a model selection approach for genome wide association
play

A Model Selection Approach for Genome Wide Association Studies - PowerPoint PPT Presentation

Model Selection Simulation results for GWAS A Model Selection Approach for Genome Wide Association Studies Florian Frommlet, Piotr Twarog, Malgorzata Bogdan Department of Statistics and Decision Support Systems, University of Vienna, Austria


  1. Model Selection Simulation results for GWAS A Model Selection Approach for Genome Wide Association Studies Florian Frommlet, Piotr Twarog, Malgorzata Bogdan Department of Statistics and Decision Support Systems, University of Vienna, Austria Paris, August 2010

  2. Model Selection Simulation results for GWAS Genome Wide Association Studies Y ← X 1 , . . . , X p Data structure: Up to one million SNPs X 1 , . . . , X p Trait Y quantitative or categorical (case control) Question: Which X i are actually associated with trait? Virtually all GWAS published so far: Single marker analysis Model selection approach Model specified by index vector M = [ i 1 , . . . , i k M ] M : Y = X M β M + ǫ, X M = [ X i 1 , . . . , X i kM ]

  3. Model Selection Simulation results for GWAS Genome Wide Association Studies Y ← X 1 , . . . , X p Data structure: Up to one million SNPs X 1 , . . . , X p Trait Y quantitative or categorical (case control) Question: Which X i are actually associated with trait? Virtually all GWAS published so far: Single marker analysis Model selection approach Model specified by index vector M = [ i 1 , . . . , i k M ] M : Y = X M β M + ǫ, X M = [ X i 1 , . . . , X i kM ]

  4. Model Selection Simulation results for GWAS Genome Wide Association Studies Y ← X 1 , . . . , X p Data structure: Up to one million SNPs X 1 , . . . , X p Trait Y quantitative or categorical (case control) Question: Which X i are actually associated with trait? Virtually all GWAS published so far: Single marker analysis Model selection approach Model specified by index vector M = [ i 1 , . . . , i k M ] M : Y = X M β M + ǫ, X M = [ X i 1 , . . . , X i kM ]

  5. Model Selection Simulation results for GWAS Classical model selection criteria Selection criteria based on likelihood L M Penalization of model size − 2 log L M + Penalty · k M Examples: AIC, BIC, RIC, Mallows C , etc. AIC . . . Penalty = 2 , BIC . . . Penalty = log n L 1 − penalization: LASSO etc.

  6. Model Selection Simulation results for GWAS Classical model selection criteria Selection criteria based on likelihood L M Penalization of model size − 2 log L M + Penalty · k M Examples: AIC, BIC, RIC, Mallows C , etc. AIC . . . Penalty = 2 , BIC . . . Penalty = log n L 1 − penalization: LASSO etc.

  7. Model Selection Simulation results for GWAS Classical model selection criteria Selection criteria based on likelihood L M Penalization of model size − 2 log L M + Penalty · k M Examples: AIC, BIC, RIC, Mallows C , etc. AIC . . . Penalty = 2 , BIC . . . Penalty = log n L 1 − penalization: LASSO etc.

  8. Model Selection Simulation results for GWAS Situation when p > n Classical theory for AIC and BIC Developed for p constant and n → ∞ Results no longer valid when p > n e.g. BIC no longer consistent Sparsity Theory possible when number of true signals k ≪ p Reasonable assumption, only few SNPs expected to be associated with trait Surprise Under sparsity and p > n BIC is choosing too large models

  9. Model Selection Simulation results for GWAS Situation when p > n Classical theory for AIC and BIC Developed for p constant and n → ∞ Results no longer valid when p > n e.g. BIC no longer consistent Sparsity Theory possible when number of true signals k ≪ p Reasonable assumption, only few SNPs expected to be associated with trait Surprise Under sparsity and p > n BIC is choosing too large models

  10. Model Selection Simulation results for GWAS Situation when p > n Classical theory for AIC and BIC Developed for p constant and n → ∞ Results no longer valid when p > n e.g. BIC no longer consistent Sparsity Theory possible when number of true signals k ≪ p Reasonable assumption, only few SNPs expected to be associated with trait Surprise Under sparsity and p > n BIC is choosing too large models

  11. Model Selection Simulation results for GWAS Modifications of BIC BIC = − 2 log L M + k M log n For situation p > n under sparsity [Bogdan et al. (2004)] mBIC = − 2 log L M + k M log( np 2 + d ) In a particular sense controlling FWE (related to Bonferroni) FDR - controlling model selection criterion mBIC 2= − 2 log L M + k M log( np 2 + d ) − 2 log k m ! Adaptivity to level of sparsity [Abramovich et al. (2006)]

  12. Model Selection Simulation results for GWAS Modifications of BIC BIC = − 2 log L M + k M log n For situation p > n under sparsity [Bogdan et al. (2004)] mBIC = − 2 log L M + k M log( np 2 + d ) In a particular sense controlling FWE (related to Bonferroni) FDR - controlling model selection criterion mBIC 2= − 2 log L M + k M log( np 2 + d ) − 2 log k m ! Adaptivity to level of sparsity [Abramovich et al. (2006)]

  13. Model Selection Simulation results for GWAS Modifications of BIC BIC = − 2 log L M + k M log n For situation p > n under sparsity [Bogdan et al. (2004)] mBIC = − 2 log L M + k M log( np 2 + d ) In a particular sense controlling FWE (related to Bonferroni) FDR - controlling model selection criterion mBIC 2= − 2 log L M + k M log( np 2 + d ) − 2 log k m ! Adaptivity to level of sparsity [Abramovich et al. (2006)]

  14. Model Selection Simulation results for GWAS Theoretical papers ABOS: Asymptotic Bayes optimality under sparsity Multiple Testing, normal mixtures M. Bogdan, A. Chakrabarti, F. Frommlet, J.K. Ghosh. Bayes oracle and asymptotic optimality of multiple testing procedures Arxiv 1002.3501 under sparsity. General priors, model selection Florian Frommlet, Malgorzata Bogdan, Arijit Chakrabarti Asymptotic Bayes optimality under sparsity of selection rules for general Arxiv 1005.4753 priors.

  15. Model Selection Simulation results for GWAS Theoretical papers ABOS: Asymptotic Bayes optimality under sparsity Multiple Testing, normal mixtures M. Bogdan, A. Chakrabarti, F. Frommlet, J.K. Ghosh. Bayes oracle and asymptotic optimality of multiple testing procedures Arxiv 1002.3501 under sparsity. General priors, model selection Florian Frommlet, Malgorzata Bogdan, Arijit Chakrabarti Asymptotic Bayes optimality under sparsity of selection rules for general Arxiv 1005.4753 priors.

  16. Model Selection Simulation results for GWAS Simulation scenario Population reference sample POPRES from dbGaP • 309790 SNPs for 649 individuals of European ancestry • k = 40 SNPs selected to be causal MAF between 0.3 and 0.5, pairwise correlation between -0.12 and 0.1 • Simulation of 1000 replicates from additive model M Y = X M β M + ǫ, ǫ i ∼ N (0 , 1) Two scenarios 1. effect size for all SNPs constant at β j = 0 . 5 2. β j equally distributed between 0.27 and 0.66

  17. Model Selection Simulation results for GWAS Simulation scenario Population reference sample POPRES from dbGaP • 309790 SNPs for 649 individuals of European ancestry • k = 40 SNPs selected to be causal MAF between 0.3 and 0.5, pairwise correlation between -0.12 and 0.1 • Simulation of 1000 replicates from additive model M Y = X M β M + ǫ, ǫ i ∼ N (0 , 1) Two scenarios 1. effect size for all SNPs constant at β j = 0 . 5 2. β j equally distributed between 0.27 and 0.66

  18. Model Selection Simulation results for GWAS Simulation scenario Population reference sample POPRES from dbGaP • 309790 SNPs for 649 individuals of European ancestry • k = 40 SNPs selected to be causal MAF between 0.3 and 0.5, pairwise correlation between -0.12 and 0.1 • Simulation of 1000 replicates from additive model M Y = X M β M + ǫ, ǫ i ∼ N (0 , 1) Two scenarios 1. effect size for all SNPs constant at β j = 0 . 5 2. β j equally distributed between 0.27 and 0.66

  19. Model Selection Simulation results for GWAS Simulation scenario Population reference sample POPRES from dbGaP • 309790 SNPs for 649 individuals of European ancestry • k = 40 SNPs selected to be causal MAF between 0.3 and 0.5, pairwise correlation between -0.12 and 0.1 • Simulation of 1000 replicates from additive model M Y = X M β M + ǫ, ǫ i ∼ N (0 , 1) Two scenarios 1. effect size for all SNPs constant at β j = 0 . 5 2. β j equally distributed between 0.27 and 0.66

  20. Model Selection Simulation results for GWAS Heritability Overall heritability is defined as Var ( X M β M ) H 2 = 1 + Var ( X M β M ) Heritability of an individual effect defined as β 2 j Var ( X j ) h 2 j = 1 + Var ( X M β M ) , Scenario 1 Overall heritability: H 2 ≈ 0 . 82. Individual effect: h 2 j ∼ 0 . 022. Scenario 2 Overall heritability: H 2 ≈ 0 . 81. Individual effect: h 2 j ranging from 0 . 006 till 0 . 037

  21. Model Selection Simulation results for GWAS Heritability Overall heritability is defined as Var ( X M β M ) H 2 = 1 + Var ( X M β M ) Heritability of an individual effect defined as β 2 j Var ( X j ) h 2 j = 1 + Var ( X M β M ) , Scenario 1 Overall heritability: H 2 ≈ 0 . 82. Individual effect: h 2 j ∼ 0 . 022. Scenario 2 Overall heritability: H 2 ≈ 0 . 81. Individual effect: h 2 j ranging from 0 . 006 till 0 . 037

  22. Model Selection Simulation results for GWAS Heritability Overall heritability is defined as Var ( X M β M ) H 2 = 1 + Var ( X M β M ) Heritability of an individual effect defined as β 2 j Var ( X j ) h 2 j = 1 + Var ( X M β M ) , Scenario 1 Overall heritability: H 2 ≈ 0 . 82. Individual effect: h 2 j ∼ 0 . 022. Scenario 2 Overall heritability: H 2 ≈ 0 . 81. Individual effect: h 2 j ranging from 0 . 006 till 0 . 037

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend