learning hierarchical bayesian networks for genome wide
play

Learning Hierarchical Bayesian Networks for Genome-Wide Association - PowerPoint PPT Presentation

Learning Hierarchical Bayesian Networks for Genome-Wide Association Studies Raphal Mourad 1 , Christine Sinoquet 2 and Philippe Leray 1 KOD team (KnOwledge and Decision), 1 LINA, UMR CNRS 6241, Ecole Polytechnique de l'Universit de Nantes. 2


  1. Learning Hierarchical Bayesian Networks for Genome-Wide Association Studies Raphaël Mourad 1 , Christine Sinoquet 2 and Philippe Leray 1 KOD team (KnOwledge and Decision), 1 LINA, UMR CNRS 6241, Ecole Polytechnique de l'Université de Nantes. 2 LINA, UMR CNRS 6241, Université de Nantes. FRANCE Presented by Raphael Mourad PhD student in Bioinformatics raphael.mourad@univ-nantes.fr

  2. Mourad R. et al : Learning Hierarchical Bayesian Networks for GWAS Outline 1/ Introduction 2/ Fondamental concept of association genetics 3/ Presentation of genetic data 4/ Our approach 5/ Results and discussion 6/ Conclusion and outlooks COMPSTAT 2010 2

  3. Mourad R. et al : Learning Hierarchical Bayesian Networks for GWAS Introduction COMPSTAT 2010 3

  4. Mourad R. et al : Learning Hierarchical Bayesian Networks for GWAS ● Context: Complex genetic diseases = multifactorial genetic diseases caused by a combination of genetic factors ( eg genes) and environmental factors ( eg sex, age...). Examples: diabetes, asthma, hypertension, some cancers... COMPSTAT 2010 4

  5. Mourad R. et al : Learning Hierarchical Bayesian Networks for GWAS ● Dissect the genetic basis of these diseases: Genome-wide association studies (GWAS) → identification of genetic markers associated with common, complex diseases. Chromosome Markers The human genome variability is covered by hundreds of thousands of markers. COMPSTAT 2010 5

  6. Mourad R. et al : Learning Hierarchical Bayesian Networks for GWAS Fondamental concept of association genetics COMPSTAT 2010 6

  7. Mourad R. et al : Learning Hierarchical Bayesian Networks for GWAS ● Linkage disequilibrium (LD): → dependences generally observed between close SNPs on the chromosome, → at the basis of GWAS. Chromosome Causal Marker Marker mutation LD LD LD between markers and their surrounding area on the chromosome. COMPSTAT 2010 7

  8. Mourad R. et al : Learning Hierarchical Bayesian Networks for GWAS Presentation of genetic data COMPSTAT 2010 8

  9. Mourad R. et al : Learning Hierarchical Bayesian Networks for GWAS Phenotype DNA > 100k SNP 1 binary variable: Ternary variables - 1000 non-affected individuals - 1000 affected individuals ● Characteristics: → large number of genetic variables (SNP): combinatorial explosion → strong dependences among genetic variables COMPSTAT 2010 9

  10. Mourad R. et al : Learning Hierarchical Bayesian Networks for GWAS Our approach COMPSTAT 2010 10

  11. Mourad R. et al : Learning Hierarchical Bayesian Networks for GWAS ● Reduce the data dimension by synthetizing the information of highly dependent SNPs, due to LD. Latent variables (LV) Cliques of highly dependent SNPs synthetizing the information of SNP cliques SNP SNP LV SNP SNP Data dimension SNP SNP reduction LV SNP COMPSTAT 2010 11

  12. Mourad R. et al : Learning Hierarchical Bayesian Networks for GWAS ● Provide a flexible and adapted probabilistic model to reduce dimension for genetic data. Genome sequence Ch Characteris istics of data: dependences by blocs of SNPs Latent variables Proposed mod odelli ling Observed variables Forest of Hierarchical Latent Class (SNPs) models (FHLCMs) COMPSTAT 2010 12

  13. Mourad R. et al : Learning Hierarchical Bayesian Networks for GWAS ● Advantages of this modelling: → hierarchical, thus : - various degrees of dimension reduction, - various degrees of LD strength, → each latent variable can reveal multiple-SNP patterns, potentially relevant to explain the disease, → contrary to Hierarchical Latent Class model, SNPs are not constrained to be dependent upon one another, → high-order interactions between SNPs can be taken into account. COMPSTAT 2010 13

  14. Mourad R. et al : Learning Hierarchical Bayesian Networks for GWAS ● Proposed algorithm to learn both parameters and structure of FHLCMs from data: CFHLC (Construction of Forests of Hierarchical Latent Class models). → based on an agglomerative hierarchical procedure to ensure scalability, → uses clique partitioning methods for an efficient discovery of non-overlapping cliques of dependent SNPs, → not restricted to binary variables and binary trees, as Hwang et al .'s algorithm. COMPSTAT 2010 14

  15. Mourad R. et al : Learning Hierarchical Bayesian Networks for GWAS Schema of the algorithm: COMPSTAT 2010 15

  16. Mourad R. et al : Learning Hierarchical Bayesian Networks for GWAS Results and discussion COMPSTAT 2010 16

  17. Mourad R. et al : Learning Hierarchical Bayesian Networks for GWAS ● Protocol testing: → C++ implementation, → run on a standard pc (3.8 GHz, 3.3 Go RAM), → tested on simulated unphased genotypic data consisting of 2000 individuals and 1k, 10k or 100k SNPs, generated with the software Hapsimu. COMPSTAT 2010 17

  18. Mourad R. et al : Learning Hierarchical Bayesian Networks for GWAS Scalability COMPSTAT 2010 18

  19. Mourad R. et al : Learning Hierarchical Bayesian Networks for GWAS Visual display of a FHLCM: 100 snp sequence High dependence regions Latent variables Low dependence regions Observed variables High-order (SNPs) dependences COMPSTAT 2010 19

  20. Mourad R. et al : Learning Hierarchical Bayesian Networks for GWAS Conclusion and outlooks COMPSTAT 2010 20

  21. Mourad R. et al : Learning Hierarchical Bayesian Networks for GWAS Conclusion: ● CFHLC algorithm have been shown to be efficient on genome-scaled data, ● Can provide a data dimension reduction of 80%. Perspectives: ● Application on the detection of genetic associations thanks to FHLCM's latent variables, ● Visualization of LD structure through the FHLCM's graph. COMPSTAT 2010 21

  22. Mourad R. et al : Learning Hierarchical Bayesian Networks for GWAS Thanks for your attention COMPSTAT 2010 22

  23. Mourad R. et al : Learning Hierarchical Bayesian Networks for GWAS COMPSTAT 2010 23

  24. Mourad R. et al : Learning Hierarchical Bayesian Networks for GWAS Questions COMPSTAT 2010 24

  25. Mourad R. et al : Learning Hierarchical Bayesian Networks for GWAS Impact of window size on running time COMPSTAT 2010 25

  26. Mourad R. et al : Learning Hierarchical Bayesian Networks for GWAS Impact of window size on dimension reduction COMPSTAT 2010 26

  27. Mourad R. et al : Learning Hierarchical Bayesian Networks for GWAS Bibliography General on GWASs: - Balding D. (2006): a tutorial on statistical methods for population association studies. Specific to probabilistic graphical models: - Verzilli (2007): Bayesian graphical models for genome-wide association studies. - Hwang (2006): learning hierarchical Bayesian networks for large-scale data analysis. COMPSTAT 2010 27

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend