sophisticated models in bio
play

Sophisticated models in Bio++ Julien Dutheil, Bastien Boussau Birc, - PowerPoint PPT Presentation

Sophisticated models in Bio++ Julien Dutheil, Bastien Boussau Birc, Aarhus; LBBE, Lyon Friday, December 19th 2008 J. Dutheil, B. Boussau (Birc; LBBE) Models in Bio++ 19/12/08 1 / 13 Models of sequence evolution A tree a b c J. Dutheil,


  1. Sophisticated models in Bio++ Julien Dutheil, Bastien Boussau Birc, Aarhus; LBBE, Lyon Friday, December 19th 2008 J. Dutheil, B. Boussau (Birc; LBBE) Models in Bio++ 19/12/08 1 / 13

  2. Models of sequence evolution A tree a b c J. Dutheil, B. Boussau (Birc; LBBE) Models in Bio++ 19/12/08 2 / 13

  3. Models of sequence evolution A tree A model of substitution a b c J. Dutheil, B. Boussau (Birc; LBBE) Models in Bio++ 19/12/08 2 / 13

  4. Models of substitution in Bio++ • for proteins and nucleic acids (codons: soon! ) • with a gamma law to account for evolutionary rate heterogeneities between sites • possibility for a class of invariant sites • possibility for covarion (heterotachous) models: • on-off models (Tuffley and Steel 1998) • change between rates of evolution (Galtier 2001) J. Dutheil, B. Boussau (Birc; LBBE) Models in Bio++ 19/12/08 3 / 13

  5. Homogeneous and branch-heterogeneous models in Bio++ Homogeneous model a b c J. Dutheil, B. Boussau (Birc; LBBE) Models in Bio++ 19/12/08 4 / 13

  6. Homogeneous and branch-heterogeneous models in Bio++ Homogeneous model Heterogeneous model a a b b c c J. Dutheil, B. Boussau (Birc; LBBE) Models in Bio++ 19/12/08 4 / 13

  7. A simple model of substitution: Tamura’s (1992) • κ : Transition/transversion ratio • θ : Equilibrium G+C content Galtier and Gouy, Mol. Biol. Evol. 1998. J. Dutheil, B. Boussau (Birc; LBBE) Models in Bio++ 19/12/08 5 / 13

  8. Galtier and Gouy model of sequence evolution (1998) Model Parameters a b c • 1 model per branch • each model is characterized by an equilibrium G+C content J. Dutheil, B. Boussau (Birc; LBBE) Models in Bio++ 19/12/08 6 / 13

  9. Models in Bio++ General non-homogeneous model of substitution. In the homogeneous case, θ and κ are constant over the tree (case ’a’). In Galtier and Gouy’s 1998 model, κ is constant over the tree and one distinct θ is allowed per branch (case ’b’). Between these two extrema lay models with certain branches, but not all, sharing a common value of θ (case ’c’). In the most general case ’d’, there are two sets of parameters, one for κ and another for θ , that are shared by the branches of the tree. J. Dutheil, B. Boussau (Birc; LBBE) Models in Bio++ 19/12/08 7 / 13

  10. Associating models to branches J. Dutheil, B. Boussau (Birc; LBBE) Models in Bio++ 19/12/08 8 / 13

  11. Bio++ and BppSuite BppSuite is a set of programs implementing various methods for the evolutionary study of sequences: • BppDist: distance estimation and tree reconstruction • BppPars: parsimony analyses • BppML: ML reconstruction of phylogenetic trees, including using non-homogeneous models • BppSeqGen: sequence simulation, including using non-homogeneous models • BppAncestor: ancestral sequence reconstruction, including using non-homogeneous models • BppSeqMan: sequence and alignment manipulation • BppConsense: building of consensus trees • BppPhySamp: select sequences according to a tree or a distance matrix • BppReRoot: automatic re-rooting of trees J. Dutheil, B. Boussau (Birc; LBBE) Models in Bio++ 19/12/08 9 / 13

  12. Specifying options of BppSuite programs Launching an analysis with bppml Example: bppml param=fichier.opt fichier.opt alphabet = DNA sequence . file = sequences . fasta sequence . format = Fasta sequence . sites to use = complete tree . file = tree . dnd etc... J. Dutheil, B. Boussau (Birc; LBBE) Models in Bio++ 19/12/08 10 / 13

  13. Associating models to branches in BppSuite J. Dutheil, B. Boussau (Birc; LBBE) Models in Bio++ 19/12/08 11 / 13

  14. Exercise • THE DATA: A well-known scientist is working on a family of homologous genes (file ”sequences.fasta”). These sequences come from closely-related species and have been named according to their species of origin: S vulg, S con, S dio, S lat, S dic. Specifically in species S dio, S lat, and S dic, the gene is found on sexual chromosomes. For each of these species, the alignment thus contains two sequences, one from the X chromosome (X is put at the end of the name), and one from the Y chromosome (Y is put at the end of the name). The famous scientist has built a rooted phylogenetic tree relating all sequences in his dataset (file ”tree.dnd”). • THE PROBLEM: The scientist suspects there might have been some Biased Gene Conversion (BGC) going on on the branch leading to the group containing sequences S dioY, S latY, and S dicY. This BGC is expected to increase the number of substitutions towards bases G and C. Your aim is to test for the presence of BGC on this branch. J. Dutheil, B. Boussau (Birc; LBBE) Models in Bio++ 19/12/08 12 / 13

  15. Exercise • THE AIMS: • Using bppML, devise a test to see whether the data rejects BGC on this branch. • Meanwhile, try to accurately characterize the evolution in this dataset. Is there significant rate heterogeneity? Covarion-like evolution? How important has been process heterogeneity in the evolution of this dataset? • THE METHOD: • Option files have been partially filled. You need to complete them to build a proper model to make hypothesis 0 (model 0: there was no heterogeneity in the evolution of the dataset), hypothesis 1 (model 1: there has been one significant change in the evolutionary process on one particular branch), hypothesis 2 (model 2: the evolution has been globally heterogeneous, with different processes on different branches). • Play with the options to better characterize sequence evolution • Use likelihood ratio tests to compare hypotheses. BONUS QUESTION: • • Think of another way to test whether the evolutionary process has been particular on the branch of interest. BppSuite may be useful once again; you may need to do a little bit of programming. J. Dutheil, B. Boussau (Birc; LBBE) Models in Bio++ 19/12/08 13 / 13

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend