polytomy refinement for the
play

POLYTOMY REFINEMENT FOR THE CORRECTION OF DUBIOUS DUPLICATIONS IN - PowerPoint PPT Presentation

POLYTOMY REFINEMENT FOR THE CORRECTION OF DUBIOUS DUPLICATIONS IN GENE TREES Manuel Lafond 1 , Cedric Chauve 2,3 , Riccardo Dondi 4 , Nadia El-Mabrouk 1 1 Universit de Montral, Canada 2 Universit Bordeaux 1, France 3 Simon Fraser University,


  1. POLYTOMY REFINEMENT FOR THE CORRECTION OF DUBIOUS DUPLICATIONS IN GENE TREES Manuel Lafond 1 , Cedric Chauve 2,3 , Riccardo Dondi 4 , Nadia El-Mabrouk 1 1 Université de Montréal, Canada 2 Université Bordeaux 1, France 3 Simon Fraser University, Canada 4 Universitá degli Studi di Bergamo, Italy

  2. Introduction • Gene tree for the SLC24a2 gene family (solute carrier 24) G : SLC24 SLC24 SLC24 SLC24 SLC24 SLC24 SLC24 Mouse Human Chimp Rat Microbat Megabat Squirrel

  3. Introduction • Species tree for the species having a gene in G. S : Microbat Megabat Mouse Rat Human Chimp Squirrel

  4. Introduction • G and S disagree S : G : MicBat MegBat Mous Rat Sqrl Chmp SLC SLC SLC SLC Hum SLC SLC SLC MicBat Mous Hum Chmp Sqrl Rat MegBat

  5. Introduction • LCA MAPPING : associate each ancestral gene with the species it belonged to z z S : G : y z z x u z w v w v MicBat MegBat Mous Rat Sqrl Chmp SLC SLC SLC SLC Hum SLC SLC SLC MicBat Mous Hum Chmp Sqrl Rat MegBat

  6. Introduction • G and S disagree => Duplication of an ancestral gene z z S : G : y z z x u z w v w v MicBat MegBat Mous Rat Sqrl Chmp SLC SLC SLC SLC Hum SLC SLC SLC MicBat Mous Hum Chmp Sqrl Rat MegBat

  7. Introduction • Extant species are expected to have 2 copies of the gene • None of them do. That’s dubious ! z z S : G : y z z x u z w v w v MicBat MegBat Mous Rat Sqrl Chmp SLC SLC SLC SLC Hum SLC SLC SLC MicBat Mous Hum Chmp Sqrl Rat MegBat

  8. Introduction • If some species was represented on both sides of the duplication, it would be an Apparent Duplication (AD) z z S : G : y z z x SLC u Hum z w v w v MicBat MegBat Mous Rat Sqrl Chmp SLC SLC SLC SLC Hum SLC SLC SLC MicBat Mous Hum Chmp Sqrl Rat MegBat

  9. Introduction • Non-apparent duplication (NAD) : the left and right subtrees of the duplication share no gene from the same species . z z NAD S : G : y z z x u z w v w v MicBat MegBat Mous Rat Sqrl Chmp SLC SLC SLC SLC Hum SLC SLC SLC MicBat Mous Hum Chmp Sqrl Rat MegBat

  10. Introduction • Missing gene copies must have been lost sometime ago. • NADs usually imply a bunch of losses. z z NAD S : G : y z z x u z w v w v MicBat MegBat Mous Rat Sqrl Chmp SLC SLC SLC SLC Hum SLC SLC SLC MicBat Mous Hum Chmp Sqrl Rat MegBat

  11. Introduction • NADs are called dubious , or ambiguous duplications in the Ensembl database. • About 44% of duplication nodes are dubious. • The SLC24 gene tree has 32 duplication nodes, 24 of which are dubious. • Simulations showed that only 5% percent of duplications were actually NADs (Chauve & Mabrouk, 2009).

  12. Introduction • Alternative scenario for the root of G : no duplication occurred. NAD S : G : MicBat MegBat Mous Rat Sqrl Chmp SLC SLC SLC SLC Hum SLC SLC SLC MicBat Mous Hum Chmp Sqrl Rat MegBat

  13. Introduction • Alternative scenario for the root of G : no duplication occurred => speciation => the bat genes should be separated from the others. NAD S : G : SLC SLC Hum/Mo/ MicBat/ Rat/Chmp/ MegBat Sqrl MicBat MegBat Mous Rat Sqrl Chmp Hum

  14. Introduction • Break G as least as possible : send the maximal bat subtrees left , and the maximal rodent/primate subtrees right S : G : MicBat MegBat Mo Rat Sqrl Chmp SLC SLC SLC SLC Hum SLC SLC SLC MicBat Mous Hum Chmp Sqrl Rat MegBat

  15. Introduction • Break G as least as possible : send the maximal bat subtrees left , and the maximal rodent/primate subtrees right G : G’ : SLC SLC SLC SLC SLC SLC SLC SLC SLC SLC SLC SLC SLC SLC MicBat Mous Hum Chmp Sqrl Rat MegBat MicBat MegBat Rat Sqrl Mous Hum Chmp

  16. Introduction • G’ ends up with possibly two unresolved polytomies . • We are looking for a binary refinement of these polytomies. G’ : SLC SLC SLC SLC SLC SLC SLC MicBat MegBat Rat Sqrl Mous Hum Chmp

  17. Introduction • Other sources of polytomies : • Lack of phylogenetic signal in the sequences, causing some gene tree construction algorithms to leave the gene tree partially unresolved. • Contraction of gene tree branches having low support (e.g. bootstrap values). SLC SLC SLC SLC SLC Rat Mous Sqrl Hum Chmp

  18. Previous works • Find a binary refinement minimizing: • Duplications + losses (Chang & Eulenstein, 2006, O(n 3 ) ); • Duplications + losses (Lafond & Swenson & El-Mabrouk, 2012, O(n)) • Duplications and then losses (Zheng, Wu, Zhang, 2012, O(n)) • Losses : It’s a linear problem . • Our problem here: Minimize NAD nodes • For all these optimization criteria, polytomies can be refined independantly. Thus we reduce the problem to a single polytomy.

  19. Introduction • Given : a polytomy P and a species tree S • Find : a binary refinement of P that minimizes the number of NADs created. S P Mous Rat Sqrl Chmp Hum SLC SLC SLC SLC SLC Rat Sqrl Mous Hum Chmp

  20. Introduction • Given : a polytomy P and a species tree S • Objective : find a binary refinement of P that minimizes the number of NADs created. S NAD P Mous Rat Sqrl Chmp Hum SLC SLC SLC SLC SLC Rat Sqrl Mous Hum Chmp SLC SLC SLC SLC SLC Rat Sqrl Mous Hum Chmp

  21. Introduction • Given : a polytomy P and a species tree S • Objective : find a binary refinement of P that minimizes the number of NADs created. S P Mous Rat Sqrl Chmp Hum SLC SLC SLC SLC SLC Rat Sqrl Mous Hum Chmp SLC SLC SLC SLC SLC Rat Sqrl Mous Hum Chmp

  22. A simple example S P a b c d e a 1 c 1 e 1 a 2 b 1 c 2 d 1

  23. Reconnecting subtrees S a b c d e a 1 c 1 e 1 a 2 b 1 c 2 d 1

  24. Reconnecting subtrees S a b c d e a 1 c 1 e 1 a 2 b 1 c 2 d 1

  25. Reconnecting subtrees S a b c d e a 1 c 1 e 1 a 2 b 1 c 2 d 1

  26. Reconnecting subtrees S a b c d e a 1 c 1 e 1 a 2 b 1 c 2 d 1

  27. Reconnecting subtrees S a 2 c 2 d 1 b 1 a b c d e a 1 c 1 e 1

  28. Reconnecting subtrees S a 2 c 2 d 1 b 1 a b c d e a 1 c 1 e 1

  29. Reconnecting subtrees S a b c d e a 1 c 1 e 1 a 2 b 1 c 2 d 1

  30. Reconnecting subtrees S a b c d e a 1 c 1 e 1 a 2 b 1 c 2 d 1

  31. Reconnecting subtrees a 1 ,c 1 are connected by Speciation (S) S a b c d e a 1 c 1 e 1 a 2 b 1 c 2 d 1

  32. Reconnecting subtrees S a b c d e a 1 c 1 e 1 a 2 b 1 c 2 d 1

  33. Reconnecting subtrees a 1 ,(a2, b1) are connected by Apparent Duplication (AD) S a b c d e a 1 c 1 e 1 a 2 b 1 c 2 d 1

  34. Reconnecting subtrees S a b c d e a 1 c 1 e 1 a 2 b 1 c 2 d 1

  35. Reconnecting subtrees a 1 ,(a2, b1) are connected by Non- Apparent Duplication (NAD) S a b c d e a 1 c 1 e 1 a 2 b 1 c 2 d 1

  36. Relationship graph Each subtree is a vertex. Each pair of vertices (x,y) is connected by an edge labeled by the connection type of x and y. S a b c d e a c e a b c d

  37. Relationship graph Each subtree is a vertex. Each pair of vertices (x,y) is connected by an edge labeled by the connection type of x and y. a b S a c d a b c d e e c

  38. Relationship graph Each subtree is a vertex. Each pair of vertices (x,y) is connected by an edge labeled by the connection type of x and y. a b S a c d a b c d e e c Spec AD NAD

  39. Relationship graph Speciation clique : a clique exclusively made up of “Spec” edges. a b S a c d a b c d e e c Spec AD NAD

  40. Relationship graph Speciation clique : a clique exclusively made up of “Spec” edges. a b S a c d a b c d e e c Spec AD NAD

  41. Relationship graph Speciation clique : a clique exclusively made up of “Spec” edges. a b S a c d a b c d e e c Spec AD NAD

  42. Theorem There exists a binary refinement with zero NADs iff there exists a set of disjoint speciation cliques W in the relationship graph such that W + the AD edges form a single connected component. a b a c d e c Spec AD NAD

  43. Theorem There exists a binary refinement with zero NADs iff there exists a set of disjoint speciation cliques W in the relationship graph such that W + the AD edges form a single connected component. a b a c d e c Spec AD NAD

  44. Theorem There exists a binary refinement with zero NADs iff there exists a set of disjoint speciation cliques W in the relationship graph such that W + the AD edges form a single connected component. a b a e c d a c e c Spec AD NAD

  45. Theorem There exists a binary refinement with zero NADs iff there exists a set of disjoint speciation cliques W in the relationship graph such that W + the AD edges form a single connected component. a b a e c d a c e a b c Spec AD NAD

  46. Theorem There exists a binary refinement with zero NADs iff there exists a set of disjoint speciation cliques W in the relationship graph such that W + the AD edges form a single connected component. a b a e c d a c e a b c d c Spec AD NAD

  47. Theorem There exists a binary refinement with a minimum of d NADs iff there exists a set of disjoint speciation cliques W in the relationship graph such that W + the AD edges have a minimum of d + 1 connected components.

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend