orthologyand paralogy
play

ORTHOLOGYAND PARALOGY CONSTRAINTS: SATISFIABILITY AND CONSISTENCY - PowerPoint PPT Presentation

ORTHOLOGYAND PARALOGY CONSTRAINTS: SATISFIABILITY AND CONSISTENCY Manuel Lafond, Nadia El-Mabrouk University of Montreal Outline Introduction Gene trees, orthologs, paralogs , 3 problems, given a set of orthologs and paralogs


  1. ORTHOLOGYAND PARALOGY CONSTRAINTS: SATISFIABILITY AND CONSISTENCY Manuel Lafond, Nadia El-Mabrouk University of Montreal

  2. Outline • Introduction • Gene trees, orthologs, paralogs , … • 3 problems, given a set of orthologs and paralogs • Satisfiability • Consistency with a species tree S • Self-consistency • Experiments

  3. Introduction • Gene trees reflect the evolutionary history of a family of homologous genes • Genes that all descend from a common ancestor G : a,b,c,d are species Gene trees don’t have to be binary. a 1 a 2 b 1 c 1 d 1

  4. Introduction • Ancestral genes may have undergone speciation or duplication Speciation G : Duplication a 1 a 2 b 1 c 1 d 1

  5. Introduction Orthologs : LCA has undergone speciation (LCA = Lowest Common Paralogs : LCA has undergone duplication Ancestor) For instance, according to G : a 1 , b 1 are paralogs a 1 , c 1 are orthologs Speciation G : Duplication a 1 a 2 b 1 c 1 d 1

  6. Introduction If we have G (and trust its Dup/Spec labeling), then we have all orthology/paralogy relationships. Paralogs Orthologs a 1 b 1 G : a 1 a 2 a 1 c 1 a 1 b 1 a 1 d 1 a 2 c 1 a 2 d 1 b 1 c 1 b 1 d 1 a 1 a 2 b 1 c 1 d 1 c 1 d 1

  7. Introduction How does that go the other way around ? If we have the orthology/paralogy relationships, can we get the gene tree ? Paralogs Orthologs a 1 b 1 a 1 a 2 a 1 c 1 ? a 1 b 1 a 1 d 1 a 2 c 1 a 2 d 1 b 1 c 1 b 1 d 1 c 1 d 1

  8. Introduction Various software let us infer orthology (and sometimes paralogy) without a gene tree Sequence-based COG (Tatusov, Galperin, Natale & Koonin, 2000) OrthoMCL (Li, Stoeckert & Roos, 2003) InParanoid (Berglund, Sjolund, Ostlund & Sonnhammer, 2008) Proteinortho (Findeib, Steiner, Marz, Stadler & Prohaska, 2011) … Gene order-based GIGA (Thomas, 2010) SYNERGY (Wapinski, Pfeffer, Friedman & Regev, 2007) [Unnamed] (Lafond, Swenson, El-Mabrouk, 2013)

  9. Introduction Various software let us infer orthology (and sometimes paralogy) without a gene tree Sequence-based COG OrthoMCL InParanoid None of them finds ALL Proteinortho orthologies/paralogies ! … Gene order-based GIGA SYNERGY [Unnamed]

  10. Satisfiability Orthologs = (a, b) (a, c) (c, d) Paralogs = (a, d) (b, d) Is there some gene tree and Dup/Spec labeling that displays these relationships ?

  11. Satisfiability Orthologs = (a,b) (a, c) (c, d) Paralogs = (a, d) (b, d) c a b d

  12. Satisfiability Orthologs = (a,b) (a, c) (c, d) Paralogs = (a, d) (b, d) c a b d

  13. Satisfiability Orthologs = (a,b) (a, c) (c, d) Paralogs = (a, d) (b, c) (b, d)

  14. Satisfiability Orthologs = (a,b) (a, c) (c, d) Paralogs = (a, d) (b, c) (b, d) d a

  15. Satisfiability Orthologs = (a,b) (a, c) (c, d) Paralogs = (a, d) (b, c) (b, d) b d a

  16. Satisfiability Orthologs = (a,b) (a, c) (c, d) Paralogs = (a, d) (b, c) (b, d) b d a c

  17. Satisfiability Orthologs = (a,b) (a, c) (c, d) Paralogs = (a, d) (b, c) (b, d) b d a c

  18. Satisfiability Orthologs = (a,b) (a, c) (c, d) Paralogs = (a, d) (b, c) (b, d) b d a c

  19. Satisfiability Orthologs = (a,b) (a, c) (c, d) Paralogs = (a, d) (b, c) (b, d) b d a c

  20. Satisfiability Orthologs = (a,b) (a, c) (c, d) Paralogs = (a, d) (b, c) (b, d) I JUST CAN’T ! THESE DON’T MAKE SENSE !

  21. Consistency with a species tree S Orthologs = (a,d) (c,d) Paralogs = (a,c) (b, d) Species tree S Gene tree G ? d a b c

  22. Consistency with a species tree S Orthologs = (a,d) (c,d) Paralogs = (a,c) (b, d) Species tree S Gene tree G d d a c b a b c

  23. Consistency with a species tree S Consistency with a species tree S : If genes from species sets X,Y are separated by speciation in G, then species X, Y are separated in S. Species tree S Gene tree G d d a c b a b c

  24. Consistency with a species tree S Consistency with a species tree S : If genes from species sets X,Y are separated by speciation in G, then species X, Y are separated in S. Species tree S Gene tree G Speciation d d a c b a b c

  25. Consistency with a species tree S Orthologs = (a,d) (c,d) Paralogs = (a,c) (b, d) Species tree S Gene tree G ? d a b c

  26. Consistency with a species tree S Orthologs = (a,d) (c,d) Paralogs = (a,c) (b, d) Species tree S Gene tree G d b d a c a b c

  27. Consistency with a species tree S Orthologs = (a,d) (c,d) Paralogs = (a,c) (b, d) Species tree S Gene tree G Speciation d b d a c a b c

  28. Self-consistency Orthologs = (a,d) (c,d) Paralogs = (a,c) (b, d) Can we build a gene tree G displaying these relationships such that there exists some species tree S consistent with it ?

  29. Self-consistency Orthologs = (a,d) (c,d) Paralogs = (a,c) (b, d) Gene tree G Speciation d a c b

  30. Self-consistency Orthologs = (a,d) (c,d) Paralogs = (a,c) (b, d) Species tree S Gene tree G Speciation d d a c b a c b

  31. Not self-consistent S G a b c a 1 b 1 c 1 a 2 c 2 b 2

  32. Not self-consistent S G a b c a 1 b 1 c 1 a 2 c 2 b 2 S’ b a c

  33. The problem(s) Given a set C of orthologs and paralogs : 1. Is C satisfiable ? Does there exist a DS-tree that exhibits all relationships in C ? 2. Is C consistent with a given species tree S ? Is there some DS-tree that satisfies C that is also consistent with S ? 3. Is C self-consistent ? Is there some species tree that C is consistent with ?

  34. Satisfiability Orthologs = (a,b) (a, c) (c, d) Paralogs = (a, d) (b, c) (b, d) Constraint graph R a b c d Orthologs Paralogs

  35. Satisfiability Orthologs = (a,b) (a, c) (c, d) Paralogs = (a, d) (b, c) (b, d) R R P R O a b a b a b c d c d c d Orthologs Paralogs

  36. Satisfiability (Hernandez-Rosales & al., 2012) If R is a complete graph, then the given set of relationships is satisfiable iff R O is P 4 -free (and equivalently, if R P is P 4 -free) R R P R O a b a b a b c d c d c d Orthologs Paralogs

  37. Unknown relationships Orthologs = (a,b) (a, c) (c, d) Paralogs = (a, d) (b, d) The (b,c) relationship is unknown . R Our relationships are satisfiable iff we can decide the (b,c) relationship such that RO will be P 4 -free a b c d

  38. Unknown relationships Orthologs = (a,b) (a, c) (c, d) Paralogs = (a, d) (b, d) The (b,c) relationship is unknown . R Our relationships are satisfiable iff we can decide the (b,c) relationship such that RO will be P 4 -free a b c d

  39. Unknown relationships Orthologs = (a,b) (a, c) (c, d) Paralogs = (a, d) (b, d) The (b,c) relationship is unknown . R Our relationships are satisfiable iff we can decide the (b,c) relationship such that RO will be P 4 -free a b This problem is equivalent to the Graph Sandwich Problem on the class of cographs c d

  40. Satisfiability Theorem ( Golumbic, Kaplan and Shamir, 1994) : A relationship graph R is satisfiable iff at least one of the following holds : 1) R O is disconnected, and each of its component is satisfiable 2) R P is disconnected, and each of its component is satisfiable

  41. Constructing a gene tree b a g c f d e

  42. Constructing a gene tree R P is connected, b nothing to do here. a g c f d e

  43. Constructing a gene tree X R O has 2 components, b X and Y. a g c f d e Y

  44. Constructing a gene tree X R O has 2 components, b X and Y. a g All edges going from X to Y are either black or blue (paralogy or c f unknown). d e Y

  45. Constructing a gene tree X R O has 2 components, b X and Y. a g All edges going from X to Y are either black or blue (paralogy or c f unknown). d e Make it all blue ! Y

  46. Constructing a gene tree Now, all genes of X are X paralog to all genes of b Y. a g We can start building our gene tree as such : c f d e X Y Y

  47. Constructing a gene tree b Repeat with X, and Y. X a c b b a a X Y c c R P [X] R O [X]

  48. Constructing a gene tree b Repeat with X, and Y, a c b b a a Y a c c b c R P [X] R O [X]

  49. Constructing a gene tree b Repeat with X, and Y. c Y a b c

  50. Constructing a gene tree g Repeat with X, and Y. f d e g R P [Y] f a d e b e g f c d

  51. Constructing a gene tree b a g c f a d e e g b c f d

  52. Consistency with a species tree S S G a d g b e c f a e g b c f d

  53. Consistency with a species tree Consistency with S : If genes from species sets X,Y are separated by speciation in G, then species X, Y are separated in S. S G a d g b e c f a e g b c f d

  54. Consistency with a species tree Consistency with S : If genes from species sets X,Y are separated by speciation in G, then species X, Y are separated in S. S Inconsistent ! G a d g b e c f a e g b c f d

  55. Careful component selection R P [Y] g Problem: at this step Y, we chose to separate {e,g} from {f,d} by speciation, contradicting S. f d e S a d g b c e f e g f d

  56. Careful component selection S a b c d d a b c

  57. Careful component selection S a b c d d a b c R P b a d c

  58. Careful component selection S a b c d d a b c S does not separate {a,c} R P b from {b} a NOT CAREFUL a c b d d c

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend