group theoretic formalization of double cut and join
play

Group theoretic formalization of double-cut-and-join model of - PowerPoint PPT Presentation

Group theoretic formalization of double-cut-and-join model of chromosomal rearrangement Sangeeta Bhatia Phd Supervisor- Prof.Andrew Francis Centre for Research in Mathematics University of Western Sydney 7 th November 2013 Rare is better


  1. Group theoretic formalization of double-cut-and-join model of chromosomal rearrangement Sangeeta Bhatia Phd Supervisor- Prof.Andrew Francis Centre for Research in Mathematics University of Western Sydney 7 th November 2013

  2. Rare is better – large scale mutations ◮ Large scale genome rearrangements such as insertion or deletion of genes, gene duplications, inversions of genes make good phlyogenetic markers, precisely because they are rare. ◮ Our focus - Determining a measure of difference between various species bssed on such large scale genome rearrangements. ◮ Our tool - algebra/group theory.

  3. An example – Double cut and join

  4. An example – Double cut and join ◮ Genome representation – graph.

  5. An example – Double cut and join ◮ Genome representation – graph. ◮ Rearrangement events Inversion of a section ◮ Translocation of a section ◮ Fission/Fusion of strands ◮

  6. Double-cut-and-join: genome representation

  7. Double-cut-and-join: genome representation ◮ A “gene” or region has two extremities: a head and a tail.

  8. Double-cut-and-join: genome representation ◮ A “gene” or region has two extremities: a head and a tail. ◮ Store “adjacencies” i.e. which gene extremities are adjacent on the genome.

  9. Double-cut-and-join: genome representation ◮ A “gene” or region has two extremities: a head and a tail. ◮ Store “adjacencies” i.e. which gene extremities are adjacent on the genome. ◮ Example 5 h , 4 t 1 h , 3 t 3 h , 2 t 2 h 1 t 5 t , 4 h { 1 t , { 1 h , 3 t } , { 3 h , 2 t } , 2 h , { 5 h , 4 t } , { 5 t , 4 h }}

  10. Double cut and join – the cut 1 t 1 h , 2 t 2 h , 3 t 3 h , 4 t 4 h 1 h 2 t 2 h , 3 t 3 h 4 t 4 h 1 t

  11. Double cut and join operation — inversion 1 h 2 t 2 h , 3 t 3 h 4 t 4 h 1 t 1 t 1 h , 3 h 3 t , 2 h 2 t , 4 t 4 h

  12. Double cut and join operation — excision 2 h , 3 t 1 t 1 h 2 t 3 h 4 t 4 h 1 t 1 h , 4 t 4 h 2 t , 3 h 2 h , 3 t

  13. Circularization/Linearization 1 h , 2 t 2 h , 3 t 3 h , 4 t 1 t 4 h 4 h , 1 t 1 h , 2 t 3 h , 4 t 2 h , 3 t

  14. Fusion/Fission 1 t 1 h , 2 t 2 h , 3 t 3 h , 4 t 4 h 1 h , 2 t 2 h 3 t 3 h , 4 t 4 h 1 t

  15. Distance under the DCJ model – Adjacency graph 1 h 2 t 3 t 2 h 4 t 3 h 4 h 5 t 5 h 1 t 1 h 2 t 4 t 3 t 2 h 3 h 1 t 4 h 5 h 5 t

  16. DCJ operator — Our re-formulation ◮ We assign a numeric label to each gene extremity. Let i be a gene. Then i t → 2 i − 1 i h → 2 i ◮ Thus if there are n genes, we get 2 n labels. Let us call this set X .

  17. DCJ operator — Our re-formulation ◮ We assign a numeric label to each gene extremity. Let i be a gene. Then i t → 2 i − 1 i h → 2 i ◮ Thus if there are n genes, we get 2 n labels. Let us call this set X . ◮ A genome on n genes is a permutation π on the set X such that π ( i ) = j ⇐ ⇒ π ( j ) = i

  18. DCJ operator — Our re-formulation ◮ For example for the genome { 1 t , ( 1 h , 2 h ) , 2 t } , the labels are 1 t → 1 , 1 h → 2 2 t → 3 , 2 h → 4

  19. DCJ operator — Our re-formulation ◮ For example for the genome { 1 t , ( 1 h , 2 h ) , 2 t } , the labels are 1 t → 1 , 1 h → 2 2 t → 3 , 2 h → 4 and it is encoded as � � 1 2 3 4 1 4 3 2

  20. DCJ operator — Our re-formulation For i , j ∈ X � ( i j ) π ( i j ) if π = . . . ( k i )( l j ) and k � = i or j � = l D ij ( π ) = ( i j ) π if i and j are fixed in π or π = . . . ( i j )

  21. DCJ operator — Our re-formulation For i , j ∈ X � ( i j ) π ( i j ) if π = . . . ( k i )( l j ) and k � = i or j � = l D ij ( π ) = ( i j ) π if i and j are fixed in π or π = . . . ( i j ) ◮ Clearly, D ij = D ji .

  22. DCJ operator — Our re-formulation For i , j ∈ X � ( i j ) π ( i j ) if π = . . . ( k i )( l j ) and k � = i or j � = l D ij ( π ) = ( i j ) π if i and j are fixed in π or π = . . . ( i j ) ◮ Clearly, D ij = D ji . ◮ Also, D 2 ij is identity.

  23. KEY RESULTS

  24. Key result # 1 – Structure of the group of D ij s ◮ Let Γ n be the set of genomic permutations on n regions. D ij is a bijection on Γ n . ◮ Let D be the subgroup of S Γ n generated by the D ij operators.

  25. Key result # 1 – Structure of the group of D ij s ◮ Let Γ n be the set of genomic permutations on n regions. D ij is a bijection on Γ n . ◮ Let D be the subgroup of S Γ n generated by the D ij operators. Let the cardinality of Γ n be γ . If γ/ 2 is even then D is alternating group of degree γ . Otherwise it is a symmetric group of degree γ .

  26. Key result # 1 – Structure of the group of D ij s ◮ Let Γ n be the set of genomic permutations on n regions. D ij is a bijection on Γ n . ◮ Let D be the subgroup of S Γ n generated by the D ij operators. Let the cardinality of Γ n be γ . If γ/ 2 is even then D is alternating group of degree γ . Otherwise it is a symmetric group of degree γ . ◮ Conjecture: γ/ 2 is even ∀ n > 2.

  27. Key result # 2 – Characterization of cycles and paths of AG ( A , B ) Theorem Let A and B be genomes and let α be a k-cycle in the product π A π B . If α contains a point that is fixed in π A or π B , then the extremities in α form a path of length k in AG ( A , B ) . If α does not contain any point of that is fixed in π A or π B then let β be the cycle in π A π B that contains π B ( i ) for any i ∈ α . Then αβ is a cycle in AG ( A , B ) .

  28. Characterization of cycles and paths of AG ( A , B ) – example π A = ( 1 , 10 )( 2 )( 3 , 5 )( 4 , 7 )( 6 )( 8 , 9 ) π B = ( 1 , 8 )( 2 , 3 )( 4 , 6 )( 5 , 7 )( 9 , 10 ) 2 t 3 t 2 h 4 t 3 h 4 h 5 t 5 h 1 t 1 h (3,5) (4,7) (6) (8,9) (1,10) 2 1 h 2 t 4 t 3 t 2 h 3 h 1 t 4 h 5 h 5 t (2,3) (5,7) (6,4) (1,8) (10,9) π A π B = ( 1 , 9 )( 8 , 10 )( 2 , 5 , 4 , 6 , 7 , 3 )

  29. Characterization of cycles and paths of AG ( A , B ) – example π A = ( 1 , 10 )( 2 )( 3 , 5 )( 4 , 7 )( 6 )( 8 , 9 ) π B = ( 1 , 8 )( 2 , 3 )( 4 , 6 )( 5 , 7 )( 9 , 10 ) 2 t 3 t 2 h 4 t 3 h 4 h 5 t 5 h 1 t 1 h (3,5) (4,7) (6) (8,9) (1,10) 2 1 h 2 t 4 t 3 t 2 h 3 h 1 t 4 h 5 h 5 t (2,3) (5,7) (6,4) (1,8) (10,9) π A π B = ( 1 , 9 )( 8 , 10 )( 2 , 5 , 4 , 6 , 7 , 3 )

  30. Characterization of cycles and paths of AG ( A , B ) – example π A = ( 1 , 10 )( 2 )( 3 , 5 )( 4 , 7 )( 6 )( 8 , 9 ) π B = ( 1 , 8 )( 2 , 3 )( 4 , 6 )( 5 , 7 )( 9 , 10 ) 2 t 3 t 2 h 4 t 3 h 4 h 5 t 5 h 1 t 1 h (3,5) (4,7) (6) (8,9) (1,10) 2 1 h 2 t 4 t 3 t 2 h 3 h 1 t 4 h 5 h 5 t (2,3) (5,7) (6,4) (1,8) (10,9) π A π B = ( 1 , 9 )( 8 , 10 )( 2 , 5 , 4 , 6 , 7 , 3 )

  31. Characterization of cycles and paths of AG ( A , B ) – example π A = ( 1 , 10 )( 2 )( 3 , 5 )( 4 , 7 )( 6 )( 8 , 9 ) π B = ( 1 , 8 )( 2 , 3 )( 4 , 6 )( 5 , 7 )( 9 , 10 ) 2 t 3 t 2 h 4 t 3 h 4 h 5 t 5 h 1 t 1 h (3,5) (4,7) (6) (8,9) (1,10) 2 1 h 2 t 4 t 3 t 2 h 3 h 1 t 4 h 5 h 5 t (2,3) (5,7) (6,4) (1,8) (10,9) π A π B = ( 1 , 9 )( 8 , 10 )( 2 , 5 , 4 , 6 , 7 , 3 )

  32. Characterization of cycles and paths of AG ( A , B ) – example π A = ( 1 , 10 )( 2 )( 3 , 5 )( 4 , 7 )( 6 )( 8 , 9 ) π B = ( 1 , 8 )( 2 , 3 )( 4 , 6 )( 5 , 7 )( 9 , 10 ) 2 t 3 t 2 h 4 t 3 h 4 h 5 t 5 h 1 t 1 h (3,5) (4,7) (6) (8,9) (1,10) 2 1 h 2 t 4 t 3 t 2 h 3 h 1 t 4 h 5 h 5 t (2,3) (5,7) (6,4) (1,8) (10,9) π A π B = ( 1 , 9 )( 8 , 10 )( 2 , 5 , 4 , 6 , 7 , 3 )

  33. Characterization of cycles and paths of AG ( A , B ) – example π A = ( 1 , 10 )( 2 )( 3 , 5 )( 4 , 7 )( 6 )( 8 , 9 ) π B = ( 1 , 8 )( 2 , 3 )( 4 , 6 )( 5 , 7 )( 9 , 10 ) 2 t 3 t 2 h 4 t 3 h 4 h 5 t 5 h 1 t 1 h (3,5) (4,7) (6) (8,9) (1,10) 2 1 h 2 t 4 t 3 t 2 h 3 h 1 t 4 h 5 h 5 t (2,3) (5,7) (6,4) (1,8) (10,9) π A π B = ( 1 , 9 )( 8 , 10 )( 2 , 5 , 4 , 6 , 7 , 3 )

  34. Characterization of cycles and paths of AG ( A , B ) – example π A = ( 1 , 10 )( 2 )( 3 , 5 )( 4 , 7 )( 6 )( 8 , 9 ) π B = ( 1 , 8 )( 2 , 3 )( 4 , 6 )( 5 , 7 )( 9 , 10 ) 2 t 3 t 2 h 4 t 3 h 4 h 5 t 5 h 1 t 1 h (3,5) (4,7) (6) (8,9) (1,10) 2 1 h 2 t 4 t 3 t 2 h 3 h 1 t 4 h 5 h 5 t (2,3) (5,7) (6,4) (1,8) (10,9) π A π B = ( 1 , 9 )( 8 , 10 )( 2 , 5 , 4 , 6 , 7 , 3 )

  35. Key result # 3 – DCJ Distance d DCJ ( π A , π B ) = l ( π A π B ) + E 2 2 where l ( π A π B ) is the length π A π B and E is the number of cycles in π A π B that move two fixed points of π A or of π B .

  36. Key result # 4 – Number of sorting scenarios Let π A and π B be genomic permutations on n regions such that π B π A encodes a cycle in the adjacency graph AG ( A , B ) . Then the number of optimal sorting scenarios between π A and π B is n n − 2 .

  37. An example Let π a = ( 1 , 8 )( 2 , 3 )( 4 , 5 )( 6 , 7 ) , π b = ( 1 , 2 )( 3 , 4 )( 5 , 6 )( 7 , 8 )

  38. An example Let π a = ( 1 , 8 )( 2 , 3 )( 4 , 5 )( 6 , 7 ) , π b = ( 1 , 2 )( 3 , 4 )( 5 , 6 )( 7 , 8 ) d 28 ( π a ) = ( 1 , 2 )( 8 , 3 )( 4 , 5 )( 6 , 7 )

  39. An example Let π a = ( 1 , 8 )( 2 , 3 )( 4 , 5 )( 6 , 7 ) , π b = ( 1 , 2 )( 3 , 4 )( 5 , 6 )( 7 , 8 ) d 28 ( π a ) = ( 1 , 2 )( 8 , 3 )( 4 , 5 )( 6 , 7 ) d 48 d 28 ( π a ) = ( 1 , 2 )( 4 , 3 )( 8 , 5 )( 6 , 7 )

  40. An example Let π a = ( 1 , 8 )( 2 , 3 )( 4 , 5 )( 6 , 7 ) , π b = ( 1 , 2 )( 3 , 4 )( 5 , 6 )( 7 , 8 ) d 28 ( π a ) = ( 1 , 2 )( 8 , 3 )( 4 , 5 )( 6 , 7 ) d 48 d 28 ( π a ) = ( 1 , 2 )( 4 , 3 )( 8 , 5 )( 6 , 7 ) d 68 d 48 d 28 ( π a ) = ( 1 , 2 )( 3 , 4 )( 5 , 6 )( 7 , 8 )

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend