tying up loose strands defining equations of the strand
play

Tying up loose strands: Defining equations of the strand symmetric - PowerPoint PPT Presentation

Tying up loose strands: Defining equations of the strand symmetric model Colby Long and Seth Sullivant North Carolina State University June 8, 2015 Colby Long (NCSU) Tying up loose strands: Defining equations of the strand symmetric model


  1. Tying up loose strands: Defining equations of the strand symmetric model Colby Long and Seth Sullivant North Carolina State University June 8, 2015 Colby Long (NCSU) Tying up loose strands: Defining equations of the strand symmetric model June 8, 2015 1 / 15

  2. Phylogenetic Models Problem Find a tree that represents the evolutionary history of a group of taxa. DATA Species 1: ACCGTAGATGACT... Species 2: ACTGTAGATGACT... Species 3: ACCGTACATGACT... Latent variable graphical models Model evolution at a single locus. Give probability distribution on n -tuples of DNA characters Colby Long (NCSU) Tying up loose strands: Defining equations of the strand symmetric model June 8, 2015 2 / 15

  3. Phylogenetic Models Tree parameter: Binary leaf-labelled tree T with label set [ n ] . Random variable X v associated to each node of T . State space of each X v is { A , C , G , T } . Transition matrix associated to each edge. M k ij = P ( X v = i | X w = j ) . Entries of the transition matrices are the stochastic or numerical parameters . To find the probability of observing a particular state at the leaves, sum over all histories , the possible states of internal nodes. Colby Long (NCSU) Tying up loose strands: Defining equations of the strand symmetric model June 8, 2015 3 / 15

  4. Jukes-Cantor Example A C G T   A α k β k β k β k C β k α k β k β k M k =     G β k β k α k β k   T β k β k β k α k M k ij = P ( X v = i | X w = j ) Colby Long (NCSU) Tying up loose strands: Defining equations of the strand symmetric model June 8, 2015 4 / 15

  5. Jukes-Cantor Example A C G T   A α k β k β k β k C β k α k β k β k M k =     G β k β k α k β k   T β k β k β k α k M k ij = P ( X v = i | X w = j ) p CCA = π A β 1 β 2 α 3 + Colby Long (NCSU) Tying up loose strands: Defining equations of the strand symmetric model June 8, 2015 4 / 15

  6. Jukes-Cantor Example A C G T   A α k β k β k β k C β k α k β k β k M k =     G β k β k α k β k   T β k β k β k α k M k ij = P ( X v = i | X w = j ) p CCA = π A β 1 β 2 α 3 + π C α 1 α 2 β 3 + Colby Long (NCSU) Tying up loose strands: Defining equations of the strand symmetric model June 8, 2015 4 / 15

  7. Jukes-Cantor Example A C G T   A α k β k β k β k C β k α k β k β k M k =     G β k β k α k β k   T β k β k β k α k M k ij = P ( X v = i | X w = j ) p CCA = π A β 1 β 2 α 3 + π C α 1 α 2 β 3 + π G β 1 β 2 β 3 + Colby Long (NCSU) Tying up loose strands: Defining equations of the strand symmetric model June 8, 2015 4 / 15

  8. Jukes-Cantor Example A C G T   A α k β k β k β k C β k α k β k β k M k =     G β k β k α k β k   T β k β k β k α k M k ij = P ( X v = i | X w = j ) p CCA = π A β 1 β 2 α 3 + π C α 1 α 2 β 3 + π G β 1 β 2 β 3 + π T β 1 β 2 β 3 Colby Long (NCSU) Tying up loose strands: Defining equations of the strand symmetric model June 8, 2015 4 / 15

  9. Jukes-Cantor Example A C G T   A α k β k β k β k C β k α k β k β k M k =     G β k β k α k β k   T β k β k β k α k M k ij = P ( X v = i | X w = j ) p CCA = π A β 1 β 2 α 3 + π C α 1 α 2 β 3 + π G β 1 β 2 β 3 + π T β 1 β 2 β 3 ψ T : Θ T → ∆ 4 n − 1 ⊆ R 4 n M T = ψ T (Θ T ) is the model. V T = im ( ψ T ) and I T = I ( V T ) is the ideal of phylogenetic invariants. Colby Long (NCSU) Tying up loose strands: Defining equations of the strand symmetric model June 8, 2015 4 / 15

  10. The Strand Symmetric Model (SSM) The Strand Symmetric Model (SSM) reflects the double-stranded structure of DNA. A-T and C-G are always paired, so a mutation in one induces a mutation in the other. We insist the root distribution satisfies π A = π T and π C = π G . Likewise, if we let θ ij be the entries of the transition matrices, θ AA = θ TT θ AC = θ TG θ AG = θ TC θ AT = θ TA θ CC = θ GG θ CG = θ GC θ CT = θ GA θ GT = θ CA Given any tree T , we want to be able to determine I T for the SSM. Colby Long (NCSU) Tying up loose strands: Defining equations of the strand symmetric model June 8, 2015 5 / 15

  11. Determining the ideal of the SSM Theorem (Casanellas-Sullivant 2005) For any binary phylogenetic tree T , the ideal of phylogenetic invariants for the SSM on T can be computed from the ideal of phylogenetic invariants for the claw tree, I SSM . Theoretically, this can be computed with elimination. Computing the required Gröbner basis is not possible. The Fourier transform gives a monomial parameterization for group-based models. We require something analogous for the Strand Symmetric Model. Colby Long (NCSU) Tying up loose strands: Defining equations of the strand symmetric model June 8, 2015 6 / 15

  12. Matrix-Valued Group-Based Models ([1]) Identify states with elements of Z 2 × { 0 , 1 } . � 0 � 0 � 1 � 1 � � � � A = , G = , T = , C = . 0 1 0 1 0 1 A G T C A θ 1 θ 8 θ 3 θ 2   0 E = G θ 7 θ 5 θ 4 θ 6   T θ 3 θ 2 θ 1 θ 8   1 C θ 4 θ 6 θ 7 θ 5 E j 1 j 2 i 1 i 2 = E k 1 k 2 whenever j 1 − j 2 = k 1 − k 2 in Z 2 . i 1 i 2 This makes the strand symmetric model a matrix-valued group based model . Colby Long (NCSU) Tying up loose strands: Defining equations of the strand symmetric model June 8, 2015 7 / 15

  13. The Group-Valued Fourier Transform In the new coordinates, the parameterization of the cone over the SSM for K 1 , 3 is given by q mno = d mm e nn 0 j f oo 0 k + d mm e nn 1 j f oo 0 i 1 i ijk 1 k if m + n + o ≡ 0 in Z 2 , and q mno = 0 otherwise. ijk This is a projection of the space of rank 2 tensors. d 0 e 0 f 0 d 0 e 0 f 0             00 00 00 10 10 10 d 0 e 0 f 0 d 0 e 0 f 0  01   01   01   11   11   11  Q =  ⊗  ⊗  +  ⊗  ⊗  d 1   e 1   f 1   d 1   e 1   f 1         00 00 00 10 10 10 d 1 e 1 f 1 d 1 e 1 f 1 01 01 01 11 11 11 I SSM = I ( Sec 2 ( Seg ( P 3 × P 3 × P 3 ))) ∩ C [ q mno : m + n + o = 0 ] . ijk Colby Long (NCSU) Tying up loose strands: Defining equations of the strand symmetric model June 8, 2015 8 / 15

  14. A Candidate Ideal Using elimination, the same authors found I SSM is generated by 32 equations in degree 3 18 equations in degree 4 0 equations in degree 5. Unknown for degree ≥ 6. Theorem (L-Sullivant 2014) Let I F be the ideal generated by the 50 equations found in [1]. Then I F = I SSM . We know that I F ⊆ I SSM and I SSM is prime, so we just need to show dim ( I F ) = dim ( I SSM ) . 1 I F is prime. 2 Colby Long (NCSU) Tying up loose strands: Defining equations of the strand symmetric model June 8, 2015 9 / 15

  15. How to show I F is prime? Dimension is easy, Compute dim ( I F ) with Macaulay2. Compute dim ( I SSM ) as a tropical secant variety [3]. dim ( I F ) = dim ( I SSM ) = 20. Colby Long (NCSU) Tying up loose strands: Defining equations of the strand symmetric model June 8, 2015 10 / 15

  16. How to show I F is prime? Dimension is easy, Compute dim ( I F ) with Macaulay2. Compute dim ( I SSM ) as a tropical secant variety [3]. dim ( I F ) = dim ( I SSM ) = 20. Lemma [6, Proposition 23] Let k be a field and J ⊂ k [ x 1 , . . . , x n ] be an ideal containing a polynomial f = gx 1 + h with g , h not involving x 1 and g a non-zero divisor modulo J . Let J 1 = J ∩ k [ x 2 , . . . , x n ] be the elimination ideal. Then J is prime if and only if J 1 is prime. J not prime ⇒ J 1 not prime. Given a , b �∈ J with ab ∈ J , a ′ := ( ga − h d x d − 1 f ) �∈ J , and a ′ b ∈ J 1 with lower x 1 -degree. Colby Long (NCSU) Tying up loose strands: Defining equations of the strand symmetric model June 8, 2015 10 / 15

  17. Proving I F is prime. Start with I 0 = I F and k = 1. 1 Find a polynomial f k = g k x k + h k ∈ I k − 1 . 2 Verify that g k is not a zero-divisor mod I k − 1 . 3 eliminate x k to obtain the ideal I k . 4 Generate a decreasing chain of elimination ideals 5 I F = I 0 ⊃ I 1 ⊃ I 2 . . . ⊃ � 0 � . By repeated application of the lemma, � 0 � prime ⇒ I F prime . Colby Long (NCSU) Tying up loose strands: Defining equations of the strand symmetric model June 8, 2015 11 / 15

  18. The Result I SSM = I ( Sec 2 ( Seg ( P 3 × P 3 × P 3 )))) ∩ C [ q mno : m + n + o = 0 ] . ijk To reduce computation time... Take advantage of the group action on I F . Eliminate variables in particular order. We show I F = I SSM and therefore we can determine the ideal for the strand symmetric model for any binary tree T . Colby Long (NCSU) Tying up loose strands: Defining equations of the strand symmetric model June 8, 2015 12 / 15

  19. Another Application: CFN mixture models The CFN model is a two-state group-based phylogenetic model. Mixture models correspond to join varieties. Goal Find a generating set for the ideal of phylogenetic invariants for two-tree CFN mixtures on the same tree. Snowflake Caterpillar I S ∗ I S is generated by 32 equations in degree 3 and 18 equations in degree 4. Colby Long (NCSU) Tying up loose strands: Defining equations of the strand symmetric model June 8, 2015 13 / 15

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend