outline
play

Outline Searching Through trees 1. Branchswapping: NNI, SPR, TBR. 2. - PDF document

2/25/09 CSCI1950Z Computa3onal Methods for Biology Lecture 9 Ben Raphael February 23, 2009 hHp://cs.brown.edu/courses/csci1950z/ Outline Searching Through trees 1. Branchswapping: NNI, SPR, TBR. 2. MCMC Consensus Trees and


  1. 2/25/09 CSCI1950‐Z Computa3onal Methods for Biology Lecture 9 Ben Raphael February 23, 2009 hHp://cs.brown.edu/courses/csci1950‐z/ Outline Searching Through trees 1. Branch‐swapping: NNI, SPR, TBR. 2. MCMC Consensus Trees and Supertrees 1

  2. 2/25/09 Heuris3c Search 1. Start with an arbitrary tree T. 2. Check “neighbors” of T . 3. Move to a neighbor if it provides the best improvement in parsimony/likelihood score. Caveats: Could be stuck in local op3mum, and not achieve global op3mum Trees and Splits Given a set X, a split is a par33on of X into two non‐ empty subsets A and B such that X = A | B. For a phylogene3c tree T with leaves L , each edge e defines a split L e = A | B , where A and B are the leaves in the subtrees obtained by removing e . e A B 2

  3. 2/25/09 Compu3ng the Splits Metric A phylogene3c tree T defines a collec3on of splits Σ(T) = { L e | e is edge in T}. Theorem : ρ( T 1 , T 2 ) = | Σ( T 1 ) \ Σ( T 2 ) | + |Σ(T 2 ) \ Σ(T 1 ) | = |Σ(T 1 )| + |Σ(T 2 )| ‐ 2 |Σ( T 1 ) ∩ Σ( T 2 )| Proof: (whiteboard) Nota3on: A \ B = {x: x ∈ A, x ∉ B} Nearest Neighbor Interchange Rearrange four subtrees defined by one internal edge Claim : The number of NNI neighbors of a binary tree is 2(n‐3) Proof: (whiteboard) 3

  4. 2/25/09 Subtree Pruning and Regrafing (SPR) 1. Remove a branch. 2. Reconnect incident vertex by subdividing a branch Subtree Pruning and Regrafing (SPR) 1. Remove a branch. 2. Reconnect incident vertex by subdividing a branch Claim : The number of SPR neighbors of a binary tree is 2(n‐3) (2n – 7) Proof: (whiteboard) 4

  5. 2/25/09 Tree Bisec3on and Reconnec3on (TBR) 1. Remove a branch. 2. Reconnect subtrees by adding new branch that subdivides branches in both. Rela3onship between Opera3ons • Every NNI is an SPR and every SPR is a TBR. • Every TBR is a single SPR or a composi3on of two SPR. • All three types of opera3ons are inver3ble: If T  T’, then T’  T. α α ‐1 Theorem : For all T and T ’ in B ( n ), there is a sequence of NNI (or SPR or TBR) opera3ons that transform T into T ’. 5

  6. 2/25/09 Rela3onship between Opera3ons NNI SPR TBR • Every NNI is an SPR and every SPR is a TBR. • Every TBR is a single SPR or a composi3on of two SPR. • All three types of opera3ons are inver3ble: If T  T’, then T’  T. Heuris3c Search 1. Start with an arbitrary tree T. 2. Check “neighbors” of T . 3. Move to a neighbor if it provides the best improvement in parsimony/likelihood score. PAUP* (widely used phylogene3c package) includes command: hsearch nreps = num swap = type Where type = NNI , SPR , TBR 6

  7. 2/25/09 From Likelihood to Bayesian Given data X = ( x 1 , …, x n ), we found the tree T and branch lengths t * that maximized likelihood Pr[X | T, t * ]. What about other trees? Could we compute Pr[T, t * | X]? Back to Coin Flipping Flip coin with p = Pr[heads] unknown. Earlier we computed max. likelihood es3mate of p . L(p) = Pr[ D | p]. Pr[p | D] = Pr[ p, D]/Pr[D] = Pr[D|p]Pr[p] / Pr[D] 11 tosses 44 tosses 5 heads 20 heads Posterior Prior 7

  8. 2/25/09 Bayesian Methods Pr[T, t * | X] = Pr[X, T, t * ] / Pr[X] = Pr[X | T, t * ] Pr[T, t * ] / Pr[X] = Pr[X | T, t * ] Pr[T, t * ] / (Σ T’, t’ Pr[X | T’, t’] Pr[T’, t’] Bayes Theorem Posterior Prior Problem : Cannot compute denominator. Bayesian Methods Pr[T, t * | X] = Pr[X, T, t * ] / Pr[X] = Pr[X | T, t * ] Pr[T, t * ] / Pr[X] = Pr[X | T, t * ] Pr[T, t * ] / (Σ T’, t’ Pr[X | T’, t’] Pr[T’, t’] Bayes Theorem Posterior Prior Problem : Cannot compute denominator. Solu2on: Use power of Markov Chains to draw trees (“sample”) according to distribu3on Pr[T, t * | X] 8

  9. 2/25/09 Markov Chain Monte Carlo To sample from a distribu3on Define a Markov chain with equilibrium distribu3on π. Simulate chain through many transi3ons. Afer many transi3ons (e.g. ~10000), will be at equilibrium π. (“Burn‐in”) Output every n ‐th state. (n ~ 50). Jukes‐Cantor model of DNA A C Equilibrium distribu3on: q A = q C = q G = q T = 1/4 T G MCMC on Trees 1. Define a Markov chain: States are trees T . • • Equilibrium distribu3on is posterior Pr[T, t * | X]. 2. Simulate Markov chain for many steps (burn‐ in). 3. Output T from every n‐th (e.g. n = 50) step. NNI neighborhood for trees with 5 leaves 9

  10. 2/25/09 MCMC on Trees 1. Define a Markov chain: States are trees T . • • Equilibrium distribu3on is posterior Pr[T, t * | X]. 2. Simulate Markov chain for many steps (burn‐ in). 3. Output T from every n‐th (e.g. n = 50) step. For transi3ons, can use NNI, SPR, TBR, or other opera3ons. Can define* the transi3on probabili3es of this Markov chain without compu3ng Z = (Σ T’, t’ Pr[X | NNI neighborhood for trees with T’, t’] Pr[T’, t’] ( Metropolis algorithm ). 5 leaves *“involves burning of incense, cas3ng of chicken bones, use of magical incanta3ons, and invoking the opinions of more pres3gious colleagues.” ‐‐Felsenstein How Many Times Did Wings Evolve? • Previous studies had shown loss of wings: winged  wingless transi3ons • Gain of wings (Wingless  winged transi3on) appears to be much more complicated 10

  11. 2/25/09 Phylogeny of Insects ( Nature 2003) Build phylogeny of winged and wingless s3ck insects Used data from: 18S ribosomal DNA (~1,900 base pairs (bp)) 28S rDNA (2,250 bp) Por3on of histone 3 (H3, 372 bp) Used mul3ple tree reconstruc3on techniques Most Parsimonious Evolu3onary Tree of Winged and Wingless Insects • All most parsimonious reconstruc3on gave a wingless ancestor • All required mul3ple winged  wingless transi3ons. 11

  12. 2/25/09 Most Parsimonious Evolu3onary Tree of Winged and Wingless Insects Will Wingless Insects Fly Again? • All most parsimonious reconstruc3ons all required the re‐inven3on of wings. • It is likely that wing developmental pathways are conserved in wingless s3ck insects 12

  13. 2/25/09 Next Ques3ons • How to combine/merge trees? • How to determine “confidence” in a par3cular tree/branch? Mul3ple Trees? 13

  14. 2/25/09 Consensus Trees Strict Consensus Tree 14

  15. 2/25/09 Strict Consensus No non‐trivial splits in common! Strict consensus tree is unresolved. Splits Equivalence Theorem A phylogene3c tree T defines a collec3on of splits Σ(T) = { L e | e is edge in T}. Splits A 1 | B 1 and A 2 | B 2 are pairwise compa.ble if at least one of A 1 ∩ A 2 , A 1 ∩ B 2 , B 1 ∩ A 2 , and B 1 ∩ B 2 is the empty set. Splits Equivalence Theorem : Let Σ be a collec3on of splits. There is a phylogene3c tree such that Σ(T) = Σ if and only if the splits in Σ are pairwise compa3ble. The Pairwise Compa3bility Theorem (for binary characters) follows from this theorem. 15

  16. 2/25/09 Majority Consensus Tree Majority Consensus Tree 16

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend