1/27/09 1
CSCI1950‐Z Computa4onal Methods for Biology Lecture 2
Ben Raphael January 26, 2009
hHp://cs.brown.edu/courses/csci1950‐z/
Outline
- Review of trees. Coun4ng features.
- Character‐based phylogeny
Outline Review of trees. Coun4ng features. Characterbased - - PDF document
1/27/09 CSCI1950Z Computa4onal Methods for Biology Lecture 2 Ben Raphael January 26, 2009 hHp://cs.brown.edu/courses/csci1950z/ Outline Review of trees. Coun4ng features. Characterbased phylogeny Maximum parsimony
Algorithm
Gorilla: CCTGTGACGTAACAAACGA Chimpanzee: CCTGTGACGTAGCAAACGA Human: CCTGTGACGTAGCAAACGA 2‐state character Non‐informa4ve character
Value1 Value2 Mouth Smile Frown Eyebrows Normal Pointed
Gorilla: CCTGTGACGTAACAAACGA Chimpanzee: CCTGTGACGTAGCAAACGA Human: CCTGTGACGTAGCAAACGA
Gorilla: CCTGTGACGTAACAAACGA Chimpanzee: CCTGTGACGTAGCAAACGA Human: CCTGTGACGTAGCAAACGA
t …. ….
t sj(right child) si(leo child) δj, t δi, t
sA(v) = mini{si(u) + δi, A} + minj{sj(w) + δj, A}
si(u)
δi, A
sum A T ∞ 3 ∞ G ∞ 4 ∞ C ∞ 9 ∞
sA(v) = 0
sA(v) = mini{si(u) + δi, A} + minj{sj(w) + δj, A}
sj(u)
δj, A
sum A ∞ ∞ T ∞ 3 ∞ G ∞ 4 ∞ C 9 9
+ 9 = 9 sA(v) = 0
Repeat for T, G, and C
Repeat for right subtree
Repeat for root
In this case, 9 – so label with T
9 is derived from 7 + 2 So left child is T, And right child is T
And the tree is thus labeled…
t sj(right child) si(leo child) δj, t δi, t
How many computa6ons do we perform for n species, m characters, and k states per character? Forward step:
st(parent) = mini {si( leo child ) + δi, t} + minj {sj( right child ) + δj, t}
Traceback: one “lookup” per internal node. (n‐1) opera4ons For each character (4k – 2)(n‐1) + (n‐1) opera4ons ≤ C n k
≤ C m n k opera4ons
a a a a a a c c {t,a} c t t t {t,a} a {a,c} {a,c}
a a a a a a c c {t,a} c t t t {t,a} a {a,c} {a,c} a a a a a t c
As seen previously:
A T G C A 1 1 1 T 1 1 1 G 1 1 1 C 1 1 1
A A A A A C A A A C A A A A
2n−2
n
0, otherwise.
a b c
Note: Root is node 2n‐1
b
c