CSE 527 Lecture 10 Parsimony and Phylogenetic Footprinting - - PowerPoint PPT Presentation

cse 527 lecture 10
SMART_READER_LITE
LIVE PREVIEW

CSE 527 Lecture 10 Parsimony and Phylogenetic Footprinting - - PowerPoint PPT Presentation

CSE 527 Lecture 10 Parsimony and Phylogenetic Footprinting Phylogenies (aka Evolutionary Trees) Nothing in biology makes sense, except in the light of evolution -- Dobzhansky A Complex Question: Given data (sequences, anatomy,


slide-1
SLIDE 1

Parsimony and Phylogenetic Footprinting

CSE 527 Lecture 10

slide-2
SLIDE 2

Phylogenies

(aka Evolutionary Trees)

“Nothing in biology makes sense, except in the light of evolution”

  • - Dobzhansky
slide-3
SLIDE 3
  • A Complex Question:

Given data (sequences, anatomy, ...) infer the phylogeny

  • A Simpler Question:

Given data and a phylogeny, evaluate “how much change” is needed to fit data to tree

slide-4
SLIDE 4

Human A T G A T ... Chimp A T G A T ... Gorilla A T G A G ... Rat A T G C G ... Mouse A T G C T ...

Parsimony

General idea ~ Occam’s Razor: Given data where change is rare, prefer an explanation that requires few events

slide-5
SLIDE 5

Human A T G A T ... Chimp A T G A T ... Gorilla A T G A G ... Rat A T G C G ... Mouse A T G C T ... General idea ~ Occam’s Razor: Given data where change is rare, prefer an explanation that requires few events

Parsimony

A A A A A A A A A

0 changes (of course

  • ther, less

parsimonious, answers possible)

slide-6
SLIDE 6

Human A T G A T ... Chimp A T G A T ... Gorilla A T G A G ... Rat A T G C G ... Mouse A T G C T ... General idea ~ Occam’s Razor: Given data where change is rare, prefer an explanation that requires few events

Parsimony

T T T T T T T T T

0 changes

slide-7
SLIDE 7

Human A T G A T ... Chimp A T G A T ... Gorilla A T G A G ... Rat A T G C G ... Mouse A T G C T ... General idea ~ Occam’s Razor: Given data where change is rare, prefer an explanation that requires few events

Parsimony

G G G G G G G G G

0 changes

slide-8
SLIDE 8

Human A T G A T ... Chimp A T G A T ... Gorilla A T G A G ... Rat A T G C G ... Mouse A T G C T ... General idea ~ Occam’s Razor: Given data where change is rare, prefer an explanation that requires few events

Parsimony

A C A/C A A A A C C

1 change

slide-9
SLIDE 9

Human A T G A T ... Chimp A T G A T ... Gorilla A T G A G ... Rat A T G C G ... Mouse A T G C T ... General idea ~ Occam’s Razor: Given data where change is rare, prefer an explanation that requires few events

Parsimony

T G/T G/T G/T T T G T G

2 changes

slide-10
SLIDE 10

Counting Events Parsimoniously

  • Lesson of example – no unique

reconstruction

  • But there is a unique minimum number, of

course

  • How to find it?
  • Early solutions 1965-75
slide-11
SLIDE 11

G G T T T

A C G T A C G T A C G T A C G T A C G T A C G T A C G T A C G T A C G T

Sankoff & Rousseau, ‘75

Pu(s) = best parsimony score of subtree rooted at node u, assuming u is labeled by character s

slide-12
SLIDE 12

For leaf u: Pu(s) = if u is a leaf labeled s ∞ if u is a leaf not labeled s For internal node u: Pu(s) =

  • v∈child(u)

min

t∈{A,C,G,T } cost(s, t) + Pv(t)

Sankoff-Rousseau Recurrence

For Leaf u: For Internal node u: Time: O(alphabet2 x tree size) Pu(s) = best parsimony score of subtree rooted at node u, assuming u is labeled by character s

slide-13
SLIDE 13

A C G T A C G T

Sankoff & Rousseau, ‘75

Pu(s) = best parsimony score of subtree rooted at node u, assuming u is labeled by character s

A C G T

internal node u: Pu(s) =

  • v∈child(u)

min

t∈{A,C,G,T } cost(s, t) + Pv(t)

u v1 v2

s v t

cost(s,t)+Pv(t) min

v1

A C G T

v2

A C G T sum: Pu(s) =

slide-14
SLIDE 14

T T

A C G T A C G T

Sankoff & Rousseau, ‘75

Pu(s) = best parsimony score of subtree rooted at node u, assuming u is labeled by character s ∞ ∞ ∞ 0 ∞ ∞ ∞ 0

A C G T

2 2 2 0

internal node u: Pu(s) =

  • v∈child(u)

min

t∈{A,C,G,T } cost(s, t) + Pv(t)

u v1 v2

s v t

cost(s,t)+Pv(t) min A

v1

A 0 + ∞ 1 C 1 + ∞ G 1 + ∞ T 1 + 0

v2

A 0 + ∞ 1 C 1 + ∞ G 1 + ∞ T 1 + 0 sum: Pu(s) = 2

slide-15
SLIDE 15

G G T T T

A C G T A C G T A C G T A C G T A C G T A C G T A C G T A C G T A C G T

Sankoff & Rousseau, ‘75

Pu(s) = best parsimony score of subtree rooted at node u, assuming u is labeled by character s ∞ ∞ ∞ 0 ∞ ∞ ∞ 0 ∞ ∞ 0 ∞ ∞ ∞ 0 ∞ ∞ ∞ ∞ 0 2 2 2 0 2 2 1 1 2 2 1 1 4 4 2 2 Min = 2 (G or T)

slide-16
SLIDE 16

Parsimony – Generalities

  • Parsimony is not necessarily the best way to

evaluate a phylogeny (maximum likelihood generally preferred)

  • But it is a natural approach, & fast.
  • Finding the best tree: a much harder problem
  • Much is known about these problems;

Inferring Phylogenies by Joe Felsenstein

is a great resource.

slide-17
SLIDE 17

Phylogenetic Footprinting

See link to Tompa’s slides on course web page

http://www.cs.washington.edu/homes/tompa/papers/ortho.ppt