Parsimony 123456789... Taxon1 CGACC A GGT... Taxon2 CGACC A GGT... - - PowerPoint PPT Presentation

▶

Feb 08, 2024 207 likes •412 views

Parsimony 123456789... Taxon1 CGACC A GGT... Taxon2 CGACC A GGT... Taxon3 CGGTC C GGT... Taxon4 CGGCC T GGT... Same tree but with data One of the Taxon1 Taxon3 A C from site 6 inserted three possible in place of taxon unrooted names

SLIDE 1

Parsimony

123456789... Taxon1 CGACCAGGT... Taxon2 CGACCAGGT... Taxon3 CGGTCCGGT... Taxon4 CGGCCTGGT...

Taxon1 Taxon3 Taxon2 Taxon4 A C A T

Same tree but with data from site 6 inserted in place of taxon names One of the three possible unrooted trees

SLIDE 2

"Standard" Parsimony

SLIDE 3

Important things to note about that last slide:

Two (2) steps was the minimum

– no way to explain the observed data with just 1 evolutionary change

More than one way to assign ancestral

character states to get 2 steps

–

ne interior node must have A but the other interior

node can have anything except G

Enumerating all possible combinations of

ancestral states is not the most efficient way to determine the number of steps

– more on this later

SLIDE 4

Parsimony Steps

123456789... Taxon1 CGACCAGGT... Taxon2 CGACCAGGT... Taxon3 CGGTCCGGT... Taxon4 CGGCCTGGT... Steps 001102000...

Taxon1 Taxon3 Taxon2 Taxon4 Tree 1's length for first 9 sites = 4 Let's call this tree 1: (1,2,(3,4))

SLIDE 5

Parsimony Steps

123456789... Taxon1 CGACCAGGT... Taxon2 CGACCAGGT... Taxon3 CGGTCCGGT... Taxon4 CGGCCTGGT... Steps 002102000...

Taxon1 Taxon2 Taxon3 Taxon4 Tree 2's length for first 9 sites = 5 Tree 2: (1,3,(2,4))

SLIDE 6

Parsimony Steps

123456789... Taxon1 CGACCAGGT... Taxon2 CGACCAGGT... Taxon3 CGGTCCGGT... Taxon4 CGGCCTGGT... Steps 002102000...

Taxon1 Taxon2 Taxon4 Taxon3 Tree 3's length for first 9 sites = 5 Tree 3: (1,4,(2,3))

SLIDE 7

Parsimony (using only 9 sites)

Taxon1 Taxon3 Taxon2 Taxon4 Taxon1 Taxon3 Taxon2 Taxon4 Taxon1 Taxon3 Taxon2 Taxon4

4 steps 5 steps 5 steps most parsimonious

This is the simplest explanation of the data for the first 9 sites according to the parsimony criterion. Choosing one of the other two trees requires additional (ad hoc) justification.

SLIDE 8

Wagner vs. Fitch Parsimony

Wagner

1 2 3

Fitch

2 1 3

Note: this is just one possible character state tree

(distinction exists only in case of multistate characters)

This "tree" says that all changes between 0 and 2, 0 and 3, or 2 and 3 must go through state 1 (and thus require 2 steps) In Fitch Parsimony, a change between any two states is possible, and all changes count just 1 step

(ordered characters) (unordered characters)

SLIDE 9

Transversion Parsimony

Transitions (A↔G, C↔T) more common

than transversions (all other changes)

Transitions saturate faster than

transversions, thus transversions are sometimes more reliable for reconstructing history

Transversion parsimony is extreme,

ignoring all transitions, counts 1 step for each transversion

SLIDE 10

Saturation

C→A A→G A→G

C G A A A G

Transversions rarer, should trust them more Transitions common,

ften involved in

parallelism (shown here), convergence,

r reversal

Saturation refers to the loss of historical information due to the effect of "multiple hits"

SLIDE 11

Implementing Transversion Parsimony

Ambiguity codes:

– R means purine (A or G) – Y means pyrimidine (C or T)

Replace nucleotides with either R or Y

– only transversions will be detectable

Note: Nexus data file format allows you to

do this substitution virtually

– no need to actually modify your data

SLIDE 12

Transversion Parsimony

SLIDE 13

Step Matrices

A C G T A 1 1 1 C 1 1 1 G 1 1 1 T 1 1 1

From To Step matrix for Fitch parsimony

SLIDE 14

A C G T A 5 1 5 C 5 5 1 G 1 5 5 T 5 1 5

Step Matrices

From To This step matrix implements something like transversion parsimony, but less severe It counts 5 for each transversion

SLIDE 15

A C G T A 5 1 5 C 5 5 1 G 1 5 5 T 5 1 5

Step Matrices

From To This step matrix implements something like transversion parsimony, but less severe And counts 1 step for each transition

SLIDE 16

Generalized Parsimony

SLIDE 17

Important points

Do not compare scores across parsimony variants

– A tree with a transversion parsimony score of 25 is not necessarily better than a tree with a Fitch parsimony score of 31

Parsimony does not provide any guidance for

selecting weights for step matrices

– parsimony cannot tell us that the transition:transversion weight ratio 1:5 is better than 1:1

SLIDE 18

Other variants

Camin-Sokal parsimony

– characters are assumed irreversible – ancestral state assumed known – forces use of rooted trees

Dollo parsimony

– derived state can arise only once, but as many reversals as needed are allowed – popular for modeling restriction sites (which are lost more easily than they are gained)

Unweighted parsimony, equal-weighted parsimony

– usually means Fitch parsimony (what I call standard parsimony)

SLIDE 19

Counting steps with a minimum of effort

T C C A A G {A,G} {A,C} {A} {A,C} {A,C,T}

(+1 step) (+1 step) (+1 step) (+1 step)

4 steps total

SLIDE 20

What is "weighted" parsimony?

Some changes weighted more than others

– i.e. generalized parsimony

Some sites weighted more than other sites

Parsimony

123456789... Taxon1 CGACCAGGT... Taxon2 CGACCAGGT... Taxon3 CGGTCCGGT... Taxon4 CGGCCTGGT...

Taxon1 Taxon3 Taxon2 Taxon4 A C A T

"Standard" Parsimony

Important things to note about that last slide:

– no way to explain the observed data with just 1 evolutionary change

character states to get 2 steps

–

node can have anything except G

ancestral states is not the most efficient way to determine the number of steps

– more on this later

Parsimony Steps

123456789... Taxon1 CGACCAGGT... Taxon2 CGACCAGGT... Taxon3 CGGTCCGGT... Taxon4 CGGCCTGGT... Steps 001102000...

Taxon1 Taxon3 Taxon2 Taxon4 Tree 1's length for first 9 sites = 4 Let's call this tree 1: (1,2,(3,4))

Parsimony Steps

123456789... Taxon1 CGACCAGGT... Taxon2 CGACCAGGT... Taxon3 CGGTCCGGT... Taxon4 CGGCCTGGT... Steps 002102000...

Taxon1 Taxon2 Taxon3 Taxon4 Tree 2's length for first 9 sites = 5 Tree 2: (1,3,(2,4))

Parsimony Steps

123456789... Taxon1 CGACCAGGT... Taxon2 CGACCAGGT... Taxon3 CGGTCCGGT... Taxon4 CGGCCTGGT... Steps 002102000...

Taxon1 Taxon2 Taxon4 Taxon3 Tree 3's length for first 9 sites = 5 Tree 3: (1,4,(2,3))

Parsimony (using only 9 sites)

4 steps 5 steps 5 steps most parsimonious

Wagner vs. Fitch Parsimony

Wagner

1 2 3

Fitch

2 1 3

(distinction exists only in case of multistate characters)

(ordered characters) (unordered characters)

Transversion Parsimony

than transversions (all other changes)

transversions, thus transversions are sometimes more reliable for reconstructing history

ignoring all transitions, counts 1 step for each transversion

Saturation

C G A A A G

Implementing Transversion Parsimony

– R means purine (A or G) – Y means pyrimidine (C or T)

– only transversions will be detectable

do this substitution virtually

– no need to actually modify your data

Transversion Parsimony

Step Matrices

A C G T A 1 1 1 C 1 1 1 G 1 1 1 T 1 1 1

From To Step matrix for Fitch parsimony

A C G T A 5 1 5 C 5 5 1 G 1 5 5 T 5 1 5

Step Matrices

From To This step matrix implements something like transversion parsimony, but less severe It counts 5 for each transversion

A C G T A 5 1 5 C 5 5 1 G 1 5 5 T 5 1 5

Step Matrices

From To This step matrix implements something like transversion parsimony, but less severe And counts 1 step for each transition

Generalized Parsimony

Important points

– A tree with a transversion parsimony score of 25 is not necessarily better than a tree with a Fitch parsimony score of 31

selecting weights for step matrices

– parsimony cannot tell us that the transition:transversion weight ratio 1:5 is better than 1:1

Other variants

– characters are assumed irreversible – ancestral state assumed known – forces use of rooted trees

– derived state can arise only once, but as many reversals as needed are allowed – popular for modeling restriction sites (which are lost more easily than they are gained)

– usually means Fitch parsimony (what I call standard parsimony)

Counting steps with a minimum of effort

T C C A A G {A,G} {A,C} {A} {A,C} {A,C,T}

4 steps total

What is "weighted" parsimony?

– i.e. generalized parsimony

– weighting may be determined a priori – weighting may be dynamic (i.e. a function of the number of changes reconstructed)

When someone says they are using weighted parsimony, this can mean more than one thing: