One Step Mutation (OSM) matrices joint work with Sequence Evolution - - PDF document

one step mutation osm matrices
SMART_READER_LITE
LIVE PREVIEW

One Step Mutation (OSM) matrices joint work with Sequence Evolution - - PDF document

One Step Mutation (OSM) matrices joint work with Sequence Evolution 1 Sequence Evolution acggcatagccgattac Sequence Evolution acgggatagcccattac acggcatagccgattac 2 Sequence Evolution acgggat--cccattac acggcatatccactggattac


slide-1
SLIDE 1

1

One Step Mutation (OSM) matrices

joint work with

Sequence Evolution

slide-2
SLIDE 2

2

Sequence Evolution

acggcatagccgattac

Sequence Evolution

acggcatagccgattac acgggatagcccattac

slide-3
SLIDE 3

3

Sequence Evolution

acggcatagccgattac acgggatagcccattac acggcatatccactggattac acgggat--cccattac

Sequence Evolution

acggcatagccgattac acgggatagcccattac acggcatatccactggattac acgggat--cccattac acgggat--cccattac

slide-4
SLIDE 4

4

Sequence Evolution

acggcatagccgattac acgggatagcccattac acggcatatccactggattac acgggat--cccattac acgggat--cccattac acgggat--cccaatac ccgggatagcttccattac acgacatatccactggattcc accccctatccactggattac

c a a c t g a t t a t t c a c seq 4 t a g c c c t t t g a a c g c seq 3 t a g c c c t t t a a a t g c seq 2 t c a t t g t c c a t t c g a seq 1 Multiple Sequence Alignment (MSA)

Alignment column or alignment pattern

slide-5
SLIDE 5

5

Example: Binary Alphabet {R, Y} Binary Alphabet {R, Y}

R R R R

pattern

slide-6
SLIDE 6

6

Binary Alphabet {R, Y}

R R R R

pattern

Binary Alphabet {R, Y}

Y R R R

pattern

slide-7
SLIDE 7

7

Binary Alphabet {R, Y}

permutation matrix σΑ

Binary Alphabet {R, Y}

permutation matrix σΑΒ

slide-8
SLIDE 8

8

Internal branches

matrix multiplication σΑ σΒ= σΑΒ

One Step Mutation Matrix

0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5

slide-9
SLIDE 9

9

Examples of OSM-Graphs Branch Lengths:

dA dCD dAB dB dC dD

= dA + dB + dC + dD + dAB + dCD

Total branch length

pedge = dedge

  • relative edge length

T

slide-10
SLIDE 10

10

Some Formalisms:

pA pCD pAB pB pC pD

pA + pB + pC + pD + pAB + pCD =1

relative branch lengths

pedge edge

general permutation matrix used to assign mutation probabilities

Constructing the OSM:

pA pCD pAB pB pC pD

MT = pedge edge

edge

slide-11
SLIDE 11

11

Many Substitutions:

MT = pedge edge

edge

  • One substitution

MT

k =

pedge edge

edge

  • k

k substitutions

Many Substitutions: Random walk

slide-12
SLIDE 12

12

Maximum Parsimony (MP)

Min

k

M

T

k (i, j) > 0 k N

{ }

describes the minimal number of mutations to move from pattern i to j MP: For a tree T and pattern j compute:

Min

k

M

T

k (R...R, j) > 0 or MT k (Y...Y, j) > 0 k N

{ }

Maximum Likelihood

We assume that the number of substitutions is Poisson distributed with parameter Δ. Then we compute, the expected OSM as

MT = exp

( )k(MT )k

k!

k= 0

  • MT = exp() exp MT

( )

MT = exp() H2n exp DT

( ) H2n

DT = H2n MT H2n

where and

H2n = H2 H2 K H2

n times

1 2 4 4 4 3 4 4 4 H2 = 1 1 1 1

slide-13
SLIDE 13

13

Maximum Likelihood

The likelihood of a tree T with branch length Δ, given an alignment of length L is then

Pr T,

( ) =

MT

i=1 L

  • {R...R,Y...Y}, pattern(i)

( )

MT = exp() H2n exp DT

( ) H2n

Another View at the Mutations

From the above formula, we can analytically compute the posterior probability of the number of mutations that have occurred on a fixed tree.

Pr k mutations | pattern

( ) = exp ( )k

k! MT R,K,R,pattern

( )

( )

k

MT R...,R,pattern

( )

similar work by Rasmus Nielsen, John Huelsenbeck, Jonathan Bollback (2002, 2003, 2005)

MT = exp() H2n exp DT

( ) H2n

slide-14
SLIDE 14

14 Posterior probabilities:clock-like tree

0.2

ppd[k |a] = exp

[ ]k 0MT

k 0,a

( ) + 1M T

k 1,a

( )

( )

0MT 0,a

( ) + 1MT 1,a ( )

Δ=1.0

Posterior probabilities: five Taxa Tree

Pattern: AB|CDE Pattern ABE|CD

alignment patterns

slide-15
SLIDE 15

15 Summary and Outlook

Developed an evolutionary model that describes the action of a single substitution on an alignment pattern. This leads to a tree-topology mediated random walk on the space of words of length n. Maximum Parsimony and Maximum Likelihood are “extreme” cases within this framework. Practical Aspect: Analytical formula for the posterior probabilities of the number of substitutions for a pattern. Open Questions:

  • Connection between OSM and Hadamard transform (Hendy, Penny 1989) and its

generalization, the Fourier calculus on evolutionary trees (Szekely, Steel, Erdös 1993).

  • Other type of substitution distributions?
  • Computational issues

The real stuff

slide-16
SLIDE 16

16

The real stuff

O d1,K,d4 n

( )

Observed pattern count

The real stuff

O d1,K,d4 n

( )

Maximum likelihood etc. Observed pattern count

slide-17
SLIDE 17

17

The real stuff

O d1,K,d4 n

( )

E p1,K, p4 n

( )

Maximum likelihood etc.

ˆ T

Observed pattern count

The real stuff

O d1,K,d4 n

( )

E p1,K, p4 n

( )

Maximum likelihood etc.

ˆ T

Observed pattern count

OSM

slide-18
SLIDE 18

18

The real stuff

O d1,K,d4 n

( )

E p1,K, p4 n

( )

Maximum likelihood etc.

ˆ T

Observed pattern count

OSM

The real stuff

O d1,K,d4 n

( )

E p1,K, p4 n

( )

Maximum likelihood etc.

ˆ T

Observed pattern count

OSM

How many mutations are required to change E() into O()?