RADICAL PHYLOGENETIC INVERSION Mike Hendy Institute of Fundamental - - PowerPoint PPT Presentation

radical phylogenetic inversion
SMART_READER_LITE
LIVE PREVIEW

RADICAL PHYLOGENETIC INVERSION Mike Hendy Institute of Fundamental - - PowerPoint PPT Presentation

RADICAL PHYLOGENETIC INVERSION Mike Hendy Institute of Fundamental Sciences Massey University Palmerston North New Zealand November 2010 Mike Hendy RADICAL PHYLOGENETIC INVERSION Acknowledgements David Penny, Massey University Mike


slide-1
SLIDE 1

RADICAL PHYLOGENETIC INVERSION

Mike Hendy

Institute of Fundamental Sciences Massey University Palmerston North New Zealand

November 2010

Mike Hendy RADICAL PHYLOGENETIC INVERSION

slide-2
SLIDE 2

Acknowledgements

◮ David Penny, Massey University ◮ Mike Steel, Canterbury University ◮ Peter Waddell, University of South Carolina ◮ Andreas Dress, Universit¨

at Bielefeld

Mike Hendy RADICAL PHYLOGENETIC INVERSION

slide-3
SLIDE 3

Local to Global

◮ X = {x0, x1, · · · , xn} is a set of n + 1 taxa, and T is an

X–tree (the leaves represent the taxa). Example X = {x0, x1, x2, x3}:

❅ ❅ ❅

  • x0

x1 x2 x3

Mike Hendy RADICAL PHYLOGENETIC INVERSION

slide-4
SLIDE 4

Local to Global

◮ X = {x0, x1, · · · , xn} is a set of n + 1 taxa, and T is an

X–tree (the leaves represent the taxa). Example X = {x0, x1, x2, x3}:

❅ ❅ ❅

  • x0

x1 x2 x3

◮ Nucleotide sequences evolve from an ancestral sequence,

under a model of nucleotide substitution governed by local stochastic matrices Me on each edge of T.

❅ ❅ ❅

  • ACGGTTT

ATGGTAT ACGGTGT ACGATGT

Mike Hendy RADICAL PHYLOGENETIC INVERSION

slide-5
SLIDE 5

Local to Global

◮ X = {x0, x1, · · · , xn} is a set of n + 1 taxa, and T is an

X–tree (the leaves represent the taxa). Example X = {x0, x1, x2, x3}:

❅ ❅ ❅

  • x0

x1 x2 x3

◮ Nucleotide sequences evolve from an ancestral sequence,

under a model of nucleotide substitution governed by local stochastic matrices Me on each edge of T.

❅ ❅ ❅

  • ACGGTTT

ATGGTAT ACGGTGT ACGATGT

◮ We (usually) can only observe are the global data, the aligned

homologous sequences for the taxa xi ∈ X.

?

ACGGTTT ATGGTAT ACGGTGT ACGATGT

Mike Hendy RADICAL PHYLOGENETIC INVERSION

slide-6
SLIDE 6

Local to Global

◮ X = {x0, x1, · · · , xn} is a set of n + 1 taxa, and T is an

X–tree (the leaves represent the taxa). Example X = {x0, x1, x2, x3}:

❅ ❅ ❅

  • x0

x1 x2 x3

◮ Nucleotide sequences evolve from an ancestral sequence,

under a model of nucleotide substitution governed by local stochastic matrices Me on each edge of T.

❅ ❅ ❅

  • ACGGTTT

ATGGTAT ACGGTGT ACGATGT

◮ We (usually) can only observe are the global data, the aligned

homologous sequences for the taxa xi ∈ X.

?

ACGGTTT ATGGTAT ACGGTGT ACGATGT

◮ The inverse problem of phylogenetics, deduce the local

structure of T, and (if possible) the stochastic matrices Me, from the global data (aligned sequences).

Mike Hendy RADICAL PHYLOGENETIC INVERSION

slide-7
SLIDE 7

Global to Local - Pathset Group

❅ ❅ ❅

  • x0

x1 x2 x3 The path Πij is the set of edges of T connecting the leaves xi,xj. The pathset group of T is the group generated by the paths Π01, Π02,· · · Π0n under disjoint union.

Mike Hendy RADICAL PHYLOGENETIC INVERSION

slide-8
SLIDE 8

Global to Local - Pathset Group

❅ ❅ ❅

  • x0

x1 x2 x3 The path Πij is the set of edges of T connecting the leaves xi,xj. The pathset group of T is the group generated by the paths Π01, Π02,· · · Π0n under disjoint union.

◮ The 8 pathsets of the pathset group of T23:

Π∅

x0 x1 Π01

  • x0

x2 Π02

❅ ❅

  • x1

x2 Π12

x0 x3 Π03

❅ ❅ ❅ ❅

x1 x3 Π13

❅ ❅

  • x2

x3 Π23

❅ ❅ ❅

  • x0

x1 x2 x3 Π0123

Mike Hendy RADICAL PHYLOGENETIC INVERSION

slide-9
SLIDE 9

Global to Local - Pathset Group

❅ ❅ ❅

  • x0

x1 x2 x3 The path Πij is the set of edges of T connecting the leaves xi,xj. The pathset group of T is the group generated by the paths Π01, Π02,· · · Π0n under disjoint union.

◮ The 8 pathsets of the pathset group of T23:

Π∅

x0 x1 Π01

  • x0

x2 Π02

❅ ❅

  • x1

x2 Π12

x0 x3 Π03

❅ ❅ ❅ ❅

x1 x3 Π13

❅ ❅

  • x2

x3 Π23

❅ ❅ ❅

  • x0

x1 x2 x3 Π0123

◮ The pathset Π0123 is the disjoint union

  • f Π01, Π02 and Π03.

❅ ❅ ❅

  • x0

x1 x2 x3 Π0123

Mike Hendy RADICAL PHYLOGENETIC INVERSION

slide-10
SLIDE 10

Global to Local - Pathsets

◮ The edge e23 belongs to the four pathsets ΠB where

|B ∩ {2, 3}| is an odd number. Π∅

x0 x1 Π01

  • x0

x2 Π02

❅ ❅

  • x1

x2 Π12

x0 x3 Π03

❅ ❅ ❅ ❅

x1 x3 Π13

❅ ❅

  • x2

x3 Π23

❅ ❅ ❅

  • x0

x1 x2 x3 Π0123

Mike Hendy RADICAL PHYLOGENETIC INVERSION

slide-11
SLIDE 11

Global to Local - Pathsets

◮ The edge e23 belongs to the four pathsets ΠB where

|B ∩ {2, 3}| is an odd number. Π∅

x0 x1 Π01

  • x0

x2 Π02

❅ ❅

  • x1

x2 Π12

x0 x3 Π03

❅ ❅ ❅ ❅

x1 x3 Π13

❅ ❅

  • x2

x3 Π23

❅ ❅ ❅

  • x0

x1 x2 x3 Π0123

◮ In general edge eA in ΠB iff |A ∩ B| is odd.

Mike Hendy RADICAL PHYLOGENETIC INVERSION

slide-12
SLIDE 12

Global to Local - Pathsets

◮ The edge e23 belongs to the four pathsets ΠB where

|B ∩ {2, 3}| is an odd number. Π∅

x0 x1 Π01

  • x0

x2 Π02

❅ ❅

  • x1

x2 Π12

x0 x3 Π03

❅ ❅ ❅ ❅

x1 x3 Π13

❅ ❅

  • x2

x3 Π23

❅ ❅ ❅

  • x0

x1 x2 x3 Π0123

◮ In general edge eA in ΠB iff |A ∩ B| is odd. ◮ An X–tree has 2|X|−1 = 2n pathsets.

Mike Hendy RADICAL PHYLOGENETIC INVERSION

slide-13
SLIDE 13

Global to Local - Pathsets

◮ x0

x1 x2 x3

✘✘ ✘

  • ❳❳

❳ ❅ ❅ ❅ ❅ ❅ ❅

❅ ❅ ❅ ❅ ❅

  • Π02 + Π12 + Π03 + Π13

A = {2, 3} x0 x1 x2 x3

✘✘ ✘

  • ❳❳

❳ ❅ ❅ ❅ ❅ ❅ ❅

❅ ❅ ❅ ❅ ❅

❅ ❅ ❅ ❅ ❅

❅ ❅ ❅ ❅ ❅

Π01 + Π12 + Π03 + Π13 −Π∅ − Π01 − Π23 − Π0123

Mike Hendy RADICAL PHYLOGENETIC INVERSION

slide-14
SLIDE 14

Global to Local - Pathsets

◮ x0

x1 x2 x3

✘✘ ✘

  • ❳❳

❳ ❅ ❅ ❅ ❅ ❅ ❅

❅ ❅ ❅ ❅ ❅

  • Π02 + Π12 + Π03 + Π13

A = {2, 3} x0 x1 x2 x3

✘✘ ✘

  • ❳❳

❳ ❅ ❅ ❅ ❅ ❅ ❅

❅ ❅ ❅ ❅ ❅

❅ ❅ ❅ ❅ ❅

❅ ❅ ❅ ❅ ❅

Π01 + Π12 + Π03 + Π13 −Π∅ − Π01 − Π23 − Π0123

A = {1, 2, 3} x0 x1 x2 x3

❅ ❅ ❳❳ ❅ ❅ ❅

❅ ❅ ❅ ❅ ❅

❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅

  • ✘✘

Π01 + Π02 + Π03 + Π0123 −Π∅ − Π12 − Π13 − Π23

❅ ❅ ❅

  • A = {1, 2}

x0 x1 x2 x3

❅ ❅ ❅ ❅ ❅

❅ ❅

❅ ❅ ❅ ❅ ❅

❅ ❅ ❅ ❅ ❅

Π01 + Π02 + Π13 + Π23 −Π∅ − Π12 − Π03 − Π0123

✘✘ ✘

  • ❳❳

❳ ❅ ❅ ❅

Mike Hendy RADICAL PHYLOGENETIC INVERSION

slide-15
SLIDE 15

Pathsets

◮ Each edge eA ∈ E(T) belongs to the 2|X|−2 pathsets ΠB with

|A ∩ B| odd.

❅ ❅ ❅ ❅ ❅

  • x0

x1 x2 x3 x4 e1234 e1 e2 e3 e4 e14 e124 The edge e124 belongs to the 8 pathsets Π01, Π02, Π13, Π23, Π04, Π0124, Π34 and Π1234.

Mike Hendy RADICAL PHYLOGENETIC INVERSION

slide-16
SLIDE 16

Pathsets

◮ Each edge eA ∈ E(T) belongs to the 2|X|−2 pathsets ΠB with

|A ∩ B| odd.

❅ ❅ ❅ ❅ ❅

  • x0

x1 x2 x3 x4 e1234 e1 e2 e3 e4 e14 e124 The edge e124 belongs to the 8 pathsets Π01, Π02, Π13, Π23, Π04, Π0124, Π34 and Π1234.

◮ Each pathset ΠB comprises all edges eA ∈ E(T) with |A ∩ B|

  • dd.
  • x0

x1 x2 x3 x4 e1234 e1 e2 e3 e4 e14 e124

❅ ❅ ❅ ❅ ❅

The pathset Π1234 contains the edges e1, e2, e3, e4 and e124.

Mike Hendy RADICAL PHYLOGENETIC INVERSION

slide-17
SLIDE 17

Global to Local - Pathweights

◮ Suppose w: E(T) → R is an edge-weighting function. Then

for any even-ordered subset A ⊆ X, the weight of the pathset ΠA is w(ΠA) =

  • eB∈E(T):

|A∩B|≡1(mod 2)

w(eB).

Mike Hendy RADICAL PHYLOGENETIC INVERSION

slide-18
SLIDE 18

Global to Local - Pathweights

◮ Suppose w: E(T) → R is an edge-weighting function. Then

for any even-ordered subset A ⊆ X, the weight of the pathset ΠA is w(ΠA) =

  • eB∈E(T):

|A∩B|≡1(mod 2)

w(eB).

◮ Given the set of global pathset weights, we can find the local

edgeweights by 2|X|−2w(eB) =

  • A⊆X :

|A|≡0(mod 2) |A∩B|≡1(mod 2)

w(ΠA) −

  • A⊆X :

|A|≡0(mod 2) |A∩B|≡0(mod 2)

w(ΠA)

Mike Hendy RADICAL PHYLOGENETIC INVERSION

slide-19
SLIDE 19

Global to Local - Pathweights

◮ Suppose w: E(T) → R is an edge-weighting function. Then

for any even-ordered subset A ⊆ X, the weight of the pathset ΠA is w(ΠA) =

  • eB∈E(T):

|A∩B|≡1(mod 2)

w(eB).

◮ Given the set of global pathset weights, we can find the local

edgeweights by 2|X|−2w(eB) =

  • A⊆X :

|A|≡0(mod 2) |A∩B|≡1(mod 2)

w(ΠA) −

  • A⊆X :

|A|≡0(mod 2) |A∩B|≡0(mod 2)

w(ΠA)

◮ Example: w(eB) = − ln(det(MeB)), the log-det weight of the

transition matrices. If we can measure the log-det of a pathset, we can determine the log-det of each edge of T.

Mike Hendy RADICAL PHYLOGENETIC INVERSION

slide-20
SLIDE 20

Global to Local - Pathweights

◮ The transition matrice multiply along paths, their

determinants multiply across the edges of a path set. If we can determine det(MΠB) =

  • eA∈E(T): |A∩B|≡1(mod 2)

det(MeA), from the sequences at the leaves, then we can calculate the determinants of each each stochastic matrix

Mike Hendy RADICAL PHYLOGENETIC INVERSION

slide-21
SLIDE 21

Global to Local - Pathweights

◮ The transition matrice multiply along paths, their

determinants multiply across the edges of a path set. If we can determine det(MΠB) =

  • eA∈E(T): |A∩B|≡1(mod 2)

det(MeA), from the sequences at the leaves, then we can calculate the determinants of each each stochastic matrix

(det(MeA))2|X|−2 =

  • A⊆X :

|A|≡0(mod 2) |A∩B|≡1(mod 2)

det(MΠB)

  • A⊆X :

|A|≡0(mod 2) |A∩B|≡0(mod 2)

det(MΠB)−1.

Mike Hendy RADICAL PHYLOGENETIC INVERSION

slide-22
SLIDE 22

Global to Local - Pathweights

◮ The transition matrice multiply along paths, their

determinants multiply across the edges of a path set. If we can determine det(MΠB) =

  • eA∈E(T): |A∩B|≡1(mod 2)

det(MeA), from the sequences at the leaves, then we can calculate the determinants of each each stochastic matrix

(det(MeA))2|X|−2 =

  • A⊆X :

|A|≡0(mod 2) |A∩B|≡1(mod 2)

det(MΠB)

  • A⊆X :

|A|≡0(mod 2) |A∩B|≡0(mod 2)

det(MΠB)−1.

◮ The product above has value 1 iff there is no edge with

edge-split A. This can determine the toplogy of T.

Mike Hendy RADICAL PHYLOGENETIC INVERSION

slide-23
SLIDE 23

Neyman’s (Cavender, Farris) Model

◮ Given a set X = {x0, x1, · · · , xn} of n + 1 taxa on a

phylogenetic tree T.

Mike Hendy RADICAL PHYLOGENETIC INVERSION

slide-24
SLIDE 24

Neyman’s (Cavender, Farris) Model

◮ Given a set X = {x0, x1, · · · , xn} of n + 1 taxa on a

phylogenetic tree T.

◮ Each edge e of T is assigned the probability pe that the

character states at the endpoints of e differ. (pe is independent of the states at its endpoints (symmetric)).

Mike Hendy RADICAL PHYLOGENETIC INVERSION

slide-25
SLIDE 25

Neyman’s (Cavender, Farris) Model

◮ Given a set X = {x0, x1, · · · , xn} of n + 1 taxa on a

phylogenetic tree T.

◮ Each edge e of T is assigned the probability pe that the

character states at the endpoints of e differ. (pe is independent of the states at its endpoints (symmetric)).

◮ The stochastic matrix for edge e is therefore

Pe = 1 − pe pe pe 1 − pe

  • .

Mike Hendy RADICAL PHYLOGENETIC INVERSION

slide-26
SLIDE 26

Neyman’s (Cavender, Farris) Model

◮ Given a set X = {x0, x1, · · · , xn} of n + 1 taxa on a

phylogenetic tree T.

◮ Each edge e of T is assigned the probability pe that the

character states at the endpoints of e differ. (pe is independent of the states at its endpoints (symmetric)).

◮ The stochastic matrix for edge e is therefore

Pe = 1 − pe pe pe 1 − pe

  • .

◮ At a site the (notional) root x0 is assigned a character

χ(x0) ∈ {R, Y}.

Mike Hendy RADICAL PHYLOGENETIC INVERSION

slide-27
SLIDE 27

Neyman’s (Cavender, Farris) Model

◮ Given a set X = {x0, x1, · · · , xn} of n + 1 taxa on a

phylogenetic tree T.

◮ Each edge e of T is assigned the probability pe that the

character states at the endpoints of e differ. (pe is independent of the states at its endpoints (symmetric)).

◮ The stochastic matrix for edge e is therefore

Pe = 1 − pe pe pe 1 − pe

  • .

◮ At a site the (notional) root x0 is assigned a character

χ(x0) ∈ {R, Y}.

◮ Each taxon has a homologous 2–state (R and Y) sequence

generated recursively by the parameters pe.

Mike Hendy RADICAL PHYLOGENETIC INVERSION

slide-28
SLIDE 28

Neyman’s (Cavender, Farris) Model

◮ Given a set X = {x0, x1, · · · , xn} of n + 1 taxa on a

phylogenetic tree T.

◮ Each edge e of T is assigned the probability pe that the

character states at the endpoints of e differ. (pe is independent of the states at its endpoints (symmetric)).

◮ The stochastic matrix for edge e is therefore

Pe = 1 − pe pe pe 1 − pe

  • .

◮ At a site the (notional) root x0 is assigned a character

χ(x0) ∈ {R, Y}.

◮ Each taxon has a homologous 2–state (R and Y) sequence

generated recursively by the parameters pe.

◮ The probability of each possible site pattern (assignment of

states at the leaves of T) is a polynomial function of the pe terms.

Mike Hendy RADICAL PHYLOGENETIC INVERSION

slide-29
SLIDE 29

Example - Felsenstein

If at a site, x0 is assigned the state R, pRRYY, the probability of site pattern RRYY, is obtained by summing the probabilities of four subcases of assigning states at the internal vertices, given the substitution probabilities P and Q on the edges of T. Q

❅ ❅ ❅ ❅P

x1

  • Q

x0

❅ ❅

Q x3

  • P

x2 T23

Mike Hendy RADICAL PHYLOGENETIC INVERSION

slide-30
SLIDE 30

Example - Felsenstein

If at a site, x0 is assigned the state R, pRRYY, the probability of site pattern RRYY, is obtained by summing the probabilities of four subcases of assigning states at the internal vertices, given the substitution probabilities P and Q on the edges of T. Q

❅ ❅ ❅ ❅P

x1

  • Q

x0

❅ ❅

Q x3

  • P

x2 T23

R R R R Y Y

  • (1 − P)(1 − Q)2PQ

Mike Hendy RADICAL PHYLOGENETIC INVERSION

slide-31
SLIDE 31

Example - Felsenstein

If at a site, x0 is assigned the state R, pRRYY, the probability of site pattern RRYY, is obtained by summing the probabilities of four subcases of assigning states at the internal vertices, given the substitution probabilities P and Q on the edges of T. Q

❅ ❅ ❅ ❅P

x1

  • Q

x0

❅ ❅

Q x3

  • P

x2 T23

R R R R Y Y

  • (1 − P)(1 − Q)2PQ

R R R Y Y Y

  • (1 − P)2(1 − Q)2Q

Mike Hendy RADICAL PHYLOGENETIC INVERSION

slide-32
SLIDE 32

Example - Felsenstein

If at a site, x0 is assigned the state R, pRRYY, the probability of site pattern RRYY, is obtained by summing the probabilities of four subcases of assigning states at the internal vertices, given the substitution probabilities P and Q on the edges of T. Q

❅ ❅ ❅ ❅P

x1

  • Q

x0

❅ ❅

Q x3

  • P

x2 T23

R R R R Y Y

  • (1 − P)(1 − Q)2PQ

R R R Y Y Y

  • (1 − P)2(1 − Q)2Q

R R Y R Y Y

  • P 2Q3

Mike Hendy RADICAL PHYLOGENETIC INVERSION

slide-33
SLIDE 33

Example - Felsenstein

If at a site, x0 is assigned the state R, pRRYY, the probability of site pattern RRYY, is obtained by summing the probabilities of four subcases of assigning states at the internal vertices, given the substitution probabilities P and Q on the edges of T. Q

❅ ❅ ❅ ❅P

x1

  • Q

x0

❅ ❅

Q x3

  • P

x2 T23

R R R R Y Y

  • (1 − P)(1 − Q)2PQ

R R R Y Y Y

  • (1 − P)2(1 − Q)2Q

R R Y R Y Y

  • P 2Q3

R R Y Y Y Y

  • (1 − P)(1 − Q)2PQ

Mike Hendy RADICAL PHYLOGENETIC INVERSION

slide-34
SLIDE 34

Example - Felsenstein

If at a site, x0 is assigned the state R, pRRYY, the probability of site pattern RRYY, is obtained by summing the probabilities of four subcases of assigning states at the internal vertices, given the substitution probabilities P and Q on the edges of T. Q

❅ ❅ ❅ ❅P

x1

  • Q

x0

❅ ❅

Q x3

  • P

x2 T23

R R R R Y Y

  • (1 − P)(1 − Q)2PQ

R R R Y Y Y

  • (1 − P)2(1 − Q)2Q

R R Y R Y Y

  • P 2Q3

R R Y Y Y Y

  • (1 − P)(1 − Q)2PQ

◮ Summing: pRRYY = Q(1 − 2Q + Q2 − P 2 + 2P 2Q).

Mike Hendy RADICAL PHYLOGENETIC INVERSION

slide-35
SLIDE 35

Inversion

◮ Given the character χ(x0) = R at a site, the probability of site

pattern RRYY (χ(x1) = R, χ(x2) = Y, χ(x3) = Y) is pRRYY = Q(1 − 2Q + Q2 − P 2 + 2P 2Q).

Mike Hendy RADICAL PHYLOGENETIC INVERSION

slide-36
SLIDE 36

Inversion

◮ Given the character χ(x0) = R at a site, the probability of site

pattern RRYY (χ(x1) = R, χ(x2) = Y, χ(x3) = Y) is pRRYY = Q(1 − 2Q + Q2 − P 2 + 2P 2Q).

◮ Similar formulae can be derived for pRRRR, · · · , pRYYY.

Mike Hendy RADICAL PHYLOGENETIC INVERSION

slide-37
SLIDE 37

Inversion

◮ Given the character χ(x0) = R at a site, the probability of site

pattern RRYY (χ(x1) = R, χ(x2) = Y, χ(x3) = Y) is pRRYY = Q(1 − 2Q + Q2 − P 2 + 2P 2Q).

◮ Similar formulae can be derived for pRRRR, · · · , pRYYY. ◮ Inverting, P and Q can be expressed as rational functions

(with radicals) of the pattern probabilities: P = 1 2

  • 1 −

4

µ12µ0123 µ03

  • ,

Q = 1 2

  • 1 −

4

µ03µ0123 µ12

  • ,

(1) where µ12 = 1 − 2(pRYRR + pRRYR + pRYRY + pRRYY), µ0123 = 1 − 2(pRYRR + pRRYR + pRRRY + pRYYY), µ03 = 1 − 2(pRRRY + pRYRY + pRRYY + pRYYY).

Mike Hendy RADICAL PHYLOGENETIC INVERSION

slide-38
SLIDE 38

Eigenvalues

◮ The stochastic matrices Me1 = Me2 =

1 − P P P 1 − P

  • ,

Me3 = Me23 = Me123 = 1 − Q Q Q 1 − Q

  • describe the

substitution probabilities on the edges of T23.

Mike Hendy RADICAL PHYLOGENETIC INVERSION

slide-39
SLIDE 39

Eigenvalues

◮ The stochastic matrices Me1 = Me2 =

1 − P P P 1 − P

  • ,

Me3 = Me23 = Me123 = 1 − Q Q Q 1 − Q

  • describe the

substitution probabilities on the edges of T23.

◮ These matrices have a common diagonalising matrix,

H1 = 1 1 1 −1

  • , with H−1

1 Me1H1 =

1 1 − 2P

  • .

Mike Hendy RADICAL PHYLOGENETIC INVERSION

slide-40
SLIDE 40

Eigenvalues

◮ The stochastic matrices Me1 = Me2 =

1 − P P P 1 − P

  • ,

Me3 = Me23 = Me123 = 1 − Q Q Q 1 − Q

  • describe the

substitution probabilities on the edges of T23.

◮ These matrices have a common diagonalising matrix,

H1 = 1 1 1 −1

  • , with H−1

1 Me1H1 =

1 1 − 2P

  • .

◮ Hence the Me1 has eigenvalues 1, λ1 = 1 − 2P, and Me3 has

eigenvalues 1, λ3 = 1 − 2Q.

Mike Hendy RADICAL PHYLOGENETIC INVERSION

slide-41
SLIDE 41

Eigenvalues

◮ The stochastic matrices Me1 = Me2 =

1 − P P P 1 − P

  • ,

Me3 = Me23 = Me123 = 1 − Q Q Q 1 − Q

  • describe the

substitution probabilities on the edges of T23.

◮ These matrices have a common diagonalising matrix,

H1 = 1 1 1 −1

  • , with H−1

1 Me1H1 =

1 1 − 2P

  • .

◮ Hence the Me1 has eigenvalues 1, λ1 = 1 − 2P, and Me3 has

eigenvalues 1, λ3 = 1 − 2Q.

◮ The transition matrices multiply along paths, hence so do

their eigenvalues.

Mike Hendy RADICAL PHYLOGENETIC INVERSION

slide-42
SLIDE 42

Eigenvalues

◮ The stochastic matrices Me1 = Me2 =

1 − P P P 1 − P

  • ,

Me3 = Me23 = Me123 = 1 − Q Q Q 1 − Q

  • describe the

substitution probabilities on the edges of T23.

◮ These matrices have a common diagonalising matrix,

H1 = 1 1 1 −1

  • , with H−1

1 Me1H1 =

1 1 − 2P

  • .

◮ Hence the Me1 has eigenvalues 1, λ1 = 1 − 2P, and Me3 has

eigenvalues 1, λ3 = 1 − 2Q.

◮ The transition matrices multiply along paths, hence so do

their eigenvalues.

◮ Hence the eigenvalue across path Πij connecting xi to xj is

µij =

  • e∈Πij

λe .

Mike Hendy RADICAL PHYLOGENETIC INVERSION

slide-43
SLIDE 43

Pathsets

❅ ❅ ❅ ❅ ❅

x1

  • x0

❅ ❅ x3

  • x2

T23

❅ ❅ ❅ ❅ ❅

  • The path Π01

Mike Hendy RADICAL PHYLOGENETIC INVERSION

slide-44
SLIDE 44

Pathsets

❅ ❅ ❅ ❅ ❅

x1

  • x0

❅ ❅ x3

  • x2

T23

❅ ❅ ❅ ❅ ❅

  • The path Π01

❅ ❅ ❅ ❅ ❅

  • The path Π12

Mike Hendy RADICAL PHYLOGENETIC INVERSION

slide-45
SLIDE 45

Pathsets

❅ ❅ ❅ ❅ ❅

x1

  • x0

❅ ❅ x3

  • x2

T23

❅ ❅ ❅ ❅ ❅

  • The path Π01

❅ ❅ ❅ ❅ ❅

  • The path Π12

❅ ❅ ❅ ❅ ❅ ❅

The path Π13

Mike Hendy RADICAL PHYLOGENETIC INVERSION

slide-46
SLIDE 46

Pathsets

❅ ❅ ❅ ❅ ❅

x1

  • x0

❅ ❅ x3

  • x2

T23

❅ ❅ ❅ ❅ ❅

  • The path Π01

❅ ❅ ❅ ❅ ❅

  • The path Π12

❅ ❅ ❅ ❅ ❅ ❅

The path Π13

❅ ❅ ❅ ❅

The pathset Π0123 = Π01 ∪ Π23

Mike Hendy RADICAL PHYLOGENETIC INVERSION

slide-47
SLIDE 47

Pathsets

❅ ❅ ❅ ❅ ❅

x1

  • x0

❅ ❅ x3

  • x2

T23

❅ ❅ ❅ ❅ ❅

  • The path Π01

❅ ❅ ❅ ❅ ❅

  • The path Π12

❅ ❅ ❅ ❅ ❅ ❅

The path Π13

❅ ❅ ❅ ❅

The pathset Π0123 = Π01 ∪ Π23

  • Subtract paths Π02, Π03, Π23.

Mike Hendy RADICAL PHYLOGENETIC INVERSION

slide-48
SLIDE 48

Pathsets

❅ ❅ ❅ ❅ ❅

x1

  • x0

❅ ❅ x3

  • x2

T23

❅ ❅ ❅ ❅ ❅

  • The path Π01

❅ ❅ ❅ ❅ ❅

  • The path Π12

❅ ❅ ❅ ❅ ❅ ❅

The path Π13

❅ ❅ ❅ ❅

The pathset Π0123 = Π01 ∪ Π23

  • Subtract paths Π02, Π03, Π23.

◮ Counting the occurrences of

each edge in Π01, Π12, Π13 and Π0123, minus the

  • ccurrences in Π02, Π03 and

Π23, we see edge e1 is counted 4 times, and each

  • ther edge cancels.

Mike Hendy RADICAL PHYLOGENETIC INVERSION

slide-49
SLIDE 49

Pathsets

❅ ❅ ❅ ❅ ❅

x1

  • x0

❅ ❅ x3

  • x2

T23

❅ ❅ ❅ ❅ ❅

  • The path Π01

❅ ❅ ❅ ❅ ❅

  • The path Π12

❅ ❅ ❅ ❅ ❅ ❅

The path Π13

❅ ❅ ❅ ❅

The pathset Π0123 = Π01 ∪ Π23

  • Subtract paths Π02, Π03, Π23.

◮ Counting the occurrences of

each edge in Π01, Π12, Π13 and Π0123, minus the

  • ccurrences in Π02, Π03 and

Π23, we see edge e1 is counted 4 times, and each

  • ther edge cancels.

◮ Hence µ01µ12µ13µ0123 µ02µ03µ23

= λ4

1

Mike Hendy RADICAL PHYLOGENETIC INVERSION

slide-50
SLIDE 50

Pathsets

❅ ❅ ❅ ❅ ❅

x1

  • x0

❅ ❅ x3

  • x2

T23

❅ ❅ ❅ ❅ ❅

  • The path Π01

❅ ❅ ❅ ❅ ❅

  • The path Π12

❅ ❅ ❅ ❅ ❅ ❅

The path Π13

❅ ❅ ❅ ❅

The pathset Π0123 = Π01 ∪ Π23

  • Subtract paths Π02, Π03, Π23.

◮ Counting the occurrences of

each edge in Π01, Π12, Π13 and Π0123, minus the

  • ccurrences in Π02, Π03 and

Π23, we see edge e1 is counted 4 times, and each

  • ther edge cancels.

◮ Hence µ01µ12µ13µ0123 µ02µ03µ23

= λ4

1 ◮ pe1 = 1 2(1 − λ1) is a rational

function (with radicals) of the eigenvalues µB, and thus of the site pattern probabilities pRRRR, · · · , pRYYY.

Mike Hendy RADICAL PHYLOGENETIC INVERSION

slide-51
SLIDE 51

◮ For 2–state sequences evolving under Neyman’s model on any

X–tree T, (X = {x0, x1, · · · , xn}), the local (λA) and global (µB) eigenvalues are related by the products: µB =

  • A⊆X∗ : |A∩B|≡1(mod 2)

λA, (λA)2n−1 =

  • B⊆X :

|B|≡0(mod 2) |A∩B|≡1(mod 2)

µB

  • B⊆X :

|B|≡0(mod 2) |A∩B|≡0(mod 2)

µ−1

B .

Mike Hendy RADICAL PHYLOGENETIC INVERSION

slide-52
SLIDE 52

◮ For 2–state sequences evolving under Neyman’s model on any

X–tree T, (X = {x0, x1, · · · , xn}), the local (λA) and global (µB) eigenvalues are related by the products: µB =

  • A⊆X∗ : |A∩B|≡1(mod 2)

λA, (λA)2n−1 =

  • B⊆X :

|B|≡0(mod 2) |A∩B|≡1(mod 2)

µB

  • B⊆X :

|B|≡0(mod 2) |A∩B|≡0(mod 2)

µ−1

B . ◮ The local eigenvalues (λA) are derived from the edge change

probabilities peA: λA =

  • 1 − 2peA if eA an edge of T

1 else

Mike Hendy RADICAL PHYLOGENETIC INVERSION

slide-53
SLIDE 53

◮ For 2–state sequences evolving under Neyman’s model on any

X–tree T, (X = {x0, x1, · · · , xn}), the local (λA) and global (µB) eigenvalues are related by the products: µB =

  • A⊆X∗ : |A∩B|≡1(mod 2)

λA, (λA)2n−1 =

  • B⊆X :

|B|≡0(mod 2) |A∩B|≡1(mod 2)

µB

  • B⊆X :

|B|≡0(mod 2) |A∩B|≡0(mod 2)

µ−1

B . ◮ The local eigenvalues (λA) are derived from the edge change

probabilities peA: λA =

  • 1 − 2peA if eA an edge of T

1 else

◮ The global eigenvalues (µB) are derived from the site pattern

probabilities sC: µB = 1 − 2(

  • C⊆X∗ :

|B∩C|≡1(mod 2)

sC).

Mike Hendy RADICAL PHYLOGENETIC INVERSION

slide-54
SLIDE 54

◮ Given this invertible relationship between the local (λA) and

global (µB) eigenvalues we can extend to a relationship between the probabilities, as peA = (1 − λA)/2 for each edge eA of T –

Mike Hendy RADICAL PHYLOGENETIC INVERSION

slide-55
SLIDE 55

◮ Given this invertible relationship between the local (λA) and

global (µB) eigenvalues we can extend to a relationship between the probabilities, as peA = (1 − λA)/2 for each edge eA of T –

◮ – and

sC = 2−n    

  • B⊆X∗ :

|A∩B|≡0(mod 2)

µB −

  • B⊆X∗ :

|A∩B|≡1(mod 2)

µB     for each site pattern C.

Mike Hendy RADICAL PHYLOGENETIC INVERSION

slide-56
SLIDE 56

◮ For models of 4–state sequences evolving on an X–tree T,

similar relationships hold between the local and global eigenvalues if the 4 × 4 stochastic matrices MeA have a common diagonalising matrix.

Mike Hendy RADICAL PHYLOGENETIC INVERSION

slide-57
SLIDE 57

◮ For models of 4–state sequences evolving on an X–tree T,

similar relationships hold between the local and global eigenvalues if the 4 × 4 stochastic matrices MeA have a common diagonalising matrix.

◮ The stochastic matrices for the generalised Kimura

3–substitution types (gK3ST) model are diagonalised by the Hadamard matrix H2 =     1 1 1 1 1 −1 1 −1 1 1 −1 −1 1 −1 −1 1     . For the gK3ST and submodels (Jukes Cantor, K2ST) the site pattern and edge probabilities are similarly related with polynomial and rational (with radicals) functions.

Mike Hendy RADICAL PHYLOGENETIC INVERSION

slide-58
SLIDE 58

◮ For models of 4–state sequences evolving on an X–tree T,

similar relationships hold between the local and global eigenvalues if the 4 × 4 stochastic matrices MeA have a common diagonalising matrix.

◮ The stochastic matrices for the generalised Kimura

3–substitution types (gK3ST) model are diagonalised by the Hadamard matrix H2 =     1 1 1 1 1 −1 1 −1 1 1 −1 −1 1 −1 −1 1     . For the gK3ST and submodels (Jukes Cantor, K2ST) the site pattern and edge probabilities are similarly related with polynomial and rational (with radicals) functions.

◮ Models with stochastic matrices with more than 3 independent

parameters cannot have a common diagonalising matrix.

Mike Hendy RADICAL PHYLOGENETIC INVERSION

slide-59
SLIDE 59

THANKS!

Mike Hendy RADICAL PHYLOGENETIC INVERSION