Maximum Agreement Subtrees Seth Sullivant North Carolina State - - PowerPoint PPT Presentation

maximum agreement subtrees
SMART_READER_LITE
LIVE PREVIEW

Maximum Agreement Subtrees Seth Sullivant North Carolina State - - PowerPoint PPT Presentation

Maximum Agreement Subtrees Seth Sullivant North Carolina State University March 24, 2018 Seth Sullivant (NCSU) Maximum Agreement Subtrees March 24, 2018 1 / 23 Phylogenetics Problem Given a collection of species, find the tree that


slide-1
SLIDE 1

Maximum Agreement Subtrees

Seth Sullivant

North Carolina State University

March 24, 2018

Seth Sullivant (NCSU) Maximum Agreement Subtrees March 24, 2018 1 / 23

slide-2
SLIDE 2

Phylogenetics

Problem

Given a collection of species, find the tree that explains their history.

Seth Sullivant (NCSU) Maximum Agreement Subtrees March 24, 2018 2 / 23

slide-3
SLIDE 3

Phylogenetics

Problem

Given a collection of species, find the tree that explains their history.

Yeates, Meier, Wiegman, 2015 Seth Sullivant (NCSU) Maximum Agreement Subtrees March 24, 2018 2 / 23

slide-4
SLIDE 4

Rooted Binary X-Trees

Definition

A rooted tree T has a distinguished vertex ρ, the root. A rooted binary phylogenetic X tree T is a binary tree that has a distinguished root vertex and where the leaves are labeled by X.

3 4 6 8 1 2 5 7

In phylogenetics, only have access to data on extant (not extinct)

  • species. We don’t know data or information about species

corresponding to internal nodes in the tree.

Seth Sullivant (NCSU) Maximum Agreement Subtrees March 24, 2018 3 / 23

slide-5
SLIDE 5

Induced subtrees

Let X be a label set, with n = |X|. Let T be a binary rooted phylogenetic X-tree. Given S ⊆ X, T|S is the binary restriction tree.

3 4 6 8 1 2 5 7 3 2 5 3 2 5

Definition

Given T1, T2 binary rooted phylogenetic X-trees, MAST(T1, T2) = max{#S : S ∈ X and T1|S = T2|S} This is the size of a maximum agreement subtree.

Seth Sullivant (NCSU) Maximum Agreement Subtrees March 24, 2018 4 / 23

slide-6
SLIDE 6

Example

3 4 6 8 1 2 5 7 3 4 6 8 1 2 5 7

MAST(T1, T2) = 3

Seth Sullivant (NCSU) Maximum Agreement Subtrees March 24, 2018 5 / 23

slide-7
SLIDE 7

Example

3 4 6 8 1 2 5 7 3 4 6 8 1 2 5 7

MAST(T1, T2) = 3

3 2 5 Seth Sullivant (NCSU) Maximum Agreement Subtrees March 24, 2018 5 / 23

slide-8
SLIDE 8

Example

3 4 6 8 1 2 5 7 3 4 6 8 1 2 5 7

MAST(T1, T2) = 3

3 2 5 6 5 7 4 2 7 Seth Sullivant (NCSU) Maximum Agreement Subtrees March 24, 2018 5 / 23

slide-9
SLIDE 9

Example

3 4 6 8 1 2 5 7 3 4 6 8 1 2 5 7

MAST(T1, T2) = 3

3 2 5 6 5 7 4 2 7

Theorem (Steel-Warnow 1993)

There is an O(n2) algorithm to compute MAST(T1, T2) of binary rooted phylogenetic X-trees.

Seth Sullivant (NCSU) Maximum Agreement Subtrees March 24, 2018 5 / 23

slide-10
SLIDE 10

What is the distribution of MAST(T1, T2)?

Problem

Determine the distribution of MAST(T1, T2) for reasonable “nice” probability distributions on rooted binary trees. Uniform distribution Yule-Harding distribution

Remark

Simulations [Bryant-Mackenzie-Steel 2003] suggest that under both the uniform distribution and the Yule-Harding distribution E[MAST(T1, T2)] ∼ c √ n where n = |X|, for some constant c depending on the distribution.

Seth Sullivant (NCSU) Maximum Agreement Subtrees March 24, 2018 6 / 23

slide-11
SLIDE 11

Motivation: Comparing New Phylogenetic Methods

Suppose we come up with a new phylogenetic method. This method takes a data set D and constructs the tree M(D). If we know the correct tree T we can evaluate the method by computing MAST(T, M(D)). If MAST(T, M(D)) is consistently small (for lots of different D), then we conclude that the new method does not work well. How small is small? Is it smaller than what you would expect to see by random chance? Need to know the distribution of MAST(T, T ′).

Seth Sullivant (NCSU) Maximum Agreement Subtrees March 24, 2018 7 / 23

slide-12
SLIDE 12

Motivation: Cospeciation

Let TH be a phylogenetic tree of host species, and TP a phylogenetic tree of parasite species. Host and parasites are paired, so TH and TP have same label set. If MAST(TH, TP) is “large”, reject hypothesis that TH and TP evolved independently. i.e. large MAST(TH, TP) = ⇒ cospeciation. Need distribution of MAST(T1, T2) for random trees under null hypothesis of independence to perform hypothesis test.

Seth Sullivant (NCSU) Maximum Agreement Subtrees March 24, 2018 8 / 23

slide-13
SLIDE 13

Motivation: Cospeciation

Let TH be a phylogenetic tree of host species, and TP a phylogenetic tree of parasite species. Host and parasites are paired, so TH and TP have same label set. If MAST(TH, TP) is “large”, reject hypothesis that TH and TP evolved independently. i.e. large MAST(TH, TP) = ⇒ cospeciation. Need distribution of MAST(T1, T2) for random trees under null hypothesis of independence to perform hypothesis test.

Hafner, M.S., Nadler, S.A. (1988) Nature 332: 258-259 Seth Sullivant (NCSU) Maximum Agreement Subtrees March 24, 2018 8 / 23

slide-14
SLIDE 14

Motivation: Cool Math

Suppose that both T1 and T2 are comb trees.

4 2 7 1 3 5 6 8 9 w w w w w w w w w

1 2 3 4 5 6 7 8 9

Seth Sullivant (NCSU) Maximum Agreement Subtrees March 24, 2018 9 / 23

slide-15
SLIDE 15

Motivation: Cool Math

Suppose that both T1 and T2 are comb trees.

4 2 7 1 3 5 6 8 9 w w w w w w w w w

1 2 3 4 5 6 7 8 9

A maximum agreement subtree corresponds to a longest increasing subsequence of the permutation w = w1w2w3w4w5w6w7w8w9, denoted L(w).

Seth Sullivant (NCSU) Maximum Agreement Subtrees March 24, 2018 9 / 23

slide-16
SLIDE 16

Motivation: Cool Math

Suppose that both T1 and T2 are comb trees.

4 2 7 1 3 5 6 8 9 w w w w w w w w w

1 2 3 4 5 6 7 8 9

A maximum agreement subtree corresponds to a longest increasing subsequence of the permutation w = w1w2w3w4w5w6w7w8w9, denoted L(w). MAST(T1, T2) for uniformly random comb trees is equivalent to L(w) for uniformly random permutations w ∈ Sn.

Seth Sullivant (NCSU) Maximum Agreement Subtrees March 24, 2018 9 / 23

slide-17
SLIDE 17

Motivation: Cool Math

Suppose that both T1 and T2 are comb trees.

4 2 7 1 3 5 6 8 9 w w w w w w w w w

1 2 3 4 5 6 7 8 9

A maximum agreement subtree corresponds to a longest increasing subsequence of the permutation w = w1w2w3w4w5w6w7w8w9, denoted L(w). MAST(T1, T2) for uniformly random comb trees is equivalent to L(w) for uniformly random permutations w ∈ Sn.

Theorem (Baik-Deift-Johansson 1999)

E[L(w)] = 2 √ n − cn1/6 + o(n1/6) c ≈ 1.77108 L(w) − 2√n n1/6 → Tracy-Widom Random Variable

Seth Sullivant (NCSU) Maximum Agreement Subtrees March 24, 2018 9 / 23

slide-18
SLIDE 18

Random Trees

Biologists are interested in models for random trees as models for speciation processes. Uniform distribution: Select a uniform tree from all (2n − 3)!! rooted binary phylogenetic trees Yule-Harding distribution: Grow a random tree by successively splitting leaves selected uniformly at random, then apply leaf labels at random.

1 2 3 4 5

β-splitting model, α-splitting model, etc.

Question

How well do the different random tree models match the shape and structure of phylogenetic trees occurring in nature?

Seth Sullivant (NCSU) Maximum Agreement Subtrees March 24, 2018 10 / 23

slide-19
SLIDE 19

Properties of Random Trees

Proposition

Both Yule-Harding and uniform random trees satisfy exchangeability and sampling consistency. Exchangeability:

1 2 3 4 5

P( )= P( )

1 2 3 4 5

Sampling Consistency: If T is a random tree, and S ⊆ X then T|S is a random tree from the same distribution on leaf label set S.

Theorem (Aldous)

The expected depth of a uniformly random tree is Θ(√n). The expected depth of Yule-Harding random tree is Θ(log n).

Seth Sullivant (NCSU) Maximum Agreement Subtrees March 24, 2018 11 / 23

slide-20
SLIDE 20

Conjecture About The Maximum Agreement Subtree

Conjecture

For any exchangeable sampling consistent distribution on rooted binary phylogenetic X-trees, E[MAST(T1, T2)] = Θ(√n) where n = |X|. Recall that f(n) = Θ(√n) means that there are positive constants c and C such that c √ n ≤ f(n) ≤ C √ n. Note that the constants c and C might depend on the probability distribution. We hope further that we can show that, asymptotically E[MAST(T1, T2)] ∼ d √ n for some d (depending on the distribution) as n → ∞.

Seth Sullivant (NCSU) Maximum Agreement Subtrees March 24, 2018 12 / 23

slide-21
SLIDE 21

Upper bounds

Theorem (BHLSSS)

For any exchangeable sampling consistent distribution on rooted binary phylogenetic trees, E[MAST(T1, T2)] = O(√n).

Proof sketch for uniform distribution.

For S ⊆ X let XS = 1 if T1|S = T2|S, XS = 0 otherwise.

Seth Sullivant (NCSU) Maximum Agreement Subtrees March 24, 2018 13 / 23

slide-22
SLIDE 22

Upper bounds

Theorem (BHLSSS)

For any exchangeable sampling consistent distribution on rooted binary phylogenetic trees, E[MAST(T1, T2)] = O(√n).

Proof sketch for uniform distribution.

For S ⊆ X let XS = 1 if T1|S = T2|S, XS = 0 otherwise. Let Yn,k =

  • S⊆X,#S=k

XS = number of agreement sets of size k

Seth Sullivant (NCSU) Maximum Agreement Subtrees March 24, 2018 13 / 23

slide-23
SLIDE 23

Upper bounds

Theorem (BHLSSS)

For any exchangeable sampling consistent distribution on rooted binary phylogenetic trees, E[MAST(T1, T2)] = O(√n).

Proof sketch for uniform distribution.

For S ⊆ X let XS = 1 if T1|S = T2|S, XS = 0 otherwise. Let Yn,k =

  • S⊆X,#S=k

XS = number of agreement sets of size k E[Yn,k] = n k

  • P(XS = 1) =

n k

  • 1

(2k − 3)!! − → 0 if k > c√n

Seth Sullivant (NCSU) Maximum Agreement Subtrees March 24, 2018 13 / 23

slide-24
SLIDE 24

Upper bounds

Theorem (BHLSSS)

For any exchangeable sampling consistent distribution on rooted binary phylogenetic trees, E[MAST(T1, T2)] = O(√n).

Proof sketch for uniform distribution.

For S ⊆ X let XS = 1 if T1|S = T2|S, XS = 0 otherwise. Let Yn,k =

  • S⊆X,#S=k

XS = number of agreement sets of size k E[Yn,k] = n k

  • P(XS = 1) =

n k

  • 1

(2k − 3)!! − → 0 if k > c√n Since E[Yn,k] → 0, with n large = ⇒ P(MAST(T1, T2) > c√n) goes to 0.

Seth Sullivant (NCSU) Maximum Agreement Subtrees March 24, 2018 13 / 23

slide-25
SLIDE 25

Lower Bounds: Uniform Distribution

Theorem (BHLSSS)

Under the uniform distribution on trees E[MAST(T1, T2)] = Ω(n1/8).

Proof Sketch.

The expected depth of a uniform random tree is Θ(n1/2). So with high probability there is a subset S ⊆ X so T1|S is a comb tree of size cn1/2.

Seth Sullivant (NCSU) Maximum Agreement Subtrees March 24, 2018 14 / 23

slide-26
SLIDE 26

Lower Bounds: Uniform Distribution

Theorem (BHLSSS)

Under the uniform distribution on trees E[MAST(T1, T2)] = Ω(n1/8).

Proof Sketch.

The expected depth of a uniform random tree is Θ(n1/2). So with high probability there is a subset S ⊆ X so T1|S is a comb tree of size cn1/2. Similarly, with high probability there is a subset S′ ⊆ S with #S = Θ(n1/4) so that T1|S′ and T2|S′ are both comb trees.

Seth Sullivant (NCSU) Maximum Agreement Subtrees March 24, 2018 14 / 23

slide-27
SLIDE 27

Lower Bounds: Uniform Distribution

Theorem (BHLSSS)

Under the uniform distribution on trees E[MAST(T1, T2)] = Ω(n1/8).

Proof Sketch.

The expected depth of a uniform random tree is Θ(n1/2). So with high probability there is a subset S ⊆ X so T1|S is a comb tree of size cn1/2. Similarly, with high probability there is a subset S′ ⊆ S with #S = Θ(n1/4) so that T1|S′ and T2|S′ are both comb trees. By sampling consistency, T1|S′ and T2|S′ are uniformly random comb trees with Θ(n1/4) leaves.

Seth Sullivant (NCSU) Maximum Agreement Subtrees March 24, 2018 14 / 23

slide-28
SLIDE 28

Lower Bounds: Uniform Distribution

Theorem (BHLSSS)

Under the uniform distribution on trees E[MAST(T1, T2)] = Ω(n1/8).

Proof Sketch.

The expected depth of a uniform random tree is Θ(n1/2). So with high probability there is a subset S ⊆ X so T1|S is a comb tree of size cn1/2. Similarly, with high probability there is a subset S′ ⊆ S with #S = Θ(n1/4) so that T1|S′ and T2|S′ are both comb trees. By sampling consistency, T1|S′ and T2|S′ are uniformly random comb trees with Θ(n1/4) leaves. By Baik-Deift-Johansson, T1|S′ and T2|S′ have an agreement subtree of expected size Θ(n1/8).

Seth Sullivant (NCSU) Maximum Agreement Subtrees March 24, 2018 14 / 23

slide-29
SLIDE 29

Lower Bounds: Yule-Harding Distribution

Theorem (BHLSSS)

Under the Yule-Harding distribution on trees E[MAST(T1, T2)] = Ω(nα) where α is the positive root of 22−α = (α + 1)(α + 2) (α ≈ .344). From the Steel-Warnow algorithm, we see that for trees T1 and T2

  • f the following shapes

A B C D

MAST(T1, T2) ≥ max (MAST(A, C) + MAST(B, D), MAST(A, D) + MAST(B, C))

Seth Sullivant (NCSU) Maximum Agreement Subtrees March 24, 2018 15 / 23

slide-30
SLIDE 30

Lower Bounds: Fixed Tree Shape

Theorem (Misra-S.)

Let T1 and T2 be uniformly random trees with the same tree shape with n leaves. Then E[MAST(T1, T2)] = Θ( √ n). Idea comes from random comb trees and connections to longest increasing subsequences.

4 2 7 1 3 5 6 8 9 w w w w w w w w w

1 2 3 4 5 6 7 8 9

Seth Sullivant (NCSU) Maximum Agreement Subtrees March 24, 2018 16 / 23

slide-31
SLIDE 31

The Simplest Proof of Ω(√n) Lower Bound

Let w1w2 · · · wn be a uniformly random permutation. Break this into blocks of length k. B1|B2| · · · |Bn/k = (w1 · · · wk)|(wk+1 · · · w2n)| · · · |(wn−k+1 · · · wn)

Seth Sullivant (NCSU) Maximum Agreement Subtrees March 24, 2018 17 / 23

slide-32
SLIDE 32

The Simplest Proof of Ω(√n) Lower Bound

Let w1w2 · · · wn be a uniformly random permutation. Break this into blocks of length k. B1|B2| · · · |Bn/k = (w1 · · · wk)|(wk+1 · · · w2n)| · · · |(wn−k+1 · · · wn) Let’s call block Bi awesome if one of the numbers (i − 1)k + 1, . . . , ik appears in that block.

Seth Sullivant (NCSU) Maximum Agreement Subtrees March 24, 2018 17 / 23

slide-33
SLIDE 33

The Simplest Proof of Ω(√n) Lower Bound

Let w1w2 · · · wn be a uniformly random permutation. Break this into blocks of length k. B1|B2| · · · |Bn/k = (w1 · · · wk)|(wk+1 · · · w2n)| · · · |(wn−k+1 · · · wn) Let’s call block Bi awesome if one of the numbers (i − 1)k + 1, . . . , ik appears in that block. Note that the awesome blocks gives AN increasing subsequence (but probably not the longest). 1 5 6 11 | 8 9 2 3 | 16 15 13 4 | 7 14 10 12

Seth Sullivant (NCSU) Maximum Agreement Subtrees March 24, 2018 17 / 23

slide-34
SLIDE 34

The Simplest Proof of Ω(√n) Lower Bound

Let w1w2 · · · wn be a uniformly random permutation. Break this into blocks of length k. B1|B2| · · · |Bn/k = (w1 · · · wk)|(wk+1 · · · w2n)| · · · |(wn−k+1 · · · wn) Let’s call block Bi awesome if one of the numbers (i − 1)k + 1, . . . , ik appears in that block. Note that the awesome blocks gives AN increasing subsequence (but probably not the longest). 1 5 6 11 | 8 9 2 3 | 16 15 13 4 | 7 14 10 12 So if we can get a lower bound on the expected number of awesome blocks, that will give a lower bound on the length of the longest increasing subsequence.

Seth Sullivant (NCSU) Maximum Agreement Subtrees March 24, 2018 17 / 23

slide-35
SLIDE 35

A block Bi awesome if one of the numbers (i − 1)k + 1, . . . , ik appears in that block. The probability that Bi is awesome is approximately 1 −

  • 1 − k

n k

Seth Sullivant (NCSU) Maximum Agreement Subtrees March 24, 2018 18 / 23

slide-36
SLIDE 36

A block Bi awesome if one of the numbers (i − 1)k + 1, . . . , ik appears in that block. The probability that Bi is awesome is approximately 1 −

  • 1 − k

n k The expected number of awesome blocks is then

  • 1 −
  • 1 − k

n k n k

Seth Sullivant (NCSU) Maximum Agreement Subtrees March 24, 2018 18 / 23

slide-37
SLIDE 37

A block Bi awesome if one of the numbers (i − 1)k + 1, . . . , ik appears in that block. The probability that Bi is awesome is approximately 1 −

  • 1 − k

n k The expected number of awesome blocks is then

  • 1 −
  • 1 − k

n k n k Taking k = √n we get the expected number of awesome blocks is

  • 1 −
  • 1 − 1

√n √n √ n ≥ (1 − e−1) √ n

Seth Sullivant (NCSU) Maximum Agreement Subtrees March 24, 2018 18 / 23

slide-38
SLIDE 38

A block Bi awesome if one of the numbers (i − 1)k + 1, . . . , ik appears in that block. The probability that Bi is awesome is approximately 1 −

  • 1 − k

n k The expected number of awesome blocks is then

  • 1 −
  • 1 − k

n k n k Taking k = √n we get the expected number of awesome blocks is

  • 1 −
  • 1 − 1

√n √n √ n ≥ (1 − e−1) √ n

Proposition

The expected length of the longest increasing subsequence of a uniformly random permutation is bounded below by (1 − e−1)√n.

Seth Sullivant (NCSU) Maximum Agreement Subtrees March 24, 2018 18 / 23

slide-39
SLIDE 39

Extending These Ideas for Trees of Same Shape

Proposition

The leaf set of any tree on n leaves can be grouped into at least

n 4k

blobs of size between k and 2k − 2. The blobs yield a scaffold tree which can force a structure for certain agreement subtrees between two trees of the same shape. n = 17, k = 3

Seth Sullivant (NCSU) Maximum Agreement Subtrees March 24, 2018 19 / 23

slide-40
SLIDE 40

Extending These Ideas for Trees of Same Shape

Proposition

The leaf set of any tree on n leaves can be grouped into at least

n 4k

blobs of size between k and 2k − 2. The blobs yield a scaffold tree which can force a structure for certain agreement subtrees between two trees of the same shape. n = 17, k = 3

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17

Seth Sullivant (NCSU) Maximum Agreement Subtrees March 24, 2018 19 / 23

slide-41
SLIDE 41

Extending These Ideas for Trees of Same Shape

Proposition

The leaf set of any tree on n leaves can be grouped into at least

n 4k

blobs of size between k and 2k − 2. The blobs yield a scaffold tree which can force a structure for certain agreement subtrees between two trees of the same shape. n = 17, k = 3

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17

Seth Sullivant (NCSU) Maximum Agreement Subtrees March 24, 2018 19 / 23

slide-42
SLIDE 42

Extending These Ideas for Trees of Same Shape

Proposition

The leaf set of any tree on n leaves can be grouped into at least

n 4k

blobs of size between k and 2k − 2. The blobs yield a scaffold tree which can force a structure for certain agreement subtrees between two trees of the same shape. n = 17, k = 3

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17

Seth Sullivant (NCSU) Maximum Agreement Subtrees March 24, 2018 19 / 23

slide-43
SLIDE 43

Let T1, T2 be two random trees with the same shape. Have corresponding blobs B1(Ti), . . . , Bn/4k(Ti) i = 1, 2.

Seth Sullivant (NCSU) Maximum Agreement Subtrees March 24, 2018 20 / 23

slide-44
SLIDE 44

Let T1, T2 be two random trees with the same shape. Have corresponding blobs B1(Ti), . . . , Bn/4k(Ti) i = 1, 2. Call a blob Bj awesome if Bj(T1) ∩ Bj(T2) = ∅. The expected number of awesome blobs is at least n 4k

  • 1 −
  • 1 − k

n k .

Seth Sullivant (NCSU) Maximum Agreement Subtrees March 24, 2018 20 / 23

slide-45
SLIDE 45

Let T1, T2 be two random trees with the same shape. Have corresponding blobs B1(Ti), . . . , Bn/4k(Ti) i = 1, 2. Call a blob Bj awesome if Bj(T1) ∩ Bj(T2) = ∅. The expected number of awesome blobs is at least n 4k

  • 1 −
  • 1 − k

n k . Awesome blobs give AN agreement subtree between T1 and T2, subtree of the scaffold.

Seth Sullivant (NCSU) Maximum Agreement Subtrees March 24, 2018 20 / 23

slide-46
SLIDE 46

Let T1, T2 be two random trees with the same shape. Have corresponding blobs B1(Ti), . . . , Bn/4k(Ti) i = 1, 2. Call a blob Bj awesome if Bj(T1) ∩ Bj(T2) = ∅. The expected number of awesome blobs is at least n 4k

  • 1 −
  • 1 − k

n k . Awesome blobs give AN agreement subtree between T1 and T2, subtree of the scaffold. Taking k = √n gives:

Proposition

If T1 and T2 are uniformly random tree with n leaves and the same shape then E[MAST(T1, T2)] ≥ 1 − e−1 4 √ n.

Seth Sullivant (NCSU) Maximum Agreement Subtrees March 24, 2018 20 / 23

slide-47
SLIDE 47

Trees with Same Shape

Theorem (Misra-S.)

Let T1 and T2 be uniformly random trees with the same tree shape with n leaves. Then E[MAST(T1, T2)] = Θ( √ n).

Conjecture (Martin–Thatte 2013)

If T1 and T2 are arbitrary completely balanced trees with n leaves, then MAST(T1, T2) ≥ √ n.

Seth Sullivant (NCSU) Maximum Agreement Subtrees March 24, 2018 21 / 23

slide-48
SLIDE 48

Summary: Now We Know How Little We Know

Computing the distribution of MAST(T1, T2) is a generalization of hard problems in combinatorial probability. Simulations suggest that E[MAST(T1, T2)] ∼ cn1/2 for the uniform and Yule-Harding distributions. We have upper bounds of the form Cn1/2 for all exchangeable, sampling consistent distributions. We have lower bounds of the form cnα for uniform, Yule-Harding distributions, fixed shape, and some β-splitting examples.

Question

Is E[MAST(T1, T2)] ∼ cn1/2 universal for all exchangeable, sampling consistent distributions? What else can be said about the distribution of MAST(T1, T2)?

Seth Sullivant (NCSU) Maximum Agreement Subtrees March 24, 2018 22 / 23

slide-49
SLIDE 49

References

Aldous, David. Probability distributions on cladograms. Random discrete structures (Minneapolis, MN, 1993), 1–18, IMA

  • Vol. Math. Appl., 76, Springer, New York, 1996.

Baik, Jinho; Deift, Percy; Johansson, Kurt On the distribution of the length of the longest increasing subsequence of random permutations. J. Amer. Math. Soc. 12 (1999), no. 4, 1119–1178. Bernstein, Daniel Irving; Ho, Lam Si Tung; Long, Colby; Steel, Mike; St. John, Katherine; Sullivant, Seth. Bounds on the expected size of the maximum agreement subtree. SIAM J. Discrete Math. 29 (2015), no. 4, 2065–2074. Bryant, David; McKenzie, Andy; Steel, Mike. The size of a maximum agreement subtree for random binary trees. Bioconsensus (Piscataway, NJ, 2000/2001), 55?65, DIMACS Ser. Discrete Math. Theoret. Comput. Sci., 61, Amer. Math. Soc., Providence, RI, 2003. Martin, Daniel M. and Thatte, Bhalchandra D. The maximum agreement subtree problem. Discrete Appl. Math. 161 (2013), no. 13–14, 1805–1817. Seth Sullivant (NCSU) Maximum Agreement Subtrees March 24, 2018 23 / 23