Evolutionary Analysis
From trees to networks
- Dr. Taoyang Wu
School of Computing Sciences, University of East Anglia
Shanghai Jiao Tong University August 2016
- T. Wu
Evolutionary Analysis
Evolutionary Analysis From trees to networks Dr. Taoyang Wu School - - PowerPoint PPT Presentation
Evolutionary Analysis From trees to networks Dr. Taoyang Wu School of Computing Sciences, University of East Anglia Shanghai Jiao Tong University August 2016 T. Wu Evolutionary Analysis Research interests Discrete Mathematics
From trees to networks
School of Computing Sciences, University of East Anglia
Shanghai Jiao Tong University August 2016
Evolutionary Analysis
◮ Discrete Mathematics
◮ Optimal realisations (DM 2012 & 2015, DAM 2013) ◮ Trees and graphs (JDA 2009, DM 2011, SIAM DM 2014) ◮ Distance problems for permutation groups (DM 2009 & 2010)
◮ Real-world networks: Protein-Protein Interaction
◮ Inferring PPI evolution (TCBB 2013) ◮ Modelling PPI networks (TCS 2013)
◮ Phylogenetics
◮ Tree space (TCBB 2013, BMB 2014, AiAM 2015) ◮ Tree reconciliation (BMC Bioinfor. 2011, COCOA 2013) ◮ Tree shape statistics (TPB 2016; JMB 2015) ◮ Phylogenetic networks (SB 2015; MBE 2016; JMB 2016;
Algorithmica 2016)
Evolutionary Analysis
1 Introduction 2 Phylogenetic Trees
◮ Tree inference ◮ Combinatorial properties ◮ Statistical properties
Evolutionary Analysis
1 Introduction 2 Phylogenetic Trees
◮ Tree inference ◮ Combinatorial properties ◮ Statistical properties
3 Phylogenetic Networks
◮ Information bottleneck ◮ Network reconstruction
Evolutionary Analysis
Evolutionary Analysis
Figure: Examples of Phylogenetic Trees
Evolutionary Analysis
Figure: From A History of Architecture on the Comparative Method for the Student, Craftsman, and Amateur, 1954.
Evolutionary Analysis
Figure: Part of The Tree of Languages; from internet.
Evolutionary Analysis
Figure: Molecular phylogeny for 31,749 species of seed plants; from [Zanne et al, Nature, 2014].
Evolutionary Analysis
Figure: Angiosperm phylogeny of 3,467 species; from [Werner et al, Nature Communication, 2014].
Evolutionary Analysis
Evolutionary Analysis
◮ Tree T = (V , E): connected, acyclic graph ◮ Semi-labelled tree: leaves are labelled. ◮ Phylogenetic tree: binary semi-labelled tree ◮ Rooted vs unrooted
Evolutionary Analysis
◮ Tree T = (V , E): connected, acyclic graph ◮ Semi-labelled tree: leaves are labelled. ◮ Phylogenetic tree: binary semi-labelled tree ◮ Rooted vs unrooted ◮ Motivation: evolution relations in biology etc.
Evolutionary Analysis
◮ Tree T = (V , E): connected, acyclic graph ◮ Semi-labelled tree: leaves are labelled. ◮ Phylogenetic tree: binary semi-labelled tree ◮ Rooted vs unrooted ◮ Motivation: evolution relations in biology etc.
Figure: A rooted phylogenetic tree (left) and an unrooted phylogenetic tree (right).
Evolutionary Analysis
Definition
Tn and T ∗
n : the collection of rooted and unrooted phylogenetic
trees with leaf set {1, . . . , n}.
Evolutionary Analysis
Definition
Tn and T ∗
n : the collection of rooted and unrooted phylogenetic
trees with leaf set {1, . . . , n}. Example:
1 2 3 1 2 3 1 2 3 1 2 3 4 4 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 5 5 1 2 3 4 5 5 5
Figure: A glimpse at the tree space
Evolutionary Analysis
Definition
Tn and T ∗
n : the collection of rooted and unrooted phylogenetic
trees with leaf set {1, . . . , n}. Example:
1 2 3 1 2 3 1 2 3 1 2 3 4 4 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 5 5 1 2 3 4 5 5 5
Figure: A glimpse at the tree space
In general, we have (Schr¨
|T ∗
n | = 1 × 3 × · · · × (2n − 5) and |Tn| = 1 × 3 × · · · × (2n − 3)
Evolutionary Analysis
Given a dataset on a taxon set X, find an optimal phylogenetic tree to explain the evolutionary relationships.
Evolutionary Analysis
Given a dataset on a taxon set X, find an optimal phylogenetic tree to explain the evolutionary relationships. Data
◮ Morphological data (traits) ◮ Genetic data (molecular sequences)
Evolutionary Analysis
Given a dataset on a taxon set X, find an optimal phylogenetic tree to explain the evolutionary relationships. Data
◮ Morphological data (traits) ◮ Genetic data (molecular sequences)
Criteria
◮ Maximum Parsimony: minimises the total number of
character-state changes.
Evolutionary Analysis
Given a dataset on a taxon set X, find an optimal phylogenetic tree to explain the evolutionary relationships. Data
◮ Morphological data (traits) ◮ Genetic data (molecular sequences)
Criteria
◮ Maximum Parsimony: minimises the total number of
character-state changes.
◮ Maximum Likelihood: maximises the likelihood function. ◮ Minimum Evolution: minimise the sum of the edge length.
Evolutionary Analysis
The tree inference problem is NP-hard for
◮ Maximum Parsimony [Foulds-Graham 1982] ◮ Maximum Likelihood [Chor-Tuller 2005] ◮ Minimum Evolution [Bastkowski-Moulton-Spillner-W 2016]
Evolutionary Analysis
The tree inference problem is NP-hard for
◮ Maximum Parsimony [Foulds-Graham 1982] ◮ Maximum Likelihood [Chor-Tuller 2005] ◮ Minimum Evolution [Bastkowski-Moulton-Spillner-W 2016]
The search space (i.e., tree-space) is large and complicated.
Evolutionary Analysis
Motivation:
Evolutionary Analysis
Motivation:
Three operations:
◮ NNI (Nearest neighbour interchange) ◮ SPR (Subtree prune and regraft) ◮ TBR (Tree bisection and reconnection)
Evolutionary Analysis
Nearest neighbour interchange (NNI)
D A B C D B C A
NNI
Figure: A schematic representation of the NNI operation
Evolutionary Analysis
Subtree prune and regraft (SPR)
A B D
u
A B C C D
u e f SPR
Figure: A schematic representation of the SPR operation Note: All degree two vertices are suppressed.
Evolutionary Analysis
Tree bisection and reconnection (TBR)
A B C D A B C D
e f TBR
Figure: A schematic representation of the TBR operation
Evolutionary Analysis
Tree bisection and reconnection (TBR)
A B C D A B C D
e f TBR
Figure: A schematic representation of the TBR operation
A TBR operation consists of two steps:
◮ Bisection: deleting e;
Evolutionary Analysis
Tree bisection and reconnection (TBR)
A B C D A B C D
e f TBR
Figure: A schematic representation of the TBR operation
A TBR operation consists of two steps:
◮ Bisection: deleting e; ◮ Reconnection: inserting f ;
Evolutionary Analysis
Tree bisection and reconnection (TBR)
A B C D A B C D
e f TBR
Figure: A schematic representation of the TBR operation
A TBR operation consists of two steps:
◮ Bisection: deleting e; ◮ Reconnection: inserting f ;
Note: All degree two vertices are suppressed.
Evolutionary Analysis
GTBR(n) = (Vn, En) with
◮ Vn : the trees in T ∗ n ; ◮ En : two trees T1 and T2 are adjacent if there exists a TBR
Evolutionary Analysis
GTBR(n) = (Vn, En) with
◮ Vn : the trees in T ∗ n ; ◮ En : two trees T1 and T2 are adjacent if there exists a TBR
Similarly, we can define GNNI(n) and GSPR(n).
Evolutionary Analysis
GTBR(n) = (Vn, En) with
◮ Vn : the trees in T ∗ n ; ◮ En : two trees T1 and T2 are adjacent if there exists a TBR
Similarly, we can define GNNI(n) and GSPR(n). Note that all operations are symmetry and
NNI ⊆ SPR ⊆ TBR,
that is, any NNI operation is a SPR operatoin while any SPR
Evolutionary Analysis
◮ GNNI(n) is regular with degree 2(n − 3); (Robinson 1971)
Evolutionary Analysis
◮ GNNI(n) is regular with degree 2(n − 3); (Robinson 1971) ◮ GSPR(n) is regular with degree 2(n − 3)(2n − 7); (Allen&Steel
2001)
Evolutionary Analysis
◮ GNNI(n) is regular with degree 2(n − 3); (Robinson 1971) ◮ GSPR(n) is regular with degree 2(n − 3)(2n − 7); (Allen&Steel
2001)
◮ GTBR(n) is not regular, the maximal degree is obtained by
caterpillar trees. (Humphries, 2008)
Evolutionary Analysis
◮ GNNI(n) is regular with degree 2(n − 3); (Robinson 1971) ◮ GSPR(n) is regular with degree 2(n − 3)(2n − 7); (Allen&Steel
2001)
◮ GTBR(n) is not regular, the maximal degree is obtained by
caterpillar trees. (Humphries, 2008)
Evolutionary Analysis
Theorem (Humphries-W, TCBB 2013)
For each vertex T ∈ T ∗
n with n ≥ 3, its degree in GTBR(n) is
4Γ(T) − (8n2 − 18n + 6)
Evolutionary Analysis
Theorem (Humphries-W, TCBB 2013)
For each vertex T ∈ T ∗
n with n ≥ 3, its degree in GTBR(n) is
4Γ(T) − (8n2 − 18n + 6) with Γ(T) :=
distT(u, v) denoting the sume of the distance between all leaves of T.
Evolutionary Analysis
Theorem (Humphries-W, TCBB 2013)
For each vertex T ∈ T ∗
n with n ≥ 3, its degree in GTBR(n) is
4Γ(T) − (8n2 − 18n + 6) with Γ(T) :=
distT(u, v) denoting the sume of the distance between all leaves of T. For the vertices in GTBR(n):
◮ Maximal degree: Caterpillar Trees ◮ Minimal degree: Semi-regular Trees (see, also,
[Szekely-Wang-W, DM 2011])
Evolutionary Analysis
Lemma
For two “distinct” TBR operations θ and θ′, θ(T) = θ′(T) implies that both θ and θ′ are NNI operations.
Evolutionary Analysis
Lemma
For two “distinct” TBR operations θ and θ′, θ(T) = θ′(T) implies that both θ and θ′ are NNI operations. Note: Here two TBR operations are distinct if
Evolutionary Analysis
Lemma
For two “distinct” TBR operations θ and θ′, θ(T) = θ′(T) implies that both θ and θ′ are NNI operations. Note: Here two TBR operations are distinct if
◮ they delete different edges in the bisection step, or
Evolutionary Analysis
Lemma
For two “distinct” TBR operations θ and θ′, θ(T) = θ′(T) implies that both θ and θ′ are NNI operations. Note: Here two TBR operations are distinct if
◮ they delete different edges in the bisection step, or ◮ they use different edges in the reconnection step.
Evolutionary Analysis
◮ The number of trees in Tn is
ϕ(n) := (2n − 3)!! = 1 · 3 · · · (2n − 3)
Evolutionary Analysis
◮ The number of trees in Tn is
ϕ(n) := (2n − 3)!! = 1 · 3 · · · (2n − 3)
◮ Under the proportional to distinguishable arrangements (PDA)
model, each tree has the same probability to be generated, that is, we have Pu(T) = 1 ϕ(n) (1) for every T in Tn.
Evolutionary Analysis
Under the Yule–Harding model [Yule 1925, Harding 1971],
◮ Beginning with a two leafed tree, we “grow” it by repeatedly
splitting a leaf into two new leaves.
Evolutionary Analysis
Under the Yule–Harding model [Yule 1925, Harding 1971],
◮ Beginning with a two leafed tree, we “grow” it by repeatedly
splitting a leaf into two new leaves.
◮ The splitting leaf is chosen randomly and uniformly among all
the present leaves in the current tree.
Evolutionary Analysis
Under the Yule–Harding model [Yule 1925, Harding 1971],
◮ Beginning with a two leafed tree, we “grow” it by repeatedly
splitting a leaf into two new leaves.
◮ The splitting leaf is chosen randomly and uniformly among all
the present leaves in the current tree.
◮ After obtaining an unlabeled tree with n leaves, we label each
replacement) from {1, · · · , n}.
Evolutionary Analysis
Under the Yule–Harding model [Yule 1925, Harding 1971],
◮ Beginning with a two leafed tree, we “grow” it by repeatedly
splitting a leaf into two new leaves.
◮ The splitting leaf is chosen randomly and uniformly among all
the present leaves in the current tree.
◮ After obtaining an unlabeled tree with n leaves, we label each
replacement) from {1, · · · , n}. When branch lengths are ignored, the Yule–Harding model is shown [Aldous,1996] to be equivalent to the trees generated by Kingman’s coalescent process, and so we call it the YHK model.
Evolutionary Analysis
◮ Cherry: a subtree with two leaves ◮ Pitchfork: a subtree with three leaves
Evolutionary Analysis
◮ Cherry: a subtree with two leaves ◮ Pitchfork: a subtree with three leaves
Figure: A tree with three cherries and one pitchfork.
Evolutionary Analysis
Given a phylogenetic tree T, let
◮ A(T): the number of pitchforks; ◮ C(T): the number of cherries.
Evolutionary Analysis
Given a phylogenetic tree T, let
◮ A(T): the number of pitchforks; ◮ C(T): the number of cherries.
For n ≥ 2, consider the random variables
◮ An: the number of pitchforks in a random tree; ◮ Cn: the number of cherries in a random tree.
Evolutionary Analysis
Given a phylogenetic tree T, let
◮ A(T): the number of pitchforks; ◮ C(T): the number of cherries.
For n ≥ 2, consider the random variables
◮ An: the number of pitchforks in a random tree; ◮ Cn: the number of cherries in a random tree.
What are the joint distributions of An and Cn?
Evolutionary Analysis
Theorem (W-Choi, 2016)
For n > 3 and 1 < b < n, we have
Py(An+1 = a, Cn+1 = b) = 2a n Py(An = a, Cn = b) + (a + 1) n Py(An = a + 1, Cn = b − 1) + 2(b − a + 1) n Py(An = a − 1, Cn = b) + (n − a − 2b + 2) n Py(An = a, Cn = b − 1).
Evolutionary Analysis
Theorem (W-Choi, 2016)
For n > 3 and 1 < b < n, we have
Py(An+1 = a, Cn+1 = b) = 2a n Py(An = a, Cn = b) + (a + 1) n Py(An = a + 1, Cn = b − 1) + 2(b − a + 1) n Py(An = a − 1, Cn = b) + (n − a − 2b + 2) n Py(An = a, Cn = b − 1).
Note: A similar formula for the PDA model.
Evolutionary Analysis
◮ A dynamic approach to computing the joint distributions.
Evolutionary Analysis
◮ A dynamic approach to computing the joint distributions. ◮ A unified approach to calculating the moments of the joint
(and the marginal) distributions.
Evolutionary Analysis
◮ A dynamic approach to computing the joint distributions. ◮ A unified approach to calculating the moments of the joint
(and the marginal) distributions.
◮ The cherry distributions are log-concave. That is, for n > 2
and 1 < k < n, we have Py(Cn = k)2 ≥ Py(Cn = k + 1)Py(Cn = k − 1)
Evolutionary Analysis
◮ A dynamic approach to computing the joint distributions. ◮ A unified approach to calculating the moments of the joint
(and the marginal) distributions.
◮ The cherry distributions are log-concave. That is, for n > 2
and 1 < k < n, we have Py(Cn = k)2 ≥ Py(Cn = k + 1)Py(Cn = k − 1)
◮ There exists a unique change point for the cherry distributions
between the YHK and the PDA models.
Evolutionary Analysis
◮ A dynamic approach to computing the joint distributions. ◮ A unified approach to calculating the moments of the joint
(and the marginal) distributions.
◮ The cherry distributions are log-concave. That is, for n > 2
and 1 < k < n, we have Py(Cn = k)2 ≥ Py(Cn = k + 1)Py(Cn = k − 1)
◮ There exists a unique change point for the cherry distributions
between the YHK and the PDA models.
◮ Similar results for clade sizes and clan sizes [Zhu-Than-W,
2015].
Evolutionary Analysis
Evolutionary Analysis
Evolutionary Analysis
Phylogenetic tree is useful, but networks provide a better tool for studying
◮ conflicting signals ◮ recombination ◮ gene flow ◮ hybridization ◮ horizontal gene transfer ◮ · · ·
Evolutionary Analysis
(1) (7) (10) (2) (13) (9) (14) (8) (15) (5) (12) (6) (11) (3) (4) (1) (4) (3) (11) (7) (9) (13) (2) (8) (15) (12) (14) (10) (6) (5)
Figure: A phylogenetic tree and network relating 15 plants species from the genus Solanum; from [Bastkowski-Moulton-Spillner-Wu, 2015, Bull.
Evolutionary Analysis
Figure: A partial pedigree of Prince Charles; from [Gusfield, 2014].
Evolutionary Analysis
Figure: A history with recombination; from [Gusfield, 2014].
Evolutionary Analysis
A (rooted) phylogenetic network:
◮ a directed acyclic graph ◮ a unique root ◮ leaves are labelled by taxa ◮ no vertex with one parent
and one child
◮ binary
A central problem: How to reconstruct phylogenetic networks?
Evolutionary Analysis
a b c d e
b e a Input trees a b c a b d a b c d d e e c e
Evolutionary Analysis
a b c d e
b e a Input trees a b c a b d a b c d d e e c e
◮ A tree is encoded by its subtrees on three leaves.
Evolutionary Analysis
a b c d e
b e a Input trees a b c a b d a b c d d e e c e
◮ A tree is encoded by its subtrees on three leaves. ◮ A polynomial algorithm to assemble trees [Aho et al. 1981].
Evolutionary Analysis
a b c d e
b e a Input trees a b c a b d a b c d d e e c e
◮ A tree is encoded by its subtrees on three leaves. ◮ A polynomial algorithm to assemble trees [Aho et al. 1981].
Evolutionary Analysis
Question: Are networks encoded by their trees?
Evolutionary Analysis
Question: Are networks encoded by their trees?
T1 ρ a b c T2 ρ a b c
N
ρ a b c
Evolutionary Analysis
Question: Are networks encoded by their trees?
ρ N ′ a b c T1 ρ a b c T2 ρ a b c
N
ρ a b c
Answer: No.
Evolutionary Analysis
Question: Are networks encoded by their subnetworks?
Evolutionary Analysis
Question: Are networks encoded by their subnetworks?
c f e d c b a d b a f e c f e
Figure: An example of subnetwork.
Evolutionary Analysis
Theorem (Huber-Iersel-Moulton-Wu, 2015, Syst. Biol.)
For every n ≥ 3, there exist two non-isomorphic phylogenetic networks N1 and N2 with n leaves such that they display the same set of subnetworks (and the same set of trees).
Evolutionary Analysis
Theorem (Huber-Iersel-Moulton-Wu, 2015, Syst. Biol.)
For every n ≥ 3, there exist two non-isomorphic phylogenetic networks N1 and N2 with n leaves such that they display the same set of subnetworks (and the same set of trees).
d a b c d a b c
Evolutionary Analysis
In [Huber-Moulton, 2013, Algorithmica], it is shown that level-1 networks are encoded by their subnetworks.
a b c d i e f g h j N
Figure: level-1 = all undirected cycles are disjoint
Evolutionary Analysis
T1(x, y; z)
x y z x y z x y z x y z z z z x x x z y y y y x
S1(x, y; z) S2(x; y; z) N2(x, y; z) N5(x; y; z) N3(x; y; z) N4(x; y; z) N1(x, y; z)
Figure: Eight types of level-1 networks on three leaves.
Evolutionary Analysis
a e c e g b a b c c d f e f c h g i Input trinets
Input: A collection of trinets. Task: (1)To decide whether there exists a binary level-1 phylogenetic network display- ing the collection of trinets.
Evolutionary Analysis
a e c e g b a b c c d f e f c h g i Input trinets
Input: A collection of trinets. Task: (1)To decide whether there exists a binary level-1 phylogenetic network display- ing the collection of trinets. (2)Construct such a network if it exists.
Evolutionary Analysis
In [Huber-Iersel-Moutlon-Scornavacca-Wu, in revision for Algorithmica], we show that when some trinet is missing, then
◮ the trinet assembling problem is NP-hard;
Evolutionary Analysis
In [Huber-Iersel-Moutlon-Scornavacca-Wu, in revision for Algorithmica], we show that when some trinet is missing, then
◮ the trinet assembling problem is NP-hard; ◮ it can be solved by an O(3npoly(n)) algorithm.
Evolutionary Analysis
In [Huber-Iersel-Moutlon-Scornavacca-Wu, in revision for Algorithmica], we show that when some trinet is missing, then
◮ the trinet assembling problem is NP-hard; ◮ it can be solved by an O(3npoly(n)) algorithm.
Question: How about ’real data’ (often noisy and containing conflict signals)?
Evolutionary Analysis
a b c d i e f g h j N a b c d j i e f g h ATCGTCATTCCGG ATGGTCAATCTGG ATGGTCAATGTCC ATCGTCATTCCGG ATGGTCAATCTGG ATGGTCAATGTCC a b c h i j An alignment on X = {a, . . . , j} h i j ATCGTCATTCCGG ATGGTCAATCTGG ATGGTCAATGTCC h i j y∗ a e c e g b a b c c d f e f h g i A dense set of trinets
Identify a suitable subst of taxa
Figure: A schematic view of Trinet-based Level One Network reconstructor, from [Oldman∗-Wu∗-Iersel-Moutlon, in revision for MBE].
Evolutionary Analysis
Giardia_lamblia_ATCC_50803_WB #H1 Giardia_intestinalis_isolate_246 Giardia_intestinalis_isolate_55 Giardia_intestinalis_isolate_JH Giardia_intestinalis_isolate_335 Giardia_intestinalis_isolate_303 Giardia_intestinalis_isolate_305
Figure: The inferred phylogeny of 7 Giardia strains by Trilonet; data from [Cooper et al, Curr. Biol., 2007].
Evolutionary Analysis
Trilonet is an algorithm for inferring level-1 network:
◮ Constructing a network directly from sequence data (without
using breaking points or gene trees).
◮ Efficient, and robust for noisy data.
Evolutionary Analysis
Trilonet is an algorithm for inferring level-1 network:
◮ Constructing a network directly from sequence data (without
using breaking points or gene trees).
◮ Efficient, and robust for noisy data. ◮ Implemented in Java, and will be available at
https://www.uea.ac.uk/computing/trilonet
◮ Consistent.
Evolutionary Analysis
Trilonet is an algorithm for inferring level-1 network:
◮ Constructing a network directly from sequence data (without
using breaking points or gene trees).
◮ Efficient, and robust for noisy data. ◮ Implemented in Java, and will be available at
https://www.uea.ac.uk/computing/trilonet
◮ Consistent.
Future improvement includes
◮ level-k networks ◮ statistical consistency
Evolutionary Analysis
Evolutionary Analysis
More realistic models:
◮ Superimposing molecular evolutionary models on edges ◮ Quantifying the contribution made by reticulate processes
Evolutionary Analysis
More realistic models:
◮ Superimposing molecular evolutionary models on edges ◮ Quantifying the contribution made by reticulate processes
Reconstructing networks
◮ Rigorous statistical frameworks (Maximal Likelihood or Bayesian)
Evolutionary Analysis
More realistic models:
◮ Superimposing molecular evolutionary models on edges ◮ Quantifying the contribution made by reticulate processes
Reconstructing networks
◮ Rigorous statistical frameworks (Maximal Likelihood or Bayesian) ◮ Accounting for non-tree like patterns resulted from
◮ Sequencing errors (e.g. SNP calling) ◮ Incomplete Lineage Sorting (see, e.g. Yu et al. 2014 PNAS)
Evolutionary Analysis
More realistic models:
◮ Superimposing molecular evolutionary models on edges ◮ Quantifying the contribution made by reticulate processes
Reconstructing networks
◮ Rigorous statistical frameworks (Maximal Likelihood or Bayesian) ◮ Accounting for non-tree like patterns resulted from
◮ Sequencing errors (e.g. SNP calling) ◮ Incomplete Lineage Sorting (see, e.g. Yu et al. 2014 PNAS)
◮ Efficient algorithms for searching the network space
Evolutionary Analysis
a d c b a b d c a c d b a d b c a b d c a d c b a d c b a b d c a b d c b a c d b a d c d a b c a c d b a c d b a c d b
Figure: Space of level-1 networks with four taxa; from [Huber-Linz-Moulton-Wu, J. Math. Biol., 2016]
Evolutionary Analysis
(ii) A B C D (i) A B C D v1 v2 v4 v3 v4 v2 v3 v1 v5 v2 v3 v6 v1 v4 v5 v2 v3 v6 v1 v4
T T ′ N N ′
Figure: A generalisation of the NNI operation on networks.
Evolutionary Analysis
a c d b a d b c a b d c a c d b a c d b a b d c N0(X) N1(X) N2(X) b a d c b a c d d a b c a b d c a b d c a d c b a d c b a b d c a d c b b a c d
Figure: Space of networks with four taxa; from [Huber-Moulton-Wu, J. Theoretical Biol., in press]
Evolutionary Analysis
◮ An exciting research field
◮ scientific curiosity ◮ practical impact
Evolutionary Analysis
◮ An exciting research field
◮ scientific curiosity ◮ practical impact
◮ A growing field of research
◮ new types of data (big data, time series) ◮ new applications (cancer evolution, culture evolution)
Evolutionary Analysis
◮ An exciting research field
◮ scientific curiosity ◮ practical impact
◮ A growing field of research
◮ new types of data (big data, time series) ◮ new applications (cancer evolution, culture evolution)
◮ A genuine multi-disciplinary area
◮ mathematics (combinatorics, optimisation, probability,
statistics)
◮ computer science (algorithms, data science) ◮ biology etc
Evolutionary Analysis
◮ Vincent Moulton and Katharina Huber (UEA) ◮ Mike Steel (Canterbury, NZ) ◮ Kwok Poi Choi (NUS)
Evolutionary Analysis
◮ Vincent Moulton and Katharina Huber (UEA) ◮ Mike Steel (Canterbury, NZ) ◮ Kwok Poi Choi (NUS) ◮ Leo van Iersel (TU Delft, the Netherlands), Celine
Scornavacca (Montpellier, France), Simone Linz (Auckland, NZ), Andereas Spillner (Greifswald, Germany), Cuong Than (Tubingen, Germany), Joe Zhu (Oxford)
Evolutionary Analysis
◮ Vincent Moulton and Katharina Huber (UEA) ◮ Mike Steel (Canterbury, NZ) ◮ Kwok Poi Choi (NUS) ◮ Leo van Iersel (TU Delft, the Netherlands), Celine
Scornavacca (Montpellier, France), Simone Linz (Auckland, NZ), Andereas Spillner (Greifswald, Germany), Cuong Than (Tubingen, Germany), Joe Zhu (Oxford)
◮ Previous and current students: Sarah Baskowski (TGAC) and
James Oldman (UEA)
Evolutionary Analysis
◮ Vincent Moulton and Katharina Huber (UEA) ◮ Mike Steel (Canterbury, NZ) ◮ Kwok Poi Choi (NUS) ◮ Leo van Iersel (TU Delft, the Netherlands), Celine
Scornavacca (Montpellier, France), Simone Linz (Auckland, NZ), Andereas Spillner (Greifswald, Germany), Cuong Than (Tubingen, Germany), Joe Zhu (Oxford)
◮ Previous and current students: Sarah Baskowski (TGAC) and
James Oldman (UEA)
◮ Your attention !
Evolutionary Analysis