Bioinformatics: Network Analysis
Evolution of Genes and Genomes
COMP 572 (BIOS 572 / BIOE 564) - Fall 2013 Luay Nakhleh, Rice University
1
Bioinformatics: Network Analysis Evolution of Genes and Genomes - - PowerPoint PPT Presentation
Bioinformatics: Network Analysis Evolution of Genes and Genomes COMP 572 (BIOS 572 / BIOE 564) - Fall 2013 Luay Nakhleh, Rice University 1 The Traditional Phylogeny Reconstruction Problem U V W X Y AGGGCAT TAGCCCA TAGACTT TGCACAA
COMP 572 (BIOS 572 / BIOE 564) - Fall 2013 Luay Nakhleh, Rice University
1
AGGGCAT TAGCCCA TAGACTT TGCACAA TGCGCTT
2
AGGGCAT TAGCCCA TAGACTT TGCACAA TGCGCTT
2
[Source: W.P. Maddison, Syst. Biol. 46(3):523-536,1997.]
3
4
Locus i B A E D C A B C D E Species Phylogeny
5
Locus i B A E D C
6
Locus i B A E D C E A D C B Gene Tree
6
Locus i B A E D C E A D C B Gene Tree A B C D E Species Phylogeny
6
Locus i B A E D C E A D C B Gene Tree A B C D E Species Phylogeny
6
B A E D C
7
Locus 1 Locus 2 Locus 3 Locus 4 Locus 5 Locus 6 B A E D C
7
Locus 1 Locus 2 Locus 3 Locus 4 Locus 5 Locus 6 B A E D C A B C D E Species Phylogeny
7
Locus 1 Locus 2 Locus 3 Locus 4 Locus 5 Locus 6 B A E D C A B C D E Species Phylogeny
7
Locus 1 Locus 2 Locus 3 Locus 4 Locus 5 Locus 6 B A E D C
8
Locus 1 Locus 2 Locus 3 Locus 4 Locus 5 Locus 6 B A E D C E A D C B E A D C B E A D C B E A D C B E A D C B E A D C B Gene Trees
8
Locus 1 Locus 2 Locus 3 Locus 4 Locus 5 Locus 6 B A E D C
9
Locus 1 Locus 2 Locus 3 Locus 4 Locus 5 Locus 6 B A E D C B A E D C
9
infer species phylogeny while accounting for these events, ...
relationships, inform about gene function, understand genomic structural variations and their role in disease (e.g., cancer), ...
10
11
12
Lineage sorting
[Source: W.P. Maddison, Syst. Biol. 46(3):523-536,1997.]
13
[Source: W.P. Maddison, Syst. Biol. 46(3):523-536,1997.]
14
[Source: http://topicpages.ploscompbiol.org/wiki/Detection_of_horizontal_gene_transfer]
15
minimum number of tree transformation operations (often, the “subtree prune and regraft” operation) that reconciles a gene tree with a species tree.
to explain the evolutionary history of the gene under study.
16
[Source: W.P. Maddison, Syst. Biol. 46(3):523-536,1997.]
17
18
19
Reconcile
Species tree Gene tree Reconciled gene tree
20
number of duplications and losses (or a weighted sum thereof) to explain the incongruence between the gene tree and species tree.
probabilistic reconciliations.
The Gene Evolution Model and Computing Its Associated Probabilities
LARS ARVESTAD AND JENS LAGERGREN
Royal Institute of Technology and Stockholm Bioinformatics Center, Stockholm, Sweden
AND BENGT SENNBLAD
Stockholm University and Stockholm Bioinformatics Center, Stockholm, Sweden
21
22
23
T2 T1 MRCA(C,G) MRCA(H,C,G)
24
T2 T1 MRCA(C,G) MRCA(H,C,G)
P[((H, C), G)] = 1 − 2 3e−(T2−T1)/N P[((H, G), C)] = 1 3e−(T2−T1)/N P[(H, (C, G))] = 1 3e−(T2−T1)/N
24
0.5 1 1.5 2 2.5 3 0.2 0.4 0.6 0.8 1
Probability (T2 – T1)/N
A(BC) (AC)B (AB)C
(HC)G (HG)C H(CG)
T2 T1 MRCA(C,G) MRCA(H,C,G)
P[((H, C), G)] = 1 − 2 3e−(T2−T1)/N P[((H, G), C)] = 1 3e−(T2−T1)/N P[(H, (C, G))] = 1 3e−(T2−T1)/N
24
coalescence” of the gene tree within the branches of the species tree, and
framework.
25
sequence alignments).
distance-based, and summary statistics.
26
incongruence between a species tree and its “contained” gene trees, as well as among the gene trees themselves, and the need for new methods to establish phylogenetic relationships in light of this incongruence
for
27
Method
Matthew D. Rasmussen1 and Manolis Kellis1
Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA; Broad Institute, Cambridge, Massachusetts 02139, USA
28
doi:10.1093/bioinformatics/bts225
Efficient algorithms for the reconciliation problem with gene duplication, horizontal transfer and loss
Mukul S. Bansal 1,∗, Eric J. Alm 2 and Manolis Kellis 1,3,∗
1Computer Science and Artificial Intelligence Laboratory, 2Department of Biological Engineering, Massachusetts
Institute of Technology, Cambridge, MA 02139, USA and 3Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
29
Yun Yu1, James H. Degnan2,3, Luay Nakhleh1*
1 Department of Computer Science, Rice University, Houston, Texas, United States of America, 2 Department of Mathematics and Statistics, University of Canterbury, Christchurch, New Zealand, 3 National Institute of Mathematical and Biological Synthesis, Knoxville, Tennessee, United States of America
30
doi:10.1093/bioinformatics/bts386
Inferring duplications, losses, transfers and incomplete lineage sorting with nonbinary species trees
Maureen Stolzer1,∗, Han Lai1, Minli Xu2, Deepa Sathaye3, Benjamin Vernot4 and Dannie Durand1,3
1Department of Biological Sciences, 2Lane Center for Computational Biology, 3Department of Computer Science,
Carnegie Mellon University, Pittsburgh, PA 15213, USA and 4Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
31
similar to that caused by true evolutionary events.
estimates before or during the species phylogeny inference process.
32
33
genome:
34
[Source: Bourque et al., Genome Research, 12(1):26-36,2002.]
35
[Source: Hampton et al., Genome Research, 19(2):167-177,2009.]
36
abstracted out, and the genome is turned into a list of signed numbers, where each element in the list corresponds to a gene, and the sign corresponds to the direction (strand) on the genome.
very large state space (all possible permutations of the list), and the evolution
37
G2=(1 2 −5 −4 −3 6 7 8) G1=(1 2 3 4 5 6 7 8)
breakpoints (arrows) are missing adjacencies
1 2 3 7 4 6 5 8 7 8 5 6 1 4 3 2 7 8 5 6 1 −4 −3 −2 1 7 6 5 8 −4 −3 −2 Inversion Inverted Transposition Transposition
[Source: Slides on Comparative Genomics, by B.M.E. Moret, PSB 2010]
38
transform one genome into another
39
minimizes the sum of distances to all three genomes)
40
for similar models for the other events.
parsimony principle. There is need for probabilistic inference.
data sets. There is need for efficient algorithms and high-performance computing techniques.
41
and/or genome rearrangements.
rise to the incongruence.
requires accounting for “evolution within a gene” (e.g., nucleotide evolution) and “evolution within and across the branches of the species phylogeny” (i.e., gene tree incongruence).
42