SLIDE 2 01‐Apr‐15 2
How to tell the difference?
branches of a node contain the same species, it is a gene duplication node
– Only if those two proteins are both really present in the same genome and there has genome and there has been no horizontal gene transfer (HGT)
speciation node
– Unless genes were lost
genomes – this is called unrecognized paralogy
- Orthologs are directly related to the
same ancestral gene, but have been retained in different species
– Since there is only one gene to perform the function, orthologous proteins generally perform the same function in these different genomes
The evolution of gene functions
Ancestral genome Current genomes
- Paralogs have evolved side by side in the
same genome for a while
– Since there are two genes with a redundant function, paralogous proteins might be more free to change their function in time – Some can acquire new functions (neo‐ and sub‐functionalization) – Most gene families evolve through duplications
Ancestral genome Current genome
Question
- Observe the two simplified gene trees above of two
homologs from mouse and two homologs from human Mouse C Mouse D Human C Human D
Tree 2
Mouse A Mouse B Human A Human B
Tree 1
homologs from mouse and two homologs from human.
1. Which are the speciation nodes and which are the gene duplication nodes? 2. What kind of homologs are Mouse A and Human A in Tree 1? 3. What kind of homologs are Human C and Human D in Tree 2? 4. Which genes may have the same function in Tree 1? 5. Which genes may have the same function in Tree 2?
Answers
- Observe the two simplified gene trees above of two
homologs from mouse and two homologs from human. Mouse C Mouse D Human C Human D
Tree 2
Mouse A Mouse B Human A Human B
Tree 1
homologs from mouse and two homologs from human.
1. See above: 2. The Mouse A and Human A genes in Tree 1 are orthologs. 3. The Human C and Human D genes in Tree 2 are paralogs. 4. In Tree 1, Mouse A / Human A, and Mouse B / Human B are more likely to have the same function. 5. We cannot say which genes may have the same function in Tree
- 2. What we can say is that the genes are all homologous, so
they will probably have a similar function to some extent.
Using orthology for function prediction
mouse mouse human human
Ancestor Present species
Time
Hex2 Hex2 Hex2 Hex1 Hex1 Hex1
Hex2 Hex1
- Researchers are often trying to identify orthologs in model
- rganisms, because if the function of the ortholog has been studied
in the model organism, it might perform the same function in other
- rganisms as well
- However, note that the definition of orthology says nothing about
function
tetrapods
Hex
Identifying orthologs
- Orthologs are best identified by studying phylogenetic trees
– They are derived from a speciation node – There can be simple 1:1 orthologs, but if one or both of the daughter lineages expanded by gene duplication (possibly resulting in many paralogs), there can also be 1:many or many:many orthology relationships – This can make function prediction based on orthology difficult
- Operational definition of orthology: bi‐directional best hits (BBH)
– Blast human_Hex1 against all proteins in mouse – Blast mouse Hex1 against all proteins in human if human Hex1 is the best hit, mouse_Hex1 is the best hit Protein 1 Protein_2 Sp_2 Protein 2 Sp_1 Protein_1 Blast mouse_Hex1 against all proteins in human if human_Hex1 is the best hit, then human_Hex1 and mouse_Hex1 are probably orthologs – If there is no 1:1 BBH, then it is likely that one or both of the orthologs duplicated since the speciation event frog frog mouse mouse human human
Hex2 Hex2 Hex2 Hex1 Hex1 Hex1
Hex2 Hex1 Hex
_ Protein_1 _ p_ Protein_2