19-10-31 Phylogenetics 2: Phylogenetic and genealogical homology - - PDF document

19 10 31
SMART_READER_LITE
LIVE PREVIEW

19-10-31 Phylogenetics 2: Phylogenetic and genealogical homology - - PDF document

19-10-31 Phylogenetics 2: Phylogenetic and genealogical homology Alignment of mammalian beta-globin gene sequences human cow rabbit rat opossum GTG CTG TCT CCT GCC GAC AAG ACC AAC GTC AAG GCC GCC TGG GGC AAG GTT GGC GCG CAC ... ... ... G.C


slide-1
SLIDE 1

19-10-31 1

Phylogenetics 2: Phylogenetic and genealogical homology

human cow rabbit rat

  • possum

GTG CTG TCT CCT GCC GAC AAG ACC AAC GTC AAG GCC GCC TGG GGC AAG GTT GGC GCG CAC ... ... ... G.C ... ... ... T.. ..T ... ... ... ... ... ... ... ... ... .GC A.. ... ... ... ..C ..T ... ... ... ... A.. ... A.T ... ... .AA ... A.C ... AGC ... ... ..C ... G.A .AT ... ..A ... ... A.. ... AA. TG. ... ..G ... A.. ..T .GC ..T ... ..C ..G GA. ..T ... ... ..T C.. ..G ..A ... AT. ... ..T ... ..G ..A .GC ... GCT GGC GAG TAT GGT GCG GAG GCC CTG GAG AGG ATG TTC CTG TCC TTC CCC ACC ACC AAG ... ..A .CT ... ..C ..A ... ..T ... ... ... ... ... ... AG. ... ... ... ... ... .G. ... ... ... ..C ..C ... ... G.. ... ... ... ... T.. GG. ... ... ... ... ... .G. ..T ..A ... ..C .A. ... ... ..A C.. ... ... ... GCT G.. ... ... ... ... ... ..C ..T .CC ..C .CA ..T ..A ..T ..T .CC ..A .CC ... ..C ... ... ... ..T ... ..A ACC TAC TTC CCG CAC TTC GAC CTG AGC CAC GGC TCT GCC CAG GTT AAG GGC CAC GGC AAG ... ... ... ..C ... ... ... ... ... ... ... ..G ... ... ..C ... ... ... ... G.. ... ... ... ..C ... ... ... T.C .C. ... ... ... .AG ... A.C ..A .C. ... ... ... ... ... ... T.T ... A.T ..T G.A ... .C. ... ... ... ... ..C ... .CT ... ... ... ..T ... ... ..C ... ... ... ... TC. .C. ... ..C ... ... A.C C.. ..T ..T ..T ...

Alignment of mammalian beta-globin gene sequences

slide-2
SLIDE 2

19-10-31 2

human cow Rabbit GTG CTG TCT CCT GCC GAC ACG TAC TAA GTC AAG GCC GCC TGG GGC AAG GTT GGC GCG CAC ... ... ... G.C ... ... ... ..T ... ... ... ... ... ... ... ... ... ... .GC A.. ... ... ... ..C ..T ... ... ..T ... A.. ... A.T ... ... .AA ... A.C ... AGC ...

Homology: similarity among two or more individuals or lineages in a feature/character, or character-state, that is the result of inheritance from a common ancestor Homologous character verses Homologous character state

ACG TAC TAA ACG TAT TAA ACG TAT TAA

C T

ACG TAC TAA ACG TAT TAA ACG TAT TAA

C T DNA alignment

slide-3
SLIDE 3

19-10-31 3

Analogy: non-phylogenetic similarity (filled squares) Homology: inherited from a common ancestor (filled circles)

Homology: similarity among two or more individuals or lineages in a feature/character, or character-state, that is the result of inheritance from a common ancestor Homologous character verses Homologous character state

ACG TAC TAA ACG TAT TAA ACG TAT TAA

C T

ACG TAC TAA ACG TAT TAA ACG TAT TAA

C T

Molecular evolution: positional homology of a character

DNA alignment

change somewhere along this branch

slide-4
SLIDE 4

19-10-31 4

ACG TAC TAA ACG TAT TAA ACG TAT TAA

C T

ACG TAC TAA ACG TAT TAA ACG TAT TAA

C T

1. position 6 is homologous (posit positional ional homolgy homolgy) 2. state “T” is homologous homologous (shared ancestry)

1 2 3 4 5 6 7 8 9

Phylogenies distinguish homology from similarity It is possible for character-states to be identical but non- homologous (Homoplasy)

  • 1. Convergence
  • 2. Reversal
  • 3. Parallelism

Homoplsy arises in molecular datasets when multiple substitutions occur at a single site.

  • “multiple substitutions”
  • “multiple hits”
  • “superimposed substitutions”
slide-5
SLIDE 5

19-10-31 5

Phylogenies distinguish homology from similarity: convergence

ACG GAT TAA ACG GAG TAA ACG GAA TAA

T G

G ⇒ A

T

ACG GAA TAA

T ⇒ C ⇒ A Convergent (non-phylogenetic) similarity of nucleotide character states; i.e., homoplasy

1. position 6 is homologous (positional homolgy) 2. state “A” is not homologous not homologous (non-phylogenetic similarity) Phylogenies distinguish homology from similarity: reversal

ACG GAA TAA ACG GAT TAA ACG GAT TAA

A T A

ACG GAA TAA

A ⇒ C ⇒ A Reversal leads to non-homologous similarities in character state.

1. position 6 is homologous (positional homolgy) 2. state “A” is not homologous not homologous (non-phylogenetic similarity)

slide-6
SLIDE 6

19-10-31 6

Phylogenies distinguish homology from similarity: parallelism

ACG GAA TAA ACG GAT TAA ACG GAA TAA

A A A

ACG GAT TAA

A ⇒ T Parallel evolution leads to non-homologous similarities in character states. A ⇒ T

1. position 6 is homologous (positional homolgy) 2. state “A” is not homologous not homologous (non-phylogenetic similarity) 1. Homologous characters can have NON-homologous states in different species! 2. Homologous characters can gave IDENTCIAL states in different species and those states are NOT homologous! (Don’t confuse similarities in state with homology!) VERY IMPORTANT consequences of phylogenetic homology

slide-7
SLIDE 7

19-10-31 7

Alternatives to phylogenetic concept of homology

percent homology

For DNA (or amino acid) sequences use percent similarity Trees within trees

slide-8
SLIDE 8

19-10-31 8

A B C D E F I Conventional representation A B C D E F I Conventional representation A C D E F I C D E F I B Path 1 Path 2 A C D E F I A C D E F I C D E F I B C D E F I B Path 1 Path 2

Trees within trees: reticulation Trees within trees

Species 1 Species 2 Species 3 Species 4

slide-9
SLIDE 9

19-10-31 9

Trees within trees

Species 1 Species 2 Species 3 Species 4

Trees within trees

Species 1 Species 2 Species 3 Species 4

slide-10
SLIDE 10

19-10-31 10

Trees within trees

Species 1 Species 2 Species 3 Species 4

Trees within trees time

Drift Selection Population history

slide-11
SLIDE 11

19-10-31 11

Trees within trees

Polymorphism and substitution (highly simplified) along a branch of a phylogeny

Time

Residence time: the time that a particular neutral polymorphism is present in a population. Mean residence time is determined by effective population size (Ne) Population substitution 1 Population substitution 2 Population substitution 3 Coalescent

GAC GAT GAT

A T A

GAG

A ⇒ C ⇒ A ⇒ G A ⇒ C C ⇒ A A ⇒ G

Forms of homology (for genes)

  • 1. ORTHOLOGY.

Orthologous genes are derived from the divergence of an organismal lineage; i.e., a speciation event. Thus if we look at orthologs on a phylogeny we see that their most recent common ancestor represents the coalescence of two organismal lineages.

  • 2. PARALOGY.

Paralogous genes are derived from the divergence event within a genomes; i.e., a gene duplication event. In this case if we look at paralogs on a phylogeny we see that they coalesce at a gene duplication event.

  • 3. PRO-ORTHOLOGY.

A gene is pro-orthologous to another gene if they coalesce at a speciation event that predates a gene duplication event. Thus a single-copy gene in organism A is pro-

  • rthologous to a gene that is present in multiple copies in organism B due to gene

duplication events that followed the divergence of organisms A and B.

  • 4. SEMI-ORTHOLGY.

This is a term that simply takes the reverse perspective of pro-orthology. Any one of the multi-copy genes in the genome of organism B is said to be semi-orthologous to a single copy gene in the genome of organism A, if the most recent common ancestors of those genes coalesce at a point in time that predates the gene duplication event.

  • 5. PARTIAL HOMOLOGY. This refers to the situation that arises when the evolutionary histories of different

segments within the same gene coalesce at different ancestors. This can arise from evolutionary processes such as homologous recombination or exon shuffling.

  • 6. GAMETOLOGY.

Gametologs coalesce at an event that isolated those genes on opposite sex chromosomes; i.e., they coalesce at the point when they became isolated from the process of recombination.

  • 7. XENOLOGY

Genes that coalesce at either a speciation or duplication event, but whose evolutionary histories do not fit with that of the organismal lineages which carry such genes due to one

  • r more lateral gene transfer events.
  • 8. SINOLOGY.

Homologous genes found within the same organism’s genome have different evolutionary histories due to the fusion of formerly evolutionarily independent genomes, such as in endosymbiosis.

slide-12
SLIDE 12

19-10-31 12

Pro-orthologs pre-date the involved duplication event Examples of different types of homology in gene families: Homo and Rattus (rat) Ldh-C are orthologous. Homo Ldh-C and Homo Ldh-A are paralogous. Homo Ldh-C and Rattus Ldh-A also are paralogous. Gallus (chicken) Ldh-A is pro-orthologous to both Homo Ldh-C and Homo Ldh-A. Homo Ldh-C is semi-orthologous to the Gallus Ldh-A. All mammalian Ldh-A genes are semi-orthologous to the non-mammalian Ldh-A. Note that the gene duplication that gave rise to the Ldh-C gene is specific to an ancestor of all present-day mammals. Mammalian Ldh-C and Ldh-A genes are paralogous.

Mus Cricetinae Homo Gallus Sceloporus Rattus Sus Homo Sus Rabbit Mus Rattus Ldh-A Ldh-C Gene duplication event

Pro-orthologs pre-date the involved duplication event

Mus Cricetinae Homo Gallus Sceloporus Rattus Sus Homo Sus Rabbit Mus Rattus Ldh-A Ldh-C Gene duplication event

human Ldh-C and human Ldh-A are paralogs paralogs (all these genes ar (all these genes are homologs) e homologs)

slide-13
SLIDE 13

19-10-31 13

Pro-orthologs pre-date the involved duplication event

Mus Cricetinae Homo Gallus Sceloporus Rattus Sus Homo Sus Rabbit Mus Rattus Ldh-A Ldh-C Gene duplication event

mouse Ldh-C and human Ldh-C are ort

  • rthologs

hologs (all these genes ar (all these genes are homologs) e homologs)

speciation event

Organism history can be different from gene history

Selected examples (in notes):

  • Non-phylogenetic lineage sorting due to DRIFT
  • Birth-death evolution in gene families
  • Trans-species evolution
  • Recombination
  • Lateral gene transfer (LGT)

Other sources:

  • Statistical error
  • Human error
slide-14
SLIDE 14

19-10-31 14

  • 1. Ancestral polymorphism
  • 1. Ancestral polymorphism: residence times
  • rate to fixation [under drift] slows with increasing in Ne
  • ultimate fate is fixation or loss
  • Larger Ne yield larger residence time of a polymorphism in a population

If we run this simulation long enough it will go to fixation or loss; it just takes much longer

slide-15
SLIDE 15

19-10-31 15

  • 1. Ancestral polymorphism: rapid loss vs. persistence

What happens here depends on Ne!

rapid loss of ancestral rapid loss of ancestral polymorphism due to strong drift ancestral ancestral polymorphism persists for a long time because drift is weak

Trees within trees

Polymorphism and substitution (highly simplified) along a branch of a phylogeny

Time

Residence time: the time that a particular neutral polymorphism is present in a population. Mean residence time is determined by effective population size (Ne) Population substitution 1 Population substitution 2 Population substitution 3 Coalescent

GAC GAT GAT

A T A

GAG

A ⇒ C ⇒ A ⇒ G A ⇒ C C ⇒ A A ⇒ G

slide-16
SLIDE 16

19-10-31 16

Figure credit: Fredrick Leliaert Speciation is a process, not an event; the process take time!

“Deep coalescents” (ancestral) have no relationship to the process

  • f speciation

Coalescents after speciation indicate the pattern speciation

Figure credit: Fredrick Leliaert Speciation is a process, not an event; the process take time!

slide-17
SLIDE 17

19-10-31 17

  • 1. Ancestral polymorphism: rapid loss vs. persistence

What happens here depends on Ne!

rapid loss of ancestral rapid loss of ancestral polymorphism due to strong drift ancestral ancestral polymorphism persists for a long time because drift is weak

Small Ne and long time to speciation events: gene trees = species trees Large Ne and short time to speciation events: gene tr gene trees ees ≠ species tr species trees ees

  • non-phylgenetic lineage sorting
  • incomplete lineage sorting
  • verview:
slide-18
SLIDE 18

19-10-31 18

  • 2. Birth-death evolution in gene families

Tandem array of 2 genes in the ancestral species

(−) (−) (+) (+) (+) (−) (−)

1 2 A1 A2 B1 B2 C1 C2 Gene “Birth”: (+) Gene “Death”: (-) ti m e Tandem array of 2 genes in the ancestral species

(−) (−) (+) (+) (+) (−) (−)

1 2 1 2 A1 A2 A1 A2 B1 B2 B1 B2 C1 C2 C1 C2 Gene “Birth”: (+) Gene “Death”: (-) ti m e

A B C A1 B1 C1

Species tree Gene (1) tree

A B C A1 B1 C1

Species tree Gene (1) tree

Species A Species B Species C Gene 1 Gene 2

  • 3. Trans-species evolution
slide-19
SLIDE 19

19-10-31 19

  • 3. Recombination

A B C D a b c d A B C D a B C d

Gene conversion between the same gene

A B a b a B A b

Parental contribution Resultant offspring Gene conversion between different genes Gene A Gene A Gene A Gene B

Among alleles within an individual: think about patterns of population coalescence Among alleles within an individual: this is a nonreciprocal exchange Among genes (paralogs) within an individual: this is also a nonreciprocal

  • exchange. Remember

that different genes can have different phylogenetic histories

partial orthology

  • 4. Lateral gene transfer (LGT)

time

a b c

A B C a b c

Species tree Gene tree

xenology

slide-20
SLIDE 20

19-10-31 20

Organism history can different from gene history

Selected examples (in notes):

  • Birth-death evolution in gene families
  • Trans-species evolution
  • Recombination
  • Lateral gene transfer (LGT)

Other sources:

  • Statistical error
  • Human error