1 1. Polymorphism and divergence are correlated Neutral theory is a - - PDF document

1
SMART_READER_LITE
LIVE PREVIEW

1 1. Polymorphism and divergence are correlated Neutral theory is a - - PDF document

Neutral theory 3: Rates and patterns of molecular evolution Predictions of the neutral theory 1. Within species variation is correlated with divergence between species. 2. Evolutionary rate is inversely related functional constraint. 2. Base


slide-1
SLIDE 1

1

Neutral theory 3: Rates and patterns of molecular evolution

1. Within species variation is correlated with divergence between species. 2. Evolutionary rate is inversely related functional constraint. 2. Base composition at neutral sites reflects mutational equilibrium.

  • 4. A molecular clock.

Predictions of the neutral theory Neutral theory is “the” rigid null hypothesis for molecular evolution

slide-2
SLIDE 2

2

  • 1. Polymorphism and divergence are correlated

Neutral population polymorphism within species is correlated with neutral divergence between species

Neutral theory is a bridge between microevolution and macroevolution

  • 1. Variation within and among species: polymorphism & divergence

This is one place where the genetic code is relevant: 1. Synonymous (S) 2. Non-synonymous (NS) Neutrality and selection have different impacts on polymorphism: 1. Neutrality: NS residence times determined by Ne 2. Selection: NS residence times reduced by natural selection Let’s look at the ratio NS:S [ratio of counts]

slide-3
SLIDE 3

3

  • 1. Variation within and among species: polymorphism & divergence

Comparison of the ratio of synonymous and nonsynonymous polymorphism within species to divergence between species. Neutral theory suggests that the fraction of variation that is nonsynonymous within species should be the same as between species.

12:4 6:2 10:3 17:6 14:5 19:6

Species 1 Species 2 Species 3 Polymorphism within a species Substitutions between species

Data are hypothetical. Ratios are tested by using a G-test on the counts of S and NS. These hypothetical data are not significant. If positive selection were acting, residence times for NS would be lower within species and polymorphic S:NS > fixed S:NS. Synonymous (S) Non-synonymous (NS) S:NS Polymorphic 28 9 3.1 Fixed 50 17 2.9

Genealogies within populations Species level phylogenies

  • 2. Rate of evolution is inversely related to functional constraint

Rate variation is well known:

  • Fast genes (D-loop) verses slow genes (Histones)
  • Introns verses exons
  • Synonymous verse nonsynonymous sites

Neutral theory is consistent with such rate variation

  • Asserts only that polymorphism is selectively equivalent
  • Frequency of such polymorphism can change among genes, sites etc.
slide-4
SLIDE 4

4

Note: two ways it is commonly measured

0.01

Cebus Saimiri Aotus Callithrix Lagothrix Brachyteles Alouatta

Ateles

Pan Homo Pongo Macaca Hylobates Tarsius Galago Otolemur Cheirogaleus Eulemur 10 20 30 40 50 60 ....|....| ....|....| ....|....| ....|....| ....|....| ....|....| Mus2.FAS MTTPALLPLS -----GRRIP PLNL--GPP- ----SFPHHR ATLRLSEKFI LLLILSAFIT Human_GIA

  • --------- ---------- ---------- -----MNSNF ITFDLKMSLL PSNLFSAFIT

Human_GIB MTTPALLPLS -----GRRIP PLNL--GPP- ----SFPHHR ATLRLSEKFI LLLILSAFIT Mus_GIA MPVGGLLPLF SSPGGGGLGS GLGGGLGGG- ----RKGSGP AAFRLTEKFV LLLVFSAFIT Rabbit_GIA

  • --------- ---------- ---------- ---------- ---------- ----------

Sus_GIA MPVGGLLPLF SSPAGGGLGG GLGGGLGGGG GGGGRKGSGP SAFRLTEKFV LLLVFSAFIT 70 80 90 100 110 120 ....|....| ....|....| ....|....| ....|....| ....|....| ....|....| Mus2.FAS LCFGAFFFLP DSSKHKRFDL G-LEDVLIPH VDAGKG---- AKNPGVFLIH GPDEHRHREE Human_GIA LCFGAIFFLP DSSKLLSGVL FHSSPALQPA ADHKPGPGAR AEDAAEGRAR RREEGAPGDP Human_GIB LCFGAFFFLP DSSKHKRFDL G-LEDVLIPH VDAGKG---- AKNPGVFLIH GPDEHRHREE Mus_GIA LCFGAIFFLP DSSKLLSGVL FHSNPALQPP AEHKPGLGAR AEDAAEGRVR HREEGAPGDP Rabbit_GIA

  • --------- ---------- ---------- ---------- AEDAADGRAR PGEEGAPGDP

Sus_GIA LCFGAIFFLP DSSKLLSGVL FHSSPALQPA ADHKPGPGAR AEDAADGRAR PGEEGAPGDP

Tree length Ave of all pairwise distances

  • 2. Rate of evolution is inversely related to functional constraint

Mean number of substitutions/site 0.05 0.1 0.15 1 2 3 Codon position

pairwise subst/site

0.2 0.4 0.6 0.8

subst/site as a sum

  • ver tree

mean pairwise subst rate Subst rate as a sum of branch lengths Gene tree for primate epsilon globins

0.01

Cebus Saimiri Aotus Callithrix Lagothrix Brachyteles Alouatta

Ateles

Pan Homo Pongo Macaca Hylobates Tarsius Galago Otolemur Cheirogaleus Eulemur

Note: mean number of substitutions per site were computed in all cases by using the Jukes and Cantor (1969) correction. Under both measures of substitution rate, 3rd codon positions evolve faster than 1st and 2nd positions.

Mean number of substitution per site at the three codon positions of the epsilon-globin gene of

  • primates. Two measures are presented: (i) the average over all pair wise comparisons

between genes; and (ii) the sum of the branch lengths of the epsilon globin gene tree.

slide-5
SLIDE 5

5

  • 2. Rate of evolution is inversely related to functional constraint

Neutral Model

Beneficial: rare and frequency quickly goes to zero Deleterious: frequency = fD Neutral: frequency = f0

Kimura, 1968: The neutral mutation rate per site is: µ0 = µT f0 The neutral substitution rate per site is: k = µT f0 f0 = 1 - fD

  • 2. Rate of evolution is inversely related to functional constraint

The rate of evolution depends on the “size (f0) of the selective sieve” Kimura’s f0 is the fraction of mutations that passes through the “sieve”.

New mutations Fixation in a “slow gene” New mutations Fixation in a “fast gene”

slide-6
SLIDE 6

6

  • 2. Rate of evolution is inversely related to functional constraint

Example: 3rd codon positions verses synonymous sites Some changes at 3rd codon positions are NOT synonymous Prediction:

1. f0 for 3rd codon positions < f0 for synonymous sites

2. rate 3rd codon positions < rate for synonymous sites

  • 2. Rate of evolution is inversely related to functional constraint

Mean number of substitutions per site between primates and rodents is t = t0 +

  • t1. The unit of time is 2 × 80my; the time

since primates and rodents shared a common ancestor. Data from Bielawski, Dunn and Yang (2000) Genetics. 156:1299-1308. This result is consistent with neutral theory given that

f0 is smaller for 3rd codon positions because some

mutations at such site will be nonsynonymous.

Primate gene Rodent gene Ancestral gene

t1 t0

5 10 15 20 25 30 35 40 45 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 More

substitutions / site / 2x80 million years number of proteins

3rd codon postions Synonymous sites Mean at 3rd positions: 0.40 Mean at synonymous sites: 0.61

The average substitution rate between primates and rodents is higher for synonymous sites as compared with third codon positions. The results are based on a sample of 82 nuclear genes.

slide-7
SLIDE 7

7

  • 2. Rate of evolution is inversely related to functional constraint

We can put sites into a wide variety of categories:

  • 5’ and 3’ flanking regions
  • 5’ and 3’ untranslated regions
  • Introns
  • Exons
  • 3rd positions of 4-fold degenerate codons
  • Nonsynonymous sites of a codon
  • Functional domains
  • Pseudogenes

Comparison of mean substitution rates in different parts of genes and pseudo-genes. Data is from Li et al. (1985). Substitution rate is the mean number of substitutions per site per 109

  • years. Rates are an average over 3000 mammalian genes.

0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5' flanking region 5' untranslated region Non-synonymous sites Syonymous sites Introns 3' untranslated region 3' flanking region Pseudogenes

Substitutions per site per 109 years

  • 2. Rate of evolution is inversely related to functional constraint
slide-8
SLIDE 8

8

  • 2. Rate of evolution is inversely related to functional constraint

Sites subject to selection will have variable f0 depending on the level of functional constraints acting on that site: 1. Nonsynonsymous sites 2. Functional domains 3. Etc.

Note: many of the shaded sites are located in the heme pocket or at the interfaces between globins subunits, consistent with the notion that sites most critical to protein function evolve at the slowest rates.

Multiple sequence alignment of four vertebrate beta-globin genes representing 450 million years of evolution. Amino acids shaded in green represent sites that appear conserved for over those 450millin years.

  • 2. Rate of evolution is inversely related to functional constraint
slide-9
SLIDE 9

9

  • 2. Rate of evolution is inversely related to functional constraint

C chain: 1.1 × 10-9 / site / year A&B chains: 0.2 × 10-9 / site / year 5 fold higher rate in C chain Note that c-chain rate is still lower than in many other proteins, and at synonymous sites, so its amino acid sequence must still has considerable functional importance to the protein, probably in folding to lowest free energy so that disulfide bonds can be formed.

Substitution rate differs in different polypeptide domains of preproinsulin.

  • 2. Rate of evolution: differences among genes

Under neutral theory:

  • The synonymous substitution rate (kS) is equal to the neutral mutation rate.
  • The nonsynonymous substitution rate (kN) measures the substitution rate for neutral amino acid changes.
  • Thus the ratio of these rates (kN / kS) represents the fraction of amino acid mutations that are neutral: this is f0 for

amino acids

  • The fraction of amino acid mutations that are deleterious (fD) must be 1 - (kN / kS).

Let’s take the Neuroleukin gene of primates as an example: kN = 0.016 kS = 0.300 The fraction of amino acid changes that are neutral is 0.016/0.300 = 0.053, a small amount. Hence the fraction of amino acid changes that are deleterious is 1 - 0.053 = 0.95!

Let’s estimate the width of the selective sieve:

slide-10
SLIDE 10

10

Estimated level of function constraint for three nuclear genes of primates

A1 adenosine receptor Prolactin Pseudogene

Fraction of deleterious mutations Fraction of neutral mutations

Estimates obtained from relative rates of synonymous and nonsynonymous substitution. Data is from Bielawski, Dunn, and Yang (2000) Genetics 156:1299-1308. Most mutations in a gene evolving under strong functional constraints are deleterious Less functionally constrained gene has more neutral mutations Pseudogenes are non- functional so all mutations are expected to be neutral

  • 2. Rate of evolution: differences among genes
  • 2. Rate of evolution: differences among genes

1. Nonsyonymous rates vary among genes due to differences in functional constraints 2. Synonymous rates vary due to differences in mutation rates [and in some cases weak selective constraints] Best to estimate kS separately for each gene when trying to estimate f0 and fD.

slide-11
SLIDE 11

11

  • 2. Rate of evolution: differences among genes

Distribution of nonsynonymous and synonymous substitution rates for 82 nuclear

genes of primates. 0.05 0.15 0.25 0.35 0.45 0.55 10 20 30 40 50 60 Number of proteins substitutions/site/80 million years Nonsyonymous rate Synonymous rate Mean rate of nonsynonymous substitution: 0.045 / site / 80 million years Mean rate of synonymous substitution: 0.201 / site / 80 million years Data from Bielawski, Dunn, and Yang (2000) Genetics, 156:1299-1308. Method: GY94 under ML

  • 3. Patterns of mutation

Neutral theory: nucleotide frequencies at site free from selection will reflect mutational equilibrium. For example:

  • Pseudogenes
  • 3’ flanking regions
  • Synonymous sites
slide-12
SLIDE 12

12

3

0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45

1 2 3 4

A C G T

0 . 0 5 0 . 1 0 . 15 0 . 2 0 . 2 5 0 . 3 0 . 3 5 0 . 4 0 . 4 5 1 2 3 4

A C G T

0 . 0 5 0 . 1 0 . 15 0 . 2 0 . 2 5 0 . 3 0 . 3 5 0 . 4 0 . 4 5 1 2 3 4

A C G T

1st codon position 2nd codon position 3rd codon position

Nucleotide frequencies in the human beta-globin gene differ among the three positions of the codon. Frequencies at positions 1 and 2 reflect selection acting on the protein product of the gene. Frequencies at position 3 reflect a strong influence of mutation pressure.

  • 3. Patterns of mutation: human beta globin
  • 4. Molecular clock

k = µ

Neutral theory (1968) predicts that the rate of molecular evolution [substitution] should be approximately constant over time, where time is measures in generations. Zukerkandl and Pauling (1965) noticed an approximately uniform rate of amino acid substitution, with time measures in years. The notion of a molecular clock is somewhat controversial

slide-13
SLIDE 13

13

  • 4. Molecular clock

Linear relation between mitochondrial substitution rate and time since common ancestor in teleost fishes 0.05 0.1 0.15 0.2 50 100 150 200 Time (millions of years)

Mean number of substitutions / site

Linear relationship is expected under a uniform rate of substitution. Substitutions are the mean number of changes at first codon positions of all mitochondrial protein coding genes. Data were kindly provided by K. Dunn.

Note: many proteins exhibit constant rate in terms of absolute time in years

Some initial problems with neutral theory 1. Expected heterozygosity was larger than observed in natural populations 2. Some molecules evolved according to a per million year clock even though generation times were very different 3. There were more problems to follow, but will not consider them.

slide-14
SLIDE 14

14

Motoo Kimura Tomoko Ohta

Nearly neutral theory Ohta: What happens if some mutations are only mildly deleterious? Nearly neutral theory

Slightly deleterious mutations:

  • Small selective coefficients (s)
  • When s is small, populations size plays important role:
  • Large Ne: selection effective
  • Small Ne: selection less effective [more deleterious alleles get fixed!]

Ohta and Kimura (1971): the slightly deleterious model of evolution

  • more adjustments were to follow
slide-15
SLIDE 15

15

Then fate of a beneficial recessive allele (A1) is not always predictable under the combined effects of directional selection and genetic drift. If there is no genetic drift (left: Nes = infinity), the fate of the recessive allele (A1) is always determined by selection. When there is drift (right: Nes < infinity) the fate of the recessive allele (A1) is not necessarily determined by selection; hence a deleterious allele can be fixed in a population. Nes = 100 Nes = infinity Note that Nes > 1 does not guarantee that an allele is going to be fixed, it simply indicates that (as a long term average) the frequency that it is fixed will be greater than the frequency under genetic drift alone.

Remember this slide from PopGen Topic 8: Selection - drift

Large population size: (selection very effective)

neutral mutations

FIXED

slightly deleterious mutations

Small population size: (selection a little less effective)

neutral mutations

FIXED

slightly deleterious mutations

Slightly deleterious model

slide-16
SLIDE 16

16

Nearly neutral theory

Slightly beneficial models were incorporated later Mildly deleterious + slightly beneficial = “Nearly neutral model”

Large population size: (selection very effective) neutral mutations

FIXED

slightly beneficial mutations Small population size: (selection a little less effective) neutral mutations

FIXED

slightly mutations beneficial

Strictly neutral model Nearly neutral model Deleterious Beneficial N Ne eu ut tr ra al l N Ne eu ut tr ra al l S Sl li ig gh ht tl ly y d de el le et te er ri io

  • u

us s N Ne eu ut tr ra al l S Sl li ig gh ht tl ly y d de el le et te er ri io

  • u

us s S Sl li ig gh ht tl ly y b be en ne ef fi ic ci ia al l The strictly neutral model was extended to accommodate nearly neutral mutations Slightly deleterious model

Fraction of “neutral” mutations (f0) changes with Ne Note that Ne changes over time!

slide-17
SLIDE 17

17

Nearly neutral theory reconciles some observations with theory 1. Predicts lower natural levels of heterozygosity 2. Possible reconciliation of predicted rate constancy in generations with rate constancy in years. Neutrality depends on environmental conditions

Strictly neutral and nearly neutral models:

  • distribution of fitness effects of new mutations changes according to

environment Genetic environment and physical environment change:

  • 3D space of protein reflect a genetic as well as physical environment
  • neutral substitutions, recombination, LGT, etc. change the genetic

environment

  • physical environment changes daily, weekly, seasonally, yearly….
slide-18
SLIDE 18

18

Success of neutral theory: “the null model” Rob Stainer (1970) at the general meeting of the society for general Microbiology commented that evolutionary studies are

“a relatively harmless habit, like eating peanuts”.

Neutral theory provides the foundation for the science of molecular evolution Success of neutral theory: James F. Crow (1985)

  • 1. The theory provides the best explanation for the dramatic differences in the rates

and patterns of evolution in molecules as compared with morphology.

  • 2. The neutral theory provides a common framework for understanding the dramatic

differences among genes, codon positions, introns, and pseudogenes.

  • 3. The neutral theory correctly predicts the differences in rates among molecular

datasets as well as the similarity of substitution rates between the so-called “living fossil” organisms and the most rapidly changing species.

  • 4. The neutral theory has stimulated theoretical studies as well as studies of natural

variation in a framework based on a rigid null hypothesis.