1 Residence time: time it take for one allele to replace another in - - PDF document
1 Residence time: time it take for one allele to replace another in - - PDF document
Functional divergence 2: Tests of populations Some background on the relationship between natural selection and neutral polymorphism: Residence times Selective sweep Linkage disequilibrium 1 Residence time: time it take for one allele to
2
Residence time: time it take for one allele to replace another in a population; i.e., the duration that the involved locus is polymorphic.
Fitness A1A1: 1 A1A2: 0.9 A2A2: 0.5
Ne = 100
Residence time
3
Fitness A1A1: 1 A1A2: 0.9 A2A2: 0.5
Ne = 10000
Residence time
Populations moving on a fitness landscape carry a “cargo” of neutral polymorphism. Linkage disequilibrium means that the fate of some of this neutral cargo will depend on the nature of selection
4
Residence times: alleles under directional selection < < neutral alleles alleles under balancing selection > > neutral alleles
+ + _ _ + _ _ _ + _ _ _
A stable fitness landscape Environmental change leads to a dramatic change in the fitness landscape FFTNS predicts strong selection pressure to increase the average fitness of the population
One model for strong selection pressure
Assume the new peak requires fixation of an allele [or alleles]: 1. Direct action of selection: change in frequency of selected allele 2. Indirect effect: change in neutral allele frequencies
5
Selective sweep: dramatic loss of population polymorphisms as loci closely linked to a locus fixed by directional selection
“hitchhiking” ⇒ Selective sweep
Physically very close, so likelihood of recombination breaking up this configuration is lowest
Neutral allele Strongly beneficial allele Chromosome slightly deleterious allele
6
Fitness A1A1: 1 A1A2: 0.9 A2A2: 0.5
Ne = 10000
The beneficial mutant is said to “sweep” through the population Linked neutral polymorphism should behave as indicated in the plot in to the left; instead it is dragged to fixation as well [plot above]
Chromosome Chromosome Chromosome Chromosome Chromosome
t0 before selective sweep t1 selective sweep t2 selective sweep t3 selective sweep t4 beneficial allele fixed 0.25 0.10 0.70 0.001 0.15 0.30 0.95 0.99 1 1 1 0.99 0.26 0.15 0.71 0.10 0.19 0.31 0.85 0.88 0.90 0.90 0.90 0.88 0.55 0.50 0.75 0.50 0.55 0.50
In a short time, most
- f the linked variation
is lost
Selective sweep
7
Selective sweep: Dramatic loss of population polymorphism Fixation of deleterious alleles
Strongly beneficial allele
Recombination rate determines the region of the genome subject to the selective sweep.
High recombination rate: small region of selective sweep Low recombination rate: larger region of selective sweep
8
Some important questions:
- How is molecular variation maintained at individual loci?
- Can I analyze and interpret the evolution of my favorite gene under strict
neutrality?
- How do I test the fit of my data to the hypothesis of neutrality?
- What does it mean when a neutrality test is rejected for my gene?
- How can I determine if my favorite gene has been subjected to adaptive
selection pressures?
Neutral Model
Deleterious Neutral Adaptive
Selectionist Model Focused on this fraction, regardless of model In some cases we are trying to estimate the fraction. In other cases we use tests that assume a strict neutral model This fraction contains information we are interested in
9
Strictly neutral model Nearly neutral model Deleterious Beneficial N Ne eu ut tr ra al l N Ne eu ut tr ra al l S Sl li ig gh ht tl ly y d de el le et te er ri io
- u
us s N Ne eu ut tr ra al l S Sl li ig gh ht tl ly y d de el le et te er ri io
- u
us s S Sl li ig gh ht tl ly y b be en ne ef fi ic ci ia al l The strictly neutral model was extended to accommodate nearly neutral mutations Slightly deleterious model
Weak selection
How is molecular variation maintained at individual loci?
Rejecting a model of strict neutrality does not reject neutral evolution! Only means that weak selection might be relevant to variation at the involved locus! Could still be a large fraction of strictly neutral variation
Can I analyze and interpret the evolution of my favorite gene under strict neutrality?
“nearly neutral theory might be more realistic than the strictly neutral theory, but the latter is certainly more useful than the former.”
Many tests of micro- and macro-evolution are based on the strictly neutral model.
10
Can I analyze and interpret the evolution of my favorite gene under strict neutrality?
Examples of evolutionary problems that rely on tests formulated from the strictly neutral model include: Is a population structured? [ What was its history; how much historical gene flow?] What is the predicted level of inbreeding depression of a captive [ endangered] population? Is there a molecular clock? [ What was the date of a divergence suggested by a gene?] What is the fraction of amino acid substitutions in a particular domain that are neutral?
Can I analyze and interpret the evolution of my favourite gene under strict neutrality? The answer to this question depends on many things
11
How do I test the fit of my population to the hypothesis of neutral evolution? Two broad categories of tests: Allelic distribution tests:
- 1. Ewans-Watterson test (1972)
- 2. Tajima’s D tests (1989)
- 3. Fu and Li’s D test (1993)
Heterogeneity tests:
- 1. Hudson-Kreitman-Aguadé (1987)
- 2. McDonald-Kreitman (1991)
- 3. Many extensions; e.g., Akashi (1994)
Focus on θ = 4Neµ Focus on polymorphism within species and divergence between species Allelic distribution tests
θ can be related to different ways of summarizing population polymorphism:
k: the mean number of nucleotide differences between a pair of sequences. S: the number of variable nucleotide sites in a sample of genes from a population (this is called the number of segregating sites).
12
Tajima’s D test
Under strict neutrality: k = S Tajima’s D = k – S [under neutrality D = 0] Reject neutrality when D is too positive or too negative D < 0; directional selection [or population growth] D > 0; balancing selection [or subdivision] Fu and Li’s D has the same interpretation [and limitations] Heterogeneity tests: Hudson-Kreitman-Aguadé Test
The ratio of polymorphism to divergence should be the same among two [or more] loci under strict neutrality HKA test applied to synonymous and non-coding variation: Direction selection: deficiency of silent polymorphism Balancing selection: excess of silent polymorphism Recombination rate: determines the extent
13
500 1000 1500 2500 3500 1
6 5 4 3 2 1
5’ untranslated DNA 5’ leader INTRON 1 EXON 1 3’ DNA EXON 2
Estimate of synonymous θ Regions of gene or genome
Deficiency of polymorphism [selective sweep]
Estimate of synonymous θ
Excess of polymorphism [Balanced polymorphism]
Heterogeneity tests: McDonald-Kreitman Test
Patterns of polymorphism to divergence: different substitution types Synonymous : Nonsynonymous Conservative : Radical Preferred : Unpreferred codons Results: Positive selection: deficiency of polymorphism Negative selection: Mildly deleterious mutant lead to excess polymorphism Balancing selection: excess polymorphism Changes in effective population size complicate interpretation of results
14
Comparison of the ratio of synonymous and nonsynonymous polymorphism within species to divergence between species. Neutral theory suggests that the fraction of variation that is nonsynonymous within species should be the same as between species.
12:4 6:2 10:3 17:6 14:5 19:6
Species 1 Species 2 Species 3 Polymorphism within a species Substitutions between species
Data are hypothetical. Ratios are tested by using a G-test on the counts of S and NS. These hypothetical data are not significant. If positive selection were acting, residence times for NS would be lower within species and polymorphic S:NS > fixed S:NS.
Synonymous (S) Non-synonymous (NS) S:NS Polymorphic 28 9 3.1 Fixed 50 17 2.9
What does it mean when neutrality is rejected for my gene? Described tests are collectively called neutrality tests Problem: null hypothesis of neutrality test is a composite hypothesis i. Strict neutrality ii. Population demographics are stable Overdominant selection & population structure ⇒ excess of polymorphism Selective sweep & bottleneck ⇒ deficiency of polymorphism
15
5 10 15 20 25 1 2 3 4 5 6 7 8 9 10 11 12 13 5 10 15 20 25 30 35 40 45 1 2 3 4 5 6 7 8 9 10 11 12 13
0 allele frequency 1 0 allele frequency 1 number of populations
initial distribution; t = 0 generations distribution after t = 50 generations
population structure ⇒ excess of polymorphism bottleneck ⇒ deficiency of polymorphism
Pre-bottleneck population Post-bottleneck population Bottleneck event
1. Change in allele frequencies, as compared with pre-bottleneck population 2. Reduction in diversity
16
Significant neutrality test ≠ evidence against neutral theory A significant result can be obtained from sequences in a population evolving under strictly neutral models, but have experienced a recent change in population size. Significant neutrality test ≠ unambiguous evidence against neutral evolution. Evidence of a selective sweep is not evidence against a large fraction of unlinked sites, in the same gene and in other genes, that are dominated by neutral evolution What does it mean when neutrality is rejected for my gene? What does it mean when neutrality is rejected for my gene?
MK test differs from the others
17
Are these tests useful? YES! Example: Adh of Dropsophila melanogaster Alcohol Dehydrogenease (Adh) is an enzyme involved in alcohol metabolism [along with many other genes] Adh activity necessary because fruit in the diet often contains alcohol Adh has two alleles; Fast (F) and Slow (S). F allele has up to two-fold higher activity than S allele Alcohol tolerance is significantly correlated with frequency of the F allele Allele frequencies exhibit a cline [the most famous cline in biology] Clines reciprocating in northern and southern hemispheres
Adh cline is impressive
18
(A) Diagram of a protein gel electrophoresis apparatus, and (B) a photograph of a “stained” protein gel, the blue “blotches” are the proteins, their position indicates how far they migrated in the electric field.
A B
Fast Slow The F and S alleles are due to a single amino acid substitution. Could this single change really be the direct target of natural selection? Reactions can depend on the capture of a specific covalent bond! Subtle changes in the precise orientation of amino acid side chains can increase of decrease an enzymatic reaction by 1 million fold!
19
- Complementary surfaces
- Precise orientation with respect to side chains
Complimentary surfaces or clefts:
i. Very precise 3D surfaces ii. Side chain interactions via physiochemical properties of side-chains
20
Hang on; are there non-adaptive explanations as well? Yes.
- 1. Structured populations with moderate gene flow
2. Adh polymorphism is neutral, and only linked to another locus subject to natural selection HKA and MK tests used to directly test the F/S variation. Until now, all arguments have assumed that alcohol metabolism correlates with fitness
- 1500
- 500
500 1500 2500 3500
Adh Adh-dup
5’ untranslated region
Kreitman and Hudson (1991) 4750 nucleotide region of DNA [below] 11 species of D. melanogaster Subdivided into “windows” of 100 bp Measured diversity at synonymous and non-coding sites
21
- 500
500 1500 2500 3500
- 1500
6 5 4 3 2 1
- bserved
expected
Average pairwise difference, k
Results from Kreitman and Hudson (1991) Very compelling evidence for a balanced polymorphism in exon 4 centered on window 1500 [F/S site: position 1490] Evidence for selective sweep in Adh-dup adaptive fine-tuning BUT not completely definitive on its own: Look at other [non-enzyme] genes: no cline Look at other enzyme genes: cline
22
Comparison of the ratio of synonymous and nonsynonymous polymorphism to
divergence in Drosophila is not consistent with strict neutral theory. Non- synonymous changes represent a significantly smaller fraction of the variation within species as compared with substitutions between species.
14:2 11:0 17:0 2:1 1:0 15:5
- D. melanogaster
- D. simualns
- D. yakuba
Polymorphism within a species Substitutions between specie
Data from: McDonald and Krietman (1991). Ratios are significantly different according to a G-test (P = 0.006) Synonymous (S) Non-synonymous (NS) S:NS Polymorphic 42 2 21 Fixed 17 7 2.4
McDonald and Kreitman (1991) Used MK test [not as sensitive to demographics] McDonald and Kreitman (1991)
Two explanation of results: 1. Fixation of adaptive mutations [selective sweep effect] 2. Dramatic recent expansion in Ne [see below]
5 10 15 20 5 10 15 20 25