SLIDE 1
Estimating the contribution of sequence context to nucleotide substitution rate heterogeneity
Helen Lindsay and Gavin A. Huttley
SLIDE 2 The Gamma Model
- Yang (1993) used a gamma distribution
to model rate variation in α- and β- globin genes
- The gamma distribution is often
approximated by four equi-probable bins
SLIDE 3
Gamma rate variation
SLIDE 4 Improvements on the Gamma model
- Allow sites to change rates
- Allow clustering of rates
- Consider other/multiple rate
distributions
SLIDE 5
What causes substitution rate variation?
SLIDE 6 What causes substitution rate variation?
Natural selection
SLIDE 7 What causes substitution rate variation?
Natural selection Differential repair
SLIDE 8 What causes substitution rate variation?
Natural selection Differential repair Nucleotide properties
SLIDE 9
AG CG TG (slow) (fast)
SLIDE 10 Data
- 470 alignments, each 50 000
nucleotides long, of introns from human, chimpanzee and macaque one- to-one orthologs.
- Sampled from Ensembl version 49.
SLIDE 11
SLIDE 12
The baseline model
SLIDE 13
The CpG model
SLIDE 14
The Gamma Model
SLIDE 15
Gamma vs Dinucleotide models
SLIDE 16
Gamma vs Dinucleotide models
SLIDE 17
Gamma vs Dinucleotide models
SLIDE 18
186.05 40.77 51.07 175.52
SLIDE 19 Accounting for CpG substitutions decreases rate variation
SLIDE 20
- Independent sites
- Reversible
- Compositional
variance
Alignment position (nucleotides) G+C% G+C%(alignment) GA GG rate
SLIDE 21 Advantages of dinucleotide models
- Less likelihood computation
- Equivalently parameter-rich
- No assumed distribution of rate
variation
- Can incorporate known mutation
biases, for example deamination of methylated cytosine.
- Smaller alphabet than amino acids
SLIDE 22 Acknowledgements
Australian National University
University of Singapore