hitch hiking and polygenic adaptation
play

Hitch-hiking and polygenic adaptation Kevin Thornton Ecology and - PowerPoint PPT Presentation

Hitch-hiking and polygenic adaptation Kevin Thornton Ecology and Evolutionary Biology, UC Irvine 1 Linked selection vs. fates of selected mutations Hudson & Kaplan, 1995 De Vladar & Barton, 2014 2 Modeling traditions Population


  1. Hitch-hiking and polygenic adaptation Kevin Thornton Ecology and Evolutionary Biology, UC Irvine 1

  2. Linked selection vs. fates of selected mutations Hudson & Kaplan, 1995 De Vladar & Barton, 2014 2

  3. Modeling traditions Population genetics Evol. quantitative genetics Fixed effect sizes Variable effect sizes Single selected site Many sites Directional sel’n Stabilizing selection Partial linkage LE or QLE 3

  4. Tree structures Neutral Recent hard sweep 4

  5. Patterns reflect the tree structures 5

  6. Linked selection during polygenic adaptation • Use forward simulations • fwdpy11 is a Python package • Uses a C++ back-end (Thornton, 2014, Genetics) 6

  7. Simulation scheme w = e − ( z − z o ) 2 / (2 V S ) A locus • 10 unlinked loci, θ = ρ = 1 , 000 per locus • Additive mutations arise at rate µ , Θ = 4 N µ • Two thetas cannot possibly be confusing. • N = 5 , 000 diploids • Evolve under GSS for 10 N generations with optimal trait value of 0 • Shift optimal trait value to z o > 0 and evolve for 10 N more generations 7

  8. Adaptation occurs rapidly and before fixation = 2.5 × 10 4 , z 0 = 1, = 0.045 = 0.001, z 0 = 1, = 0.089 = 0.005, z 0 = 1, = 0.200 Mean trait value Mean trait value Mean trait value 1.0 5 × V ( G ) 5 × V ( G ) 5 × V ( G ) Value 0.5 0.0 1.0 = effect size, = effect size, = effect size, o = origination time o = origination time o = origination time 0.8 = 0.38, o = -0.0018 = 0.55, o = -0.0004 = 0.26, o = -0.0044 Mutation frequency = 0.17, o = 0.0040 0.6 = -0.09, o = 0.3042 0.4 0.2 0.0 1.0 = effect size, = effect size, = effect size, o = origination time o = origination time o = origination time 0.8 = 0.57, o = 0.0000 = 0.75, o = 0.0022 = 0.68, o = -0.0016 Mutation frequency = 0.46, o = 0.0004 = 0.57, o = -0.0002 = 0.57, o = -0.0012 0.6 = 0.36, o = 0.0002 = 0.55, o = 0.0066 = 0.54, o = -0.0004 = 0.34, o = 0.0006 = 0.48, o = 0.0012 = 0.51, o = -0.0008 = 0.31, o = 0.0068 = 0.39, o = 0.0000 = 0.49, o = 0.0182 0.4 0.2 0.0 0.02 0.00 0.02 0.04 0.02 0.00 0.02 0.04 0.02 0.00 0.02 0.04 Time since optimum shift (units of N generations) Figure 1: Large optimum shift, z o = 1 with V S = 1. 8

  9. Contributions of different loci Optimal trait value, z o Optimal trait value, z o = 1 = 1 Mutation rate, µ Mutation rate, µ = 0.00025 = 0.005 Mean genetic value of locus. 1.0 0.8 0.6 0.4 0.2 0 5 0 5 0 0 5 0 5 0 0 0 1 1 2 0 0 1 1 2 . . . . . . . . . . 0 0 0 0 0 0 0 0 0 0 Time since optimum shift (units of N generations) Figure 2: Mean trait value per locus , colored by rank. 9

  10. Sweeps from SGV start out rare 4000 0 1 2 3 4 5 6 7 8 9 0 2 4 6 8 10 12 0 5 10 15 20 25 3500 Number of haplotypes with mutation 3000 z o = 1.00 z o = 1.00 z o = 1.00 = 0.00025 = 0.001 = 0.005 2500 2000 1500 1000 500 0 0.0 0.5 0.0 0.5 0.0 0.5 Effect size ( ) Effect size ( ) Effect size ( ) This predicts “hard” sweep signals due to sweeps from large-effect SGV. 10

  11. Temporal and spatial patterns of “selection signals” Distance from window with causal mutations. 0 2 4 1 3 5 z o z o z o = 1 = 1 = 1 µ µ µ = 0.00025 = 0.001 = 0.005 0.1 0.0 Mean H' −0.1 −0.2 −0.3 −0.4 0 1 2 3 0 1 2 3 0 1 2 3 z o z o z o = 1 = 1 = 1 µ µ µ = 0.00025 = 0.001 = 0.005 0.05 Mean z−score 0.0 −0.05 0 1 2 3 0 1 2 3 0 1 2 3 Time since optimum shift (units of N generations) Figure 3: Mean statistic per window over time for a large optimum shift. z scores are for the nS L statistic (Ferrer-Admetlla et al. (2014), MBE 11

  12. Similar patterns for new mutations vs SVG z o z o z o = 1:Standing var. = 1:Standing var. = 1:Standing var. µ µ µ = 0.00025 = 0.001 = 0.005 0 −1 −2 Mean H' z o z o z o = 1:New mutation = 1:New mutation = 1:New mutation µ µ µ = 0.00025 = 0.001 = 0.005 0 −1 −2 0 1 2 3 0 1 2 3 0 1 2 3 z o z o z o = 1:Standing Var. = 1:Standing Var. = 1:Standing Var. µ µ µ = 0.00025 = 0.001 = 0.005 0.3 0.2 0.1 0.0 Mean z−score −0.1 −0.2 z o z o z o = 1:New mutation = 1:New mutation = 1:New mutation µ µ µ = 0.00025 = 0.001 = 0.005 0.3 0.2 0.1 0.0 −0.1 −0.2 0 1 2 3 0 1 2 3 0 1 2 3 Time since optimum shift (units of N generations) Figure 4: Same data, but conditioning on fixations of large effect 12

  13. Mutational variance matters Distance from window with causal mutations. 0 2 4 1 3 5 Pr(| γ | >= γ ^) Pr(| γ | >= γ ^) Pr(| γ | >= γ ^) = 0.75 = 0.75 = 0.75 µ µ µ = 0.00025 = 0.001 = 0.005 0.0 −0.5 −1.0 Mean H' Pr(| γ | >= γ ^) Pr(| γ | >= γ ^) Pr(| γ | >= γ ^) = 0.05 = 0.05 = 0.05 µ µ µ = 0.00025 = 0.001 = 0.005 0.0 −0.5 −1.0 0.00 0.05 0.10 0.15 0.00 0.05 0.10 0.15 0.00 0.05 0.10 0.15 Time since optimum shift (units of N generations) Figure 5: Choose σ µ so that probability of a large-effect mutation is constant. Time scale is determined by δ q of fixations. 13

  14. Implications • Patterns unique to “soft sweeps” are not generated by this model! • We are using supervised machine learning (Schrider/Kern) to further investigate this. • Hitch-hiking signals decrease as Θ increases • Keep in mind that our “tests” are usuall designed to detect hard sweeps Data not shown: • Small optimum shifts leave less dramatic patterns 14

  15. Tree sequences: representing genetic data using tables Kelleher, et al. 2016. PLoS Computational Biology a.k.a "The msprime paper" Tree topologies and mutations: Nodes: Edges: ID time left right parent child 2 4 4 1 Time ago 0 0.0 0 10 3 1 1 0.0 0 10 4 3 0 1 3 3 2 0.0 2 0 5 3 0 3 1.0 0 5 4 2 2 1 0 0 1 2 0 4 2.0 5 10 3 2 T A A G G C 5 10 4 0 Intervals: 1 3 Sites: Mutations: 3 4 ancestral derived ID position state ID site node state 0 3 0 4 0 2.5 A 0 0 2 T 1 7.5 G 1 1 3 C 2 4 2 3 2 1 1 G 0 5 10 Genomic position 15

  16. Tree sequence simplification. . . Kelleher, et al. 2018. PLoS Computational Biology 16

  17. . . . can be done in FAST linear time. . . 17

  18. . . . and give a huge performance boost. . . N = 1 e + 03 50 N = 1 e + 04 N = 5 e + 04 40 pedigree recording Speedup due to 30 20 10 0 10 3 10 4 10 5 Scaled recombination rate ( = 4 Nr ) 18

  19. . . . allowing chromosome-scale simulations in large N Θ = 10 Θ = 100 "Polygenic adaptation" "Complete and partial sweeps" 0.116 doman. doman. Expected proportion of singleton mutations Generations since 0.114 optimum shift 0 50 100 150 200 250 0.112 0.110 0 0 0 5 0 5 0 5 0 5 . . . . . . 0 . . . . 0 0 2 5 7 1 0 2 5 7 1 Distance from trees with selected mutations (units of 4Nr) Figure 6: N = 2 × 10 5 diploids, ρ = 10 5 ( ≈ 100MB in humans), γ ∼ N (0 , 0 . 25), V S = 1. Analysis based on n = 3 , 000 diploids. 19

  20. Facilitates better testing • Methods for detecting polygenic adaptation of continuous traits shouldn’t be evaluated with simulations of strong sweeps. • Methods assuming linkage equilibrium need to be tested using simulations involving partial linkage • etc. 20

  21. Resources • fwdpy11 : https://fwdpy11.readthedocs.org • msprime : https://msprime.readthedocs.org • Tree sequence tutorials: https://tskit-dev.github.io/tutorials/ • The tree sequence toolkit: https://github.com/tskit-dev/tskit (“almost ready”) 21

  22. Thanks • David Lawrie • Khoi Hyunh • Jaleal Sanjak • Tony Long • Jerome Kelleher, Jaime Ashander, Peter Ralph • NIH for funding • UCI HPCC for computing support 22

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend