Hitch-hiking and polygenic adaptation Kevin Thornton Ecology and - - PowerPoint PPT Presentation

hitch hiking and polygenic adaptation
SMART_READER_LITE
LIVE PREVIEW

Hitch-hiking and polygenic adaptation Kevin Thornton Ecology and - - PowerPoint PPT Presentation

Hitch-hiking and polygenic adaptation Kevin Thornton Ecology and Evolutionary Biology, UC Irvine 1 Linked selection vs. fates of selected mutations Hudson & Kaplan, 1995 De Vladar & Barton, 2014 2 Modeling traditions Population


slide-1
SLIDE 1

Hitch-hiking and polygenic adaptation

Kevin Thornton

Ecology and Evolutionary Biology, UC Irvine 1

slide-2
SLIDE 2

Linked selection vs. fates of selected mutations

Hudson & Kaplan, 1995 De Vladar & Barton, 2014

2

slide-3
SLIDE 3

Modeling traditions

Population genetics

  • Evol. quantitative genetics

Fixed effect sizes Variable effect sizes Single selected site Many sites Directional sel’n Stabilizing selection Partial linkage LE or QLE

3

slide-4
SLIDE 4

Tree structures

Neutral Recent hard sweep

4

slide-5
SLIDE 5

Patterns reflect the tree structures

5

slide-6
SLIDE 6

Linked selection during polygenic adaptation

  • Use forward simulations
  • fwdpy11 is a Python package
  • Uses a C++ back-end (Thornton, 2014, Genetics)

6

slide-7
SLIDE 7

Simulation scheme

A locus w = e−(z−zo)2/(2VS)

  • 10 unlinked loci, θ = ρ = 1, 000 per locus
  • Additive mutations arise at rate µ, Θ = 4Nµ
  • Two thetas cannot possibly be confusing.
  • N = 5, 000 diploids
  • Evolve under GSS for 10N generations with optimal trait value of 0
  • Shift optimal trait value to zo > 0 and evolve for 10N more generations

7

slide-8
SLIDE 8

Adaptation occurs rapidly and before fixation

0.0 0.5 1.0 Value

= 2.5 × 10

4, z0 = 1,

= 0.045

Mean trait value 5 × V(G)

= 0.001, z0 = 1, = 0.089

Mean trait value 5 × V(G)

= 0.005, z0 = 1, = 0.200

Mean trait value 5 × V(G)

0.0 0.2 0.4 0.6 0.8 1.0 Mutation frequency = effect size,

  • = origination time

= 0.38, o = -0.0018 = 0.17, o = 0.0040 = -0.09, o = 0.3042

= effect size,

  • = origination time

= 0.55, o = -0.0004

= effect size,

  • = origination time

= 0.26, o = -0.0044

0.02 0.00 0.02 0.04 0.0 0.2 0.4 0.6 0.8 1.0 Mutation frequency = effect size,

  • = origination time

= 0.57, o = 0.0000 = 0.46, o = 0.0004 = 0.36, o = 0.0002 = 0.34, o = 0.0006 = 0.31, o = 0.0068

0.02 0.00 0.02 0.04 Time since optimum shift (units of N generations) = effect size,

  • = origination time

= 0.75, o = 0.0022 = 0.57, o = -0.0002 = 0.55, o = 0.0066 = 0.48, o = 0.0012 = 0.39, o = 0.0000

0.02 0.00 0.02 0.04 = effect size,

  • = origination time

= 0.68, o = -0.0016 = 0.57, o = -0.0012 = 0.54, o = -0.0004 = 0.51, o = -0.0008 = 0.49, o = 0.0182

Figure 1: Large optimum shift, zo = 1 with VS = 1.

8

slide-9
SLIDE 9

Contributions of different loci

Time since optimum shift (units of N generations) Mean genetic value of locus. 0.2 0.4 0.6 0.8 1.0 . . 5 . 1 . 1 5 . 2 = Mutation rate, µ 0.00025 = Optimal trait value, zo 1 . . 5 . 1 . 1 5 . 2 = Mutation rate, µ 0.005 = Optimal trait value, zo 1

Figure 2: Mean trait value per locus, colored by rank.

9

slide-10
SLIDE 10

Sweeps from SGV start out rare

0.0 0.5 Effect size ( ) 500 1000 1500 2000 2500 3000 3500 4000 Number of haplotypes with mutation

zo = 1.00 = 0.00025

1 2 3 4 5 6 7 8 9 0.0 0.5 Effect size ( )

zo = 1.00 = 0.001

2 4 6 8 10 12 0.0 0.5 Effect size ( )

zo = 1.00 = 0.005

5 10 15 20 25

This predicts “hard” sweep signals due to sweeps from large-effect SGV.

10

slide-11
SLIDE 11

Temporal and spatial patterns of “selection signals”

Mean H' −0.4 −0.3 −0.2 −0.1 0.0 0.1 1 2 3 = µ 0.00025 = zo 1 1 2 3 = µ 0.001 = zo 1 1 2 3 = µ 0.005 = zo 1 Distance from window with causal mutations. 1 2 3 4 5 Time since optimum shift (units of N generations) Mean z−score −0.05 0.0 0.05 1 2 3 = µ 0.00025 = zo 1 1 2 3 = µ 0.001 = zo 1 1 2 3 = µ 0.005 = zo 1

Figure 3: Mean statistic per window over time for a large optimum shift. z scores are for the nSL statistic (Ferrer-Admetlla et al. (2014), MBE

11

slide-12
SLIDE 12

Similar patterns for new mutations vs SVG

Mean H' −2 −1 1 2 3 = µ 0.00025 = zo 1:New mutation 1 2 3 = µ 0.001 = zo 1:New mutation 1 2 3 = µ 0.005 = zo 1:New mutation −2 −1 = µ 0.00025 = zo 1:Standing var. = µ 0.001 = zo 1:Standing var. = µ 0.005 = zo 1:Standing var. Time since optimum shift (units of N generations) Mean z−score −0.2 −0.1 0.0 0.1 0.2 0.3 1 2 3 = µ 0.00025 = zo 1:New mutation 1 2 3 = µ 0.001 = zo 1:New mutation 1 2 3 = µ 0.005 = zo 1:New mutation −0.2 −0.1 0.0 0.1 0.2 0.3 = µ 0.00025 = zo 1:Standing Var. = µ 0.001 = zo 1:Standing Var. = µ 0.005 = zo 1:Standing Var.

Figure 4: Same data, but conditioning on fixations of large effect

12

slide-13
SLIDE 13

Mutational variance matters

Time since optimum shift (units of N generations) Mean H' −1.0 −0.5 0.0 0.00 0.05 0.10 0.15 = µ 0.00025 = Pr(|γ| >= γ ^) 0.05 0.00 0.05 0.10 0.15 = µ 0.001 = Pr(|γ| >= γ ^) 0.05 0.00 0.05 0.10 0.15 = µ 0.005 = Pr(|γ| >= γ ^) 0.05 −1.0 −0.5 0.0 = µ 0.00025 = Pr(|γ| >= γ ^) 0.75 = µ 0.001 = Pr(|γ| >= γ ^) 0.75 = µ 0.005 = Pr(|γ| >= γ ^) 0.75 Distance from window with causal mutations. 1 2 3 4 5

Figure 5: Choose σµ so that probability of a large-effect mutation is constant. Time scale is determined by δq of fixations.

13

slide-14
SLIDE 14

Implications

  • Patterns unique to “soft sweeps” are not generated by this model!
  • We are using supervised machine learning (Schrider/Kern) to further investigate this.
  • Hitch-hiking signals decrease as Θ increases
  • Keep in mind that our “tests” are usuall designed to detect hard sweeps

Data not shown:

  • Small optimum shifts leave less dramatic patterns

14

slide-15
SLIDE 15

Tree sequences: representing genetic data using tables

Kelleher, et al. 2016. PLoS Computational Biology a.k.a "The msprime paper"

Time ago 2 1 5 10 Genomic position

Nodes:

ID time 1 2 3 4 0.0 0.0 1.0 2.0 0.0

Edges:

left right 5 5 10 10 10 10 parent child 1 3 2 3 4 3 4 5 2 4 5 3

Sites:

ID position ancestral state 1 2.5 7.5 A G ID site node

Mutations:

derived state 1 1 2 3 T C 2 1 1 G

Tree topologies and mutations:

T A A G G C

1 2

Intervals:

4 3 1 2 4 3 2 1 3 1 4 3 3 4 2 3 4 2

15

slide-16
SLIDE 16

Tree sequence simplification. . .

Kelleher, et al. 2018. PLoS Computational Biology

16

slide-17
SLIDE 17

. . . can be done in FAST linear time. . .

17

slide-18
SLIDE 18

. . . and give a huge performance boost. . .

103 104 105 Scaled recombination rate ( = 4Nr) 10 20 30 40 50 Speedup due to pedigree recording N = 1e + 03 N = 1e + 04 N = 5e + 04

18

slide-19
SLIDE 19

. . . allowing chromosome-scale simulations in large N

Θ =10 Θ =100

. 2 . 5 5 . 7 . 5 1 . . 2 . 5 5 . 7 . 5 1 .

0.110 0.112 0.114 0.116

Distance from trees with selected mutations (units of 4Nr) Expected proportion of singleton mutations Generations since

  • ptimum shift

50 100 150 200 250 "Complete and partial sweeps" doman. "Polygenic adaptation" doman.

Figure 6: N = 2 × 105 diploids, ρ = 105 (≈ 100MB in humans),γ ∼ N(0, 0.25), VS = 1. Analysis based on n = 3, 000 diploids.

19

slide-20
SLIDE 20

Facilitates better testing

  • Methods for detecting polygenic adaptation of continuous traits shouldn’t be

evaluated with simulations of strong sweeps.

  • Methods assuming linkage equilibrium need to be tested using simulations involving

partial linkage

  • etc.

20

slide-21
SLIDE 21

Resources

  • fwdpy11: https://fwdpy11.readthedocs.org
  • msprime: https://msprime.readthedocs.org
  • Tree sequence tutorials: https://tskit-dev.github.io/tutorials/
  • The tree sequence toolkit: https://github.com/tskit-dev/tskit (“almost ready”)

21

slide-22
SLIDE 22

Thanks

  • David Lawrie
  • Khoi Hyunh
  • Jaleal Sanjak
  • Tony Long
  • Jerome Kelleher, Jaime Ashander, Peter Ralph
  • NIH for funding
  • UCI HPCC for computing support

22