Recent selection in Tibet, Greenland & China Anders Albrechtsen - - PowerPoint PPT Presentation

recent selection in tibet greenland china
SMART_READER_LITE
LIVE PREVIEW

Recent selection in Tibet, Greenland & China Anders Albrechtsen - - PowerPoint PPT Presentation

Recent selection in Tibet, Greenland & China Anders Albrechtsen April 3, 2019 Signatures of recent/ongoing selection Tibet Greenland SFS for NGS data Signatures of recent/ongoing selection Tibet Greenland SFS for NGS data Signatures


slide-1
SLIDE 1

Recent selection in Tibet, Greenland & China

Anders Albrechtsen April 3, 2019

slide-2
SLIDE 2

Signatures of recent/ongoing selection Tibet Greenland SFS for NGS data

slide-3
SLIDE 3

Signatures of recent/ongoing selection Tibet Greenland SFS for NGS data

slide-4
SLIDE 4

Signatures of recent/ongoing selection Tibet Greenland SFS for NGS data

Probability of fixation

slide-5
SLIDE 5

Signatures of recent/ongoing selection Tibet Greenland SFS for NGS data

slide-6
SLIDE 6

Signatures of recent/ongoing selection Tibet Greenland SFS for NGS data

slide-7
SLIDE 7

Signatures of recent/ongoing selection Tibet Greenland SFS for NGS data

Altitude adaption in Tibet

slide-8
SLIDE 8

Signatures of recent/ongoing selection Tibet Greenland SFS for NGS data

Altitude adaption in Tibet

Yi et al. 2010

  • Low oxygen has a large effect on fitness
  • People living in high altitude general have more birth defects
slide-9
SLIDE 9

Signatures of recent/ongoing selection Tibet Greenland SFS for NGS data

Oxygen and height

slide-10
SLIDE 10

Signatures of recent/ongoing selection Tibet Greenland SFS for NGS data

Altitude adaption in Tibet

Yi et al. 2010

  • The full exomes of 50 Tibetan individuals at an average

coverage of 18X.

  • Compared to 40 Han Chinese individuals sequenced at an

average of 6X (1000G).

  • Estimated joint allele frequencies for each SNP using Bayesian

approach.

slide-11
SLIDE 11

Signatures of recent/ongoing selection Tibet Greenland SFS for NGS data

2D site frequency spectrum

slide-12
SLIDE 12

Signatures of recent/ongoing selection Tibet Greenland SFS for NGS data

2D SFS and Fst

Fst from heterozygosity Fst = σB

σT = Htotal−Hsubpolulations Htotal

slide-13
SLIDE 13

Signatures of recent/ongoing selection Tibet Greenland SFS for NGS data

Population Branch Statistic (PBS)

PBS = TBS = (T TH +T TD −T HD)/2, T AB = −log(1−F AB

st )

slide-14
SLIDE 14

Signatures of recent/ongoing selection Tibet Greenland SFS for NGS data

Population frequencies

EPAS1 SNP allele frequencies Allele Tibetan Han Danish C 0.13 0.9125 1 G 0.87 0.0875

slide-15
SLIDE 15

Signatures of recent/ongoing selection Tibet Greenland SFS for NGS data

EPAS1

  • type of hypoxia-inducible factors
  • active under low oxygen
  • variant of gene confers increased athletic performance - called

the ”super athlete gene”.

slide-16
SLIDE 16

Signatures of recent/ongoing selection Tibet Greenland SFS for NGS data

Genotyping in 366 individuals

Independent genotyping

  • 366 Tibetans
  • Genotyped for the EPAS1 SNP
  • Phenotypes availeble

Associations within the Tibetan population CC CG GG p-value N 10 84 272 Hemoglobin concentration 178 178.9 167.5 0.0013 erythrocyte counts 5.3 5.6 5.2 0.0015

slide-17
SLIDE 17

Signatures of recent/ongoing selection Tibet Greenland SFS for NGS data

Is this extreme compared to populations

slide-18
SLIDE 18

Signatures of recent/ongoing selection Tibet Greenland SFS for NGS data

Other genes with large FST

slide-19
SLIDE 19

Signatures of recent/ongoing selection Tibet Greenland SFS for NGS data

EPAS1

slide-20
SLIDE 20

Signatures of recent/ongoing selection Tibet Greenland SFS for NGS data

Haplotype is extremely different

slide-21
SLIDE 21

Signatures of recent/ongoing selection Tibet Greenland SFS for NGS data

How did they adapt so fast

slide-22
SLIDE 22

Signatures of recent/ongoing selection Tibet Greenland SFS for NGS data

Adaptive intergression

slide-23
SLIDE 23

Signatures of recent/ongoing selection Tibet Greenland SFS for NGS data

conclusion

  • Tibetans have adapted to life in high altitude
  • A loci EPAS1 was found that has undergone strong adaptive

selection

  • The loci associated with hemoglobin concentrations and

erythrocyte counts

  • The mutations were introduced by Denisovan introgression
  • First (and only) example of adaptive introgression in humans
slide-24
SLIDE 24

Signatures of recent/ongoing selection Tibet Greenland SFS for NGS data

Human adaption to arctic environment

slide-25
SLIDE 25

Signatures of recent/ongoing selection Tibet Greenland SFS for NGS data

Brief overview of Greenland’s history

  • Inhabited on and off by different

Arctic cultures for ∼4500 years:

  • Visited by Vikings, Danish colony

from 1814, now autonomous country

slide-26
SLIDE 26

Signatures of recent/ongoing selection Tibet Greenland SFS for NGS data

The modern Greenlandic population

  • Small: N≃57,000
  • Live in coastal towns
  • Descendents of Inuit
  • But most also have

European ancestry

  • On average ∼ 25%

From Moltke et al. 2014, AJHG

slide-27
SLIDE 27

Signatures of recent/ongoing selection Tibet Greenland SFS for NGS data

A mutation causes 15% of type 2 diabetes in Greenland1

Very large almost recessive effect Rec model 2-h Glucose:3.8mmol/l T2D: OR=10.3 heredibility The variation explain 15% of all T2D in Greenland

1Moltke et al. Nature, 2014

slide-28
SLIDE 28

Signatures of recent/ongoing selection Tibet Greenland SFS for NGS data

Life in the Arctic is extreme: cold temperatures & fat-rich diet

slide-29
SLIDE 29

Signatures of recent/ongoing selection Tibet Greenland SFS for NGS data

Allele frequencies and population size

slide-30
SLIDE 30

Signatures of recent/ongoing selection Tibet Greenland SFS for NGS data

Frequency spectrum of Inuit

slide-31
SLIDE 31

Signatures of recent/ongoing selection Tibet Greenland SFS for NGS data

2D SFS between GL and Han

slide-32
SLIDE 32

Signatures of recent/ongoing selection Tibet Greenland SFS for NGS data

Split time from East Asia

Analyses of the exome data using ∂a∂i: Tree based on Fst

slide-33
SLIDE 33

Signatures of recent/ongoing selection Tibet Greenland SFS for NGS data

Recent changes in population size

slide-34
SLIDE 34

Signatures of recent/ongoing selection Tibet Greenland SFS for NGS data

Selection scan using PBS - ((HAN, GR) CEU)

slide-35
SLIDE 35

Signatures of recent/ongoing selection Tibet Greenland SFS for NGS data

Top loci

FADS fatty acid desaturase. TBX15

  • TBX15 plays an important role in differentiation of brown

(subcutaneous) adipocytes.

  • Upon stimulation by cold exposure can produces heat by lipid
  • xidation.

FN3KRP

  • an enzyme that catalyzes fructosamines, psicosamines and

ribulosamines that protects against nonenzymatic glycation.

  • FN3KRP can act to counteract the negative fitness caused by

a PUFA rich diet.

slide-36
SLIDE 36

Signatures of recent/ongoing selection Tibet Greenland SFS for NGS data

Why selection?

  • Tested for association between top SNPs and metabolic traits
  • Marginally significant associations with multiple traits, including

LDL

  • Selected alleles associated with decreased weight and height:
slide-37
SLIDE 37

Signatures of recent/ongoing selection Tibet Greenland SFS for NGS data

Why selection?

  • The association with height replicates in Europe:
  • V
  • ADDITION (N = 0)

SDC (N = 1306) Inter99 (N = 6116)

  • D

D

  • Effect size (SD)
slide-38
SLIDE 38

Signatures of recent/ongoing selection Tibet Greenland SFS for NGS data

Why selection? Take 2

  • Testing for association w. red blood cell membrane fatty acid

composition:

  • Mutation seems to compensate for high-fat diet
  • Height due to effect of fatty acid composition on growth hormone

levels?

  • Either way, the results suggest that selection in this region is a new

example of human adaptation where we know the genetic basis

slide-39
SLIDE 39

Signatures of recent/ongoing selection Tibet Greenland SFS for NGS data

Conclusion

  • We find multiple loci with recent adaptation to life in the

arctic

  • As expected the genes are involved in poly unsaturated fatty

acid metabolism and cold adaption

  • Surprisingly the loci also affects high and weight
  • Mutations also have an effect in Europe
slide-40
SLIDE 40

Signatures of recent/ongoing selection Tibet Greenland SFS for NGS data

How are the SFS estimated?

Can we construct the SFS using NGS data Yes - but be careful

slide-41
SLIDE 41

Signatures of recent/ongoing selection Tibet Greenland SFS for NGS data

When can calling SNPs and genotypes be a problem?

low/medium depth data

  • Capture data
  • low depth sequencing due to price
  • ancient DNA (only a finite amount of DNA)

What depth is high enough? Depends on the analysis

  • SFS is extremely sensitive to both genotype and SNP calling
  • admixture proportions are sensitive to genotype calling
  • ABBA-BABA (D-stats) can be used regardless of depth
slide-42
SLIDE 42

Signatures of recent/ongoing selection Tibet Greenland SFS for NGS data

Estimating SFS using uncertainty of the data

Likelihood of SFS for a single site: P(X s | η) =

2N

  • j=0

p(X s | η, J = j)p(J = j|η) ∝

2N

  • j=0

ηj

  • g∈{0,1,2}N

p(G = g | J = j)

N

  • i=1

P(X s

i | Gi = gi),

p(G = g | J = j) p(G = g | J = j) = 2N

j

  • 2

N

i I1(gi)

when 2N

i=1 gi = j, else 0

SFS for a region P(X | η) = r

s=1 P(X s | η)

fast calculations with dynamic programming and EM2

2Nielsen et al. SNP Calling, Genotype Calling, and Sample Allele Frequency

Estimation from New-Generation Sequencing Data

slide-43
SLIDE 43

Signatures of recent/ongoing selection Tibet Greenland SFS for NGS data

SFS based on genotype likelihoods

  • can be estimate even with low(ish) depth e.g. 2 X
  • Must be done with genotype likelihoods unless depth is high

(>10X)

  • Can be done in any dimension

1D thetas e.g. Tajimas pi, Tajimas D, Population sizes 2D fst and PBS >2D usefull for Demography inference

slide-44
SLIDE 44

Signatures of recent/ongoing selection Tibet Greenland SFS for NGS data

Time for exercises

Data from 1000 Genomes

  • 2500 individuals sequenced at low/medium depth (3-8X)
  • mulitple populations

Human genomes

  • 3Gb
  • BAM file size 5Gb per X

Reduced genome

  • 22 100k regions (one for each autosome)
  • 1Mb region on chr5
  • 3 x 10 individuals from
  • African(YRI), European (CEU), East Asian (JPT)
slide-45
SLIDE 45

Signatures of recent/ongoing selection Tibet Greenland SFS for NGS data

Site frequency spectrum for low/medium depth data

slide-46
SLIDE 46

Signatures of recent/ongoing selection Tibet Greenland SFS for NGS data

Selection scan using emperical bayes

LCT loci in Europeans based on 3X data

120 125 130 −2 −1 1 2

Tajima D chr2 100k windows

Position (MB) Tajima' D EB p1e−6mLike p1e−3mLike p1e−6HWE p1e−3HWE

slide-47
SLIDE 47

Signatures of recent/ongoing selection Tibet Greenland SFS for NGS data

Site frequency spectrum for low/medium depth data

slide-48
SLIDE 48

Signatures of recent/ongoing selection Tibet Greenland SFS for NGS data

There are no possible filters than can solve the problem