Topic outline - Quick look to the pioneers: HapMap - 1000 Genomes - - PowerPoint PPT Presentation

topic outline
SMART_READER_LITE
LIVE PREVIEW

Topic outline - Quick look to the pioneers: HapMap - 1000 Genomes - - PowerPoint PPT Presentation

Carolina Medina Gomez PhD SNPs and Diseases Molecular School of Medicine Thursday, November 16 th , 2017 Topic outline - Quick look to the pioneers: HapMap - 1000 Genomes project -Description - Diversity Panel -The HRC consortium - Local


slide-1
SLIDE 1

SNPs and Diseases Molecular School of Medicine Thursday, November 16th, 2017

Carolina Medina Gomez PhD

slide-2
SLIDE 2
  • Quick look to the pioneers: HapMap
  • 1000 Genomes project
  • Description
  • Diversity Panel
  • The HRC consortium
  • Local Panels
  • Acquire awareness on the implications of population diversity
  • Comprehend the utility of large haplotype reference panels and

large biobank data

  • Use this knowledge for the mapping of complex traits

Topic outline Learning Aims

slide-3
SLIDE 3

AIM

Perform a comprehensive sampling of common genetic variation that may form the basis of phenotypic differences in humans

The HapMap Project

YRI CEU CHB+JPT

A second generation human haplotype map of over 3.1 million SNPs 2007, Nature 449: 851-861.

slide-4
SLIDE 4

The HapMap Project

YRI CEU CHB+JPT

HapMap II r 22: Build 36 - 2007

270 samples

slide-5
SLIDE 5

HapMap III r22 Build 36 - 2010

1,184 Samples – DEPICT, LDSC

Name Population # of samples ASW African ancestry in Southwest USA 53 CEU Utah residents with Northern and Western European ancestry from the CEPH collection 112 CHB Han Chinese in Beijing, China 137 CHD Chinese in Metropolitan Denver, Colorado 109 GIH Gujarati Indians in Houston, Texas 101 JPT Japanese in Tokyo, Japan 113 LWK Luhya in Webuye, Kenya 110 MEX Mexican ancestry in Los Angeles, California 58 MKK Maasai in Kinyawa, Kenya 156 TSI Toscani in Italia 102 YRI Yoruba in Ibadan, Nigeria 147

Integrating common and rare genetic variation in diverse populations 2010, Nature 467: 52-58.

slide-6
SLIDE 6

Phase 1 – 2010 AIM Catalogue of human genetic variation sequencing whole genome of 1,092 individuals from 14 worldwide populations. Discover human genetic variations of all types (95% of variation > 1% frequency) at the population level

The 1000 Genomes project – Build 37

The 1000 Genomes Project Consortium 2010, Nature 467: 1061-1073.

slide-7
SLIDE 7

Phase 3 – 2015 AIM Catalogue of human genetic variation sequencing whole genome of 2,504 individuals from 14 worldwide populations. Discover human genetic variations of all types (99% of variation > 1% frequency) at the population level

The 1000 Genomes project – Build 37

A global reference for human genetic variation 2015, Nature 526: 68-.

slide-8
SLIDE 8

Phase 3 – 2014 AIM Catalogue of human genetic variation sequencing whole genome of 2,504 individuals from 14 worldwide populations. Discover human genetic variations of all types (99% of variation > 1% frequency) at the population level

The 1000 Genomes project – Build 37

A global reference for human genetic variation 2015, Nature 526: 68-.

slide-9
SLIDE 9

The American Journal of Human Genetics 96, 37–53, January 8, 2015

slide-10
SLIDE 10

Phased design

Generation R HapMap Imputation 3,021,329 SNPs 2,671,724 MAF>0.01 r2>=0.3 Generation R 1KG Imputation 47,072,644 SNPs 11,361,791 MAF>0.01 r2>=0.3

2012 2017

slide-11
SLIDE 11

None of the two variants were present or tagged by HapMap variants (one is common)

slide-12
SLIDE 12
slide-13
SLIDE 13
slide-14
SLIDE 14

Imputation Servers

slide-15
SLIDE 15

Phase 1 – 2016 AIM To bring together as many whole-genome sequencing data sets as

  • possible. This reference panel consists of 64,976 haplotypes at

39,235,157 SNPs.

The Haplotype reference consortium – Build 37

A reference panel of 64,976 haplotypes for genotype imputation. Nature Genetics 48 10

slide-16
SLIDE 16

Phase 1 – 2016 AIM To bring together as many whole-genome sequencing data sets as

  • possible. This reference panel consists of 64,976 haplotypes at

39,235,157 SNPs.

The Haplotype reference consortium – Build 37

A reference panel of 64,976 haplotypes for genotype imputation. Nature Genetics 48 10

34% increase r2 at 0.1% r2 at ~0.4%

slide-17
SLIDE 17
slide-18
SLIDE 18
slide-19
SLIDE 19

http://www.nealelab.is/blog/2017/7/19/rapid-gwas-of-thousands-of-phenotypes-for-337000-samples-in- the-uk-biobank

slide-20
SLIDE 20

The Million Veteran Program began collecting data in 2011, and it has the goal of reaching 1 million participants by 2020

  • r 2021.

Now… imagine if we combine data Predictions for next GIANT freeze 1.5 million

slide-21
SLIDE 21

Analytical Issues!

slide-22
SLIDE 22

Current challenges

Perception challenge: Are we using the correct multiple testing threshold, or we should change it as we are including more rare variants constituting independent test (LD low). Methodological challenge: Is it necessary to correct further for population stratification (implementation of mixed model) to avoid false-positive signals. Computational challenge: Can we store and analyze the data with our current computational power. Follow-up challenge Can we identify correctly variants/genes for follow-up studies.

slide-23
SLIDE 23

The discovery of genetic variants associated with a trait

  • r disease is determine by different parameters
slide-24
SLIDE 24

We are surpassing the 1M barrier New imputation panels allow to explore variants with MAF ~0.1% More variants more opportunities to tag the causal variant

The new GWAS era is a treasure trove for making new fundamental discoveries in human genetics.

10 Years of GWAS Discovery: Biology, Function, and Translation. AJHG July. 2017

slide-25
SLIDE 25

Resolution magnification

P<=6.6x10-9

Identification of 153 new loci associated with heel bone mineral density and functional involvement of GPC6 in osteoporosis. Nature Genetics.

slide-26
SLIDE 26

Original 2x2 scenario of genetic architecture needs to be redefined under the scope of the 1000G and other projects

slide-27
SLIDE 27

Debatable Facts

slide-28
SLIDE 28
  • Understanding of human genome diversity is key for the design of

genetic studies

  • Large and more comprehensive panels panels provide the best

performance and yield in terms of quality and MAF coverage resulting in greater power (even more so, in combination with MegaBiobanks!)

  • Most of the novel variants to be discovered are of “low to rare” allele

frequency, highly population specific & enriched for functional aspects

  • Upscaling of technology, either through interfacing with -omic data or

through experimental perturbations are necessary for making new fundamental discoveries in human genetics

Take home messages

slide-29
SLIDE 29
slide-30
SLIDE 30

10 Years of GWAS Discovery: Biology, Function, and Translation. AJHG July. 2017