1 Genome-wide linkage study Example1: hemophilia in European - - PowerPoint PPT Presentation

1
SMART_READER_LITE
LIVE PREVIEW

1 Genome-wide linkage study Example1: hemophilia in European - - PowerPoint PPT Presentation

Overview Genetic architecture of a trait Arrays GWAS Ethnicity Statistics and analytical issues: Effect size / power SNP and GWAS Consortia Linda Broer (l.broer@erasmusmc.nl) Genetic Laboratory Department of Internal


slide-1
SLIDE 1

1

Statistics and analytical issues: SNP and GWAS

Linda Broer (l.broer@erasmusmc.nl) Genetic Laboratory Department of Internal Medicine Erasmus MC, Rotterdam

Overview

Genetic architecture of a trait Arrays GWAS Ethnicity Effect size / power Consortia

Overview

Genetic architecture of a trait Arrays GWAS Ethnicity Effect size / power Consortia

Types of genetic studies

Effect Size Frequency Genetic Variant

common, complex (association)

Probably real (impossible to identify with current methods) Few examples

rare common small big Genetic architecture of traits rare, monogenic (linkage)

Modified from McCarthy et al., Nat Genet Rev 2008

slide-2
SLIDE 2

2

Genome-wide linkage study

Assumption: trait is determined by rare variants with large effect Hypothesis free Resolution is poor (5 - 20 million base pairs) Works well for monogenetic traits Need to know/estimate model of inheritance! Common traits / complex diseases? Not effective

Example1: hemophilia in European royalty Types of genetic studies

Effect Size Frequency Genetic Variant

common, complex (association)

Probably real (impossible to identify with current methods) Few examples

rare common small big Genetic architecture of traits rare, monogenic (linkage)

Modified from McCarthy et al., Nat Genet Rev 2008

Candidate gene approach

Assumption: trait is determined by common variants with small effect Hypothesis driven Based on prior (biological) knowledge Association analysis of few variants Excellent resolution (1 bp) Often results in false-positive or negative findings Why?

slide-3
SLIDE 3

3

Example of false-positive candidate gene study

Heat Shock Proteins are the most important pathway to determine longevity after IGF1 in model organisms In centenarians the association between HSP proteins and longevity shown In genetics …

Genome-wide approach

Scale-up of candidate gene to genome-wide Hypothesis free approach Resolution 5-50 thousand base pairs Very effective

Overview

Genetic architecture of a trait Arrays GWAS Ethnicity Effect size / power Consortia

Which genotyping technique to use?

slide-4
SLIDE 4

4

Array-technology for genotyping SNPs

Created for genotyping many SNPs (> 0.3 million) Two major companies: Illumina & Affymetrix Illumina: tagSNP optimized Affymetrix: population-specific arrays Primarily used for Genome-wide testing GWAS But also for: pharmacogenetics, clinical research, linkage analysis

Array-technology

Illumina bead-array Beads have probes of one SNP attached Each bead is spotted in multifold to increase accuracy and redundancy

Address Probe 23 bp 50 bp bead

Procedure

DNA normalization and whole genome amplification

DNA pellet after amplification 200 ng DNA Whole genome amplification

Procedure

Hybridization on array, single base extension SBE: 1 base added to the probe

Fragmented gDNA SNP Labelled ddNTP

T-DNP

Address Probe Address Probe bead bead

slide-5
SLIDE 5

5

Procedure Procedure

DNA collection on array Every dot represents a SNP Colors: Red & green: homozygous Yellow: heterozygous

Overview

Genetic architecture of a trait Arrays GWAS Ethnicity Effect size / power Consortia

GWAS analysis

Replication Select SNPs Meta-Analysis of all data Combine GWASs Analyzing all SNPs in 1 run Visualizing results in plots Manhattan-plot Each dot represents 1 SNP

slide-6
SLIDE 6

6

Manhattan plot: “Holland” plot

5 x 10-8

LUMBAR SPINE BMD

Rivadeneira et al., Nat Genet., 2009

Manhattan plot: “Dubai” plot

HERC2/OCA2 gene

12 kb on Chr. 15q11

Rotterdam Study: Kayser et al, Am J Hum Genet, 2008

P < 1.10-206

Manhattan plot: true “Manhattan” plot

5 x 10-8

Lango, Estrada, Rivadeneira et al., Nature, 2010

  • 180 loci identified
  • 10-15% variance explained

GWAS catalog (https://www.ebi.ac.uk/gwas/)

Online collection of all published GWAS Quality controlled Manually curated Literature-derived Regularly updated Currently contains: 3,172 publications 52,491 unique SNP-trait associations

slide-7
SLIDE 7

7

GWAS on cardiovascular traits GWAS on cancer

slide-8
SLIDE 8

8

Overview

Genetic architecture of a trait Arrays GWAS Ethnicity Effect size / power Consortia

Out-of-Africa Not all variants got to travel: bottleneck event Not all variants got to travel: bottleneck event

Africans have more variants than Europeans/Asians ‘Unique’ variants appeared in those that left Africa Adaptation to new environment Some of these came from already existing hominids outside Africa Frequencies of variants can differ between Ethnic groups

slide-9
SLIDE 9

9

Example: rs776746

SNP in gene CYP3A5 which metabolizes clinical drugs G allele encodes CYP3A5*3 allele Inactivates the gene

Side note: humans are not the only species with a bottle-neck event

Cheetahs 2 bottle-neck events 10,000 years ago Last 100 years All cheetahs are identical twins Elephant seals Only 20-50 individuals left in 1890 Florida Panthers Isolated from other cougars Only 30-50 individuals left in 1980 Many recessive disease present in population

Consequences for study design

Example: Cases: sickle cell anemia Controls: European ancestry What will you find? Multiple variants across the genome show evidence of association Most cases are African ancestry All controls are European ancestry

Overview

Genetic architecture of a trait Arrays GWAS Ethnicity Effect size / power Consortia

slide-10
SLIDE 10

10

Power is an issue in GWAS

TRUTH

GWA Study

H0: No Association HA: Association

Accept H0 No Association

OK

Beta (β) error

Reject H0 Association

Alpha (α) error

OK

Power (1-β) of a GWA study will depend on:

FIXED FACTORS MODIFIABLE FACTORS

  • Allele frequency
  • Effect size
  • Linkage disequilibrium
  • Phenotype definition
  • Alpha level
  • Sample size

Effect size and frequency are important to consider

1000 cases / 1000 controls OR=2 OR=1.5 OR=1.3 OR=1.2

Power is an issue in GWAS

TRUTH

GWA Study

H0: No Association HA: Association

Accept H0 No Association

OK

Beta (β) error

Reject H0 Association

Alpha (α) error

OK

Power (1-β) of a GWA study will depend on:

FIXED FACTORS MODIFIABLE FACTORS

  • Allele frequency
  • Effect size
  • Linkage disequilibrium
  • Phenotype definition
  • Alpha level
  • Sample size

Sample size

Sample size needed to detect associations is >>20,000 Preferably even over 100,000 samples Most study populations don’t have this many samples Rotterdam Study: ~15,000 samples Exceptions UK Biobank: ~500,000 samples 23andMe: ~200,000 samples and growing Working together with others is only solution

slide-11
SLIDE 11

11

Overview

Genetic architecture of a trait Arrays GWAS Ethnicity Effect size / power Consortia

Large consortia

Rotterdam Study

CHARGE

GEnetic Factors of OSteoporosis

GENETIC INVESTIGATIONS OF ANTHROPOMETRIC TRAITS

Consortia: working together does work

Much larger sample sizes can be achieved Go from competition to cooperation Creates better science! But… Only ‘cosmopolitan’ variants found Trying to set up a call with the US, Europe and Australia is impossible Can slow things down as you are waiting for each other Typical GWAS takes ~3-7 years

slide-12
SLIDE 12

12

N=5,000

5 x 10-8

  • Rotterdam Study
  • ERF Study
  • Twins UK
  • deCODE Genetics
  • Framingham Study

LUMBAR SPINE BMD

Rivadeneira et al., Nat Genet., 2009

N=6,200

5 x 10-8

  • Rotterdam Study
  • ERF Study
  • Twins UK
  • deCODE Genetics
  • Framingham Study

LRP5

LUMBAR SPINE BMD

Rivadeneira et al., Nat Genet., 2009

N=8,500

5 x 10-8

  • Rotterdam Study
  • ERF Study
  • Twins UK
  • deCODE Genetics
  • Framingham Study

LRP5

LUMBAR SPINE BMD

Rivadeneira et al., Nat Genet., 2009

N=15,000

  • Rotterdam Study
  • ERF Study
  • Twins UK
  • deCODE Genetics
  • Framingham Study

LRP5 OPG MHC C6ôrf10 1p36 RANK-L

LUMBAR SPINE BMD

Rivadeneira et al., Nat Genet., 2009

5 x 10-8

slide-13
SLIDE 13

13

N=19,125

  • Rotterdam Study
  • ERF Study
  • Twins UK
  • deCODE Genetics
  • Framingham Study

1p36 C6ôrf10 LRP5 RANK-L OPG

LUMBAR SPINE BMD

SP7

Rivadeneira et al., Nat Genet., 2009

5 x 10-8

The success of consortia (2005): Everyone doing their

  • wn thing

The success of consortia (2007): starting to work together The success of consortia (2009): we’re getting somewhere

slide-14
SLIDE 14

14

The success of consortia (2011): is anything not significant? The success of consortia (2013) : I’ve given up to count them The success of consortia (2015): Wow, it’s pretty ☺ ☺ ☺ ☺ What has/will GWAS achieve

E D. Green et al. Nature 470, 204-213 (2011) doi:10.1038/nature09764

slide-15
SLIDE 15

15

In summary / Take Home Messages

Before doing genetic research, determine the genetic architecture of your trait and adjust methodology accordingly Arrays quickly becoming so cheap that they are feasible for any study GWAS is the work-horse of genetic epidemiology of complex traits Allele frequencies (and trait variation) can differ between ethnicities Sample size is only truly adjustable determinant of power Working together in consortia not just a necessity, it pays off

Questions