Introduction to Complex Genetics: Genetic Association Studies Andr - - PowerPoint PPT Presentation

introduction to complex genetics
SMART_READER_LITE
LIVE PREVIEW

Introduction to Complex Genetics: Genetic Association Studies Andr - - PowerPoint PPT Presentation

MolMed Course Genetics for Dummies Rotterdam, 1 November, 2017 Introduction to Complex Genetics: Genetic Association Studies Andr G Uitterlinden Genetic Laboratory Department of Internal Medicine Department of Epidemiology


slide-1
SLIDE 1

MolMed Course “Genetics for Dummies” Rotterdam, 1 November, 2017

Introduction to Complex Genetics:

Genetic Association Studies

André G Uitterlinden

Genetic Laboratory Department of Internal Medicine

Department of Epidemiology Department of Clinical Chemistry

www.glimdna.org

Professor Trifonius Zonnebloem Professor Cuthbert Calculus Professeur Tryphon Tournesol

  • ur website…
slide-2
SLIDE 2

DNA polymorphisms

Definition: a DNA sequence variation that occurs in the population w3ith a frequency of….. at least ……0.1 % ….0.5%....1%...5%...? Genome Frequency: ~ 1 in 100 base pairs is polymorphic……probably more! ……………….depends on samples sequenced globally…. 2016: >150 million polymorphisms in human genome f > 0.1% >50 million polymorphisms f > 1%

slide-3
SLIDE 3

1000G (N=2.504))

Combined databases (N=73.994)

ExAC (N=60.706) ESP (N=6.503) GoNL (N=500) 326 114 61 288 151 307 280 159 829 236 204 972 129 311 5.098 121 319 5.368 UK10K (N=3.781)

Overlap between Rotterdam Study WES coding (refseq) variants* across other publicly available datasets Rotterdam Study WES (60x), n=2,628 samples *Numbers are x1000

Van Rooij, Verkerk, Kraaij, unpublished

slide-4
SLIDE 4

Types of DNA polymorphisms (1)

How do they look like ?

  • Single Nucleotide Polymorphisms (SNPs; e.g. A to G)
  • Insertion/deletion (e.g., AATCGC / -)
  • Variable Number of Tandem Repeats (VNTRs)

*homopolymers: repeat unit = 1 bp (e.g., poly-A) *microsatellites: repeat unit = 2-6 bp (e.g., GT or CA) *minisatellites: repeat unit = >7 bp

slide-5
SLIDE 5

Types of DNA polymorphisms (2)

POSSIBLE FUNCTIONALITY:

  • 5’promoter: affecting mRNA expression by production
  • 3’UTR: affecting mRNA expression by stability
  • intron: affecting mRNA expression, splicing
  • exons: affecting protein structure/activity

intron exon 3’ UTR 5’ promoter

Gene structure What do they do ?

slide-6
SLIDE 6

“NON-FUNCTIONAL” ???

  • exons: synonymous codon changes (e.g., both

TCT and TCC encode the aa Serine)

  • introns
  • intergenic areas (“gene deserts”)

Types of DNA polymorphisms (3)

intron exon 3’ UTR 5’ promoter

Gene structure What do they do ? Problem: For all these cases functional SNPs have been identified. >> We have NOT analyzed ALL SNPs in the human genome PROPERLY They could all very well be functional !

slide-7
SLIDE 7

SNPs, alleles, genotypes and haplotypes

A C G T A G C A A C G T A G C A SNP= Single Nucleotide Polymorphism Genotype Allele Haplotype Allele

+ +

strand chromosomes

slide-8
SLIDE 8

Genetic Association Analysis (1) case-control design

Test for “association” by counting variants of a (candidate) gene. Compare allele frequencies: A = wild-type allele; B = risk-allele CONTROL-group DISEASE-group AAAAA AA 70% BBB 30% BBBBB 50% AAAAA 50%

slide-9
SLIDE 9

Humans are diploid: compare characteristics by their genotype

Population (-based sample)

Genotype mean Femoral Neck BMD AA 0.82±0.12

0.82±0.12 0.82±0.12

AB 0.80±0.13

0.82±0.13 0.79±0.13

BB 0.78±0.13

0.78±0.13 0.79±0.13 dose-effect recessive dominant

Genetic Association Analysis (2) Quantitative Trait analyses

slide-10
SLIDE 10

What is a Genome-Wide Association Study?

  • Method for interrogating millions of common DNA

variations across the human genome

  • Based on classic association study design
  • GWAS is based on “Linkage Disequilibrium” (LD):

>DNA variation is inherited in blocks, so not all variants have to be tested because one/a few will predict others

slide-11
SLIDE 11

CAGATCGCTGGATGAATCGCATCTGTAAGCAT CGGATTGCTGCATGGATCGCATCTGTAAGCAC CAGATCGCTGGATGAATCGCATCTGTAAGCAT CAGATCGCTGGATGAATCCCATCAGTACGCAT CGGATTGCTGCATGGATCCCATCAGTACGCAT CGGATTGCTGCATGGATCCCATCAGTACGCAC

SNP3

SNP5

Block 1 Block 2

SNP8

One Tag SNP May Serve as Proxy for Many

slide-12
SLIDE 12
slide-13
SLIDE 13
slide-14
SLIDE 14

Candidate Gene Analysis: 1990-2005

Association analysis Association analysis with disease with disease phenotype phenotype in in populations populations Identify Identify D DNA NA polymorphisms polymorphisms Identification Identification of

  • f Candidate

Candidate Gene Gene Identify haplotypes Identify haplotypes Meta Meta-

  • analysis to quantify

analysis to quantify effect effect size size Functional analysis Functional analysis in in cells cells/serum/etc. /serum/etc. Specific Expression Specific Expression Animal Animal Model Model Mendelian Disease Mendelian Disease GWA GWA

slide-15
SLIDE 15

Candidate gene association analysis “in practice”:

  • Small sample size
  • Ill-defined choice of polymorphisms
  • Lack of standardized genotyping
  • Lack of standardized phenotype data
  • Publication bias

>> How to improve?

  • Combine study populations (across Europe, globally): meta-analysis
  • Rationalise choice of polymorphisms: functionality, haplotypes
  • Standardize genotyping methods: reference DNA plate
  • Standardize phenotypes across populations: meta-analysis individual level data
  • Run prospective meta-analyses

Many controversial & ir-reproducible results because of:

slide-16
SLIDE 16

Genetic determinants of osteoporosis ?

Vitamin D Estrogen TGFb/BMP/Wnt- signalling homocysteine

VDR,DBP ERα, ERβ, Aromatase, LH, LHR, GnRH TGFb, LRP5/6, BMP2, FRZB, SOST MTHFR, MS, MTRR, CBS, THYMS

Matrix molecules

Collagen Ia1, osteocalcin, AHSG, LOX

Thyroid Hormone

TSHR, DIO1, DIO2, DIO3, MCT8

Cortisol

GR, 11B-HSD IGFI, IGFBP3

IGF/GH

slide-17
SLIDE 17

10 11 12 14 15 16 13

“GENOMOS” a large-scale, multi-centre study for prospective meta- analyses of osteoporosis candidate gene variants

“Genetic Markers for Osteoporosis” EU FP5 sponsored: 3 mio euro Jan 2003 – Jan 2007

Genes analysed:

  • ESR1
  • COLIA1
  • VDR
  • TGFb
  • LRP5&6

Total number

  • f subjects

(early 2006): 26,264

18,405 women 7,859 men 6,498 fractures 2,380 vertebral fx

9 = Participant + Epidemiological Cohort 7 = Participant

Rotterdam*

Aberdeen Aarhus Antwerp Cambridge Barcelona Firenze Ioannina

2 3 5 1 7 9 4 8

*coordinating centre Warsaw Graz Amsterdam

slide-18
SLIDE 18

GENOMOS RESULTS (March 2008)

  • 37,760

1 LRP6 6-14% 12-26% 0.15 SD 0.15 SD 37,760 2 LRP5

  • 28,924

5 TGFb

  • 10% (Cdx)
  • 26,242

5 VDR

  • 10% (Sp1)

0.15 SD 0.15 SD 20,786 1 COLI 10-20% 20-30%

  • 18,917

3 ESR1 Non-Vert Vert LS FN (17) (6) FX BMD Sample n SNPs n GENE Ioannidis et al., JAMA 2004 Uitterlinden et al., Ann Int Med 2006 Langdahl et al. Bone 2008 van Meurs et al. JAMA 2008 Ralston et al., PLoS Med 2006 PUBLICATIONS :

slide-19
SLIDE 19

EU EU-FP7 FP7 pr proj

  • jec

ect: t: GEFOS EFOS (2008-2012)

Number of subjects: GENOMOS: >150,000

  • f which GWAS: 40,000

www.gefos.org

Coordinated by Erasmus MC

= G GENOMOS study dy popul ulat ation

  • n

= i idem m + GWAS = i idem, m, under er negot

  • tiati

ation

  • n / i

in devel elopm

  • pment

ent

slide-20
SLIDE 20

GEFOS HYPOTHESIS-FREE GWAS:

AS SAMPLE SIZE INCREASES, GENOME-WIDE SIGNIFICANT SIGNALS BECOME GRADUALLY EVIDENT

N=5,000

5 x 10-8

  • Rotterdam Study
  • ERF Study
  • Twins UK
  • deCODE Genetics
  • Framingham Study

LUMBAR SPINE BMD

Rivadeneira et al., Nat Genet., 2009

slide-21
SLIDE 21

N=6,200

5 x 10-8

  • Rotterdam Study
  • ERF Study
  • Twins UK
  • deCODE Genetics
  • Framingham Study

LRP5

LUMBAR SPINE BMD

Rivadeneira et al., Nat Genet., 2009

slide-22
SLIDE 22

N=8,500

5 x 10-8

  • Rotterdam Study
  • ERF Study
  • Twins UK
  • deCODE Genetics
  • Framingham Study

LRP5

LUMBAR SPINE BMD

Rivadeneira et al., Nat Genet., 2009

slide-23
SLIDE 23

N=15,000

  • Rotterdam Study
  • ERF Study
  • Twins UK
  • deCODE Genetics
  • Framingham Study

LRP5 OPG MHC C6ôrf10 1p36 RANK-L

LUMBAR SPINE BMD

Rivadeneira et al., Nat Genet., 2009

5 x 10-8

slide-24
SLIDE 24

N=19,125

  • Rotterdam Study
  • ERF Study
  • Twins UK
  • deCODE Genetics
  • Framingham Study

1p36 C6ôrf10 LRP5 RANK-L OPG

LUMBAR SPINE BMD

SP7

Rivadeneira et al., Nat Genet., 2009

5 x 10-8

slide-25
SLIDE 25

GEFOS collaboration has been the generator of greatest leaps in shear number of discoveries… for common variants

slide-26
SLIDE 26

Association analysis of a rare SNP in PLS3 with Fractures and BMD in the Rotterdam Study (RS)

  • Small effect on BMD
  • PLS3 genotype dependent fracture risk was not fully explained by BMD

Van Dijk, Zillikens et al., NEJM 2013

slide-27
SLIDE 27

WGS identifies rare non-coding variants near EN1 with large effects on BMD and FX (EN1=engrailed 1: transcription factor)

Houfeng Zheng 1,2 *, Vincenzo Forgetta 1,2*, Yi-Hsiang Hsu 3-5*, Karol Estrada 4-7*, ..Carolina Medina-Gómez 6,30,31, …..Ling Oei 6,30,31,….,Robert Kraaij 6,30,31,……. Nathalie van der Velde6,42, ……..Jeroen van Rooij 6,31, …….André G Uitterlinden 6,30,31,……. Fernando Rivadeneira 6,30,31†, J Brent Richards 1,2,28† for the UK10K and GEFOS Consortia

Evidence for regulatory function in osteoblasts (Nature, sept 2015) Association signals at the EN1 locus. Allele frequency vs. effect size

FRACTURE META-ANALYSIS in 508,253 subjects: Locus SNP Effect Allele Frequency OR (95% CI) p-value N cases N controls I2 EN1 rs11692564 T 0.02 0.85(0.80-0.89) 2.0.10-11 98,742 409,511 0.00

slide-28
SLIDE 28

2015

2

2 UK10K Imputed GEFOS UK10K N=33K

Less frequent variants

Identification of rare variants requires NGS data and big sample size for replication

slide-29
SLIDE 29

Even bigger sample size yields many more discoveries for both common and less frequent variants…….

2016 1000GP UK

BIOBANK n=160K

>300

~240

(Unpublished data)

slide-30
SLIDE 30

population frequency

  • f BMD

value

Monogenic Mutations with large effects Polymorphisms with subtle effects Rare Rare Common Monogenic Mutations with large effects

BMD value

LRP5 SOST ClCN7 TCIRG1 CATK OSTM1 RANKL RANK COLIA1 COLIA2 CRTAP LEPRE LRP5 CYP17 ESR1 PLS3

Low High

LINKAGE IN PEDIGREES+ EXOME SEQUENCING

GWAS + GEFOS + GENOMOS

ANALYTICAL APPROACHES: EXOME + GENOMESEQUENCING EXOME + GENOME SEQUENCING LINKAGE IN PEDIGREES + EXOME SEQUENCING

Genetic “architecture” of human BMD

ANXA11 LIN7C RSPH10B TNFAIP8L3 ARHGAP1 LRP3 RTDR1 TNFRSF11B BBOX1 LRP4 RUNX2 TNFSF11 BCR LRP5 SERPINE2 TOE1 CDC5L LSM12 SETD4 TOP2B CDK5 LYRM5 SFTPD TSGA10IP CLIP4 MAP3K11 SHFM1 TSPYL6 COL11A1 MAP3K12 SIRT3 TSR1 CTNNB1 MBL2 SLC25A13 TTC21B CYLD MEF2C SLC45A1 UNKL DAB2IP MEOX1 SNX20 USHBP1 DCDC1 MEPE SOX4 WDFY1 DLX5 MKKS SOX6 WDR43 DLX6 MPP2 SOX9 WDR86 DYDC1 MPP3 SP1 WDR88 ERC1 MYO9B SP7 WFIKKN1 ESR1 NAB1 SPIRE1 WNT1 FOXC2 PAX6 SPP1 WNT10B FOXF1 PIGC SPTBN1 WNT16 GPR141 PKD2L1 STARD3NL WNT3 GPR177 PLAC9 STK38L WNT4 GRB10 PTPRN2 SUPT3H WNT4 HDAC5 QRFP SUV420H1 WNT5B IBSP RAB18 TIPARP WNT9B IGFBP6 RADIL TLR5 XKR9 INSIG2 RBMS3 TMEM16J ZBTB40 ITGA2B RIC8B TMEM175 ZCCHC2 JAG1 RPE65 TMEM87B ZDHHC23

  • Expl. variance :

5% >> 20%?

EN1 LGR4 PLS3

slide-31
SLIDE 31

– Osteoporosis (GEFOS): BMD FN+LS, geometry, etc. – Osteoarthritis (TREAT OA): Kellgren score 2+, TJR, etc. – Menopause, Menarche (REPROGEN) – BMI, fat%, waist circumference, waist hip ratio (CHARGE/GIANT) – Lean Mass (DXA), grip strength (GEFOS, CHARGE) – Height (GIANT) – Hcy (CHARGE) – Vitamine D (Sunshine, CHARGE) – Microbiome (CHARGE) – Pain (CHARGE) – Educational Attainment (SSGC/CHARGE)

Examples of Phenotypes currently subject to GWAS in our research group

slide-32
SLIDE 32

GWAS: succesfull approach to identify new genetic loci in complex diseases (2011)

Trait/disease Nr Subjects GWAS Nr hits Explained variance Height 135.000 210 14% BMD 20.000 20 2-3% BMI 126.000 18 ~1.5% MI 2500+3500 9 2.8% LDL 20.000 11 8% HDL 20.000 14 9% Blood pressure 35.000 8 0.5% Breast cancer 4000+4000 8 5.4% Age at menopause 10.000 4 2.7%

slide-33
SLIDE 33

Willer et al., Nature Genetics, jan 2009: 145 authors

GWAS issues: *GWAS hits are just a start to find causal variant(s) *Much follow-up research, …to be done by you (in collab.) *GWAS creates new genome annotation/function/biology (e.g., snRNA) *Small effect size does NOT mean small biological relevance

slide-34
SLIDE 34

mRNA

  • level, stability, splicing/isoforms

Protein

  • level, stability, isoforms, protein-protein

Cells

  • e.g., transcriptional activity
  • e.g., Cell growth inhibition

Humans

  • Serum parameters
  • Intervention

Determining “Functional” Effects of DNA Polymorphisms is a slow and difficult process DNA polymorphism Association with disease: > 70 yrs follow-up !

Organizational Level “Read-out” of functionality

slide-35
SLIDE 35

GIANT meta-analyses GWAS BMI in n = 32,000 (stage 1) Replication in n = 59,000 (stage 2) Loci that are p<5.10-8: Known (risk allele freq %): FTO (41%) MC4R (21%) New (risk allele freq %): TMEM18 (84%) KCTD15 (67%) GNPDA2 (45%) SH2B1 (41%) MTCH2 (34%) NEGR1 (62%)

Explained variance = 0.84% (in stage 2 samples) Individual SNP Odds ratio’s for being:

  • Overweight (bmi>25): 1.03 – 1.14
  • Obese (bmi >30) : 1.03 – 1.25

Willer et al., Nature Genetics, jan 2009: 145 authors

slide-36
SLIDE 36

Future Efforts in GWAS……….

  • Unanswered Questions….:
  • Causative SNP ? Causative gene ? Biological mechanism ?
  • Limited explained variance per trait/disease : …“dark matter”

* The Hunt for Genetic “Dark Matter” …..:

  • Other types of genetic variation :
  • Rare variants (<5%, <1%, etc.)
  • Copy Number Variations (CNVs)
  • Interaction:
  • Gene-Gene and Gene-Environment
  • Technological Developments…..:
  • High Throughput Sequencing : >> 1000 genomes sequence project….
slide-37
SLIDE 37

WHY PERSONALIZED/PRECISION MEDICINE ? “Doctors prescribe medication they know little about to cure diseases they understand even less in people they know nothing about” (Voltaire (1694-1778) , ?)

« Les médecins administrent des médicaments dont ils savent très peu, à des malades dont ils savent moins, pour guérir des maladies dont ils ne savent rien »

slide-38
SLIDE 38

[Dec 2008, Caucasian male (AU)] deCODE 23andMe Rheumatoid Arthritis ; RR = 1.86 (7 SNPs) 0.60 (6 SNPs) GENE: SNP rs-number (genoytpe): HLA DRB1

  • rs6457617 (CT)

rs660895 (AG)

  • PADI4
  • rs11203366 (AG)

PTPN22 rs2476601 (GG) rs2476601 (GG) MMEL1

  • rs3890745 (CT)

6q23 rs2327832 (AA) rs2327832 (AA) rs13192841 (GG)

  • TRAF1/CT

rs3761847 (AG) rs3761847 (AG) IL2/IL21 rs6822844 (GG)

  • STAT4

rs7574865 (GT)

  • Commercial SNP analysis

Recreational Genomics is “not a medical necesity”….

slide-39
SLIDE 39

Illumina, TruSight Individual Genome Sequencing (IGS) test, CLIA-certified, CAP-accredited, Physician-led

(Part of “Understand Your Genome” event, London, UK: 12/13 september 2013

Costs : $ 5,000,= Analysis time: 1 month Coverage: > 30x, >90% of genome, only SNVs, NO indel, NO CNV, NO structural variants + 2.5 mio Omni SNP array

Reports: 1600 genes for 1221 conditions (exonic variants) Only “clinically significant variants” are discussed (=Mendelian, high prenetrance) My result: 5,377 variants =

  • 558 nothing reported in literature;
  • 3,959 benign
  • 854 likely benign
  • 6 of clinical significance:

2 likely pathogenic variant, 2 “suspicious”, Carrier of 2 pathogenic variants,

My Personal (Full) Genome….(13 sept 2013)

Variant Statistics: Total 3,348,002 in Genes 1,280,794 in Coding regions 18,857 in UTR 25,877 Splice site 2,336 Stop/Gain 86 Stop/Lost 31 Non-synonymous 9,988 Synonymous 8,751

slide-40
SLIDE 40
  • New Biology: Dozens of novel genes/pathways discovered to be

involved in disease phenotypes and risk factors

  • Potential for Prediction: A still increasing part of heritability of

phenotypes is being explained

  • Better Epidemiology: Mendelian Randomization is now more

feasible to analyse causality of “classic” epidemiologiccal associations

  • High Impact and Exemplary: Large-scale international

collaborations allow for very robust evidence for genetic & genomic discoveries

  • Populations are a bunch of individuals: opportunities for

studying “personalized aspects” of medicine >> Translational Research based on these discoveries is opportune

Population Genomics: what have we learned?

slide-41
SLIDE 41

…..IGNORANCE CAN BE DAUNTING……EDUCATION IS IMPORTANT !!

Annual Courses organized by the Genetic Laboratory: in 2017:

  • 9th edition of “Genetic for Dummies” (1-2 Nov; MolMed)
  • 14th edition of “SNP Course” (13-17 Nov; MolMed)
  • 12th edition of “Genomics in Medicine” (Aug; ESP57; NIHES)
  • 2nd edition of Microbiome course (Sept; MolMed)

www.molmed.nl www.nihes.nl

slide-42
SLIDE 42

It’s a fine balance……

Sougia, Crete, sept 2012