Ermerging Genomics Technologies in Research of Complex Traits and - - PowerPoint PPT Presentation

ermerging genomics
SMART_READER_LITE
LIVE PREVIEW

Ermerging Genomics Technologies in Research of Complex Traits and - - PowerPoint PPT Presentation

Power of Programming, Munich, D, 14 March 2014 Ermerging Genomics Technologies in Research of Complex Traits and Diseases Andr G Uitterlinden Genetic Laboratory Department of Internal Medicine Department of Epidemiology Department of


slide-1
SLIDE 1

André G Uitterlinden

Genetic Laboratory Department of Internal Medicine

Department of Epidemiology Department of Clinical Chemistry

www.glimdna.org

Ermerging Genomics Technologies in Research

  • f Complex Traits and Diseases

Power of Programming, Munich, D, 14 March 2014

Note: for non-commercial purposes only

slide-2
SLIDE 2
slide-3
SLIDE 3

RNA

  • Dynamic
  • Instable
  • Tissue specific regulation
  • Quantitative measurement

Clinical+ Biological Relevance

slide-4
SLIDE 4

AGGAGTCTGACTGACCATTGGACTAGGGGATTGACCAGTAGGCTGCGATTCGGATGCGGATTGACGATTAAAAAGGATTACGATT AGCTGTGACGTGCAGGATGCTGCGATGCTGGACTGAACGCCCCCCGGGCTTCTTTATTAGCTGCTGACGTGCCAGATGCTGAC GTGCAGTGCGGCTGACGGTGCTTACCTGGATCGGATGCTACCAGTCGATCGATCGATCGTAGCGTAGCGTATGCTAGCTAGTGAT CGATGCTAGTAGCTAGCTAGCTGATCGATCATCGATCGTAGCTAGCTAGCTAGCTAGCTGATCGATCGATGCTAGCTAGCTAGCTA GTCATCTGTGGTGGGGGGTTAAATGCGATTGCCGCTAGCTAGAACAAAATAGCGGTATTTTGGGGAGTCTGACTGACCATTGGAC TAGGGGATTGACCAGTAGGCTGCGATTCGGATGCGGATTGACGATTAAAAAGGATTACGATTAGCTGTGACGTGCAGGATGCTGC GATGCTGGACTGAACGCCCCCCGGGCTTCTTTATTAGCTGCTGACGTGCCAGATGCTGACGTGCAGTGCGGCTGACGGAGTCT GACTGACCATTGGACTAGGGGATTGACCAGTAGGCTGCGATTCGGATGCGGATTGACGATTAAAAAGGATTACGATTAGCTGTGA CGTGCAGGATGCTGCGATGCTGGACTGAACGCCCCCCGGGCTTCTTTATTAGCTGCTGACGTGCCAGATGCTGACGTGCAGTG CGGCTGACGGTGCTTACCTGGATCGGATGCTACCAGTCGATCGATCGATCGTAGCGTAGCGTATGCTAGCTAGTGATCGATGCTA GTAGCTAGCTAGCTGATCGATCATCGATCGTAGCTAGCTAGCTAGCTAGCTGATCGATCGATGCTAGCTAGCTAGCTAGTCATCTGT GGTGGGGGGTTAAATGCGATTGCCGCTAGCTAGAACAAAATAGCGGTATTTTGGGGAGTCTGACTGACCATTGGACTAGGGGATT GACCAGTAGGCTGCGATTCGGATGCGGATTGACGATTAAAAAGGATTACGATTAGCTGTGACGTGCAGGATGCTGCGATGCTGGA CTGAACGCCCCTCGGGCTTCTTTATTAGCTGCTGACGTGCCAGATGCTGACGTGCAGTGAGGAGTCTGACTGACCATTGGACTA GGGGATTGACCAGTAGGCTGCGATTCGGATGCGGATTGACGATTAAAAAGGATTACGATTAGCTGTGACGTGCAGGATGCTGCGA TGCTGGACTGAACGCCCCCCGGGCTTCTTTATTAGCTGCTGACGTGCCAGATGCTGACGTGCAGTGCGGCTGACGGTGCTTAC CTGGATCGGATGCTACCAGTCGATCGATCGATCGTAGCGTAGCGTATGCTAGCTAGTGATCGATGCTAGTAGCTAGCTAGCTGATC GATCATCGATAACCGTATAAGGGCTAGCTAGCTGATCGATCGATGCTAGCTAGCTAGCTAGTCATCTGTGGTGGGGGGTTAAATGC GATTGCCGCTAGCTAGAACAAAATAGCGGTATTTTGGCGGCTGACGGTGCTTACCTGGATCGGATGCTACCAGTCGATCGATCGA TCGTAGCGTAGCGTATGCTAGCTAGTGATCGATGCTAGTAAGGAGTCTGACTGACCATTGGACTAGGGGATTGACCAGTAGGCTG CGATTCGGATGCGGATTGACGATTAAAAAGGATTACGATTAGCTGTGACGTGCAGGATGCTGCGATGCTGGACTGAACGCCCCC CGGGCTTCTTTATTAGCTGCTGACGTGCCAGATGCTGACGTGCAGTGCGGCTGACGGTGCTTACCTGGATCGGATGCTACCAGT CGATCGATCGATCGTAGCGTAGCGTATGCTAGCTAGTGATCGATGCTAGTAGCTAGCTAGCTGATCGATCATCGATCGTAGCTAGC TAGCTAGCTAGCTGATCGATCGATGCTAGCTAGCTAGCTAGTCATCTGTGGTGGGGGGTTAAATGCGATTGCCGCTAGCTAGAAC AAAATAGCGGTATTTTGGAGGAGTCTGACTGACCATTGGACTAGGGGATTGACCAGTAGGCTGCGATTCGGATGCGGATTGAC GATTAAAAAGGATTACGATTAGCTGTGACGTGCAGGATGCTGCGATGCTGGACTGAACGCCCCCCGGGCTTCTTTATTAGCTGCT GACGTGCCAGATGCTGACGTGCAGTGCGGCTGACGGTGCTTACCTGGATCGGATGCTACCAGTCGATCGATCGATCGTAGCGTA GCGTATGCTAGCTAGTGATCGATGCTAGTAGCTAGCTAGCTGATCGATCATCGATCGTAGCTAGCTAGCTAGCTAGCTGATCGATC GATGCTAGCTAGCTAGCTAGTCATCTGTGGTGGGGGGTTAAATGCGATTGCCGCTAGCTAGAACAAAATAGCGGTATTTTGGGCTA GCTAGCTGATCGATCATCGATCGTAGCTAGCTAGCTAGCTAGCTGATCGATCGATGCTAGCTAGCTAGCTAGTCATCTGTGGTGGG GGGTTAAATGCACACACACACACACACACACACACACACACACAGATTGCCGCTAGCTAGAACAAAATAGCGGTATTTTGGGGT GCTTACCTGGATCGGATGCTACCAGTCGATCGATCGATCGTAGCGTAGCGTATGCTAGCTAGTGATCGATGCTAGTAGCTAGCTAG CTGATCGATCATCGATCGTAGCTAGCTAGCTAGCTAGCTGATCGATCGATGCTAGCTAGCTAGCTAGTCATCTGTGGTGGGGGGTT AAATGCGATTGCCGCTAGCTAGAACAAAATAGCGGTATTTTGGAGGAGTCTGACTGACCATTGGACTAGGGGATTGACCAGTAGG CTGCGATTCGGATGCGGATTGACGATTAAAAAGGATTACGATTAGCTGTGACGTGCAGGATGCTGCGATGCTGGACTGAACGCCC CCCGGGCTTCTTTATTAGCTGCTGACGTGCCAGATGCTGACGTGCAGTGCGGCTGACGGTGCTTACCTGGATCGGATGCTACCA GTCGATCGATCGATCGTAGCGTAGCGTATGCTAGCTAGTGATCGATGCTAGTAGCTAGCTAGCTGATCGA

“SNP=Single Nucleotide Polymorphism”

DNA Variants are: *Frequent in the Genome:

  • >75 million (?) variable loci in genome (~2%)
  • “SNPs” , in/del, CNV, VNTR
  • dbSNP, HapMap, 1KG, “local” NGS efforts,..

*Frequent in the Population:

> 5 % = common polymorphism 1 – 5 % = less common variant < 1 % = rare variant/mutation

HUMAN DNA IS HIGHLY VARIABLE

“IN/DEL=Insertion Deletion” “CNV=Copy Number Variation” “VNTR=Variable Nunber of Repeats”

slide-5
SLIDE 5

Time needed for genotyping 1 SNP in 7.000 DNA samples of the Rotterdam Study

1996

6 months:

RFLP, Epp tubes

1999

3 months: RFLP, 96-well plates 2001 1 week: SBE, 384-well plates 2003 1 day: Taqman (manual)

2004 6 hrs: Taqman, Caliper pipetting robot

2005 3 hrs: Taqman, Deerac, “Fast” PCR

2007

6 sec:

Illumina 550K array, 600 DNAs/week 2010

< 0.0006 sec:

Illumina HiSeq2000 Sequencers

The influence of “technology-push”

slide-6
SLIDE 6

Human Ageing Research: Bone as an Example...

ERGO/Rotterdam Study

Age (yr) BMD

Bone growth Peak BMD Bone Loss 50 75 25 100

EPOS GenR CALEUR AGGO

Osteoporosis: Low BMD, fractures men women

DNA collections bone endpoints

Maternal genotype Paternal genotype Environmental factors Ageing

slide-7
SLIDE 7

Clinical Expression:

Risk Factors:

Fracture Risk

Bone Strength Impact Force Fall Risk

DNA mutations and polymorphisms

BMD Quality Geometry

Osteoporotic fracture is a “complex” phenotype: Environmental factors: diet, exercise, sun exposure, ...

Hip fx Wrist fx Vertebral fx etc.

+Age, Sex, Age-at-Menopause, Height, OA, etc.

slide-8
SLIDE 8

Environmental influences can differ between populations ! HOLLAND BELGIUM

> 1100 mg/day < 500 mg/day Dietary Calcium intake Geographical distance: <100km

Foto: Barbara Obermayer-Pietsch Foto: Stuart Ralston

slide-9
SLIDE 9

Effect Size Frequency Genetic Variant

rare, monogenic common, complex

Next-Generation High-Throughput Sequencing

rare common small big

Genetic Architecture of Diseases/Traits :

Study designs to identify “risk” alleles

Linkage Analysis in pedigrees Genome-Wide Association Study

slide-10
SLIDE 10

AA→ BB→ AB→ . . . AB→ SNP1 SNP2 SNP3 . . . SNP550,000 1 2 3 4 5 6 7 8 14 18 X

Chromosomes

10 12 AA AB BB AA BB AB

DATA ANALYSIS (e.g., PLINK):

Replication Illumina Affymetrix

Genome-Wide Association Study (GWAS)

Select SNPs

DNA collection: e.g. 1000 cases vs. 1000 controls

Each dot is one SNP in, e.g, 2000 subjects

Meta-Analysis of all data Combine GWAS

  • Effects per SNP are usually small
  • We are looking at common variants
slide-11
SLIDE 11

HERC2/OCA2 gene

12 kb on Chr. 15q11

Rotterdam Study: Kayser et al, Am J Hum Genet, 2008

A “Dubai”plot: GWAS of human iris colour Chromosome / position P - value (-log 10)

P < 1.10-206 n = 5974

slide-12
SLIDE 12

A “Holland”plot: GWAS for BMD in the Rotterdam Study

N=5,000

5 x 10-8

  • Rotterdam Study
  • ERF Study
  • Twins UK
  • deCODE Genetics
  • Framingham Study

LUMBAR SPINE BMD

Rivadeneira et al., Nat Genet., 2009

slide-13
SLIDE 13

A real Manhattan plot: “height” in the GIANT consortium

5 x 10-8

Lango, Estrada, Rivadeneira et al., Nature, 2010

  • 180,000 subjects
  • 180 loci identified
  • 10-15% variance explained
slide-14
SLIDE 14
  • Collaborative prospective meta-analysis of individual

level data in consortia

  • Meta-analysis of published data
  • >2 large studies (n > 1000 each)
  • 1-3 smaller studies
  • 1 small study (n<500)

Very Good Not so Good

Grades of Evidence

slide-15
SLIDE 15

EUROPE by prejudice.…….(according to USA)

(From: Yanko Tsvetkov, alphadesigner.com)

slide-16
SLIDE 16

The he GEFOS/GE OS/GENOM NOMOS OS con

  • nsortium
  • rtium

Number of subjects: GENOMOS: >150,000

  • f which GWAS: 40,000

www.gefos.org www.genomos.eu

= G GENOMOS study dy popul ulat ation

  • n

= i idem m + GWAS = i idem, m, under er negot

  • tiati

ation

  • n / i

in devel elopm

  • pment

ent

slide-17
SLIDE 17

GEFOS HYPOTHESIS-FREE GWAS:

AS SAMPLE SIZE INCREASES, GENOME-WIDE SIGNIFICANT SIGNALS BECOME GRADUALLY EVIDENT

N=5,000

5 x 10-8

  • Rotterdam Study
  • ERF Study
  • Twins UK
  • deCODE Genetics
  • Framingham Study

LUMBAR SPINE BMD

Rivadeneira et al., Nat Genet., 2009

slide-18
SLIDE 18

N=6,200

5 x 10-8

  • Rotterdam Study
  • ERF Study
  • Twins UK
  • deCODE Genetics
  • Framingham Study

LRP5

LUMBAR SPINE BMD

Rivadeneira et al., Nat Genet., 2009

slide-19
SLIDE 19

N=8,500

5 x 10-8

  • Rotterdam Study
  • ERF Study
  • Twins UK
  • deCODE Genetics
  • Framingham Study

LRP5

LUMBAR SPINE BMD

Rivadeneira et al., Nat Genet., 2009

slide-20
SLIDE 20

N=15,000

  • Rotterdam Study
  • ERF Study
  • Twins UK
  • deCODE Genetics
  • Framingham Study

LRP5 OPG MHC C6ôrf10 1p36 RANK-L

LUMBAR SPINE BMD

Rivadeneira et al., Nat Genet., 2009

5 x 10-8

slide-21
SLIDE 21

N=19,125

  • Rotterdam Study
  • ERF Study
  • Twins UK
  • deCODE Genetics
  • Framingham Study

1p36 C6ôrf10 LRP5 RANK-L OPG

LUMBAR SPINE BMD

SP7

Rivadeneira et al., Nat Genet., 2009

5 x 10-8

slide-22
SLIDE 22

Willer et al., Nature Genetics, jan 2009: 145 authors

GWAS issues: *GWAS hits are just a start to find causal genes/variant(s) *Follow-up research per individual locus *GWAS creates new genome annotation/function/biology *Small effect size does NOT mean small biological relevance

slide-23
SLIDE 23

NHGRI GWA Catalog www.genome.gov/GWAStudies www.ebi.ac.uk/fgpt/gwas/ Published Genome-Wide Associations through 12/2012 Published GWA at p≤5X10-8 for 17 trait categories

With current GWAS efforts we have: *Genotyped only 0.3% of nucleotides in the human genome *Selected for “Universal/Cosmopolitan” variants *Explained 2-30% of genetic variance per disease (some exceptions) *not analysed many more phenotypes

As of 11/19/13, the catalog includes 1751 publications and 11,912 SNPs.

slide-24
SLIDE 24

What are eQTLs?

 expression Quantitative Trait Loci  genomic variations that explain expression traits

slide-25
SLIDE 25

Published online Nat. Gen. 8 september 2013

Cis-eQTL:

  • 2,500,000 SNPs
  • 34,000 Illumina probes (16,332 genes, cis-window: < 250kb)

Trans-eQTL:

  • 4,611 SNPs *
  • 34,000 Illumina probes (16,332 genes, trans-window > 5mb)

* Based on data from Teri Manolio’s ‘A Catalog of Genome-Wide Association Studies’

slide-26
SLIDE 26

CHARGE WG on RNA expression differences with Age

(by array) study design, number of samples

Discovery meta-analysis: 7,257 whole blood samples (6 cohorts) Differential Expression with Age (LM):

*adjusted for: sex, smoking, fasting status, blood cell counts, RNA quality (RIN), batch effects, family structure,

  • ther technical covariates

EGCUT, FHS-2st gen, InChianti, KORA, RS and SHIP Replication: 8,009 whole blood samples (7 cohorts) Differential Expression with Age (LM):

*adjusted for: sex, smoking, fasting status, blood cell counts, RNA quality (RIN), batch effects, family structure,

  • ther technical covariates

BSGS, DILGOM, Fehrmann, FHS-3rd gen, GTP, HVH, and NIDDK/Pima Generalization: 4,644 samples (9 other tissues/cell types) Differential Expression with Age (LM) using different tissues:

Brain (cerebellum + frontal cortex) (n=394), CD4+ cells (n=515), CD8+ cells (n=299) CD14+ cells (n=213), LCLs (n=869), Lymphocytes (n=1,244), Monocytes (n=354) and PBMCs (n=362)

EGCUT, IMMVAR, GARP, GENOA, MESA, NABEC-UKBEC and SAFHS, In total, 19,910 samples analyzed

slide-27
SLIDE 27

Smoking cessation effect on blood RNA expression

  • Only SHIP and FHS have the

data

  • Replication is extremely tricky

due to differences in platform and number of observations, CHARGE WG on RNA expression, unpublished data

slide-28
SLIDE 28

RFLP TaqMan SNP Array SNP Array + Imputation Full Exome NGS Full Genome NGS (=data are too complex…?)

What will Next Generation Sequencing (NGS) bring us?

>>increased levels of resolution in genome analysis

slide-29
SLIDE 29

A new genomics centre: “Genomics@Rotterdam”, Erasmus MC

(4 x HiSeq2000/2500; >50 projects; >14,000 human NGS samples sequenced)

Samples/month

~ 1,500 ~ 500 ~ 32 ~ 80

Application

  • DNA Full Exome (WES)

(@ 50 X;Nimblegen/Agilent)

  • RNA Seq Transcriptome

(@ 18 Gb)

  • DNA Whole Genome Seq

@ 30 X @ 12 X

~ Costs *

€ 550 € 280 ? (€ 2.000?) ? (€ 1.000?)

*Prices subject to change; by project volume and date

slide-30
SLIDE 30

X-Exome sequencing of three affected males revealed a Frameshift c.235delT in exon 3 ; p.(Tyr79Ilefs*6) of PLS3 Plastin 3 (PLS3) is an F-actin bundling protein.

17 october 2013

Van Dijk, Zillikens et al., NEJM 2013

slide-31
SLIDE 31

population frequency

  • f BMD

value

Monogenic Mutations with large effects Polymorphisms with subtle effects Rare Rare Common Monogenic Mutations with large effects

BMD value

LRP5 SOST ClCN7 TCIRG1 CATK OSTM1 RANKL RANK COLIA1 COLIA2 CRTAP LEPRE LRP5 CYP17 ESR1 PLS3

Low High

LINKAGE IN PEDIGREES+ EXOME SEQUENCING

GWAS + GEFOS + GENOMOS

ANALYTICAL APPROACHES: EXOME + GENOMESEQUENCING EXOME + GENOME SEQUENCING LINKAGE IN PEDIGREES + EXOME SEQUENCING

Genetic “architecture” of human BMD

ANXA11 LIN7C RSPH10B TNFAIP8L3 ARHGAP1 LRP3 RTDR1 TNFRSF11B BBOX1 LRP4 RUNX2 TNFSF11 BCR LRP5 SERPINE2 TOE1 CDC5L LSM12 SETD4 TOP2B CDK5 LYRM5 SFTPD TSGA10IP CLIP4 MAP3K11 SHFM1 TSPYL6 COL11A1 MAP3K12 SIRT3 TSR1 CTNNB1 MBL2 SLC25A13 TTC21B CYLD MEF2C SLC45A1 UNKL DAB2IP MEOX1 SNX20 USHBP1 DCDC1 MEPE SOX4 WDFY1 DLX5 MKKS SOX6 WDR43 DLX6 MPP2 SOX9 WDR86 DYDC1 MPP3 SP1 WDR88 ERC1 MYO9B SP7 WFIKKN1 ESR1 NAB1 SPIRE1 WNT1 FOXC2 PAX6 SPP1 WNT10B FOXF1 PIGC SPTBN1 WNT16 GPR141 PKD2L1 STARD3NL WNT3 GPR177 PLAC9 STK38L WNT4 GRB10 PTPRN2 SUPT3H WNT4 HDAC5 QRFP SUV420H1 WNT5B IBSP RAB18 TIPARP WNT9B IGFBP6 RADIL TLR5 XKR9 INSIG2 RBMS3 TMEM16J ZBTB40 ITGA2B RIC8B TMEM175 ZCCHC2 JAG1 RPE65 TMEM87B ZDHHC23

  • Expl. variance :

5% >> 20%?

slide-32
SLIDE 32

Genome-wide Exploration in Cohort studies (in NCHA)

Total samples N=32,000 GenR

(n=6000 children; 11,000 parents)

ERF

(n=3000)

RS I,II,III

(n=12,000)

LLS

(n=3,000)

PROSPER

(n=5,000)

B-PROOF

(n=3,000)

SNP array

6,000

(illum 610K)

3,000

(illum 300K)

12,000

(illumina 550/610K)

3,500

(illum 610K)

5,000

(illum 610K)

3,000

(illum OmniXp)

Exome Sequencing

  • 1,500

4,500

  • (600)
  • Exome Array (illum)

1,000 1,500 3,000 1,000

  • Full Genome Seq.
  • 50

100 250

  • Methylation Array

1,000 (1000) 1,000 (1,000)

  • (500)

Methylation Seq.

  • RNA Array
  • 200

1,000 500

  • RNA Seq.
  • (700)

900 (700)

  • Microbiome Seq

(1,500)

  • (1,500)
slide-33
SLIDE 33

Illumina, TruSight Individual Genome Sequencing (IGS) test, CLIA-certified, CAP-accredited, Physician-led

(Part of “Understand Your Genome” event, London, UK: 12/13 september 2013

Costs : $ 5,000,= Analysis time: 1 month Coverage: > 30x, >90% of genome, only SNVs, NO indel, NO CNV, NO structural variants + 2.5 mio Omni SNP array

Reports: 1600 genes for 1221 conditions (exonic variants) Only “clinically significant variants” are discussed (=Mendelian, high prenetrance) My result: 5,377 variants =

  • 558 nothing reported in literature;
  • 3,959 benign
  • 854 likely benign
  • 6 of clinical significance:

2 likely pathogenic variant, 2 “suspicious”, Carrier of 2 pathogenic variants,

My Personal Genome….(13 sept 2013)

Variant Statistics: Total 3,348,002 in Genes 1,280,794 in Coding regions 18,857 in UTR 25,877 Splice site 2,336 Stop/Gain 86 Stop/Lost 31 Non-synonymous 9,988 Synonymous 8,751

slide-34
SLIDE 34

Human Complex Genetics

The Next Frontiers:

>>> NEXT GEN SEQUENCING:

  • EXOME SEQUENCING
  • FULL GENOME SEQUENCING
  • RNA SEQUENCING
  • MICROBIOME SEQUENCING

>> Full genome sequencing of individual cells…..(cancer, brain, immunology,..) >> Full genome sequencing of several cells from a subject >> Coupling of genomic levels: DNA > methylation> RNA > protein >> Microbiome Sequencing (100x more bacterial cells than human cells per subject): > Flora of Intestine, Skin, Nose, Eye, Mouth, etc.

slide-35
SLIDE 35
  • New Biology: Dozens of novel genes/pathwayss discovered to be

involved in disease phenotypes and risk factors

  • Potential for Prediction: A still increasing part of heritability of

phenotypes is being explained

  • Better Epidemiology: Mendelian Randomization is now more

feasible to analyse causality of “classic” epidemiologiccal associations

  • High Impact and Exemplary: Large-scale international

collaborations allow for very robust evidence for genetic & genomic discoveries >> Translational Research based on these discoveries is opportune

Population Genomics: what have we learned?

slide-36
SLIDE 36

Special Thx to:

  • NWO, ZonMW, NGI-NCHA
  • EU-GEFOS
  • EU-GENOMOS
  • EU-TREATOA
  • CHARGE
  • GIANT
  • EAGLE
  • etc.

www.glimdna.org

Fernando Rivadeneira Karol Estrada Anke Ennemans Mila Jhamai Joyce van Meurs Hanneke Kerkhof Carolina Medina Ramazan Buyukcelic Carola Zillikens Martha Castano-Betancourt Pascal Arp Maria Jongerden Robert Kraaij Slavik Koval Michael Verbiest Liz Herrera-Duran Annemieke Verkerk Marjolein Peters Marijn Verkerk Jeroen van Rooij Lisette Stolk Ling Oei Sarah Hinxton Manushka Ganesh Anis Abuseiris Stephan Nouwens