Genome-wide analysis of DNA methylation in samples from the - - PowerPoint PPT Presentation

genome wide analysis of dna methylation in samples from
SMART_READER_LITE
LIVE PREVIEW

Genome-wide analysis of DNA methylation in samples from the - - PowerPoint PPT Presentation

Genome-wide analysis of DNA methylation in samples from the Genotype-Tissue Expression (GTEx) project Peter Hickey @PeteHaitch Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health Single Cell Open Research Endeavour


slide-1
SLIDE 1

Genome-wide analysis of DNA methylation in samples from the Genotype-Tissue Expression (GTEx) project

Peter Hickey @PeteHaitch Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health Single Cell Open Research Endeavour (SCORE), Walter and Eliza Hall Institute of Medical Research Slides: www.bit.ly/AGTA2018

slide-2
SLIDE 2

GTEx to eGTEx via a ‘pilot’ study

slide-3
SLIDE 3

The Genotype-Tissue Expression (GTEx) project is an ongoing effort to build a comprehensive public resource to study [human] tissue-specific gene expression and regulation.

  • GTEx Consortium, 2015, Science 348, 648–660
slide-4
SLIDE 4

[eGTEx] extends the GTEx project to combine gene expression with additional intermediate molecular measurements on the same tissues.

  • eGTEx Project, 2017, Nat. Genet. 49, 1664–1670
slide-5
SLIDE 5

Hmm, this eGTEx study is gonna be huge. And the human brain is hella cool. Let’s do a pilot study.

  • Artist’s impression of conversation in Hansen and

Feinberg labs, c. 2015

slide-6
SLIDE 6

BrainEpigenome (the ‘pilot’ study)

Rizzardi, L*. Hickey, P.F.*, et al. Neuronal brain region-specific DNA methylation and chromatin accessibility are associated with neuropsychiatric disease heritability. bioRxiv (2017), doi:10.1101/120386 (in press, Nature Neuroscience) UCSC Track Hub: www.bit.ly/BrainEpigenomeHub

slide-7
SLIDE 7

http://epigenomesportal.ca/ihec/grid.html (Build: 2017-10)

Ma Map o

  • f h

human b brain me meth thylome lome wa was limited (c. 2015)

  • Little whole genome bisulfite

sequencing (WGBS) data

  • Few (if any) biological replicates
  • Mostly bulk tissue samples
  • Few brain region-specific differentially

methylated regions (DMRs)1,2

1Davies, M. N. et al. Functional annotation of the human brain

methylome identifies tissue-specific epigenetic variation across brain and blood. Genome Biol. 13, R43 (2012).

2Roadmap Epigenomics Consortium et al. Integrative analysis of

111 reference human epigenomes. Nature 518, 317–330 (2015).

slide-8
SLIDE 8

WGBS (bulk) n = 27 WGBS (NeuN sorted) n = 45 ATAC-seq (NeuN sorted) n = 22 RNA-seq (NeuN sorted) n = 20

A A good d map p requi quires s bi biologi gical repl plicates, s, mul ultipl ple br brain n regi egions ns, and nd mul ultipl ple e cel ell types pes

Donor

Tissue BRNCTXB (frontal cortex) BRNACC (anterior cingulate cortex) BRNHPP (hippocampus) BRNNCC (nucleus accumbens)

slide-9
SLIDE 9

Bulk k tissue samples are uninformative for brain region-sp speci cific c mC mCG due due to var aria iatio tion n of ne neur uronal nal pr propo portio tion n in in sample ampled d tis tissue ue

Tissue

slide-10
SLIDE 10

WGBS (bulk) n = 27 WGBS (NeuN+, NeuN-) n = 45 ATAC-seq (NeuN sorted) n = 22 RNA-seq (NeuN sorted) n = 20

Le Let’s try fluorescence activated nuclei sort rting (FANS) S)

Donor

Tissue BRNCTXB (frontal cortex) BRNACC (anterior cingulate cortex) BRNHPP (hippocampus) BRNNCC (nucleus accumbens)

slide-11
SLIDE 11

WGBS (bulk) n = 27 WGBS (NeuN+, NeuN-) n = 45 ATAC-seq (NeuN+, NeuN-) n = 22 RNA-seq (NeuN+, NeuN-) n = 20

And And let’s s do do so some more assa ssays

Donor

Tissue BRNCTXB (frontal cortex) BRNACC (anterior cingulate cortex) BRNHPP (hippocampus) BRNNCC (nucleus accumbens)

slide-12
SLIDE 12

FA FANS & WGBS reveals brain region-sp specifi ficity y of f mC mCG in in Ne NeuN+ + (b (but not Ne NeuN-) ) sa samples

CG-DMRs

n Size NeuN+ vs. NeuN- 100,875* 70.0 Mb NeuN+ 13,074* 11.9 Mb NeuN- 114* 0.1 Mb

*21,802 novel DMRs

slide-13
SLIDE 13

mCG (S) 0.2 0.5 0.8 mCG (L) 0.2 0.5 0.8 mCA (+) 0.2 0.5 0.8 mCT (+) 0.2 0.5 0.8 mCA (−) 0.2 0.5 0.8 mCT (−) 0.2 0.5 0.8

GABBR2 chr9: 101,348,685 − 101,404,045 (width = 55,361, extended = 15,000) A A A A A A A A A A A A A A A A A A A A A A A T T T T T T T T T T T TT T T T T T T T T T T a a a a a a a a a a a a a a a a a a a a a a a t t t t t t t t t t t t t t t t t t t t t t t −0.2 −0.1 0.0 0.1 0.2 0.3 −0.3 −0.2 −0.1 0.0 0.1 0.2

mCH (1 kb bins)

PC1 (22.5%) PC2 (8%) A a T t mCA (+) mCA (−) mCT (+) mCT (−) BA24 BA9 HC NAcc

PC1 (22.5%)

NeuN+ mCH (1kb bins)

PC2 (8.0 %)

Region BRNCTXB BRNACC BRNHPP BRNNCC

Context & strand A: mCA (+) a: mCA (-) T: mCT (+) t: mCT (-)

Neu NeuN+ s + sampl ples: mCH mCH sh shows s limited strand speci cifici city, ‘track cks’ ’ mC mCG, a , and c nd can be n be us used t d to i ide denti tify C CH-DM DMRs Rs

CH-DMRs

n Size NeuN+ 15,029+ 39.6 Mb++

+Before merging across strand and context ++After merging across strand and context

mCG (S) mCA (+) mCT (+) mCG (L) GABBR2

slide-14
SLIDE 14

C G − D M R s ( P O S ) u n i

  • n

Satellite LINE LTR intergenic five_utr Low_complexity Simple_repeat DNA SINE OpenSea CGI promoter Shores three_utr exonic Shelves intronic DEG promoters CG−DMRs (NeuN+) DEGs CH−DMRs (NeuN+) FANTOM5 H3K27ac OCR (union) −4 2 4 Value

CH-DMR (NeuN+) CG-DMR (NeuN+) log2(OR)

Enr Enrichm hment of f DMRs Rs over geno nomic featur ures

slide-15
SLIDE 15

C G − D M R s ( P O S ) u n i

  • n

Satellite LINE LTR intergenic five_utr Low_complexity Simple_repeat DNA SINE OpenSea CGI promoter Shores three_utr exonic Shelves intronic DEG promoters CG−DMRs (NeuN+) DEGs CH−DMRs (NeuN+) FANTOM5 H3K27ac OCR (union) −4 2 4 Value

CH-DMR (NeuN+) CG-DMR (NeuN+) log2(OR)

Enr Enrichm hment of f DMRs Rs over geno nomic featur ures

CG CG-DM DMRs and CH-DM DMRs co-oc

  • ccur

CG CG-DM DMRs are enhancer-ce centri ric CH CH-DM DMRs are enriched over

slide-16
SLIDE 16

C G − D M R s ( P O S ) u n i

  • n

Satellite LINE LTR intergenic five_utr Low_complexity Simple_repeat DNA SINE OpenSea CGI promoter Shores three_utr exonic Shelves intronic DEG promoters CG−DMRs (NeuN+) DEGs CH−DMRs (NeuN+) FANTOM5 H3K27ac OCR (union) −4 2 4 Value

CH-DMR (NeuN+) CG-DMR (NeuN+) log2(OR)

Enr Enrichm hment of f DMRs Rs over geno nomic featur ures

CG CG-DM DMRs and CH-DM DMRs co-oc

  • ccur

CG CG-DM DMRs are enhancer-ce centri ric CH CH-DM DMRs are enriched over genes (D (D

slide-17
SLIDE 17

C G − D M R s ( P O S ) u n i

  • n

Satellite LINE LTR intergenic five_utr Low_complexity Simple_repeat DNA SINE OpenSea CGI promoter Shores three_utr exonic Shelves intronic DEG promoters CG−DMRs (NeuN+) DEGs CH−DMRs (NeuN+) FANTOM5 H3K27ac OCR (union) −4 2 4 Value

CH-DMR (NeuN+) CG-DMR (NeuN+) log2(OR)

Enr Enrichm hment of f DMRs Rs over geno nomic featur ures

CG CG-DM DMRs and CH-DM DMRs co-oc

  • ccur

CG CG-DM DMRs are enhancer-ce centri ric CH CH-DM DMRs are DE DEG-ce centri ric

slide-18
SLIDE 18

CG CG-DM DMRs in in Ne NeuN+ + sa samp mples s are enriched for GWAS AS her heritabi bility of neur neuropsychi hiatric traits

*Finucane, H. K. et al. Partitioning heritability by functional annotation using genome-wide association summary statistics.

  • Nat. Genet. (2015) doi: 10.1038/ng.3404

Stratified l linkage disequilibrium s score re regre ression* 27 ‘ ‘brain-linked’ t traits (e.g., S Schizophrenia, AD ADHD) 3 ‘ ‘non-br brain-lin linked’ ’ traits ( (e.g., h height)

slide-19
SLIDE 19

eGTEx (work in-progress)

eGTEx Project Enhancing GTEx by bridging the gaps between genotype, gene expression, and disease. Nature Genetics (2017), doi: 10.1038/ng.3969

slide-20
SLIDE 20

eGTEx Project Enhancing GTEx by bridging the gaps between genotype, gene expression, and disease. Nature Genetics (2017), doi: 10.1038/ng.3969

eGTE TEX st study design

slide-21
SLIDE 21

Re Re-wr wrote e bs bsseq to to process and analyse eGTE TEx-si sized (a (and bigger) ) datase sets

  • Processed data is too large to store and
  • perate on in-memory (10s – 100s of GB)
  • Data stored on-disk in HDF5 file
  • Improved parallelization of key steps
  • Importing files
  • Smoothing
  • DMR calling
  • Permutation testing
  • Available through Bioconductor
  • https://bioconductor.org/packages/bsseq/
slide-22
SLIDE 22

mC mCG distinguishes eGTE TEx samples by tissue

slide-23
SLIDE 23

eGTE TEx Ne NeuN+ + sa samp mples s are (mo (mostly) y) consi sistent with Br BrainEpigenome me Ne NeuN+ + sa samp mples

slide-24
SLIDE 24

eGTE TEx Ne NeuN+ + sa samp mples s are (mo (mostly) y) consi sistent with Br BrainEpigenome me Ne NeuN+ + sa samp mples

slide-25
SLIDE 25

eGTE TEx Ne NeuN+ + sa samp mples s are (mo (mostly) y) consi sistent with Br BrainEpigenome me Ne NeuN+ + sa samp mples

slide-26
SLIDE 26

5-gr group: up: 16x x as as man any CG-DMRs in eGTE TEx Ne NeuN+ + sa samples s as s in Br BrainEpigenome me Ne NeuN+ + sa samp mples

CG-DMRs

n Size 5-group 181,146 196.9 Mb

slide-27
SLIDE 27

Ba Basal g gangl glia: Di : Discover er 2x 2x a as ma many C y CG-DM DMRs Rs i in e eGTEx Neu NeuN+ s + sampl ples a as i in n Br BrainEpigen enome me Neu NeuN+ s + sampl ples

CG-DMRs

n Size 5-group 181,146 196.9 Mb Basal ganglia 16,866 24.0 Mb

slide-28
SLIDE 28

Hippo Hippocam ampus pus: Wha hat t the the hell hell is is going ing on? n?

CG-DMRs

n Size 5-group 181,146 196.9 Mb Basal ganglia 16,866 24.0 Mb Hippocampus 11,702 24.4 Mb

slide-29
SLIDE 29

Ongoing eGTE TEx analyses

  • Complete analyses of CG-DMRs
  • Identify CH-DMRs and analyse
  • Stratified linkage disequilibrium score

regression

  • Do BrainEpigenome results replicate?
  • What can brain region-specific DMRs tell

us?

  • Variably methylated regions (VMRs)
  • Allele-specific methylation using phased

GTEx genomes

  • Use sorted data to deconvolute bulk

brain samples

  • Integration with other GTEx and eGTEx

data

BRNNCC BRNCDT BRNPTM THYROID

slide-30
SLIDE 30

Su Summa mmary

  • BrainEpigenome
  • FANS + WGBS reveals many brain region-specific CG-DMRs and CH-

DMRs for NeuN+ (but not NeuN-) samples.

  • Neuronal CG-DMRs are enriched for heritability of several

neurological, psychiatric, behavioral-cognitive phenotypes.

  • eGTEx
  • More tissues + more replicates = huge increase in DMRs.
  • The scale of these projects necessitated extensive improvements

to computational methods and software engineering.

  • There will still be heaps of analyses on the table after

publication of initial eGTEx publication(s).

  • Get involved!
slide-31
SLIDE 31

Acknowledgements

  • Dr. Lindsay Rizzardi
  • Assoc. Prof. Kasper Hansen
  • Prof. Andy Feinberg

Sequencing gurus: Rakel Tryggvadóttir, Adrian Idrizi, Colin Callahan ATAC-seq experiments: Varenka Rodriguez DiBlasi Flow sorting: Hao Zhang and Hopkins Flow Facility Funding: eGTEx (U01MH104393), CFAR (5P30AI094189-04, 1S10OD016315-01, and 1S10RR13777001), AGTA Travel Award Donors and families: NIH NeuroBioBank at the University of Maryland & University of Pittsburgh

slide-32
SLIDE 32

Links

Papers

Rizzardi, L*. Hickey, P.F.*, et al. Neuronal brain region-specific DNA methylation and chromatin accessibility are associated with neuropsychiatric disease heritability. bioRxiv (2017), doi:10.1101/120386 (in press, Nature Neuroscience) eGTEx Project Enhancing GTEx by bridging the gaps between genotype, gene expression, and disease. Nature Genetics (2017), doi: 10.1038/ng.3969

Genome Browser

www.bit.ly/BrainEpigenomeHub

Slides

www.bit.ly/AGTA2018

Software

http://bioconductor.org/packages/bsseq/

@PeteHaitch

slide-33
SLIDE 33

Bonus slides

slide-34
SLIDE 34

eGTE TEx capture bisulfite-se sequencing g study

  • Aim: Study genetic influence on DNA methylation in human brain
  • Assay: Targeting 46 Mb (1 million CpGs) with Roche NimbleGen capture
  • 55% of CpGs not captured by microarrays or other targeted panels
  • CG-DMRs
  • Neuronal (BrainEpigenome and eGTEx)
  • NeuN+ vs. NeuN- (BrainEpigenome)
  • GABAergic vs. glutamatergic1
  • CG-VMRs (eGTEx)
  • Haplotype-dependent allele-specific DMRs and meQTLs2
  • Fetal brain meQTLs3
  • ‘Epigenetic age’ CpGs4
  • Samples: > 100 donors (BRNCTXB, BRNCDT, BRNNCC, BRNHPP, and THYROID)

1Dracheva et al., unpublished 2Do, C. et al. Mechanisms and Disease Associations of Haplotype-Dependent Allele-Specific DNA Methylation. Am. J. Hum. Genet. (2016) 3Court, F. et al. Genome-wide parent-of-origin DNA methylation analysis reveals the intricacies of human imprinting and suggests a germline

methylation-independent mechanism of establishment. Genome Res. (2014)

4Horvath, S. DNA methylation age of human tissues and cell types. Genome Biol. (2013)

slide-35
SLIDE 35

GTEx -> eGTEx

1GTEx Consortium. Human genomics. The Genotype-Tissue Expression (GTEx) pilot analysis: multitissue gene regulation in humans. Science 348,

648–660 (2015).

2https://gtexportal.org/home/tissueSummaryPage 3eGTEx Project. Enhancing GTEx by bridging the gaps between genotype, gene expression, and disease. Nat. Genet. 49, 1664–1670 (2017).