in the cancer genome Andy Feber UCL Cancer Institute University - - PowerPoint PPT Presentation
in the cancer genome Andy Feber UCL Cancer Institute University - - PowerPoint PPT Presentation
Global profiling of methylation in the cancer genome Andy Feber UCL Cancer Institute University College London, UK Illumina, Manchester, 7 th September 2010 What determines a cancer phenotype? genetic factors non-genetic factors
genetic factors
What determines a cancer phenotype?
non-genetic factors
- epigenetics
- environment
Epigenetics and Cancer
- How are epigenetics changes involved in cancer?
- Definition
“The study of heritable changes in gene expression that
- ccur independent of changes in the primary DNA
sequence”
Background - epigenetics
Histone modifications, e.g.,
- acetylation
- methylation
DNA methylation Non-coding RNA (ncRNA) & Micro RNA (miRNA)
One Genome……..Many Methylomes
DNA Methylation
- DNA methylation is the addition of methyl group to cytosine generally in CpG
dinucleotides
- 28.6 million CpG sites in the human genome, 70% of which are methylated
- CpG rich regions known CpG Islands (CGIs) are generally located near to the start of
genes and associate with promoters
- Previously thought to key site of epigenetic regulation of gene expression, and have
been the main focus of epigenetic research
- Recently changes in methylation at regions out side CGIs, known as CGI sores have
been shown to more significantly associated with gene regulation
- Only 7% of CpGs reside within CGIs, many CpGs remain un-analyzed by
conventional approaches, microarray, PCR, bisulphite sequencing
- Next generation sequencing now allows the profiling of over 100x106 loci at one time
- Combined with enrichment strategies, such as MEDIP (Methylated DNA
immunoprecipitation) (MeDIP-Seq), allows whole genome methylation (methylome) to be assessed in a single experiment
MeDIP Seq
T C G A T C G A T C G T C G G C A C C T G T G G
Reference Sequence
….CGTGATGTCGCGCCTCACTCCGGTGG… TCCGGTGG CCTCACTCCGG CGCGCCTCAC TGATGTCGCG GCTGATGTCG TGTCGCGCC TCGCGCCTC CCTCACTCCG CTCCGGTGG
Determining methylation from read count
- How to determine absolute methylation levels both within a genome and
between genomes?
- Within a given genomic region, MeDIP enrichment is proportional to the
number of methylated CpG sites.
- Simple enrichment ratios/read counts do not accurately reflect the absolute
methylation levels within a particular region of interest.
Absolute Methylation Value MeDIP Enrichment Hypothetical Genomic Region C. 100% C. B. 50% B. A. 0% A.
A B C
Read Count
Bioinformatic challenge…
- Enrichment bias means absolute methylation levels are difficult to
quantitate
p 1 c c cp base p
v , m C r A | A G m | A f
Batman : Bayesian Tool for Methylation Analysis
Batman : Bayesian Tool for Methylation Analysis
MPNST Methylome
- To define the methylome (methylated genome) associated with a
malignant phenotype
- Using Medip-Seq to identify tumor specific differential methylation
which correlates with tumor progression/development
Aim
- Pools of ten cases per sample cohort
– Malignant Peripheral Nerve Sheath Tumors (MPNST) – Benign neurofibromas (NF) – Normal cultured Schwann Cells (SC)
- Age and gender matched
- MPNST 6 Female, 4 Male, median age 30.7 (range 12 to 58)
- NF 6 Female, 4 Male, median age 27.7 (range 15 to 54)
Samples
Benign Disease Familial (Germ line mutation in NF1) Sporadic (often with alterations in NF1, eg LOH) Plexiform Dermal Malignant Peripheral Nerve Sheath Tumours 10-15% develop malignant disease Neurofibromatosis type 1 (NF1) 3000 cases/year Only 20% of patients disease free after 5 years Malignant Peripheral Nerve Sheath Tumors (MPNST)
Sample Total number
- f reads
Total Mapped Reads Total Unique Mapped Reads MPNST 140119516 133145064 75918388 NF 140442616 134234980 81619250 SC 138120350 131484108 68697944
Read Stats
* Those with a Maq score of >10 and both paired reads mapping uniquely
- Covering ~68% of CpGs in each of the
three genomes.
Copy Number Correction
Sample Batman V Infinium Pearson correlation MPNST 0.78 NF 0.80 SC 0.77
Medip-Seq Verification
- Verification of medip-seq initially using the Infinium 27K Human BeadChips,
- Illumina. Interrogate ~27500 CpG sites across the genome.
- Comparison of Medip-seq data with arrays showed a high degree of
correlation
- Similar to correlations observed between:-
– BeadArray v bisulphite sequencing – BATMAN v bisulphite sequencing
Global changes in methylation
- What are the global changes involved in MPNST development ?
- To assess changes in global methylation, the methylation status of each CpG site
was bind into 3 methylation states Low (<40%), High (>60%), intermediate (40-60%)
- Global analysis of revealed a small change in global methylation (1%), compared to
- ther tumours which show global loss of methylation ranging from 10-20%.
Intermediate methylation Low methylation High methylation
- Analysis of regulatory features of CGIs, CGI shores and promoters, show similar
levels of global methylation between MPNST and Schwann cell controls
Global repeat methylation
- One of the most commonly cited features of the cancer methylome is hypomethylation
- f repeats
- Methylation over LINE and SINE repeats, changes slightly, interestingly LINE repeats
appear to lose low methylated CpGs
- Largest changes in global methylation seen in Satellite repeats, with a 25% change in
methylation between MPNST and non-neoplastic Schwann cells
Intermediate methylation Low methylation High methylation
DMR - Differentially Methylated Regions
- Regions of differential methylation were
defined by average Batman methylation scores over 1kb.
- Regions were called differentially
methylated if they had an average difference of 33% in batman methylation score
- Increasing numbers of DMRs during
progression from non-neoplastic schwann cell controls to MPNSTs
DMRs Hypermethylated Hypomethylated h2bDMR (SC v NF) 45239 46587 b2mDMR (NF v MPNST) 41886 45230 cDMR (SC v MPNST) 48391 53075
Hypermethylated Hypomethylated CGI 385 79 CGI shores 2119 1669 promoters 1097 1098 Non CGI associated promoters 293 175 exons 11858 11432 Introns 61709 57632 miRNA 22 30 Conserved regions 16535 27805 Satellite repeats 142 1398 LTR repeats 14339 12665 LINE repeats 34515 25359 SINE repeats 32661 39502 Hypermethylated Hypomethylated CGI 49 47 CGI shores 996 1382 promoters 484 812 Non CGI associated promoters 39 95 exons 7885 12104 Introns 48086 49503 miRNA 19 31 Conserved regions 18566 16113 Satellite repeats 128 259 LTR repeats 10805 10773 LINE repeats 25526 22110 SINE repeats 28764 36448
DMRs in Genomic Features
- Comparison of DMRs in different genomic features shows in which regions
methylation changes during disease progression
- Association of features DMRs with genes allows identification of potential candidate
- noc- and tumorsuppressor genes
SC v NF (h2bDMR) SC v MPNST (cDMRs)
DMR Enrichment
Hypermethylated Hypomethylated
- Relative enrichment analysis was carried out to identify those features that have a significantly
(p<0.001, red bars) higher number of DMRs than would be expected by chance
- Significant enrichment of hypomethylated satellite and SINE repeats, also enrichment of
hypermethylated LINE repeats
- Of those regions assumed to be functionally relevant in the regulation of gene expression, only
CGI shores and promoters (not associated with a CGI) to be significantly enriched
- Previous studies have focused on CGI and CGI associated promoters, suggesting many possible
sites important in cancer have been missed.
- Are DMRs enriched in specific genomic features
Hypermethylated Hypomethylated
Enrichment in repeats
- Analysis of aberrant methylation in repeats located either within or outside introns showed a
distinct pattern of repeat methylation
- We see significant enrichment of both hypomethylated non-intronic SINEs and non-intronic
satellites repeats
- Also significant enrichment of intronic SINE repeats in early disease
- Enrichment of hypermethylated intronic LINE repeats, as well as non-intronic LINES
Discrete types of satellite repeats show enrichment
Hypermethylated Hypomethylated
- Satellite repeats be divided into 19 different types of repeat
- Enrichment analysis of sat repeat type highlighted 2 specific types of repeat which under go
hypomethylation , SATR1 and ARL
- SATR1 appear to early events in tumourigenic progression, whereas ARL hypomethylation may be
a later event
- Do satellite repeats undergo sequence specific methylation?
- Knock-out of specific DNMT family members have been shown to alter specific satellite repeat
methylation
- What its the role of aberrant satellite repeat methylation in cancer
Hypermethylated Hypomethylated CGI 385 79 CGI shores 2119 1669 promoters 1097 1098 Non CGI associated promoters 293 175 exons 11858 11432 Introns 61709 57632 miRNA 22 30 Conserved regions 16535 27805 Satellite repeats 142 1398 LTR repeats 14339 12665 LINE repeats 34515 25359 SINE repeats 32661 39502
DMRs in Genomic Features
- Where to start?
- 101,466 unique cDMRs
- Do DMRs associate with candidate genes
SC v MPNST (cDMRs)
Candidate genes
MEST
- Imprinted region, differently methylated in glioblastomas (which also have frequent NF1 mutations)
WT1 –
Wilms tumor suppressor 1 gene,
MPNST NF SC
Association of methylation of gene expression
- If aberrant methylation is a key driver of tumourigenic process?
- which regions of the genome show strongest correlation with gene
expression?
- Integration of independent gene expression from MPNST (n=10) and NF
(n=28) Henderson et al., 2005 (Affy U95) and Miller et al., 2009 (Affy U133+)
Effect of methylation on gene expression across canonical gene features
- Is there a difference in canonical methylation of genomic features and those genes
with high expression in MPNST vs low expression in MPNST
- Largest difference in methylation (13%) observed in the 1st exon, shows a strong
inverse relationship with gene expression
- If CGI shores have a greater effect of gene expression than CGIs, is there a difference in
methylation between genes with high and low expression
Effect of methylation gene expression CGIs and CGI shores
- Strong inverse relationship in both up stream and down stream shores, with no difference in CGI.
- Largest difference in methylation (11%) seen ~800bp-1.5kb outside the CGI suggesting these
regions are important in the regulation of gene expression
Does gene expression reflect methylation state?
- Partition clustering (with 10,000 permutations) of the expression of genes associated with
DMRs between NF and MPNST, show significant association between :-
- hypermethylation of CGI shores (p=0.0001)
- hypermethylated non-CGI promoters (p=0.0003)
- hypomethylated CGI shores (p=0.0001)
- Can the expression of genes associated with DMRs discriminate between disease
phenotypes?
Hyper CGI Shore N=1056 Hyper non CGI Promoter N=702
SOX10
MPNST NF
CDKN2A
MPNST NF
Summary
- Medip-seq provides high resolution methylation profiling of the human
epigenome
- Provides insights into the role aberrant methylation plays in regions not
accessible by other technologies
- Whole genome methylation profiles can identify potential
prognostic/diagnostic molecular markers of malignant development and progression.
- Still not the whole picture, other epigenetic modifications are out there
- Non-CpG methylation
- 5-hydroxymethylcytosine
Acknowledgements
Lab: UCL Cancer Institute, UK:
Adrienne Flanagan Andrew Teschendorff Elia Stupka Nadege Presneau Bernadine Idowu
Gurdon Institute, UK:
Thomas Down Barts and The London School of Medicine and Dentistry, UK: Vardman Rakyan
Illumina, USA:
Gray Schroth Zhang Lu
Funding:
SACT
(Skeletal Action Cancer Trust)