epigenomic enrichment analysis using bioconductor
play

Epigenomic enrichment analysis using Bioconductor EuroBioc 2019 - PowerPoint PPT Presentation

Epigenomic enrichment analysis using Bioconductor EuroBioc 2019 Brussels Dario Righelli PhD Istituto per le Applicazioni del Calcolo M. Picone CNR - Napoli d.righelli@na.iac.cnr.it || dario.righelli@gmail.com drighelli Whats


  1. Epigenomic enrichment analysis using Bioconductor EuroBioc 2019 – Brussels Dario Righelli – PhD Istituto per le Applicazioni del Calcolo «M. Picone» – CNR - Napoli d.righelli@na.iac.cnr.it || dario.righelli@gmail.com drighelli

  2. What’s the aim? Compare methods and provide guidelines on epigenomic data analysis

  3. ATAC-seq dataset Before Fear Induction Condition (E0) } 4 biological replicates } 4 biological replicates Catching differences in open chromatine After Fear Induction Condition (E1) regions } Yijing Su et al. 2017 - Nature Neuroscience - Neuronal activity modifies the chromatin accessibility landscape in the adult brain

  4. ChIP-seq dataset (NULL dataset) Home Cage Controls - Histon 3, Lysine 9 Acetilation (H3K9ac) 9 biological replicates } } How many random differences are we able to catch inside a control dataset?

  5. BWA and Bowtie2 perform the same o Most used aligners for epigenomics data o Correlation computed on ChIP-seq data coverages o used DeepTools plotCorrelation tool o Computed correlations on the coverages of the same samples on BWA and Bowtie2 bams have value of 1.

  6. A Bioconductor Approach o MACS2 (No Bioconductor) o Most used peak caller o Broad and Narrow peaks option o DEScan2 Peak DESCan2 MACS2 CSAW o Has a peak detector in R Callers o Peak resolution -> bin size Broad Narrow o Can work with external peaks o DiffBind o No peak detection Peak o Fast on matrix construction Consensus & DESCan2 DiffBind CSAW o Uses external peaks Matrices o CSAW o Starts from BAM files o Computes matrix of bins x samples Differential edgeR o edgeR Enrichment o Widely used method o Very flexible in usage

  7. Counts Normalization Affects Differentially Accessible Regions (DARs) ATAC-seq dataset o Pay attention to the normalization process o One tryes to apply a classic RNA-Seq normalization o The process does not always give the same results o Maybe some more specific normalization is required for this kind of data

  8. Comparing DARs across methods ATAC − seq DARs o All the methods have the biggest 16652 overlap on the detected peaks 15000 o CSAW and DiffBind show a big amount 11982 of not-overlapping regions Intersection Size o DEScan2 shows the lowest number of 10000 7956 not-overlapping regions 7505 6597 6530 o The big amount of not-overlapping 5000 4491 regions by CSAW and DiffBind suggests a possible high-level of false positive 976 976 654 599 regions detected. 523 514 344 282 0 o Ad-hoc designed UpsetPlot on GRanges DEScan_Z10_K4_DARs o Based on findOverlaps method DiffBindNarrow CSAW DiffBindBroad 40000 20000 0 Set Size

  9. Peaks contrasts on NULL dataset show no results H3K9ac ChIP-seq dataset o Compared performances on a null dataset of ChIP-seq H3K9ac samples 8 o Performed 126 permutations of samples o Samples are randomly divided in 6 two groups o All the possible permutations on 9 normalized nElem samples (126) 4 NO YES o All the methods find mostly 0 Differential Enriched Peaks on the 2 random conditions. o Sometimes some differences have been found 0 m m m m m m m m o With and without normalization r r r r r r r r o o o o o o o o N N N N N N N N o _ o _ o _ _ _ N 2 N a N r r d r r _ n _ o _ a a a 2 a a r r N N o c B r r n o a _ _ B S _ a r N 2 d E B 2 _ c M _ M n d S D _ 2 B i n E 2 _ _ M 2 i M 2 f B D S i f _ S _ D f f 2 E 2 E i D S D S D E E D D method

  10. What’s Next? On-going and future works

  11. Some comparisons are still needed o Compare CSAW on ChIP-seq o Compare normalization methods with all epigenomics methods o Explore in-silico biological functions of results o Testing ATAC—seq Single Cell dataset

  12. Acknowledgements • Dr. Claudia Angelini – Istituto per le Applicazioni del Calcolo-CNR • Dr. Davide Risso – Univeristy of Padua • Dr. Lucia Peixoto – Elon S. Floyd College of Medicine, Washington State University • Dr. Timothy Triche Jr. - Van Andel Research Institute • Dr. Ben Johnson - Van Andel Research Institute • Thank you for your Attention!

  13. Napoli R/ Bioconductor Meetup o Since Nov 2018 o R Consortium Array Group o At least 25 people any event with a good https://www.facebook.com/pg/NapoliRBiocMeetup turn-over of attendees o Eight meetups until now o R Package Creation o scRNA-seq Analysis o Differentially Methylated Regions Analysis o Microscope Image Processing o Chromosomal Copy Number Changes http://lists.moo.gs/mailman/listinfo/biocmeetup.naples Detection napoli.r.bioc@gmail.com o Bulk RNA-seq Differential Expression o Hi-C data analysis using HiCeekR o Metagenomics analysis workflow

  14. Napoli R/Bioconductor Meetup • Part of a wider idea • Third city in the World • Boston (USA) • New York (USA) • Napoli (IT) • Useful to • share ideas and workflows • create new collaborations • extend bioinfo community

  15. Is there a best Aligner? Bowtie2 vs BWA

  16. Comparing DARs across methods (2) ATAC-seq dataset ATAC − seq Regions Nar/Broa & DEScan2 o Ad-hoc designed UpsetPlot on Granges 80000 73600 o Based on findOverlaps method 60794 o Results description 60000 Intersection Size 40000 24236 20000 16120 13994 5854 4376 2879 2424 2122 1913 1857 1457 1046722 703 466 410 351 350 327 241 137 100 93 89 9 5 5 3 2 0 DiffBindMACSNarrow DEScanZ10K4 DiffBindMACSBroad DEScanMACSBroad DEScanMACSNarrow 125000 100000 75000 50000 25000 0 Set Size

  17. Duplicates Removal doesn’t impact peak detection o Diagonal Correlations on counts matrices show that DEScan2 DiffBind there is no big differences noDup_E0_1 noDup_E0_2 noDup_E0_3 noDup_E0_4 noDup_E1_1 noDup_E1_2 noDup_E1_3 noDup_E1_4 DiffBind between duplicates and noDup_E0_1 noDup_E0_2 noDup_E0_3 noDup_E0_4 noDup_E1_1 noDup_E1_2 noDup_E1_3 noDup_E1_4 1 no-duplicates samples withDup_E0_1 1 0.8 o rmDup with samtools withDup_E0_1 0.8 withDup_E0_2 0.6 withDup_E0_2 0.6 o DEScan2 counts matrices 0.4 withDup_E0_3 0.4 withDup_E0_3 0.2 withDup_E0_4 o DiffBind counts matrices 0.2 0 withDup_E0_4 withDup_E1_1 0 − 0.2 withDup_E1_1 Final Peaks with/without Duplicates withDup_E1_2 − 0.2 − 0.4 40000 withDup_E1_2 − 0.4 − 0.6 withDup_E1_3 − 0.6 − 0.8 withDup_E1_3 30000 withDup_E1_4 − 0.8 − 1 withDup_E1_4 20000 − 1 10000 0 Dup_DEScan2 noDup_DEScan2 Dup_DiffBind noDup_DiffBind

  18. DEScan2 – Differential Enriched Scan 2 • Filter out the peaks with a score lower than a user-defined threshold • Aligns the peaks over user-defined number of samples • Different thresholds produce different trends in number of final peaks detected

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend