MOtif aNAlysis with Lisa European Bioconductor Meeting 2019 Dania - - PowerPoint PPT Presentation
MOtif aNAlysis with Lisa European Bioconductor Meeting 2019 Dania - - PowerPoint PPT Presentation
mo monaLi Lisa MOtif aNAlysis with Lisa European Bioconductor Meeting 2019 Dania Machlab Lukas Burger Michael Stadler Friedrich Miescher Institute for Biomedical Research Background and Motivation Co-binding Chromatin remodeling Use
Background and Motivation
Francois Spitz & Eileen E. M. Furlong (2012) Nature Reviews Genetics
Co-binding Chromatin remodeling Blocking repositioning Architectural role Use monaLisa to:
- Identify Enriched motifs
- Select motifs explaining
- bserved changes
Background and Motivation
Genome
Gene A Enhancer
ATAC-seq Condition 1 ATAC-seq Condition 2 RNA-seq Condition 1 RNA-seq Condition 2 Predicted TFBS
Identify Enriched Motifs
CTCF CTCFL RARAvar2 Rarbvar2 KLF4 Klf1 Klf12 E2F7 BHLHE41 KLF13 ZEB1 ERG ETS1 ETV5 ELK3 ETV1 ETV4 FEV FLI1 ERF ETV3 ID4 KLF14 SP4
enrichment (log2) FDR (−log10)
Percent G+C 20 40 60 80 100 log2 enrichment −2 −1 1 2 FDR 2 4 6 8 10
delta methylation density of promoters
enrichment (log2) FDR (-log10)
Select Motifs using Stability Selection
Randomized lasso stability selection
Meinshausen & Bühlmann (2010) Journal of the Royal Statistical Society
- bserved logFC
predicted TFBS Lasso with Cross Validation Lasso Stability Selection Randomized Lasso Stability Selection regularization parameter weakness parameter
true signal noise
small 𝜇 large 𝜇
Y ~ X perform regularized regression
Select Motifs Explaining Observed Changes in Accessibility
glmnet::glmnet and stabs::stabsel used
NFATC1 TEAD2 TEAD3 NKX2−8 Nkx2−5(var.2) NFIC KLF5 GATA3 Gata1 GATA1::TAL1 HNF1A Nr2f6 Hnf4a NFATC1 TEAD2 TEAD3 NKX2−8 Nkx2−5(var.2) NFIC KLF5 GATA3 Gata1 GATA1::TAL1 HNF1A Nr2f6 Hnf4a
- Pear. Cor.
−1 −0.5 0.5 1
1.0 0.8 0.6 0.4 0.2 0.0
selection probability
Summary and Outlook
- We can identify TFs enriched in regions of interest that display certain log-fold changes
- We can select TFs that are likely to explain the observed log-fold changes using stability
selection
- We can be use any fold-change defined on regions of interest (ATAC-seq, methylation,
expression, ChIP-seq …) to select motifs explaining the observed logFC
- We want to look at motif enrichment without using existing databases (unbiased view)
- Enriched k-mers, grouping them, aligning them to predict the motif
- Submit to Bioconductor
- https://github.com/fmicompbio/monaLisa