MOtif aNAlysis with Lisa European Bioconductor Meeting 2019 Dania - - PowerPoint PPT Presentation

motif analysis with lisa
SMART_READER_LITE
LIVE PREVIEW

MOtif aNAlysis with Lisa European Bioconductor Meeting 2019 Dania - - PowerPoint PPT Presentation

mo monaLi Lisa MOtif aNAlysis with Lisa European Bioconductor Meeting 2019 Dania Machlab Lukas Burger Michael Stadler Friedrich Miescher Institute for Biomedical Research Background and Motivation Co-binding Chromatin remodeling Use


slide-1
SLIDE 1

mo monaLi Lisa

MOtif aNAlysis with Lisa

European Bioconductor Meeting 2019

Dania Machlab Lukas Burger Michael Stadler Friedrich Miescher Institute for Biomedical Research

slide-2
SLIDE 2

Background and Motivation

Francois Spitz & Eileen E. M. Furlong (2012) Nature Reviews Genetics

Co-binding Chromatin remodeling Blocking repositioning Architectural role Use monaLisa to:

  • Identify Enriched motifs
  • Select motifs explaining
  • bserved changes
slide-3
SLIDE 3

Background and Motivation

Genome

Gene A Enhancer

ATAC-seq Condition 1 ATAC-seq Condition 2 RNA-seq Condition 1 RNA-seq Condition 2 Predicted TFBS

slide-4
SLIDE 4

Identify Enriched Motifs

CTCF CTCFL RARAvar2 Rarbvar2 KLF4 Klf1 Klf12 E2F7 BHLHE41 KLF13 ZEB1 ERG ETS1 ETV5 ELK3 ETV1 ETV4 FEV FLI1 ERF ETV3 ID4 KLF14 SP4

enrichment (log2) FDR (−log10)

Percent G+C 20 40 60 80 100 log2 enrichment −2 −1 1 2 FDR 2 4 6 8 10

delta methylation density of promoters

enrichment (log2) FDR (-log10)

slide-5
SLIDE 5

Select Motifs using Stability Selection

Randomized lasso stability selection

Meinshausen & Bühlmann (2010) Journal of the Royal Statistical Society

  • bserved logFC

predicted TFBS Lasso with Cross Validation Lasso Stability Selection Randomized Lasso Stability Selection regularization parameter weakness parameter

true signal noise

small 𝜇 large 𝜇

Y ~ X perform regularized regression

slide-6
SLIDE 6

Select Motifs Explaining Observed Changes in Accessibility

glmnet::glmnet and stabs::stabsel used

NFATC1 TEAD2 TEAD3 NKX2−8 Nkx2−5(var.2) NFIC KLF5 GATA3 Gata1 GATA1::TAL1 HNF1A Nr2f6 Hnf4a NFATC1 TEAD2 TEAD3 NKX2−8 Nkx2−5(var.2) NFIC KLF5 GATA3 Gata1 GATA1::TAL1 HNF1A Nr2f6 Hnf4a

  • Pear. Cor.

−1 −0.5 0.5 1

1.0 0.8 0.6 0.4 0.2 0.0

selection probability

slide-7
SLIDE 7

Summary and Outlook

  • We can identify TFs enriched in regions of interest that display certain log-fold changes
  • We can select TFs that are likely to explain the observed log-fold changes using stability

selection

  • We can be use any fold-change defined on regions of interest (ATAC-seq, methylation,

expression, ChIP-seq …) to select motifs explaining the observed logFC

  • We want to look at motif enrichment without using existing databases (unbiased view)
  • Enriched k-mers, grouping them, aligning them to predict the motif
  • Submit to Bioconductor
  • https://github.com/fmicompbio/monaLisa