CSEP 590B Summary Below, as a somewhat unusual course summary, I - - PowerPoint PPT Presentation

csep 590b summary
SMART_READER_LITE
LIVE PREVIEW

CSEP 590B Summary Below, as a somewhat unusual course summary, I - - PowerPoint PPT Presentation

CSEP 590B Summary Below, as a somewhat unusual course summary, I have decided to give the bulk of a research talk I presented in our CompBio seminar last spring, partly because I think the content is interesting, but more to show how


slide-1
SLIDE 1

CSEP 590B “Summary”

Below, as a somewhat unusual “course summary,” I have decided to give the bulk of a research talk I presented in

  • ur CompBio seminar last spring, partly because I think

the content is interesting, but more to show how deeply “computation” is embedded in modern “bio” research, and to show that many of the themes of the course are directly relevant. Also note that the last ~40 slides of Lecture 9 “CMs” were actually presented in Lecture 10, but conceptually and logistically it was easier to split the slides this way...

Asides emphasizing these connections are highlighted in a sprinkling of boxes like this.

slide-2
SLIDE 2

April 2010 Goal: To give you a sense of where the “comp” fits in a modern “bio” paper.

slide-3
SLIDE 3

Outline

Transcription factors & MyoD Chromatin Immunoprecipitation (ChIP) ChIP-seq Computational Methods Results

slide-4
SLIDE 4

Enhancer TF TSS Gene

...

Promoter

Transcription factors & TFBS motif discovery

slide-5
SLIDE 5

MyoD MyoD MyoD MyoD Myogenin Mef2 Myogenin Mef2

Myogenesis:

Myoblast – a muscle precursor Myotube – differentiated skeletal muscle cell

MyoD is the “master regulator”

Other players: Mef2, MyoG, ...

N e t w

  • r

k m

  • t

i f s

slide-6
SLIDE 6

“Standard Model”

MyoD absent or low in myoblasts Triggering it in myoblasts (or many other cell types) starts a cascade leading to myotubes 500-1500 genes show differential expression between myoblasts & myotubes Expectation: MyoD drives those changes, by binding their promoters, plus a few enhancer sites

A g a i n , f a m i l i a r g r

  • u

n d

slide-7
SLIDE 7

Chromatin Immunoprecipitation

Antibody

Readout: qPCR microarray deep seq

slide-8
SLIDE 8

MyoD Experimental Design

Solexa Sequencing Chromatin IP with anti- Myod antisera C2C12 Myoblasts C2C12 Myotubes Gene specific QC-PCR

R e c a l l s e q u e n c i n g

slide-9
SLIDE 9

Size selection 150-250bp

Load 2 picomoles on the machine

ChIP DNA

End repair 3’-dA overhang Adapter ligation

5’ GATCGGAAGAGCTCGTATGCCGTCTTCTGCTTG3’

PCR amplification

A A A A TCTAGCCTTCTCGCAGCACATCCCTTTCTCACA5’ 5’ACACTCTTTCCCTACACGACGCTCTTCCGATCT 3’GTTCGTCTTCTGCCGTATGCTCGAGAAGGCTAG 5’AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT 3’TCTAGCCTTCTCGAGCATACGGCAGAAGACGAAC

ChIP-seq Sample Prep

Adaptor total length: 67bp

R e c a l l P C R

slide-10
SLIDE 10

Bioanalyzer Analysis

ChIP Fragment Mean = 130 Range = 60 - 200

R e c a l l g e l e l e c t r

  • p

h

  • r

e s i s

slide-11
SLIDE 11

Analysis & Methods

slide-12
SLIDE 12

ChIP-seq Analysis

Yields 5-20M “reads” per lane (8 lanes per run, usually 8 different samples) Reads (35-55 bp, depending on run) are mapped back to the mouse reference genome.

  • nly one copy of dup reads retained (PCR artifacts?)

tolerate 2 bp mismatch among 1st 28 bp reads not mapping uniquely are discarded

“Extended read” – pretend each is 200 bp Overlapping extended reads presumably mark binding sites

L a t e s t t e c h n

  • l
  • g

y → ~ 1

9

r e a d s p e r r u n

slide-13
SLIDE 13

Identification of Binding Regions

Few reads = short flat peaks Many reads = High sharp peaks

slide-14
SLIDE 14

Myod Locus

12 12

MyoD

R e c a l l g e n e , T F B S p r e d i c t i

  • n

. A l s

  • ,

m u l t i

  • w

a y a l i g n m e n t , r e p e a t d i s c

  • v

e r y , . . .

slide-15
SLIDE 15

A

Myog

B

Are the antibodies any good?

slide-16
SLIDE 16

Analysis questions: How tall must a peak be?

Estimate Poisson null model from “islands” of height 1, 2. How likely is height 6? 12?

R e c a l l l i k e l i h

  • d

r a t i

  • t

e s t s , e t c .

slide-17
SLIDE 17

Results

slide-18
SLIDE 18

“Standard Model”

MyoD binding absent or rare in myoblasts Triggering it in myoblasts (or many other cell types) starts a cascade leading to myotubes 500-1500 genes show differential expression between blasts & tubes Expectation: MyoD binds their promoters & drives those changes

A g a i n , f a m i l i a r g r

  • u

n d

slide-19
SLIDE 19

How Many Peaks?

As opposed to the 500-1500 genes changed, we find MyoD bound to 25,956 loci in myotubes (at 12-read cutoff; FDR < 10-6) In myotubes and myoblasts both > 60,000 at ~.01 FDR

(Excludes X, Y, repetitive regions)

slide-20
SLIDE 20

Where are peaks?

Count

But 50% are >10k from any TSS Concentrated at promoters Binds 41% of genes.

A g a i n , f a m i l i a r g r

  • u

n d A s u r p r i s e much computational analysis

slide-21
SLIDE 21

Where are peaks?

much computational analysis

slide-22
SLIDE 22

What’s it doing?

much computational analysis

slide-23
SLIDE 23

What else is it doing?

much computational analysis

slide-24
SLIDE 24 Pos: upregulated genes (384) Neg: non-regulated expressed genes (2278)

Pos: upregulated genes (384) Neg: non-regulated expressed genes (2278) Promoter regions AUC=0.64 Pos: upregulated genes (504) Neg: non-regulated expressed genes (3789) CTCF domains AUC=0.70 CTCF domains AUC=0.77 Pos: upregulated genes (504) Neg: intergenic (1294)

TP much computational analysis

slide-25
SLIDE 25

Binding Site/Cofactor Motifs (See paper)

Discriminative motif discovery, on very large scale E.g., 3 papers with related approaches appeared in Bioinformatics today

slide-26
SLIDE 26

Summary

MyoD present (& bound) in both myoblasts & myotubes Binds most genes, not just differentially expressed ones Significant genome-wide binding Although differentially bound peaks are associated with changed expression, peak height is a weak predictor of function Implicated in broad chromatin modifications (histone H4 acetylation) Motif discovery possible (but of limited predictive value in

isolation)

slide-27
SLIDE 27

Summary

MyoD present (& bound) in both myoblasts & myotubes Binds most genes, not just differentially expressed ones Significant genome-wide binding Although differentially bound peaks are associated with changed expression, peak height is a weak predictor of function Implicated in broad chromatin modifications (histone H4 acetylation) Motif discovery possible (but of limited predictive value in

isolation)

And math, stat, computational analysis is deeply interwoven with everything here...

slide-28
SLIDE 28

Thanks for a fun quarter!