Predicting Regulatory Elements Predicting Regulatory Elements in P. - - PowerPoint PPT Presentation

predicting regulatory elements predicting regulatory
SMART_READER_LITE
LIVE PREVIEW

Predicting Regulatory Elements Predicting Regulatory Elements in P. - - PowerPoint PPT Presentation

Predicting Regulatory Elements Predicting Regulatory Elements in P. falciparum in P. falciparum Chengyong Yang Erliang Zeng Giri Narasimhan BioInformatics Research Group (BioRG) School of Computer Science Florida International University,


slide-1
SLIDE 1

Chengyong Yang Erliang Zeng Giri Narasimhan

BioInformatics Research Group (BioRG) School of Computer Science Florida International University, Miami, FL.

Kalai Mathee

Department of Biological Sciences Florida International University , Miami, FL.

Predicting Regulatory Elements in P. falciparum Predicting Regulatory Elements in P. falciparum

slide-2
SLIDE 2

2

Outline

  • Biology of Transcription Regulation
  • Mining Regulatory Elements (or Transcription

Factor Binding Motifs or TFBMs)

  • Experimental Results
  • Conclusion
  • Future Work
slide-3
SLIDE 3

3

Transcription Regulation

Exon 1 Exon 2

T r a n s c r i p t i

  • n

S t a r t p

  • i

n t

mRNA

Intron Gene-Specific TF Binding Sites TATA Box CAT Box Basal TF Binding Sites TATAAT CCAAT coding region promoter region enhancer

slide-4
SLIDE 4

4

Transcription Regulation

[ Goffart et al. Exp. Physiology (2003) ]

slide-5
SLIDE 5

5

Outline

  • Biology of Transcription Regulation
  • Mining Regulatory Elements (or Transcription

Factor Binding Motifs or TFBMs)

  • Experimental Results
  • PlasmoTFBM database & Web Query Interface
  • Conclusion
  • Future Work
slide-6
SLIDE 6

6

Transcription Factor Binding Motifs (TFBM)

  • Why look for TFBMs?

– Which TFs regulate a specific gene? – Which genes are co-regulated by same TF? – Understand strength of gene expression. – Understand gene regulatory pathways.

slide-7
SLIDE 7

7

Computational Methods

  • Search for known motifs
  • Predict sites based on pattern discovery in

upstream sequences

How to Find TF Binding Motifs?

Direct Experimental Assays

  • Electrophoretic mobility shift assay
  • Nuclease protection assay

Need to know the TFs. Need to know the TFBMs Only need to know the upstream sequences

slide-8
SLIDE 8

8

CAMDA Data Set

  • Microarray data from DeRisi lab
  • 46 data sets for a 48 hour time period for
  • P. falciparum during the intraerythrocytic

development life cycle.

  • During the 48 hour period, P. falciparum

goes through 4 stages:

– Ring (1-15 hpi) – Trophozoite (16-28 hpi) – Schizont (29-42 hpi) – Merozoite (43-48 hpi)

slide-9
SLIDE 9

9

Broad Questions Raised

  • Are there transcriptional events that

distinguish the 4 stages of the

  • rganism?
  • Are there functional similarities in the

genes that share motifs?

slide-10
SLIDE 10

10

AlignACE

  • Uses Gibbs Sampling to find good alignments of

upstream sequences.

  • Maximizes relative entropy to find significant

motifs.

  • Significant motifs: must over-represent in the

input set and must have small probability of

  • ccurring by chance.

[ Roth et al. Nature Biotechnology (1998) ]

AlignACE

Cluster

  • f Genes

Significant Motifs

slide-11
SLIDE 11

11

Clustering of Samples or/and Genes

  • Transcription

Module

slide-12
SLIDE 12

12

Transcription Module

Transcription Module: a set of genes G and a set of conditions C such that the genes in G are co- regulated under conditions C.

[ Ihmel et al. Nature Genetics (2002) ]

Iterative Signature Algorithm (ISA)

  • Defines the score of a set of genes and

conditions.

  • Iteratively refines the set of genes and conditions

until a “stable” transcription module is obtained.

[ Ihmel et al. Bioinformatics (2004) ]

slide-13
SLIDE 13

13

Predicting TFBMs: Method I

ISA

Clusters

  • f Genes

AlignACE

Significant Motifs Gene Expression Data

Using Method I: 106 transcription modules 840 significant motifs

slide-14
SLIDE 14

14

Strength of TFBMs

  • TFs bind to DNA in sequence-specific

manner.

  • If the motif is “strong”, then the binding is

strong and the regulation is strong.

  • Correlation between gene expression and

the strength of its upstream TFBMs.

  • MotifRegressor [Conlon et al., PNAS 2003]

exploits this correlation.

slide-15
SLIDE 15

15

Motif Regressor

[Conlon et al., PNAS, 2003]

  • 1. Rank all genes by expression and obtain upstream

sequences of highly ranked genes.

  • 2. Use MDscan to find motifs from most induced and most

repressed genes.

  • 3. Score each upstream sequence for matches to each

MDscan reported motif.

  • 4. Perform linear regression between motif matching score

and gene expression and identify significant motifs.

46 separate runs of MotifRegressor resulted in 637 significant motifs.

slide-16
SLIDE 16

16

PlasmoTFBM Database

  • All results were put into a searchable

MySQL database containing:

– Modules – Motifs – Gene Annotation information – Gene Expression data – Upstream sequence data – Miscellaneous data

slide-17
SLIDE 17

17

Outline

  • Biology of Transcription Regulation
  • Mining Regulatory Elements (or Transcription

Factor Binding Motifs or TFBMs)

  • Experimental Results
  • Conclusion
  • Future Work
slide-18
SLIDE 18

18

Results

  • A. Validation of known motifs
  • 1. G-Box motif
  • 2. var gene family
  • B. Motif clusters & motif-stage

correlations

  • C. All Motifs in single gene of interest
  • D. Gene Family Analysis (SERA genes)
slide-19
SLIDE 19

19

A: G-Box Motifs

  • P. falciparum genome is AT-rich (15% GC)
  • G-box: a unique regulatory element
  • Identified in upstreams of heat shock

proteins (hsp).

[Militello et al., MBP, 2004]

(A/G)NGGGG(C/A)

Published Motif

slide-20
SLIDE 20

20

A: G-Box Motifs

[Militello et al., MBP, 2004]

(A/G)NGGGG(C/A) G-Box from PlasmoTFBM

TG-box

New

slide-21
SLIDE 21

21

A: var Gene Family

  • 50 diverse var genes
  • Coding for variants of P. falciparum

erythrocyte membrane protein 1 (PfEMP1)

  • Ability to switch the expression of PfEMP1
  • Allows the parasite to escape specific

immune responses

[Voss et al., Mol Microbiol, 2003]

slide-22
SLIDE 22

22

A: Significant Motifs in var Genes

[Voss et al., Mol Microbiol, 2003]

PFL0935c PF14_048 PFB0010w PF08_0103 PFI1830c PF10_0406 PFL1955w PFA0765c PFD0615c

Induced 11 hpi

PF08_010 PFL0935c PF10_040 PFB0010w PFI1830c PFA0765c

Repressed 38 hpi

ATGTTGTACAT

CPE

TGTGCATAGTG

SPE2

slide-23
SLIDE 23

23

B: Motif Clusters

R T S M

1 2 3 4 5 6 7 8 10 9 11 12

R T S M

[Genesis, http://genome.tugraz.at/Software]

slide-24
SLIDE 24

24

C: EBA140

  • EBA140 is implicated in merozoite

invasion on erythrocytes

  • Putative vaccine target
  • Share sequence homology and

structural features with EBA175

[Thompson et al., Mol Microbiol, 2001]

slide-25
SLIDE 25

25

C: Motifs Found in EBA140

MAL13P1.61, hypothetical protein, sharing all eight motifs Shared by 77 genes including MAL7_1.86 Shared by 77 genes including MAL7P1.86 (TF) Correction: Figure 4, Pg 19 of Abstracts

slide-26
SLIDE 26

26

D: SERA Gene Family

  • Serine repeat antigen (SERA)
  • Adjacent, co-regulated genes from Chr 2
  • Highly expressed in late blood cycle
  • Target of protective immune response
  • Possesses a protease function domain
  • Serves both as a vaccine and a drug

target to control P. falciparum

[Miller et al., JBC, 2002]

slide-27
SLIDE 27

27

D: Motif Discovered in SERA

PFB0325c PFB0330c PFB0335c PFB0340c PFB0345c PFB0350c PFB0355c PFB0360c

SERA Modules

rand5-80_m22_1 PFE0415w_g1.3_c8

Motif

http://biorg.cs.fiu.edu/TFBM/tfbm.php

slide-28
SLIDE 28

28

D: Module Average Expression Profile

PFE0415w_g1.3_c8

slide-29
SLIDE 29

29

D: SERA TFBMs Visualization

M0 M1 M2 M3 M4

slide-30
SLIDE 30

30

Conclusions

  • PlasmoTFBM: first comprehensive database
  • f P. falciparum TFBMs
  • Validated many known P. falciparum motifs
  • Discovered new interesting motifs
  • Web query interface built for biologists
slide-31
SLIDE 31

31

Acknowledgements

  • BioRG members (Tao Li, Gaolin Zheng,

Tom Milledge)

  • Prof. Shirley Liu, Harvard (MotifRegressor)
  • Haifeng Wang, Jing Zhai & Wei Shi

http://biorg.cs.fiu.edu/CAMDA2004 http://biorg.cs.fiu.edu/TFBM/tfbm.php