Detecting Network Motifs in Gene Co-expression Networks Xinxia Peng - - PowerPoint PPT Presentation
Detecting Network Motifs in Gene Co-expression Networks Xinxia Peng - - PowerPoint PPT Presentation
Detecting Network Motifs in Gene Co-expression Networks Xinxia Peng Genome Science & Technology Program The University of Tennessee Oak Ridge Natl. Lab Motivation Modularity of Biological Networks Co-expression Network
Motivation
Modularity of Biological Networks
Co-expression Network
Cutoff: 0.8 Correlation Matrix* Adjacency Matrix
*Pearson’s R
Genes of Similar Function Cluster Together
Densely connected subgraphs
Protein complexes Pathways …
Clique
maximally connected subgraph
Gene Duplication
Paralogs “Paralogous pathways”: pathways with duplicated proteins and interactions
co-expression duplication
1dxy
Protein Domain (or Motif)
Evolutionary unit Functional unit Reiterated use of domains
“Network Motifs”
II and III are
- verlapping,
I and II are non-
- verlapping
Materials and Methods
Protein Domain Annotation
Protein Sequences
PlasmoDB http://plasmodb.org
HMM Library
Pfam http://www.sanger.ac.uk /Software/Pfam
HMMER
http://hmmer.wustl.edu
OIT Cluster
http://icl.cs.utk.edu/si nrg/index.html
Domain Annotations
Network Motif Discovery (1)
< G, k, f > Enumeration of k-vertex cliques Groups of cliques Network motifs Protein domain f: # of non-overlapping cliques
Network Motif Discovery (2)
p-value: fraction of times putative network motifs found in randomized networks
Randomize the real network by randomly
permuting the protein domain labels
Repeat 1,000 times
Network Motif Discovery (3)
Domain Matching Level 2 Domain Matching Level 4
1 1’ A B C A B C 2 2’ A D A D A
Protein Interaction Network and Data Visualization
Protein Interaction Network (PPI)
BIND (http://www.blueprint.org/bind/bind.php) Vertices: genes/proteins Edges: binary protein interactions Protein complex: “matrix” model
Visualization
ALIVE (http://mouse.ornl.gov/alive) R (http://www.r-project-org)
Results
Co-expression Network
- Complete Dataset
- R >= 0.95
- 2,292 ORFs
- 93% ( 2124) with strong
periodic behavior
- cover 78% (2124/2714)
- f Overview Dataset
Prediction of Network Motifs
k: size of network motif. f: min. number of non-overlapping instances # of network motifs having at least one instance in yeast PPI # of network motifs found
↑ k or f , ↑ % in yeast PPI
Percentage of network motifs having instance in yeast PPI network by Freq. x Size. Domain matching level 2
25/88 2/3 11/18 5/6 1/1 0/0 0/0 0/0
↑ k ↑ f
↑ k or f , ↑ % in yeast PPI
Percentage of network motifs having instance in yeast PPI network by Freq. x Size. Domain matching level 4
0/0 0/0 0/0 0/0 6/9 17/32 0/0 6/6 29/87 53/197 13/17 5/5
↑ f ↑ k
Example 1
Functional Annotations
DEAD/DEAH box helicase (PF00270) and Helicase conserved C- terminal domain (PF00271), WD domains, G-beta repeats (PF00400), Brix domain (PF04427), GTPase of unknown function (PF01926).
Supported by Yeast Protein Interactions
Prediction of Complementary Functional Units
Example 2
Functional Annotations
Protein kinase domain (PF00069), Calcineurin-like phosphoesterase (PF00149), AhpC/TSA family (PF00578), it contains Peroxiredoxins (Prxs), a ubiquitous family of antioxidant enzymes, and Prxs can be regulated by phosphorylation.
Differential Temporal Expression
More Results
http://mouse.ornl.gov/~xpv/camda04/index.html
Conclusion
New strategy for microarray data analysis Data integration
Gene expression, sequence, protein interaction, …
Easier for experimental verification
Small clusters Implication about relationships among members
Biological hypothesis
Modularity of biological networks
Acknowledgements
- Dr. Jay Snoddy (Genome Science &
Technology, UT-ORNL)
Adam Tebbe, Suzanne Baktash, …
- Dr. Michael Langston (Computer Science,
UT)
Nicole Baldwin
- Dr. Arnold Saxton (Animal Science, UT)
References
[1] Bhan, A., Galas, D.J. and Dewey, T.G. A duplication growth model of gene expression networks. Bioinformatics, 18 (11). 1486-1493. [2] Bozdech, Z., Llinas, M., Pulliam, B.L., Wong, E.D., Zhu, J. and DeRisi, J.L. The Transcriptome of the Intraerythrocytic Developmental Cycle
- f Plasmodium falciparum. PLoS Biol, 1 (1). E5.
[3] Chang, L. and Karin, M. Mammalian MAP kinase signalling cascades. Nature, 410 (6824). 37-40. [4] Chang, T.S., Jeong, W., Choi, S.Y., Yu, S., Kang, S.W. and Rhee, S.G. Regulation of peroxiredoxin I activity by Cdc2-mediated
- phosphorylation. J Biol Chem, 277 (28). 25370-25376.
[5] Eisenhaber, F., Wechselberger, C. and Kreil, G. The Brix domain protein family - a key to the ribosomal biogenesis pathway? Trends in Biochemical Sciences, 26 (6). 345-347. [6] Langston, M., Lin, L., Peng, X., Baldwin, N., Symons, C., Zhang, B. and Snoddy, J. A Combinatorial Approach to the Analysis of Differential Gene Expression Data: The Use of Graph Algorithms for Disease Prediction and Screening. in Methods of Microarray Data Analysis IV, Kluwer academic publishers, Boston, In press. [7] Lee, H.K., Hsu, A.K., Sajdak, J., Qin, J. and Pavlidis, P. Coexpression analysis of human genes across many microarray data sets. Genome Research, 14 (6). 1085-1094. [8] Milo, R., Shen-Orr, S., Itzkovitz, S., Kashtan, N., Chklovskii, D. and Alon, U. Network motifs: Simple building blocks of complex networks. Science, 298 (5594). 824-827. [9] Neer, E.J., Schmidt, C.J., Nambudripad, R. and Smith, T.F. The ancient regulatory-protein family of WD-repeat proteins. Nature, 371 (6495). 297-300. [10] Pawson, T. and Nash, P. Assembly of cell regulatory systems through protein interaction domains. Science, 300 (5618). 445-452. [11] Shen-Orr, S.S., Milo, R., Mangan, S. and Alon, U. Network motifs in the transcriptional regulation network of Escherichia coli. Nature Genetics, 31 (1). 64-68. [12] Wood, Z.A., Schroder, E., Robin Harris, J. and Poole, L.B. Structure, mechanism and regulation of peroxiredoxins. Trends Biochem Sci, 28 (1). 32-40.