1
Using Graph Theory to Analyze Gene Network Coherence
José A. Lagares Jesús S. Aguilar Norberto Díaz-Díaz ndiaz@upo.es Francisco A. Gómez-Vela fgomez@upo.es José A. Sánchez
Using Graph Theory to Analyze Gene Network Coherence Francisco A. - - PowerPoint PPT Presentation
Using Graph Theory to Analyze Gene Network Coherence Francisco A. Gmez-Vela Norberto Daz-Daz fgomez@upo.es ndiaz@upo.es Jess S. Aguilar Jos A. Lagares Jos A. Snchez 1 Outlines n Introduction n Proposed Methodology n
1
José A. Lagares Jesús S. Aguilar Norberto Díaz-Díaz ndiaz@upo.es Francisco A. Gómez-Vela fgomez@upo.es José A. Sánchez
2
n Introduction n Proposed Methodology n Experiments n Conclusions
3
n Introduction n Proposed Methodology n Experiments n Conclusions
n There is a need to generate patterns of expression, and
n GNs arise as a visual and intuitive solution for gene-
n They are presented as a graph:
q Nodes: are made up of genes. q Edges: relationships among these genes.
4
5
6
n Many GN inference algorithms have been developed as
q Ponzoni et al., 2007. q Gallo et al., 2011.
n They can be broadly classified as (Hecker M, 2009):
q Boolean Network q Information Theory Model q Bayesian Networks
7
n Once the network has been generated, it is very
q
q
n The quality of a GN can be measured by a direct
n However, these approaches are not entirely accurate as
8
9
n Introduction n Proposed Methodology n Experiments n Conclusions
10
n The main features of our method:
q Evaluate the similarities and differences between gene
networks and biological database.
q Take into account the indirect gene-gene relationships for the
validation process.
q Using Graph Theory to evaluate with gene networks and
11
B A D C
Input Network Biological Database
B A E C F
Distance Matrices
Floyd Warshall Algorithm
12
B A D C
Input Network Biological Database
B A E C F
Coherence Matrix
CM = |DMIN – DMDB|
CM=|DMi – DMj|
13
n This approach is a graph analysis method that solves
Network
A B C E F A 2 1 1 2 B 2 1 1 2 C 1 1 2 1 E 1 1 2 1 F 2 2 1 1
Distance Matrix
B F E A C
n Distance threshold (δ)
q It is used to exclude relationships that lack biological meaning. q This threshold denotes the maximum distance to be considered
as relevant in the Distance Matrix generation process.
q If the minimum distance between two genes is greater than δ,
then no path between the genes will be assumed.
14
A B C E F A 2 1 1 2 B 2 1 1 2 C 1 1 2 1 E 1 1 2 1 F 2 2 1 1
15
Network Distance Matrix
B F E A C A B C E F A 2 1 1 2 B 2 1 1 2 C 1 1 2 1 E 1 1 2 1 F 2 2 1 1
A B C E F A 2 1 1 2 B 2 1 1 2 C 1 1 2 1 E 1 1 2 1 F 2 2 1 1
16
Network Distance Matrix
B F E A C A B C E F A 2 1 1 2 B 2 1 1 2 C 1 1 2 1 E 1 1 2 1 F 2 2 1 1 A B C E F A ∞ 1 1 ∞ B ∞ 1 1 ∞ C 1 1 ∞ 1 E 1 1 ∞ 1 F ∞ ∞ 1 1
17
Coherence Matrix (CM) A B C D A 1 3 2 B 1 2 1 C 3 2 1 D 2 1 1 A B C A 1 ∞ B 1 1 C ∞ 1 A B C D A 1 ∞ 2 B 1 2 1 C ∞ 2 1 D 2 1 1 A B C E F A 0 2 1 2 2 B 2 0 1 1 2 C 1 1 0 1 1 E 2 1 1 0 1 F 2 2 1 1 0 A B C E F A 0 2 1 2 2 B 2 0 1 1 2 C 1 1 0 1 1 E 2 1 1 0 1 F 2 2 1 1 0
DMIN DMDB
CM=|DMi – DMj|
n Coherence Level threshold (θ)
q This threshold denotes the maximum coherence level to be
considered as relevant in the Coherence Matrix.
q It is used to obtain well-Known indices by using the elements of
the coherence matrix:
18
Coherence Matrix
Coherence Matrix
21
n Introduction n Proposed Methodology n Experiments n Conclusions
22
n Input networks were obtained by applying four
n Comparison with Well-Known data:
23
n Several studies were carried out using different
q Distance threshold (δ) and Coherence level threshold (θ) have
been modified from one to five, generating 25 different combinations.
n The results show that the higher δ and θ values, the
q The most representative result, was obtained for δ=4 and θ=1.
24
Biogrid KEGG SGD YeastNet Soinov Bulashevska Ponzoni
0,27 0,58 0,31 0,29 0,65 0,34 0,53 0,50 0,82 0,28 1 1 0,42 0,48 0,47 0,45 0,79 0,50 0,69 0,66 0,90 0,43 1 1
Accuracy F-measure Accuracy F-measure Accuracy F-measure
25
q These results are consistent with the experiment
q Ponzoni was successfully compared with Soinov and
26
n Introduction n Proposed Methodology n Experiments n Conclusions
27
n A new approach of a gene network validation framework
q The methodology not only takes into account the direct
relationships, but also the indirect ones.
q Graph theory has been used to perform validation task.
28
n Experiments with Real Data.
q These results are consistent with the experiment carried out in
Ponzoni et al., 2007.
q Ponzoni was successfully compared with Soinov and
Bulashevska approaches.
q These behaviours are also found in the obtained results. Ponzoni
presents better coherence values than Soinov and Bulashevska in BioGrid, SGD and YeastNet.
29
n The methodology has been improved:
q The elements in coherence matrix will be weighted based on the
gene-gene relationships distance.
q A new measure, based on different databases will be generated.
n Moreover, a Cytoscape plugin will be implemented.
30
Pavlopoulos GA, et al. (2011): Using graph theory to analyze biological networks. BioData Mining, 4:10. Asghar A, et al (2012) Speeding up the Floyd–Warshall algorithm for the cycled shortest path problem. AppliedMathematics Letters 25(1): 1 Bulashevska S and Eils R (2005) Inferring genetic regulatory logic from expression
Ponzoni I, et al (2007) Inferring adaptive regulationthresh-olds and association rules from gene expressiondata through combinatorial optimization learning.IEEE/ACM Transaction on Computation Biology andBioinformatics 4(4):624. Poyatos JF (2011). The balance of weak and strong interactions in genetic networks. PloS One 6(2):e14598.
31