Using Graph Theory to Analyze Gene Network Coherence Francisco A. - PowerPoint PPT Presentation

Using Graph Theory to Analyze Gene Network Coherence Francisco A. Gómez-Vela Norberto Díaz-Díaz fgomez@upo.es ndiaz@upo.es Jesús S. Aguilar José A. Lagares José A. Sánchez 1

Outlines n Introduction n Proposed Methodology n Experiments n Conclusions 2

Introduction Gene Network n There is a need to generate patterns of expression, and behavioral influences between genes from microarray. n GNs arise as a visual and intuitive solution for gene- gene interaction. n They are presented as a graph: q Nodes: are made up of genes. q Edges: relationships among these genes. 4

Introduction Gene Network 5

Introduction Gene Network n Many GN inference algorithms have been developed as techniques for extracting biological knowledge q Ponzoni et al., 2007. q Gallo et al., 2011. n They can be broadly classified as (Hecker M, 2009): q Boolean Network q Information Theory Model q Bayesian Networks 6

Introduction Gene Network Validation in Bioinformatics n Once the network has been generated, it is very important to assure network reliability in order to illustrate the quality of the generated model. Synthetic data based validation q This approach is normally used to validate new • methodologies or algorithms . Well-Known data based validation q The literature prior knowledge is used to validate • gene networks . 7

Introduction Well-Known Biological data based Validation n The quality of a GN can be measured by a direct comparison between the obtained GN and prior biological knowledge (Wei and Li, 2007; Zhou and Wong, 2011). n However, these approaches are not entirely accurate as they only take direct gene–gene interactions into account for the validation task, leaving aside the weak (indirect) relationships (Poyatos, 2011). 8

Proposed Methodology n The main features of our method: q Evaluate the similarities and differences between gene networks and biological database. q Take into account the indirect gene-gene relationships for the validation process. q Using Graph Theory to evaluate with gene networks and obtain different measures. 10

Proposed Methodology Biological Database Input Network B A B A E Floyd Warshall D C C Algorithm F DM DB DM IN Distance Matrices 11

Proposed Methodology Biological Database Input Network B A B A DM IN DM DB E CM=|DMi – DMj| D C C F Coherence Matrix CM CM = |DM IN – DM DB | 12

Proposed Methodology Floyd-Warshall Algorithm n This approach is a graph analysis method that solves the shortest path between nodes. Network Distance Matrix A B C E F B A 0 2 1 1 2 A E B 2 0 1 1 2 C 1 1 0 2 1 E 1 1 2 0 1 F F 2 2 1 1 0 C 13

Proposed Methodology Distance Threshold n Distance threshold ( δ ) q It is used to exclude relationships that lack biological meaning. q This threshold denotes the maximum distance to be considered as relevant in the Distance Matrix generation process. q If the minimum distance between two genes is greater than δ , then no path between the genes will be assumed. 14

Proposed Methodology Distance Threshold Network Distance Matrix δ = 1 B A A B B C C E E F F A A A 0 0 2 2 1 1 1 1 2 2 E B B 2 2 0 0 1 1 1 1 2 2 C C 1 1 1 1 0 0 2 2 1 1 E E 1 1 1 1 2 2 0 0 1 1 F C F F 2 2 2 2 1 1 1 1 0 0 15

Proposed Methodology Distance Threshold Network Distance Matrix δ = 1 B A A A B B B C C C E E E F F F A A A A 0 0 0 2 ∞ 2 1 1 1 1 1 1 ∞ 2 2 E B B B ∞ 2 2 0 0 0 1 1 1 1 1 1 ∞ 2 2 C C C 1 1 1 1 1 1 0 0 0 ∞ 2 2 1 1 1 E E E 1 1 1 1 1 1 2 2 ∞ 0 0 0 1 1 1 F C F F F 2 2 ∞ ∞ 2 2 1 1 1 1 1 1 0 0 0 16

Proposed Methodology DM DB DM IN A B C E F A B C E F A B C D A B C D A 0 2 1 2 2 A 0 2 1 2 2 A 0 1 3 2 CM=|DMi – DMj| A 0 1 ∞ 2 B 2 0 1 1 2 B 2 0 1 1 2 B 1 0 2 1 B 1 0 2 1 C 1 1 0 1 1 C 1 1 0 1 1 C 3 2 0 1 C ∞ 2 0 1 E 2 1 1 0 1 E 2 1 1 0 1 D 2 1 1 0 D 2 1 1 0 F 2 2 1 1 0 F 2 2 1 1 0 Coherence Matrix (CM) A B C A 0 1 ∞ B 1 0 1 C ∞ 1 0 17

Proposed Methodology Obtaining Measures n Coherence Level threshold ( θ ) q This threshold denotes the maximum coherence level to be considered as relevant in the Coherence Matrix. q It is used to obtain well-Known indices by using the elements of the coherence matrix: 0< v,y < ∞ |v-y|<= θ TP FP |v-y|> θ CM i,j | ∞ -y| ( α ) FN | ∞ - ∞ |( β ) TN 18

Proposed Methodology θ = 3 Coherence Matrix A B C D E α A - 1 4 7 β B 1 - 2 5 α β C - 1 8 D 4 2 1 - 1 E 7 5 8 1 - 19

Proposed Methodology θ = 3 Coherence Matrix A B C D E A A A B B B C C C D D D E E E A A A A A B B B B B C C C C C D D D D D E E E E E α α α α α α A - TP 4 7 A A A - - - TP TP TP FN FN FN FP FP FP FP FP FP A A A A A - - - - - TP TP TP 1 1 FP FP 4 4 4 FP FP 7 7 7 β β β B TP - TN β β β β β TP FP B B B TP TP TP - - - TP TP TP FP FP 5 B B B B B TP TP TP 1 1 - - - - - TP TP TP 2 2 FP FP 5 5 5 C FN α TN β β β - TP FP α α α α α β β β β β C C C FN FN - - - TP TP TP FP FP 8 C C C C C - - - - - TP TP TP 1 1 FP FP 8 8 8 D FP TP TP - TP D D D FP FP 4 TP TP TP TP TP TP - - - TP TP TP D D D D D FP FP 4 4 4 TP TP TP 2 2 TP TP TP 1 1 - - - - - TP TP TP 1 1 E FP FP FP TP - E E E FP FP 7 FP FP 5 FP FP 8 TP TP TP - - - E E E E E FP FP 7 7 7 FP FP 5 5 5 FP FP 8 8 8 TP TP TP 1 1 - - - - - 20

Results Real data experiment n Input networks were obtained by applying four inference network techniques on the well-known yeast cell cycle expression data set (Spellman et al., 1998). Soinov et al., 2003. • Bulashevska et al., 2005. • Ponzoni (GRNCOP) et al., 2007 • n Comparison with Well-Known data: BioGrid • KEGG • SGD • YeastNet • 22

Results Real data experiment n Several studies were carried out using different threshold value combinations: q Distance threshold ( δ ) and Coherence level threshold ( θ ) have been modified from one to five, generating 25 different combinations. n The results show that the higher δ and θ values, the greater is the noise introduced. q The most representative result, was obtained for δ =4 and θ =1. 23

Results Soinov Bulashevska Ponzoni Accuracy F-measure Accuracy F-measure Accuracy F-measure Biogrid 0,65 0,79 0,82 0,90 0,27 0,42 KEGG 0,34 0,50 0,28 0,43 0,58 0,48 0,53 0,69 SGD 0,31 0,47 1 1 0,29 0,45 0,50 0,66 1 1 YeastNet 24

Results q These results are consistent with the experiment carried out in Ponzoni et al., 2007. q Ponzoni was successfully compared with Soinov and Bulashevska approaches. 25

Conclusions n A new approach of a gene network validation framework is presented: q The methodology not only takes into account the direct relationships, but also the indirect ones. q Graph theory has been used to perform validation task. 27

Conclusions n Experiments with Real Data . q These results are consistent with the experiment carried out in Ponzoni et al., 2007. q Ponzoni was successfully compared with Soinov and Bulashevska approaches. q These behaviours are also found in the obtained results. Ponzoni presents better coherence values than Soinov and Bulashevska in BioGrid, SGD and YeastNet. 28

Future Works n The methodology has been improved: q The elements in coherence matrix will be weighted based on the gene-gene relationships distance . q A new measure, based on different databases will be generated. n Moreover, a Cytoscape plugin will be implemented. 29

Some References Pavlopoulos GA, et al. (2011): Using graph theory to analyze biological networks. BioData Mining , 4: 10. Asghar A, et al (2012) Speeding up the Floyd–Warshall algorithm for the cycled shortest path problem . AppliedMathematics Letters 25(1): 1 Bulashevska S and Eils R (2005) Inferring genetic regulatory logic from expression data. Bioinformatics 21(11):2706. Ponzoni I, et al (2007) Inferring adaptive regulationthresh-olds and association rules from gene expressiondata through combinatorial optimization learning .IEEE/ACM Transaction on Computation Biology andBioinformatics 4(4):624. Poyatos JF (2011). The balance of weak and strong interactions in genetic networks . PloS One 6(2):e14598. 30

Using Graph Theory to Analyze Gene Network Coherence Thanks for your attention 31

Using Graph Theory to Analyze Gene Network Coherence Francisco A. - PowerPoint PPT Presentation

Using Graph Theory to Analyze Gene Network Coherence Francisco A. Gmez-Vela Norberto Daz-Daz fgomez@upo.es ndiaz@upo.es Jess S. Aguilar Jos A. Lagares Jos A. Snchez 1 Outlines n Introduction n Proposed Methodology n

Network/Graph Network/Graph Informally a graph is a set of nodes Theory Theory joined by a

Ti Ti Tiny Directory Tiny Directory Di Di t t Making Coherence Tracking Making Coherence

Coherence Intuition that the parts of a discourse hang together Local coherence: Consecutive

Coherence Coherence Coherence Holography Recording Holography Recording Let the object

Eukaryotic Gene Eukaryotic Gene Prediction Prediction Eukaryotic gene structure Eukaryotic

Gene Finding Strategies to find gene structures on the web Swiss Institute of Bioinformatics

Staphylococcus aureus Pathogenesis - Gene exchanges - Gene regulation - Gene products - Gene

Introduction to Network Introduction to Network Theory Theory What is a Network? What is a

Network/graph theory Dr. Chris Davis Postdoc, Faculty of TPM, TU Delft Network/graph theory

Gene$c Varia$on and Gene$c Diversity 02-223 How to Analyze

Gene Expression Data Introduction to gene expression data Expression data storage concept An

GRAPH MINING AND GRAPH KERNELS Part I: Graph Mining Karsten Borgwardt^ and Xifeng Yan*

Discourse Coherence Lecture Plan: Einf uhrung in Pragmatik Discourse cohesion and

XL1A: Graph Nominal Frequency Data Using Excel2013 3/10/2017 V0E XL1A: V0E XL1A: V0E Graph

Graph Theory Mongi BLEL King Saud University August 30, 2019 Mongi BLEL Graph Theory Table of

Graph Neural Network Fang Yuanqiang, 2019/05/18 Graph Neural Network Why GNN? Preliminary

Usage Aware Average-Clicks Kalyan Beemanapalli University of Minnesota Ramya Rangarajan

Categorical Liveness Checking by Corecursive Algebras Natsuki Urabe, Masaki Hara &

VERCORS: VERIFICATION OF CONCURRENT SYSTEMS MARIEKE HUISMAN UNIVERSITY OF TWENTE, NETHERLANDS

Analysis of Approximate Median Selection M. Hofri Department of Computer Science, WPI

Software Engineering I (02161) Week 10 Assoc. Prof. Hubert Baumeister DTU Compute Technical

Strings & Software Model Checking Philipp Rmmer Uppsala University 30 August 2019 Taipei,

P OPULATION CTMC A population model is thus given by a tuple X ( N ) = ( X ( N ) , T ( N ) , x ( N

Welcome'to'the'first'video'lecture'in'this'tutorial'on'Fluid'Construc5on'Grammar.'

Using Graph Theory to Analyze Gene Network Coherence Francisco A. - PowerPoint PPT Presentation

Using Graph Theory to Analyze Gene Network Coherence Francisco A. Gmez-Vela Norberto Daz-Daz fgomez@upo.es ndiaz@upo.es Jess S. Aguilar Jos A. Lagares Jos A. Snchez 1 Outlines n Introduction n Proposed Methodology n

Network/Graph Network/Graph Informally a graph is a set of nodes Theory Theory joined by a

Ti Ti Tiny Directory Tiny Directory Di Di t t Making Coherence Tracking Making Coherence

Coherence Intuition that the parts of a discourse hang together Local coherence: Consecutive

Coherence Coherence Coherence Holography Recording Holography Recording Let the object

Eukaryotic Gene Eukaryotic Gene Prediction Prediction Eukaryotic gene structure Eukaryotic

Gene Finding Strategies to find gene structures on the web Swiss Institute of Bioinformatics

Staphylococcus aureus Pathogenesis - Gene exchanges - Gene regulation - Gene products - Gene

Introduction to Network Introduction to Network Theory Theory What is a Network? What is a

Network/graph theory Dr. Chris Davis Postdoc, Faculty of TPM, TU Delft Network/graph theory

Gene$c Varia$on and Gene$c Diversity 02-223 How to Analyze

Gene Expression Data Introduction to gene expression data Expression data storage concept An

GRAPH MINING AND GRAPH KERNELS Part I: Graph Mining Karsten Borgwardt^ and Xifeng Yan*

Discourse Coherence Lecture Plan: Einf uhrung in Pragmatik Discourse cohesion and

XL1A: Graph Nominal Frequency Data Using Excel2013 3/10/2017 V0E XL1A: V0E XL1A: V0E Graph

Graph Theory Mongi BLEL King Saud University August 30, 2019 Mongi BLEL Graph Theory Table of

Graph Neural Network Fang Yuanqiang, 2019/05/18 Graph Neural Network Why GNN? Preliminary

Usage Aware Average-Clicks Kalyan Beemanapalli University of Minnesota Ramya Rangarajan

Categorical Liveness Checking by Corecursive Algebras Natsuki Urabe, Masaki Hara &amp;

VERCORS: VERIFICATION OF CONCURRENT SYSTEMS MARIEKE HUISMAN UNIVERSITY OF TWENTE, NETHERLANDS

Analysis of Approximate Median Selection M. Hofri Department of Computer Science, WPI

Software Engineering I (02161) Week 10 Assoc. Prof. Hubert Baumeister DTU Compute Technical

Strings &amp; Software Model Checking Philipp Rmmer Uppsala University 30 August 2019 Taipei,

P OPULATION CTMC A population model is thus given by a tuple X ( N ) = ( X ( N ) , T ( N ) , x ( N

Welcome'to'the'first'video'lecture'in'this'tutorial'on'Fluid'Construc5on'Grammar.'

Categorical Liveness Checking by Corecursive Algebras Natsuki Urabe, Masaki Hara &

Strings & Software Model Checking Philipp Rmmer Uppsala University 30 August 2019 Taipei,