identification of fever and vaccine associated gene
play

Identification of fever and vaccine- associated gene interaction - PowerPoint PPT Presentation

Identification of fever and vaccine- associated gene interaction networks using ontology-based literature mining Arzucan zgr Bogazici University Junguk Hur, Zuoshuang Xiang, and Yongqun Oliver He University of Michigan The VDOSME


  1. Identification of fever and vaccine- associated gene interaction networks using ontology-based literature mining Arzucan Özgür Bogazici University Junguk Hur, Zuoshuang Xiang, and Yongqun Oliver He University of Michigan The VDOSME workshop, ICBO 2012 July 21, 2012

  2. Motivation

  3. Fever  Fever is a symptom of abnormal elevation of body temperature, usually as a result of a pathologic process.  Fever-associated genes include PGE2, PLA2, COX-2, PTGES, and many cytokines  Many vaccines cause fever, but how vaccination perturbs which fever-related genes is unclear  Goal: Identify gene-gene and gene-vaccine interaction networks that are associated with fever processes using ontology-based literature mining

  4. Workflow

  5. Fever-related literature-derived network

  6. Literature Corpus  Fever-related articles obtained from PubMed:  “Fever OR Hyperthermia OR Pyrexia OR Febrile OR Pyrexial” → 179,156 articles  Vaccine and fever-related articles:  including the terms “vaccine”, “vaccination”, and their variants (e.g., “vaccines”) → 6,224 articles  including 186 specific vaccine names from VO 6,537 articles →  Sentences of titles and abstracts obtained from:  BioNLP database in the National Center for Integrative Biomedical Informatics (NCIBI; http://ncibi.org/)

  7. Vaccine Ontology Support - Motivating Example • These results suggest that the BCG-CWS induces TNF-alpha secretion from DC via TLR2 and TLR4 and that the secreted TNF- alpha induces the maturation of DC per se. [PMID: 11083809] – The term “vaccine” or its variants does not occur in the abstract. – Bacillus Calmette-Guérin (BCG) is a licensed tuberculosis vaccine to protect against infection of Mycobacterium tuberculosis. We use vaccine ontology for two purposes: 1) Obtain vaccine-related literature. 2) Identify specific vaccine-gene interactions.

  8. Vaccine Ontology (VO)  Ontology of the vaccine domain for vaccine data standardization, integration, and analysis.  http://www.violinet.org/vaccineontology/  Classifies a large number of existing vaccines (> 1,000 vaccines) in licensed use, on trial, or in research.  Follows the OBO Foundry principles.  Led by Yongqun “Oliver” He (co-author of this paper).

  9. VO Terms Obtained from http://www.ontobee.org/

  10. Gene and Vaccine Name Identification  Gene names tagged using SciMiner (Hur et al., Bioinformatics, 2009)  Dictionary and rule-based system  F-score: 76%  Genes reported in terms of the official human genes based on the HUGO Gene Nomenclature Committee database (http://www.genenames.org/).  VO-SciMiner used to identify vaccine names based on a set of 186 VO terms (Hur et al., BMC immunology, 2011)  F-score: 95%

  11. Interaction Extraction IL-2 and IL-15 induced the production of IL-17 and IFN-gamma in a dose dependent manner by PBMCs. Path between proteins: good description of No semantic relation between interaction. them. No interaction interaction. Stanford Parser is used to generate the dependency parse trees (de Marneffe et al., 2006).

  12. Path Edit Kernel  Minimum number of operations (insertion, deletion, or substitution of a single word) to transform the first string to the second.  IL2 – nsubj – induced – dobj – production – prep_of – IL-17  IL2 – nsubj – induced – dobj – production – prep_of – IL-17 – conj_and – IFN-gamma  IL-17 – conj_and – IFN-gamma  Edit distance (Path1 -> Path2) = 2 (2 insertions)  Edit distance (Path1 -> Path3) = 8 (6 deletions + 2 insertions) Convert to Similarity Function: EditSim  p i ,p j  =e [ − γ  EditDist  p i ,p j   ] • Integrate as a kernel function to SVM light package (Joachims, 1999). • 56% F-score on AIMED, 85% F-score on CB) (Erkan et al., EMNLP, 2007; Ozgur et al., Journal of Biomedical Semantics, 2011).

  13. Gene-gene interaction networks Gene-gene interactions in all fever- related articles Articles containing the term “vaccine” and its variants Articles containing the term “vaccine” and its variants + terms in the Vaccine Ontology (VO)

  14. Generic fever-related network

  15. Vaccine/VO-associated fever-related network

  16. Centrality Analysis

  17. Degree Centrality  The number of nodes a given node is connected to n k i = ∑ A ij j= 1 z x y  Measures the extent of inluence a node has on the network  The more neighbors a node has, the more important it is  Degree centrality of x = 5; of y = 2

  18. Eigenvector Centrality  Proportional to the sum of the centralities of the neighbors of a given node. n x i = λ − 1 ∑ A ij x j j= 1  In matrix representation: λ x = Ax  For non-negative centrality vector:  λ is largest eigenvalue of A and x is the corresponding eigenvector  Not all neighbors contribute equally to the centrality of node  Defined as “prestige” in social networks  The prestige of a person depends not only on how many friends he has, but also on who (how prestigious) his friends are

  19. Closeness Centrality  Inverse sum of the geodesic distances from a given node to the other nodes in the network closeness  i  = [ ∑ d ij ] − 1 n j= 1 x y  The closer a node to the other nodes, the more important it is . Geodesic distance: length of shortest path between node i and node j (d ij )

  20. Betweenness Centrality  For a node i : sum over all pairs of nodes of proportion of the number of shortest paths passing through i Betweenness  i  = ∑ g jk  i  / g jk g jk (i): # of geodesics passing over i g jk : total # of geodesics j<k x y  Control of a node over the information flow of the network  A node is important if it is on many geodesics

  21. Genes that rank high in both networks - 7 genes - well studied in both contexts

  22. Genes that rank high in generic fever network - 7 genes - not well studied in vaccine context

  23. Genes that rank high in vaccine/VO fever network - 7 genes - well studied in vaccine context

  24. Gene Set Enrichment Analysis

  25. Gene Set Enrichment Analysis  The Database for Annotation, Visualization and Integrated Discovery (DAVID) used.  997 significantly over-represented functional terms (GO or KEGG) in the fever network.  239 significantly over-represented functional terms (GO or KEGG) in the VO-associated fever network.  New scientific hypothesis can be generated (e.g. Role of phosphorylation process in vaccine-induced fever response).

  26. Top 10 most significantly enriched biological functions Values are –log 10 (Benjamini-Hochberg corrected P-values)

  27. Gene-Vaccine Interaction Network

  28. Gene-Vaccine Interactions  SVM pipeline applied to extract gene-vaccine interactions.  1,716 articles containing 2,835 interactions identified. – 32 articles also related to fever – 52 sentences with 44 unique interactions identified. – Specific vaccines: B rucella vaccine RB51, Shigella flexneri vaccine S602, Shigella sonnei strain WRSS1, and Shigella dysenteriae 1 strain WRSd1.  New scientific hypothesis can be generated.

  29. Fever-related gene-vaccine interaction network Green: vaccine Red: genes Blue: genes associated with vaccines New hypothesis: e.g., TLR-vaccine interactions in inducing fever response.

  30. Conclusion  Identification of fever and fever–vaccine associated gene-gene and gene-vaccine networks  Improved mining performance by VO  Phosphorylation-focused regulation enriched in the fever vaccine-subnetwork suggests its crucial role  Identification of TLRs as potential key factors in vaccine-induced fever responses.

  31. Future works  Expansion of the networks by  including more specific vaccines (via improved VO)  using a sentence-level co-citation rather than SVM-based approach  integrating Ontology of Adverse Events (OAE; http://www.oae-ontology.org)  Apply the ontology-based literature mining approach to different domains.

  32. Acknowledgments University of Michigan  Dragomir Radev  Junguk Hur  Eva Feldman  Yongqun He  Alex Ade  Zoushuang Xiang  Brian Athey  Rebecca Racz Funding: NIH grant R01AI081062 & Marie Curie Career Integration Grant within the 7th European Community Framework Programme.

  33. Thank you!

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend