BioinfoGRID Project: Bioinformatics Grid Application for life - - PowerPoint PPT Presentation

bioinfogrid project bioinformatics grid application for
SMART_READER_LITE
LIVE PREVIEW

BioinfoGRID Project: Bioinformatics Grid Application for life - - PowerPoint PPT Presentation

BioinfoGRID Project: Bioinformatics Grid Application for life science LEGR Yannick (legre@clermont.in2p3.fr) CNRS/IN2P3, LPC Clermont-Ferrand on behalf of the BioInfoGrid consortium Credit slides: Luciano MILANESI, ITB CNR http:/ / w w w


slide-1
SLIDE 1

http:/ / w w w .itb.cnr.it/ bioinfogrid I SGC 2 0 0 6 – Taipei – May 1 st – 4 th, 2 0 0 6

BioinfoGRID Project: Bioinformatics Grid Application for life science

LEGRÉ Yannick (legre@clermont.in2p3.fr) CNRS/IN2P3, LPC Clermont-Ferrand

  • n behalf of the BioInfoGrid consortium

Credit slides: Luciano MILANESI, ITB CNR

slide-2
SLIDE 2

BioinfoGRID http://www.itb.cnr.it/bioinfogrid 2

BioinfoGRID Project

.

  • The BIOINFOGRID projects proposes to combine the

Bioinformatics services and applications for molecular biology users with the Grid Infrastructure operated by EGEE and EGEEII projects.

  • In the BIOINFOGRID initiative we plan to evaluate

genomics, transcriptomics, proteomics and molecular dynamics applications studies based on GRID technology.

  • start date: 1st January 2006
  • end date: 31st December 2007
slide-3
SLIDE 3

BioinfoGRID http://www.itb.cnr.it/bioinfogrid 3

  • A typical gene lab can produce 100 terabytes of

information a year, the equivalent of 1 million encyclopedias.

  • Few biologists have the computational skills needed to fully

explore such an astonishing amount of data; nor do they have the skills to explore the exploding amount of data being generated from clinical trials.

  • The immense amount of data that are available, and the

knowledge is the tip of the data iceberg.

Bioinformatics: Emerging Opportunities and Emerging Gaps1 Paula E.Stephan and Grant Black

Introduction

slide-4
SLIDE 4

BioinfoGRID http://www.itb.cnr.it/bioinfogrid 4

Bioinformatics applications in GRID

ID MURA_BACSU STANDARD; PRT; 429 AA. DE PROBABLE UDP-N-ACETYLGLUCOSAMINE 1- CARBOXYVINYLTRANSFERASE DE (EC 2.5.1.7) (ENOYLPYRUVATE TRANSFERASE) (UDP-N- ACETYLGLUCOSAMINE DE ENOLPYRUVYL TRANSFERASE) (EPT).

slide-5
SLIDE 5

BioinfoGRID http://www.itb.cnr.it/bioinfogrid 5

BIOINFOGRID: Workpackages

Database and Functional Genomics Applications WP4 Project Management Office WP8 Dissemination and Outreach. WP7 Coordination of technical aspects and relation with RI Projects, user training, application support and resources integration. WP6 Molecular Dynamics Applications WP5 Transcriptomics Applications in GRID WP3 Proteomics Applications in GRID WP2 Genomics Applications in GRID WP1 Description WP

slide-6
SLIDE 6

BioinfoGRID http://www.itb.cnr.it/bioinfogrid 6

Genomics applications in GRID

Aim of WP1: use of computational GRID to analyse molecular biological data at the genomic scale Projects

  • the GRID version of the Portal system: unification of

larger groups of bioinformatics tools into single analytical steps and their optimization for GRID

  • GRID analysis of cDNA data: computer- aided functional

annotation of cDNAs in order to optimize sensitivity and specificity WP1

slide-7
SLIDE 7

BioinfoGRID http://www.itb.cnr.it/bioinfogrid 7

Genomics applications in GRID

  • GRID analysis of genomic databases: integration of

precomputed data, gene identification, differentiation of pseudogenes, comparative genome analysis, etc.

  • Multiple alignments: testing of new algorithms for

computationally very demanding alignment procedures,

  • ptimization for GRID.

WP1

slide-8
SLIDE 8

BioinfoGRID http://www.itb.cnr.it/bioinfogrid 8

Proteomics Applications in GRID

Aim of WP2 : use of computational GRIDs to analysis molecular biological data in proteomics Projects

  • Perform functional protein analysis in GRID: Testing

the functional protein domain annotations of large proteins families using GRID and related databases. WP2

slide-9
SLIDE 9

BioinfoGRID http://www.itb.cnr.it/bioinfogrid 9

Proteomics Applications in GRID

  • Protein surface calculation in GRID. : the grid will be

used to elaborate the volumetric description of the protein

  • btaining a precise representation of the corresponding

surface. WP2

slide-10
SLIDE 10

BioinfoGRID http://www.itb.cnr.it/bioinfogrid 10

Transcriptomics applications in GRID

Aim of WP3 : use of computational GRIDs to analyse trascriptomics data and to perform application of Phylogenetic methods based on estimates trees. Projects

  • To perform algorithmic tools for gene expression data

analysis in GRID: evaluate the computational tools for extracting biologically significant information from gene expression data.

  • Algorithms will focus on clustering steady state and time

series gene expression data, multiple testing and meta analysis of different microarray experiments from different groups, and identification of transcription sites. WP3

slide-11
SLIDE 11

BioinfoGRID http://www.itb.cnr.it/bioinfogrid 11

Transcriptomics applications in GRID EGEEII

Data analysis specific for microarray and allow the GRID user to store and search microarray data, with direct access to the data files stored on Data Storage element on GRID servers.

Researchers perform their activities regardless geographical location, interact with colleagues, share and access data Scientific instruments and experiments provide huge amount of data from microarray

WP3

slide-12
SLIDE 12

BioinfoGRID http://www.itb.cnr.it/bioinfogrid 12

Phylogenetic application in GRID

  • Phylogenetics methods : Reconstructing the evolutionary

history of a group of taxa is major research thrust in computational biology and a standard part of exploratory sequence analysis. An evolutionary history not only gives relationships among taxa, but also an important tool for inferring the universal tree of life, inferring structural, physiological, and biochemical properties of sequences from other similar sequences, and reconstruction of tissue evolution. WP3

slide-13
SLIDE 13

BioinfoGRID http://www.itb.cnr.it/bioinfogrid 13

Database Applications in GRID

Aim of WP4 : This WP will provide the possibility to manage the biological databases, by using the GRID EGEEII infrastructure. Projects

  • Biological database on GRID: these database will be

complemented by the other publicly available in Internet, by using GRID and web services where is appropriate. WP4

slide-14
SLIDE 14

BioinfoGRID http://www.itb.cnr.it/bioinfogrid 14

Functional Analogous Finder

  • Functional Analogous Finder: By using the GO terms

and the associations to gene products and using a simple chi-square approach we plan to compare the total associated GO terms and their ascending parents to validate the functional analogy between two gene products.

  • In addition, we weight the GO term dependent how often

this term is used; the more the term is used to describe different gene products the less specific it is and the lower the weighting and impact for the statistic.

  • A search within the UniProt products for a functional

analogous therefore involves a comparison of the GO terms of the gene product of interest with the GO terms. WP4

slide-15
SLIDE 15

BioinfoGRID http://www.itb.cnr.it/bioinfogrid 15

Molecular applications in GRID

Aim of WP5 : The objective is to test the scalability of Molecular Dynamics simulations, which usually takes very long time to complete relevant analysis. Analysis will be performed notably using as a starting point the data generated by the WISDOM application on the EGEE infrastructure Projects

  • Wide In Silico Docking On Malaria initiative WISDOM :

This protocol has to coordinate the different analysis steps in order to complete the simulation on the GRID platform WP5

slide-16
SLIDE 16

BioinfoGRID http://www.itb.cnr.it/bioinfogrid 16

WISDOM Data challenge

  • Mandate: deploy Molecular Dynamics application in a grid

environment

– Evaluation of performances

  • Goal: contribute to the WISDOM initiative dedicated to in

silico drug discovery

– Start from the results of WISDOM docking data challenge in 2005 – Rerank the best hits using Molecular Dynamics

  • Strategy: deployment of MD softwares on different grid

infrastructures

– Grid of PCs: EGEE-II – Grid of supercomputers: DEISA

slide-17
SLIDE 17

BioinfoGRID http://www.itb.cnr.it/bioinfogrid 17

Dissemination and Outreach

  • The BIOINFOGRID Initial training course

http://www.itb.cnr.it/bioinfogrid/project-events/initial-training-course

  • Course objectives

– provide to bioinformatics users a general overview of the state of the art in the development of the Grid Middleware and

  • infrastructures. In particular the state of LCG and gLite Middleware

and of the EGEE infrastructure will be presented; – provide detailed technical information and precise instructions on how to use the GRID to enable new users to start using the Grid in the best possible way.

  • A BioinfoGRID International Conference will be organized towards

the end of the project in 2007.

WP6, WP7

slide-18
SLIDE 18

BioinfoGRID http://www.itb.cnr.it/bioinfogrid 18

http://www.itb.cnr.it/bioinfogrid

slide-19
SLIDE 19

BioinfoGRID http://www.itb.cnr.it/bioinfogrid 19

CREDITS

  • Suhai Sándor (DKFZ) Germany
  • Mazzucato, Mirco (INFN), Italy
  • Breton Vincent (CNRS/IN2P3), France.
  • Giorgio Maggi (INFN), Italy
  • Legre Yannick (CNRS/IN2P3), France.
  • Francesco Beltrame (DIST), Italy
  • Lio’ Pietro (UNIVERSITY OF CAMBRIDGE), UK
  • Meloni Giovanni (CILEA), Italy
  • Giselle Andreas (CNR-ITB), Italy
  • Ivan Merelli (CNR-ITB), Italy
slide-20
SLIDE 20

BioinfoGRID http://www.itb.cnr.it/bioinfogrid 20

Thank you for your attention!

4th HealthGrid conference

6th – 9th June Valencia (Spain) http://valencia2006.healthgrid.org