Virtual Integration of of Existing Existing Web Web Virtual - - PowerPoint PPT Presentation

virtual integration of of existing existing web web
SMART_READER_LITE
LIVE PREVIEW

Virtual Integration of of Existing Existing Web Web Virtual - - PowerPoint PPT Presentation

DB Group @ unimo 3rd International International Workshop on Data Workshop on Data Integration Integration in Life in Life Sciences Sciences DILS 06, DILS 06, 3rd European Bioinformatics Institute (EBI), European Bioinformatics Institute


slide-1
SLIDE 1

Sonia Bergamaschi - Antonio Sala

1

DB Group @ unimo

Virtual Integration Virtual Integration of

  • f Existing

Existing Web Web Databases for Databases for the the Genotypic Selection Genotypic Selection of

  • f

Cereal Cultivars Cereal Cultivars

Sonia Sonia Bergamaschi Bergamaschi -

  • Antonio Sala

Antonio Sala www.dbgroup.unimo.it www.dbgroup.unimo.it

Dipartimento di Ingegneria dell’Informazione Dipartimento di Ingegneria dell’Informazione Università di Modena e Reggio Emilia, via Università di Modena e Reggio Emilia, via Vignolese Vignolese 905, 41100 Modena 905, 41100 Modena

3rd 3rd International International Workshop on Data Workshop on Data Integration Integration in Life in Life Sciences Sciences DILS 06, DILS 06, European Bioinformatics Institute European Bioinformatics Institute (EBI), (EBI), Hinxton Hinxton, UK , UK 20 20 -

  • 22

22 July July 2006 2006

slide-2
SLIDE 2

Sonia Bergamaschi - Antonio Sala

2

DB Group @ unimo

Motivations Motivations

  • To perform intelligent data integration of existing databases to

create a Global Virtual View (GVV) for the genotypic selection of cereal cultivars.

  • The GVV has been realized with the MOMIS system (Mediator

envirOnment for Multiple Information Sources) developed by the Database Group of the University of Modena and Reggio Emilia as a part of the CEREALAB project conducted by the Agrarian faculty

  • f the University of Modena and Reggio Emilia in collaboration and

funded by the Regional Government of Emilia Romagna.

slide-3
SLIDE 3

Sonia Bergamaschi - Antonio Sala

3

DB Group @ unimo

The MOMIS The MOMIS Integration Process Integration Process

SYNSET2 SYNSET# SYNSET4 SYNSET1

AUTOMATI C/ MANUAL ANNOTATI ON SEMI -AUTOMATI C ANNOTATI ON

I NFERRED RELATI ONSHI P S LEXI CON DERI VED RELATI ONSHI P S SCHEMA DERI VED RELATI ONSHI PS

Common Thesaurus

COMMON THESAURUS GENERATI ON

USER SUPPLI ED RELATI ONSHI PS ODLI 3 LOCAL SCHEMA 3

WRAPPI NG

ODLI 3 LOCAL SCHEMA 1

GVV GENERATI ON (CEREALAB)

MAPPI NG TABLES GLOBAL CLASSES

clusters generation

Structured source

Graingenes

Structured source

CEREALAB ODLI 3 LOCAL SCHEMA 2

Structured source

Gramene

OWL Export

slide-4
SLIDE 4

Sonia Bergamaschi - Antonio Sala

4

DB Group @ unimo

The ODL The ODLI3

I3 Language

Language

MOMIS uses an object-oriented language called ODLi3 as a common data model for integrating a given set of local information sources. ODLi3 extends ODL with the following relationships expressing intra- and inter-schema knowledge for the source schemata:

  • SYN (synonym of)
  • BT (broader terms)
  • NT (narrower terms)
  • RT (related terms)

By means of ODLi3, only one language is exploited to describe both the sources (the input of the synthesis process) and the GVV (the result

  • f the process).

ODLi3 is based on the OCDL description logics. Translators ODLi3/OCDL and OCDL/ODLi3 are available.

slide-5
SLIDE 5

Sonia Bergamaschi - Antonio Sala

5

DB Group @ unimo

Global Virtual View Global Virtual View Generation Generation

MOMIS

  • Identifies and groups similar ODLi3 classes (classes that describe the same
  • r semantically related concept in different sources) into clusters (global

classes)

  • Generates mappings among global and local classes in the cluster

Cluster generation: affinity coefficients are evaluated for all possible pairs of ODLi3 classes, based on the relationships in the Common Thesaurus properly strengthened

– Affinity coefficients determine the degree of matching of two classes based on: – their names (Name Affinity coefficient) – their attributes (Structural Affinity coefficient) – Affinity coefficients are fused into Global Affinity coefficients calculated by means of the linear combination of the two coefficients – Global affinity coefficients are used by a hierarchical clustering algorithm, to include ODLi3 classes in clusters according to their degree of affinity

  • The designer may interactively refine and complete the proposed

integration results

slide-6
SLIDE 6

Sonia Bergamaschi - Antonio Sala

6

DB Group @ unimo

Mapping Refinement Mapping Refinement

  • A Mapping Table (MT) is automatically generated for each global

class of a GVV.

  • The designer can extend the MT by adding:

– Data Conversion Functions from local to global attributes – Join Conditions among pairs of local classes. – Resolution Functions for global attributes to solve data conflicts

  • f local attribute values.
  • MOMIS provides some standard kinds of resolution functions for solving

data conflicts for each global attribute mapping onto local attributes coming from more than one local source:

  • Random
  • Aggregation
  • Coalescence
  • Precedence function
  • All Values
slide-7
SLIDE 7

Sonia Bergamaschi - Antonio Sala

7

DB Group @ unimo

The MOMIS The MOMIS Query Query Manager Manager

The MOMIS Query Manager is the coordinated set of functions which allows the user to query the GVV Query processing consists of the following steps:

  • Query rewriting
  • to rewrite a global query as an equivalent set of queries

expressed on the local sources (local queries)

  • Local queries execution
  • the local queries are sent and executed at local sources
  • Fusion and Reconciliation
  • The local answers are fused into the global answer