A Mediation Framework for Transparent Access to largely distributed - - PowerPoint PPT Presentation

a mediation framework for transparent access to largely
SMART_READER_LITE
LIVE PREVIEW

A Mediation Framework for Transparent Access to largely distributed - - PowerPoint PPT Presentation

A Mediation Framework for Transparent Access to largely distributed data sources Christine Collet Christine.Collet@imag.fr Institut National Polytechnique Grenoble LSR assistant director Database group leader http://www-lsr.imag.fr/mediagrid


slide-1
SLIDE 1

A Mediation Framework for Transparent Access to largely distributed data sources

Christine Collet

Christine.Collet@imag.fr Institut National Polytechnique Grenoble LSR assistant director Database group leader

slide-2
SLIDE 2

2

http://www-lsr.imag.fr/mediagrid

Project's Partners

LaMI (Lab. des Méthodes Informatiques) Univ. Evry- Val d'Essonne, France LSR (Lab. Logiciels Systèmes Réseaux), IMAG Grenoble, France PRiSM (Lab. Parallélisme, Réseaux, Systèmes, Modélisation), Univ. Versailles St Quentin, France

slide-3
SLIDE 3

3

MEDIAGRID objective

Contribute to the definition of an open mediation framework for the Grid

« framework » means a reusable design of a

mediation system expressed as a set of abstract classes (or components) and the way their instances collaborate

  • « open » refers to the construction of

mediation systems out of heterogeneous (hardware, software and network) elements

slide-4
SLIDE 4

4

Mediation system

WRAPPER WRAPPER MEDIATOR WRAPPER WRAPPER MEDIATOR

M e d i a t i

  • n

l e v e l

Mediation schema

Exported schema

MQn MQ1 MQ2

APPLICATION USER

slide-5
SLIDE 5

5

Several aspects to consider

Design Communication Execution Association Design Communication Execution Association Syntax (technics, interfaces) Logic (semantics, schemas) Syntax (technics, interfaces) Logic (semantics, schemas) Systems more and more distributed Several protocols (SOAP, CORBA, RMI) Several transparency levels Systems more and more distributed Several protocols (SOAP, CORBA, RMI) Several transparency levels Structured : DISCO(objet), LeSelect(relationnel) Semi-structured : TSIMMIS(OEM), Tukwila, MIX, YAT, MOMIS(XML) Structured : DISCO(objet), LeSelect(relationnel) Semi-structured : TSIMMIS(OEM), Tukwila, MIX, YAT, MOMIS(XML) Single mediator : LeSelect, Information Manifold, … Hierarchie of mediators : TSIMMIS, DISCO, … Single mediator : LeSelect, Information Manifold, … Hierarchie of mediators : TSIMMIS, DISCO, … GAV : TSIMMIS, Garlic, MiX, Hermes, MOMIS, Xyleme, YAT, … LAV : Information Manifold, SIMS, Tukwila, PICSEL, Agora, DWQ,… GAV : TSIMMIS, Garlic, MiX, Hermes, MOMIS, Xyleme, YAT, … LAV : Information Manifold, SIMS, Tukwila, PICSEL, Agora, DWQ,… Thin wrapper : Information Manifold, SIMS, … Thick wrapper : TSIMMIS, Garlic, … Thin wrapper : Information Manifold, SIMS, … Thick wrapper : TSIMMIS, Garlic, … Yes : IRO-DB, DWQ, IGD, … No : Information Manifold, SIMS, Tukwila, PICSEL, Agora, DWQ, TSIMMIS, Garlic, MiX, Hermes, MOMIS, Xyleme, YAT, … Yes : IRO-DB, DWQ, IGD, … No : Information Manifold, SIMS, Tukwila, PICSEL, Agora, DWQ, TSIMMIS, Garlic, MiX, Hermes, MOMIS, Xyleme, YAT, … classical: TSIMMIS, Garlic, MiX, Hermes, Xyleme, Inf. Manifold YAT, …

  • ntology-based : SIMS, Ontobroker, MOMIS,

OBSERVER, DWQ, … classical: TSIMMIS, Garlic, MiX, Hermes, Xyleme, Inf. Manifold YAT, …

  • ntology-based : SIMS, Ontobroker, MOMIS,

OBSERVER, DWQ, …

slide-6
SLIDE 6

6

Revisiting …

  • Huge amount of knowledge to maintain (sources

descriptions, schemas, semantic relations between schemas)

  • Growing complexity with respect to the number, the

types and capacities of sources

  • High dynamicity: data sources evolution, new

sources being of interest, remove sources

  • Long running queries and continuous queries: may

need to change the execution plan, to produce partial results, to materialize results

  • Interaction: Users and applications want to control

the query processing

slide-7
SLIDE 7

7

Response Wrapper S1 Wrapper Sn

User level Mediation level Source level

Mediation schema

XML schema Schéma exporté (XML schéma) Exported schema (XML schema)

Statistics

Metadata

Semantic correspondences Capabilities

Mediation level Mediation Queries generator Evaluator Rewriting

Sub-query Sub-query 1 Sub-query Sub- query n XQuery For Eukaryotes organisms WhereEntirely sequenced ReturnExpression matrix XQuery XQuery For For Eukaryotes organisms Where WhereEntirely sequenced Return ReturnExpression matrix Intermediary result n Query in terms of exported schemas Query in terms of exported schemas LAV Mediation queries S1=Q(MS), S2=Q(MS) GAV Mediation queries MS=Q(S1,S2,S3)

MEDIAGRID big picture

Intermediary result 1

slide-8
SLIDE 8

8

Generation of mediation queries

MEDATION SOURCES

R(#K, A, B, C)

Mediation Schema

Source S1 Source S2 R1(#K, A) R2(#A, B) R3(#B, C, D) R4(#D, E)

Q: R1 R2 R3

Given an entity R

within the mediation schema and exported schemas

How to generate

mediation queries computing data of R from data of sources

slide-9
SLIDE 9

9

Generation of mediation queries

Extract relevant portions of a source (mapping

schemas) for a given mediation schema.

  • > mapping for R : (K, A) of S1/R1, (A, B) of

S1/R2 and (B,C) of S2/R3

Find the candidate operations between mapping

schemas using rules such as

If two mapping schemas have an overlapping sub-tree then the join operation is candidate between the two mapping schemas.

Generate mediation queries

Ex: R1(K, A) R2(A, B) R3(B,C)

slide-10
SLIDE 10

10

Adaptive and interactive query evaluator

Adaptive: it is able to adapt itself to the execution environment (in case a query

takes a long time to produce results, data arrive after a too long time, too much data has already been processed …)

Interactive: it allows user control. Users

specify their interests or constraints on processing queries.

slide-11
SLIDE 11

11

QBF architecture

Query Manager

IQueryMgr

Context Manager

IContextMgr

Plan Manager

IPlanMgr

Buffer Manager

IBufferMgr

Rule Manager

IRuleMgr

Monitor

IMonitor

Legends provided interface required interface

slide-12
SLIDE 12

12

Adaptive query evaluation

Monitor and RuleManager Classes have been defined to monitor: arrival data rate, number of data processed, and execution time.

R S T

  • p1
  • p2

R T S

  • p1’

retransform

R T S

  • p1
  • p2

buffer

Buffer

  • p2’

[Query Scrambling technic] when timeout if not transformable do buffer when timeout if not began do retransform

slide-13
SLIDE 13

13

Interactive query evaluation

User interest on

number of results, execution time, data preference, type of

partial result, …

Refinement : operations

For modifying context parameters, requesting partial results,

preference, halting operators, adding new operators, …

Managed as

MonitoringProperty instances, rules, control operators (e.g.

Counter, Partial, Buffer…)

With techniques : monitoring, active rules, construction of

partial results, …

slide-14
SLIDE 14

14

Contribution to Data GRID

One of the key challenge: the provision of system and

data management services in the large scale, open and highly dynamic environment of a GRID

MEDIAGRID contributions: frameworks for data

integration, mediation queries generation, adaptive and interactive query evaluation, wrapper / mediator generation, rules, workflows, metadata …

More generally, DBMS components as middelware open

services for data management: caching, persistency, replication, transactions, confidentiality, security, querying, workflows, composition, etc.

slide-15
SLIDE 15

15

Possible links in UK

The OGSA-DAI project http://www.ogsa-dai.org.uk/

concerned with constructing middleware to assist with

access and integration of data from separate data sources via the grid.

conceived by the UK Database Task Force and is working

closely with the Global Grid Forum DAIS-WG and the Globus team.

The Open Grid Service Architecture includes a Grid Distributed Query Service (GDQS) and a Grid Query Evaluation Service (GQES).

distribution and implicit parallelism in query processing

slide-16
SLIDE 16

16

CoreGrid & ObjectWeb

The ObjectWeb consortium can provide valuable technology and

exploitation path for CoreGrid

check out http://www.objectweb.org ObjectWeb focus is the development of open source distributed

middleware

current emphasis is on Web application servers plans are for the development of highly configurable component-

based middleware (with multiple application domains in mind)

distributed system technology and data (base) management

technology in ObjectWeb can be useful for Grids esp. when it comes to merge Grid infrastructure and Web services infrastructure