http://asp.uma.es Ontology-Based Mediation in the Amine System - - PowerPoint PPT Presentation

http asp uma es
SMART_READER_LITE
LIVE PREVIEW

http://asp.uma.es Ontology-Based Mediation in the Amine System - - PowerPoint PPT Presentation

http://asp.uma.es Ontology-Based Mediation in the Amine System Project Pisa June 2007 Prof. Dr. Jos F. Aldana Montes (jfam@lcc.uma.es) Prof. Dr. Francisca Snchez-Jimnez Ismael Navas Delgado Ral Montaez Almudena Pino-ngeles


slide-1
SLIDE 1

Ontology-Based Mediation in the

Amine System Project

Pisa June 2007

  • Prof. Dr. José F. Aldana Montes (jfam@lcc.uma.es)
  • Prof. Dr. Francisca Sánchez-Jiménez

Ismael Navas Delgado Raúl Montañez Almudena Pino-Ángeles Aurelio A. Moya-García José Luis Urdiales

http://asp.uma.es

slide-2
SLIDE 2

Outline

  • Family of problems to be solved
  • Proposed solution: from

Semantics to Data Integration

  • Semantic Directories
  • Ontology-Based Mediator
  • Specific Problem: Use Case
  • Conclusions
slide-3
SLIDE 3

Outline

  • Family of problems to be solved
  • Proposed solution: from

Semantics to Data Integration

  • Semantic Directories
  • Ontology-Based Mediator
  • Specific Problem: Use Case
  • Conclusions
slide-4
SLIDE 4

Family of Problems

  • Systems biology is the study of an organism,

viewed as an integrated and interacting network of genes, proteins and biochemical reactions which give rise to life. (Institute of Systems Biology)

  • Instead of focusing on individual parts, the focus is
  • n a complete system

 Need of integrated access to different data sources to enable the study of the system

slide-5
SLIDE 5

Outline

  • Family of problems to be solved
  • Proposed solution: from

Semantics to Data Integration

  • Semantic Directories
  • Ontology-Based Mediator
  • Specific Problem: Use Case
  • Conclusions
slide-6
SLIDE 6

Proposed Solution

  • Requirements:
  • Easily extensible with new resources
  • Reusable elements
  • Possibility of developing different kind of

applications (not only data integration)

  • Decisions:
  • Take advantage of the Semantic Web
  • Annotate data sources with respect to
  • ntologies
  • Reuse previous works
slide-7
SLIDE 7

Proposed Solution

  • Requirements:
  • Easily extensible with new databases
  • Reusable elements
  • Possibility of developing different kind of

applications (not only data integration)

  • Decisions:
  • Take advantage of the Semantic Web
  • Annotate data sources with respect to
  • ntologies
  • Reuse previous works
slide-8
SLIDE 8

Outline

  • Family of problems to be solved
  • Proposed solution: from

Semantics to Data Integration

  • Semantic Directories
  • Ontology-Based Mediator
  • Specific Problem: Use Case
  • Conclusions
slide-9
SLIDE 9

Semantic Directories

slide-10
SLIDE 10

SD-Core Metadata

OMV SDMO

OMV Instance OMV Instance OMV Instance OMV Instance OMV Instance OMV Instance OMV Instance OMV Instance OMV Instance SDMO Instanc e Semantic Register R e s

  • u

r c e M e t a d a t a R e p

  • s

i t

  • r

y O n t

  • l
  • g

y M e t a d a t a R e p

  • s

i t

  • r

y

Reasoner Componen t

Semantic Directory

slide-11
SLIDE 11

OMV

OMV (ONTOLOGY METADATA VOCABULARY) is described with OWL, and each instance of the class OntologyImplementation represents an ontology registered in the Semantic

  • Directory. It is possible to

describe some relationships between

  • ntologies
slide-12
SLIDE 12

SDMO

  • OMV: link resources with registered ontologies.
  • Resource: store information (query capabilities,

schema, query interface, name and URI) about resources.

  • Mapping: set the relationships between resources

and ontologies. Each mapping is related with a similarity instance that establishes the similarity between ontology concepts and resource elements. The mapping class is related with OMV, Resource and Similarity class.

  • Similarity: contains three properties (concept1,

concept2 and similarityValue) to establish the similarity between an ontology concept and a resource element.

  • User: deal with users in the applications.
slide-13
SLIDE 13

Interfaces

S

e m a n t i c D i r e c t

  • r

y + l i s t R e s

  • u

r c e s (

  • u

t v e c t

  • r

) + l i s t R e s

  • u

r c e s ( i n u r l ,

  • u

t v e c t

  • r

) + l i s t R e s

  • u

r c e s ( i n r e l a t e d O n t

  • l
  • g

y ,

  • u

t v e c t

  • r

) + l i s t R e s

  • u

r c e s ( i n r e l a t e d O n t

  • l
  • g

y , i n r e l e v a n t C

  • n

c e p t s ,

  • u

t v e c t

  • r

) + l i s t M a p p i n g s ( i n r e s

  • u

r c e N a m e ,

  • u

t v e c t

  • r

) + l i s t M a p p i n g s ( i n u r l ,

  • u

t v e c t

  • r

) + g e t S c h e m a ( i n r e s

  • u

r c e N a m e ,

  • u

t v e c t

  • r

) + g e t S c h e m a ( i n u r l ,

  • u

t f i l e ) + g e t R e l a t e d E e l e m e n t s ( i n

  • n

t

  • l
  • g

y N a m e , i n c

  • n

c e p t ,

  • u

t v e c t

  • r

) « i n t e r f a c e » S e m a n t i c D i r e c t

  • r

y : : R e s

  • u

r c e M e t a d a t a R e p

  • s

i t

  • r

y + l i s t O n t

  • l
  • g

i e s (

  • u

t v e c t

  • r

) + l i s t O n t

  • l
  • g

i e s ( i n c

  • n

c e p t N a m e ,

  • u

t v e c t

  • r

) + l i s t O n t

  • l
  • g

i e s ( i n c

  • n

c e p t N a m e V e c t

  • r

,

  • u

t v e c t

  • r

) + g e t O n t

  • l
  • g

y ( i n

  • n

t

  • l
  • g

y N a m e ,

  • u

t v e c t

  • r

) + g e t O n t

  • l
  • g

y ( i n u r l ,

  • u

t

  • n

t

  • l
  • g

y F i l e ) « i n t e r f a c e » O n t

  • l
  • g

y M e t a d a t a R e p

  • s

i t

  • r

y + r e g i s t e r O n t

  • l
  • g

y ( i n u r l , i n n a m e ) + r e g i s t e r O n t

  • l
  • g

y ( i n

  • n

t

  • l
  • g

y F i l e , i n n a m e ) + r e g i s t e r R e s

  • u

r c e ( i n r e s

  • u

r c e N a m e , i n q u e r y M e t h

  • d

, i n s c h e m a M e t h

  • d

, i n q u e r y C a p a b i l i t i e s ) + r e g i s t e r R e s

  • u

r c e ( i n r e s

  • u

r c e N a m e , i n q u e r y M e t h

  • d

, i n s c h e m a M e t h

  • d

, i n q u e r y C a p a b i l i t i e s , i n

  • n

t

  • l
  • g

y N a m e ) + r e g i s t e r R e s

  • u

r c e ( i n r e s

  • u

r c e N a m e , i n q u e r y M e t h

  • d

, i n s c h e m a M e t h

  • d

, i n q u e r y C a p a b i l i t i e s , i n

  • n

t

  • l
  • g

y N a m e , i n m a p p i n g s ) « i n t e r f a c e » S e m a n t i c R e g i s t e r

Our goal is to provide applications which will make the semantics of the resources explicit through their commitment with an ontology registered in the Semantic

  • Directory. The applications that can be developed using

the Semantic Directory components depend on the extension of the infrastructure by means of new components (built on top of the Semantic Directory)

slide-14
SLIDE 14

SD Conclusions

  • Generic Infrastructure
  • Basic Functionality
  • Extended Functionalities requires Core

Extensions: new metadata, interfaces, …

  • Fully implemented:
  • V 0.9.A: Java, BD MySQL, Racer (Concepts

Classification), Web Services

  • V 0.9.B: Java, Metadata Files, Jena, Web

Services

  • V 0.9.C: Java, Metadata Files, Jena, Corba

CCM

slide-15
SLIDE 15

Outline

  • Family of problems to be solved
  • Proposed solution: from

Semantics to Data Integration

  • Semantic Directories
  • Ontology-Based Mediator
  • Specific Problem: Use Case
  • Conclusions
slide-16
SLIDE 16

Mediator (Data Integration)

Discover available data sources Select data sources & tools Interact with each data source Combine results Users manually interact with each data source getting partial results Users must select which data sources include relevant information for themselves Users must locate data sources that are available online and can solve their problems Users combine partial results to get a partial solution of the problem Building the full solution

slide-17
SLIDE 17

Approach

  • GAV (BioBroker):

+ Easy query rewriting

  • Extension  Global Schema

Changes

  • LAV

+ Easy addition of new data sources

  • Complex Rewriting Process 

simpler components will allow partial improvements

slide-18
SLIDE 18

Approach

  • GAV (BioBroker):

+ Easy query rewriting

  • Extension  Global Schema

Changes

  • LAV

+ Easy addition of new data sources

  • Complex Rewriting Process 

simpler components will allow partial improvements

slide-19
SLIDE 19

Main Characteristics

User Query, Ontology (Q,O) Result (ontology Instances ) Ontology Search /reasoning Mappings Search Resource Search Q, O, QP, {R1, …, Rn} Q, O, Query Plan (QP) (Q,O)

Controller Query Planner

Q, O, Query Plan (QP)

Reasoner Component

Semantics

Semantic Register R e s

  • u

r c e m e t a d a t a r e p

  • s

i t

  • r

y O n t

  • l
  • g

y M e t a d a t a R e p

  • s

i t

  • r

y

Data Service

Q , O , Q P , { R 1 , … , R n } Q, O, Result (Ontology Instances)

Query Plan Solver/ Evaluator Integrator Data Service Data Service Data Service

Mediator

Wrappers developed as Web Services Resource Semantic Descriptions Technical Users

  • r Software

Developers

slide-20
SLIDE 20

Component Division

User Query, Ontology (Q,O) Result (ontology Instances ) Ontology Search /reasoning Mappings Search Resource Search Q, O, QP, {R1, …, Rn} Q, O, Query Plan (QP) ( Q , O )

Controller Query Planner

Q , O , Q u e r y P l a n ( Q P )

Reasoner Component

Semantics

Semantic Register R e s

  • u

r c e m e t a d a t a r e p

  • s

i t

  • r

y O n t

  • l
  • g

y M e t a d a t a R e p

  • s

i t

  • r

y

Data Service

Q, O, QP, {R1, …, Rn} Q, O, Result (Ontology Instances)

Query Plan Solver/ Evaluator Integrator Data Service Data Service Data Service

Mediator

slide-21
SLIDE 21

Partial Improvements

User Query, Ontology (Q,O) Result (ontology Instances ) Ontology Search /reasoning Mappings Search Resource Search Q, O, QP, {R1, …, Rn} Q, O, Query Plan (QP) ( Q , O )

Controller Query Planner

Q, O, Query Plan (QP)

Reasoner Component

Semantics

Semantic Register R e s

  • u

r c e m e t a d a t a r e p

  • s

i t

  • r

y O n t

  • l
  • g

y M e t a d a t a R e p

  • s

i t

  • r

y

Data Service

Q, O, QP, {R1, …, Rn} Q , O , R e s u l t ( O n t

  • l
  • g

y I n s t a n c e s )

Query Plan Solver/ Evaluator Integrator Data Service Data Service Data Service

slide-22
SLIDE 22

Mediator

Reuse Elements Share Semantics Reuse sources

U s e r Q u e r y , O n t

  • l
  • g

y ( Q , O ) R e s u lt (

  • n

t

  • lo

g y I n s t a n c e s ) O n t

  • l
  • g

y S e a r c h / r e a s

  • n

in g M a p p in g s S e a r c h R e s

  • u

r c e S e a r c h Q , O , Q P , { R 1 , … , R n } Q , O , Q u e r y P l a n ( Q P ) ( Q , O )

C

  • n

t r

  • l

le r Q u e r y P la n n e r

Q , O , Q u e r y P l a n ( Q P )

R e a s

  • n

e r C

  • m

p

  • n

e n t

S e m a n t i c s

S e m a n t ic R e g is t e r R e s

  • u

r c e m e t a d a t a r e p

  • s

i t

  • r

y O n t

  • l
  • g

y M e t a d a t a R e p

  • s

i t

  • r

y

D a t a S e r v ic e

Q , O , Q P , { R 1 , … , R n } Q , O , R e s u l t ( O n t

  • l
  • g

y I n s t a n c e s )

Q u e r y P la n S

  • lv

e r / E v a l u a t

  • r

I n t e g r a t

  • r

D a t a S e r v ic e D a t a S e r v ic e D a t a S e r v ic e

Web Services XQuery and XML Ontology Instances LAV Mappings interact with the user interface find a query plan (QP) for the user query performs the corresponding call to the data services involved in the sub-queries Results from data services (R1, ..., Rn) are composed

slide-23
SLIDE 23

Mediator Conclusions

  • LAV
  • Ontology Based
  • Enabled partial improvements
  • Limited Reasoning
  • First beta version implemented (testing

phase):

  • Test 1: Bioinformatics Resources
  • Test 2: Second Hand Car Resources
slide-24
SLIDE 24

Outline

  • Family of problems to be solved
  • Proposed solution: from

Semantics to Data Integration

  • Semantic Directories
  • Ontology-Based Mediator
  • Specific Problem: Use Case
  • Conclusions
slide-25
SLIDE 25

Use Case

ASP: Amine System Project http://asp.uma.es

A common and useful strategy to find the 3D structure of a protein, which cannot be obtained by its crystallization, is to apply comparative modeling techniques. These work from the primary sequence of the unsolved protein and predict its 3D structure by comparing it to those of solved homologous proteins

slide-26
SLIDE 26

Domain Ontology

O r g a n i s m M e t a b

  • l

i c _ p r

  • c

e s s C a t a l y t i c _ a c t i v i t y P r

  • t

e i n _ b i n d i n g P r

  • t

e i n _ d

  • m

a i n _ s p e c i f ic _ b i n d i n g A m i n

  • _

a c id _ s e q u e n c e H

  • m
  • l
  • g

u e _ s e q u e n c e P

  • ly

p e p t i d e M a c r

  • m
  • l

e c u l e M

  • l

e c u l a r _ e n t i t y M

  • l

e c u l a r _ 3 D _ S t r u c t u r e F u n c t i

  • n

a l _ d

  • m

a i n

b e lo n g _ t

  • b

e lo n g _ t

  • b

e lo n g _ t

  • b

e lo n g _ t

  • is

_ f

  • r

m e d _ b y

M e t a b

  • l

i t e

h a s is _ c h a r a c t e r iz e d is _ f

  • r

m e d _ b y a m in

  • _

a c id _ s e q u e n c e h

  • m
  • lo

g u e _ s e q u e n c e _ a a s h

  • m
  • lo

g u e _ s e q u e n c e _ p

  • ly

p e p t id e d e v e lo p s SWISS-Prot PDB BlastP Modeller JMol PubChem Kegg Brenda Prosite

slide-27
SLIDE 27

Domain Ontology

ASP: Amine System Project http://asp.uma.es

SWISS-Prot PDB BlastP Modeller JMol PubChem Kegg Brenda Prosite

O r g a n i s m M e t a b

  • li

c _ p r

  • c

e s s C a t a ly t ic _ a c t iv it y P r

  • t

e in _ b in d in g P r

  • t

e in _ d

  • m

a i n _ s p e c if ic _ b in d in g A m in

  • _

a c id _ s e q u e n c e H

  • m
  • lo

g u e _ s e q u e n c e P

  • ly

p e p t i d e M a c r

  • m
  • le

c u le M

  • l

e c u l a r _ e n t it y M

  • l

e c u l a r _ 3 D _ S t r u c t u r e F u n c t io n a l _ d

  • m

a i n

b e lo n g _ t

  • b

e lo n g _ t

  • b

e lo n g _ t

  • b

e lo n g _ t

  • is

_ f

  • r

m e d _ b y

M e t a b

  • li

t e

h a s is _ c h a r a c t e r iz e d is _ f

  • r

m e d _ b y a m in

  • _

a c id _ s e q u e n c e h

  • m
  • lo

g u e _ s e q u e n c e _ a a s h

  • m
  • lo

g u e _ s e q u e n c e _ p

  • ly

p e p t id e d e v e lo p s

slide-28
SLIDE 28

Next Step

User Query, Ontology (Q,O) Result (ontology Instances ) Ontology Search /reasoning Mappings Search Resource Search Q, O, QP, {R1, …, Rn} Q, O, Query Plan (QP) ( Q , O )

Controller Query Planner

Q , O , Q u e r y P l a n ( Q P )

Reasoner Component

Semantics

Semantic Register R e s

  • u

r c e m e t a d a t a r e p

  • s

i t

  • r

y O n t

  • l
  • g

y M e t a d a t a R e p

  • s

i t

  • r

y

Data Service

Q, O, QP, {R1, …, Rn} Q, O, Result (Ontology Instances)

Query Plan Solver/ Evaluator Integrator Data Service Data Service Data Service

slide-29
SLIDE 29

Application

slide-30
SLIDE 30

Outline

  • Family of problems to be solved
  • Proposed solution: from

Semantics to Data Integration

  • Semantic Directories
  • Ontology-Based Mediator
  • Specific Problem: Use Case
  • Conclusions
slide-31
SLIDE 31

Conclusions

  • SD: Generic Infrastructure
  • Two Ontologies to manage metadata
  • SDMO: needs improvements and

extensions

  • Mediator
  • Needs testing and improvement
  • Study the addition of reasoning in

the integration

slide-32
SLIDE 32

Conclusions

Use Case: Protein structures contain fundamental information regarding their function, location and interactions, which is most of the information in their biological

  • missions. Combining information integration

with prediction techniques (as an automatic process) results in efficient information retrieval and expands the applicability spectrum of structural bioinformatics techniques to non experienced users.

slide-33
SLIDE 33

Conclusions

  • The problem presented is important in our

context, as performing this process automatically will reduce the effort required to solve it.

  • Genome Projects have exponentially

increased the number of known polypeptide sequences. Thus, any effort to improve efficiency for the extraction of structural information at its highest level should help the advance of many on- going Systems Biology projects.

slide-34
SLIDE 34

Conclusions

The main limitation found is the maintenance of the data services, because the developed ones make use of public databases that are not under our control. The long term success of this and similar proposals rely on the collaboration of data and tool owners.

slide-35
SLIDE 35

Contact

José F. Aldana-Montes jfam@lcc.uma.es http://asp.uma.es Thank YOU ! Thank YOU !.