Peer-to-Peer Data Integration with Active XML Tova Milo Tel-Aviv - - PowerPoint PPT Presentation

peer to peer data integration with active xml
SMART_READER_LITE
LIVE PREVIEW

Peer-to-Peer Data Integration with Active XML Tova Milo Tel-Aviv - - PowerPoint PPT Presentation

/56 Peer-to-Peer Data Integration with Active XML Tova Milo Tel-Aviv University Tova Milo Tel Aviv University /56 Active XML - Outline Introduction Active XML Active XML documents Active XML services Novel issues


slide-1
SLIDE 1

/56

Tova Milo – Tel Aviv University

Peer-to-Peer Data Integration with Active XML

Tova Milo

Tel-Aviv University

slide-2
SLIDE 2

/56

Tova Milo – Tel Aviv University

Active XML - Outline

Introduction Active XML

  • Active XML documents
  • Active XML services

Novel issues

  • Exchanging Active XML data
  • Querying Active XML data
  • Distribution and replication
  • Security and Access control

Active XML Peers

  • The peer as a client
  • The peer as a server
  • Theoretical foundations

Applications Conclusion

slide-3
SLIDE 3

/56

Tova Milo – Tel Aviv University

Introduction

slide-4
SLIDE 4

/56

Tova Milo – Tel Aviv University

Distributed data management in P2P Information is everywhere

services XML XML services XML XML XML XML services XML services XML Web Web service Web service Data warehouses Databases Web sites PC, PDA, cell phones, home appliances, cars…

slide-5
SLIDE 5

/56

Tova Milo – Tel Aviv University

The golden triangle

  • f distributed data management

XML

  • a standard for data representation & exchange

Query languages

  • XPath, XQuery

Web services

  • standards for distributed computing
  • Activation of methods on remote servers

XQuery XPath XML Web services

slide-6
SLIDE 6

/56

Tova Milo – Tel Aviv University

What is Active XML (AXML)?

AXML is a declarative language for distributed information management and an infrastructure to support this language, in a peer-to-peer framework.

slide-7
SLIDE 7

/56

Tova Milo – Tel Aviv University

Active XML

slide-8
SLIDE 8

/56

Tova Milo – Tel Aviv University

Active XML documents

XML documents with embedded calls to Web services

Intensional

  • Some of the data is given explicitly
  • Some is given intensionally

(i.e. the means to acquire data when needed are given)

Dynamic

  • If the external sources change, the same document will provide

different information

  • Reaction to world changes
slide-9
SLIDE 9

/56

Tova Milo – Tel Aviv University

Not a new idea in databases, nor on the Web

Mixing calls to data is an old idea

  • Procedural attributes in relational systems
  • Basis of Object-oriented Databases

In HTML world

  • Sun’s JSP, PHP+MySQL

Calls to Web services inside XML documents

  • Macromedia FLEX, Apache Jelly, Microsoft XAML

What is new is the exploitation of the idea…

slide-10
SLIDE 10

/56

Tova Milo – Tel Aviv University

A sample AXML document

<?xml version=“1.0” ?> <newspaper> <title>Le Monde</title> <date>06/10/2003</date> <call svc=“Yahoo.GetTemp”> <city>Paris</city> </call> <call svc=“TimeOut.GetEvents”> exhibits </call> </newspaper>

GetTemp city “Paris” newspaper title date “06/10/2003” “Le Monde” GetEvents “Exhibits”

AXML documents may contain calls:

  • to any existing Web services

(e-bay.net, google.com…)

  • to any AXML Web services

(to be defined)

slide-11
SLIDE 11

/56

Tova Milo – Tel Aviv University

Materialization

We will see later that:

  • Replacing the call by its result is not the only option
  • Calls are not necessarily RPC-style synchronous invocations

<?xml version=“1.0” ?> <newspaper> <title>Le Monde</title> <date>06/10/2003</date> <call svc=“Yahoo.GetTemp”> <city>Paris</city> </call> <call svc=“TimeOut.GetEvents”> exhibits </call> </newspaper>

GetTemp city “Paris” newspaper title date “06/10/2003” “Le Monde” GetEvents “Exhibits”

  • temp

“16° C”

SOAP call

<temp>16°C</temp>

slide-12
SLIDE 12

/56

Tova Milo – Tel Aviv University

AXML Web services

Parameters: AXML data Result: AXML data

Distribute computations: by sending as parameters data containing service calls, one can delegate some work to other peers Partial computations: by returning data containing service calls, one can give to the receiver the control

  • f these calls

Great flexibility

slide-13
SLIDE 13

/56

Tova Milo – Tel Aviv University

Calling an AXML service

<?xml version=“1.0” ?> <newspaper> <title>Le Monde</title> <date>06/10/2003</date> <call svc=“TimeOut.GetEvents”> exhibits </call> </newspaper>

newspaper title date “06/10/2003” “Le Monde” GetEvents “Exhibits”

<temp>16°C</temp>

exhibits GetExhibits “Paris” City

  • temp

“16° C”

SOAP call (still…)

Materialization is a recursive process Termination is an issue

<exhibits> <call svc=“Yahoo.GetExhibits”> <city>Paris</city> </call> </exhibits>

slide-14
SLIDE 14

/56

Tova Milo – Tel Aviv University

Novel issues

slide-15
SLIDE 15

/56

Tova Milo – Tel Aviv University

Active XML - Outline

Introduction Active XML

  • Active XML documents
  • Active XML services

Novel issues

  • Exchanging Active XML data (SIGMOD’03, PODS’05)
  • Querying Active XML data
  • Distribution and replication
  • Security and Access control

Active XML Peers

  • The peer as a client
  • The peer as a server
  • Theoretical foundations

Applications Conclusion

slide-16
SLIDE 16

/56

Tova Milo – Tel Aviv University

To call or not to call ?

GetEvents “Exhibits” newspaper title date “Le Monde” “06/10/2003” GetTemp city “Paris” temp “16° C”

  • Materialization can be performed

by the sender, before sending a document

  • r by the receiver, after receiving it

GetEvents “Exhibits” newspaper title date “Le Monde” “06/10/2003” GetTemp city “Paris” temp “16° C”

slide-17
SLIDE 17

/56

Tova Milo – Tel Aviv University

Why control the materialization of calls?

For added functionality, e.g.

  • Intensional data allows to get up-to-date information

For security reasons or capabilities, e.g.

  • I don’t trust this Web service/domain
  • I don’t have the right credentials to invoke it
  • It costs money
  • Maybe the receiver doesn’t know Active XML!

For performance reasons, e.g.

  • A proxy can invoke all the services on behalf of a PDA

… and many more reasons you can think of!

slide-18
SLIDE 18

/56

Tova Milo – Tel Aviv University

We extend XML Schema, with intensional types: XMLSchemaint

How to control it? Using types

Casting algorithms use signatures of services: WSDLint

... ...

r

... ... ... ... ...

g f q

...

Capabilities ACL Cost ... Sender

data exchange Schema

f q g

Capabilities ACL Cost ... Receiver

g g g g g g q q q f f r r

slide-19
SLIDE 19

/56

Tova Milo – Tel Aviv University

Rewritings

The Goal:

Given

  • an AXML document d
  • a schema s

Can we rewrite d so that it matches s?

Safe rewriting: one that for sure leads to s (we know without making any call) Possible rewriting: one that possibly leads to s (depending on the answers of the services)

slide-20
SLIDE 20

/56

Tova Milo – Tel Aviv University

Results

The general problem is undecidable [MSS04] Restrictions on the considered rewritings

  • Left-to-right: No “going back and forth”
  • K-depth: bound on the nesting of function calls

(Search space still infinite but finitely representable)

Under these restrictions

  • We have algorithms to find safe/possible rewritings
  • They are PTIME (for deterministic schemas)
  • We can also do it between schemas

Implementation

  • first demo at VLDB 2003 (customizable news syndication)
slide-21
SLIDE 21

/56

Tova Milo – Tel Aviv University

Active XML - Outline

Introduction Active XML

  • Active XML documents
  • Active XML services

Novel issues

  • Exchanging Active XML data
  • Querying Active XML data (SIGMOD’04, PODS’05)
  • Distribution and replication
  • Security and Access control

Active XML Peers

  • The peer as a client
  • The peer as a server
  • Theoretical foundations

Applications Conclusion

slide-22
SLIDE 22

/56

Tova Milo – Tel Aviv University

Querying AXML Data

Given a (tree pattern) query:

/newspaper[temp > 18°C]/exhibits//exhibit[location=“Le Louvre”]

Materialize the document? Call only the services that may contribute data to the query answer.

The problem: Lazy evaluation of service calls To call or not to call, this time when evaluating a query

GetTemp city “Paris” newspaper title getDate “Le Monde” GetEvents “Exhibits” exhibits GetExhibits “Paris” City temp “19° C”

slide-23
SLIDE 23

/56

Tova Milo – Tel Aviv University

Lazy evaluation

Difficulties:

  • Calls can be found everywhere in the document
  • May appear dynamically (as a result of previous calls)
  • May become (ir)relevant due to previous invocations
  • Need to take signatures of calls into consideration

Possible approach: modify the query processor

  • Trigger the calls found on the way
  • Not so great:

– Computation is blocked – Optimization opportunities are lost

Our solution:

  • Drives queries that find the relevant calls (recursively)
  • Use service signatures to prune irrelevant calls
  • Parallel call invocations
  • Pushing queries to capable external sources
slide-24
SLIDE 24

/56

Tova Milo – Tel Aviv University

Active XML - Outline

Introduction Active XML

  • Active XML documents
  • Active XML services

Novel issues

  • Exchanging Active XML data
  • Querying Active XML data
  • Distribution and replication
  • Security and Access control

Active XML Peers

  • The peer as a client
  • The peer as a server
  • Theoretical foundations

Applications Conclusion

slide-25
SLIDE 25

/56

Tova Milo – Tel Aviv University

Active XML peers

slide-26
SLIDE 26

/56

Tova Milo – Tel Aviv University

Distributed data management in P2P

services XML XML services XML XML XML XML services XML services XML Web Web service Web service

AXML AXML AXML

AXML AXML AXML AXML

slide-27
SLIDE 27

/56

Tova Milo – Tel Aviv University

What is an AXML peer ?

Repository: manages persistent AXML data Client: uses (AXML) Web services to dynamically enrich data Server: easy (declarative) definition of AXML services

AXML peer soap

slide-28
SLIDE 28

/56

Tova Milo – Tel Aviv University

Global architecture

  • AXML

XML AXML AXML

AXML store

service descriptions

AXML engine Query engine

slide-29
SLIDE 29

/56

Tova Milo – Tel Aviv University

Active XML - Outline

Introduction Active XML

  • Active XML documents
  • Active XML services

New issues

  • Exchanging Active XML data
  • Querying Active XML data
  • Distribution and replication
  • Security and Access control

Active XML Peers

  • The peer as a client
  • The peer as a server
  • Theoretical foundations

Applications

  • P2P auctions
  • News syndication
  • Other applications

Conclusion

slide-30
SLIDE 30

/56

Tova Milo – Tel Aviv University

Managing persistent AXML data

“Our newspaper should have its temperature information refreshed daily. New exhibits should be fetched every week and archived for 6 months” Service call results enrich the document (calls can be kept for possible future reuse) Main issues:

  • When to activate a service call? (pull/push, implicit/explicit)
  • What to do with its result? (add/replace/merge)
slide-31
SLIDE 31

/56

Tova Milo – Tel Aviv University

Example: AXML document with control attributes

<?xml version=“1.0” ?> <newspaper> <title>Le Monde</title> <date>06/10/2003</date> <call svc=“Yahoo.GetTemp” mode=“lazy” valid=“1 day” merge=“replace” > <city>Paris</city> </call> <call svc=“TimeOut.GetEvents” mode=“every Monday morning” valid=“6 months” merge=“append”> exhibits </call> </newspaper>

slide-32
SLIDE 32

/56

Tova Milo – Tel Aviv University

Providing declarative AXML services

Services can be defined by queries or updates over the AXML documents of the repository (XQuery, XPath, Xupdate) Users can subscribe to services Services can be composed (BPEL4WS)

Which (lazy) service calls may contribute to the answer?

let service GetExhibitsByLocation($loc) be for $a in document(“newspaper.xml")/newspaper/exhibits, $b in $a//exhibit where $b@name=$loc return <exhibits> {$b} </exhibits>

slide-33
SLIDE 33

/56

Tova Milo – Tel Aviv University

Active XML - Outline

Introduction Active XML

  • Active XML documents
  • Active XML services

New issues

  • Exchanging Active XML data
  • Querying Active XML data
  • Distribution and replication
  • Security and Access control

Active XML Peers

  • The peer as a client
  • The peer as a server
  • Theoretical foundations (PODS’04, PODS’05)

Applications Conclusion

slide-34
SLIDE 34

/56

Tova Milo – Tel Aviv University

Applications

slide-35
SLIDE 35

/56

Tova Milo – Tel Aviv University

Demos

Peer-to-peer auctions (VLDB 2002 demo)

  • Discovery of new peers/auctions through intensional answers

RSS News syndication (VLDB 2003 demo)

  • Customization of services through schemas + news subscriptions

Decentralized management of patient data (VLDB 2004 demo)

  • Use AXML to coordinate the integration of data

and privacy enforcement services in a uniform way

Querying Business Processes (VLDB 2005 demo)

  • Use AXML to model and query BPEL specifications

Others…

A powerful framework for the fast development

  • f distributed, data-centric applications.
slide-36
SLIDE 36

/56

Tova Milo – Tel Aviv University

Other applications

Dynamic warehouse on food risk management (E.dot)

  • Use AXML as the platform for the warehouse definition,

construction and maintenance

Network configuration (SWAN)

  • Consider using AXML exchange of information to

configure hardware/software components

Software distribution (EDOS)

  • Consider using AXML to customize distributions and

keep your view of the software fresh

slide-37
SLIDE 37

/56

Tova Milo – Tel Aviv University

Conclusion

slide-38
SLIDE 38

/56

Tova Milo – Tel Aviv University

AXML documents and services

A simple paradigm… …that allows for new, powerful features

  • Intensional parameters and results:

AXML documents can be exchanged

  • Support for continuous services (streams of answers)
  • Control over the exchange of AXML data
  • Lazy query evaluation

AXML implementation goes Open Source (ObjectWeb consortium)

slide-39
SLIDE 39

/56

Tova Milo – Tel Aviv University

Thanks: Serge Abiteboul, Omar Benjelloun, Ioana Manolescu, Bernd Amann, Jerome Baumgarten, Bogdan Cautis, and many others…