designing information preserving mapping schemes for xml
play

Designing Information-Preserving Mapping Schemes for XML Denilson - PowerPoint PPT Presentation

Designing Information-Preserving Mapping Schemes for XML Denilson Barbosa Juliana Freire Alberto O. Mendelzon VLDB 2005 Motivation An XML-to-relational mapping scheme consists of a procedure for shredding XML documents into relational


  1. Designing Information-Preserving Mapping Schemes for XML Denilson Barbosa Juliana Freire Alberto O. Mendelzon VLDB 2005

  2. Motivation � An XML-to-relational mapping scheme consists of a procedure for shredding XML documents into relational databases, a procedure for publishing the databases back as documents, and constraints the databases must satisfy � The focus to date has been mostly on the performance of queries (see e.g., (Krishnamurthy et al. [2003]) for a survey) and updates (Tatarinov et al. [2001, 2002]) � We need to understand the properties of a mapping scheme (in any domain) to determine its suitability for a given application • Well studied for traditional data models (Hull [1986], Abiteboul and Hull [1988], Miller et al. [1993]) • We are only starting in the XML context [XSYM’04], (Bohannon et al. [2005]) 1 Designing Information-Preserving Mapping Schemes for XML — Denilson Barbosa

  3. Information Preservation – Goals Answering queries: � Requires reconstructing every fragment of the document: losslessness [XSYM’04] � Previous methods (possibly with simple extensions) suffice Processing updates, preserving document validity: � Requires that the resulting database “represents” a valid document and that every valid document can be represented by some database: validation [XSYM’04] � Losslessness alone is not enough � Problem: checking whether the update is permissible 2 Designing Information-Preserving Mapping Schemes for XML — Denilson Barbosa

  4. Example Consider the following DTD and a valid document: mondial 1 mondial ← cities , country ∗ cities country cities ← city ∗ 2 19 city ← name , ( province | state ) , official + city name capital country ← name , capital 3 20 22 name ← # PCDATA name province official city 21 23 province ← # PCDATA 4 6 8 Brazil Brasilia state ← # PCDATA 5 7 9 official ← # PCDATA 10 Toronto Ontario David capital ← # PCDATA name state official official 11 13 15 17 12 14 16 18 Salt Lake City Utah Rocky Sam 3 Designing Information-Preserving Mapping Schemes for XML — Denilson Barbosa

  5. Example – cont’d. Consider this (lossless) mapping scheme: city (cityId, name, ord, province , state ) mondial ← cities , country ∗ official (officialId, cityId, name, ord) cities ← city ∗ country (countryId, name, capital, ord) city ← name , ( province | state ) , official + country ← name , capital city (1, ’Toronto’, 1, ’Ontario’, NULL) name ← # PCDATA city (4, ’Salt Lake City’, 2, NULL, ’Utah’) province ← # PCDATA official (2, 1, ’David’, 1) state ← # PCDATA official (5, 4, ’Rocky’, 1) official ← # PCDATA official (6, 4, ’Sam’, 2) capital ← # PCDATA country (7, ’Brazil’, ’Brasilia’, 1) � Problems: UPDATE city SET province=’Utah’ update WHERE name=’Salt Lake City’ delete //city[name=’Toronto’]/official[last()] Legal SQL update Cannot be checked statically 4 Designing Information-Preserving Mapping Schemes for XML — Denilson Barbosa

  6. Checking for Permissible Updates Using a mapping scheme that is only lossless: � Publish the portions of the database affected by the update, and validate the result • Potentially expensive operation; large fragments of the document may have to be reconstructed � Build a (incremental) validator into the DBMS • In-DBMS validation is expensive (Nicola and John [2003]) and incremental validation requires maintaining considerable auxiliary information [ICDE’04],(Balmin et al. [2004]) • Requires a new component whose functionality overlaps with the DBMS constraint checking mechanism 5 Designing Information-Preserving Mapping Schemes for XML — Denilson Barbosa

  7. Outline 1. Motivation 2. Information-Preserving Mapping Schemes � Losslessness � Validation 3. Designing Information-Preserving Mapping Schemes 4. LILO � Mapping scheme transformations 5. Conclusion 6 Designing Information-Preserving Mapping Schemes for XML — Denilson Barbosa

  8. Information-Preserving Mapping Schemes � A mapping scheme is a triple µ = ( σ, π, S ) σ π − − − → − − − → � A class of mapping schemes is defined by the languages for writing σ , π , and the constraints in S . � The XDS class of mapping schemes [XSYM’04] • Mapping language: XQuery augment with mapping expressions • Relational constraints: boolean queries in Datalog ¬ • Publishing language: SilkRoute – XQuery over “canonical” XML views of the databases • Powerful by design 7 Designing Information-Preserving Mapping Schemes for XML — Denilson Barbosa

  9. Information-Preserving Mapping Schemes σ I D X : all XML documents π [ D 1 ] [ I 1 ] D ′ [ D ] [ D 2 ] R ( S ) : all legal instances of S [ I 2 ] L ( X ) L ( X ) : all valid documents w.r.t. X R ( S ) R ( S ) X X [ · ] : equivalence class lossless and validating lossless mapping scheme mapping scheme � µ = ( σ, π, S ) is lossless iff π ( σ ( · )) is the identity on equivalence classes of documents � µ = ( σ, π, S ) is lossless and validating iff σ and π are bijective and σ = π − 1 (up to equivalence) � µ = ( σ, π, S ) is lossless and validating iff X ≡ S Losslessness and validation are undecidable for XDS mapping schemes [XSYM’04] 8 Designing Information-Preserving Mapping Schemes for XML — Denilson Barbosa

  10. Designing Mapping Schemes α k α 2 α 1 σ X S S 1 · · · S k π β 1 β 2 β k µ 0 µ 1 · · · µ k � Goal: designing a mapping scheme µ k = ( σ k , π k , S k ) that is both lossless and validating � Framework for designing lossless and validating mapping schemes in XDS : • Start with µ 0 that is known to be lossless and validating • Apply equivalence-preserving transformations between µ i and µ i +1 • In the paper: rewriting µ = ( σ, π, S ) in XDS and α i , β i in wrec-ILOG ¬ into µ ′ = ( σ ′ , π ′ , S ′ ) in XDS 9 Designing Information-Preserving Mapping Schemes for XML — Denilson Barbosa

  11. LILO – Initial Mapping Scheme Initial mapping scheme in LILO: Edge ++ is both lossless and validating [XSYM’04] � Relational Schema: • Edge , FLC , ILS , Value : document structure and content • Type : element types • Transition : transition functions of all content models in the DTD � Constraints: • Structural Constraints ensure the database represents a well-formed XML document; e.g., the database encodes a tree, the ordering of siblings is consistent, etc. • Validating Constraints ensure that the content of every element is valid ; i.e., spells a word accepted by an appropriate DFA � Each validation constraint is implemented by a recursive Datalog ¬ program 10 Designing Information-Preserving Mapping Schemes for XML — Denilson Barbosa

  12. LILO Transformations – Example Goal: replace a validating constraint by equivalent constraints that are easier to enforce Example: enforcing the rule country ← name , capital � Initial Edge ++ mapping ( S 0 ): Edge 0 FLC 0 Type 0 pid eid label pid first last eid type country 1 19 country 19 20 22 19 t 1 19 20 name 19 19 22 capital name capital Transition 0 type from label to acc 20 22 Value 0 ILS 0 t 1 q 0 name q 1 no eid value 21 23 left right capital yes t 1 q 1 q 2 20 Brazil Brazil Brasilia 20 22 22 Brasilia � Validation constraint: recursive Datalog ¬ program 11 Designing Information-Preserving Mapping Schemes for XML — Denilson Barbosa

  13. LILO Transformations – Example Step 1: inline the name and capital elements S 0 S 1 Validation constraints: Edge 0 FLC 0 � name and capital are pid eid label Country 1 pid first last unique in Country 1 1 19 country name capital country 19 20 22 19 20 19 20 22 name � FKs: name and 19 22 capital capital in Country 1 Value 1 Value 0 refer to value in ILS 0 eid value eid value left right Value 1 20 Brazil 20 Brazil 20 22 22 Brasilia 22 Brasilia α 1 : R ( S 0 ) → R ( S 1 ) β 1 : R ( S 1 ) → R ( S 0 ) Diff ( e ): − Edge 0 ( , e, ′ country ′ ) Edge 0 ( e, c, l ): − Edge 1 ( e, c, l ) Edge 0 ( e, c, l ): − Edge 1 ( e, , ′ country ′ ) , Country ( c, , ) , Diff ( e ): − Edge 0 ( , e, ′ capital ′ ) l = ′ country ′ Diff ( e ): − Edge 0 ( , c, ′ country ′ ) , Edge 0 ( c, e, ′ name ′ ) Edge 0 ( e, c, l ): − Country 1 ( e, c, ) , l = ′ name ′ Country 1 ( e, n, c ): − Edge 0 ( e, n, ′ name ′ ) , Edge 0 ( e, c, ′ capital ′ ) Edge 0 ( e, c, l ): − Country 1 ( e, , c ) , l = ′ capital ′ Edge 1 ( e, c, l ): − Edge 0 ( e, c, l ) , ¬ Diff ( e ) FLC 1 ( p, f, l ): − FLC 0 ( p, f, l ) , ¬ Diff ( p ) FLC 0 ( p, f, l ): − FLC 1 ( p, f, l ) ILS 1 ( l, r ): − ILS 0 ( l, r ) , ¬ Diff ( l ) FLC 0 ( p, f, l ): − Country ( p, f, l ) Value 1 ( e, v ): − Value 0 ( e, v ) ILS 0 ( l, r ): − ILS 1 ( l, r ) ILS 0 ( l, r ): − Country 1 ( , l, r ) Value 0 ( e, v ): − Value 1 ( e, v ) 12 Designing Information-Preserving Mapping Schemes for XML — Denilson Barbosa

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend