scalable xquery type matching
play

Scalable XQuery Type Matching Jens Teubner IBM T. J. Watson Research - PowerPoint PPT Presentation

Scalable XQuery Type Matching Jens Teubner IBM T. J. Watson Research Center teubner@us.ibm.com Scalable XQuery Type Matching Type matching: Inspection of dynamic type information at runtime. 1 Compare runtime types of ( x 1 , . . . , x k )


  1. Scalable XQuery Type Matching Jens Teubner · IBM T. J. Watson Research Center teubner@us.ibm.com

  2. Scalable XQuery Type Matching Type matching: Inspection of dynamic type information at runtime. 1 Compare runtime types of ( x 1 , . . . , x k ) against t i in turn. typeswitch ( x 1 , x 2 , ..., x k ) case t 1 return e 1 2 First matching branch case t 2 return e 2 determines expression result. . . . Likewise: case t n return e n e instance of t default return e def e / ax ::element ( n , t ) This talk describes a scalable and efficient implementation for 1 . → Leverage existing DBMS capabilities (aggregation). → Faithful to XQuery semantics. Scalable XQuery Type Matching Jens Teubner 2 / 14

  3. The XQuery Data Model XQuery: item = value + type annotation x = v of type t (atomic values) x = element n of type t { · · · } (element nodes) x = attribute n of type t { · · · } (attribute nodes) (text nodes) 1 x = text { · · · } . . . A type annotation t references a (named) XML Schema type. Type information may come, e.g. , from a validated XML instance. Type matching is XQuery’s means to access type annotations. 1 Text, comment, and processing instruction nodes do not carry type information. Scalable XQuery Type Matching Jens Teubner 3 / 14

  4. The XDM Type Hierarchy xs:anyType Types arrange into a hierarchy . xs:untyped Derived types are added according xs:anySimpleType to their base type . xs:anyAtomicType xs:boolean xs:decimal xs:integer xs:string xs:untypedAtomic user-defd. list types user-defd. complex types Scalable XQuery Type Matching Jens Teubner 4 / 14

  5. The XDM Type Hierarchy xs:anyType Types arrange into a hierarchy . xs:untyped Derived types are added according xs:anySimpleType to their base type . xs:anyAtomicType xs:boolean xs:decimal my:shoesize xs:integer my:hatsize xs:string xs:untypedAtomic my:hatsizelist my:stockitem Scalable XQuery Type Matching Jens Teubner 4 / 14

  6. The XDM Type Hierarchy xs:anyType Types arrange into a hierarchy . xs:untyped Derived types are added according xs:anySimpleType to their base type . xs:anyAtomicType xs:boolean let $x := my:hatsize (56) xs:decimal return my:shoesize $x instance of xs:decimal xs:integer my:hatsize xs:string Existing implementations take xs:untypedAtomic the semantics of type matching my:hatsizelist quite literally. my:stockitem → Expensive recursion . Scalable XQuery Type Matching Jens Teubner 4 / 14

  7. Type Ranks Use tree encoding to encode xs:anyType 0 12 type hierarchy. xs:untyped 1 0 xs:anySimpleType 2 9 → pre : preorder rank (of types!) xs:anyAtomicType → size : number of derived types 3 7 xs:boolean 4 0 → cf. XPath Accelerator xs:decimal 5 3 Use pre values to implement type my:shoesize 6 0 annotations . xs:integer 7 1 → “type ranks” my:hatsize 8 0 xs:string 9 0 t 1 derives from t 2 xs:untypedAtomic 10 0 ⇔ my:hatsizelist 11 0 pre ( t 2 ) ≤ pre ( t 1 ) ≤ pre ( t 2 ) + size ( t 2 ) my:stockitem 12 0 Scalable XQuery Type Matching Jens Teubner 5 / 14

  8. Type Ranks Use tree encoding to encode xs:anyType 0 12 type hierarchy. xs:untyped 1 0 xs:anySimpleType 2 9 → pre : preorder rank (of types!) xs:anyAtomicType → size : number of derived types 3 7 xs:boolean 4 0 → cf. XPath Accelerator xs:decimal 5 3 Use pre values to implement type my:shoesize 6 0 annotations . xs:integer 7 1 → “type ranks” my:hatsize 8 0 xs:string 9 0 t 1 derives from t 2 xs:untypedAtomic 10 0 ⇔ my:hatsizelist 11 0 pre ( t 2 ) ≤ pre ( t 1 ) ≤ pre ( t 2 ) + size ( t 2 ) my:stockitem 12 0 � �� � � �� � known at compile time! Scalable XQuery Type Matching Jens Teubner 5 / 14

  9. Type Ranks xs:anyType 0 12 let $x := my:hatsize (56) xs:untyped 1 0 return xs:anySimpleType 2 9 $x instance of xs:decimal xs:anyAtomicType 3 7 xs:boolean 4 0 my:hatsize xs:decimal 5 3 $x = 56 of type 8 my:shoesize 6 0 xs:integer $x instance of xs:decimal 7 1 ⇔ my:hatsize 8 0 5 ≤ 8 ≤ 5 + 3 xs:string 9 0 xs:untypedAtomic 10 0 xs:decimal xs:decimal my:hatsizelist 11 0 my:stockitem 12 0 Decidable in constant time . Scalable XQuery Type Matching Jens Teubner 6 / 14

  10. Sequences and Occurrence Indicators The argument to type matching typically is a sequence. ( x 1 , . . . , x k ) instance of t � � ∈ { � , ? , + , * } The match succeeds iff 1 x i matches t for all x i in x = ( x 1 , . . . , x k ) and 2 the sequence length k is compatible with the occurrence indicator � . Scalable XQuery Type Matching Jens Teubner 7 / 14

  11. Sequences and Occurrence Indicators Expressed in terms of type ranks: 1 x i matches t for all x i in x = ( x 1 , . . . , x k ) ⇔ ∀ ( x i = v i of type t i ) ∈ x : pre ( t i ) ≥ pre ( t ) ∧ pre ( t i ) ≤ pre ( t ) + size ( t ) Scalable XQuery Type Matching Jens Teubner 8 / 14

  12. Sequences and Occurrence Indicators Expressed in terms of type ranks: 1 x i matches t for all x i in x = ( x 1 , . . . , x k ) ⇔ ∀ ( x i = v i of type t i ) ∈ x : pre ( t i ) ≥ pre ( t ) ∧ pre ( t i ) ≤ pre ( t ) + size ( t ) Type aggregation: ⇔ � � min ( x i = v i of type t i ) ∈ x pre ( t i ) ≥ pre ( t ) � � ∧ max ( x i = v i of type t i ) ∈ x pre ( t i ) ≤ pre ( t ) + size ( t ) Find minimum and maximum type ranks first, then compare once. Scalable XQuery Type Matching Jens Teubner 8 / 14

  13. Type Aggregation Aggregation (once more) beneficial for efficient XML processing. Implementations highly tuned in today’s DBMSs. Likewise: Use aggregation to test compatibility with occurrence indicator � : 2 the sequence length k is compatible with � ⇔ Count sequence items, then compare according to � . Scalable XQuery Type Matching Jens Teubner 9 / 14

  14. Type Aggregation in Relational XQuery Example: XQuery on purely relational database back-ends. 2 iter pos item type All loops unrolled, iter : logical iteration. 1 1 6 43 pos : sequence order, item holds payload. 1 2 56 8 new column type : preorder type ranks. 2 1 9 "XL" Type aggregation: SELECT iter , MIN( type ), MAX( type ), COUNT(*) FROM q GROUP BY iter 2 http://www.pathfinder-xquery.org/ Scalable XQuery Type Matching Jens Teubner 10 / 14

  15. Type Aggregation in Relational XQuery iter pos item type my:shoesize Example: 1 1 43 6 1 2 56 8 my:hatsize e instance of xs:decimal* 2 1 "XL" 9 aggregate xs:string 1 Add type information to loop-lifted sequence encoding. iter min max 1 6 8 2 Aggregate, then compare. 2 9 9 compare min ≥ 5 ∧ max ≤ 5 + 3 ? iter min max res 3 Projection re-establishes 1 6 8 true 2 9 9 false loop-lifted encoding. project → Standard DBMS operators iter pos item type suffice. xs:boolean 1 1 true 4 2 1 false 4 Scalable XQuery Type Matching Jens Teubner 11 / 14

  16. Type Aggregation in an RDBMS Proof-of-concept implementation using SQL. 1000 DB2 9 SQL execution time [sec] 100 FpML schema (777 types) 10 10,000 for iterations 1 0.1 5 20 50 � �� � non-indexed average sequence length / iteration recursive type ranks type ranks + aggregation Scalable XQuery Type Matching Jens Teubner 12 / 14

  17. Type Aggregation in an RDBMS Proof-of-concept implementation using SQL. 1000 DB2 9 SQL execution time [sec] 100 FpML schema (777 types) 10 10,000 for iterations 1 0.1 5 20 50 50 � �� � � �� � non-indexed indexed average sequence length / iteration recursive type ranks type ranks + aggregation Scalable XQuery Type Matching Jens Teubner 12 / 14

  18. Type Aggregation has Even Further Potential Type aggregation yields new runtime guarantees. typeswitch : Match a sequence against a number of types in turn. Type aggregation: Traditional: typeswitch ( x 1 , x 2 , ..., x k ) aggregate O ( k ) case t 1 return e 1 match O ( k ) compare O ( 1 ) case t 2 return e 2 match O ( k ) compare O ( 1 ) . . . . . . . . . . . . . . . case t n return e n match O ( k ) compare O ( 1 ) default return e def � � O ( n · k ) O ( n + k ) Recursion may further increase left-hand-side complexity. Scalable XQuery Type Matching Jens Teubner 13 / 14

  19. Summary A scalable implementation for XQuery’s dynamic type semantics. Type ranks: constant time for singleton type matching. → Inspired by XPath Accelerator tree encoding. Type aggregation: use aggregation to handle sequences. → Exploit efficient implementations in modern DBMSs. New runtime guarantees: O ( n · k ) → O ( n + k ) for typeswitch es Faithful to XQuery semantics . → Paper also covers XML node matching, incl. substitution groups Scalable XQuery Type Matching Jens Teubner 14 / 14

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend