processing xml
play

Processing XML: XPath, XQuery Ramakrishnan & Gehrke, Chapter 24 - PowerPoint PPT Presentation

Processing XML: XPath, XQuery Ramakrishnan & Gehrke, Chapter 24 / 27 320302 Databases & WebServices (P. Baumann) Why are we DBers interested? Its data, stupid. Thats us. Database issues: How are we going to model


  1. Processing XML: XPath, XQuery Ramakrishnan & Gehrke, Chapter 24 / 27 320302 Databases & WebServices (P. Baumann)

  2. Why are we DB’ers interested?  It‟s data, stupid. That‟s us.  Database issues: • How are we going to model XML? • Trees, graphs • How are we going to query XML? • XQuery • How are we going to store XML? • in a relational database? object-oriented? native? • How are we going to process XML efficiently? • many interesting research questions! 320302 Databases & WebApplications (P. Baumann) 2

  3. XML Revisited  From a data modelling viewpoint, what does XML offer?  Entities (ER!)  Attributes • Single-valued, atomic  Relationships? Yes, but: • Single-root trees only • Unordered, no role names • General graphs through id/idrefs, syntax only 320302 Databases & WebApplications (P. Baumann) 3

  4. Roadmap  XPath  XQuery 320302 Databases & WebApplications (P. Baumann) 4

  5. Path Expressions: XPath  Basic concept: path = sequence of location steps • Axis: tree relationship between nodes selected by location step + current node • parent, child, self, descendant-or- self, attribute, … • a node test: node type + expanded-name of nodes selected by location step • 0..* predicates: further refinement  General location step syntax: axisname::nodetest[predicate] 320302 Databases & WebApplications (P. Baumann) 5

  6. Pattern Expressions <?xml version="1.0" encoding="ISO-8859-1"?>  identify nodes in document <catalog> <cd country="USA"> <title>Empire Burlesque</title>  path through the XML document <artist>Bob Dylan</artist> • .../node1/node2/... <price>10.90</price> </cd>  pattern "selects" elements that match <cd country="UK"> <title>Hide your heart</title> path, result is a (sub)tree <artist>Bonnie Tyler</artist> <price>9.90</price> • „all price elements of all cd elements </cd> of the catalog element“: <cd country="USA"> /catalog/cd/price <title>Greatest Hits</title> <price>10.90</price> <artist>Dolly Parton</artist> <price>9.90</price> <price>9.90</price> <price>9.90</price> </cd> </catalog> 320302 Databases & WebApplications (P. Baumann) 6

  7. Paths <?xml version="1.0" encoding="ISO-8859-1"?>  Absolute vs. relative vs. fitting: <catalog> <cd country="USA"> • path starts with slash ( / ): <title>Empire Burlesque</title> absolute path <artist>Bob Dylan</artist> <price>10.90</price> • path starts with oduble slash ( // ): </cd> all fitting elements, <cd country="UK"> even if at different levels in tree <title>Hide your heart</title> <artist>Bonnie Tyler</artist> • Otherwise: path relative to current position <price>9.90</price> </cd>  Relative addressing via axis: <cd country="USA"> • node set relative to current node <title>Greatest Hits</title> <artist>Dolly Parton</artist> • all children of parent, child, self, ancestor, <price>9.90</price> descendant, attribute , … </cd> </catalog> 320302 Databases & WebApplications (P. Baumann) 7

  8. Examples 320302 Databases & WebApplications (P. Baumann) 9

  9. More Examples self({2}) = {2} ancestor-or-self({4}) = {1,2,4}   child({1}) = {2,5} following({3}) = {4,5}   <1> parent({3}) ={2} preceding({4}) = {3} <2>   <3/> descendant({1}) = {2,3,4,5} following-sibling({4}) = {}   <4/> </2> descendant-or-self({1}) = {1,2,3,4,5} preceding-sibling({5}) = {2}   <5/> <1/> ancestor({4}) = {1,2}  320302 Databases & WebApplications (P. Baumann) 10

  10. Wildcards <?xml version="1.0" encoding="ISO-8859-1"?>  * selects unknown elements <catalog> <cd country="USA"> <title>Empire Burlesque</title>  „ all child elements of all cd of catalog “: <artist>Bob Dylan</artist> /catalog/cd/* <price>10.90</price> </cd>  „ all price elements that are <cd country="UK"> <title>Hide your heart</title> grandchilds of catalog “: <artist>Bonnie Tyler</artist> /catalog/*/price <price>9.90</price> </cd>  „ all price elements which have 2 <cd country="USA"> <title>Greatest Hits</title> ancestors “: /*/*/price <artist>Dolly Parton</artist> <price>9.90</price>  „ all elements “: //* </cd> </catalog> 320302 Databases & WebApplications (P. Baumann) 11

  11. Abbreviations  a/b/c • ./child::a/child::b/child::c  a//@id • ./child::a/descendant-or-self::node()/attribute::id  //a • root(.)/descendant-or-self::node()/child::a  a/text() • ./child::a/child::text() 320302 Databases & WebApplications (P. Baumann) 12

  12. Branch Selection <?xml version="1.0" encoding="ISO-8859-1"?>  Selecting branches from subtree: "[...]" <catalog> <cd country="USA">  „first cd child of catalog“: /catalog/cd[1] <title>Empire Burlesque</title> <artist>Bob Dylan</artist> • /catalog/cd[ position() = 1 ] <price>10.90</price> </cd>  „last cd child of catalog“: <cd country="UK"> /catalog/cd[ last() ] <title>Hide your heart</title> <artist>Bonnie Tyler</artist> Note: There is no function named first() • <price>9.90</price>  „ all cd elements of catalog that have a </cd> <cd country="USA"> price element “: /catalog/cd[ price ] <title>Greatest Hits</title> <artist>Dolly Parton</artist>  „ all cd elements of catalog that have a <price>9.90</price> price with value of 10.90 “: </cd> </catalog> /catalog/cd[ price=10.90 ] 320302 Databases & WebApplications (P. Baumann) 13

  13. Multiple Paths <?xml version="1.0" encoding="ISO-8859-1"?>  Selecting Several Paths: | operator <catalog> <cd country="USA"> <title>Empire Burlesque</title>  „all title, artist elements“: <artist>Bob Dylan</artist> /catalog/cd/title | /catalog/cd/artist <price>10.90</price> </cd>  „all the title and artist elements in the <cd country="UK"> <title>Hide your heart</title> document“: //title | //artist <artist>Bonnie Tyler</artist> <price>9.90</price>  „all title, artist, price elements“: </cd> //title | //artist | //price <cd country="USA"> <title>Greatest Hits</title>  “all title elements of cd of catalog, and <artist>Dolly Parton</artist> <price>9.90</price> all artist elements“: </cd> /catalog/cd/title | //artist </catalog> 320302 Databases & WebApplications (P. Baumann) 14

  14. Attributes <?xml version="1.0" encoding="ISO-8859-1"?>  Selecting Attributes: <catalog> <cd country="USA"> prefix attributes with @ <title>Empire Burlesque</title> <artist>Bob Dylan</artist>  „all attributes named „ country „ “: <price>10.90</price> //@country </cd> <cd country="UK"> <title>Hide your heart</title>  „all cd elements which have an <artist>Bonnie Tyler</artist> attribute named country“: <price>9.90</price> //cd[@country] </cd> <cd country="USA"> <title>Greatest Hits</title>  „all cd elements with attribute named <artist>Dolly Parton</artist> country with value 'UK' ": <price>9.90</price> //cd[@country='UK'] </cd> </catalog> 320302 Databases & WebApplications (P. Baumann) 15

  15. Predicates <?xml version="1.0" encoding="ISO-8859-1"?>  Predicates, operators, functions <catalog> <cd country="USA"> as usual <title>Empire Burlesque</title> <artist>Bob Dylan</artist>  „ all CDs with price below 10.0 “: <price>10.90</price> /catalog/cd[ price<10.0 ] </cd> <cd country="UK"> <title>Hide your heart</title>  „ all CDs with country "UK" <artist>Bonnie Tyler</artist> and price below 10.0 “: <price>9.90</price> / catalog </cd> <cd country="USA"> / cd[ @country="UK" ] <title>Greatest Hits</title> / [ price<10.0 ] <artist>Dolly Parton</artist> <price>9.90</price> </cd> </catalog> 320302 Databases & WebApplications (P. Baumann) 16

  16. Roadmap  XPath  XQuery 320302 Databases & WebApplications (P. Baumann) 19

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend