Symmetrically Exploiting XML
Shuohao Zhang and Curtis Dyreson
School of E.E. and Computer Science Washington State University Pullman, Washington, USA
The 15th International World Wide Web Conference May 2006 Edinburgh, Scotland
Symmetrically Exploiting XML Shuohao Zhang and Curtis Dyreson - - PowerPoint PPT Presentation
Symmetrically Exploiting XML Shuohao Zhang and Curtis Dyreson School of E.E. and Computer Science Washington State University Pullman, Washington, USA The 15 th International World Wide Web Conference May 2006 Edinburgh, Scotland 1970s
The 15th International World Wide Web Conference May 2006 Edinburgh, Scotland
Symmetrically Exploiting XML: Zhang, Dyreson
part/project works on some, but not all
Symmetrically Exploiting XML: Zhang, Dyreson
Find books by E. F. Codd
return doc("author.xml")//author[name= 'E. F. Codd']/book
name author book book title title publisher publisher price price
Addison Wesley Academic Press DB 46.95 Automata 9.99
Symmetrically Exploiting XML: Zhang, Dyreson
Find books by E. F. Codd
return doc("book.xml")//book[author/name='E. F. Codd']
publisher book book title title author author price price
Addison Wesley DB 46.95 Automata 9.99
name
publisher
Academic Press
name
Codd
name author book book title title publisher publisher price price
Addison Wesley Academic Press DB 46.95 Automata 9.99
Symmetrically Exploiting XML: Zhang, Dyreson
lack of schema knowledge heterogeneous data irregular data schema evolution
Symmetrically Exploiting XML: Zhang, Dyreson
Symmetrically Exploiting XML: Zhang, Dyreson
Symmetrically Exploiting XML: Zhang, Dyreson
Symmetrically Exploiting XML: Zhang, Dyreson
Symmetrically Exploiting XML: Zhang, Dyreson
closest::
a function that takes a context node and returns a sequence of
Symmetrically Exploiting XML: Zhang, Dyreson
Returns a list of five nodes
Returns the first price node
Symmetrically Exploiting XML: Zhang, Dyreson
The minimal distance between a title and a price is 2
Returns an empty list
Symmetrically Exploiting XML: Zhang, Dyreson
author/name author/book/publisher/name
Symmetrically Exploiting XML: Zhang, Dyreson
Query
return doc("any.xml")->author[->name='E. F. Codd']->book Query Result#2 Result#3 Query Result#1
Symmetrically Exploiting XML: Zhang, Dyreson
Query#1 -- return doc("author.xml")//author[name= 'E. F. Codd']/book Query#2 -- …… Query#3 -- return doc("book.xml")//book[author/name='E. F. Codd'] Result#2 Result#3 Result#1
Symmetrically Exploiting XML: Zhang, Dyreson
Compute Closest for every node Time complexity is O(sn2)
s: number of labels in the signature n: number of nodes
Symmetrically Exploiting XML: Zhang, Dyreson
200 400 600 800 1000 1200 1400 1600 2 5 5 7 5 1 1 2 5 1 5 Number of Nodes Time (milliseconds) descendant closest
Symmetrically Exploiting XML: Zhang, Dyreson
Every Closest pair related via an LCA Idea is to merge lists of types O(sn)
… … current lca direction of merge … … current parent … … current child
Symmetrically Exploiting XML: Zhang, Dyreson
Symmetrically Exploiting XML: Zhang, Dyreson
Zhang, Dyreson (IIWeb 2006)
Zhang, Dyreson, Dang (DASFAA 2006)
Symmetrically Exploiting XML: Zhang, Dyreson
May break down if structure changes
Simple in syntax
Can be easily integrated in XQuery
Can be implemented efficiently
In-memory Persistent