Xcerpt and visXcerpt: Integrating Web Querying Sacha Berger - - PowerPoint PPT Presentation

xcerpt and visxcerpt integrating web querying
SMART_READER_LITE
LIVE PREVIEW

Xcerpt and visXcerpt: Integrating Web Querying Sacha Berger - - PowerPoint PPT Presentation

Xcerpt and visXcerpt: Integrating Web Querying Sacha Berger Franois Bry Institute for Informatics Tim Furche University of Munich Benedikt Linse Andreas Schroeder http://www.pms.ifi.lmu.de/ 1 Data: Semi-structured Trees & Graphs


slide-1
SLIDE 1

Xcerpt and visXcerpt: Integrating Web Querying

http://www.pms.ifi.lmu.de/

Institute for Informatics University of Munich

Sacha Berger François Bry Tim Furche Benedikt Linse Andreas Schroeder

slide-2
SLIDE 2

Overview Data Patterns Rules

1 Data: Semi-structured Trees & Graphs

Graph data model for Xcerpt and visXcerpt

— as in RDF and semi-structured DBs like Lore — great attention to XML specificities such as attributes and namespaces

Consistent Extension of XML

— children order may be irrelevant — possible transparent resolution of non-hierarchical relations

slide-3
SLIDE 3

Bibliography Entries: DBLP-style

Bibliography Entries

— rather regular schema with optionals — several ordered lists, otherwise keyed attributes

Identifier and label of elements Context-Menu: Interactive Features Folding elements for information focus Element nesting (child relation) becomes box nesting and colors Non-hierarchical relations as hyperlinks Ordered vs. unordered children list

Overview Data Patterns Rules

slide-4
SLIDE 4

‘Advancements in Data Management for Military and Civil Application’ ‘Graphs and Networks’ ‘Trees’ ‘Data Structures’ ‘Data’ ‘Information Systems’ ‘Papyri’ ‘Wax Tablets’ ‘Storage Management’ ‘Secondary Storage’ ‘Programming Techniques’ ‘Software’ ‘Operating Systems’ ‘Computing Classification System’

acm98:CCS acm98:D mybib:journal_adm

h a s T

  • p

C

  • n

c e p t h a s T

  • p

C

  • n

c e p t

acm98:E acm98:H

hasTopConcept

acm98:D_1 acm98:D_4

narrower narrower

‘Logic Programming’ ‘Visual Programming’

acm98:D_1_6 acm98:D_1_7

n a r r

  • w

e r n a r r

  • w

e r narrower narrower

acm98:D_4_2 acm98:D_4_2_e acm98:D_4_2_e_i acm98:D_4_2_e_ii

narrower narrower

‘Database Management’ ‘Physical Design’ ‘Logical Design’ ‘Data Models’ ‘Information Storage and Retrieval’ ‘Information Storage’ ‘Systems and Software’ ‘Performance evaluation (efficiency and effectiveness)’

acm98:E_1

narrower

acm98:E_1_c acm98:E_1_d

narrower n a r r

  • w

e r

acm98:H_2

narrower

acm98:H_2_1 acm98:H_2_2

narrower

acm98:H_2_1_a

narrower narrower

acm98:H_3

narrower narrower

acm98:H_3_2

n a r r

  • w

e r

acm98:H_3_4 acm98:H_3_4_d

narrower

mybib:conf_dmc mybib:article_66_scaurus_qumran mybib:article_66_wax_cicero mybib:inproc_44_brutus

‘Applied Data Management’ ‘From Wax Tablets to Papyri: The Qumran Case Study’ ‘Space- and Time-Optimal Data Storage on Wax Tablets’ ‘Efficient Management of Rapidly Changing Personal Records’

primarySubject subject related primarySubject subject p r i m a r y S u b j e c t primarySubject r e l a t e d primarySubject s u b j e c t related subject

Topics and Themes: SKOS Ontology

slide-5
SLIDE 5

Overview Data Patterns Rules

2 Patterns: Examples for Selected Data

Query-by-Example paradigm

— queries just like data plus variables, incompleteness, optionality, negation — patterns plus variables instead of navigation

Logical Variables in Patterns

— select relevant data (n-ary queries) — group and aggregate data — join different data items

slide-6
SLIDE 6

Basic Patterns: Variables and Incompleteness

Basic Pattern

“return the titles of all top-level sections in articles by Marcus Tullius Cicero and published in ‘Applied Data Management’. ”

Accessing Web resources: arbitrary XML documents can be accessed using their URL Incomplete patterns in depth: descendant allows additional intermediary elements Grouping collects alternative bindings for variables: essential for structural assembly Incomplete patterns in breadth: partial patterns allow additional child elements Variables are used in lieu of data : express selection, joins, or arithmetic conditions

Overview Data Patterns Rules

slide-7
SLIDE 7

Complex Patterns: Formulas, Join, Optionality

Complex Pattern

“return titles and optionally paragraphs of all top-level sections without figures in articles on the topic ‘Wax Tablets’. ”

Terms as formulas: Terms may contain boolean connectives, variables, negation, etc. Subterm negation: Some subterms may be required not to occur in matching data Optional subterms: Local form of disjunction essential for variable schema data Value Joins: Expressed through multiple variable occurrences Optional construction:

Limited form of conditional construction based on variable bindings

Overview Data Patterns Rules

slide-8
SLIDE 8

3 Rules: Separation of Concern by Views

Separation of Query and Construction

— two separate parts in rules — no mixing of construction and querying — instead chaining where necessary

Separation of Concern by Views

— separate tasks of a query in rules — efficient evaluation of chained queries — memoization and unfolding

Overview Data Patterns Rules

slide-9
SLIDE 9

Rules: Inference, Views, and Chaining

Rules and Chaining

“close the skos:related relation on the provided data by adding skos:subject and traversing the closure of skos:narrower”

Terms as formulas: Terms may contain boolean connectives, including disjunctions Rules separate construction from querying and allow for procedural abstraction in query programs

Overview Data Patterns Rules