YARR! Yet Another Rewriting Reasoner Jrg Schnfisch ORE Workshop - - PowerPoint PPT Presentation

yarr yet another rewriting reasoner
SMART_READER_LITE
LIVE PREVIEW

YARR! Yet Another Rewriting Reasoner Jrg Schnfisch ORE Workshop - - PowerPoint PPT Presentation

YARR! Yet Another Rewriting Reasoner Jrg Schnfisch ORE Workshop 2013 Ulm, Germany, 22.07.2013 Agenda Introduction Query Rewriting Architecture & Implementation Performance Current development Introduction | Query Rewriting |


slide-1
SLIDE 1

Jörg Schönfisch ORE Workshop 2013 Ulm, Germany, 22.07.2013

Yet Another Rewriting Reasoner YARR!

slide-2
SLIDE 2

Agenda

Introduction | Query Rewriting | Architecture | Performance | Current Development

Introduction Query Rewriting Architecture & Implementation Performance Current development

slide-3
SLIDE 3

Introduction

Introduction | Query Rewriting | Architecture | Performance | Current Development

We needed a reasoner for closed source customer projects Requirements Stable implementation Liberal licensing Reuse of existing architecture Suitable for a collaborative and concurrent deployment scenario

Motivation for implementing YARR

slide-4
SLIDE 4

Introduction

Introduction | Query Rewriting | Architecture | Performance | Current Development

Query answering through query rewriting (Presto algorithm) Support of the OWL 2 QL profile Persistence in a relational database system (RDBMS) SPARQL endpoint Developed as part of a larger platform

YARR’s Features

slide-5
SLIDE 5

Introduction

Introduction | Query Rewriting | Architecture | Performance | Current Development

SPARQL 1.1 support limited to SELECT and ASK queries UPDATE, CONSTRUCT and DESCRIBE are not supported Property paths (especially negation of properties) not implemented Some built-in functions are not implemented, yet abs, ceil, floor, … Hashing functions Some String operations …

YARR’s Limitations

slide-6
SLIDE 6

Introduction

Introduction | Query Rewriting | Architecture | Performance | Current Development

Enables conjunctive queries to be answered in LogSpace using standard relational database technology Suitable for applications where relatively lightweight ontologies are used to

  • rganize large numbers of individuals and where it is useful or necessary to

access the data directly via relational queries (i.e. SQL) Limitations: No cardinality restrictions No (disjoint) unions No transitive properties … Algorithms and prototypes for OWL 2 QL reasoning exist, but they are not easily available for commercial use in closed source products and customer projects

OWL 2 QL Profile

slide-7
SLIDE 7

Query Rewriting

Introduction | Query Rewriting | Architecture | Performance | Current Development

Getting complete and sound answers from the database requires rewriting of the query to also fetch knowledge only implicitly defined by the TBox and the ABox Query Rewriting expands the query so that implicit facts are also retrieved from the ABox Features of Query Rewriting The ABox is not involved in the reasoning step during query answering Changes in the knowledge base are instantly reflected in query results Changes create no overhead (e.g. removal of old inferences) Only read access to the data is needed No preprocessing of the data required Trades less storage space for the knowledge base for more complex queries

slide-8
SLIDE 8

Preliminaries

Introduction | Query Rewriting | Architecture | Performance | Current Development

SPARQL Query: SELECT ?athlete ?games WHERE { ?event hasGoldWinner ?athlete ;

  • ccurredInGames ?games .

?athlete isRepresentativeOf Germany . } (Relevant) TBox axioms: InverseProperties(hasGoldWinner,wonGold) InverseProperties(isRepresentativeOf,hasRepresentative) Translation to Datalog: q(?games,?event) :- hasGoldWinner(?event,?athlete),

  • ccuredInGames(?event,?games),

isRepresentativeOf(?athlete,Germany)

Rewriting Example

slide-9
SLIDE 9

Preliminaries

Introduction | Query Rewriting | Architecture | Performance | Current Development

Rewritten Datalog after applying the Tbox axioms: q(?games,?event) :- occuredInGames(?event,?games), view_1(?event,?athlete), view_2(?athlete) view_1(?event,?athlete) :- hasGoldWinner(?event,?athlete) view_1(?event,?athlete) :- wonGold(?athlete,?event) view_2(?athlete) :- isRepresentativeOf(?athlete,Germany) view_2(?athlete) :- hasRepresentative(Germany,?athlete) The translation to SQL results in a query with 24 subselects and altogether 50 join

  • perations.

Rewriting Example

slide-10
SLIDE 10

Preliminaries

Introduction | Query Rewriting | Architecture | Performance | Current Development

Alternative approach to rewriting: Materialization Everything that can be inferred about the data is stored explicitly beforehand Queries can be executed as-is Changes in the data are more complex as they influence the inferences Advantages of Query Rewriting Changes to instances have no impact on the internal state of the reasoner Highly concurrent editing and reasoning Requires potentially less storage space Size of the ABox is irrelevant for the reasoning step Requires no additional steps during load time of the data Disadvantages Potentially more complex queries

Query Rewriting vs Materialization

slide-11
SLIDE 11

Architecture

Introduction | Query Rewriting | Architecture | Performance | Current Development

Reasoner Architecture

SPARQL Endpoint (any) RDBMS Editing Services SPARQL to Datalog Query Rewriting Datalog to SQL Persistence Mapper

slide-12
SLIDE 12

Architecture

Introduction | Query Rewriting | Architecture | Performance | Current Development

Based on the structure of the OWL 2 Metamodel Every part of the metamodel is stored in a separate table Facilitates editing of the knowledgebase Consists of 43 tables 6 tables for each type of entity 4 tables for each type of assertion 2 tables for literals 21 tables for different axioms (domain, range, equivalency, …) 10 auxiliary tables

Database Schema

slide-13
SLIDE 13

Performance

Introduction | Query Rewriting | Architecture | Performance | Current Development

Sp2Bench SPARQL Benchmark

0,0010 0,0100 0,1000 1,0000 10,0000 100,0000 1000,0000 10000,0000 1 2 3a 3b 3c 4 5a 5b 6 7 8 9 10 11 12a 12b 12c sec 10k YARR 10k Stardog 250k YARR 250k Stardog 5M YARR 5M Stardog

slide-14
SLIDE 14

Current development

Introduction | Query Rewriting | Architecture | Performance | Current Development

Further support for SPARQL 1.1 Queries Negation in Property Paths Implement more built-in functions Support for CONSTRUCT Performance improvement Caching Optimizations in the SQL translator Analysis of slow queries Implementing adapters for ontology-based data access (OBDA) Possibly supporting R2RML (W3C mapping language from RDF to relational data)

slide-15
SLIDE 15

Thank You!

slide-16
SLIDE 16

Performance

Sp2Bench SPARQL Benchmark

0,0001 0,0010 0,0100 0,1000 1,0000 10,0000 100,0000 1000,0000 10000,0000 1 2 3a 3b 3c 4 5a 5b 6 7 8 9 10 11 12a 12b 12c sec 10k LSP 10k OWLIM 250k LSP 250k OWLIM 5M LSP 5M OWLIM

slide-17
SLIDE 17

Architecture

(any) RDBMS Persistence Mapper (Hibernate / jOOQ) Editing Services SPARQL Querying Services OWL 2 QL Query Rewriting (Presto Algorithm), Sesame SPARQL Parser Fulltext Search Services Ontology Visualizations Lucene based Search Engine CRUD Operations

  • n RDBMS

Faceted Search A-Box Editing T-Box Editing OWL Import & Export (OWL API) Search UI … … … Tested with Oracle, Postgresql, HSQLDB Pure Java Application UI built with Vaadin