scalable end user access to big data
play

Scalable End-user Access to Big Data . . Type column is T. Out: - PowerPoint PPT Presentation

. . HELLENIC REPUBLIC National and Kapodistrian University of Athens 1 / 12 Scalable End-user Access to Big Data . . Type column is T. Out: Print Sensor Nr. x for all rows x in Sensors table where In: List


  1. . . HELLENIC REPUBLIC National and Kapodistrian University of Athens 1 / 12 Scalable End-user Access to Big Data

  2. . . “Type” column is “T.”’ Out: ‘Print “Sensor Nr. x ” for all rows x in “Sensors” table where In: ‘List all temperature sensors.’ over data sources. Nr. x is a Temperature Sensor’ 2 / 12 . Ontology-based Data Access . Capture End-user vocabulary in an “Ontology” . ≈ Domain model . Classes and relations known to end-users . Some minimal domain knowledge . Mappings that relate Ontology with data sources . ‘Column “Type” is “T” in row x of table “Sensors” if sensor . Automatically translate queries in End-user language to queries

  3. . a turbine fault? . . . Ian Horrocks Based on slides by . . Generators with . engineer . . . 3 / 12 OBDA: Example Generator ( g1 ) hasFault ( g1 , f1 ) CondenserFault ( f1 )

  4. . . . g1 has fault f1 g1 is a Generator . . Ian Horrocks Based on slides by . a turbine fault? Generators with . engineer . . . 3 / 12 OBDA: Example f1 is a CondenserFault

  5. . . . g1 has fault f1 g1 is a Generator . . Ian Horrocks Based on slides by . a turbine fault? Generators with . engineer . . . 3 / 12 OBDA: Example f1 is a CondenserFault

  6. . . . . g1 has fault f1 g1 is a Generator . . Ian Horrocks Based on slides by . a turbine fault? Generators with . engineer . . . 3 / 12 OBDA: Example ∅ f1 is a CondenserFault

  7. . . . g1 has fault f1 g1 is a Generator . . Based on slides by Ian Horrocks a turbine fault? Generators with . engineer . . . 3 / 12 OBDA: Example f1 is a CondenserFault Condenser ⊑ CoolingDevice ⊓ ∃ isPartOf . Turbine CondenserFault ≡ Fault ⊓ ∃ affects . Condenser TurbineFault ≡ Fault ⊓ ∃ affects . ( ∃ isPartOf . Turbine )

  8. . Based on slides by affects a Condenser is part of a Turbine . g1 has fault f1 g1 is a Generator . . Ian Horrocks . a turbine fault? Generators with . engineer . . . 3 / 12 OBDA: Example f1 is a CondenserFault Condenser is a CoolingDevice that Condenser Fault is a Fault that Turbine Fault is a Fault that affects part of a Turbine

  9. . Based on slides by affects a Condenser is part of a Turbine . g1 has fault f1 g1 is a Generator . . Ian Horrocks . a turbine fault? Generators with . engineer . . . 3 / 12 OBDA: Example f1 is a CondenserFault Condenser is a CoolingDevice that Condenser Fault is a Fault that Turbine Fault is a Fault that affects part of a Turbine

  10. . Ian Horrocks g1 . affects a Condenser is part of a Turbine . g1 has fault f1 g1 is a Generator . . Based on slides by . a turbine fault? Generators with . engineer . . . 3 / 12 OBDA: Example f1 is a CondenserFault Condenser is a CoolingDevice that Condenser Fault is a Fault that Turbine Fault is a Fault that affects part of a Turbine

  11. . . . 4 / 12 Unique Combination of Techniques

  12. . Query Execution . Query Transformation . Query Planning . Stream Adapter . Query Execution . . . . . . streaming data . query . . cross-component optimization . Mappings Ontology Data models . . . End-user . . IT-expert . . 5 / 12 Std. ontologies … . Appli- cation . Query Formulation . Ontology & Mapping Management Optique Architecture results · · · · · · · · ·

  13. . . 6 / 12 . Integrated Platform Client Tier Ontology and Mapping Query Formulation Rich Interface Answer visualisation Management Rich Interface Visualisation Stream analytics Processing Components of engines mining Ontology and Mapping Manager Query Formulation log analyses, etc Processing Components Bootstrapper Ontology Query by Navigation QDriven ont Analyser and Context Sens. Ed construction Evolution Engine Mapping Direct Ed. Transformator Revision - ontology Faceted Search Export funct. Approximator control & Ontology reasoner 1 - mappings 1-time Q Editing - configuration Stream Q Feedback funct. Ontology reasoner 2 SPARQL ontology mapping Shared - queries ... triple - answers store - history - etc. Query Answering Component Query transformation Distributed Query Execution Query Rewriting Answ Manager Semantic QOpt Q Planner Query Execution Shared Syntacti QOpt Optimization Data Federation database Sem indexing Data Federaion 1-time Q 1-time Q 1-time Q Stream Stream Q Stream Q SPARQL SPARQL SQL Q Application Tier Data Tier Cloud ... RDBs, triple stores, (virtual resource pool) ... data streams and Cloud temporal DBs, etc.

  14. . . 7 / 12 . The Query Formulation Interface . Let users formulate ad-hoc queries . filtering on attributes . connecting objects . selecting what information to extract . choosing types (Facility → FixedFacility | MovableFacility) . Until end of year: . specify time ranges . choose entities (licenses, fields, etc.) from map . Later: . aggregation: sums, averages, etc. . negation (“all turbines without a fault”) . Intentionally restricted expressivity . As powerful as SQL → as hard to learn . Demo . Data from NPD FactPages ( http://factpages.npd.no/ )

  15. . . Alignment . Ontology . Direct Mapping . HQ Ontology . . DM Mappings DM Ontology . Database . . . 8 / 12 Ontology & Mapping Management . OBDA relies on Ontology and Mappings . Tool support to create and maintain O&M . Results so far: Bootstrapping components . Coming up: tool support for O&M QC and evolution . when ontology changes . when data sources change

  16. . . . . . 9 / 12 Time & Streams . Query processing extended for stream queries (STARQL) . combined queries on real-time and historical data . rewrite queries over temporal data . execution with streaming answers in ADP ( → slide 11) . Coming up: integration with platform architecture . register/unregister queries . stream answers . (also useful for one-shot queries)

  17. . . . 10 / 12 Query Transformation . Based on open source -ontop- system . Query rewriting for OWL 2 QL ontologies . Covers almost all of standard SPARQL query language . Now testing on real queries from Statoil on EPDS . Efficiency problems with some rewritten queries . Targeted optimisation based on use-case requirements

  18. . . . 11 / 12 Distributed Query Execution . Query Execution (“backend”) . Based on ADP – Athena Distributed Processing . Cutting edge parallelised database engine . Optimisation w.r.t. many dimensions . “Hadoop for Databases” . For Optique: . stream processing . federation (one query, many sources) . parallelisation (elastic clouds) . Cross-component optimisation of query processing

  19. . . www.optique-project.eu

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend