Scalable End-user Access to Big Data Engineer Application - - PowerPoint PPT Presentation

scalable end user access to big data
SMART_READER_LITE
LIVE PREVIEW

Scalable End-user Access to Big Data Engineer Application - - PowerPoint PPT Presentation

National and Kapodistrian HELLENIC REPUBLIC University of Athens 1 / 14 Scalable End-user Access to Big Data Engineer Application predefjned queries answers 2 / 14 The Problem of Data Access I need to fjnd all rock samples where my Company


slide-1
SLIDE 1

Scalable End-user Access to Big Data

HELLENIC REPUBLIC

National and Kapodistrian University of Athens

1 / 14

slide-2
SLIDE 2

The Problem of Data Access

Engineer Application predefjned queries answers

2 / 14

slide-3
SLIDE 3

When does this Go Wrong?

I need to fjnd all rock samples where my Company had at least a 30% share of the licence at the time the sample was taken. I’m sure the information is there but there are so many concepts involved that I can’t fjnd it in the application. I need all wellbores with a pore pressure of over 14ppg, but lower than 12ppg further down the hole. I can’t say this to the application. I need to fjnd all rock samples for this oil fjeld, including the ones in this Excel sheet from Dinoco. The application doesn’t know about this data.

3 / 14

slide-4
SLIDE 4

What then happens?

Where is this information stored, and what is it called? Can you hand-craft a query for my information need? Can you include data from this spreadsheet in the db? May take weeks to respond Takes several years to master data stores and user needs

30–70% of domain expert time spent looking for and assessing the quality of the data found

4 / 14

slide-5
SLIDE 5

What then happens?

Where is this information stored, and what is it called? Can you hand-craft a query for my information need? Can you include data from this spreadsheet in the db? May take weeks to respond Takes several years to master data stores and user needs

30–70% of domain expert time spent looking for and assessing the quality of the data found

4 / 14

slide-6
SLIDE 6

The Problem of Data Access

Engineer Application predefjned queries answers

5 / 14

slide-7
SLIDE 7

The Problem of Data Access

Engineer Application IT-expert information need specialised query answers

5 / 14

slide-8
SLIDE 8

Data Access, with a Data Warehouse

Engineer Application

Data Warehouse

queries ETL answers ETL

6 / 14

slide-9
SLIDE 9

Data Access: The Optique Solution

Engineer Application

  • ntology-based

query translated query answers

7 / 14

slide-10
SLIDE 10

Data Access: The Optique Solution

Engineer Application

  • ntology-based

query translated query answers Onto- logy Map- pings

7 / 14

slide-11
SLIDE 11

Optique Architecture

End-user IT-expert Data models

  • Std. ontologies

… Visualisation & Analysis Query Formulation Ontology & Mapping Management Ontology Mappings Queries Query Transformation Query Planning Query Execution Query Execution Query Execution · · · · · · results streaming data temporal data static data central repository

8 / 14

slide-12
SLIDE 12

OBDA: Example engineer

Generators with a turbine fault?

Based on slides by Ian Horrocks

Generator(g1) hasFault(g1, f1) CondenserFault(f1)

9 / 14

slide-13
SLIDE 13

OBDA: Example engineer

Generators with a turbine fault?

Based on slides by Ian Horrocks

g1 is a Generator g1 has fault f1 f1 is a CondenserFault

9 / 14

slide-14
SLIDE 14

OBDA: Example engineer

Generators with a turbine fault?

Based on slides by Ian Horrocks

g1 is a Generator g1 has fault f1 f1 is a CondenserFault

9 / 14

slide-15
SLIDE 15

OBDA: Example engineer

Generators with a turbine fault?

Based on slides by Ian Horrocks

g1 is a Generator g1 has fault f1 f1 is a CondenserFault ∅

9 / 14

slide-16
SLIDE 16

OBDA: Example engineer

Generators with a turbine fault?

Based on slides by Ian Horrocks

g1 is a Generator g1 has fault f1 f1 is a CondenserFault

Condenser ⊑ CoolingDevice ⊓ ∃isPartOf.Turbine CondenserFault ≡ Fault ⊓ ∃afgects.Condenser TurbineFault ≡ Fault ⊓ ∃afgects.( ∃isPartOf.Turbine)

9 / 14

slide-17
SLIDE 17

OBDA: Example engineer

Generators with a turbine fault?

Based on slides by Ian Horrocks

g1 is a Generator g1 has fault f1 f1 is a CondenserFault

Condenser is a CoolingDevice that is part of a Turbine Condenser Fault is a Fault that afgects a Condenser Turbine Fault is a Fault that afgects part of a Turbine

9 / 14

slide-18
SLIDE 18

OBDA: Example engineer

Generators with a turbine fault?

Based on slides by Ian Horrocks

g1 is a Generator g1 has fault f1 f1 is a CondenserFault

Condenser is a CoolingDevice that is part of a Turbine Condenser Fault is a Fault that afgects a Condenser Turbine Fault is a Fault that afgects part of a Turbine

9 / 14

slide-19
SLIDE 19

OBDA: Example engineer

Generators with a turbine fault?

Based on slides by Ian Horrocks

g1 is a Generator g1 has fault f1 f1 is a CondenserFault

Condenser is a CoolingDevice that is part of a Turbine Condenser Fault is a Fault that afgects a Condenser Turbine Fault is a Fault that afgects part of a Turbine

g1

9 / 14

slide-20
SLIDE 20

Query Rewriting

Given:

T (Terminology) – the ontology, domain model A (Assertions) – the database q a query

we want ans(q, (T , A)) the answers of the query given knowledge in T and A In general expensive to compute In certain cases possible by rewriting: q′ := rewrite(q, T ) such that ans(q′, (∅, A)) = ans(q, (T , A)) Query answering with empty ontology is cheap (same as SQL)

10 / 14

slide-21
SLIDE 21

Rewriting Example

A : Generator(g1) hasFault(g1, f1) CondenserFault(f1)

T : Condenser ⊑ CoolingDevice ⊓ ∃isPartOf.Turbine CondenserFault ≡ Fault ⊓ ∃afgects.Condenser TurbineFault ≡ Fault ⊓ ∃afgects.( ∃isPartOf.Turbine)

q = Generator(g) ∧ hasFault(g, f) ∧ TurbineFault(f) Rewrite with T : rewrite(q, T ) = q′ = Generator(g) ∧ hasFault(g, f) ∧ CondenserFault(f) ∨ · · · Answers from q′: ans(q′, (∅, A)) = {⟨g1, f1⟩}

11 / 14

slide-22
SLIDE 22

That Sounds Simple?

OBDA is well researched, many publications in last 10 years. So why a 4 year EU research project? Some important bits are missing: Usability

How do end-users formulate queries? In fjrst-order logic? Need a user interface for ‘query formulation’ Ontology management Mapping management, analysis and evolution

Scope

What about queries with time? Or geology? Or chemistry? Need to extend bare-bones query rewriting

Effjciency

SQL databases not good at queries from OBDA Big Data is maybe not best stored in an SQL database Optimize rewritten queries and storage layer

12 / 14

slide-23
SLIDE 23

www.optique-project.eu