Continuous queries Daniele DellAglio dellaglio@ifi.uzh.ch - - PowerPoint PPT Presentation

continuous queries
SMART_READER_LITE
LIVE PREVIEW

Continuous queries Daniele DellAglio dellaglio@ifi.uzh.ch - - PowerPoint PPT Presentation

How to Build a Stream Reasoning Application D. Dell'Aglio, E. Della Valle, T. Le-Pham, A. Mileo, and R. Tommasini http://streamreasoning.org/events/streamapp2017 Continuous queries Daniele DellAglio dellaglio@ifi.uzh.ch


slide-1
SLIDE 1

How to Build a Stream Reasoning Application

  • D. Dell'Aglio, E. Della Valle,
  • T. Le-Pham, A. Mileo, and R. Tommasini

http://streamreasoning.org/events/streamapp2017

Continuous queries

Daniele Dell’Aglio

dellaglio@ifi.uzh.ch http://dellaglio.org @dandellaglio

slide-2
SLIDE 2

http://streamreasoning.org/events/streamapp2017

Share, Remix, Reuse — Legally

  • This work is licensed under the Creative Commons

Attribution 3.0 Unported License.

  • Your are free:

to Share — to copy, distribute and transmit the work to Remix — to adapt the work

  • Under the following conditions

Attribution — You must attribute the work by inserting a credits slide stating

– These slides are partially based on “How to Build a Stream Reasoning Application 2017” by D. Dell'Aglio, E. Della Valle,

  • T. Le-Pham, A. Mileo, and R. Tommasini available online at

http://streamreasoning.org/events/streamapp2017

  • To view a copy of this license, visit

http://creativecommons.org/licenses/by/3.0/

slide-3
SLIDE 3

http://streamreasoning.org/events/streamapp2017

Continuous query evaluation

  • From SPARQL
  • One query, one answer
  • The query is sent after that the data is available
  • To a continuous query language
  • One query, multiple answers
  • The query is registered in the query engine
  • The registration usually happens before that the data arrives
  • Real-time responsiveness is usually required

3

slide-4
SLIDE 4

http://streamreasoning.org/events/streamapp2017

Let’s process the RDF streams!

  • In literature there are two different main approaches to process streams
  • Data Stream Management Systems (DSMSs)
  • Roots in DBMS research
  • Aggregations and filters
  • Complex Event Processors (CEPs)
  • Roots in Discrete Event Simulation
  • Search of relevant patterns in the stream
  • Non-equi-join on the timestamps (after, before, etc.)
  • Current systems implements feature of both of them
  • EPL (e.g. Esper, ORACLE CEP)
  • Now we focus on the CQL/STREAM model
  • Developed in the DSMS research
  • C-SPARQL (and others) is inspired to this model

4

slide-5
SLIDE 5

http://streamreasoning.org/events/streamapp2017

Querying data streams – The CQL model

5

Streams Relations … <s,τ> … <s1> <s2> <s3>

infinite unbounded sequence finite bag Mapping: T  R

stream-to-relation relation-to-stream relation-to-relation

Stream Relation R(t) Relational algerbra *Stream operators Sliding windows

slide-6
SLIDE 6

http://streamreasoning.org/events/streamapp2017

CQL extension for querying RDF data streams

6

RDF Streams Mappings

S2R operators R2S operators SPARQL operators

*Stream operators Sliding windows

slide-7
SLIDE 7

http://streamreasoning.org/events/streamapp2017

7

R2R operator

Time-based sliding window

S3 S4 S5 S6 S7 S8 S9 S10 S11 S12

S

S1 S2 W(ω,β) β ω t width slide

slide-8
SLIDE 8

http://streamreasoning.org/events/streamapp2017

8

SPARQL: a quick recap

slide-9
SLIDE 9

http://streamreasoning.org/events/streamapp2017

The query output

  • Which is the format of the answer?
  • We can distinguish two cases

1. No R2S operator: the output is a relation (that changes during the time) 2. R2S operator: a stream.

– An RDF stream? It depends by the Query Form

9 S2R operators R2S operators SPARQL operators

RDF Mappings RDF Streams

slide-10
SLIDE 10

http://streamreasoning.org/events/streamapp2017

No R2S operator: relation

10

RSP

SELECT ?a ?b … FROM …. WHERE …. CONSTRUCT {?a :prop ?b } FROM …. WHERE …. a … b… [t1] a … b… a … b… [t3] a … b… [t5] a … b… [t7] <… :prop … > [t1] <… :prop … > <… :prop … > [t3] <… :prop … > [t5] <… :prop … > [t7]

queries bindings triples

slide-11
SLIDE 11

http://streamreasoning.org/events/streamapp2017

R2S operator: stream

  • R2S operators
  • Three operators:
  • Rstream: streams out all data in the last step
  • Istream: streams out data in the last step that wasn’t on the previous step, i.e.

streams out what is new

  • Dstream: streams out data in the previous step that isn’t in the last step, i.e. streams
  • ut what is old

11 CONSTRUCT RSTREAM {?a :prop ?b } FROM …. WHERE ….

… <… :prop … > [t1] <… :prop … > [t1] <… :prop … > [t3] <… :prop … > [t5] < …:prop … > [t7] …

RSP

query stream

slide-12
SLIDE 12

http://streamreasoning.org/events/streamapp2017

Some existing RSP systems (oversimplified!)

  • C-SPARQL: RDF Store + Stream processor
  • Combined architecture
  • CQELS: Implemented from scratch. Focus on performance
  • Native + adaptive joins for static-data and streaming data

12

RDF Store Stream processor

C-SPARQL query continuous results

Native RSP

CQELS query continuous results

translator

slide-13
SLIDE 13

http://streamreasoning.org/events/streamapp2017

Some existing RSP systems (oversimplified!)

  • SPARQLstream: Ontology-based stream query answering
  • Virtual RDF views, using R2RML mappings
  • SPARQL stream queries over the original data streams.
  • Instans: RETE-based evaluation

13

DSMS/CEP

SPARQLStream query continuous results

rewriter R2RML mappings

slide-14
SLIDE 14

http://streamreasoning.org/events/streamapp2017

Classification of existing systems

Model Continuous execution Union, Join, Optional, Filter Aggregates Time window Triple window R2S operator TA-SPARQL TA-RDF ✔ Limited tSPARQL tRDF ✔ Streaming SPARQL RDF Stream ✔ ✔ ✔ ✔ C-SPARQL RDF Stream ✔ ✔ ✔ ✔ ✔ Rstream

  • nly

CQELS RDF Stream ✔ ✔ ✔ ✔ ✔ Istream

  • nly

SPARQLStream (Virtual) RDF Stream ✔ ✔ ✔ ✔ ✔ Instans RDF ✔ ✔ ✔

14

Disclaimer: only a partial view

slide-15
SLIDE 15

http://streamreasoning.org/events/streamapp2017

Similar models, similar (not equals!) query languages

15

SELECT ?sensor FROM NAMED STREAM <http://www.cwi.nl/SRBench/observations> [NOW-3 HOURS SLIDE 10 MINUTES] WHERE { ?observation om-owl:procedure ?sensor ;

  • m-owl:observedProperty weather:WindSpeed ;
  • m-owl:result [ om-owl:floatValue ?value ] . }

GROUP BY ?sensor HAVING ( AVG(?value) >= "74"^^xsd:float ) SELECT ?sensor WHERE { STREAM <http://www.cwi.nl/SRBench/observations> [RANGE 10800s SLIDE 600s] { ?observation om-owl:procedure ?sensor ;

  • m-owl:observedProperty weather:WindSpeed ;
  • m-owl:result [ om-owl:floatValue ?value ] .} }

GROUP BY ?sensor HAVING ( AVG(?value) >= "74"^^xsd:float ) SELECT ?sensor FROM STREAM <http://www.cwi.nl/SRBench/observations> [RANGE 1h STEP 10m] WHERE { ?observation om-owl:procedure ?sensor ;

  • m-owl:observedProperty weather:WindSpeed ;
  • m-owl:result [ om-owl:floatValue ?value ] . }

GROUP BY ?sensor HAVING ( AVG(?value) >= "74"^^xsd:float )

SPARQLstream CQELS C-SPARQL

slide-16
SLIDE 16

http://streamreasoning.org/events/streamapp2017

16

The problem (1)

Executi

  • n

1° answer 2° answer 1 :hall [6] :kitchen [11] 2 :hall [5] :kitchen [10] 3 :hall [6] :kitchen [11] 4

  • [7]
  • [12]
  • Where are Alice and Bob,

when they are together?

  • Let’s consider a tumbling

window W(ω=β=5)

  • Let’s execute the experiment

4 times

t 3 6 9 1 {:alice :isIn :hall} {:bob :isIn :hall} {:alice :isIn :kitchen} {:bob :isIn :kitchen} Which is the correct answer? e1 e2 e3 e4

S

slide-17
SLIDE 17

http://streamreasoning.org/events/streamapp2017

17

The problem (2)

Executi

  • n

1° answer 2° answer 1 :hall [6] :kitchen [11] 2 :hall [5] :kitchen [10] 3 :hall [6] :kitchen [11] 4

  • [7]
  • [12]

Executi

  • n

1° answer 2° answer 1 :hall [3] :kitchen [9] 2 No answers 3 :hall [3] :kitchen [9] 4 No answers CSPARQL CQELS Which system behaves in the correct way? t 3 6 9 1 {:alice :isIn :hall} {:bob :isIn :hall} {:alice :isIn :kitchen} {:bob :isIn :kitchen} e1 e2 e3 e4

S

slide-18
SLIDE 18

http://streamreasoning.org/events/streamapp2017

Understanding the RSPs

  • They share similar models, but they behave in different ways
  • The C-SPARQL, CQELS and SPARQLstream models do not allow to determine

in a unique way which should be the answer given the inputs and the query

  • There are missing parameters (encoded in the implementations)
  • Why is it important to understand those behaviours?
  • To assess the correct implementation of the systems
  • To improve the comprehension of the benchmarking
  • W3C RDF stream processor community group started to jointly work out

a recommendation in 2014

  • http://www.w3.org/community/rsp/

18

slide-19
SLIDE 19

http://streamreasoning.org/events/streamapp2017

The problem (3)

  • In the context of continuous query answering over RDF streams, how can

the behaviour of existing systems be captured, compared and contrasted?

  • Why do we need it?
  • Comparison and contrast
  • Interoperability
  • Study RDF Stream Processing related problems
  • Standard RSP query language

19

slide-20
SLIDE 20

http://streamreasoning.org/events/streamapp2017

Backgroun d data Streams RSEP-QL Applications RSP-QL

BGP evaluation

  • ver background

data BGP evaluation

  • ver streams

Event Pattern detection operators Model to express continous queries

RDF SPARQL

RSEP-QL

  • A reference model that formally defines the semantics of RDF Stream

Processing engines

20

slide-21
SLIDE 21

http://streamreasoning.org/events/streamapp2017

21 Q (E, DS, QF)

RSEP-QL

From SPARQL to RSEP-QL

Evaluator Data layer Result Formatter Ans(Q) RDF graphs E DS QF Continuous Evaluator ET RDF graphs RDF streams Query Interface SDS Q (E, SDS, QF) Q (E, SDS, ET, QF) Q (SE, SDS, ET, QF) SE

slide-22
SLIDE 22

http://streamreasoning.org/events/streamapp2017

23

Sequence of timestamped graphs (stream items) 𝕏(3,1,1) β ω t width slide S3 S4 S5 S6 S7 S8 S9 S10 S11

S

S1 S2 t0

RSEP-QL: Dataset

Time-based Sliding Window: 𝕏(ω,β,t0) t0: When does the window start? (internal window param)

slide-23
SLIDE 23

http://streamreasoning.org/events/streamapp2017

𝕄(2) t S3 S4 S5 S6 S7 S8 S9 S10 S11

S

S1 S2 t0

RSEP-QL: Dataset

Landmark window: 𝕄(t0) Sequence of timestamped graphs (stream items)

24

slide-24
SLIDE 24

http://streamreasoning.org/events/streamapp2017

25

RSEP-QL: Dataset

From SPARQL to RSEP-QL dataset

t1 G(t1) T⊆ ℕ R={RDF graph}

SPARQL dataset

G H

Instantaneous Graph G(t1)  R Time-Varying Graph G: T R RESP-QL dataset

S3 S4 S5 S6 S7 S8 S9 S10 S11 S12

S

S1 S2

𝕏(S)

slide-25
SLIDE 25

http://streamreasoning.org/events/streamapp2017

RSEP-QL: Operators

  • Stream Processing operators (RSP-QL)
  • SPARQL operators
  • WINDOW to specify that the active element is a window (similar

to GRAPH)

  • RStream, IStream, DStream to create the output stream
  • Event Processing operators (RSEP-QL)
  • Not covered in this tutorial

26

slide-26
SLIDE 26

http://streamreasoning.org/events/streamapp2017

RSEP-QL: Evaluation Semantics

Stream Processing Evaluation Semantics

  • The SPARQL evaluation function is defined as
  • ⟦𝑄⟧𝐸𝑇(𝐻)
  • The RSEP-QL evaluation function extends the SPARQL one by introducing

the evaluation time instant

  • ⟪𝑄⟫𝑇𝐸𝑇(𝐵)

𝑢

  • SPARQL operators are straight extended to the new evaluation function
  • Example: JOIN
  • ⟦𝐾𝑃𝐽𝑂(𝑄

1, 𝑄2)⟧𝐸𝑇(𝐻) = ⟦𝑄 1⟧𝐸𝑇 𝐻 ⨝⟦𝑄2⟧𝐸𝑇 𝐻

  • ⟪𝐾𝑃𝐽𝑂(𝑄

1, 𝑄2)⟫𝑇𝐸𝑇 𝐵 𝑢

= ⟪𝑄

1⟫𝑇𝐸𝑇 𝐵 𝑢

⨝⟪𝑄2⟫𝑇𝐸𝑇 𝐵

𝑢

27

slide-27
SLIDE 27

http://streamreasoning.org/events/streamapp2017

RSEP-QL: Evaluation Semantics

Stream Processing Evaluation Semantics

  • The main difference is on the BGP evaluation:
  • ⟪𝐶𝐻𝑄⟫𝑇𝐸𝑇(𝐵)

𝑢

=⟦𝐶𝐻𝑄⟧𝑇𝐸𝑇(𝐵,𝑢) SDS(A,t) is: SDS(G,t)= SDS(G(t)) if A is a time-varying graph G SDS(𝕏(S),t)=SDS(m(𝕏(S,t))) if A is from a sliding window 𝕏 SDS(𝕄(S),t)=SDS(m(𝕄(S,t))) if A is from a landmark window 𝕄

where m denotes a merge function m(𝕏(S,t))= 𝑒𝑗,𝑢𝑗 ∈𝕏(S,t) 𝑒𝑗

  • takes as input a window content i.e. a sequence of timestamped RDF graphs
  • produces an RDF graph

28

slide-28
SLIDE 28

How to Build a Stream Reasoning Application

  • D. Dell'Aglio, E. Della Valle,
  • T. Le-Pham, A. Mileo, and R. Tommasini

http://streamreasoning.org/events/streamapp2017

Continuous queries

Daniele Dell’Aglio

dellaglio@ifi.uzh.ch http://dellaglio.org @dandellaglio

slide-29
SLIDE 29

http://streamreasoning.org/events/streamapp2017

RSEP-QL: Evaluation Semantics

Event Processing Evaluation semantics

  • We use a new evaluation function ⦅⋅⦆ 𝑝,𝑑

𝑢

  • t is the evaluation time instant (as in ⟪⋅⟫𝑇𝐸𝑇(𝐵)

𝑢

),

  • 𝑝, 𝑑 is an additional window to identify the portion of the data on which

the event may happen

  • Event pattern evaluation produces event mappings 𝜈, 𝑢1, 𝑢2
  • 𝜈 is a solution mapping
  • 𝑢1 and 𝑢2 denote the time inverval justifying 𝜈

30

slide-30
SLIDE 30

http://streamreasoning.org/events/streamapp2017

CEP operators

  • Sequence operators and CEP world
  • SEQ: joins ei and ej if ej occurs after ei
  • EQUALS: joins ei and ej if they occur simultaneously
  • AND: joins ei and ej if they both occur
  • NOT: check if ei does not exist
  • ...

31

A B C D

S

3 6 9 1

Sequence Simultaneous

slide-31
SLIDE 31

http://streamreasoning.org/events/streamapp2017

CEP operators: examples

  • B SEQ A
  • not matches
  • A AND C SEQ D
  • matches!
  • A SEQ NOT B SEQ C
  • not matches

32

A B C D

S

3 6 9 1

slide-32
SLIDE 32

http://streamreasoning.org/events/streamapp2017

t 3 6 9 1 2 4 5 7 8 10

P1 P1 P1 P2 P2 P3 P3

P1 SEQ P3 P2 AND P3 P2 OR P3 P1 PAR P2 P3 STARTS P1 P1 EQUALS P3 NOT(P3 ).[P1 , P1] P3 FINISHES P2 P2 MEETS P3

CEP operators: intervals

slide-33
SLIDE 33

http://streamreasoning.org/events/streamapp2017

RSEP-QL: Evaluation Semantics

Evaluation semantics - Example

  • ⦅FIRST EVENT 𝑥1 𝑄

1 SEQ EVERY EVENT 𝑥2 𝑄 2⦆ 9,16 𝑢

S2 S3 S4

S1

S1 S6 S7 S8 S9 S10 S11 S12

S2

S1 S10 S1 S12 FIRST EVENT 𝑥1 𝑄

1

SEQ EVENT 𝑥2 𝑄2 t 10 12 14 16 11 13 15 11 13 11 15

34

slide-34
SLIDE 34

http://streamreasoning.org/events/streamapp2017

RSEP-QL: Evaluation Semantics

MATCH graph pattern

  • Event patterns are eclosed in MATCH graph patterns
  • Event mappings exist only in the context of event patterns
  • The evaluation of a MATCH graph pattern produces a bag of solution

mappings

  • ⟪𝑁𝐵𝑈𝐷𝐼 𝐹⟫𝑇𝐸𝑇 𝐵

𝑢

= {𝜈| 𝜈, 𝑢1, 𝑢2 ∈ ⦅𝐹⦆ 0,𝑢

𝑢

}

  • It is possible to combine the MATCH graph pattern with other SPARQL

graph patterns

35

slide-35
SLIDE 35

http://streamreasoning.org/events/streamapp2017

RSEP-QL: Evaluation Semantics

Continuous evaluation

  • For each evaluation time t ∈ ET: ⟪𝑇𝐹⟫𝑇𝐸𝑇(𝐵)

𝑢

  • The continuous evaluation is a sequence of instantaneous evaluations
  • It is not always possible to compute ET a priori
  • Can be data dependent
  • ET is expressed through a Report Policy

36

slide-36
SLIDE 36

http://streamreasoning.org/events/streamapp2017

RSEP-QL: Evaluation Semantics

Continuous evaluation

  • A Report Policy is a set of conditions to one or more window operators in

SDS

  • Initially defined in SECRET for Stream Processing engines
  • Report Policy examples:
  • P Periodic: the window reports only at regular intervals
  • WC Window Close: the window reports if the active window closes
  • CC Content Change: the window reports if the content changes.

37

slide-37
SLIDE 37

http://streamreasoning.org/events/streamapp2017

38

Execution 1° answer 2° answer 1 :hall [6] :kitchen [11] 2 :hall [5] :kitchen [10] 3 :hall [6] :kitchen [11] 4

  • [7]
  • [12]

t0=0

Window 1° answer 2° answer

t0=0

:hall [5] :kitchen [10]

t0=1

:hall [6] :kitchen [11]

t0=2

  • [7]
  • [12]

t0=1 t0=2

RSEP-QL in action

Correctness assessment t 3 6 9 1 {:alice :isIn :hall} {:bob :isIn :hall} {:alice :isIn :kitchen} {:bob :isIn :kitchen} e1 e2 e3 e4

S

slide-38
SLIDE 38

http://streamreasoning.org/events/streamapp2017

39

Executio n 1° answer 2° answer 1 :hall [6] :kitchen [11] 2 :hall [5] :kitchen [10] 3 :hall [6] :kitchen [11] 4

  • [7]
  • [12]

Executio n 1° answer 2° answer 1 :hall [3] :kitchen [9] 2 No answers 3 :hall [3] :kitchen [9] 4 No answers CSPARQL CQELS Window-close vs content-change Empty relation notification (yes|no)

RSEP-QL in action

Correctness assessment t 3 6 9 1 {:alice :isIn :hall} {:bob :isIn :hall} {:alice :isIn :kitchen} {:bob :isIn :kitchen} e1 e2 e3 e4

S

slide-39
SLIDE 39

http://streamreasoning.org/events/streamapp2017

CSR Bench

  • CSR-bench is an extension of the SRbench benchmark that focuses on

correctness

  • A test suite
  • LinkedSensorData as dataset
  • 8 parametric queries to tests the RSP engines in different conditions
  • An oracle (an automatic correctness validator)
  • Based on RSP-QL

40

slide-40
SLIDE 40

http://streamreasoning.org/events/streamapp2017

CSR Bench

Oracle Online Offline Data importer Query transformer SPARQL engine Result matcher DS R RSP-QL model

  • f R

Correctness assessment Q (E, SDS, ET, QF)

41

slide-41
SLIDE 41

http://streamreasoning.org/events/streamapp2017

CSR Bench

Experimental Results

  • All the three systems that we considered in
  • ur experiments showed wrong behaviours
  • The defects we identified are related to:
  • the window operator

– Initialization – Slide parameter – Window contents

  • timestamps of the triples

– Internal timestamp management CQELS C-SPARQL SPARQLstrea

m

Q1 Q2 Q3 Q4 Q5 Q6 Q7

42

slide-42
SLIDE 42

http://streamreasoning.org/events/streamapp2017

RSEP-QL: Results

  • RSEP-QL captures the evaluation semantics of existing RSP engines
  • RSEP-QL can determine which are the excepted correct answers of an

RSP engine, given the input data and query

  • At the basis of CSR Bench
  • The dynamics introduced in the continuous query evlauation process

have not been totally understood

  • Not fully captured by existing models
  • RSEP-QL captures those dynamics

43