Processing Heterogeneous RDF Event Streams with Standing SPARQL - - PowerPoint PPT Presentation

processing heterogeneous rdf event streams with standing
SMART_READER_LITE
LIVE PREVIEW

Processing Heterogeneous RDF Event Streams with Standing SPARQL - - PowerPoint PPT Presentation

Processing Heterogeneous RDF Event Streams with Standing SPARQL Update Mikko Rinne, Haris Abdullah, Seppo Trm, Esko Nuutila http://cse.aalto.fi/instans/ 11.9.2012 Department of Distributed Computer Science and Systems Engineering


slide-1
SLIDE 1

Processing Heterogeneous RDF Event Streams with Standing SPARQL Update

Mikko Rinne, Haris Abdullah, Seppo Törmä, Esko Nuutila http://cse.aalto.fi/instans/ 11.9.2012

Department of Computer Science and Engineering Distributed Systems Group

slide-2
SLIDE 2

Smart Cities Need Interoperability

  • Smart environments of the future interconnect billions of

sensors

– Platforms from multiple vendors – Operated by different companies, public authorities or individuals

  • Highly distributed, loosely coupled solutions based on

common standards are required

– Challenge to proprietary platforms

  • Semantic web standards RDF, SPARQL and OWL offer

a good base for interoperability

– How would they work for event processing?

slide-3
SLIDE 3

Solution Components

  • 1. Method: Multiple collaborating SPARQL queries and

update rules processing heterogeneous events expressed in RDF

  • 2. Implementation (INSTANS): Incremental continuous

query engine based on the Rete-algorithm

slide-4
SLIDE 4

An Event = “Anything that happens or is contemplated as happening”*)

*) Luckham, D., Schulte, R.: Event processing glossary – version 2.0 (Jul 2011)

(Simple) Event (Simple) Event

Complex Event

(Simple) Event

Composite Event “Synthesized Event”

Seppo came in Seppo, Mikko and Esko are in. Meeting started in time (Simple) Event (Simple) Event Mikko came in (Simple) Event (Simple) Event Esko came in It is 9 a.m.

Summarizes, represents, or denotes a set of

  • ther events *)
slide-5
SLIDE 5

Heterogeneous Event Representations

  • Variable event structures in an
  • pen environment

– Different sensors may support different parameters – Queries can match the data of interest and disregard the rest

  • Semantic web standard RDF

has flexible support for heterogeneous event structures

  • Alternative approaches

typically cover data stream processing on individual time- annotated triples

:e1 rdf: type event: Event event: agent :p3 event: place geo: lat

60.1587 76

geo: long

24.8814 90

event: time rdf: type

tl: Insta nt

tl: at

2011-10-03T08:1 7:11

geo: alt

Example Location Update

slide-6
SLIDE 6

SPARQL Query + Update

  • SPARQL is tailor-made to query RDF data
  • SPARQL 1.1 Update supports INSERT operations,

enabling

– Memory – Communication between SPARQL queries – Stepwise processing of data

  • Applications can be constructed entirely of SPARQL

Queries

slide-7
SLIDE 7

Close Friends Example Service

  • Mobile clients

emit location updates

  • Service

produces a “nearby” notification if two friends come ge-

  • graphically

close to each

  • ther
  • 4. Event

Processing Agent

  • 1. Static input

(RDF Store)

  • 2. Event

Producer (RDF Stream)

  • 5. Event

Consumer Mobile Client Network Configuration

  • 3. Event

Channel

slide-8
SLIDE 8

Approach 1: Single Query

CONSTRUCT { ?person1 :nearby ?person2 } WHERE { # Part 1: Bind event data for pairs of persons who know each other GRAPH <http://externalgraphstore.org/socialnetwork> { ?person1 foaf:knows ?person2 } <bind events for p1+p2> # Part 2: Remove events, if a newer event can be found FILTER NOT EXISTS { ?event3 rdf:type event:Event ; event:agent ?person1 ; event:time [tl:at ?dttm3] . ?event4 rdf:type event:Event ; event:agent ?person2 ; event:time [tl:at ?dttm4] . FILTER ((?dttm1 < ?dttm3) || (?dttm2 < ?dttm4)) } # Part 3: Check if the latest registrations were close in space and time FILTER ( (abs(?lat2-?lat1)<0.01) && (abs(?long2-?long1)<0.01) && (abs(hours(?dttm2)*60+minutes(?dttm2)-hours(?dttm1)*60-minutes(?dttm1))<10))}

Find the latest location for each person

  • Finds friends, whose latest registrations are close in space and time
  • Doesn’t do anything for buffer management or re-execution of the

query

slide-9
SLIDE 9

Approach 2: Window-Based Streaming SPARQL

REGISTER QUERY CloseFriends COMPUTED EVERY 2m AS SELECT ?person1 ?person2 FROM STREAM <http://myexample.org/personlocationupdates> [RANGE 10m STEP 2m] FROM http://streams.org/socialnetwork.rdf WHERE { # Part 1: Bind event data for all friends ?person1 foaf:knows ?person2 <bind events for p1+p2> FILTER ( ( ((?lat2-?lat1)*(?lat2-?lat1)) < 0.01*0.01) ) FILTER ( ( ((?long2-?long1)*(?long2-?long1)) < 0.01*0.01) ) } ORDER BY ?dttm1 ?dttm2

Window range and repetition rate

  • C-SPARQL environment handles windowing, removal of
  • ld events and repetition of query
  • Duplicate removal has to be handled by external means
  • Notification delay and duplicates lead to compromises
slide-10
SLIDE 10

Approach 3: Collaborative SPARQL Update Rules

  • No duplicate detections
  • Buffer management handled by SPARQL

Query 1: Maintain

  • nly the latest

registration in the workspace Query 2: Insert a “nearby” detection marker Query 3: Emit notifications Query 4: Delete “nearby status”

slide-11
SLIDE 11

Rete-Algorithm in INSTANS

!1 Y1 "1: ! a event:event ?event "2: ! event:time ! Y2 ?event, ?time "3: ! tl:at ! Y3 ?time, ?daytime !2 ?event ?event !3 ?event, ?time ?event, ?time filter1 ?event, ?daytime select1 ?event

Query: ¡ ¡ SELECT ¡?event ¡ WHERE ¡{ ¡ ¡ ¡?event ¡a ¡event:Event ¡; ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡event:7me ¡?7me ¡. ¡ ¡ ¡?7me ¡tl:at ¡?day7me ¡. ¡ ¡ ¡FILTER ¡( ¡hours(?day7me) ¡= ¡10 ¡) ¡ ¡} ¡

1 2 3 4 5 6 7 8

:e1 :e1 _:b1 Drop _:b1 :e1 10:05

Process ¡flow: ¡ ¡ ① Each ¡condi7on ¡corresponds ¡to ¡an ¡α-­‑node. ¡α1 ¡matches ¡ with ¡sample ¡input ¡“:e1 ¡a ¡event:Event”. ¡ ② ¡“:e1” ¡propagates ¡to ¡β2 ¡and ¡is ¡stored ¡there. ¡ ③ ¡α2 ¡matches ¡with ¡“:e1 ¡event:,me ¡_:b1”, ¡where ¡“_:b1” ¡ is ¡a ¡blank ¡node. ¡Input ¡from ¡β2 ¡matches ¡with ¡“?event” ¡ in ¡Y2. ¡ ④ ¡“:e1” ¡and ¡“_:b1” ¡propagate ¡un7l ¡β3. ¡ ⑤ ¡α3 ¡matches ¡with ¡input ¡“_:b1 ¡tl:at ¡ “2011-­‑10-­‑03T10:05:00”ˆˆxsd:dateTime”. ¡ ⑥ In ¡Y3 ¡“_:b1” ¡is ¡equal ¡in ¡both ¡incoming ¡branches ¡and ¡ can ¡be ¡eliminated. ¡ ⑦ ¡“:e1” ¡and ¡“2011-­‑10-­‑03T10:05:00”ˆˆxsd:dateTime ¡ reach ¡filter1. ¡The ¡condi7on ¡“hour ¡= ¡10” ¡is ¡true. ¡ ⑧ ¡“:e1” ¡is ¡selected ¡as ¡a ¡result. ¡

  • Translation of SPARQL

queries into an incremental processor

  • Each input triple

propagates according to the queries and resulting states are saved within the structure

  • When a complete query

is matched, results are immediately available

  • This sample query

selects events between 10 and 11 o’clock

slide-12
SLIDE 12

Comparison of Approaches

Single Query C-SPARQL INSTANS Correctness of notifications Yes Yes if windows

  • verlap

Yes Duplication elimination Only within one query Only inside window Yes Timeliness of notifications Query triggered Periodically Triggered Event triggered Scalability wrt #events No Yes Yes

slide-13
SLIDE 13

Notification Delay Results

0.0 5.0 10.0 15.0 20.0 25.0 30.0 5s 10s 20s 30s 40s 50s 60s Notification Delay [s] C-SPARQL Window Length C-SPARQL INSTANS

INSTANS: 12 ms independent of window length

  • 5 simulated friends

moving on a map

  • C-SPARQL query

processing delay varied 12-253 ms for 5-60 events, respectively

  • Window repetition

rate is the dominant component of the notification delay

  • With 1 event per

second inter-arrival time C-SPARQL notification delay measured at 1.34 – 25.90 seconds.

slide-14
SLIDE 14

Summary

  • Event processing based on

– RDF-encoded heterogeneous events

  • Event format can evolve independently of event processing application
  • Built-in support for disjoint vocabularies

– SPARQL Query + Update

  • Application can be built entirely out of collaborating SPARQL queries
  • Access to linked open data, future possibilities for inference

– No proprietary extensions needed so far

  • Promise of good interoperability in multi-vendor multi-actor

environments

– Continuous incremental matching using the Rete-algorithm

  • No repeating windows (processing repetition, duplicate matches,

missed detections on window borders)

  • Application areas in smart spaces, context-aware mobile

systems, internet-of-things, the real-time web etc.

*) ACM Special Interest Group for Applied Computing

slide-15
SLIDE 15

Conclusions

  • Collaborative SPARQL queries are a promising method

for event processing using semantic web technologies

  • A platform capable of continuous event-driven

evaluation of parallel SPARQL-queries supporting SPARQL 1.1 Update (INSERT) is needed

  • INSTANS outperforms the comparison approaches

– Single SPARQL query lacks buffer management and repetition – Window-based streaming SPARQL suffers from contradicting requirements in setting operation parameters – INSTANS gives corrent notifications without duplicates in a fraction of the notification time of Streaming SPARQL.

slide-16
SLIDE 16

Background Material

slide-17
SLIDE 17

Queries in Approach 3

Query 1) Window-query: DELETE { <bind event to variables>} WHERE { <bind event to variables> FILTER EXISTS { ?event2 event:agent ?person ; event:time [tl:at ?dttm2] . FILTER (?dttm < ?dttm2) } } Query 2) Nearby detection INSERT { ?person1 :nearby ?person2 } WHERE { ?person1 foaf:knows ?person2 . <bind events for p1+p2> # Check proximity in space and time FILTER ((abs(?lat2-?lat1)<0.01) && (abs(?long2-?long1)<0.01) && (abs(hours(?dttm2)*60+minutes(?dttm2)

  • hours(?dttm1)*60-minutes(?dttm1))<10))

# Don't insert, if the relation already exists FILTER NOT EXISTS {?person1 :nearby ?person2}} Query 3) Notification: SELECT ?person1 ?person2 WHERE { ?person1 :nearby ?person2 } Query 4) Removal of ``nearby'' status: DELETE { ?person1 :nearby ?person2 } WHERE { ?person1 foaf:knows ?person2 . <bind events for p1+p2> FILTER ( (abs(?lat2-?lat1)>0.02) || (abs(? long2-?long1)>0.02)) FILTER EXISTS { ?person1 :nearby ?person2 } }