processing heterogeneous rdf event streams with standing
play

Processing Heterogeneous RDF Event Streams with Standing SPARQL - PowerPoint PPT Presentation

Processing Heterogeneous RDF Event Streams with Standing SPARQL Update Mikko Rinne, Haris Abdullah, Seppo Trm, Esko Nuutila http://cse.aalto.fi/instans/ 11.9.2012 Department of Distributed Computer Science and Systems Engineering


  1. Processing Heterogeneous RDF Event Streams with Standing SPARQL Update Mikko Rinne, Haris Abdullah, Seppo Törmä, Esko Nuutila http://cse.aalto.fi/instans/ 11.9.2012 Department of Distributed Computer Science and Systems Engineering Group

  2. Smart Cities Need Interoperability • Smart environments of the future interconnect billions of sensors – Platforms from multiple vendors – Operated by different companies, public authorities or individuals • Highly distributed, loosely coupled solutions based on common standards are required – Challenge to proprietary platforms • Semantic web standards RDF, SPARQL and OWL offer a good base for interoperability – How would they work for event processing?

  3. Solution Components 1. Method: Multiple collaborating SPARQL queries and update rules processing heterogeneous events expressed in RDF 2. Implementation (INSTANS): Incremental continuous query engine based on the Rete-algorithm

  4. An Event = “Anything that happens or is contemplated as “Synthesized Event” happening”*) It is 9 (Simple) a.m. Event Meeting started in time Seppo, Mikko and Esko are in. Seppo came in (Simple) (Simple) Event Event Complex Composite Event Event Mikko came in (Simple) (Simple) Event Event Esko Summarizes, came in represents, or (Simple) (Simple) Event Event denotes a set of other events *) *) Luckham, D., Schulte, R.: Event processing glossary – version 2.0 (Jul 2011)

  5. Heterogeneous Event Representations Example Location • Variable event structures in an Update open environment – Different sensors may support different parameters event: 60.1587 – Queries can match the data of Event 76 interest and disregard the rest geo: type rdf: lat • Semantic web standard RDF event: event: geo: has flexible support for :p3 :e1 agent place alt heterogeneous event event: geo: long time structures • Alternative approaches tl: rdf: 24.8814 typically cover data stream Insta type 90 nt processing on individual time- tl: at annotated triples 2011-10-03T08:1 7:11

  6. SPARQL Query + Update • SPARQL is tailor-made to query RDF data • SPARQL 1.1 Update supports INSERT operations, enabling – Memory – Communication between SPARQL queries – Stepwise processing of data • Applications can be constructed entirely of SPARQL Queries

  7. Close Friends Example Service • Mobile clients emit location 1. Static input updates (RDF Store) • Service Configuration produces a 4. Event “nearby” 3. Event Processing Channel notification if 2. Event Agent Producer two friends (RDF Stream) come ge- ographically 5. Event Consumer close to each other Mobile Client Network

  8. Approach 1: Single Query CONSTRUCT { ?person1 :nearby ?person2 } WHERE { # Part 1: Bind event data for pairs of persons who know each other GRAPH <http://externalgraphstore.org/socialnetwork> { ?person1 foaf:knows ?person2 } <bind events for p1+p2> # Part 2: Remove events, if a newer event can be found Find the latest FILTER NOT EXISTS { ?event3 rdf:type event:Event ; location for event:agent ?person1 ; each person event:time [tl:at ?dttm3] . ?event4 rdf:type event:Event ; event:agent ?person2 ; event:time [tl:at ?dttm4] . FILTER ((?dttm1 < ?dttm3) || (?dttm2 < ?dttm4)) } # Part 3: Check if the latest registrations were close in space and time FILTER ( (abs(?lat2-?lat1)<0.01) && (abs(?long2-?long1)<0.01) && (abs(hours(?dttm2)*60+minutes(?dttm2)-hours(?dttm1)*60-minutes(?dttm1))<10))} • Finds friends, whose latest registrations are close in space and time • Doesn’t do anything for buffer management or re-execution of the query

  9. Approach 2: Window-Based Streaming SPARQL REGISTER QUERY CloseFriends COMPUTED EVERY 2m AS SELECT ?person1 ?person2 FROM STREAM <http://myexample.org/personlocationupdates> [RANGE 10m STEP 2m] FROM http://streams.org/socialnetwork.rdf WHERE { # Part 1: Bind event data for all friends ?person1 foaf:knows ?person2 Window range <bind events for p1+p2> FILTER ( ( ((?lat2-?lat1)*(?lat2-?lat1)) < 0.01*0.01) ) and repetition FILTER ( ( ((?long2-?long1)*(?long2-?long1)) < 0.01*0.01) ) } rate ORDER BY ?dttm1 ?dttm2 • C-SPARQL environment handles windowing, removal of old events and repetition of query • Duplicate removal has to be handled by external means • Notification delay and duplicates lead to compromises

  10. Approach 3: Collaborative SPARQL Update Rules Query 1: Maintain Query 3: Emit only the latest notifications registration in the workspace Query 2: Insert a Query 4: Delete “nearby” detection “nearby status” marker • No duplicate detections • Buffer management handled by SPARQL

  11. 1 3 5 Rete-Algorithm ! 1 " 1 : ! a event:event " 2 : ! event:time ! " 3 : ! tl:at ! in INSTANS ?event Y 1 • Translation of SPARQL ?event ?event, ?time queries into an 2 :e1 incremental processor ! 2 • Each input triple Query: ¡ ?event ?time, ?daytime ¡ propagates according to SELECT ¡?event ¡ WHERE ¡{ ¡ Y 2 the queries and resulting ¡ ¡?event ¡a ¡event:Event ¡; ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡event:7me ¡?7me ¡. ¡ states are saved within ¡ ¡?7me ¡tl:at ¡?day7me ¡. ¡ ?event, ?time ¡ ¡FILTER ¡( ¡hours(?day7me) ¡= ¡10 ¡) ¡ ¡} ¡ 4 the structure :e1 ! 3 _:b1 • When a complete query Process ¡flow: ¡ ¡ is matched, results are ?event, ?time ① Each ¡condi7on ¡corresponds ¡to ¡an ¡α-­‑node. ¡ α1 ¡matches ¡ with ¡sample ¡input ¡ “:e1 ¡a ¡event:Event” . ¡ immediately available 6 ② ¡ “:e1” ¡ propagates ¡to ¡ β2 ¡and ¡is ¡stored ¡there. ¡ Y 3 ③ ¡ α2 ¡matches ¡with ¡ “:e1 ¡event:,me ¡_:b1” , ¡where ¡ “_:b1” ¡ Drop _:b1 • This sample query is ¡a ¡blank ¡node. ¡Input ¡from ¡ β2 ¡matches ¡with ¡ “?event” ¡ in ¡ Y2 . ¡ ?event, ?daytime selects events between ④ ¡ “:e1” ¡and ¡ “_:b1” ¡propagate ¡un7l ¡ β3 . ¡ ⑤ ¡ α3 ¡matches ¡with ¡input ¡ “_:b1 ¡tl:at ¡ 7 10 and 11 o’clock “2011-­‑10-­‑03T10:05:00”ˆˆxsd:dateTime” . ¡ filter 1 ⑥ In ¡ Y3 ¡ “_:b1” ¡is ¡equal ¡in ¡both ¡incoming ¡branches ¡and ¡ :e1 10:05 can ¡be ¡eliminated. ¡ ⑦ ¡ “:e1” ¡ and ¡“2011-­‑10-­‑03T10:05:00”ˆˆxsd:dateTime ¡ ?event reach ¡ filter1 . ¡The ¡condi7on ¡ “hour ¡= ¡10” ¡is ¡true. ¡ 8 ⑧ ¡ “:e1” ¡is ¡selected ¡as ¡a ¡result. ¡ select 1

  12. Comparison of Approaches Single Query C-SPARQL INSTANS Correctness of Yes Yes if windows Yes notifications overlap Duplication Only within one Only inside Yes elimination query window Timeliness of Query triggered Periodically Event triggered notifications Triggered Scalability wrt No Yes Yes #events

  13. Notification Delay Results 30.0 • 5 simulated friends moving on a map 25.0 C-SPARQL • C-SPARQL query Notification Delay [s] processing delay 20.0 INSTANS varied 12-253 ms for 5-60 events, 15.0 respectively 10.0 • Window repetition rate is the dominant 5.0 component of the notification delay 0.0 5s 10s 20s 30s 40s 50s 60s • With 1 event per C-SPARQL Window Length second inter-arrival time C-SPARQL notification delay INSTANS: 12 ms measured at 1.34 – independent of 25.90 seconds. window length

  14. Summary • Event processing based on – RDF-encoded heterogeneous events • Event format can evolve independently of event processing application • Built-in support for disjoint vocabularies – SPARQL Query + Update • Application can be built entirely out of collaborating SPARQL queries • Access to linked open data, future possibilities for inference – No proprietary extensions needed so far • Promise of good interoperability in multi-vendor multi-actor environments – Continuous incremental matching using the Rete-algorithm • No repeating windows (processing repetition, duplicate matches, missed detections on window borders) • Application areas in smart spaces, context-aware mobile systems, internet-of-things, the real-time web etc. *) ACM Special Interest Group for Applied Computing

  15. Conclusions • Collaborative SPARQL queries are a promising method for event processing using semantic web technologies • A platform capable of continuous event-driven evaluation of parallel SPARQL-queries supporting SPARQL 1.1 Update (INSERT) is needed • INSTANS outperforms the comparison approaches – Single SPARQL query lacks buffer management and repetition – Window-based streaming SPARQL suffers from contradicting requirements in setting operation parameters – INSTANS gives corrent notifications without duplicates in a fraction of the notification time of Streaming SPARQL.

  16. Background Material

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend