RSP Optimisation Techniques M.I. Ali http://intizarali.org - - PowerPoint PPT Presentation

rsp optimisation techniques
SMART_READER_LITE
LIVE PREVIEW

RSP Optimisation Techniques M.I. Ali http://intizarali.org - - PowerPoint PPT Presentation

Tutorial on RDF Stream Processing 2016 M.I. Ali, J-P Calbimonte, D. Dell'Aglio, E. Della Valle, and A. Mauri http://streamreasoning.org/events/rsp2016 RSP Optimisation Techniques M.I. Ali http://intizarali.org @intizarali


slide-1
SLIDE 1

Tutorial on RDF Stream Processing 2016

M.I. Ali, J-P Calbimonte, D. Dell'Aglio,

  • E. Della Valle, and A. Mauri

http://streamreasoning.org/events/rsp2016

RSP Optimisation Techniques

M.I. Ali

ali.intizar@insight- centre.org http://intizarali.org @intizarali

slide-2
SLIDE 2

http://streamreasoning.org/events/rsp2016

  • Smart Cities and IoT are

leading to an era of streaming world

  • Sensors and mobile

devices are producing an enormous amount of data

  • Mostly in streaming

fashion

Data Streams are Everywhere

slide-3
SLIDE 3

http://streamreasoning.org/events/rsp2016

Introducing Semantics in Data Streams

  • Why RDF Data Streams?
  • Interoperable (easy

integration)

  • Machine Readable
  • Reasoning
  • On-demand discovery
  • Ideal for the web
  • Dereferencing
slide-4
SLIDE 4

http://streamreasoning.org/events/rsp2016

The Goal

02/11/2016 4

slide-5
SLIDE 5

http://streamreasoning.org/events/rsp2016

  • CityPulse aims to support the integration of dynamic data

sources and context-dependent on-demand adaptations of processing chains during run-time.

  • CityPulse aims to bridge the gap between the application

technologies on the IoT and real world data streams.

  • It will use Cyber-Physical and Social data and will employ big

data analytics and intelligent methods to aggregate, interpret and extract meaningful knowledge and perceptions from large sets of heterogeneous data streams.

CityPulse: Real-time IoT Data Analytics and Large Scale Data Analytics for Smart Cities Applications

slide-6
SLIDE 6

http://streamreasoning.org/events/rsp2016

CityPulse: Real-time IoT Data Analytics and Large Scale Data Analytics for Smart Cities Applications

slide-7
SLIDE 7

http://streamreasoning.org/events/rsp2016

Smart City Applications

slide-8
SLIDE 8

http://streamreasoning.org/events/rsp2016

Is RSP Ready for Action?

  • Available Engines
  • CQELS
  • C-SPARQL
  • SPARQLStream
  • Processing capabilities tests
  • Benchmarks

– LS – SR – CSR

  • Performance and Scalability
slide-9
SLIDE 9

http://streamreasoning.org/events/rsp2016

Is RSP Ready for Action?

  • RSP is still in its cradle
  • On-going work for query

language and semantics

  • Existing RSP engines are

not more than prototypes

  • Benchmarking for

performance and scalability testing in control environment

slide-10
SLIDE 10

http://streamreasoning.org/events/rsp2016

Challenges for RSP Optimisation

  • Data Distribution

– Data produced by streams is highly distributed

  • Unpredictable Data Rate

– Stream observation rate is variable – Stream Bursts

slide-11
SLIDE 11

http://streamreasoning.org/events/rsp2016

Challenges for RSP Optimisation

  • Number of Concurrent

queries

– A large number of audience

  • r end users e.g. Citizens of

a smart city

  • Background Data

Integration

– Streaming queries process a combination of streaming and static knowledge – Currently static knowledge base is processed in memory

slide-12
SLIDE 12

http://streamreasoning.org/events/rsp2016

Challenges for RSP Optimisation

  • Quasi-static Data

– Fetch and locally process can result into outdated results for quasi-static data

  • On-demand Discovery

– Stream Processing operate in a frequently changing world – Data and applications change quite frequently

  • Adaptation

– Streaming queries in dynamic environment need continuous monitoring

slide-13
SLIDE 13

http://streamreasoning.org/events/rsp2016

How can we optimise RSP?

  • Benchmarking
  • Resource Optimisation
  • Resource Sharing/Join

Optimiaiton

  • Scalability
  • Load Balancing
  • Hybrid Reasoning
slide-14
SLIDE 14

http://streamreasoning.org/events/rsp2016

Benchmarks

  • SR Bench
  • LS Bench
  • CSR Bench

Benchmarking Infrastructure

  • CityBench
  • YABench
  • Heaven
slide-15
SLIDE 15

http://streamreasoning.org/events/rsp2016

CityBench Benchmarking Suite- CTI

Configurable Testbed Infrastructure (CTI) Query Configuration Module Dataset Configuration Module RSP Engine Static Datastore Smart City Data Streams Performance Evaluator

Benchmark Results

CityBench Queries Smart City Applications … …

slide-16
SLIDE 16

http://streamreasoning.org/events/rsp2016

CityBench Benchmarking Suite

  • CityBench is designed to evaluate RSP engines for

Smart City Applications

  • It comprises of
  • 7 real time smart city data sets containing live RDF

streams

  • Configurable Testbed Infrastructure with 6 parameters
  • 13 queries for 3 smart city applications e.g. Travel

Planner, Parking Finder and CityDashboard

slide-17
SLIDE 17

http://streamreasoning.org/events/rsp2016

CityBench Benchmarking Suite

  • CityBench Datasets
  • Vehicle Traffic
  • Parking
  • Weather
  • Pollution
  • Cultural Events
  • Library Events
  • User Location

Stream

slide-18
SLIDE 18

http://streamreasoning.org/events/rsp2016

CityBench Benchmarking Suite- CTI

  • Configuration Parameters
  • Changes in Input Streaming Rate
  • Play Back Time
  • Variable Background Data Sizes
  • Number of Concurrent Queries
  • Number of Streams within a Single Query
  • Selection of the RSP Engine
slide-19
SLIDE 19

http://streamreasoning.org/events/rsp2016

CityBench Evaluation

  • We evaluated 2 state of the art RSP engines
  • CQELS
  • C-SPARQL
  • Both engines were test for their
  • Latency
  • Memory Consumption
  • Completeness
  • Different settings by fine tuning CTI Parameters
  • Number of queries, users, background data size

etc.

02/11/2016

19

slide-20
SLIDE 20

http://streamreasoning.org/events/rsp2016

CityBench Evaluation : Latency

  • Latency over Increasing Number of Input Streams

3000 4000 5000 6000 200 400 600 800 1000 1200 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 latency (ms) experiment me (minutes) Q10_8-csparql Q10_2-csparql Q10_2-cqels Q10_5-csparql Q10_5-cqels

slide-21
SLIDE 21

http://streamreasoning.org/events/rsp2016

CityBench Evaluation : Latency

  • Latency over Increasing Number of Concurrent

Queries

  • CQELS: Q1, Q5 and Q8

1000 2000 3000 4000 5000 6000 7000 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 latency (ms) experiment me (minute) Q1 Q1-10 Q1-20 100 200 300 400 500 600 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 latency (ms) experiment me (minute) Q5 Q5-10 Q5-20 Q8-20 Q8-10 Q8

slide-22
SLIDE 22

http://streamreasoning.org/events/rsp2016

CityBench Evaluation : Latency

  • Latency over Increasing Number of Concurrent

Queries

  • C-SPARQL: Q1, Q5 and Q8

500 1000 1500 2000 2500 3000 3500 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 latency (ms) experiment me (minute) Q1 Q1-10 Q1-20 500 1000 1500 2000 2500 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 latency (ms) experiment me (minute) Q5 Q5-10 Q5-20 Q8

slide-23
SLIDE 23

http://streamreasoning.org/events/rsp2016

CityBench Evaluation : Memory Consumption

  • Memory Consumption over Increasing the Number of

Concurrent Queries

100 200 300 400 500 600 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 memory (MB) experiment me (minute) Q1 Q1-20 Q5 Q5-20 60 80 100 120 140 160 180 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 memory (MB) experiment me (minute) Q1 Q1-20 Q5-1 Q5-20

slide-24
SLIDE 24

http://streamreasoning.org/events/rsp2016

CityBench Evaluation : Memory Consumption

  • Memory Consumption over Increasing the Size of

Background Data

50 100 150 200 250 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 memory (MB) experiment me (minutes) 3MB-cqels 20MB-cqels 30MB-cqels 3MB-csparql 20MB-csparql 30MB-csparql

slide-25
SLIDE 25

http://streamreasoning.org/events/rsp2016

CityBench Evaluation: Completeness

  • Memory Consumption over Increasing the Size of

Background Data

91.4 82.4 73.2 74.2 54.4 98 97 96 97 96 10 20 30 40 50 60 70 80 90 100 30 60 90 120 150 Completeness (%) stream input rate (triple/s) cqels csparql

slide-26
SLIDE 26

http://streamreasoning.org/events/rsp2016

RDF Stream Processing (RSP) : Challenges

  • Optimal Data Source Discovery
  • Streams are everywhere
  • Multiple data streams can answer the same

query

  • Optimal data stream selection
  • Catering for user-defined constraints and

preferences

  • On-Demand Stream Federation
  • Automated composition of primitive data streams

to answer complex queries

  • Adaptation
  • Data source properties can change over time
  • Make sure selected sources remain “optimal”

throughout life cycle of the query

slide-27
SLIDE 27

http://streamreasoning.org/events/rsp2016

Stream Discovery, Federation and Adaptation

  • Stream Discovery
  • Data interoperability:

– Semantic descriptions (ontologies and annotations)

  • Interface interoperability:

– Streams as event services (service discovery)

  • Stream Federation
  • Efficient processing of complicated event logics

– Data Stream Management Systems – Complex Event Processing

  • Adaptation
  • Continuous monitoring to observe constraints violation
  • Trigger adaptation mechanism to select new optimal

data stream

Semantic Web Service Oriented Architectures DSMS and CEP Continuous constraint checking Semantic Web + Service Oriented Architecture + Complex Event Processing

slide-28
SLIDE 28

http://streamreasoning.org/events/rsp2016 ✓Input data heterougneity + Background knowledge integration + Output data reusability + Platform indenpendency

Motivation – Smart City Applications

  • Detecting complex events in real-time by answering continuous

queries over data streams.

E1 E2 E3

Event engine

Event Consumer E1∧ E2∧ E3 E1 subClassOf E4 Event Providers

Complex Event Processing (CEP) RDF Stream Processing (RSP) Semantic Event Service (SES) Internet

  • f

Things

Internet

  • f People

Ontologie s

E1 E2 E2 E3

Semantic Annotation

RSP engine CQELS engine CSPARQL engine

E1and E2 E2&& E3 Service Wrapper Service Wrapper Service Wrapper Service Wrapper e1 rdf:type E1 …

Ontologies & Service Descriptions

Complex/Com posite Event Services (CES)

C1: Acc >= 90% C2: Acc >= 85% C3: Acc >= 95%

Acc

Automatic adaptations?

Primitive Event Services (PES)

slide-29
SLIDE 29

http://streamreasoning.org/events/rsp2016

Automated Complex Event Implementation System

Semantic Annotation ACEIS Core Resource Management Application Interface Knowledge Base QoI/QoS Stream Description Data Mgmt, Indexing, Caching User Input Event Request Data Federation Resource Discovery Event Service Composer Composition Plan Subscription Manager Query Transformer Query Engine Query Results Constraint Validation Constraint Violation Adaptation Manager Data Store IoT Data Stream Social Data Stream

slide-30
SLIDE 30

http://streamreasoning.org/events/rsp2016

Summary of the Approach

  • How to describe complex event services?
  • Create an Event Service Ontology with Event Patterns.
  • How to determine if two event patterns are functionally

equivalent?

  • Create and compare canonical event patterns to find

substitutes.

  • How to create event compositions and choose the optimal?
  • Top-down traverse to find functionally-equivalent canonical

patterns.

  • How to derive event service compositions efficiently?
  • Construct and utilize an Event Reusability Hierarchy for event

service composition.

  • How to ensure best remains best?
  • Monitor user defined constraints and trigger adaptation

mechanism if constraints are violated.

slide-31
SLIDE 31

http://streamreasoning.org/events/rsp2016

EventService EventProfile

  • wls:Grounding

Pattern PrimitiveEvent Service

  • wls:Service
  • wls:supports

ComplexEven tService EventRequest

  • wls:presents

hasPattern rdf:_x (contains) rdf:_x (contains) Namespaces: default: <http://www.insight-centre.org/ces#> rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

  • wls: <http://www.daml.org/services/owl-s/1.2/Service.owl# >
  • wls-sp: <http://www.daml.org/services/owl-s/1.2/ServiceParameter.owl#>

Legend: Class Object property subClassOf

  • wls:ServiceProfile
  • wls:presents
  • wls-sp:ServiceParameter

NFP Constraint Preference

hasPreference hasConstraint

QosWeight Preference

hasWeight

xsd:double

rdf:_x (contains)

  • wls-sp:serviceParameter

Data property hasNFP

Complex Event Service Ontology

slide-32
SLIDE 32

http://streamreasoning.org/events/rsp2016

Annotation of sensor streams

  • A sensor service description is annotated as:

sdesc = (td, g, qd, Pd, FoId, fd)

type grounding QoS Observed Properties Feature Of Iterest Pd → FoId

  • Similarly, a sensor service request is annotated:

sr = (tr, Pr, FoIr, fr, pref, C)

type Requested Properties Feature of Interest Pd → FoId no grounding NFP Constraint and Preferences

slide-33
SLIDE 33

http://streamreasoning.org/events/rsp2016

On-Demand Stream Federation

  • Event Request:
  • User/Application defines an event request using CES

Ontology

  • Procedure:
  • Derive canonical forms of event patterns of CESs.
  • Apply tree isomorphism algorithms over the canonical

event patterns and the event request to identify reusable or equivalent event patterns.

  • Generate all possible composition plans.
  • Aggregate NFPs and compare aggregated NFP values

against constraints on event request to filter out unsatisfied composition plans.

  • Optimization using Genetic Algorithm (GA)
  • Rank the remaining composition plans based on

preferences (soft constraints).

slide-34
SLIDE 34

http://streamreasoning.org/events/rsp2016

On-Demand Stream Federation

e1 e2 ES1 ES2 ES3 Or And Seq e3 e4 ES2

Or And Seq e1 e2 e3 e4 getCompletePattern()

ES3

slide-35
SLIDE 35

http://streamreasoning.org/events/rsp2016

On-Demand Stream Federation

  • Create event reusability hierachy
  • Reusable relation: R(ep1,ep2) holds if Rd(ep1,ep2) or Ri(ep1,ep2) holds.

efi ⊂ × ∈ ⇐⇒ ∃ ⊆ × −→ fil × −→ × −→ fix × −→ fini fil fine fini efi ⊂ × fil ∈ ⇐⇒ ¬ ∧ ∃ ∈ ⊂ ⊂ ∈ ∈ ∈ ∈ ∈ ∧ fine fini efi ∪ ⊂ × ∀ ∈ ∈

e1

SEQ

e2

OR

e4 e3 e1

SEQ

e2 e3

SEQ

e2 e3 directly reusable in-directly reusable in-directly reusable

∧ ∈ fini ← ← ← ← ∪ fie firs finds ∈ ∈ ∧ findi firs

slide-36
SLIDE 36

http://streamreasoning.org/events/rsp2016

Stream Federation: Composition Plan Generation

e1

SEQ

e2

OR

e3 Query e1

SEQ

e2 type= e4 loc=loc4 e3 e2 e1 type= e3 loc=loc3 type= e2 loc=loc2 type= e1 loc=loc1

OR

e3 Composition Plan e4 loc=loc4 loc=loc3 Event Service 1 Event Service 2 Event Service 3 Event Service 4

slide-37
SLIDE 37

http://streamreasoning.org/events/rsp2016

Stream Federation: Composition Plan Generation

– Brute-Force Enumeration:

  • global optimum, poor scalability.

– Integer Programming (Zeng et al. 2004, Berbner et al. 2006, Alrifai et al. 2009):

  • near-optimal global planning, improved scalability,
  • requries QoS metrics to be linear,
  • perates on a fixed set of service classes in a composition plan,
  • does not perform well in dynamic environments.

– Genetic Algorithm (Canfora et al. 2004, Wu et al. 2013):

  • near-optimal global planning, good scalability,
  • does not require linear QoS metrics,
  • can provide composition plans with services with different granularity levels,
  • can adapt to changes effortlessly,
  • can achieve ~89% optimal results in 0-2 seconds using default settings in our

approach.

slide-38
SLIDE 38

http://streamreasoning.org/events/rsp2016

Stream Federation: Composition Plan Generation

Define fitness function Population initialisation Selection of individuals based on fitness Crossover genetic encodings of selected individuals Mutation Set derived results as the next generation Termination

QoS aggregation schema based

  • n patterns

ERH based

slide-39
SLIDE 39

http://streamreasoning.org/events/rsp2016

fi fi fi fi fi fi

e3 e1

SEQ

e2

OR

Query n1 n2 n3 n4 n5 e3

OR

n6 n7 n8 P_1 e4 P_2 e3 e1

SEQ

e2

OR

n9 n10 n11 n12 n13 Cross Point Reusable Node

n7:<n6,es1>, n8:<n6,es2>. n12:<n9n10,es3>, n13:<n9n10,es4>, n11:<n9,es5>.

Cross Over Picked Leaf chromosome for P_2 chromosome for P_1 C_1 e3 e1

SEQ

e2

OR

n6 n10 n8 n12 n13 e3

OR

n9 n7 n11 C_2 e4

n12:<n9n10,es3>, n13:<n9n10,es4>, n8:<n6,es2>. n7:<n6,es1>, n11:<n9,es5>.

chromosome for P_2 chromosome for C_1 ERF Space CCP Space es1 es2 es3 es4 es5 es3 es4 es2 es1 es5

Stream Federation: Composition Plan Generation

slide-40
SLIDE 40

http://streamreasoning.org/events/rsp2016

Adaptation in Stream Processing Why we need adaptation?

  • Ensure that the “best” remains the “best”
  • Improves robustness

Technical Adaptation in Stream Federation

  • Monitoring quality updates of streaming sources
  • Evaluate criticality of the update (based on query-related

constraints and requirements)

  • React to this change (discover new streaming sources)
  • Monitoring quality updates of streaming sources
  • Evaluate criticality of the update (based on query-related

constraints and requirements)

  • React to this change (discover new streaming sources)
slide-41
SLIDE 41

http://streamreasoning.org/events/rsp2016

Constraint Validation

  • Degradation in stream quality is considered as constraints

violation

  • Better performance but can lead to the possibility of having

better quality streams not considered

Adaptation Manager deals with constraint violations:

  • Switching to alternative streams: only candidate streams

selected by composition plan are considered as substitute for stream switching

  • Re-generation of the composition plan and consideration of

all the available (registered) stream.

Adaptation in Stream Processing

slide-42
SLIDE 42

http://streamreasoning.org/events/rsp2016

Query Transformation: Semantic Alignment

Goal: transform the composition plan into a stream query evaluated by a stream reasoning engine over RDF data streams

  • Requirements:

– Matching event pattern operators to stream query operators – Transformation Algorithm

  • Alignments for CQELS, C-SPARQL and ETALIS:
  • Sequence and Repetition not supported by CQELS.
  • Sensor requests mapped to StreamGraphPattern(CQELS) and

GroupGraphPattern(CSPARQL).

  • AND operator mapped to stream join.
  • OR operator mapped to OPTIONAL keyword (left-outer-join).
slide-43
SLIDE 43

http://streamreasoning.org/events/rsp2016

ACEIS in Practice

Input

  • Complex Event Request (Function & non-functional properties)
  • Sensor Metadata Repository (including quality updates)
  • Sensor Data Streams (Semantically annotated data streams)

Process

  • Discover relevant data stream (list of candidate data streams)
  • Ranking the multiple candidate data streams (evaluate

constraints & preferences)

  • On-demand Stream Federation (composition plan using BF/GA)

Output

  • Federated Output Data Stream

GitHub Source Code: https://github.com/CityPulse/Stream-Discovery-and-Integration- Middleware

slide-44
SLIDE 44

http://streamreasoning.org/events/rsp2016

Query Scheduler in ACEIS

  • Goal: Deploy multiple RSP engine instances and leverage load balancing

techniques to increase the capacity of ACEIS server.

  • Multiple load balancing techniques applied:

– Equalised Queries (EQ): initialize multiple engines upfront and same number of queries deployed on each instance.

fi

Data Federation Composition Plan Composition Plan Query Transformer Query Engine Scheduler Dispatcher Query Monitor Composition Plan Query Engine Query Engine Subscription Manager getEngine() engineID deploy(qid,engineID) sendStats() createEngineInstance()

↵ fi “ E ” ) “ EL” )

slide-45
SLIDE 45

http://streamreasoning.org/events/rsp2016

  • Goal: Deploy multiple RSP engine instances and leverage load balancing

techniques to increase the capacity of ACEIS server.

  • Multiple load balancing techniques applied:

– Elastic (EL): define maximum number of queries deployed on an instance and creates instances on demand

fi

Data Federation Composition Plan Composition Plan Query Transformer Query Engine Scheduler Dispatcher Query Monitor Composition Plan Query Engine Query Engine Subscription Manager getEngine() engineID deploy(qid,engineID) sendStats() createEngineInstance()

↵ fi “ E ” ) “ EL” )

Query Scheduler in ACEIS

slide-46
SLIDE 46

http://streamreasoning.org/events/rsp2016

  • Goal: Deploy multiple RSP engine instances and leverage load balancing

techniques to increase the capacity of ACEIS server.

  • Multiple load balancing techniques applied:

– Balanced Latency (BL): deploy the query on the instance with lowest average latency

fi

Data Federation Composition Plan Composition Plan Query Transformer Query Engine Scheduler Dispatcher Query Monitor Composition Plan Query Engine Query Engine Subscription Manager getEngine() engineID deploy(qid,engineID) sendStats() createEngineInstance()

↵ fi “ E ” ) “ EL” )

Query Scheduler in ACEIS

slide-47
SLIDE 47

http://streamreasoning.org/events/rsp2016

  • Goal: Deploy multiple RSP engine instances and leverage load balancing

techniques to increase the capacity of ACEIS server.

  • Multiple load balancing techniques applied:

– Elastic-Balanced-Latency (EBL): combination of EL and BL, best strategy.

fi

Data Federation Composition Plan Composition Plan Query Transformer Query Engine Scheduler Dispatcher Query Monitor Composition Plan Query Engine Query Engine Subscription Manager getEngine() engineID deploy(qid,engineID) sendStats() createEngineInstance()

↵ fi “ E ” ) “ EL” )

Query Scheduler in ACEIS

slide-48
SLIDE 48

http://streamreasoning.org/events/rsp2016

  • Performance evaluation with EBL strategy

Increased capacity of CQELS from 30 to about 1000 Increased capacity of CSPARQL from 30 to about 90

Query Scheduler in ACEIS

slide-49
SLIDE 49

http://streamreasoning.org/events/rsp2016

  • Further RSP Optimisation Opportunities
  • Sharing the resource among multiple concurrent queries

– One stream buffer shared among multiple queries – Common triple patterns in multiple queries handled once – Join Optimisation for multiple concurrent queries

slide-50
SLIDE 50

http://streamreasoning.org/events/rsp2016

Pre-processing Input Queries

slide-51
SLIDE 51

http://streamreasoning.org/events/rsp2016

  • Further RSP Optimisation Opportunities
  • Trade-off between latency and consistency

– Querying remote endpoint can be resource intensive – SPARQL endpoint contain quasi-static knowledge base can impose restriction

  • n number of service calls

– Maintenance policy to keep the optimised outcomes from the tradeoff between latecny and consistency.

slide-52
SLIDE 52

http://streamreasoning.org/events/rsp2016

Mediator system: Highest consistency with a latency threshold

Join

RDF Stream Generator Background data (SPARQL endpoint)

Window

Local View

52

slide-53
SLIDE 53

http://streamreasoning.org/events/rsp2016

Mediator system: Highest consistency with a latency threshold

Join

RDF Stream Generator Background data (SPARQL endpoint)

Window

Local View

Maintenance Process

Freshness decreases Refresh Cost/Quality trade-off

53

slide-54
SLIDE 54

http://streamreasoning.org/events/rsp2016

  • Further RSP Optimisation Opportunities
  • Trade-off between expressivity and scalability

– RSP brings the power of reasoning – Reasoning tools need to be optimised to deal with large volume of data having high velocity and variety – RSP can support to improve RDF stream reasoning – Finding the trade-off between stream processing and stream reasoning can be key

slide-55
SLIDE 55

http://streamreasoning.org/events/rsp2016

Introduction – Stream reasoning Stream reasoning Stream processing + Process stream

  • Lack complex reasoning tasks

Knowledge Representation & Reasoning

  • Mainly on static data

+ Perform complex reasoning tasks

slide-56
SLIDE 56

http://streamreasoning.org/events/rsp2016

Data Streams

Stream Query Processing Semantic Complex Event Processing

relevant events complex events

Stream Reasoning

solution sets

Data Streams Data Streams Applications

Scalability Expressivity

Scalability Vs Expressivity Tradeoff

slide-57
SLIDE 57

http://streamreasoning.org/events/rsp2016

The StreamRule idea

  • 2-tier approach: not all dynamic data streams are

relevant for complex reasoning

  • Enrich the ability of complex reasoning over data

streams

  • Keep the solution scalable
  • Leverage existing engines from both stream processing

and non-monotonic reasoning research areas

slide-58
SLIDE 58

http://streamreasoning.org/events/rsp2016

Approach

  • Leverage existing engines from both stream processing and

non-monotonic reasoning research areas

  • Enable adaptation layer to enhance scalability

2-tier approach

Answers Filtered Stream Adaptation

Rule-based Non-monotonic Reasoning

Logic Program

RDF Stream Processing

Query Streams

Adaptation

slide-59
SLIDE 59

http://streamreasoning.org/events/rsp2016

  • Query Processing

RDF Files (e.g. maps) Sensor Streams

Clingo

C- SPARQL CQELS

Application

Scalability requires adaptation! Controller Rule-based Expressive Reasoning Web of Data

LSD Wrappers

Stream Rule

slide-60
SLIDE 60

http://streamreasoning.org/events/rsp2016

  • Conclusions:
  • RDF Stream Processing provides an exciting opportunity for

applications dealing with hetergeneous data

  • RSP engines are evolving and maturing
  • Optimisation is critical for RSP to be considered for large-scale

real-time applications.

  • Distributed large-scale RSP engines
  • Language semantics clarity necessary for query optimisation