A P A P A Proposal for Publishing Data A Proposal for Publishing - - PowerPoint PPT Presentation

a p a p a proposal for publishing data a proposal for
SMART_READER_LITE
LIVE PREVIEW

A P A P A Proposal for Publishing Data A Proposal for Publishing - - PowerPoint PPT Presentation

A P A P A Proposal for Publishing Data A Proposal for Publishing Data l f l f P bli hi P bli hi D t D t Streams as Linked Data Streams as Linked Data Streams as Linked Data Streams as Linked Data http://streamreasoning.org


slide-1
SLIDE 1

A P l f P bli hi D t A P l f P bli hi D t A Proposal for Publishing Data A Proposal for Publishing Data Streams as Linked Data Streams as Linked Data Streams as Linked Data Streams as Linked Data

http://streamreasoning.org http://streamreasoning.org htt // iki l k / htt // iki l k / l/ l/ http://wiki.larkc.eu/c http://wiki.larkc.eu/c-

  • sparql/

sparql/

Davide F. Barbieri Emanuele Della Valle

DEI – Politecnico di Milano DEI – Politecnico di Milano

  • For more information visit http://wiki.larkc.eu/UrbanComputing

DEI Politecnico di Milano DEI Politecnico di Milano

dbarbieri@elet.polimi.it emanuele.dellavalle@polimi.it

slide-2
SLIDE 2

Introduction Real-Time Streams on the Web

  • Streams are appearing more and more often on the

Web in sites that distribute and present information in Web in sites that distribute and present information in real-time streams.

  • E.g. http://twitter.com/#search?q=just%20landed%20in

E.g. http://twitter.com/#search?q just%20landed%20in

Emanuele Della Valle - visit http://streamreasoning.org

  • Checkout http://activitystrea.ms/ for a standard API

2

LDOW2010 @ WWW 2010, Raleigh, North Carolina, April 27th, 2010

slide-3
SLIDE 3

Introduction Combining Streams and Static Information

  • We anticipate a rapidly growing need of mashing up

this streaming information with more static one. this streaming information with more static one.

  • E.g., Twitter + MetaCarta

Emanuele Della Valle - visit http://streamreasoning.org

3

LDOW2010 @ WWW 2010, Raleigh, North Carolina, April 27th, 2010

[source: http://blog.blprnt.com/blog/blprnt/just-landed-processing-twitter-metacarta-hidden-data ]

slide-4
SLIDE 4

Background Managing Streams

  • Streams

– unbounded sequences of time-varying data elements unbounded sequences of time varying data elements

time

  • Stream Processing

Continuous queries registered over streams that are

time

– Continuous queries registered over streams that are

  • bserved trough windows

Emanuele Della Valle - visit http://streamreasoning.org

4

LDOW2010 @ WWW 2010, Raleigh, North Carolina, April 27th, 2010

slide-5
SLIDE 5

State-of-the-Art Data Stream Management Systems (DSMS)

  • Research prototypes:

– STREAM http://infolab.stanford.edu/stream/ – Aurora http://www.cs.brown.edu/research/aurora/ – Borealis http://www.cs.brown.edu/research/borealis/public/

S F t b dd d i

  • Some Features are embedded in

– Oracle http://www.oracle.com/technology/products/ dataint/htdocs/streams fo.html dataint/htdocs/streams_fo.html – DB2 http://www.eweek.com/c/a/Database/IBM-DB2-Turns- 25-and-Prepares-for-New-Life/

  • Start-ups

– StreamBase http://www.streambase.com/

  • Open Source
  • Open Source

– Esper http://esper.codehaus.org/ – Data Turbine http://www dataturbine org/

Emanuele Della Valle - visit http://streamreasoning.org

Data Turbine http://www.dataturbine.org/

5

LDOW2010 @ WWW 2010, Raleigh, North Carolina, April 27th, 2010

slide-6
SLIDE 6

Background Continuous SPARQL (C-SPARQL)

  • What is it?

– an extension to SPARQL for continuous querying over an extension to SPARQL for continuous querying over (virtual) streams of RDF and static RDF graphs

  • Architecture of our C-SPARQL Engine

– Based on the Large Knowledge Collider (LarKC) conceptual framework Based on the Large Knowledge Collider (LarKC) conceptual framework

rs ms Window

Select Abstract Reason

Streamed Input Window Content RDF Streams Answe Stream p Window Content RDF Streams RDF Graphs

Emanuele Della Valle - visit http://streamreasoning.org

6

LDOW2010 @ WWW 2010, Raleigh, North Carolina, April 27th, 2010

slide-7
SLIDE 7

Background RDF Stream

  • RDF Stream Data Type

– Ordered sequence of pairs where each pair is made Ordered sequence of pairs, where each pair is made

  • f an RDF triple and its timestamp t

(< triple >, t)

  • E.g.,

(< :traveller1 :justLanded :placeA >, T1) (< :traveller2 :justLanded :placeB >, T1) (< :traveller3 :justLanded :placeA >, T2) (< :traveller1 :justLanded :placeC >, T3)

Emanuele Della Valle - visit http://streamreasoning.org

7

LDOW2010 @ WWW 2010, Raleigh, North Carolina, April 27th, 2010

slide-8
SLIDE 8

Background An Example of C-SPARQL Query

Who has landed in USA in the last hour?

REGISTER QUERY WhoHasLandedInUSAinTheLastHour AS PREFIX gno: <http://www.geonames.org/ontology#> PREFIX c: <http://www.geonames.org/countries/#> p g g PREFIX : <http://example> SELECT ?traveller ?place ?type FROM <htt // / E i ti USf t G h> FROM <http://sws.geonames.org/nonExistingUSfeatureGraph> FROM STREAM <http://someStreamGeneratedFromTwitter> [ RANGE 60m STEP 5m ] WHERE { ?traveller :justLanded ?place . ?place gno:inCountry c:US ?place gno:inCountry c:US . ?place gno:featureCode ?type . }

Emanuele Della Valle - visit http://streamreasoning.org

8

SDoW @ ISWC 2009, Washington, USA - 25-10-2009

slide-9
SLIDE 9

Background An Example of C-SPARQL Query Explained

Who has landed in USA in the last hour?

Query registration (for continuous execution)

REGISTER QUERY WhoHasLandedInUSAinTheLastHour AS PREFIX gno: <http://www.geonames.org/ontology#> PREFIX c: <http://www.geonames.org/countries/#> p g g PREFIX : <http://example> SELECT ?traveller ?place ?type FROM <htt // / E i ti USf t G h>

FROM STREAM clause

FROM <http://sws.geonames.org/nonExistingUSfeatureGraph> FROM STREAM <http://someStreamGeneratedFromTwitter> [ RANGE 60m STEP 5m ]

WINDOW

WHERE { ?traveller :justLanded ?place . ?place gno:inCountry c:US

triples from a stream

?place gno:inCountry c:US . ?place gno:featureCode ?type . }

Combined with triples a RDF graph

Emanuele Della Valle - visit http://streamreasoning.org

9

SDoW @ ISWC 2009, Washington, USA - 25-10-2009

slide-10
SLIDE 10

Proposal Streaming Linked Data

  • What

– an extension of our C-SPARQL Engine that publishes an extension of our C SPARQL Engine that publishes data streams as Linked Data

  • Architecture

Architecture

HTTP

HTML Clients

RDF

Linked Data Clients

HTML

Streaming Linked Data Server

Data Streams REST

C‐SPARQL Remote C‐ SPARQL Clients

Java

Local C‐SPARQL Clients

RDF Graphs RDF Streams

C‐SPARQL Engine

Emanuele Della Valle - visit http://streamreasoning.org

10

LDOW2010 @ WWW 2010, Raleigh, North Carolina, April 27th, 2010

slide-11
SLIDE 11

Streaming Linked Data Raw Data Stream

  • The problem

– How to publish as linked an RDF Stream? How to publish as linked an RDF Stream?

  • Proposal

Use Named Graph – Use Named Graph – A Stream Graph (s-graph)

  • a metadata graph that describes the current content of the
  • a metadata graph that describes the current content of the

window over the stream

– Several Instantaneous Graphs (i-graph) p ( g p )

  • one for each time stamp

– rdfs:seeAlso is used (and reserved) to link the graphs

– A Streaming Linked Data Vocabulary is used to describe the content of the graphs

Emanuele Della Valle - visit http://streamreasoning.org

11

LDOW2010 @ WWW 2010, Raleigh, North Carolina, April 27th, 2010

slide-12
SLIDE 12

Streaming Linked Data Raw Data Stream - Example A s-Graph (only metadata) and …

@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema #> . @prefix sld: <http://www.streaminglinkeddata.org/schema#> . @prefix : <http://example/> @prefix : <http://example/> . :sgraph sld:lastUpdate "T3"^^xsd:dataTime; sld:expires "T4"^^xsd:dataTime; sld:windowType sld:logicalTumbling; sld:windowSize "PT1H"^^xsd:duration . :sgraph1 rdfs:seeAlso :igraph1 . :sgraph1 rdfs:seeAlso :igraph1 . :igraph1 sld:receivedAt "T1"^^xsd:dataTime . :sgraph1 rdfs:seeAlso :igraph2 . i h2 ld i d "T "^^ d d i :igraph2 sld:receivedAt "T2"^^xsd:dataTime . :sgraph1 rdfs:seeAlso :igraph1 . :igraph1 sld:receivedAt "T3"^^xsd:dataTime .

Emanuele Della Valle - visit http://streamreasoning.org

12

LDOW2010 @ WWW 2010, Raleigh, North Carolina, April 27th, 2010

slide-13
SLIDE 13

Streaming Linked Data Raw Data Stream - Example … and three i-Graphs (triples + few metadata)

:igraph1 sld:receivedAt "T1"^^xsd:dataTime ; rdfs:seeAlso :sgraph . igraph1 :traveller1 :justLanded :placeA . :traveller2 :justLanded :placeB . igraph1 :igraph2 sld:receivedAt "T2"^^xsd:dataTime ; rdfs:seeAlso :sgraph . igraph2 :traveller3 :justLanded :placeA . :igraph3 sld:receivedAt "T3"^^xsd:dataTime ;

3

rdfs:seeAlso :sgraph . :traveller1 :justLanded :placeC . igraph3

Emanuele Della Valle - visit http://streamreasoning.org

13

LDOW2010 @ WWW 2010, Raleigh, North Carolina, April 27th, 2010

slide-14
SLIDE 14

Streaming Linked Data Raw Data Stream – naming the graphs

  • Patterns

– s-graphs http://ex.org/%stream-name% – i-graphs http://ex.org/%stream-name/%timestamp%

  • Example

– s-graph s graph http://ex.org/just-landed-in – i-graphs // / /2010 02 12 133 1 http://ex.org/just-landed-in/2010-02-12T133441Z http://ex.org/just-landed-in/2010-02-12T133710Z http://ex.org/just-landed-in/2010-02-12T133933Z

Emanuele Della Valle - visit http://streamreasoning.org

14

LDOW2010 @ WWW 2010, Raleigh, North Carolina, April 27th, 2010

slide-15
SLIDE 15

Streaming Linked Data Raw Data Stream – dereferencing rules

  • Resource

http://ex.org/just-landed-in http://ex.org/just-landed-in/2010-02-12T133441Z

  • Linked Data Clients

http://ex.org/trdf/just-landed-in http://ex.org/trdf/just landed in http://ex.org/trdf/just-landed-in/2010-02-12T133441Z

  • HTML Clients

http://ex.org/page/just-landed-in http://ex.org/page/just-landed-in/2010-02-12T133441Z

Emanuele Della Valle - visit http://streamreasoning.org

15

LDOW2010 @ WWW 2010, Raleigh, North Carolina, April 27th, 2010

slide-16
SLIDE 16

Streaming Linked Data Controlling the Window - Types of window

  • physical: a given number of triples
  • logical: a variable number of triples which occur during a

logical: a variable number of triples which occur during a given time interval (e.g., 1 hour)

– Sliding: they are progressively advanced of a given STEP (e.g., 5 minutes) – Tumbling: they are advanced of exactly their time interval

Emanuele Della Valle - visit http://streamreasoning.org

16

LDOW2010 @ WWW 2010, Raleigh, North Carolina, April 27th, 2010

slide-17
SLIDE 17

Streaming Linked Data Controlling the Window

  • Physical Windows

– Schema

http://ex.org/%stream-name%/physical/%size%

– Example, last 1000 triples

http://ex.org/just-landed-in/physical/1000

  • Logical Windows

S h – Schema

http://ex.org/%stream-name%/logical/%size%/%step%

– Example, last hour sliding with a step of 1 minute Example, last hour sliding with a step of 1 minute

http://ex.org/just-landed-in/logical/PT1H/PT10M

  • NOTE 1: These URL patters are translated in equivalent C-SPARQL queries that

select a part of the stream

  • NOTE 2: The lexical space of an interval is the same as xsd:duration, i.e., the format

P Y M DT H M S d d b ISO 8601

Emanuele Della Valle - visit http://streamreasoning.org

PnYnMnDTnHnMnS dened by ISO 8601

17

LDOW2010 @ WWW 2010, Raleigh, North Carolina, April 27th, 2010

slide-18
SLIDE 18

Streaming Linked Data REST APIs to Control C-SPARQL Queries

  • Operations:

– register a C-SPARQL query g q y

  • C-SPARQL queries have to be registered in

the C-SPARQL Engine

start the computation – start the computation

  • Be aware that the results are not “semantically” valid until the

window is completely filled in – pause the computation

  • the windowing mechanism will keep working, but the window

content is not processed content is not processed – stop the computation

  • the window is emptied

p – unregister the C-SPARQL query

Emanuele Della Valle - visit http://streamreasoning.org

18

LDOW2010 @ WWW 2010, Raleigh, North Carolina, April 27th, 2010

slide-19
SLIDE 19

Conclusion and Retrospective

  • In our previous work we investigated C-SPARQL as an

approach to treat non-RDF DSMSs as virtual RDF streams and graphs.

  • With this position paper, we propose an extension of our C-

SPARQL Engine that publishes data streams as Linked Data SPARQL Engine that publishes data streams as Linked Data.

  • We introduced the concept of

– A Stream Graph (s-graph) A Stream Graph (s graph)

  • a metadata graph that describes the current content of the

window over the stream S l I t t G h (i h) – Several Instantaneous Graphs (i-graph)

  • one for each time stamp

– rdfs:seeAlso is used (and reserved) to link the graphs

( ) g p

  • We define URL patterns

– to control the window

Emanuele Della Valle - visit http://streamreasoning.org

– to register, start, pause, stop, unregister a C-SPARQL query

19

LDOW2010 @ WWW 2010, Raleigh, North Carolina, April 27th, 2010

slide-20
SLIDE 20

Much More to Come! Questions?

Thank You! Keep an eye on p y http://www.larkc.eu/ http://streamreasoning.org/ http://streamreasoning.org/

Emanuele Della Valle

DEI – Politecnico di Milano

Emanuele Della Valle - visit http://streamreasoning.org

20 20

SDoW @ ISWC 2009, Washington, USA - 25-10-2009

DEI Politecnico di Milano

emanuele.dellavalle@polimi.it