federated semantic data management
play

Federated Semantic Data Management 25-30 June 2017 - dagstuhl - - PowerPoint PPT Presentation

Federated Semantic Data Management 25-30 June 2017 - dagstuhl - Germany Hala Skaf-Molli Pascal Molli Universit de Nantes NANTES GDD: Distributed Data Management Group Foundations of Distributed Distributed Data Management Federated


  1. Federated Semantic Data Management 25-30 June 2017 - dagstuhl - Germany Hala Skaf-Molli Pascal Molli Université de Nantes

  2. NANTES

  3. GDD: Distributed Data Management Group Foundations of Distributed Distributed Data Management ● Federated Query Processing : Systems ● Distributed algorithms ○ source selection, ● Distributed Data decomposition, Structures optimization, operators ● Consistency criteria & ● Data Integration ● Replication, synchronization & protocols ● Fog Computing consistency ● Queries in the Fog

  4. Replication, synchronization, Consistency How to improve linked data together ? I see a mistake how to fix it, especially if i cannot edit ? ● Idea: Replicate and synchronize… git for RDF data... ● Live Linked Data: synchronising semantic stores with commutative replicated data types . JMSO13 Towards Writable and Scalable Linked Open Data. ISWC14 ●

  5. Replication, synchronization, Consistency LinkedCT Lct:intervention1 [ Wiwiss-berlin.de:DB00087 Lct:type Drug . rdf:type Drug Lct:condition Lct:T-Cell-Lymhoma wifo5-mannheim.de:DB00087 rdfs:label ‘Alemtuzumab’ . rdf:type Drug rdfs:seeAlso wiwiss-berlin:DB00087] I’m ready to fix the problem. How can I update? 5

  6. Lct:intervention1 [ Lct:intervention1 [ Lct:type Drug . Lct:type Drug . Lct:condition Lct:T-Cell-Lymphoma wifo5-mannheim.de:DB00087 Lct:condition Lct:T-Cell-Lymphoma rdfs:label ‘Alemtuzumab’ . DB:Half-Life 288h rdfs:label ‘Alemtuzumab’ . rdfs:seeAlso wiwiss-berlin:DB00087 rdfs:seeAlso wifo5-mannheim.de:DB00087] ] CONSTRUCT CONSTRUCT WHERE { CONSTRUCT WHERE { ?x rdf:type drugbank:drug } WHERE { ?x rdfs:seeAlso ?y} ?x rdfs:seeAlso ?y} MyOrg (My Update Feed Update Feed Endpoint) Lct:intervention1 rdfs:seeAlso wifo5-mannheim:DB00087 wifo5-mannheim.de:DB00087 DB:Half-Life 288h 6

  7. Data Integration How to query the deep web and linked data with SPARQL ? Idea: Local-as-view mediator with smart materialization of ● views Semlav: Local-as-view mediation for sparql queries. TLDKS14 ● ● Semlav: Querying deep web and linked open data with SPARQL , ESWC14 (demo) Gun: An efficient execution strategy for querying the web of ● data Dexa2013

  8. SELECT DISTINCT * WHERE { ?P foaf:member ?C . Client ?C rdfs:label “Semantic Web“ . ?P foaf:knows ?WKP . ?WKP foaf:name “Barack Obama“ } Global Schema Query Executor SemLAV rdfs:label foaf:na foaf:name foaf:name foaf:name foaf:name me rdfs:label rdfs:label rdfs:label rdfs:labe foaf:mem foaf:mem l ber ber 8

  9. Query : Q(P,C,WKP,N):- member(P,C) , label(C,”Semantic Web”), knows(P,WKP), name(WKP, ,”Barack Obama”) LAV mappings: v1(P,A,I,C,L):-made(P,A),affiliation(P,I), member(P,C) ,label(C,L) v2(A,T,P,N,C):-title(A,T),made(P,A),name(P,N), member(P,C) v3(P,N,R,M):-name(P,N),name(R,M),knows(P,R) v4(P,N,G,R,C):-name(P,N),gender(P,G),knows(P,R), member(P,C) v5(P,N,R,C,L):-name(P,N),knows(P,R), member(P,C) ,label(C,L) member(P,C) label(C,L) knows(P,WKP) name(WKP,N) 4 v5(P,N,R,C,L) v5(P,N,R,C,L) v5(P,N,R,C,L) v5(P,N,R,C,L) 3 v4(P,N,G,R,C) v1(P,A,I,C,L) v4(P,N,G,R,C) v4(P,N,G,R,C) 2 v1(P,A,I,C,L) v3(P,N,R,M) v2(A,T,P,N,C) 2 v2(A,T,P,N,C) v3(P,N,R,M) 9

  10. Federated Queries & Replication ● How to improve data availability of the linked data ? Idea: Partial data replication to create new data locality and smart ○ source selection ■ Federated SPARQL Queries Processing with Replicated Fragments. ISWC15 Idea: Partial data replication and query decomposition ○ ■ Decomposing federated queries in presence of replicated fragments JWS17

  11. Data replication & query decomposition ● consider a BGP with three triple patterns tp1,tp2, and tp3. Endpoint C1 is relevant for tp1 and tp3, ○ Endpoint C2 is relevant for tp1 and tp2. ○ tp1@c1=tp1@c2 ○ ● Existing source selection strategies prevent from assigning tp1.tp3 to C1 and tp1.tp2 to C2, even if these sub-queries generate less intermediate results...

  12. Federated queries & Replication ● How to improve query performance on the linked data ? ● Idea: Partial replication and intra-query parallelization ● PeNeLoop: Parallelizing Federated SPARQL Queries in Presence of Replicated Fragments - QUWEDA@ESWC17

  13. PeNeLoop Query Processing Both E1 & E3 are used to process the join E1 M 7 M 1 , M 2 M 6 { tp1. tp2. } π Start ⋈ E2 { ?movie = dbo:Seven_Samurai, ?name = “Samurai movie” } Join ⋈ performed M 3 , M 4 in local at E2 E3 SELECT ?movie ?name WHERE { ?movie dbo:director ?director . (tp1) ?movie lmdb:genre ?genre . (tp2) ?genre lmdb:genre_name ?name . (tp3) } 13

  14. Queries in the Fog How to have data availability * and* performances ? ● Idea: P2P resource sharing but on client side… in the fog of browsers ○ CyCLaDEs: A Decentralized Cache for Linked Data Fragments ESWC 2016 ○ SPARQL Queries in the Fog of Browsers Demo@ESWC 2017

  15. DBpedia DrugBank DBpedia DrugBank LDF Server LDF Server HTTP Cache HTTP Cache c6 2 c3 c9 1 c1 2 c5 c6 c3 c9 c7 c8 c1 2 c2 c4 c5 c7 c8 c6 c2 c3 c9 c1 c4 c5 c7 c8 c2 c4

  16. SPARQL Queries in the Fog of Browsers Fog of Browsers: P2P network of Browsers with Browser to browser Connections (WebRTC) WebRTC: https://webrtc.org/

  17. FoB with Triple Pattern Fragments TPFs TPFs Servers run TPF servers C1 TPFc Browsers run TPF Clients: C1, C2... C2 TPFc TPF: Verborgh, Ruben, et al. "Triple Pattern Fragments: A low-cost knowledge graph interface for the Web." Web Semantics: Science, Services and Agents on the World Wide Web 37 (2016): 184-206.

  18. Clients receive SPARQL queries... Any Client can receive at TPFs anytime SPARQL queries... TPFs W1:Q1,Q2, C1 Q3,Q4 TPFc C2 TPFc W2:Q5,Q6 TPF: Verborgh, Ruben, et al. "Triple Pattern Fragments: A low-cost knowledge graph interface for the Web." Web Semantics: Science, Services and Agents on the World Wide Web 37 (2016): 184-206.

  19. Clients receive SPARQL queries... Any Client can receive anytime TPFs SPARQL queries. TPFs Do it yourself, or delegate some to neighbors : Client-side W1:Q1,Q2, C1 C3 C5 Q3,Q4 Inter-query parallelism ● Q4@C4, Q3@C3... C2 W2:Q5,Q6 Q4 C4

  20. Clients receives SPARQL queries... TPFs TPFs Can we reduce the global Execution Time (ET) of W1 W1:Q1,Q2, and W2 by delegating queries C1 Q3,Q4 to neighbours ? ET(W1@C1 // W2@C2) > C2 ET({W1 ∪ W2}@{C1-C5} ? W2:Q5,Q6

  21. ladda-demo.herokuapp.com

  22. GDD Research Group Distributed Data Management 25-30 June 2017 - dagstuhl - Germany P. Molli - H. Skaf Mcf Univ Nantes

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend