linked data indexing methods a survey
play

Linked Data Indexing Methods: A Survey Martin Svoboda, Irena Mlnkov - PowerPoint PPT Presentation

Linked Data Indexing Methods: A Survey Martin Svoboda, Irena Mlnkov Charles University in Prague The Czech Republic 21st October 2011 SWWS@OTM, Crete, Greece Outline Introduction Dimensions Approaches Observations


  1. Linked Data Indexing Methods: A Survey Martin Svoboda, Irena Mlýnková Charles University in Prague The Czech Republic 21st October 2011 SWWS@OTM, Crete, Greece

  2. Outline • Introduction • Dimensions • Approaches • Observations • Challenges • Conclusion Linked Data Indexing Methods: A Survey 21st October 2011 SWWS@OTM, Crete, Greece 2

  3. Introduction • Motivation  Web of Documents  Web of Data • Linked Data  Principles ‒ Unique identifiers (URIs) ‒ Useful description (HTTP, RDF) ‒ Links Linked Data Indexing Methods: A Survey 21st October 2011 SWWS@OTM, Crete, Greece 3

  4. Introduction • RDF (Resource Description Framework)  Triples ‒ Subject Predicate Object.  Graph ‒ Directed labeled multigraph ‒ Vertices for subjects and objects ‒ Edges for particular triples Linked Data Indexing Methods: A Survey 21st October 2011 SWWS@OTM, Crete, Greece 4

  5. Intent • Querying framework  Architecture ‒ Compromise between local and distributed approaches  Issues ‒ Physical storage ‒ Index structures ‒ Query processor  Problems ‒ Data scalability, distribution and dynamicity Linked Data Indexing Methods: A Survey 21st October 2011 SWWS@OTM, Crete, Greece 5

  6. Intent • Architecture  Local ‒ Efficient processing ‒ Independent data ‒ Storage requirements  Distributed ‒ Runtime requests ‒ Up-to-date data ‒ Network throughput Linked Data Indexing Methods: A Survey 21st October 2011 SWWS@OTM, Crete, Greece 6

  7. Dimensions • Aspects  Data  Index  Querying • Dimensions  Not all combinations make sense Linked Data Indexing Methods: A Survey 21st October 2011 SWWS@OTM, Crete, Greece 7

  8. Dimensions • Data distribution  Local , distributed or global data • Data units  Triples , quads , documents or other sources • Data dynamicity  Durable , changeable or volatile data • Index organization  Local or distributed model Linked Data Indexing Methods: A Survey 21st October 2011 SWWS@OTM, Crete, Greece 8

  9. Dimensions • Index items  Keywords , triples , quads , trees , paths or areas • Index content  Pure data , statistics or summaries about data • Index dynamicity  Dynamic or static structures • Access patterns  Universal or limited approaches Linked Data Indexing Methods: A Survey 21st October 2011 SWWS@OTM, Crete, Greece 9

  10. Dimensions • Querying layer  Syntactic , structural or semantic querying • Query models  Full text querying or graph patterns • Query evaluation  Local or distributed processing • Query results  Complete or incomplete results Linked Data Indexing Methods: A Survey 21st October 2011 SWWS@OTM, Crete, Greece 10

  11. Categories • Main approach types  Querying systems ‒ Local or distributed data ‒ Structural queries ‒ Complete results  Searching engines ‒ Global data cloud ‒ Full text queries ‒ Imprecise results Linked Data Indexing Methods: A Survey 21st October 2011 SWWS@OTM, Crete, Greece 11

  12. Approaches • Source selection ‒ Andreas Harth et al.: Data Summaries for On-Demand Queries over Linked Data  Data transformation ‒ 3-dimenisonal space ‒ Hash functions  Q-trees based on R-trees (5, 10, 5) ‒ Overlapping bounding boxes Dataset A 87 Dataset B 14 ‒ Buckets with summaries (15, 20, 25) Linked Data Indexing Methods: A Survey 21st October 2011 SWWS@OTM, Crete, Greece 12

  13. Approaches • BitMat index ‒ Medha Atre et al.: Matrix "Bit"loaded: A Scalable Lightweight Join Query Processor for RDF Data  3-dimensional matrix subjects predicates ‒ Bit values 0 or 1 0  2-dimensional slices 1 0 0 John ‒ S-O, O-S, P-O, P-S slices 0 lives in Peter 0  Implementation 1 0 knows ‒ Compressed bit runs objects Linked Data Indexing Methods: A Survey 21st October 2011 SWWS@OTM, Crete, Greece 13

  14. Observations • String compression  Repeating string values ‒ URIs and literals  Unique integer identifiers ‒ Efficient processing ‒ Space requirements  Translation maps ‒ Both directions ‒ Based on B-trees Linked Data Indexing Methods: A Survey 21st October 2011 SWWS@OTM, Crete, Greece 14

  15. Observations • Data pruning  Idea ‒ Query optimization ‒ Relevant data  Methods ‒ Filtering selections ‒ Join ordering  Problem ‒ Partial knowledge Linked Data Indexing Methods: A Survey 21st October 2011 SWWS@OTM, Crete, Greece 15

  16. Challenges • Data distribution  Motivation ‒ Datasets are distributed ‒ Appropriate compromise  Problems ‒ Network drawbacks ‒ Space requirements ‒ Independent datasets Linked Data Indexing Methods: A Survey 21st October 2011 SWWS@OTM, Crete, Greece 16

  17. Challenges • Data scalability  Motivation ‒ Web of Data size explosion • September 2011: • 295 datasets, 31 billion triples, 504 million links  Problems ‒ Scalable storages and indices ‒ Efficient query evaluation ‒ Quality, provenance and trust Linked Data Indexing Methods: A Survey 21st October 2011 SWWS@OTM, Crete, Greece 17

  18. Challenges • Data dynamicity  Motivation ‒ Data tend to ageing  Problems ‒ Continuous updates ‒ Dynamic structures Linked Data Indexing Methods: A Survey 21st October 2011 SWWS@OTM, Crete, Greece 18

  19. Conclusion • Problem  Linked Data indexing methods • Contributions  Approaches comparison ‒ Dimensions ‒ Observations ‒ Challenges Linked Data Indexing Methods: A Survey 21st October 2011 SWWS@OTM, Crete, Greece 19

  20. Thank you for your attention… Faculty of Mathematics and Physics Charles University in Prague

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend