How Caching Improves Efficiency and Result Completeness for - - PowerPoint PPT Presentation
How Caching Improves Efficiency and Result Completeness for - - PowerPoint PPT Presentation
How Caching Improves Efficiency and Result Completeness for Querying Linked Data Olaf Hartig http://olafhartig.de/foaf.rdf#olaf @olafhartig Database and Information Systems Research Group Humboldt-Universitt zu Berlin Can we query the Web
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 2
SELECT DISTINCT ?i ?label WHERE { ?prof rdf:type <http://res ... data/dbprofs#DBProfessor> ; foaf:topic_interest ?i . OPTIONAL { ?i rdfs:label ?label FILTER( LANG(?label)="en" || LANG(?label)="") } } ORDER BY ?label
?
Can we query the Web of Data as of it were a single, giant database? Our approach: Link Traversal Based Query Execution [ISWC'09]
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 3
Main Idea
query-local dataset
- Intertwine query evaluation with traversal of data links
- We alternate between:
- Evaluate parts of the query (triple patterns)
- n a continuously augmented set of data
- Look up URIs in intermediate
solutions and add retrieved data to the query-local dataset
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 4
Main Idea
query-local dataset
- Intertwine query evaluation with traversal of data links
- We alternate between:
- Evaluate parts of the query (triple patterns)
- n a continuously augmented set of data
- Look up URIs in intermediate
solutions and add retrieved data to the query-local dataset
name project ?prj
Query
k n
- w
s ?acq ?prjName http://bob.name
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 5
query-local dataset
Main Idea
h t t p : / / b
- b
. n a m e ?
- Intertwine query evaluation with traversal of data links
- We alternate between:
- Evaluate parts of the query (triple patterns)
- n a continuously augmented set of data
- Look up URIs in intermediate
solutions and add retrieved data to the query-local dataset
name project ?prj
Query
k n
- w
s ?acq ?prjName http://bob.name
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 6
query-local dataset
Main Idea
h t t p : / / b
- b
. n a m e ?
- Intertwine query evaluation with traversal of data links
- We alternate between:
- Evaluate parts of the query (triple patterns)
- n a continuously augmented set of data
- Look up URIs in intermediate
solutions and add retrieved data to the query-local dataset
name project ?prj
Query
k n
- w
s ?acq ?prjName http://bob.name
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 7
query-local dataset
Main Idea
h t t p : / / b
- b
. n a m e ?
- Intertwine query evaluation with traversal of data links
- We alternate between:
- Evaluate parts of the query (triple patterns)
- n a continuously augmented set of data
- Look up URIs in intermediate
solutions and add retrieved data to the query-local dataset
name project ?prj
Query
k n
- w
s ?acq ?prjName http://bob.name
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 8
query-local dataset
Main Idea
h t t p : / / b
- b
. n a m e ?
- Intertwine query evaluation with traversal of data links
- We alternate between:
- Evaluate parts of the query (triple patterns)
- n a continuously augmented set of data
- Look up URIs in intermediate
solutions and add retrieved data to the query-local dataset
name project ?prj
Query
k n
- w
s ?acq ?prjName http://bob.name
“Descriptor object”
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 9
query-local dataset
- Intertwine query evaluation with traversal of data links
- We alternate between:
- Evaluate parts of the query (triple patterns)
- n a continuously augmented set of data
- Look up URIs in intermediate
solutions and add retrieved data to the query-local dataset
Main Idea
name project ?prj
Query
?prjName k n
- w
s http://bob.name ?acq
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 10
query-local dataset
- Intertwine query evaluation with traversal of data links
- We alternate between:
- Evaluate parts of the query (triple patterns)
- n a continuously augmented set of data
- Look up URIs in intermediate
solutions and add retrieved data to the query-local dataset
Main Idea
knows http://bob.name http://alice.name name project ?prj
Query
?prjName k n
- w
s http://bob.name ?acq name project ?prj
Query
?prjName k n
- w
s http://bob.name ?acq
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 11
query-local dataset
- Intertwine query evaluation with traversal of data links
- We alternate between:
- Evaluate parts of the query (triple patterns)
- n a continuously augmented set of data
- Look up URIs in intermediate
solutions and add retrieved data to the query-local dataset
Main Idea
http://alice.name
?acq name project ?prj
Query
?prjName k n
- w
s http://bob.name ?acq knows http://bob.name http://alice.name
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 12
query-local dataset
- Intertwine query evaluation with traversal of data links
- We alternate between:
- Evaluate parts of the query (triple patterns)
- n a continuously augmented set of data
- Look up URIs in intermediate
solutions and add retrieved data to the query-local dataset
Main Idea
http://alice.name ?
http://alice.name
?acq name project ?prj
Query
k n
- w
s ?acq ?prjName http://bob.name
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 13
query-local dataset
- Intertwine query evaluation with traversal of data links
- We alternate between:
- Evaluate parts of the query (triple patterns)
- n a continuously augmented set of data
- Look up URIs in intermediate
solutions and add retrieved data to the query-local dataset
Main Idea
http://alice.name ?
http://alice.name
?acq name project ?prj
Query
k n
- w
s ?acq ?prjName http://bob.name
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 14
query-local dataset
- Intertwine query evaluation with traversal of data links
- We alternate between:
- Evaluate parts of the query (triple patterns)
- n a continuously augmented set of data
- Look up URIs in intermediate
solutions and add retrieved data to the query-local dataset
Main Idea
http://alice.name ?
http://alice.name
?acq name project ?prj
Query
k n
- w
s ?acq ?prjName http://bob.name
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 15
query-local dataset
- Intertwine query evaluation with traversal of data links
- We alternate between:
- Evaluate parts of the query (triple patterns)
- n a continuously augmented set of data
- Look up URIs in intermediate
solutions and add retrieved data to the query-local dataset
Main Idea
name project ?prj
Query
k n
- w
s ?acq ?prjName http://bob.name
http://alice.name
?acq
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 16
query-local dataset
- Intertwine query evaluation with traversal of data links
- We alternate between:
- Evaluate parts of the query (triple patterns)
- n a continuously augmented set of data
- Look up URIs in intermediate
solutions and add retrieved data to the query-local dataset
Main Idea
http://alice.name
?acq name ?prjName k n
- w
s http://bob.name
Query
project ?prj ?acq
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 17
query-local dataset
- Intertwine query evaluation with traversal of data links
- We alternate between:
- Evaluate parts of the query (triple patterns)
- n a continuously augmented set of data
- Look up URIs in intermediate
solutions and add retrieved data to the query-local dataset
Main Idea
project http://.../AlicesPrj http://alice.name
http://alice.name
?acq name ?prjName k n
- w
s http://bob.name
Query
project ?prj ?acq
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 18
query-local dataset
- Intertwine query evaluation with traversal of data links
- We alternate between:
- Evaluate parts of the query (triple patterns)
- n a continuously augmented set of data
- Look up URIs in intermediate
solutions and add retrieved data to the query-local dataset
Main Idea
http://alice.name http://.../AlicesPrj
?prj ?acq
http://alice.name
?acq name ?prjName k n
- w
s http://bob.name
Query
project ?prj ?acq project http://.../AlicesPrj http://alice.name
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 19
query-local dataset
- Intertwine query evaluation with traversal of data links
- We alternate between:
- Evaluate parts of the query (triple patterns)
- n a continuously augmented set of data
- Look up URIs in intermediate
solutions and add retrieved data to the query-local dataset
Main Idea
http://alice.name
?acq name ?prjName k n
- w
s http://bob.name
Query
project ?prj ?acq
http://alice.name http://.../AlicesPrj
?prj ?acq
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 20
query-local dataset
- Intertwine query evaluation with traversal of data links
- We alternate between:
- Evaluate parts of the query (triple patterns)
- n a continuously augmented set of data
- Look up URIs in intermediate
solutions and add retrieved data to the query-local dataset
Main Idea
http://alice.name
?acq name k n
- w
s http://bob.name
Query
project ?acq ?prjName ?prj
http://alice.name http://.../AlicesPrj
?prj ?acq
http://.../AlicesPrj “ … “
?prjName ?prj
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 21
query-local dataset
Main Idea
http://alice.name
?acq name k n
- w
s http://bob.name
Query
project ?acq ?prjName ?prj
http://alice.name http://.../AlicesPrj
?prj ?acq
http://.../AlicesPrj “ … “
?prjName ?prj
http://alice.name
?acq
http://.../AlicesPrj “ … “
?prjName ?prj
- Intertwine query evaluation with traversal of data links
- We alternate between:
- Evaluate parts of the query (triple patterns)
- n a continuously augmented set of data
- Look up URIs in intermediate
solutions and add retrieved data to the query-local dataset
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 22
Characteristics
- Link traversal based query execution:
- Evaluation on a continuously augmented dataset
- Discovery of potentially relevant data during execution
- Discovery driven by intermediate solutions
- Main advantage:
- No need to know all data sources in advance
- Limitations:
- Query has to contain a URI as a starting point
- Ignores data that is not reachable* by the query execution
* formal definition in the paper
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 23
query-local dataset
The Issue
label interest ?i
Query
knows ?acq ?iLabel http://bob.name
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 24
query-local dataset
The Issue
label interest ?i
Query
knows ?acq ?iLabel http://bob.name
http://bob.name ?
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 25
query-local dataset
The Issue
label interest ?i
Query
knows ?acq ?iLabel http://bob.name knows http://bob.name http://alice.name
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 26
query-local dataset
The Issue
label interest ?i
Query
knows ?acq ?iLabel http://bob.name knows http://bob.name http://alice.name ?acq ?iLabel ?i
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 27
query-local dataset query-local dataset
The Issue
label interest ?i
Query
knows ?acq ?iLabel http://bob.name name k n
- w
s http://bob.name
Query
project ?acq ?prjName ?prj
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 28
query-local dataset query-local dataset
Reusing the Query-Local Dataset
label interest ?i
Query
knows ?acq ?iLabel http://bob.name name k n
- w
s http://bob.name
Query
project ?acq ?prjName ?prj
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 29
query-local dataset
Reusing the Query-Local Dataset
label interest ?i
Query
knows ?acq ?iLabel http://bob.name name k n
- w
s http://bob.name
Query
project ?acq ?prjName ?prj knows http://alice.name http://bob.name
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 30
query-local dataset
Reusing the Query-Local Dataset
label interest ?i
Query
knows ?iLabel ?acq http://bob.name name k n
- w
s http://bob.name
Query
project ?acq ?prjName ?prj
http://alice.name
?acq knows http://bob.name http://alice.name
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 31
Re-using the query-local dataset (a.k.a. data caching) may benefit query performance + result completeness
Hypothesis
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 32
Contributions
- Systematic analysis of the impact of data caching
- Theoretical foundation*
- Conceptual analysis*
- Empirical evaluation of the potential impact
- Out of scope: Caching strategies (replacement, invalidation)
*see paper
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 33
Experiment – Scenario
- Information about the
distributed social network of FOAF profiles
- 5 types of queries
- Experiment Setup:
- 23 persons
- Sequential use
➔ 115 queries
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 34
ContactInfoPhillipe UnsetPropsPhillipe 2ndDegree1Phillipe 2ndDegree2Phillipe IncomingPhillipe
0,2 0,4 0,6 0,8 1 0,2 0,4 0,6 0,8 1 no reuse given
- rder
Experiment – Complete Sequence
(Query No. 36) (Query No. 37) (Query No. 38) (Query No. 39) (Query No. 40) hit rate
- no reuse experiment:
- No data caching
- given order experiment
- Reuse of the query-local
dataset for the complete sequence of all 115 queries
- Hit rate:
look-ups answered from cache all look-up requests
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 35
Experiment – Complete Sequence
(Query No. 36) (Query No. 37) (Query No. 38) (Query No. 39) (Query No. 40) hit rate
- no reuse experiment:
- No data caching
- given order experiment
- Reuse of the query-local
dataset for the complete sequence of all 115 queries
- Hit rate:
look-ups answered from cache all look-up requests
ContactInfoPhillipe UnsetPropsPhillipe 2ndDegree1Phillipe 2ndDegree2Phillipe IncomingPhillipe
0,2 0,4 0,6 0,8 1 0,2 0,4 0,6 0,8 1 no reuse given
- rder
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 36
Experiment – Complete Sequence
(Query No. 36) (Query No. 37) (Query No. 38) (Query No. 39) (Query No. 40) hit rate query execution time (in seconds) number of query results
5 10 15 20 25 30 5 10 15 20 25 30 20 40 60 80 20 40 60 80
ContactInfoPhillipe UnsetPropsPhillipe 2ndDegree1Phillipe 2ndDegree2Phillipe IncomingPhillipe
0,2 0,4 0,6 0,8 1 0,2 0,4 0,6 0,8 1 no reuse given
- rder
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 37
Experiment – Complete Sequence
(Query No. 36) (Query No. 37) (Query No. 38) (Query No. 39) (Query No. 40) hit rate query execution time (in seconds) number of query results
5 10 15 20 25 30 5 10 15 20 25 30 20 40 60 80 20 40 60 80
ContactInfoPhillipe UnsetPropsPhillipe 2ndDegree1Phillipe 2ndDegree2Phillipe IncomingPhillipe
0,2 0,4 0,6 0,8 1 0,2 0,4 0,6 0,8 1 no reuse given
- rder
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 38
Summary
- Contributions:
- Theoretical foundation
- Conceptual analysis
- Empirical evaluation
- Main findings:
- Additional results possible (for semantically similar queries)
- Impact on performance may be positive but also negative
- Future work:
- Analysis of caching strategies in our context
- Main issue: invalidation
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 39
Backup Slides
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 40
Contributions
- Theoretical foundation (extension of the original definition)
- Reachability by a Dseed-initialized execution of a BGP query b
- Dseed-dependent solution for a BGP query b
- Reachability R(B) for a serial execution of B = b1 , … , bn
➔ Each solution for bcur is also R(B)-dependent solution for bcur
- Conceptual analysis of the impact of data caching
- Performance factor: p( bcur , B ) = c( bcur , [ ] ) – c( bcur , B )
- Serendipity factor: s( bcur , B ) = b( bcur , B ) – b( bcur , [ ] )
- Empirical verification of the potential impact
- Out of scope: Caching strategies (replacement, invalidation)
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 41
Query Template Contact
SELECT * WHERE { <PERSON> foaf:knows ?p . OPTIONAL { ?p foaf:name ?name } OPTIONAL { ?p foaf:firstName ?firstName } OPTIONAL { ?p foaf:givenName ?givenName } OPTIONAL { ?p foaf:givenname ?givenname } OPTIONAL { ?p foaf:familyName ?familyName } OPTIONAL { ?p foaf:family_name ?family_name } OPTIONAL { ?p foaf:lastName ?lastName } OPTIONAL { ?p foaf:surname ?surname } OPTIONAL { ?p foaf:birthday ?birthday } OPTIONAL { ?p foaf:img ?img } OPTIONAL { ?p foaf:phone ?phone } OPTIONAL { ?p foaf:aimChatID ?aimChatID } OPTIONAL { ?p foaf:icqChatID ?icqChatID } OPTIONAL { ?p foaf:jabberID ?jabberID } OPTIONAL { ?p foaf:msnChatID ?msnChatID } OPTIONAL { ?p foaf:skypeID ?skypeID } OPTIONAL { ?p foaf:yahooChatID ?yahooChatID } }
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 42
Query Template UnsetProps
SELECT DISTINCT ?result ?resultLabel WHERE { ?result rdfs:isDefinedBy <http://xmlns.com/foaf/0.1/> . ?result rdfs:domain foaf:Person . OPTIONAL { <PERSON> ?result ?var0 } FILTER ( !bound(?var0) ) <PERSON> foaf:knows ?var2 . ?var2 ?result ?var3 . ?result rdfs:label ?resultLabel . ?result vs:term_status ?var1 . } ORDER BY ?var1
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 43
Query Template Incoming
SELECT DISTINCT ?result WHERE { ?result foaf:knows <PERSON> . OPTIONAL { ?result foaf:knows ?var1 . FILTER ( <PERSON> = ?var1 ) <PERSON> foaf:knows ?result . } FILTER ( !bound(?var1) ) }
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 44
Query Template 2ndDegree1
SELECT DISTINCT ?result WHERE { <PERSON> foaf:knows ?p1 . <PERSON> foaf:knows ?p2 . FILTER ( ?p1 != ?p2 ) ?p1 foaf:knows ?result . FILTER ( <PERSON> != ?result ) ?p2 foaf:knows ?result . OPTIONAL { <PERSON> ?knows ?result . FILTER ( ?knows = foaf:knows ) } FILTER ( !bound(?knows) ) }
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 45
Query Template 2ndDegree2
SELECT DISTINCT ?result WHERE { <PERSON> foaf:knows ?p1 . <PERSON> foaf:knows ?p2 . FILTER ( ?p1 != ?p2 ) ?result foaf:knows ?p1 . FILTER ( <PERSON> != ?result ) ?result foaf:knows ?p2 . OPTIONAL { <PERSON> ?knows ?result . FILTER ( ?knows = foaf:knows ) } FILTER ( !bound(?knows) ) }
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 46
ContactInfoPhillipe UnsetPropsPhillipe 2ndDegree1Phillipe 2ndDegree2Phillipe IncomingPhillipe
0,2 0,4 0,6 0,8 1 0,2 0,4 0,6 0,8 1 no reuse upper bound
Experiment – Single Query
hit rate (Query No. 36) (Query No. 37) (Query No. 38) (Query No. 39) (Query No. 40)
- no reuse experiment:
- No data caching
- upper bound experiment
- Reuse of query-local dataset
for 3 executions of each query
- Third execution measured
- Hit rate:
look-ups answered from cache all look-up requests
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 47
ContactInfoPhillipe UnsetPropsPhillipe 2ndDegree1Phillipe 2ndDegree2Phillipe IncomingPhillipe
0,2 0,4 0,6 0,8 1 0,2 0,4 0,6 0,8 1 no reuse upper bound
Experiment – Single Query
hit rate (Query No. 36) (Query No. 37) (Query No. 38) (Query No. 39) (Query No. 40)
- no reuse experiment:
- No data caching
- upper bound experiment
- Reuse of query-local dataset
for 3 executions of each query
- Third execution measured
- Hit rate:
look-ups answered from cache all look-up requests
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 48
ContactInfoPhillipe UnsetPropsPhillipe 2ndDegree1Phillipe 2ndDegree2Phillipe IncomingPhillipe
0,2 0,4 0,6 0,8 1 0,2 0,4 0,6 0,8 1 no reuse upper bound
Experiment – Single Query
hit rate query execution time (in seconds) number of query results (Query No. 36) (Query No. 37) (Query No. 38) (Query No. 39) (Query No. 40)
5 10 15 20 25 30 5 10 15 20 25 30 20 40 60 80 20 40 60 80
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 49
ContactInfoPhillipe UnsetPropsPhillipe 2ndDegree1Phillipe 2ndDegree2Phillipe IncomingPhillipe
0,2 0,4 0,6 0,8 1 0,2 0,4 0,6 0,8 1 no reuse upper bound
Experiment – Single Query
hit rate query execution time (in seconds) number of query results (Query No. 36) (Query No. 37) (Query No. 38) (Query No. 39) (Query No. 40)
5 10 15 20 25 30 5 10 15 20 25 30 20 40 60 80 20 40 60 80
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 50
Experiment – Single Query
- In the ideal case for Bupper= [ bcur , bcur ] :
- pupper( bcur , Bupper ) = c( bcur , [ ] ) – c( bcur , Bupper ) = c( bcur , [ ] )
- supper( bcur , Bupper ) = b( bcur , Bupper ) – b( bcur , [ ] ) = 0
Experiment Avg.1 number of Query Results (std.dev.) Average1 Hit Rate (std.dev.) Avg.1 query Execution Time (std.dev.) no reuse 4.983
(11.658)
0.576
(0.182)
30.036 s
(46.708)
upper bound 5.070
(11.813)
0.996
(0.017)
1.943 s
(11.375)
1 Averaged over all 115 queries
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 51
Experiment – Single Query
- Summary (measurement errors aside):
- Same number of query results
- Significant improvements in query performance
Experiment Avg.1 number of Query Results (std.dev.) Average1 Hit Rate (std.dev.) Avg.1 query Execution Time (std.dev.) no reuse 4.983
(11.658)
0.576
(0.182)
30.036 s
(46.708)
upper bound 5.070
(11.813)
0.996
(0.017)
1.943 s
(11.375)
1 Averaged over all 115 queries
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 52
ContactInfoPhillipe UnsetPropsPhillipe 2ndDegree1Phillipe 2ndDegree2Phillipe IncomingPhillipe
0,2 0,4 0,6 0,8 1 0,2 0,4 0,6 0,8 1 no reuse upper bound given
- rder
Experiment – Complete Sequence
(Query No. 36) (Query No. 37) (Query No. 38) (Query No. 39) (Query No. 40) hit rate
5 10 15 20 25 30 5 10 15 20 25 30 20 40 60 80 20 40 60 80
query execution time (in seconds) number of query results
- given order experiment:
- Reuse of the query-local
dataset for the complete sequence of all 115 queries
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 53
5 10 15 20 25 30 5 10 15 20 25 30 20 40 60 80 20 40 60 80
query execution time (in seconds) number of query results
Experiment – Complete Sequence
(Query No. 36) (Query No. 37) (Query No. 38) (Query No. 39) (Query No. 40) hit rate
- given order experiment:
- Reuse of the query-local
dataset for the complete sequence of all 115 queries
ContactInfoPhillipe UnsetPropsPhillipe 2ndDegree1Phillipe 2ndDegree2Phillipe IncomingPhillipe
0,2 0,4 0,6 0,8 1 0,2 0,4 0,6 0,8 1 no reuse upper bound given
- rder
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 54
Experiment – Complete Sequence
(Query No. 36) (Query No. 37) (Query No. 38) (Query No. 39) (Query No. 40)
5 10 15 20 25 30 5 10 15 20 25 30 20 40 60 80 20 40 60 80
hit rate query execution time (in seconds) number of query results ContactInfoPhillipe UnsetPropsPhillipe 2ndDegree1Phillipe 2ndDegree2Phillipe IncomingPhillipe
0,2 0,4 0,6 0,8 1 0,2 0,4 0,6 0,8 1 no reuse upper bound given
- rder
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 55
Experiment – Complete Sequence
(Query No. 36) (Query No. 37) (Query No. 38) (Query No. 39) (Query No. 40)
5 10 15 20 25 30 5 10 15 20 25 30 20 40 60 80 20 40 60 80
hit rate query execution time (in seconds) number of query results ContactInfoPhillipe UnsetPropsPhillipe 2ndDegree1Phillipe 2ndDegree2Phillipe IncomingPhillipe
0,2 0,4 0,6 0,8 1 0,2 0,4 0,6 0,8 1 no reuse upper bound given
- rder
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 56
ContactInfoPhillipe UnsetPropsPhillipe 2ndDegree1Phillipe 2ndDegree2Phillipe IncomingPhillipe
0,2 0,4 0,6 0,8 1 0,2 0,4 0,6 0,8 1 no reuse upper bound given
- rder
Experiment – Complete Sequence
(Query No. 36) (Query No. 37) (Query No. 38) (Query No. 39) (Query No. 40)
5 10 15 20 25 30 5 10 15 20 25 30 20 40 60 80 20 40 60 80
hit rate query execution time (in seconds) number of query results
Bgiven order= [ q1 , … , q38 ] s( q39 , Bgiven order ) = b( q39 , Bgiven order ) – b( q39 , [ ] ) = 9 – 1 = 8
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 57
ContactInfoPhillipe UnsetPropsPhillipe 2ndDegree1Phillipe 2ndDegree2Phillipe IncomingPhillipe
0,2 0,4 0,6 0,8 1 0,2 0,4 0,6 0,8 1 no reuse upper bound given
- rder
Experiment – Complete Sequence
(Query No. 36) (Query No. 37) (Query No. 38) (Query No. 39) (Query No. 40)
5 10 15 20 25 30 5 10 15 20 25 30 20 40 60 80 20 40 60 80
hit rate query execution time (in seconds) number of query results
Bgiven order= [ q1 , … , q38 ] p'( q39 , Bgiven order ) = c'( q39 , [ ] ) – c'( q39 , Bgiven order ) = 31.48 s – 68.64 s = – 37.16 s
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 58
Experiment – Complete Sequence
- Summary:
- Data cache may provide for additional query results
- Impact on performance may be positive but also negative
Experiment Avg.1 number of Query Results (std.dev.) Average1 Hit Rate (std.dev.) Avg.1 query Execution Time (std.dev.) no reuse 4.983
(11.658)
0.576
(0.182)
30.036 s
(46.708)
upper bound 5.070
(11.813)
0.996
(0.017)
1.943 s
(11.375)
given order 6.878
(12.158)
0.932
(0.139)
39.845 s
(145.898)
1 Averaged over all 115 queries
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 59
Experiment – Complete Sequence
- Executing the query sequence in a random order results in
measurements similar to the given order.
Experiment Avg.1 number of Query Results (std.dev.) Average1 Hit Rate (std.dev.) Avg.1 query Execution Time (std.dev.) no reuse 4.983
(11.658)
0.576
(0.182)
30.036 s
(46.708)
upper bound 5.070
(11.813)
0.996
(0.017)
1.943 s
(11.375)
given order 6.878
(12.158)
0.932
(0.139)
39.845 s
(145.898)
random orders 6.652
(11.966)
0.954
(0.036)
36.994 s
(118.700)
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 60