Efficient Detection of Empty-Result Queries Gang Luo IBM Watson - PowerPoint PPT Presentation

Efficient Detection of Empty-Result Queries Gang Luo IBM Watson Research Centre Damon Sotoudeh

Agenda  Introduction  The detection method  Related work  Future work  Conclusion

Empty-Result Queries  Queries that return nothing  Do not provide much information  May take much time to produce  Frequently encountered: ○ CRM (at IBM): 18% ○ Biomedical domain: up to 40% ○ In interactive systems

Empty-Result Queries  In interactive systems  Users keep refining queries  Few parameters are changed  Much of query parts are common ○ In IBM CRM application, only 38% of queries are distinct

Intuition  Remember query parts that previously led to empty result sets  If a new query matches those parts, it will generate empty results No query execution required

Detection Method Numbers are set cardinalities

Detection Method Identify lowest set with cardinality zero, and the sub-tree rooted at that point

Detection Method  Easy to see that the set cardinalities above this point are all zero

Detection Method  If a new query has this query part, it is an empty-result query  Only if all the operators above it are empty-result propagating ○ Selection ○ Projection ○ Join ○ And most of SQL operators

Simplifying query plans  Abstractly  Certain operators have no influence on the emptiness of output ○ Projection ○ Hash ○ Sort, ...  Any join operator is simply a join ○ Hash join ○ Sort-merge join ○ Nested-loops join

Simplifying query plans

Simplifying query plans  Previous figure corresponds to the following query:

Further simplification  Convert selection conditions to DNF  Disjunctive normal form  For example: =  Interval selection does not need to be changed

Further simplification  After rewriting selections in DNF, combine the individual selection terms in each relation

Further simplification  Great news:  The output of the four simplified query parts is also empty! ○ Proof by intuition!  They are called atomic query parts ○ Cannot be further simplified  But generating them is exponential ○ Poor performance for complex queries

Detection  How to detect an empty-result query Q?  Break Q into its atomic parts  Is there any atomic part in container that covers Q? ○ If yes, then it is an empty-result query

Coverage  A selection condition X covers selection condition Y, if and only if when Y is true, then X is true.  In other words, if X is false, then so is Y.

Coverage  Notion of coverage expands the detection possibilities  But deciding coverage is exponential  Paper uses a restricted coverage detection  Trade off between efficiency and coverage detection  If an empty result atomic query part covers an atomic part of query Q, then Q definitely generates empty results  But we may not necessarily find such match

Atomic query container  Is fully stored in memory  For fast access  Is of fixed size M, but M can be fairly large  Trade off between efficiency and coverage  Once the container is full, maintain the most frequently used atomic parts only ○ E.g. Least recently used (LRU) algorithm

Atomic query container  To avoid scanning the whole container  Index the container based on involved relations

Experiments  Based on two queries  Q 1 : Find the information about certain parts that were sold on certain days  Q 2 : Find the information about certain parts that were sold to certain customers on certain days

Experiments  The overhead is trivial compared to query execution overhead 1000 execution time or overhead 100 10 (second) execute Q1 check Q1 1 execute Q2 0.1 check Q2 0.01 0.001 1 2 3 database size (GB)

Experiments  The overhead of our method increases with both query complexity and the number of atomic query parts stored in C When check fails, the overhead of our method is higher than that  when check succeeds

Related Work  Two general approaches Find what leads to empty results 1. ○ Time consuming ○ A lot of possibilities Automatically generalize the query to obtain 2. some answers ○ Domain specific ○ Restricted forms of queries  No best approach

Open issues  How to include updates?  Extension beyond empty result propagating operators  A method that takes into account advantages of all current solutions  Not restrictive  Efficient

Conclusion  An efficient detection method of empty result sets  High detection rate once the container is highly filled  Low overhead compared to actual execution of query  Small storage requirements  Perfect for interactions  Existence of hotspots is reflected

Thanks for listening! Questions?

Efficient Detection of Empty-Result Queries Gang Luo IBM Watson - PowerPoint PPT Presentation

Efficient Detection of Empty-Result Queries Gang Luo IBM Watson Research Centre Damon Sotoudeh Agenda Introduction The detection method Related work Future work Conclusion Empty-Result Queries Queries that return

Detection of neutral particles detection of neutrons detection of neutrinons detection of low

2010 Full Year Result 2010 Full Year Result 23 February 2011 2010 Full Year Result 2010 Full

EMPTY SPACE MAPPING Mussie Nigussie, Jessica Chayavichitsilp EMPTY SPACE MAPPING Jessica Chaya,

Draft Empty Homes Strategy 2019-2024 Homes & Safe Communities Scrutiny Committee Tuesday

Timothy Papandreou @tpap_ 5 % Use 80 % Empty 120% Peak 60% Empty Concept: Timothy

Some Useful Sets The Empty Set Definition 1 The empty set is the set with no elements, denoted by

Why empty KB is TRUE and empty Clause is FALSE by Rick Lathrop Notation used in this Special

Lecture 23: Superconductivity II Theory (Kittel Ch. 10) D(E) D(E) Filled Filled Empty Empty

MAINFREIGHT LIMITED FULL YEAR RESULT TO MARCH 2014 Result Summary Result Summary Net surplus

Efficient Incremental Dynamic Invariant Detection Jeff Perkins and Michael Ernst MIT CSAIL Page

Low Level Low Level Low Level Low Level Detection of Detection of Detection of Detection of

MAINFREIGHT LIMITED FULL YEAR RESULT TO MARCH 2017 Result Summary Result Summary Net surplus

Allan Gibbard - Manipulation of voting The result for game forms schemes: a general result (1973)

Perimeter Intrusion Detection Mikro Tek Detection Technologies Ltd | +44 (0) 1773 744750 |

Collision Detection Collision detection weaknesses Naive collision detection suffers from 3 known

Local features: detection and description detection and description Kristen Grauman UT Austin

The Domestic Nexus interrogating the interlinked practices of water, energy and food consumption

a platform for all that we know savas parastatidis http://savas.me savasp transition from web

Corporate Overview 1 Addressing Community Needs Through Mixed- Income/Workforce Housing The

Intro to Prolog (notes only) Deduction Propositions: 1. Socrates is a man. 2. All men are mortal.

Chip Watson Scientific Computing Group Quick Outline Hardware Overview & Recent

Exploring and Using the Semantic Web Mathieu dAquin KMi, The Open University

Requirements for Secure Device Authentication Iden&ty in the browser

Beyond TREC-QA Ling573 NLP Systems and Applications May 28, 2013 Roadmap Beyond

Efficient Detection of Empty-Result Queries Gang Luo IBM Watson - PowerPoint PPT Presentation

Efficient Detection of Empty-Result Queries Gang Luo IBM Watson Research Centre Damon Sotoudeh Agenda Introduction The detection method Related work Future work Conclusion Empty-Result Queries Queries that return

Detection of neutral particles detection of neutrons detection of neutrinons detection of low

2010 Full Year Result 2010 Full Year Result 23 February 2011 2010 Full Year Result 2010 Full

EMPTY SPACE MAPPING Mussie Nigussie, Jessica Chayavichitsilp EMPTY SPACE MAPPING Jessica Chaya,

Draft Empty Homes Strategy 2019-2024 Homes &amp; Safe Communities Scrutiny Committee Tuesday

Timothy Papandreou @tpap_ 5 % Use 80 % Empty 120% Peak 60% Empty Concept: Timothy

Some Useful Sets The Empty Set Definition 1 The empty set is the set with no elements, denoted by

Why empty KB is TRUE and empty Clause is FALSE by Rick Lathrop Notation used in this Special

Lecture 23: Superconductivity II Theory (Kittel Ch. 10) D(E) D(E) Filled Filled Empty Empty

MAINFREIGHT LIMITED FULL YEAR RESULT TO MARCH 2014 Result Summary Result Summary Net surplus

Efficient Incremental Dynamic Invariant Detection Jeff Perkins and Michael Ernst MIT CSAIL Page

Low Level Low Level Low Level Low Level Detection of Detection of Detection of Detection of

MAINFREIGHT LIMITED FULL YEAR RESULT TO MARCH 2017 Result Summary Result Summary Net surplus

Allan Gibbard - Manipulation of voting The result for game forms schemes: a general result (1973)

Perimeter Intrusion Detection Mikro Tek Detection Technologies Ltd | +44 (0) 1773 744750 |

Collision Detection Collision detection weaknesses Naive collision detection suffers from 3 known

Local features: detection and description detection and description Kristen Grauman UT Austin

The Domestic Nexus interrogating the interlinked practices of water, energy and food consumption

a platform for all that we know savas parastatidis http://savas.me savasp transition from web

Corporate Overview 1 Addressing Community Needs Through Mixed- Income/Workforce Housing The

Intro to Prolog (notes only) Deduction Propositions: 1. Socrates is a man. 2. All men are mortal.

Chip Watson Scientific Computing Group Quick Outline Hardware Overview &amp; Recent

Exploring and Using the Semantic Web Mathieu dAquin KMi, The Open University

Requirements for Secure Device Authentication Iden&amp;ty in the browser

Beyond TREC-QA Ling573 NLP Systems and Applications May 28, 2013 Roadmap Beyond

Draft Empty Homes Strategy 2019-2024 Homes & Safe Communities Scrutiny Committee Tuesday

Chip Watson Scientific Computing Group Quick Outline Hardware Overview & Recent

Requirements for Secure Device Authentication Iden&ty in the browser