A Dichotomy for Non-Repeating Queries with Negation in Probabilistic - PowerPoint PPT Presentation

A Dichotomy for Non-Repeating Queries with Negation in Probabilistic Databases Robert Fink and Dan Olteanu PODS June 24, 2014 1 / 20

Outline The Dichotomy The Interesting but Hard Queries The Easy Queries Leftovers 2 / 20

Problem Setting Relational algebra query language fragment 1RA − Included: Equi-joins, selections, projections, difference Excluded: Repeating relation symbols (self-joins), unions Tuple-independent probabilistic model Each tuple associated with a fresh Boolean random variable x . P ( x ) is the probability that the tuple exists in the database. Simplest probabilistic model in the literature. Beyond this model, query tractability is quickly lost. Used by real-world large-scale probabilistic repositories, e.g., Google Knowledge Vault. Query Evaluation Problem: For a fixed 1RA − query Q : Given a tuple-independent probabilistic database D and a tuple t ∈ Q ( D ), compute its marginal probability. 3 / 20

The Main Result Data complexity of any 1RA − query Q on tuple-independent databases: Polynomial time if Q is hierarchical and #P-hard otherwise. 4 / 20

The Main Result Data complexity of any 1RA − query Q on tuple-independent databases: Polynomial time if Q is hierarchical and #P-hard otherwise. This result strictly extends a 2004 result by Dalvi and Suciu: We added the relational algebra difference operator ◮ and moved from conjunctive queries without self-joins to 1RA. Same syntactic characterization of tractable queries. ◮ The hierarchical property can be recognized in LOGSPACE. The reason for tractability is however different . 4 / 20

Hierarchical 1RA − Queries Let [ C ] be the equivalence class of attribute C in query Q as defined by the transitivity of equi-join conditions and difference operators. E.g., C and D are in the same class due to join X ( C ) ✶ C = D Y ( D ) or difference X ( C ) − C ↔ D Y ( D ) under attribute mapping C ↔ D . 5 / 20

Hierarchical 1RA − Queries Let [ C ] be the equivalence class of attribute C in query Q as defined by the transitivity of equi-join conditions and difference operators. E.g., C and D are in the same class due to join X ( C ) ✶ C = D Y ( D ) or difference X ( C ) − C ↔ D Y ( D ) under attribute mapping C ↔ D . (Boolean ∗ ) 1RA − query Q is hierarchical if For every pair of distinct attribute equivalence classes [ A ] and [ B ], there is no triple of relation symbols R , S , and T in Q such that R [ A ][ ¬ B ] has attributes in [ A ] and not in [ B ], S [ A ][ B ] has attributes in both [ A ] and [ B ], and T [ ¬ A ][ B ] has attributes in [ B ] and not in [ A ]. ∗ For non-Boolean queries, we need not check for equivalence classes with attributes in the query result. 5 / 20

Examples Examples of hierarchical queries: �� R ( A ) ✶ S ( A , B ) − T ( A , B ) π ∅ �� R ( A ) × T ( B ) − U ( A ) × V ( B ) π ∅ �� M ( A ) × N ( B ) R ( A ) × T ( B ) U ( A ) × V ( B ) − − π ∅ �� M ( A ) × N ( B ) − π A R ( A ) × T ( B ) − U ( A ) × V ( B ) π ∅ 6 / 20

Examples Examples of hierarchical queries: �� R ( A ) ✶ S ( A , B ) − T ( A , B ) π ∅ �� R ( A ) × T ( B ) − U ( A ) × V ( B ) π ∅ �� M ( A ) × N ( B ) R ( A ) × T ( B ) U ( A ) × V ( B ) − − π ∅ �� M ( A ) × N ( B ) − π A R ( A ) × T ( B ) − U ( A ) × V ( B ) π ∅ Examples of non-hierarchical queries: � � R ( A ) ✶ S ( A , B ) ✶ T ( B ) π ∅ � � � � R ( A ) ✶ S ( A , B ) − T ( B ) π ∅ π B � �� T ( B ) − π B R ( A ) ✶ S ( A , B ) π ∅ � �� X ( A ) ✶ R ( A ) − π A T ( B ) ✶ S ( A , B ) π ∅ 6 / 20

Hardness Proof Idea Reduction from #P-hard model counting problem for positive 2DNF: Given a non-hierarchical 1RA query Q and A positive bipartite DNF formula Ψ, Construct a tuple-independent database D with ◮ size polynomial in the number of variables and clauses in Ψ, and ◮ tuples annotated with variables in Ψ such that Ψ annotates Q ( D ). Then #Ψ = 2 n · P Q ( D ) , where ◮ P Q ( D ) is the probability of Q ( D ), ◮ 1/2 is the probability of each variable in Ψ, and ◮ n is the number of variables in Ψ. 8 / 20

Example of Hardness Reduction Input formula and query: Ψ = x 1 y 1 ∨ x 1 y 2 , � �� Q = π ∅ R ( A ) − π A T ( B ) ✶ S ( A , B ) Construct database such that Ψ annotates Q ’s (nullary) result: Column Φ holds annotations over variables in Ψ. ◮ Special annotations: ⊤ (true), ⊥ (false) Variables used as constants for the attribute B in T and S . S ( a , b , φ ): Clause a has variable b exactly when φ is true. R ( a , ⊤ ) and T ( b , ¬ b ): a is a clause and b is a variable in Ψ. π A ( T ✶ S ) R − π A ( T ✶ S ) R T S T ✶ S A Φ B Φ A B Φ A B Φ A Φ A Φ 1 ⊤ x 1 ¬ x 1 1 x 1 ⊤ 1 x 1 ¬ x 1 1 ¬ x 1 ∨ ¬ y 1 1 x 1 y 1 2 ⊤ y 1 ¬ y 1 1 y 1 ⊤ 1 y 1 ¬ y 1 2 ¬ x 1 ∨ ¬ y 2 2 x 1 y 2 1 y 2 ⊥ 1 y 2 ⊥ y 2 ¬ y 2 2 x 1 ⊤ 2 x 1 ¬ x 1 2 y 1 ⊥ 2 y 1 ⊥ 2 y 2 ⊤ 2 y 2 ¬ y 2 9 / 20

Example of Hardness Reduction Input formula and query: Ψ = x 1 y 1 ∨ x 1 y 2 , � �� Q = π ∅ R ( A ) − π A T ( B ) ✶ S ( A , B ) Construct database such that Ψ annotates Q ’s (nullary) result: Column Φ holds annotations over variables in Ψ. ◮ Special annotations: ⊤ (true), ⊥ (false) Variables used as constants for the attribute B in T and S . S ( a , b , φ ): Clause a has variable b exactly when φ is true. R ( a , ⊤ ) and T ( b , ¬ b ): a is a clause and b is a variable in Ψ. π A ( T ✶ S ) R − π A ( T ✶ S ) R T S T ✶ S A Φ B Φ A B Φ A B Φ A Φ A Φ 1 ⊤ x 1 ¬ x 1 1 x 1 ⊤ 1 x 1 ¬ x 1 1 ¬ x 1 ∨ ¬ y 1 1 x 1 y 1 2 ⊤ y 1 ¬ y 1 1 y 1 ⊤ 1 y 1 ¬ y 1 2 ¬ x 1 ∨ ¬ y 2 2 x 1 y 2 1 y 2 ⊥ 1 y 2 ⊥ y 2 ¬ y 2 2 x 1 ⊤ 2 x 1 ¬ x 1 2 y 1 ⊥ 2 y 1 ⊥ 2 y 2 ⊤ 2 y 2 ¬ y 2 Query Q is already hard when T is the only uncertain input relation! 9 / 20

Hard Query Patterns There are 48 (!) minimal non-hierarchical query patterns. Binary trees with leaves A , AB , and B and inner nodes ✶ or − . ◮ Some are symmetric and need not be consider separately: A and B can be exchanged, joins are commutative and associative. ◮ Still, many cases left to consider due to the difference operator. P 1 . 1 P 1 . 2 P 1 . 3 − P 1 . 4 − ✶ ✶ ✶ − ✶ − AB AB AB AB A B A B A B A B P 5 . 1 P 5 . 2 P 5 . 3 − P 5 . 4 − ✶ ✶ ✶ − ✶ − A A A A B AB B AB B AB B AB . . . . . . . . . . . . There is a database construction scheme for each pattern. Each non-hierarchical query Q matches a pattern P x . y . 10 / 20

Hard Query Patterns There are 48 (!) minimal non-hierarchical query patterns. Binary trees with leaves A , AB , and B and inner nodes ✶ or − . ◮ Some are symmetric and need not be consider separately: A and B can be exchanged, joins are commutative and associative. ◮ Still, many cases left to consider due to the difference operator. P 1 . 1 P 1 . 2 P 1 . 3 − P 1 . 4 − ✶ ✶ ✶ − ✶ − AB AB AB AB A B A B A B A B P 5 . 1 P 5 . 2 P 5 . 3 − P 5 . 4 − ✶ ✶ ✶ − ✶ − A A A A B AB B AB B AB B AB . . . . . . . . . . . . There is a database construction scheme for each pattern. Each non-hierarchical query Q matches a pattern P x . y . P 1 . 1 is the only hard pattern to consider w/o the difference operator! 10 / 20

Non-hierarchical Queries Match Minimal Hard Patterns Each non-hierarchical query Q matches a pattern P x . y : There is a total mapping from P x . y to Q ’s parse tree that ◮ is identity on inner nodes ✶ and − , ◮ preserves ancestor-descendant relationships, ◮ maps leaves A , AB , B to relations R [ A ][ ¬ B ] , S [ A ][ B ] , T [ ¬ A ][ B ] . π ∅ Pattern P 5 . 3 Query Q ✶ − X ( A ) − R ( A ) π A A ✶ ✶ B AB T ( B ) S ( A , B ) The match preserves the annotation of the query pattern: Q and P x . y have the same annotation for any input database. 11 / 20

Evaluation of Hierarchical 1RA − Queries Approach based on knowledge compilation For any database D , the probability P Q ( D ) of a 1RA − query Q is the probability P Ψ of the query annotation Ψ. Compile Ψ into poly-size OBDD(Ψ). Compute probability of OBDD(Ψ) in time linear in its size. 13 / 20

Evaluation of Hierarchical 1RA − Queries Approach based on knowledge compilation For any database D , the probability P Q ( D ) of a 1RA − query Q is the probability P Ψ of the query annotation Ψ. Compile Ψ into poly-size OBDD(Ψ). Compute probability of OBDD(Ψ) in time linear in its size. Distinction from existing tractability results [O. & Huang 2008]: 1RA − queries w/o difference: Annotations are read-once. ◮ Read-once annotations admit linear-size OBBDs. 1RA − queries: Annotations are not read-once. ◮ They admit OBBDs of size linear in the database size but exponential in the query size. 13 / 20

A Dichotomy for Non-Repeating Queries with Negation in Probabilistic - PowerPoint PPT Presentation

A Dichotomy for Non-Repeating Queries with Negation in Probabilistic Databases Robert Fink and Dan Olteanu PODS June 24, 2014 1 / 20 Outline The Dichotomy The Interesting but Hard Queries The Easy Queries Leftovers 2 / 20 Problem Setting

Stratied Negation Negation wrapp ed inside a recursion mak es no sense. Ev

Identifying Negation in the DGS Corpus Graz, 2019-05-03 Marc Schulder, Thomas Hanke

Repeating Years, Repeating Themes Leisa Schaim Annual Profections A timing technique that

Dichotomy between Rights-based and Market- based Dichotomy between Rights-based and Market- based

Queries in PSM The following rules apply to the use of queries: CS 235: 1. Queries

Today Closed World Assumption & Negation as Failure. Clark completion Lloyd-Topor

Double Negation Translations as Morphisms Olivier Hermant CRI, MINES ParisTech December 1, 2014

Variable Negation Strategy Decision Table-Based Testing Variable Negation Strategy An

Logic Programming Theory Lecture 7: Negation as Failure Richard Mayr School of Informatics 6th

Variable Negation Strategy Decision Table-Based Testing Variable Negation Strategy An

Subminimal Logics and Relativistic Negation Satoru Niki School of Information Science, JAIST

Double Negation Translations as Morphisms Olivier Hermant CRI, MINES ParisTech December 12, 2014

Range Minimum and Lowest Common Ancestor Queries Slides by Solon P. Pissis November 15, 2019

Top- -k k Queries Queries on SQL on SQL Databases Databases Top Top-k Queries on SQL

Middleware Queries Queries Middleware Middleware Queries Prof. Paolo Ciaccia Prof. Paolo

A Dichotomy for Homomorphism-Closed Queries on Probabilistic Graphs Antoine Amarilli 1 and smail

Mathematical Logic Introduction to Reasoning and Automated Reasoning. Hilbert-style Propositional

The State of Containers in Scientific Computing Georg Rath FOSDEM18/04.02.2018 NERSC Primary

With David Lee Managerial Laments Lack of initiative What can you do for me? vs

Nuclear Physics at Project-x R. K. Choudhury Nuclear Physics Division Bhabha Atomic Physics

Journey to the Center of the Internet John Kristoff jtk@depaul.edu +1 312 3625878 DePaul

ASPECTS OF CONVERGENCE FOR MIXED MULTISCALE FINITE ELEMENTS AND A NEW APPROACH TO THEIR

Learning Face Recognition from Limited Training Data using Deep Neural Networks Xi Peng * , Nalini

3.1 Agents 1. What is Artificial Intelligence? 2. AI Past and Present 3. Rational

A Dichotomy for Non-Repeating Queries with Negation in Probabilistic - PowerPoint PPT Presentation

A Dichotomy for Non-Repeating Queries with Negation in Probabilistic Databases Robert Fink and Dan Olteanu PODS June 24, 2014 1 / 20 Outline The Dichotomy The Interesting but Hard Queries The Easy Queries Leftovers 2 / 20 Problem Setting

Stratied Negation Negation wrapp ed inside a recursion mak es no sense. Ev

Identifying Negation in the DGS Corpus Graz, 2019-05-03 Marc Schulder, Thomas Hanke

Repeating Years, Repeating Themes Leisa Schaim Annual Profections A timing technique that

Dichotomy between Rights-based and Market- based Dichotomy between Rights-based and Market- based

Queries in PSM The following rules apply to the use of queries: CS 235: 1. Queries

Today Closed World Assumption &amp; Negation as Failure. Clark completion Lloyd-Topor

Double Negation Translations as Morphisms Olivier Hermant CRI, MINES ParisTech December 1, 2014

Variable Negation Strategy Decision Table-Based Testing Variable Negation Strategy An

Logic Programming Theory Lecture 7: Negation as Failure Richard Mayr School of Informatics 6th

Variable Negation Strategy Decision Table-Based Testing Variable Negation Strategy An

Subminimal Logics and Relativistic Negation Satoru Niki School of Information Science, JAIST

Double Negation Translations as Morphisms Olivier Hermant CRI, MINES ParisTech December 12, 2014

Range Minimum and Lowest Common Ancestor Queries Slides by Solon P. Pissis November 15, 2019

Top- -k k Queries Queries on SQL on SQL Databases Databases Top Top-k Queries on SQL

Middleware Queries Queries Middleware Middleware Queries Prof. Paolo Ciaccia Prof. Paolo

A Dichotomy for Homomorphism-Closed Queries on Probabilistic Graphs Antoine Amarilli 1 and smail

Mathematical Logic Introduction to Reasoning and Automated Reasoning. Hilbert-style Propositional

The State of Containers in Scientific Computing Georg Rath FOSDEM18/04.02.2018 NERSC Primary

With David Lee Managerial Laments Lack of initiative What can you do for me? vs

Nuclear Physics at Project-x R. K. Choudhury Nuclear Physics Division Bhabha Atomic Physics

Journey to the Center of the Internet John Kristoff jtk@depaul.edu +1 312 3625878 DePaul

ASPECTS OF CONVERGENCE FOR MIXED MULTISCALE FINITE ELEMENTS AND A NEW APPROACH TO THEIR

Learning Face Recognition from Limited Training Data using Deep Neural Networks Xi Peng * , Nalini

3.1 Agents 1. What is Artificial Intelligence? 2. AI Past and Present 3. Rational

Today Closed World Assumption & Negation as Failure. Clark completion Lloyd-Topor