Trade-offs in Static and Dynamic Query Evaluation Ahmet Kara, Milos - - PowerPoint PPT Presentation
Trade-offs in Static and Dynamic Query Evaluation Ahmet Kara, Milos - - PowerPoint PPT Presentation
Trade-offs in Static and Dynamic Query Evaluation Ahmet Kara, Milos Nikolic Dan Olteanu, and Haozhe Zhang fdbresearch.github.io KOCOON Workshop 2019, Arras Static and Dynamic Query Evaluation Static Query Evaluation preprocessing
Static and Dynamic Query Evaluation
Static Query Evaluation query data base data structure preprocessing query result enumeration
1 / 13
Static and Dynamic Query Evaluation
Static Query Evaluation query data base data structure preprocessing preprocessing time query result enumeration
1 / 13
Static and Dynamic Query Evaluation
Static Query Evaluation query data base data structure preprocessing preprocessing time query result enumeration enumeration delay
1 / 13
Static and Dynamic Query Evaluation
Static Query Evaluation query data base data structure preprocessing preprocessing time query result enumeration enumeration delay Dynamic Query Evaluation query data base data structure preprocessing preprocessing time query result enumeration enumeration delay
1 / 13
Static and Dynamic Query Evaluation
Static Query Evaluation query data base data structure preprocessing preprocessing time query result enumeration enumeration delay Dynamic Query Evaluation query data base data structure preprocessing preprocessing time query result enumeration enumeration delay single-tuple update
1 / 13
Static and Dynamic Query Evaluation
Static Query Evaluation query data base data structure preprocessing preprocessing time query result enumeration enumeration delay Dynamic Query Evaluation query data base data structure preprocessing preprocessing time query result enumeration enumeration delay single-tuple update maintenance
1 / 13
Static and Dynamic Query Evaluation
Static Query Evaluation query data base data structure preprocessing preprocessing time query result enumeration enumeration delay Dynamic Query Evaluation query data base data structure preprocessing preprocessing time query result enumeration enumeration delay single-tuple update maintenance update time
1 / 13
Static and Dynamic Query Evaluation
Static Query Evaluation query data base data structure preprocessing preprocessing time query result enumeration enumeration delay Dynamic Query Evaluation query data base data structure preprocessing preprocessing time query result enumeration enumeration delay single-tuple update maintenance update time We are interested in the trade-off between: preprocessing time
- enumeration delay -
(update time)
1 / 13
Landscape of Static Query Evaluation
Preprocessing time/Enumeration delay conjunctive O(Nw)/O(1)
[TODS ’15] 1 w 1 logN delay logN preprocessing time
conjunctive static width w = s↑ [TODS ’15] or faqw [PODS ’16]
2 / 13
Landscape of Static Query Evaluation
Preprocessing time/Enumeration delay conjunctive O(Nw)/O(1)
[TODS ’15]
(α)-acyclic O(N)/O(N)
[CSL ’07] 1 w 1 logN delay logN preprocessing time
conjunctive acyclic static width w = s↑ [TODS ’15] or faqw [PODS ’16]
2 / 13
Landscape of Static Query Evaluation
Preprocessing time/Enumeration delay conjunctive O(Nw)/O(1)
[TODS ’15]
(α)-acyclic O(N)/O(N)
[CSL ’07]
free-connex O(N)/O(1)
[CSL ’07] 1 w 1 logN delay logN preprocessing time
conjunctive acyclic free-connex static width w = s↑ [TODS ’15] or faqw [PODS ’16]
2 / 13
Landscape of Static Query Evaluation
Preprocessing time/Enumeration delay conjunctive O(Nw)/O(1)
[TODS ’15]
(α)-acyclic O(N)/O(N)
[CSL ’07]
free-connex O(N)/O(1)
[CSL ’07]
hierarchical ?
[PODS ’20] This work 1 w 1 logN delay logN preprocessing time
conjunctive acyclic free-connex static width w = s↑ [TODS ’15] or faqw [PODS ’16]
2 / 13
Landscape of Static Query Evaluation
Preprocessing time/Enumeration delay conjunctive O(Nw)/O(1)
[TODS ’15]
(α)-acyclic O(N)/O(N)
[CSL ’07]
free-connex O(N)/O(1)
[CSL ’07]
hierarchical O(N1+(w−1)ε)/O(N1−ε)
ε ∈ [0, 1] 1 w 1 logN delay logN preprocessing time
conjunctive acyclic free-connex hierarchical conjunctive acyclic free-connex static width w = s↑ [TODS ’15] or faqw [PODS ’16]
2 / 13
Landscape of Static Query Evaluation
Preprocessing time/Enumeration delay conjunctive O(Nw)/O(1)
[TODS ’15]
(α)-acyclic O(N)/O(N)
[CSL ’07]
free-connex O(N)/O(1)
[CSL ’07]
hierarchical O(N1+(w−1)ε)/O(N1−ε)
ε ∈ [0, 1] 1 w 1 logN delay logN preprocessing time
conjunctive acyclic free-connex hierarchical conjunctive acyclic free-connex static width w = s↑ [TODS ’15] or faqw [PODS ’16]
1 1 w ε
preprocessing time O(N1+(w−1)ε) enumeration delay O(N1−ε)
logN time
2 / 13
Landscape of Static Query Evaluation
Preprocessing time/Enumeration delay conjunctive O(Nw)/O(1)
[TODS ’15]
(α)-acyclic O(N)/O(N)
[CSL ’07]
free-connex O(N)/O(1)
[CSL ’07]
hierarchical O(N1+(w−1)ε)/O(N1−ε)
ε ∈ [0, 1] 1 w 1 logN delay logN preprocessing time
conjunctive acyclic free-connex hierarchical conjunctive acyclic free-connex static width w = s↑ [TODS ’15] or faqw [PODS ’16]
1 1 w ε
preprocessing time O(N1+(w−1)ε) enumeration delay O(N1−ε)
logN time conjunctive acyclic free-connex (w = 1)
2 / 13
Landscape of Dynamic Query Evaluation
Preprocessing time/Update time/Enumeration delay conjunctive O(Nw)/O(Nδ)/O(1) [SIGMOD ’18] static width w = s↑[TODS ’15] or faqw [PODS ’16] dynamic width δ = max
delta queries static width [PODS ’20]
3 / 13
Landscape of Dynamic Query Evaluation
Preprocessing time/Update time/Enumeration delay conjunctive O(Nw)/O(Nδ)/O(1) [SIGMOD ’18] static width w = s↑[TODS ’15] or faqw [PODS ’16] dynamic width δ = max
delta queries static width [PODS ’20]
(α-)acyclic free-connex hierarchical
?
[PODS ’20] This work
3 / 13
Landscape of Dynamic Query Evaluation
Preprocessing time/Update time/Enumeration delay conjunctive O(Nw)/O(Nδ)/O(1) [SIGMOD ’18] static width w = s↑[TODS ’15] or faqw [PODS ’16] dynamic width δ = max
delta queries static width [PODS ’20]
(α-)acyclic free-connex hierarchical O(N1+(w−1)ε)/O(Nδε)∗/O(N1−ε)
ε ∈ [0, 1] (∗): amortized update time
3 / 13
Landscape of Dynamic Query Evaluation
Preprocessing time/Update time/Enumeration delay conjunctive O(Nw)/O(Nδ)/O(1) [SIGMOD ’18] static width w = s↑[TODS ’15] or faqw [PODS ’16] dynamic width δ = max
delta queries static width [PODS ’20]
(α-)acyclic free-connex δ0-hierarchical w = 1, δ = 0
[PODS ’17]
hierarchical O(N1+(w−1)ε)/O(Nδε)∗/O(N1−ε)
ε ∈ [0, 1] (∗): amortized update time
3 / 13
Landscape of Dynamic Query Evaluation
Preprocessing time/Update time/Enumeration delay conjunctive O(Nw)/O(Nδ)/O(1) [SIGMOD ’18] static width w = s↑[TODS ’15] or faqw [PODS ’16] dynamic width δ = max
delta queries static width [PODS ’20]
(α-)acyclic free-connex δ0-hierarchical w = 1, δ = 0
[PODS ’17]
hierarchical O(N1+(w−1)ε)/O(Nδε)∗/O(N1−ε)
ε ∈ [0, 1]
δ1-hierarchical w ≤ 2, δ = 1
(∗): amortized update time
3 / 13
Contribution 1: Recovery of Prior Approaches
logNdelay logNpreprocessing time logNupdate time (1, 0, 1) 1 1 w δ
conjunctive
4 / 13
Contribution 1: Recovery of Prior Approaches
logNdelay logNpreprocessing time logNupdate time (1, 0, 1) 1 1 w δ
conjunctive δ0-hierarchical (w = 1, δ = 0)
4 / 13
Contribution 1: Recovery of Prior Approaches
logNdelay logNpreprocessing time logNupdate time (1, 0, 1) 1 1 w δ
conjunctive δ0-hierarchical (w = 1, δ = 0) hierarchical
4 / 13
Contribution 1: Recovery of Prior Approaches
logNdelay logNpreprocessing time logNupdate time (1, 0, 1) 1 1 w δ
conjunctive δ0-hierarchical (w = 1, δ = 0) hierarchical 1 1 δ w ε preprocessing timeO(N1+(w−1)ε) amortized update time O(Nδε) enumeration delay O(N1−ε)
logN time
4 / 13
Contribution 1: Recovery of Prior Approaches
logNdelay logNpreprocessing time logNupdate time (1, 0, 1) 1 1 w δ
conjunctive δ0-hierarchical (w = 1, δ = 0) hierarchical 1 1 δ w ε preprocessing timeO(N1+(w−1)ε) amortized update time O(Nδε) enumeration delay O(N1−ε)
logN time conjunctive δ0-hierarchical (w = 1, δ = 0)
Recovers prior approach for conjunctive queries by setting ε = 1. Recovers prior approach for δ0-hierarchcal queries by setting ε = 1.
4 / 13
Contribution 2: Sublinear Update Time and Delay
logNdelay logNpreprocessing time logNupdate time (1, 0, 1) 1 1 w δ δ0-hierarchical (0, 1, 1) 1 Sublinear
1 1 δ w ε preprocessing time O(N1+(w−1)ε) amortized update time O(Nδε) enumeration delay O(N1−ε)
logN time
First approach that allows sublinear amortized update time and sublinear enumeration delay for hierarchical queries.
5 / 13
Contribution 3: Optimality for δ1-Hierarchical Queries
For any δ1-hierarchical query, there is no algorithm that admits preprocessing time amortized update time enumeration delay arbitrary O(N0.5−γ) O(N0.5−γ) for any γ > 0, unless the OMv Conjecture (*) fails.
(*) OMv Conjecture: Online Matrix-Vector Multiplication Problem cannot be solved in sub-cubic time. logNdelay logNpreprocessing time logNupdate time (1, 0, 1) 1 1 2 δ = 1 0.5 0.5
0.5 1 0.5 1 1.5 2 ε
preprocessing time O(N1+ε) amortized update time O(Nε) enumeration delay O(N1−ε) logN time
6 / 13
Contribution 3: Optimality for δ1-Hierarchical Queries
For any δ1-hierarchical query, there is no algorithm that admits preprocessing time amortized update time enumeration delay arbitrary O(N0.5−γ) O(N0.5−γ) for any γ > 0, unless the OMv Conjecture (*) fails. Our approach maintains any δ1-hierarchical query with preprocessing time amortized update time enumeration delay O(N1+ε) O(Nε) O(N1−ε).
(*) OMv Conjecture: Online Matrix-Vector Multiplication Problem cannot be solved in sub-cubic time. logNdelay logNpreprocessing time logNupdate time (1, 0, 1) 1 1 1.5 2 δ = 1 0.5 0.5 (1.5, 0.5, 0.5)
0.5 1 0.5 1 1.5 2 ε
preprocessing time O(N1+ε) amortized update time O(Nε) enumeration delay O(N1−ε) logN time
6 / 13
Contribution 3: Optimality for δ1-Hierarchical Queries
For any δ1-hierarchical query, there is no algorithm that admits preprocessing time amortized update time enumeration delay arbitrary O(N0.5−γ) O(N0.5−γ) for any γ > 0, unless the OMv Conjecture (*) fails. Our approach maintains any δ1-hierarchical query with preprocessing time amortized update time enumeration delay O(N1+ε) O(Nε) O(N1−ε). = ⇒ For ε = 0.5, this is weak Pareto optimal, unless OMv Conjecture fails.
(*) OMv Conjecture: Online Matrix-Vector Multiplication Problem cannot be solved in sub-cubic time. logNdelay logNpreprocessing time logNupdate time (1, 0, 1) 1 1 1.5 2 δ = 1 0.5 0.5 (1.5, 0.5, 0.5)
- ptimal
0.5 1 0.5 1 1.5 2 ε
preprocessing time O(N1+ε) amortized update time O(Nε) enumeration delay O(N1−ε) logN time
- ptimal: ε = 0.5
6 / 13
Contribution 4: Single-Tuple vs Bulk Tuple Updates
δ = w − 1 or δ = w for hierarchical queries. Case δ = w − 1 Time to insert N tuples: O(N · N(w−1)ε) = O(N1+(w−1)ε). = ⇒ Preprocessing can be simulated by executing N single-tuple updates.
7 / 13
Contribution 4: Single-Tuple vs Bulk Tuple Updates
δ = w − 1 or δ = w for hierarchical queries. Case δ = w − 1 Time to insert N tuples: O(N · N(w−1)ε) = O(N1+(w−1)ε). = ⇒ Preprocessing can be simulated by executing N single-tuple updates. Case δ = w Time to insert N tuples: O(N · Nwε) = O(N1+(w−1)ε+ε). = ⇒ Complexity gap of O(Nε) between single-tuple updates and bulk updates.
7 / 13
Hierarchical Queries
A query is hierarchical if for any two variables X, Y : atoms(X) ⊆ atoms(Y ) or atoms(X) ⊇ atoms(Y ) or atoms(X) ∩ atoms(Y ) = ∅
hierarchical F ⊆ {A, B, C, D, F, G} Q(F) = R(A, B, D), S(A, B), T(A, C, F), U(A, C, G) A B C D F G R S T U
8 / 13
Hierarchical Queries
A query is hierarchical if for any two variables X, Y : atoms(X) ⊆ atoms(Y ) or atoms(X) ⊇ atoms(Y ) or atoms(X) ∩ atoms(Y ) = ∅
hierarchical F ⊆ {A, B, C, D, F, G} Q(F) = R(A, B, D), S(A, B), T(A, C, F), U(A, C, G) A B C D F G R S T U not hierarchical F ⊆ {A, B, C, D, F, G} Q(F) = R(A), S(A, B), T(B) A B R S T
8 / 13
δ0-Hierarchical Queries
A hierarchical query is δ0-hierarchical if for any bound variable X and atom R(X) ∈ atoms(X): free(atoms(X)) ⊆ X.
δ0-hierarchical Q(A, B, C) = R(A, B, D), S(A, B), T(A, C, F), U(A, C, G) A B C D F G R S T U
9 / 13
δ0-Hierarchical Queries
A hierarchical query is δ0-hierarchical if for any bound variable X and atom R(X) ∈ atoms(X): free(atoms(X)) ⊆ X.
δ0-hierarchical Q(A, B, C) = R(A, B, D), S(A, B), T(A, C, F), U(A, C, G) A B C D F G R S T U hierarchical but not δ0-hierarchical Q(A) = S(A, B), T(B) A B S T
9 / 13
δ1-Hierarchical Queries
The query is not δ0-hierarchical. For any bound variable X and atom R(X) ∈ atoms(X): there is an atom S(Y) ∈ atoms(X) such that free(atoms(X)) ⊆ X ∪ Y.
δ1-hierarchical Q(A, D, E, G) = R(A, B, D), S(A, B, E), T(A, C, F), U(A, C, G) A B C D E F G R S T U
10 / 13
δ1-Hierarchical Queries
The query is not δ0-hierarchical. For any bound variable X and atom R(X) ∈ atoms(X): there is an atom S(Y) ∈ atoms(X) such that free(atoms(X)) ⊆ X ∪ Y.
δ1-hierarchical Q(A, D, E, G) = R(A, B, D), S(A, B, E), T(A, C, F), U(A, C, G) A B C D E F G R S T U not δ1-hierarchical Q(D, G) = R(A, B, D), S(A, B, E), T(A, C, F), U(A, C, G) A B C D E F G R S T U
10 / 13
Static Query Evaluation - Example
Simple δ1-hierarchical query Q(B, C) = R(A, B), S(A, C)
A B C R S
11 / 13
Static Query Evaluation - Example
Simple δ1-hierarchical query Q(B, C) = R(A, B), S(A, C)
A B C R S
1 2 1 logN delay logN preprocessing time
Lower bound [CSL ’07] There is no algorithm that admits preprocessing time enumeration delay O(N) O(1) unless Boolean Matrix Multiplication can be solved in quadratic time.
11 / 13
Static Query Evaluation - Example
Simple δ1-hierarchical query Q(B, C) = R(A, B), S(A, C)
A B C R S
1 2 1 logN delay logN preprocessing time
Known approach: Eager preprocessing, quick enumeration Preprocessing: Materialize the result. Enumeration: Enumerate from materialized result.
11 / 13
Static Query Evaluation - Example
Simple δ1-hierarchical query Q(B, C) = R(A, B), S(A, C)
A B C R S
1 2 1 logN delay logN preprocessing time
Known approach: Lazy preprocessing, heavy enumeration Preprocessing: Eliminate dangling tuples. Enumeration: For each B-value, enumerate distinct C-values.
11 / 13
Static Query Evaluation - Example
Simple δ1-hierarchical query Q(B, C) = R(A, B), S(A, C)
A B C R S
1 2 1 logN delay logN preprocessing time
? ?
Open question Is there an algorithm that admits sub-quadratic preprocessing time and sub-linear enumeration delay?
11 / 13
Static Query Evaluation - Example
Simple δ1-hierarchical query Q(B, C) = R(A, B), S(A, C)
A B C R S
1 2 1 logN delay logN preprocessing time
- u
r a p p r
- a
c h
1 1 2 ε
preprocessing time O(N1+ε) enumeration delay O(N1−ε)
logN time
11 / 13
Dynamic Query Evaluation - Example
Simple δ1-hierarchical query Q(A) = R(A, B), S(B)
A B R S
12 / 13
Dynamic Query Evaluation - Example
Simple δ1-hierarchical query Q(A) = R(A, B), S(B)
A B R S
logNdelay logNpreprocessing time logNupdate time (1, 0, 1) 1 1 δ = 1 0.5 0.5
Lower bound For this query, there is no algorithm that admits preprocessing time amortized update time enumeration delay arbitrary O(N)0.5−γ O(N)0.5−γ for any γ > 0, unless the OMv Conjecture fails.
12 / 13
Dynamic Query Evaluation - Example
Simple δ1-hierarchical query Q(A) = R(A, B), S(B)
A B R S
logNdelay logNpreprocessing time logNupdate time (1, 0, 1) 1 1 δ = 1 0.5 0.5
Known approach: Eager update, quick enumeration Preprocessing: Materialize the result. Upon update: Maintain the materialized result. Enumeration: Enumerate from materialized result.
12 / 13
Dynamic Query Evaluation - Example
Simple δ1-hierarchical query Q(A) = R(A, B), S(B)
A B R S
logNdelay logNpreprocessing time logNupdate time (1, 0, 1) 1 1 δ = 1 0.5 0.5
Known approach: Lazy update, heavy enumeration Preprocessing: Eliminate dangling tuples. Upon update: Update only base relations. Enumeration: Eliminate dangling tuples and enumerate.
12 / 13
Dynamic Query Evaluation - Example
Simple δ1-hierarchical query Q(A) = R(A, B), S(B)
A B R S
logNdelay logNpreprocessing time logNupdate time (1, 0, 1) 1 1 δ = 1 0.5 0.5
Open question Is there an algorithm that admits sub-linear (amortized) update time and sub-linear enumeration delay?
12 / 13
Dynamic Query Evaluation - Example
Simple δ1-hierarchical query Q(A) = R(A, B), S(B)
A B R S
logNdelay logNpreprocessing time logNupdate time (1, 0, 1) 1 1 δ = 1 0.5 0.5 (1.0, 0.5, 0.5)
- ptimal∗
0.5 1 0.5 1 ε
preprocessing time O(N1) amortized update time O(Nε) enumeration delay O(N1−ε)
logN time
- ptimal∗
(∗): Weak Pareto optimality by OMv Conjecture
12 / 13
Conclusion
Benefits of Our Approach Allows to tune the trade-off between preprocessing time, update time, and enumeration delay. Recovers existing results as specific points. Maintains hierarchical queries with sub-linear amortized update time and sub-linear enumeration delay. Maintains δ1-queries with weak Pareto optimal update time and delay. Ongoing Work Extension of our approach to
◮ conjunctive queries, ◮ aggregate queries, and ◮ enumeration in desired order.
System prototype.
13 / 13