counting triangles under updates in worst case optimal
play

Counting Triangles under Updates in Worst-Case Optimal Time Ahmet - PowerPoint PPT Presentation

Counting Triangles under Updates in Worst-Case Optimal Time Ahmet Kara, Hung Q. Ngo, Milos Nikolic Dan Olteanu, and Haozhe Zhang fdbresearch.github.io Highlights 2018, Berlin Relational AI Problem Setting Maintain the triangle count Q under


  1. Counting Triangles under Updates in Worst-Case Optimal Time Ahmet Kara, Hung Q. Ngo, Milos Nikolic Dan Olteanu, and Haozhe Zhang fdbresearch.github.io Highlights 2018, Berlin Relational AI

  2. Problem Setting Maintain the triangle count Q under single-tuple updates to R , S , and T ! A R T B C S Q counts the number of tuples in the join of R, S, and T . Q = � a , b , c R ( a , b ) · S ( b , c ) · T ( c , a )

  3. The Maintenance Problem single-tuple single-tuple single-tuple update update update database D 0 D 1 D 2 auxiliary maintain maintain A 0 A 1 A 2 data structure maintain maintain triangle Q ( D 0 ) Q ( D 1 ) Q ( D 2 ) count Given a current database D and a single-tuple update, what are the time and space complexities for maintaining Q ( D )?

  4. Much Ado about Triangles The Triangle Query Served as Milestone in Many Fields Worst-case optimal join algorithms [Algorithmica 1997, SIGMOD R. 2013] Parallel query evaluation [Found. & Trends DB 2018] Randomized approximation in static settings [FOCS 2015] Randomized approximation in data streams [SODA 2002, COCOON 2005, PODS 2006, PODS 2016, Theor. Comput. Sci. 2017] Intensive Investigation of Answering Queries under Updates Theoretical developments [PODS 2017, ICDT 2018] Systems developments [F. & T. DB 2012, VLDB J. 2014, SIGMOD 2017, 2018] Lower bounds [STOC 2015, ICM 2018] So far: No dynamic algorithm maintaining the exact triangle count in worst-case optimal time!

  5. Na¨ ıve Maintenance “ Compute from scratch! ” δ R = { ( a ′ , b ′ ) �→ m } � � � R ( a , b ) + δ R ( a , b ) · S ( b , c ) · T ( c , a ) a , b , c � �� � newR = � a , b , c newR ( a , b ) · S ( b , c ) · T ( c , a ) Maintenance Complexity Time: O ( | D | 1 . 5 ) using worst-case optimal join algorithms Space: O ( | D | ) to store input relations

  6. Classical Incremental View Maintenance (IVM) “ Compute the difference! ” δ R = { ( a ′ , b ′ ) �→ m } � � � R ( a , b ) + δ R ( a , b ) · S ( b , c ) · T ( c , a ) a , b , c = � a , b , c R ( a , b ) · S ( b , c ) · T ( c , a ) + δ R ( a ′ , b ′ ) · � c S ( b ′ , c ) · T ( c , a ′ ) Maintenance Complexity Time: O ( | D | ) to intersect C -values from S and T Space: O ( | D | ) to store input relations

  7. Factorized Incremental View Maintenance (F-IVM) “ Compute the difference by using pre-materialized views! ” δ R = { ( a ′ , b ′ ) �→ m } Pre-materialize V ST ( b , a ) = � c S ( b , c ) · T ( c , a )! � � � R ( a , b ) + δ R ( a , b ) · S ( b , c ) · T ( c , a ) a , b , c = � a , b , c R ( a , b ) · S ( b , c ) · T ( c , a ) + δ R ( a ′ , b ′ ) · V ST ( b ′ , a ′ ) Maintenance Complexity Time for updates to R : O (1) to look up in V ST Time for updates to S and T : O ( | D | ) to maintain V ST Space: O ( | D | 2 ) to store input relations and V ST

  8. Closing the Complexity Gap Complexity bounds for the maintenance of the triangle count Known Upper Bound Maintenance Time: O ( | D | ) Space: O ( | D | ) Known Lower Bound Amortized maintenance time: not O ( | D | 0 . 5 − γ ) for any γ > 0 (under reasonable complexity theoretic assumptions)

  9. Closing the Complexity Gap Complexity bounds for the maintenance of the triangle count Known Upper Bound Maintenance Time: O ( | D | ) Space: O ( | D | ) Can the triangle count be maintained in sublinear time? Known Lower Bound Amortized maintenance time: not O ( | D | 0 . 5 − γ ) for any γ > 0 (under reasonable complexity theoretic assumptions)

  10. Closing the Complexity Gap Complexity bounds for the maintenance of the triangle count Known Upper Bound Maintenance Time: O ( | D | ) Space: O ( | D | ) Yes! We propose: IVM ε Can the triangle count Amortized maintenance time: be maintained in O ( | D | 0 . 5 ) sublinear time? This is worst-case optimal! Known Lower Bound Amortized maintenance time: not O ( | D | 0 . 5 − γ ) for any γ > 0 (under reasonable complexity theoretic assumptions)

  11. IVM ε Exhibits a Time-Space Tradeoff Given ε ∈ [0 , 1], IVM ε maintains the triangle count with O ( | D | max { ε, 1 − ε } ) amortized time and O ( | D | 1+min { ε, 1 − ε } ) space. complexity O ( | D | 1 . 5 ) Space Amortized Time O ( | D | ) worst-case optimality O ( | D | 0 . 5 ) ε = 0 . 5 ε 0 0 . 5 1 Known maintenance approaches are recovered by IVM ε .

  12. Main Ideas in IVM ε Compute the difference like in classical IVM! Materialize views like in Factorized IVM! New ingredient: Use adaptive processing based on data skew! = ⇒ Treat heavy values differently from light values!

  13. Quo Vadis IVM ε ? Generalization of IVM ε IVM ε variants obtain sublinear maintenance time for counting versions of Loomis-Whitney, 4-cycle, and 4-path. Ongoing Work Characterization of the class of conjunctive count queries that admit sublinear maintenance time Implementation of IVM ε on top of DBToaster

  14. Quo Vadis IVM ε ? Generalization of IVM ε IVM ε variants obtain sublinear maintenance time for counting versions of Loomis-Whitney, 4-cycle, and 4-path. Ongoing Work Characterization of the class of conjunctive count queries that admit sublinear maintenance time Implementation of IVM ε on top of DBToaster For details, see arxiv.org/abs/1804.02780

  15. Quick Look inside IVM ε Partition R into a light part R L = { t ∈ R | | σ A = t . A | < | D | ε } , a heavy part R H = R \ R L ! R light part A B R L · · A B a b 1 . . . . . . . . n < | D | ε . . . . a b n · · heavy part · · R H a ′ b ′ 1 A B . . . . . . m ≥ | D | ε . . . . . . . . . . . . a ′ b ′ m · ·

  16. Quick Look inside IVM ε Derived Bounds Partition R into for all A -values a : a light part | σ A = a R L | < | D | ε R L = { t ∈ R | | σ A = t . A | < | D | ε } , | π A R H | ≤ | D | 1 − ε a heavy part R H = R \ R L ! R light part A B R L · · A B a b 1 . . . . . . . . n < | D | ε . . . . a b n · · heavy part · · R H a ′ b ′ 1 A B . . . . . . m ≥ | D | ε . . . . . . . . . . . . a ′ b ′ m · ·

  17. Quick Look inside IVM ε Derived Bounds Partition R into for all A -values a : a light part | σ A = a R L | < | D | ε R L = { t ∈ R | | σ A = t . A | < | D | ε } , | π A R H | ≤ | D | 1 − ε a heavy part R H = R \ R L ! R light part A B R L Likewise, partition · · A B S = S L ∪ S H based on B , and a b 1 . . . . . . . . n < | D | ε . . T = T L ∪ T H based on C ! . . a b n · · heavy part · · R H a ′ b ′ 1 A B . . . . . . m ≥ | D | ε . . . . . . . . . . . . a ′ b ′ m · ·

  18. Quick Look inside IVM ε Derived Bounds Partition R into for all A -values a : a light part | σ A = a R L | < | D | ε R L = { t ∈ R | | σ A = t . A | < | D | ε } , | π A R H | ≤ | D | 1 − ε a heavy part R H = R \ R L ! R light part A B R L Likewise, partition · · A B S = S L ∪ S H based on B , and a b 1 . . . . . . . . n < | D | ε . . T = T L ∪ T H based on C ! . . a b n · · heavy part · · R H a ′ b ′ Q is the sum of skew-aware views 1 A B . . . . R U ( a , b ) · S V ( b , c ) · T W ( c , a ) . . m ≥ | D | ε . . . . . . . . . . with U , V , W ∈ { L , H } . . . a ′ b ′ m · ·

  19. Adaptive Maintenance Strategy Given an update δ R ∗ = { ( a ′ , b ′ ) �→ m } , compute the difference for each skew-aware view using different strategies: Skew-aware View Evaluation from left to right Time � δ R ∗ ( a ′ , b ′ ) · � S L ( b ′ , c ) · T L ( c , a ′ ) R ∗ ( a , b ) · S L ( b , c ) · T L ( c , a ) O ( | D | ε ) a , b , c c

  20. Adaptive Maintenance Strategy Given an update δ R ∗ = { ( a ′ , b ′ ) �→ m } , compute the difference for each skew-aware view using different strategies: Skew-aware View Evaluation from left to right Time � δ R ∗ ( a ′ , b ′ ) · � S L ( b ′ , c ) · T L ( c , a ′ ) R ∗ ( a , b ) · S L ( b , c ) · T L ( c , a ) O ( | D | ε ) a , b , c c � δ R ∗ ( a ′ , b ′ ) · � T H ( c , a ′ ) · S H ( b ′ , c ) O ( | D | 1 − ε ) R ∗ ( a , b ) · S H ( b , c ) · T H ( c , a ) a , b , c c

  21. Adaptive Maintenance Strategy Given an update δ R ∗ = { ( a ′ , b ′ ) �→ m } , compute the difference for each skew-aware view using different strategies: Skew-aware View Evaluation from left to right Time � δ R ∗ ( a ′ , b ′ ) · � S L ( b ′ , c ) · T L ( c , a ′ ) R ∗ ( a , b ) · S L ( b , c ) · T L ( c , a ) O ( | D | ε ) a , b , c c � δ R ∗ ( a ′ , b ′ ) · � T H ( c , a ′ ) · S H ( b ′ , c ) O ( | D | 1 − ε ) R ∗ ( a , b ) · S H ( b , c ) · T H ( c , a ) a , b , c c δ R ∗ ( a ′ , b ′ ) · � S L ( b ′ , c ) · T H ( c , a ′ ) O ( | D | ε ) c � R ∗ ( a , b ) · S L ( b , c ) · T H ( c , a ) or a , b , c δ R ∗ ( a ′ , b ′ ) · � T H ( c , a ′ ) · S L ( b ′ , c ) O ( | D | 1 − ε ) c

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend