Selecting and Using Views To Compute Aggregate Queries Foto Afrati - PowerPoint PPT Presentation

Selecting and Using Views To Compute Aggregate Queries Foto Afrati (NTUA Greece) and Rada Chirkova (NC State University) F. Afrati and R. Chirkova Views to Compute Aggregate Queries January 7, 2005

What this Talk Is About � Topic: using materialized views to improve query- evaluation performance � Context: � Workloads of queries with or without aggregation � Designing optimal views under constraints � Results: � Formats of equivalent rewritings of queries using views: sufficient and necessary conditions � Designing and using views: complexity and algorithms F. Afrati and R. Chirkova Views to Compute Aggregate Queries January 7, 2005 2

Using Databases: Asking Queries … “Give me recent total sales for all products in Edinburgh” F. Afrati and R. Chirkova Views to Compute Aggregate Queries January 7, 2005 3

Using Databases: Asking Queries … “Give me recent total sales for all products in Edinburgh” SELECT Product, SUM(Sales) FROM SalesOfProducts WHERE Location = ‘Edinburgh’ AND Date > ’12/31/2003’ GROUP BY Product; F. Afrati and R. Chirkova Views to Compute Aggregate Queries January 7, 2005 3

… And Getting Answers, Without Delay “Give me recent total sales for all products in Edinburgh” . . . F. Afrati and R. Chirkova Views to Compute Aggregate Queries January 7, 2005 3

When Users Ask Complex Queries … SELECT DealerName, Expenses FROM Dealers, Reports WHERE Location = ‘San Francisco’ AND Sales > 300000; SELECT ProductID, ProductDescr, MIN(Sales) FROM AllProducts, Stores WHERE Store.Location = ‘Palo Alto’ SELECT ProductDescr, MAX(TotalSales) GROUP BY ProductID; FROM GenericProducts, Stores WHERE Date > ’05/01/2004’ GROUP BY ProductDescr; SELECT StoreLocation, AVG(Sales) FROM Stores WHERE TotalSales < 150000 GROUP BY StoreLocation; SELECT ProductID, AVG(Sales) FROM AllProducts, Dealers WHERE Dealers.Location = ‘Berkeley’ GROUP BY ProductID; SELECT ProductID, SUM(Sales) FROM AllProducts, Dealers WHERE Dealers.State = `CA’ GROUP BY ProductID; F. Afrati and R. Chirkova Views to Compute Aggregate Queries January 7, 2005 4

… Can We Reduce Response Times? F. Afrati and R. Chirkova Views to Compute Aggregate Queries January 7, 2005 4

Computing a Complex Query SELECT DealerName, Expenses FROM Dealers, Reports WHERE Location = ‘Edinburgh’ AND Sales > 300000; JOIN JOIN F. Afrati and R. Chirkova Views to Compute Aggregate Queries January 7, 2005 5

Can Be Done with Maximum Speedup SELECT DealerName, Expenses FROM Dealers, Reports WHERE Location = ‘Edinburgh’ AND Sales > 300000; F. Afrati and R. Chirkova Views to Compute Aggregate Queries January 7, 2005 6

Or with at Least Some Speedup SELECT DealerName, Expenses FROM Dealers, Reports WHERE Location = ‘Edinburgh’ AND Sales > 300000; JOIN F. Afrati and R. Chirkova Views to Compute Aggregate Queries January 7, 2005 7

Our Goals and Approach � Improving query-evaluation performance � By using derived data (materialized views) � Working with a query workload at a time � Method: designing and precomputing derived data in advance F. Afrati and R. Chirkova Views to Compute Aggregate Queries January 7, 2005 8

Example: Rewriting Aggregate Queries [ACGL05] Sales(CustID, DateID, ProductID, QtySold, TotalAmount, Discount) Customer(CustID, CustName, Address, City, State, RegistrDateID) Time(DateID, Month, Year) Q1: SELECT c.CustID, SUM (QtySold) Q2: SELECT t.Year, MAX (QtySold) FROM Sales s, Customer c, Time t FROM Sales s, Customer c, Time t WHERE s.DateID = t.DateID WHERE s.DateID = t.DateID AND s.CustID = c.CustID AND s.CustID = c.CustID AND Month >= 10 AND Month <= 12 AND Year > 1997 AND Year = 2004 AND State = ‘NC’ GROUP BY c.CustID; GROUP BY t.Year; F. Afrati and R. Chirkova Views to Compute Aggregate Queries January 7, 2005 9

Defining a View on the Sales Relation Q1: SELECT c.CustID, SUM (QtySold) Q2: SELECT t.Year, MAX (QtySold) FROM Sales s, Customer c, Time t FROM Sales s, Customer c, Time t WHERE s.DateID = t.DateID WHERE s.DateID = t.DateID AND s.CustID = c.CustID AND Month >= 10 AND s.CustID = c.CustID AND Month <= 12 AND Year = 2004 AND Year > 1997 AND State = 'NC' GROUP BY c.CustID; GROUP BY t.Year; V1: SELECT CustID, DateID, R1: SELECT c.CustID, SUM(SumQS) SUM(QtySold) AS SumQS, FROM V1, Customer c, Time t MAX(QtySold) AS MaxQS WHERE V1.DateID = t.DateID FROM Sales AND V1.CustID = c.CustID GROUP BY CustID, DateID; AND Month >= 10 AND Month <= 12 AND Year = 2004 GROUP BY c.CustID; V1 can also be used to evaluate the query Q2 F. Afrati and R. Chirkova Views to Compute Aggregate Queries January 7, 2005 10

Problem: View Selection For Queries in Presence of Aggregation � Input: � a workload Q of queries (some may be without aggregation) � an oracle O that gives view sizes � a constraint L on the views to be materialized � Output: a set of views, with or without aggregation, that � satisfies the constraint L, � produces equivalent rewritings of the queries in Q , and � minimizes the total evaluation costs of Q F. Afrati and R. Chirkova Views to Compute Aggregate Queries January 7, 2005 11

The Setting � Queries, views, and rewritings: SQL select- project-join ( conjunctive ) with equality selection conditions, with or without aggregation (without HAVING) � Aggregate functions: max, min, sum, count � Sum cost model for query evaluation � Constraint on views: storage limit � Rewriting format: central rewritings F. Afrati and R. Chirkova Views to Compute Aggregate Queries January 7, 2005 12

Why Consider Rewriting Formats Example (5.1 in [CNS99]) Q(x,count) :- P(x,y). V(x,count) :- P(x,s), P(x,t). R(x, √ z) :- V(x,z). R is an equivalent rewriting of Q using V [CNS99, NSS98]. F. Afrati and R. Chirkova Views to Compute Aggregate Queries January 7, 2005 13

Rewriting Queries Using Views Query Q answer () :- b1(), b2(), .... , bn() Rewriting answer () :- v1(), v2(), .... , vm() ⊆ . . . answer ():- b11(), b12(), .... , b1k() . . . bm1(), bm2(), .... Expansion F. Afrati and R. Chirkova Views to Compute Aggregate Queries January 7, 2005

Query Equivalence Modulo Views Definition. Let Q be a query defined on a database schema S, and let V be a set of views defined on S; let R be a query defined using the views in V . Then Q and R are equivalent modulo V if and only if for any database D, Q(D) is equivalent to R(D V ). Here, D V is the database obtained by computing all the view relations in V on D. F. Afrati and R. Chirkova Views to Compute Aggregate Queries January 7, 2005

Evaluating an Aggregate Query Q On Database D 1. Compute the core Q’ of Q on D as a bag B. 2. Grouping: Form equivalence classes { B’ } in B based on the grouping arguments of Q. 3. Aggregation: With each equivalence class B’, associate a value that is the aggregate function of Q computed on a bag of all values of the input argument of the aggregated attribute of Q in B’. F. Afrati and R. Chirkova Views to Compute Aggregate Queries January 7, 2005

Equivalence between Aggregate Queries [NSS98] � Definition. Two aggregate queries are compatible if the tuples of arguments in their heads are identical. � Definition. For two compatible aggregate queries Q and Q’, Q and Q’ are equivalent if Q(D) = Q’(D) for every database D. F. Afrati and R. Chirkova Views to Compute Aggregate Queries January 7, 2005

Classes of Aggregate Functions � An aggregate function α is duplicate-insensitive [GHQ95] if the result of α computed over a bag of values is the same as the result of α computed over the core set of the bag. Otherwise α is duplicate sensitive. � An aggregate function α is distributive [GCB+97] if there is a function γ such that α ({{ X ij }}) = γ ({{ α ({{ X ij }} | i = 1, … , I) | j = 1, … , J }}) F. Afrati and R. Chirkova Views to Compute Aggregate Queries January 7, 2005

Results on Equivalence of Compatible Aggregate Queries [CNS99, NSS98] � Two conjunctive queries are bag-set equivalent iff they are isomorphic after duplicate subgoals are removed. � Equivalence of sum- and count- queries can be reduced to bag-set equivalence of their cores. � Equivalence of max- and min- queries can be reduced to set equivalence of their cores. F. Afrati and R. Chirkova Views to Compute Aggregate Queries January 7, 2005

Selecting and Using Views To Compute Aggregate Queries Foto Afrati - PowerPoint PPT Presentation

Selecting and Using Views To Compute Aggregate Queries Foto Afrati (NTUA Greece) and Rada Chirkova (NC State University) F. Afrati and R. Chirkova Views to Compute Aggregate Queries January 7, 2005 What this Talk Is About

Aggregate Sampling Aggregate Stockpiles CIVL 3137 2 Stockpile Segregation CIVL 3137 3

Asphalt Aggregate Specifications Aggregate Specifications In order to make good asphalt

Aggregate Blending Aggregate Blending To meet the gradation specifications for a concrete or

Answering Queries Using Answering Queries Using Materialized view: result set is stored

Queries in PSM The following rules apply to the use of queries: CS 235: 1. Queries

Dotmetrics Exclusive Users Selecting basic dimensions (country, devices) Selecting timeframe

Range-Consistent Answers of Aggregate Queries under Aggregate Constraints Sergio Flesca, Filippo

Views 2 Designing the user interface Roy Scholten hi Views . Views 2 Views 2 have you heard

Differential Forms for Target Tracking and Aggregate Queries in Distributed Networks Distributed

Range Minimum and Lowest Common Ancestor Queries Slides by Solon P. Pissis November 15, 2019

Top- -k k Queries Queries on SQL on SQL Databases Databases Top Top-k Queries on SQL

Middleware Queries Queries Middleware Middleware Queries Prof. Paolo Ciaccia Prof. Paolo

Short-Run Aggregate Supply (SRAS) Video explanation in 2 minutes or 12 minutes AND Long-Run

1 1 easy to compute , 1 easy to compute 2

Module 14: Analyzing Queries Overview Queries That Use the AND Operator the OR

Geometric Algorithms Range & windowing queries (2 lectures) Database queries 2/180 G.

Drafting Your Smoke-Free Law Doug Blanke June 2-4, 2010 Washington, D.C. The Tobacco Control

Machine Learning Learning HMMs A Hidden Markov model A set of states {s 1 s n } - In

Matrix-Free Preconditioning in Online Learning Ashok Cutkosky, Tamas Sarlos Google Research

Practical Bioinformatics Mark Voorhies 4/10/2018 Mark Voorhies Practical Bioinformatics

Non-linear interlinkages and key objectives amongst the Paris Agreement and the Sustainable

Ambedkar University Delhi Work participation rates of rural women in 15 major states of India (per

SIM CQM & ASSESSMENT UPDATES WEBINAR TUESDAY, FEBRUARY 28 11:00 12:00 AGENDA Welcome

http://cs246.stanford.edu High dimensional == many features Find

Selecting and Using Views To Compute Aggregate Queries Foto Afrati - PowerPoint PPT Presentation

Selecting and Using Views To Compute Aggregate Queries Foto Afrati (NTUA Greece) and Rada Chirkova (NC State University) F. Afrati and R. Chirkova Views to Compute Aggregate Queries January 7, 2005 What this Talk Is About

Aggregate Sampling Aggregate Stockpiles CIVL 3137 2 Stockpile Segregation CIVL 3137 3

Asphalt Aggregate Specifications Aggregate Specifications In order to make good asphalt

Aggregate Blending Aggregate Blending To meet the gradation specifications for a concrete or

Answering Queries Using Answering Queries Using Materialized view: result set is stored

Queries in PSM The following rules apply to the use of queries: CS 235: 1. Queries

Dotmetrics Exclusive Users Selecting basic dimensions (country, devices) Selecting timeframe

Range-Consistent Answers of Aggregate Queries under Aggregate Constraints Sergio Flesca, Filippo

Views 2 Designing the user interface Roy Scholten hi Views . Views 2 Views 2 have you heard

Differential Forms for Target Tracking and Aggregate Queries in Distributed Networks Distributed

Range Minimum and Lowest Common Ancestor Queries Slides by Solon P. Pissis November 15, 2019

Top- -k k Queries Queries on SQL on SQL Databases Databases Top Top-k Queries on SQL

Middleware Queries Queries Middleware Middleware Queries Prof. Paolo Ciaccia Prof. Paolo

Short-Run Aggregate Supply (SRAS) Video explanation in 2 minutes or 12 minutes AND Long-Run

1 1 easy to compute , 1 easy to compute 2

Module 14: Analyzing Queries Overview Queries That Use the AND Operator the OR

Geometric Algorithms Range &amp; windowing queries (2 lectures) Database queries 2/180 G.

Drafting Your Smoke-Free Law Doug Blanke June 2-4, 2010 Washington, D.C. The Tobacco Control

Machine Learning Learning HMMs A Hidden Markov model A set of states {s 1 s n } - In

Matrix-Free Preconditioning in Online Learning Ashok Cutkosky, Tamas Sarlos Google Research

Practical Bioinformatics Mark Voorhies 4/10/2018 Mark Voorhies Practical Bioinformatics

Non-linear interlinkages and key objectives amongst the Paris Agreement and the Sustainable

Ambedkar University Delhi Work participation rates of rural women in 15 major states of India (per

SIM CQM &amp; ASSESSMENT UPDATES WEBINAR TUESDAY, FEBRUARY 28 11:00 12:00 AGENDA Welcome

http://cs246.stanford.edu High dimensional == many features Find

Geometric Algorithms Range & windowing queries (2 lectures) Database queries 2/180 G.

SIM CQM & ASSESSMENT UPDATES WEBINAR TUESDAY, FEBRUARY 28 11:00 12:00 AGENDA Welcome