SQL Workshop Summaries Doug Shook Aggregates Also called column - - PowerPoint PPT Presentation
SQL Workshop Summaries Doug Shook Aggregates Also called column - - PowerPoint PPT Presentation
SQL Workshop Summaries Doug Shook Aggregates Also called column functions AVG, SUM, MIN, MAX, COUNT Will use all values can use distinct if desired All except COUNT ignore null values A summary query that counts unpaid
2
Aggregates
Also called column functions
– AVG, SUM, MIN, MAX, COUNT
Will use all values – can use distinct if desired
– All except COUNT ignore null values A summary query that counts unpaid invoices and calculates the total due
SELECT COUNT(*) AS NumberOfInvoices, SUM(InvoiceTotal - PaymentTotal - CreditTotal) AS TotalDue FROM Invoices WHERE InvoiceTotal - PaymentTotal - CreditTotal > 0;
The result set
3
Aggregates
A summary query with COUNT(*), AVG, and SUM
SELECT 'After 9/1/2011' AS SelectionDate, COUNT(*) AS NumberOfInvoices, AVG(InvoiceTotal) AS AverageInvoiceAmount, SUM(InvoiceTotal) AS TotalInvoiceAmount FROM Invoices WHERE InvoiceDate > '2011-09-01';
The result set A summary query with MIN and MAX
SELECT 'After 9/1/2011' AS SelectionDate, COUNT(*) AS NumberOfInvoices, MAX(InvoiceTotal) AS HighestInvoiceTotal, MIN(InvoiceTotal) AS LowestInvoiceTotal FROM Invoices WHERE InvoiceDate > '2011-09-01';
The result set
4
Aggregates
A summary query for non-numeric columns
SELECT MIN(VendorName) AS FirstVendor, MAX(VendorName) AS LastVendor, COUNT(VendorName) AS NumberOfVendors FROM Vendors;
The result set A summary query with the DISTINCT keyword
SELECT COUNT(DISTINCT VendorID) AS NumberOfVendors, COUNT(VendorID) AS NumberOfInvoices, AVG(InvoiceTotal) AS AverageInvoiceAmount, SUM(InvoiceTotal) AS TotalInvoiceAmount FROM Invoices WHERE InvoiceDate > '2011-09-01';
The result set
5
Aggregates
What do you notice about the previous examples? Three exceptions
– Literals – GROUP BY – OVER
6
GROUP BY
Commonly used with aggregates
– Groups the rows based on a certain column
Commonly found with a HAVING clause
– Specifies a search condition
Allows for aggregates and non-aggregates to be
used together – Non-aggregate columns will be used for groupings
7
GROUP BY
A summary query that calculates the average invoice amount by vendor
SELECT VendorID, AVG(InvoiceTotal) AS AverageInvoiceAmount FROM Invoices GROUP BY VendorID HAVING AVG(InvoiceTotal) > 2000 ORDER BY AverageInvoiceAmount DESC;
The result set
8
GROUP BY
A summary query that counts the number of invoices by vendor
SELECT VendorID, COUNT(*) AS InvoiceQty FROM Invoices GROUP BY VendorID;
The result set
(34 rows)
9
GROUP BY
A summary query that groups by two columns
SELECT VendorState, VendorCity, COUNT(*) AS InvoiceQty, AVG(InvoiceTotal) AS InvoiceAvg FROM Invoices JOIN Vendors ON Invoices.VendorID = Vendors.VendorID GROUP BY VendorState, VendorCity ORDER BY VendorState, VendorCity;
The result set
(20 rows)
10
GROUP BY
The same summary query with a HAVING clause
SELECT VendorState, VendorCity, COUNT(*) AS InvoiceQty, AVG(InvoiceTotal) AS InvoiceAvg FROM Invoices JOIN Vendors ON Invoices.VendorID = Vendors.VendorID GROUP BY VendorState, VendorCity HAVING COUNT(*) >= 2 ORDER BY VendorState, VendorCity;
The result set
(12 rows)
11
HAVING vs. WHERE
Both serve the same general purpose
– Application is different
WHERE is applied before rows are grouped and
aggregates are calculated – HAVING is applied after
WHERE can refer to any column
– HAVING can only refer to columns in SELECT or GROUP BY
WHERE cannot contain aggregates
– HAVING can
12
HAVING vs. WHERE
A summary query with a search condition in the HAVING clause
SELECT VendorName, COUNT(*) AS InvoiceQty, AVG(InvoiceTotal) AS InvoiceAvg FROM Vendors JOIN Invoices ON Vendors.VendorID = Invoices.VendorID GROUP BY VendorName HAVING AVG(InvoiceTotal) > 500 ORDER BY InvoiceQty DESC;
The result set
(19 rows)
13
HAVING vs. WHERE
A summary query with a search condition in the WHERE clause
SELECT VendorName, COUNT(*) AS InvoiceQty, AVG(InvoiceTotal) AS InvoiceAvg FROM Vendors JOIN Invoices ON Vendors.VendorID = Invoices.VendorID WHERE InvoiceTotal > 500 GROUP BY VendorName ORDER BY InvoiceQty DESC;
The result set
(20 rows)
14
HAVING vs. WHERE
A summary query with a compound condition in the HAVING clause
SELECT InvoiceDate, COUNT(*) AS InvoiceQty, SUM(InvoiceTotal) AS InvoiceSum FROM Invoices GROUP BY InvoiceDate HAVING InvoiceDate BETWEEN '2012-01-01' AND '2012-01-31' AND COUNT(*) > 1 AND SUM(InvoiceTotal) > 100 ORDER BY InvoiceDate DESC;
The result set
15
HAVING vs. WHERE
The same summary query with a WHERE clause
SELECT InvoiceDate, COUNT(*) AS InvoiceQty, SUM(InvoiceTotal) AS InvoiceSum FROM Invoices WHERE InvoiceDate BETWEEN '2012-01-01' AND '2012-01-31' GROUP BY InvoiceDate HAVING COUNT(*) > 1 AND SUM(InvoiceTotal) > 100 ORDER BY InvoiceDate DESC;
The same result set
16
Exercises
Write a SELECT statement that returns two columns
from the Invoices table: VendorID and PaymentSum, where PaymentSum is the sum of the PaymentTotal
- column. Group the result set by VendorID.
Write a SELECT statement that returns two columns:
VendorName and PaymentSum, where PaymentSum is the sum of the PaymentTotal column. Group the result set by VendorName. Return only 10 rows, corresponding to the 10 vendors who've been paid the most.
17
Exercises
Write a SELECT statement that returns three
columns: VendorName, InvoiceCount, and
- InvoiceSum. InvoiceCount is the count of the number
- f invoices, and InvoiceSum is the sum of the
InvoiceTotal column. Group the result set by vendor. Sort the result set so that the vendor with the highest number of invoices appears first.
18
Exercises
Write a SELECT statement that returns three
columns: AccountDescription, LineItemCount, and
- LineItemSum. LineItemCount is the number of entries
in the InvoiceLineItems table that have that
- AccountNo. LineItemSum is the sum of the
InvoiceLineItemAmount column for that AccountNo. Filter the result set to include only those rows with LineItemCount greater than 1. Group the result set by account description, and sort it by descending LineItemCount. Hint: Join the GLAccounts table to the InvoiceLineItems table.
19
Exercises
Modify the solution to the previous problem to filter
for invoices dated from December 1, 2011 to February 29, 2012.
20
Exercises
Write a SELECT statement that returns four
columns: VendorName, AccountDescription, LineItemCount, and LineItemSum. LineItemCount is the row count, and LineItemSum is the sum of the InvoiceLineItemAmount column. For each vendor and account, return the number and sum of line items, sorted first by vendor, then by account description. Hint: use a four table join.
21
Exercises
Write a SELECT statement that answers this