INTRODUCTION TO RELATIONAL DATABASE SYSTEMS
DATENBANKSYSTEME 1 (INF 3131)
Torsten Grust Universität Tübingen Winter 2019/20
1
INTRODUCTION TO RELATIONAL DATABASE SYSTEMS DATENBANKSYSTEME 1 (INF - - PowerPoint PPT Presentation
INTRODUCTION TO RELATIONAL DATABASE SYSTEMS DATENBANKSYSTEME 1 (INF 3131) Torsten Grust Universitt Tbingen Winter 2019/20 1 SQL: GROUPING AND AGGREGATION In relational database design, a single table like turtles will typically hold
1
turtles turtle pos command τ₁ 1 (t,0,10) τ₁ 2 (t,10,0) τ₁ 3 (t,0,–10) τ₁ 4 (t,–10,0) ┈┈┈┈┈ ┈┈┈ ┈┈┈┈┈┈┈┈┈ τ₂ 1 (t,5,10) τ₂ 2 (t,5,–10) τ₂ 3 (t,–10,0) ┈┈┈┈┈ ┈┈┈ ┈┈┈┈┈┈┈┈┈┈ τ₃ 1 (t,0,10) τ₃ 2 (f,–5,–5) τ₃ 3 (t,10,0)
2
‹SELECT FROM WHERE block› [ GROUP BY ‹expression› [, …] ]
3
SELECT t.turtle, t.pos, t.command -- ⚠ SELECT clause not valid SQL — fix below FROM turtles t GROUP BY t.turtle turtles turtle pos command τ₁ [ 1, 2, 3, 4 ] [ (t,0,10), (t,10,0), (t,0,–10), (t,–10,0) ] ┈┈┈ ┈┈┈┈┈┈┈ ┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈ τ₂ [ 1, 2, 3 ] [ (t,5,10), (t,5,–10), (t,–10,0) ] ┈┈┈ ┈┈┈┈┈┈┈ ┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈ τ₃ [ 1, 2, 3 ] [ (t,0,10), (f,–5,–5), (t,10,0) ]
4
𝗱𝗽𝘃𝗼𝘂(‹expression›) -- number of non-NULL values in group 𝗱𝗽𝘃𝗼𝘂(*) -- number of rows in group 𝘁𝘃𝗻(‹expression›) -- sum of non-NULL values in group 𝗯𝘄𝗵(‹expression›) -- average of non-NULL values in group {𝗻𝗯𝘆 | 𝗻𝗷𝗼}(‹expression›) -- maximum/minimum of values in group 𝗯𝘀𝘀𝗯𝘇▁𝗯𝗵𝗵(‹expression› 𝗣𝗦𝗘𝗙𝗦 𝗖𝗭 ‹expression›) -- array of all values in group {𝗰𝗽𝗽𝗺▁𝗯𝗼𝗲 | 𝗰𝗽𝗽𝗺▁𝗽𝘀}(‹expression›) -- conjunction/disjunction of values
5
Data flow through a SQL SFW block with GROUP BY
6
R G A B X 1 true X 2 false X 4 true ┈ ┈ ┈┈┈┈┈ Y 8 true Y 16 true ┈ ┈ ┈┈┈┈┈ Z 32 true Z ▢ ▢
SELECT r.G, count(r.A) "count", count(*) "count*", sum(r.A) "sum", avg(r.A) "avg", max(r.A) "max", min(r.A) "min", array_agg(r.A ORDER BY r.A) "array", bool_and(r.B) "every", bool_or(r.B) "some" FROM R r GROUP BY r.G
G count count* sum avg max min array every some X 3 3 7 2.333 4 1 {1,2,4} f t Y 2 2 24 12.000 16 8 {8,16} t t Z 1 2 32 32.000 32 32 {32,▢} t t
7
8
Sketch of Data Warehouse Star Schema
9
sales sale set date store items price ‹sale ID› 𝑞 𝑒 𝑡 ‹# of items sold› ‹price of single item›
sets set name cat x y z weight year img 𝑞 dates date day day_of_week month quarter year 𝑒 stores store city state country 𝑡
10
SELECT s.country, count(*) sales, sum(f.items * f.price) turnover FROM sales f, -- ⎫ from fact table navigate to stores s -- ⎬ dimension table(s) to WHERE f.store = s.store -- ⎭ collect required details of sale GROUP BY s.country;
SELECT d."year", count(*) sales FROM sales f, dates d WHERE f."date" = d."date" GROUP BY d."year";
⋯
11
SELECT r.𝘣₁, count(*) FROM 𝑆 r GROUP BY r.𝘣₁ SELECT s.𝑐₁, s.𝑐₂, …, s.𝑐ᵣ FROM 𝑇 s GROUP BY s.𝑐₁, s.𝑐₂, …, s.𝑐ᵣ
SELECT r.𝘣₁, count(*) SELECT r.𝘣₁, count(*) FROM 𝑆 r, 𝑇 s FROM 𝑆 r, 𝑇 s WHERE r.𝘣₁ = s.𝑐₁ WHERE r.𝘣₁ = s.𝑐₁ GROUP BY r.𝘣₁ GROUP BY r.𝘣₁, r.𝘣ᵢ -- 1 ⩽ i ⩽ k
12
‹SELECT FROM WHERE block› [ GROUP BY ‹expression› [, …] [ HAVING ‹condition› ] ]
13
Data flow through a SQL SFW block with GROUP BY and HAVING
14
SELECT a.person FROM calendar c, attendees a WHERE c.appointment = a.appointment GROUP BY a.person HAVING max(c.stop - c.start) >= '01:00:00';
SELECT DISTINCT 'The tested columns do not form a (super) key for table R' FROM 𝑆 r GROUP BY r.𝘣₁, …, r.𝘣ᵢ HAVING count(*) > 1
15
SELECT count(*) cardinality FROM R
SELECT max(weight) "max", min(weight) "min", avg(weight) "avg1", sum(weight)/count(weight) "avg2" FROM sets
16
17
18