Windows 3.1? So why stick with SQL-92? @ModernSQL - - - PowerPoint PPT Presentation

windows 3 1
SMART_READER_LITE
LIVE PREVIEW

Windows 3.1? So why stick with SQL-92? @ModernSQL - - - PowerPoint PPT Presentation

Still using Windows 3.1? So why stick with SQL-92? @ModernSQL - https://modern-sql.com/ @MarkusWinand SQL:1999 WITH (Common Table Expressions) WITH (non-recursive) The Problem Nested queries are hard to read: SELECT FROM (SELECT


slide-1
SLIDE 1

Still using

Windows 3.1?

So why stick with

SQL-92?

@ModernSQL - https://modern-sql.com/ @MarkusWinand

slide-2
SLIDE 2

SQL:1999

slide-3
SLIDE 3

WITH

(Common Table Expressions)

slide-4
SLIDE 4

WITH (non-recursive) The Problem

Nested queries are hard to read: SELECT … FROM (SELECT … FROM t1 JOIN (SELECT … FROM … ) a ON (…) ) b JOIN (SELECT … FROM … ) c ON (…)

slide-5
SLIDE 5

Understand this first

WITH (non-recursive) The Problem

Nested queries are hard to read: SELECT … FROM (SELECT … FROM t1 JOIN (SELECT … FROM … ) a ON (…) ) b JOIN (SELECT … FROM … ) c ON (…)

slide-6
SLIDE 6

Then this...

WITH (non-recursive) The Problem

Nested queries are hard to read: SELECT … FROM (SELECT … FROM t1 JOIN (SELECT … FROM … ) a ON (…) ) b JOIN (SELECT … FROM … ) c ON (…)

slide-7
SLIDE 7

Then this...

WITH (non-recursive) The Problem

Nested queries are hard to read: SELECT … FROM (SELECT … FROM t1 JOIN (SELECT … FROM … ) a ON (…) ) b JOIN (SELECT … FROM … ) c ON (…)

slide-8
SLIDE 8

F i n a l l y t h e f i r s t l i n e m a k e s s e n s e

WITH (non-recursive) The Problem

Nested queries are hard to read: SELECT … FROM (SELECT … FROM t1 JOIN (SELECT … FROM … ) a ON (…) ) b JOIN (SELECT … FROM … ) c ON (…)

slide-9
SLIDE 9

CTEs are statement-scoped views: WITH a (c1, c2, c3) AS (SELECT c1, c2, c3 FROM …), b (c4, …) AS (SELECT c4, … FROM t1 JOIN a ON (…) ), c (…)

K e y w

  • r

d

WITH (non-recursive) Since SQL:1999

slide-10
SLIDE 10

CTEs are statement-scoped views: WITH a (c1, c2, c3) AS (SELECT c1, c2, c3 FROM …), b (c4, …) AS (SELECT c4, … FROM t1 JOIN a ON (…) ), c (…)

Name of CTE and (here

  • ptional) column names

WITH (non-recursive) Since SQL:1999

slide-11
SLIDE 11

CTEs are statement-scoped views: WITH a (c1, c2, c3) AS (SELECT c1, c2, c3 FROM …), b (c4, …) AS (SELECT c4, … FROM t1 JOIN a ON (…) ), c (…)

Definition

WITH (non-recursive) Since SQL:1999

slide-12
SLIDE 12

CTEs are statement-scoped views: WITH a (c1, c2, c3) AS (SELECT c1, c2, c3 FROM …), b (c4, …) AS (SELECT c4, … FROM t1 JOIN a ON (…) ), c (…)

Introduces another CTE Don't repeat WITH

WITH (non-recursive) Since SQL:1999

slide-13
SLIDE 13

CTEs are statement-scoped views: WITH a (c1, c2, c3) AS (SELECT c1, c2, c3 FROM …), b (c4, …) AS (SELECT c4, … FROM t1 JOIN a ON (…) ), c (…)

May refer to
 previous CTEs

WITH (non-recursive) Since SQL:1999

slide-14
SLIDE 14

WITH a (c1, c2, c3) AS (SELECT c1, c2, c3 FROM …), b (c4, …) AS (SELECT c4, … FROM t1 JOIN a ON (…) ), c (…) AS (SELECT … FROM …) SELECT … FROM b JOIN c ON (…)

Third CTE

WITH (non-recursive) Since SQL:1999

slide-15
SLIDE 15

WITH a (c1, c2, c3) AS (SELECT c1, c2, c3 FROM …), b (c4, …) AS (SELECT c4, … FROM t1 JOIN a ON (…) ), c (…) AS (SELECT … FROM …) SELECT … FROM b JOIN c ON (…)

No comma!

WITH (non-recursive) Since SQL:1999

slide-16
SLIDE 16

WITH a (c1, c2, c3) AS (SELECT c1, c2, c3 FROM …), b (c4, …) AS (SELECT c4, … FROM t1 JOIN a ON (…) ), c (…) AS (SELECT … FROM …) SELECT … FROM b JOIN c ON (…)

Main query

WITH (non-recursive) Since SQL:1999

slide-17
SLIDE 17

CTEs are statement-scoped views: WITH a (c1, c2, c3) AS (SELECT c1, c2, c3 FROM …), b (c4, …) AS (SELECT c4, … FROM t1 JOIN a ON (…) ), c (…) AS (SELECT … FROM …) SELECT … FROM b JOIN c ON (…)

Read top down

WITH (non-recursive) Since SQL:1999

slide-18
SLIDE 18
  • Literate SQL


Organize SQL code to
 improve maintainability

  • Assign column names


to tables produced by values


  • r unnest.
  • Overload tables (for testing)


with queries hide tables


  • f the same name.

Use-Cases WITH (non-recursive)

https://modern-sql.com/use-case/literate-sql https://modern-sql.com/use-case/naming-unnamed-columns https://modern-sql.com/use-case/unit-tests-on-transient-data

slide-19
SLIDE 19

WITH are the "private methods" of SQL WITH is a prefix to SELECT WITH queries are only visible in the SELECT they precede WITH in detail: https://modern-sql.com/feature/with

WITH (non-recursive) In a Nutshell

slide-20
SLIDE 20

Availability WITH (non-recursive)

1 9 9 9 2 1 2 3 2 5 2 7 2 9 2 1 1 2 1 3 2 1 5 2 1 7

5.1 10.2

MariaDB

8.0 MySQL 8.4

PostgreSQL

3.8.3

[0]

SQLite

7.0

DB2 LUW

9iR2

Oracle

2005

[1]

SQL Server

[0]Only for top-level SELECT statements [1]Only allowed at the very begin of a statement. E.g. WITH...INSERT...SELECT.
slide-21
SLIDE 21

WITH RECURSIVE

(Common Table Expressions)

slide-22
SLIDE 22

(This page is intentionally left blank)

WITH RECURSIVE The Problem

slide-23
SLIDE 23

CREATE TABLE t ( id NUMERIC NOT NULL, parent_id NUMERIC, … PRIMARY KEY (id) )

Coping with hierarchies in the Adjacency List Model[0]

WITH RECURSIVE The Problem

[0] Hierarchies implemented using a “parent id” — see “Joe Celko’s Trees and Hierarchies in SQL for Smarties”

slide-24
SLIDE 24

CREATE TABLE t ( id NUMERIC NOT NULL, parent_id NUMERIC, … PRIMARY KEY (id) )

Coping with hierarchies in the Adjacency List Model[0]

WITH RECURSIVE The Problem

[0] Hierarchies implemented using a “parent id” — see “Joe Celko’s Trees and Hierarchies in SQL for Smarties”

slide-25
SLIDE 25

SELECT * FROM t AS d0 LEFT JOIN t AS d1 ON (d1.parent_id=d0.id) LEFT JOIN t AS d2 ON (d2.parent_id=d1.id)

Coping with hierarchies in the Adjacency List Model[0]

WITH RECURSIVE The Problem

WHERE d0.id = ?

[0] Hierarchies implemented using a “parent id” — see “Joe Celko’s Trees and Hierarchies in SQL for Smarties”

slide-26
SLIDE 26

SELECT * FROM t AS d0 LEFT JOIN t AS d1 ON (d1.parent_id=d0.id) LEFT JOIN t AS d2 ON (d2.parent_id=d1.id)

Coping with hierarchies in the Adjacency List Model[0]

WITH RECURSIVE The Problem

WHERE d0.id = ?

[0] Hierarchies implemented using a “parent id” — see “Joe Celko’s Trees and Hierarchies in SQL for Smarties”

slide-27
SLIDE 27

SELECT * FROM t AS d0 LEFT JOIN t AS d1 ON (d1.parent_id=d0.id) LEFT JOIN t AS d2 ON (d2.parent_id=d1.id)

Coping with hierarchies in the Adjacency List Model[0]

WITH RECURSIVE The Problem

WHERE d0.id = ?

[0] Hierarchies implemented using a “parent id” — see “Joe Celko’s Trees and Hierarchies in SQL for Smarties”

slide-28
SLIDE 28

SELECT * FROM t AS d0 LEFT JOIN t AS d1 ON (d1.parent_id=d0.id) LEFT JOIN t AS d2 ON (d2.parent_id=d1.id)

WITH RECURSIVE Since SQL:1999

WHERE d0.id = ? WITH RECURSIVE d (id, parent, …) AS
 (SELECT id, parent, … FROM tbl WHERE id = ? UNION ALL SELECT id, parent, … FROM d JOIN tbl
 ON (tbl.parent=d.id) ) SELECT * FROM subtree

slide-29
SLIDE 29

Recursive common table expressions may refer to themselves in a leg of a UNION [ALL]: WITH RECURSIVE cte (n) AS (SELECT 1 UNION ALL SELECT n+1 FROM cte WHERE n < 3) SELECT * FROM cte

K e y w

  • r

d

Since SQL:1999 WITH RECURSIVE

slide-30
SLIDE 30

Recursive common table expressions may refer to themselves in a leg of a UNION [ALL]: WITH RECURSIVE cte (n) AS (SELECT 1 UNION ALL SELECT n+1 FROM cte WHERE n < 3) SELECT * FROM cte

Column list mandatory here

Since SQL:1999 WITH RECURSIVE

slide-31
SLIDE 31

Recursive common table expressions may refer to themselves in a leg of a UNION [ALL]: WITH RECURSIVE cte (n) AS (SELECT 1 UNION ALL SELECT n+1 FROM cte WHERE n < 3) SELECT * FROM cte

E x e c u t e d f i r s t

Since SQL:1999 WITH RECURSIVE

slide-32
SLIDE 32

Recursive common table expressions may refer to themselves in a leg of a UNION [ALL]: WITH RECURSIVE cte (n) AS (SELECT 1 UNION ALL SELECT n+1 FROM cte WHERE n < 3) SELECT * FROM cte

Result sent there

Since SQL:1999 WITH RECURSIVE

slide-33
SLIDE 33

Recursive common table expressions may refer to themselves in a leg of a UNION [ALL]: WITH RECURSIVE cte (n) AS (SELECT 1 UNION ALL SELECT n+1 FROM cte WHERE n < 3) SELECT * FROM cte

Result visible twice

Since SQL:1999 WITH RECURSIVE

slide-34
SLIDE 34

Recursive common table expressions may refer to themselves in a leg of a UNION [ALL]: WITH RECURSIVE cte (n) AS (SELECT 1 UNION ALL SELECT n+1 FROM cte WHERE n < 3) SELECT * FROM cte n

  • 1

2 3 (3 rows)

Once it becomes part of
 the final
 result

Since SQL:1999 WITH RECURSIVE

slide-35
SLIDE 35

Recursive common table expressions may refer to themselves in a leg of a UNION [ALL]: WITH RECURSIVE cte (n) AS (SELECT 1 UNION ALL SELECT n+1 FROM cte WHERE n < 3) SELECT * FROM cte n

  • 1

2 3 (3 rows)

Since SQL:1999 WITH RECURSIVE

slide-36
SLIDE 36

Recursive common table expressions may refer to themselves in a leg of a UNION [ALL]: WITH RECURSIVE cte (n) AS (SELECT 1 UNION ALL SELECT n+1 FROM cte WHERE n < 3) SELECT * FROM cte n

  • 1

2 3 (3 rows)

Second
 leg of UNION 
 is executed

Since SQL:1999 WITH RECURSIVE

slide-37
SLIDE 37

Recursive common table expressions may refer to themselves in a leg of a UNION [ALL]: WITH RECURSIVE cte (n) AS (SELECT 1 UNION ALL SELECT n+1 FROM cte WHERE n < 3) SELECT * FROM cte n

  • 1

2 3 (3 rows)

Result sent there again

Since SQL:1999 WITH RECURSIVE

slide-38
SLIDE 38

Recursive common table expressions may refer to themselves in a leg of a UNION [ALL]: WITH RECURSIVE cte (n) AS (SELECT 1 UNION ALL SELECT n+1 FROM cte WHERE n < 3) SELECT * FROM cte n

  • 1

2 3 (3 rows)

Since SQL:1999 WITH RECURSIVE

slide-39
SLIDE 39

Recursive common table expressions may refer to themselves in a leg of a UNION [ALL]: WITH RECURSIVE cte (n) AS (SELECT 1 UNION ALL SELECT n+1 FROM cte WHERE n < 3) SELECT * FROM cte n

  • 1

2 3 (3 rows)

It's a loop!

Since SQL:1999 WITH RECURSIVE

slide-40
SLIDE 40

Recursive common table expressions may refer to themselves in a leg of a UNION [ALL]: WITH RECURSIVE cte (n) AS (SELECT 1 UNION ALL SELECT n+1 FROM cte WHERE n < 3) SELECT * FROM cte n

  • 1

2 3 (3 rows)

It's a loop!

Since SQL:1999 WITH RECURSIVE

slide-41
SLIDE 41

Recursive common table expressions may refer to themselves in a leg of a UNION [ALL]: WITH RECURSIVE cte (n) AS (SELECT 1 UNION ALL SELECT n+1 FROM cte WHERE n < 3) SELECT * FROM cte n

  • 1

2 3 (3 rows)

It's a loop!

Since SQL:1999 WITH RECURSIVE

slide-42
SLIDE 42

Recursive common table expressions may refer to themselves in a leg of a UNION [ALL]: WITH RECURSIVE cte (n) AS (SELECT 1 UNION ALL SELECT n+1 FROM cte WHERE n < 3) SELECT * FROM cte n

  • 1

2 3 (3 rows)

n=3
 doesn't match

Since SQL:1999 WITH RECURSIVE

slide-43
SLIDE 43

Recursive common table expressions may refer to themselves in a leg of a UNION [ALL]: WITH RECURSIVE cte (n) AS (SELECT 1 UNION ALL SELECT n+1 FROM cte WHERE n < 3) SELECT * FROM cte n

  • 1

2 3 (3 rows)

n=3
 doesn't match Loop terminates

Since SQL:1999 WITH RECURSIVE

slide-44
SLIDE 44

Use Cases

  • Row generators


To fill gaps (e.g., in time series), generate test data.

  • Processing graphs


Shortest route from person A to B in LinkedIn/Facebook/Twitter/…

  • Finding distinct values


with n*log(N)† time complexity. […many more…]

As shown on previous slide http://aprogrammerwrites.eu/?p=1391

“[…] for certain classes of graphs, solutions utilizing relational database technology […] can offer performance superior to that of the dedicated graph databases.” event.cwi.nl/grades2013/07-welc.pdf

http://wiki.postgresql.org/wiki/Loose_indexscan

† n … # distinct values, N … # of table rows. Suitable index required

WITH RECURSIVE

slide-45
SLIDE 45

WITH RECURSIVE is the “while” of SQL WITH RECURSIVE "supports" infinite loops Except PostgreSQL, databases generally don't require the RECURSIVE keyword. DB2, SQL Server & Oracle don’t even know the keyword RECURSIVE, but allow recursive CTEs anyway.

In a Nutshell WITH RECURSIVE

slide-46
SLIDE 46

Availability WITH RECURSIVE

1 9 9 9 2 1 2 3 2 5 2 7 2 9 2 1 1 2 1 3 2 1 5 2 1 7

5.1 10.2

MariaDB

8.0 MySQL 8.4

PostgreSQL

3.8.3

[0]

SQLite

7.0

DB2 LUW

11gR2

Oracle

2005

SQL Server

[0]Only for top-level SELECT statements
slide-47
SLIDE 47

GROUPING SETS

slide-48
SLIDE 48

Only one GROUP BY operation at a time:

GROUPING SETS Before SQL:1999

SELECT year , month , sum(revenue) FROM tbl GROUP BY year, month Monthly revenue Yearly revenue SELECT year , sum(revenue) FROM tbl GROUP BY year

slide-49
SLIDE 49

GROUPING SETS Before SQL:1999

SELECT year , month , sum(revenue) FROM tbl GROUP BY year, month SELECT year , sum(revenue) FROM tbl GROUP BY year

slide-50
SLIDE 50

GROUPING SETS Before SQL:1999

SELECT year , month , sum(revenue) FROM tbl GROUP BY year, month SELECT year , sum(revenue) FROM tbl GROUP BY year UNION ALL , null

slide-51
SLIDE 51

GROUPING SETS Since SQL:1999

SELECT year , month , sum(revenue) FROM tbl GROUP BY year, month SELECT year , sum(revenue) FROM tbl GROUP BY year UNION ALL , null SELECT year , month , sum(revenue) FROM tbl GROUP BY GROUPING SETS ( (year, month) , (year) )

slide-52
SLIDE 52

GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function) disambiguates the meaning of NULL


(was the grouped data NULL or is this column not currently grouped?)

Permutations can be created using ROLLUP and CUBE


(ROLLUP(a,b,c) = GROUPING SETS ((a,b,c), (a,b),(a),())

GROUPING SETS In a Nutshell

slide-53
SLIDE 53

GROUPING SETS Availability

1 9 9 9 2 1 2 3 2 5 2 7 2 9 2 1 1 2 1 3 2 1 5 2 1 7

5.1

[0]

MariaDB

5.0

[1]

MySQL

9.5

PostgreSQL SQLite

5

DB2 LUW

9iR1

Oracle

2008

SQL Server

[0]Only ROLLUP (properitery syntax). [1]Only ROLLUP (properitery syntax). GROUPING function since MySQL 8.0.
slide-54
SLIDE 54

SQL:2003

slide-55
SLIDE 55

OVER

and

PARTITION BY

slide-56
SLIDE 56

OVER (PARTITION BY) The Problem

Two distinct concepts could not be used independently:

  • Merge rows with the same key properties
  • GROUP BY to specify key properties
  • DISTINCT to use full row as key
  • Aggregate data from related rows
  • Requires GROUP BY to segregate the rows
  • COUNT, SUM, AVG, MIN, MAX to aggregate grouped rows
slide-57
SLIDE 57

SELECT c1 , SUM(c2) tot FROM t GROUP BY c1

OVER (PARTITION BY) The Problem

Yes ⇠ Merge rows ⇢ No No ⇠ Aggregate ⇢ Yes SELECT c1 , c2 FROM t SELECT DISTINCT c1 , c2 FROM t SELECT c1 , c2 FROM t SELECT c1 , SUM(c2) tot FROM t GROUP BY c1

slide-58
SLIDE 58

SELECT c1 , SUM(c2) tot FROM t GROUP BY c1

OVER (PARTITION BY) The Problem

Yes ⇠ Merge rows ⇢ No No ⇠ Aggregate ⇢ Yes SELECT c1 , c2 FROM t SELECT DISTINCT c1 , c2 FROM t SELECT c1 , c2 FROM t SELECT c1 , SUM(c2) tot FROM t GROUP BY c1

slide-59
SLIDE 59

SELECT c1 , SUM(c2) tot FROM t GROUP BY c1

OVER (PARTITION BY) The Problem

Yes ⇠ Merge rows ⇢ No No ⇠ Aggregate ⇢ Yes SELECT c1 , c2 FROM t SELECT DISTINCT c1 , c2 FROM t SELECT c1 , c2 FROM t

JOIN ( ) ta ON (t.c1=ta.c1)

SELECT c1 , SUM(c2) tot FROM t GROUP BY c1

slide-60
SLIDE 60

SELECT c1 , SUM(c2) tot FROM t GROUP BY c1

OVER (PARTITION BY) The Problem

Yes ⇠ Merge rows ⇢ No No ⇠ Aggregate ⇢ Yes SELECT c1 , c2 FROM t SELECT DISTINCT c1 , c2 FROM t SELECT c1 , c2 FROM t

JOIN ( ) ta ON (t.c1=ta.c1)

SELECT c1 , SUM(c2) tot FROM t GROUP BY c1

, tot

slide-61
SLIDE 61

SELECT c1 , SUM(c2) tot FROM t GROUP BY c1

OVER (PARTITION BY) The Problem

Yes ⇠ Merge rows ⇢ No No ⇠ Aggregate ⇢ Yes SELECT c1 , c2 FROM t SELECT DISTINCT c1 , c2 FROM t SELECT c1 , c2 FROM t

JOIN ( ) ta ON (t.c1=ta.c1)

SELECT c1 , SUM(c2) tot FROM t GROUP BY c1

, tot

slide-62
SLIDE 62

SELECT c1 , SUM(c2) tot FROM t GROUP BY c1

OVER (PARTITION BY) Since SQL:2003

Yes ⇠ Merge rows ⇢ No No ⇠ Aggregate ⇢ Yes SELECT c1 , c2 FROM t SELECT DISTINCT c1 , c2 FROM t SELECT c1 , c2 FROM t FROM t , SUM(c2) OVER (PARTITION BY c1)

slide-63
SLIDE 63

SELECT dep, salary, SUM(salary) OVER() FROM emp dep salary 1 1000 6000 22 1000 6000 22 1000 6000 333 1000 6000 333 1000 6000 333 1000 6000

OVER (PARTITION BY) How it works

slide-64
SLIDE 64

SELECT dep, salary, SUM(salary) OVER() FROM emp dep salary 1 1000 6000 22 1000 6000 22 1000 6000 333 1000 6000 333 1000 6000 333 1000 6000

OVER (PARTITION BY) How it works

Look here

slide-65
SLIDE 65

SELECT dep, salary, SUM(salary) OVER() FROM emp dep salary 1 1000 6000 22 1000 6000 22 1000 6000 333 1000 6000 333 1000 6000 333 1000 6000

OVER (PARTITION BY) How it works

slide-66
SLIDE 66

SELECT dep, salary, SUM(salary) OVER() FROM emp dep salary 1 1000 6000 22 1000 6000 22 1000 6000 333 1000 6000 333 1000 6000 333 1000 6000

OVER (PARTITION BY) How it works

slide-67
SLIDE 67

SELECT dep, salary, SUM(salary) OVER() FROM emp dep salary 1 1000 1000 22 1000 2000 22 1000 2000 333 1000 3000 333 1000 3000 333 1000 3000

OVER (PARTITION BY) How it works

) PARTITION BY dep

slide-68
SLIDE 68

OVER

and

ORDER BY

(Framing & Ranking)

slide-69
SLIDE 69

acnt id value balance 1 1 +10 +10 22 2 +20 +30 22 3

  • 10

+20 333 4 +50 +70 333 5

  • 30

+40 333 6

  • 20

+20

OVER (ORDER BY) The Problem

SELECT id, value, FROM transactions t

slide-70
SLIDE 70

acnt id value balance 1 1 +10 +10 22 2 +20 +30 22 3

  • 10

+20 333 4 +50 +70 333 5

  • 30

+40 333 6

  • 20

+20

OVER (ORDER BY) The Problem

SELECT id, value, (SELECT SUM(value) FROM transactions t2 WHERE t2.id <= t.id) FROM transactions t

Range segregation (<=)
 not possible with
 GROUP BY or
 PARTITION BY

slide-71
SLIDE 71

OVER (ORDER BY) Since SQL:2003

SELECT id, value, FROM transactions t SUM(value) OVER ( ) acnt id value balance 1 1 +10 +10 22 2 +20 +30 22 3

  • 10

+20 333 4 +50 +70 333 5

  • 30

+40 333 6

  • 20

+20 ORDER BY id

slide-72
SLIDE 72

OVER (ORDER BY) Since SQL:2003

SELECT id, value, FROM transactions t SUM(value) OVER ( ) acnt id value balance 1 1 +10 +10 22 2 +20 +30 22 3

  • 10

+20 333 4 +50 +70 333 5

  • 30

+40 333 6

  • 20

+20 ORDER BY id ROWS BETWEEN
 UNBOUNDED PRECEDING

slide-73
SLIDE 73

OVER (ORDER BY) Since SQL:2003

SELECT id, value, FROM transactions t SUM(value) OVER ( ) acnt id value balance 1 1 +10 +10 22 2 +20 +30 22 3

  • 10

+20 333 4 +50 +70 333 5

  • 30

+40 333 6

  • 20

+20 ORDER BY id ROWS BETWEEN
 UNBOUNDED PRECEDING AND CURRENT ROW

slide-74
SLIDE 74

OVER (ORDER BY) Since SQL:2003

SELECT id, value, FROM transactions t SUM(value) OVER ( ) acnt id value balance 1 1 +10 +10 22 2 +20 +30 22 3

  • 10

+20 333 4 +50 +70 333 5

  • 30

+40 333 6

  • 20

+20 ORDER BY id ROWS BETWEEN
 UNBOUNDED PRECEDING AND CURRENT ROW

slide-75
SLIDE 75

OVER (ORDER BY) Since SQL:2003

SELECT id, value, FROM transactions t SUM(value) OVER ( ) acnt id value balance 1 1 +10 +10 22 2 +20 +30 22 3

  • 10

+20 333 4 +50 +70 333 5

  • 30

+40 333 6

  • 20

+20 ORDER BY id ROWS BETWEEN
 UNBOUNDED PRECEDING AND CURRENT ROW

slide-76
SLIDE 76

OVER (ORDER BY) Since SQL:2003

SELECT id, value, FROM transactions t SUM(value) OVER ( ) acnt id value balance 1 1 +10 +10 22 2 +20 +30 22 3

  • 10

+20 333 4 +50 +70 333 5

  • 30

+40 333 6

  • 20

+20 ORDER BY id ROWS BETWEEN
 UNBOUNDED PRECEDING AND CURRENT ROW

slide-77
SLIDE 77

OVER (ORDER BY) Since SQL:2003

SELECT id, value, FROM transactions t SUM(value) OVER ( ) acnt id value balance 1 1 +10 +10 22 2 +20 +30 22 3

  • 10

+20 333 4 +50 +70 333 5

  • 30

+40 333 6

  • 20

+20 ORDER BY id ROWS BETWEEN
 UNBOUNDED PRECEDING AND CURRENT ROW

slide-78
SLIDE 78

OVER (ORDER BY) Since SQL:2003

SELECT id, value, FROM transactions t SUM(value) OVER ( ) acnt id value balance 1 1 +10 +10 22 2 +20 +30 22 3

  • 10

+20 333 4 +50 +70 333 5

  • 30

+40 333 6

  • 20

+20 ORDER BY id ROWS BETWEEN
 UNBOUNDED PRECEDING AND CURRENT ROW

slide-79
SLIDE 79

OVER (ORDER BY) Since SQL:2003

SELECT id, value, FROM transactions t SUM(value) OVER ( ) acnt id value balance 1 1 +10 +10 22 2 +20 +30 22 3

  • 10

+20 333 4 +50 +70 333 5

  • 30

+40 333 6

  • 20

+20 ORDER BY id ROWS BETWEEN
 UNBOUNDED PRECEDING AND CURRENT ROW

slide-80
SLIDE 80

OVER (ORDER BY) Since SQL:2003

SELECT id, value, FROM transactions t SUM(value) OVER ( ) acnt id value balance 1 1 +10 22 2 +20 22 3

  • 10

333 4 +50 333 5

  • 30

333 6

  • 20

ORDER BY id ROWS BETWEEN
 UNBOUNDED PRECEDING AND CURRENT ROW

slide-81
SLIDE 81

OVER (ORDER BY) Since SQL:2003

SELECT id, value, FROM transactions t SUM(value) OVER ( ) acnt id value balance 1 1 +10 +10 22 2 +20 +20 22 3

  • 10

+10 333 4 +50 +50 333 5

  • 30

+20 333 6

  • 20
. 0

ORDER BY id ROWS BETWEEN
 UNBOUNDED PRECEDING AND CURRENT ROW PARTITION BY acnt

slide-82
SLIDE 82

OVER (ORDER BY) Since SQL:2003

With OVER (ORDER BY n) a new type of functions make sense:

n ROW_NUMBER RANK DENSE_RANK PERCENT_RANK CUME_DIST 1 1 1 1 0.25 2 2 2 2 0.33… 0.75 3 3 2 2 0.33… 0.75 4 4 4 3 1 1

slide-83
SLIDE 83
  • Aggregates without GROUP BY
  • Running totals,


moving averages

  • Ranking
  • Top-N per Group
  • Avoiding self-joins

[… many more …]

Use Cases

SELECT * FROM (SELECT ROW_NUMBER() OVER(PARTITION BY … ORDER BY …) rn , t.* FROM t) numbered_t WHERE rn <= 3 AVG(…) OVER(ORDER BY … ROWS BETWEEN 3 PRECEDING AND 3 FOLLOWING) moving_avg

OVER (SQL:2003)

slide-84
SLIDE 84

OVER may follow any aggregate function OVER defines which rows are visible at each row OVER() makes all rows visible at every row OVER(PARTITION BY …) segregates like GROUP BY OVER(ORDER BY … BETWEEN) segregates using <, >

In a Nutshell OVER (SQL:2003)

slide-85
SLIDE 85

1 9 9 9 2 1 2 3 2 5 2 7 2 9 2 1 1 2 1 3 2 1 5 2 1 7

5.1 10.2

MariaDB

8.0 MySQL 8.4

PostgreSQL SQLite

[0]

7.0

DB2 LUW

8i

Oracle

2005

SQL Server

[0]Expected for release 3.25.0 (available in snapshot release).

OVER (SQL:2003) Availability

Hive Impala Spark NuoDB

slide-86
SLIDE 86

NULLS FIRST/LAST

slide-87
SLIDE 87

The sorting of NULL is implementation defined (some DBs sort NULL as great, others as very small value)

NULLS FIRST/LAST Before SQL:2003

SELECT … FROM … ORDER BY COALESCE(nullable, ?); If you know a value
 larger/smaller than any
 actual value…

slide-88
SLIDE 88

The sorting of NULL is implementation defined (some DBs sort NULL as great, others as very small value)

NULLS FIRST/LAST Before SQL:2003

SELECT … FROM … ORDER BY COALESCE(nullable, ?); ORDER BY CASE WHEN nullable IS NULL THEN 0 ELSE 1 END , nullable; Using an extra sort key
 to put NULL and NOT NULL apart is more robust This shows NULLs first (no matter if nullable
 is sorted ASC or DESC)

slide-89
SLIDE 89

SQL:2003 introduced ORDER BY … NULLS FIRST/LAST

NULLS FIRST/LAST Since SQL:2003

SELECT … FROM … ORDER BY nullable NULLS FIRST Note: PostgreSQL accepts NULLS FIRST/LAST in index definitions. This returns
 NULLs first
 (for ASC and DESC)

slide-90
SLIDE 90

NULLS FIRST/LAST Since SQL:2003

1 9 9 9 2 1 2 3 2 5 2 7 2 9 2 1 1 2 1 3 2 1 5 2 1 7

MariaDB

[0]

MySQL

[0]

8.3

[1]

PostgreSQL SQLite

[0]

11.1

[1] DB2 LUW

11gR1

[1]

Oracle SQL Server

[0]

[0]By default sorted as smallest [1]By default sorted as greatest
slide-91
SLIDE 91

FILTER

slide-92
SLIDE 92

SELECT YEAR, SUM(CASE WHEN MONTH = 1 THEN revenue
 ELSE 0
 END) JAN, SUM(CASE WHEN MONTH = 2 THEN revenue END) FEB, … FROM sales GROUP BY YEAR

FILTER The Problem

Pivot table: Years on the Y axis, month on X:

slide-93
SLIDE 93

SELECT YEAR, SUM(CASE WHEN MONTH = 1 THEN revenue
 ELSE 0
 END) JAN, SUM(CASE WHEN MONTH = 2 THEN revenue END) FEB, … FROM sales GROUP BY YEAR

FILTER The Problem

Pivot table: Years on the Y axis, month on X: Optional:
 ELSE NULL is default Aggregates
 ignore NULL*

*Exceptions:array_agg, json_objectagg, xmlagg See: https://modern-sql.com/concept/null#aggregates

slide-94
SLIDE 94

SELECT YEAR, SUM(CASE WHEN MONTH = 1 THEN revenue
 ELSE 0
 END) JAN, SUM(CASE WHEN MONTH = 2 THEN revenue END) FEB, … FROM sales GROUP BY YEAR

FILTER The Problem

Pivot table: Years on the Y axis, month on X:

slide-95
SLIDE 95

SELECT YEAR, SUM(revenue) FILTER (WHERE MONTH = 1) JAN, SUM(revenue) FILTER (WHERE MONTH = 2) FEB, … FROM sales GROUP BY YEAR;

FILTER Since SQL:2003

SQL:2003 allows FILTER (WHERE…) after aggregates:

slide-96
SLIDE 96

FILTER Since SQL:2003

Year 2016 2016 2016 2016 2016 Month 1 2 3 ... 12 Revenue 1 23 345 ... 1234 Year 2016 Jan 1 Feb 23 Mar 345 ... ... Dec 1234 S U M ( … ) F I L T E R ( W H E R E … ) S U M ( … ) F I L T E R ( W H E R E m

  • n

t h = 2 ) S U M ( r e v e n u e ) F I L T E R ( W H E R E m

  • n

t h = 3 ) S U M ( r e v e n u e ) F I L T E R ( W H E R E m

  • n

t h = … ) S U M ( r e v e n u e ) F I L T E R ( W H E R E m

  • n

t h = 1 2 )

Pivot in SQL

  • 1. Use GROUP BY

to combine rows

  • 2. Use FILTER to pick

rows per column

See: https://modern-sql.com/use-case/pivot

slide-97
SLIDE 97

FILTER Availability

1 9 9 9 2 1 2 3 2 5 2 7 2 9 2 1 1 2 1 3 2 1 5 2 1 7

5.1

MariaDB MySQL

9.4

PostgreSQL SQLite

[0]

DB2 LUW Oracle SQL Server

[0]Only with OVER clause
slide-98
SLIDE 98

Inverse Distribution Functions (percentiles)

slide-99
SLIDE 99

SELECT d1.val FROM data d1 JOIN data d2 ON (d1.val < d2.val OR (d1.val=d2.val AND d1.id<d2.id)) GROUP BY d1.val HAVING count(*) = (SELECT FLOOR(COUNT(*)/2) FROM data d3)

Inverse Distribution Functions

The Problem

Grouped rows cannot be ordered prior aggregation.

(how to get the middle value (median) of a set)

slide-100
SLIDE 100

SELECT d1.val FROM data d1 JOIN data d2 ON (d1.val < d2.val OR (d1.val=d2.val AND d1.id<d2.id)) GROUP BY d1.val HAVING count(*) = (SELECT FLOOR(COUNT(*)/2) FROM data d3)

Inverse Distribution Functions

The Problem

Grouped rows cannot be ordered prior aggregation.

(how to get the middle value (median) of a set)

Number rows Pick middle one

slide-101
SLIDE 101

SELECT d1.val FROM data d1 JOIN data d2 ON (d1.val < d2.val OR (d1.val=d2.val AND d1.id<d2.id)) GROUP BY d1.val HAVING count(*) = (SELECT FLOOR(COUNT(*)/2) FROM data d3)

Inverse Distribution Functions

The Problem

Grouped rows cannot be ordered prior aggregation.

(how to get the middle value (median) of a set)

Number rows Pick middle one

slide-102
SLIDE 102

SELECT PERCENTILE_DISC(0.5) WITHIN GROUP (ORDER BY val) FROM data

Median Which value?

Since SQL:2003

Inverse Distribution Functions

slide-103
SLIDE 103

1 2 3 4

0.25 0.5 0.75 1

1 2 3 4

0.25 0.5 0.75 1

1 2 3 4 PERCENTILE_DISC

0.25 0.5 0.75 1

1 2 3 4 PERCENTILE_DISC

0.25 0.5 0.75 1

1 2 3 4 PERCENTILE_DISC(0.5)

0.25 0.5 0.75 1

1 2 3 4 PERCENTILE_CONT PERCENTILE_DISC(0.5)

0.25 0.5 0.75 1

1 2 3 4 PERCENTILE_CONT(0.5) PERCENTILE_DISC(0.5)

SELECT PERCENTILE_DISC(0.5) WITHIN GROUP (ORDER BY val) FROM data

Since SQL:2003

Inverse Distribution Functions

Two variants:

  • for discrete values


(categories)

  • for continuous values


(linear interpolation)

slide-104
SLIDE 104

Inverse Distribution Functions

Availability

1 9 9 9 2 1 2 3 2 5 2 7 2 9 2 1 1 2 1 3 2 1 5 2 1 7

5.1 10.3

[0] MariaDB

MySQL

9.4

PostgreSQL SQLite

11.1

DB2 LUW

9iR1

Oracle

2012

[0]

SQL Server

[0]Only as window function (requires OVER clause)
slide-105
SLIDE 105

SQL:2006

slide-106
SLIDE 106

XMLTABLE

slide-107
SLIDE 107

SELECT id , c1 , n FROM tbl , XMLTABLE( '/d/e' PASSING x COLUMNS id INT PATH '@id' , c1 VARCHAR(255) PATH 'c1' , n FOR ORDINALITY ) r

XMLTABLE Since SQL:2006

Stored in tbl.x: <d> <e id="42"> <c1>…</c1> </e> </d> XPath* expression to identify rows

*Standard SQL allows XQuery

slide-108
SLIDE 108

SELECT id , c1 , n FROM tbl , XMLTABLE( '/d/e' PASSING x COLUMNS id INT PATH '@id' , c1 VARCHAR(255) PATH 'c1' , n FOR ORDINALITY ) r

XMLTABLE Since SQL:2006

Stored in tbl.x: <d> <e id="42"> <c1>…</c1> </e> </d>

*Standard SQL allows XQuery

slide-109
SLIDE 109

SELECT id , c1 , n FROM tbl , XMLTABLE( '/d/e' PASSING x COLUMNS id INT PATH '@id' , c1 VARCHAR(255) PATH 'c1' , n FOR ORDINALITY ) r

XMLTABLE Since SQL:2006

Stored in tbl.x: <d> <e id="42"> <c1>…</c1> </e> </d>

*Standard SQL allows XQuery

XPath* expressions to extract data

slide-110
SLIDE 110

SELECT id , c1 , n FROM tbl , XMLTABLE( '/d/e' PASSING x COLUMNS id INT PATH '@id' , c1 VARCHAR(255) PATH 'c1' , n FOR ORDINALITY ) r

XMLTABLE Since SQL:2006

Stored in tbl.x: <d> <e id="42"> <c1>…</c1> </e> </d>

*Standard SQL allows XQuery

Row number (like for unnest)

slide-111
SLIDE 111

SELECT id , c1 , n FROM tbl , XMLTABLE( '/d/e' PASSING x COLUMNS id INT PATH '@id' , c1 VARCHAR(255) PATH 'c1' , n FOR ORDINALITY ) r

XMLTABLE Since SQL:2006

Stored in tbl.x: <d> <e id="42"> <c1>…</c1> </e> </d>

*Standard SQL allows XQuery

Result id | c1 | n

  • ---+----+---

42 | … | 1

slide-112
SLIDE 112

XMLTABLE Availability

1 9 9 9 2 1 2 3 2 5 2 7 2 9 2 1 1 2 1 3 2 1 5 2 1 7

MariaDB MySQL

10

[0]

PostgreSQL SQLite

9.7

DB2 LUW

11gR1

Oracle SQL Server

[0]No XQuery (only XPath). No default namespace declaration.
slide-113
SLIDE 113

SQL:2008

slide-114
SLIDE 114

FETCH FIRST

slide-115
SLIDE 115

SELECT * FROM (SELECT * , ROW_NUMBER() OVER(ORDER BY x) rn FROM data) numbered_data WHERE rn <=10

FETCH FIRST The Problem

Limit the result to a number of rows.

(LIMIT, TOP and ROWNUM are all proprietary)

SQL:2003 introduced ROW_NUMBER() to number rows.
 But this still requires wrapping to limit the result. And how about databases not supporting ROW_NUMBER()?

slide-116
SLIDE 116

SELECT * FROM (SELECT * , ROW_NUMBER() OVER(ORDER BY x) rn FROM data) numbered_data WHERE rn <=10

FETCH FIRST The Problem

Limit the result to a number of rows.

(LIMIT, TOP and ROWNUM are all proprietary)

SQL:2003 introduced ROW_NUMBER() to number rows.
 But this still requires wrapping to limit the result. And how about databases not supporting ROW_NUMBER()?

Dammit! Let's take
 LIMIT

slide-117
SLIDE 117

SELECT * FROM data ORDER BY x FETCH FIRST 10 ROWS ONLY

FETCH FIRST Since SQL:2008

SQL:2008 introduced the FETCH FIRST … ROWS ONLY clause:

slide-118
SLIDE 118

FETCH FIRST Availability

1999 2001 2003 2005 2007 2009 2011 2013 2015 2017

5.1

MariaDB

3.19.3

[0]

MySQL

6.5

[1]

8.4

PostgreSQL

2.1.0

[1]

SQLite

7.0

DB2 LUW

12cR1

Oracle

7.0

[2]

2012

SQL Server

[0]Earliest mention of LIMIT. Probably inherited from mSQL [1]Functionality available using LIMIT [2]SELECT TOP n ... SQL Server 2000 also supports expressions and bind parameters
slide-119
SLIDE 119

SQL:2011

slide-120
SLIDE 120

OFFSET

slide-121
SLIDE 121

SELECT * FROM (SELECT * , ROW_NUMBER() OVER(ORDER BY x) rn FROM data) numbered_data WHERE rn > 10 and rn <= 20

OFFSET The Problem

How to fetch the rows after a limit?


(pagination anybody?)

slide-122
SLIDE 122

SELECT * FROM data ORDER BY x OFFSET 10 ROWS FETCH NEXT 10 ROWS ONLY

OFFSET Since SQL:2011

SQL:2011 introduced OFFSET, unfortunately!

slide-123
SLIDE 123

SELECT * FROM data ORDER BY x OFFSET 10 ROWS FETCH NEXT 10 ROWS ONLY

OFFSET Since SQL:2011

SQL:2011 introduced OFFSET, unfortunately!

OFFSET

G r a b c

  • a

s t e r s & s t i c k e r s !

https://use-the-index-luke.com/no-offset

slide-124
SLIDE 124

OFFSET Since SQL:2011

1999 2001 2003 2005 2007 2009 2011 2013 2015

5.1

MariaDB

3.20.3

[0]

4.0.6

[1]

MySQL

6.5

PostgreSQL

2.1.0

SQLite

9.7

[2]

11.1

DB2 LUW

12c

Oracle

2012

SQL Server

[0]LIMIT [offset,] limit: "With this it's easy to do a poor man's next page/previous page WWW application." [1]The release notes say "Added PostgreSQL compatible LIMIT syntax" [2]Requires enabling the MySQL compatibility vector: db2set DB2_COMPATIBILITY_VECTOR=MYS
slide-125
SLIDE 125

OVER

slide-126
SLIDE 126

WITH numbered_t AS (SELECT * ) SELECT curr.* , curr.balance

  • COALESCE(prev.balance,0)

FROM numbered_t curr LEFT JOIN numbered_t prev ON (curr.rn = prev.rn+1)

OVER (SQL:2011) The Problem

Direct access of other rows of the same window is not possible.

(E.g., calculate the difference to the previous rows)

curr balance … rn 50 … 1 90 … 2 70 … 3 30 … 4

FROM t

slide-127
SLIDE 127

WITH numbered_t AS (SELECT * ) SELECT curr.* , curr.balance

  • COALESCE(prev.balance,0)

FROM numbered_t curr LEFT JOIN numbered_t prev ON (curr.rn = prev.rn+1)

OVER (SQL:2011) The Problem

Direct access of other rows of the same window is not possible.

(E.g., calculate the difference to the previous rows)

curr balance … rn 50 … 1 90 … 2 70 … 3 30 … 4

FROM t , ROW_NUMBER() OVER(ORDER BY x) rn

slide-128
SLIDE 128

WITH numbered_t AS (SELECT * ) SELECT curr.* , curr.balance

  • COALESCE(prev.balance,0)

FROM numbered_t curr LEFT JOIN numbered_t prev ON (curr.rn = prev.rn+1)

OVER (SQL:2011) The Problem

Direct access of other rows of the same window is not possible.

(E.g., calculate the difference to the previous rows)

curr balance … rn 50 … 1 90 … 2 70 … 3 30 … 4

FROM t , ROW_NUMBER() OVER(ORDER BY x) rn

slide-129
SLIDE 129

WITH numbered_t AS (SELECT * ) SELECT curr.* , curr.balance

  • COALESCE(prev.balance,0)

FROM numbered_t curr LEFT JOIN numbered_t prev ON (curr.rn = prev.rn+1)

OVER (SQL:2011) The Problem

Direct access of other rows of the same window is not possible.

(E.g., calculate the difference to the previous rows)

curr balance … rn 50 … 1 90 … 2 70 … 3 30 … 4

FROM t , ROW_NUMBER() OVER(ORDER BY x) rn

prev balance … rn 50 … 1 90 … 2 70 … 3 30 … 4

slide-130
SLIDE 130

WITH numbered_t AS (SELECT * ) SELECT curr.* , curr.balance

  • COALESCE(prev.balance,0)

FROM numbered_t curr LEFT JOIN numbered_t prev ON (curr.rn = prev.rn+1)

OVER (SQL:2011) The Problem

Direct access of other rows of the same window is not possible.

(E.g., calculate the difference to the previous rows)

curr balance … rn 50 … 1 90 … 2 70 … 3 30 … 4

FROM t , ROW_NUMBER() OVER(ORDER BY x) rn

prev balance … rn 50 … 1 90 … 2 70 … 3 30 … 4

slide-131
SLIDE 131

WITH numbered_t AS (SELECT * ) SELECT curr.* , curr.balance

  • COALESCE(prev.balance,0)

FROM numbered_t curr LEFT JOIN numbered_t prev ON (curr.rn = prev.rn+1)

OVER (SQL:2011) The Problem

Direct access of other rows of the same window is not possible.

(E.g., calculate the difference to the previous rows)

curr balance … rn 50 … 1 90 … 2 70 … 3 30 … 4

FROM t , ROW_NUMBER() OVER(ORDER BY x) rn

prev balance … rn 50 … 1 90 … 2 70 … 3 30 … 4 +50 +40

  • 20
  • 40
slide-132
SLIDE 132

SELECT *, balance 


  • COALESCE( LAG(balance)


OVER(ORDER BY x)
 , 0)
 FROM t

Available functions:

LEAD / LAG
 FIRST_VALUE / LAST_VALUE
 NTH_VALUE(col, n) FROM FIRST/LAST
 RESPECT/IGNORE NULLS

OVER (SQL:2011) Since SQL:2011

SQL:2011 introduced LEAD, LAG, NTH_VALUE, … for that:

slide-133
SLIDE 133

OVER (LEAD, LAG, …) Since SQL:2011

1 9 9 9 2 1 2 3 2 5 2 7 2 9 2 1 1 2 1 3 2 1 5 2 1 7

5.1 10.2

[0]

MariaDB

8.0

[0] MySQL

8.4

[0]

PostgreSQL SQLite

[0]

9.5

[1]

11.1

DB2 LUW

8i

[1]

11gR2

Oracle

2012

[1]

SQL Server

[0]No IGNORE NULLS and FROM LAST [1]No NTH_VALUE
slide-134
SLIDE 134

System Versioning

(Time Traveling)

slide-135
SLIDE 135

INSERT UPDATE DELETE are

DESTRUCTIVE

System Versioning The Problem

slide-136
SLIDE 136

CREATE TABLE t (..., start_ts TIMESTAMP(9) GENERATED
 ALWAYS AS ROW START, end_ts TIMESTAMP(9) GENERATED
 ALWAYS AS ROW END,
 PERIOD FOR SYSTEM_TIME (start_ts, end_ts) ) WITH SYSTEM VERSIONING

System Versioning Since SQL:2011

Table can be system versioned, application versioned or both.

slide-137
SLIDE 137

ID Data start_ts end_ts 1 X 10:00:00

UPDATE ... SET DATA = 'Y' ...

ID Data start_ts end_ts 1 X 10:00:00 11:00:00 1 Y 11:00:00

DELETE ... WHERE ID = 1 INSERT ... (ID, DATA) VALUES (1, 'X')

System Versioning Since SQL:2011

slide-138
SLIDE 138

ID Data start_ts end_ts 1 X 10:00:00

UPDATE ... SET DATA = 'Y' ...

ID Data start_ts end_ts 1 X 10:00:00 11:00:00 1 Y 11:00:00

DELETE ... WHERE ID = 1 INSERT ... (ID, DATA) VALUES (1, 'X')

System Versioning Since SQL:2011

slide-139
SLIDE 139

ID Data start_ts end_ts 1 X 10:00:00

UPDATE ... SET DATA = 'Y' ...

ID Data start_ts end_ts 1 X 10:00:00 11:00:00 1 Y 11:00:00

DELETE ... WHERE ID = 1

ID Data start_ts end_ts 1 X 10:00:00 11:00:00 1 Y 11:00:00 12:00:00

System Versioning Since SQL:2011

slide-140
SLIDE 140

Although multiple versions exist, only the “current”

  • ne is visible per default.

After 12:00:00, SELECT * FROM t doesn’t return anything anymore.

ID Data start_ts end_ts 1 X 10:00:00 11:00:00 1 Y 11:00:00 12:00:00

System Versioning Since SQL:2011

slide-141
SLIDE 141

ID Data start_ts end_ts 1 X 10:00:00 11:00:00 1 Y 11:00:00 12:00:00

With FOR … AS OF you can query anything you like: SELECT * FROM t FOR SYSTEM_TIME AS OF TIMESTAMP '2018-09-12 10:30:00'

ID Data start_ts end_ts 1 X 10:00:00 11:00:00

System Versioning Since SQL:2011

slide-142
SLIDE 142

System Versioning Since SQL:2011

1999 2001 2003 2005 2007 2009 2011 2013 2015 2017

5.1 10.3

[0] MariaDB

MySQL PostgreSQL SQLite

10.1

[1]

DB2 LUW

10gR1

[2]

Oracle

2016 SQL Server

[0]Transaction time not immutable. See MDEV-16236. [1]Third column required (tx id), history table required. [2]Functionality available using Flashback
slide-143
SLIDE 143

SQL:2016

(released: 2016-12-15)

slide-144
SLIDE 144

MATCH_RECOGNIZE

(Row Pattern Matching)

slide-145
SLIDE 145

Row Pattern Matching

Time 30 minutes

Example: Logfile

slide-146
SLIDE 146

Row Pattern Matching

Example: Logfile

Time 30 minutes

Session 1 Session 2 Session 3 Session 4

slide-147
SLIDE 147

Row Pattern Matching

Example: Logfile

Time 30 minutes

Session 1 Session 2 Session 3 Session 4

Example problem:

  • Average session duration

Two approaches:

  • Row pattern matching
  • Start-of-group tagging
slide-148
SLIDE 148

SELECT COUNT(*) sessions , AVG(duration) avg_duration FROM log MATCH_RECOGNIZE( ORDER BY ts MEASURES LAST(ts) - FIRST(ts) AS duration ONE ROW PER MATCH PATTERN ( any cont* ) DEFINE cont AS ts < PREV(ts) + INTERVAL '30' minute ) t

Since SQL:2016

Row Pattern Matching

Time 30 minutes

Oracle doesn’t support avg on intervals — query doesn’t work as shown
slide-149
SLIDE 149

SELECT COUNT(*) sessions , AVG(duration) avg_duration FROM log MATCH_RECOGNIZE( ORDER BY ts MEASURES LAST(ts) - FIRST(ts) AS duration ONE ROW PER MATCH PATTERN ( any cont* ) DEFINE cont AS ts < PREV(ts) + INTERVAL '30' minute ) t

Since SQL:2016

Row Pattern Matching

Time 30 minutes

define
 continued

Oracle doesn’t support avg on intervals — query doesn’t work as shown
slide-150
SLIDE 150

SELECT COUNT(*) sessions , AVG(duration) avg_duration FROM log MATCH_RECOGNIZE( ORDER BY ts MEASURES LAST(ts) - FIRST(ts) AS duration ONE ROW PER MATCH PATTERN ( any cont* ) DEFINE cont AS ts < PREV(ts) + INTERVAL '30' minute ) t

Since SQL:2016

Row Pattern Matching

Time 30 minutes

Oracle doesn’t support avg on intervals — query doesn’t work as shown

undefined
 pattern variable: matches any row

slide-151
SLIDE 151

SELECT COUNT(*) sessions , AVG(duration) avg_duration FROM log MATCH_RECOGNIZE( ORDER BY ts MEASURES LAST(ts) - FIRST(ts) AS duration ONE ROW PER MATCH PATTERN ( any cont* ) DEFINE cont AS ts < PREV(ts) + INTERVAL '30' minute ) t

Since SQL:2016

Row Pattern Matching

Time 30 minutes

any number


  • f “cont”


rows

Oracle doesn’t support avg on intervals — query doesn’t work as shown
slide-152
SLIDE 152

SELECT COUNT(*) sessions , AVG(duration) avg_duration FROM log MATCH_RECOGNIZE( ORDER BY ts MEASURES LAST(ts) - FIRST(ts) AS duration ONE ROW PER MATCH PATTERN ( any cont* ) DEFINE cont AS ts < PREV(ts) + INTERVAL '30' minute ) t

Since SQL:2016

Row Pattern Matching

Time 30 minutes

Very much
 like GROUP BY

Oracle doesn’t support avg on intervals — query doesn’t work as shown
slide-153
SLIDE 153

SELECT COUNT(*) sessions , AVG(duration) avg_duration FROM log MATCH_RECOGNIZE( ORDER BY ts MEASURES LAST(ts) - FIRST(ts) AS duration ONE ROW PER MATCH PATTERN ( any cont* ) DEFINE cont AS ts < PREV(ts) + INTERVAL '30' minute ) t

Since SQL:2016

Row Pattern Matching

Time 30 minutes

Very much
 like SELECT

Oracle doesn’t support avg on intervals — query doesn’t work as shown
slide-154
SLIDE 154

SELECT COUNT(*) sessions , AVG(duration) avg_duration FROM log MATCH_RECOGNIZE( ORDER BY ts MEASURES LAST(ts) - FIRST(ts) AS duration ONE ROW PER MATCH PATTERN ( any cont* ) DEFINE cont AS ts < PREV(ts) + INTERVAL '30' minute ) t

Since SQL:2016

Row Pattern Matching

Time 30 minutes

Oracle doesn’t support avg on intervals — query doesn’t work as shown
slide-155
SLIDE 155

Row Pattern Matching

Before SQL:2016

Time 30 minutes

Now, let’s try using window functions

slide-156
SLIDE 156

SELECT count(*) sessions, avg(duration) avg_duration FROM (SELECT MAX(ts) - MIN(ts) duration FROM (SELECT ts, COUNT(grp_start) OVER(ORDER BY ts) session_no FROM (SELECT ts, CASE WHEN ts >= LAG( ts, 1, DATE'1900-01-1' ) OVER( ORDER BY ts ) + INTERVAL '30' minute THEN 1 END grp_start FROM log ) tagged ) numbered GROUP BY session_no ) grouped

Row Pattern Matching

Before SQL:2016

Time 30 minutes

Start-of-group tags

slide-157
SLIDE 157

SELECT count(*) sessions, avg(duration) avg_duration FROM (SELECT MAX(ts) - MIN(ts) duration FROM (SELECT ts, COUNT(grp_start) OVER(ORDER BY ts) session_no FROM (SELECT ts, CASE WHEN ts >= LAG( ts, 1, DATE'1900-01-1' ) OVER( ORDER BY ts ) + INTERVAL '30' minute THEN 1 END grp_start FROM log ) tagged ) numbered GROUP BY session_no ) grouped

Row Pattern Matching

Before SQL:2016

Time 30 minutes

number sessions 2222 2 33 3 44 4 2 3 4 1

slide-158
SLIDE 158

SELECT count(*) sessions, avg(duration) avg_duration FROM (SELECT MAX(ts) - MIN(ts) duration FROM (SELECT ts, COUNT(grp_start) OVER(ORDER BY ts) session_no FROM (SELECT ts, CASE WHEN ts >= LAG( ts, 1, DATE'1900-01-1' ) OVER( ORDER BY ts ) + INTERVAL '30' minute THEN 1 END grp_start FROM log ) tagged ) numbered GROUP BY session_no ) grouped

Row Pattern Matching

Before SQL:2016

Time 30 minutes

2222 2 33 3 44 4 2 3 4 1

slide-159
SLIDE 159

Row Pattern Matching

Since SQL:2016

https://www.slideshare.net/MarkusWinand/row-pattern-matching-in-sql2016

slide-160
SLIDE 160

Row Pattern Matching

Availability

1999 2001 2003 2005 2007 2009 2011 2013 2015 2017

MariaDB MySQL PostgreSQL SQLite DB2 LUW

12cR1

Oracle SQL Server

slide-161
SLIDE 161

SQL has evolved beyond the relational idea.

Modern SQL?

@MarkusWinand

slide-162
SLIDE 162

SQL has evolved beyond the relational idea. If you are using SQL like 25 years ago,


you are doing it wrong! Modern SQL?

@MarkusWinand

slide-163
SLIDE 163

SQL has evolved beyond the relational idea. If you are using SQL like 25 years ago,


you are doing it wrong!

A lot has happened since SQL-92.

Modern SQL?

@MarkusWinand

slide-164
SLIDE 164

https://www.flickr.com/photos/mfoubister/25367243054/

slide-165
SLIDE 165

I have shown you 


a few features

today

https://www.flickr.com/photos/mfoubister/25367243054/

slide-166
SLIDE 166

I have shown you 


a few features

today

https://www.flickr.com/photos/mfoubister/25367243054/

There are hundreds more to discover

slide-167
SLIDE 167

@ModernSQL modern-sql.com

My other website: https://use-the-index-luke.com

Training & co: https://winand.at/