Modernes SQL Wie PostgreSQL die Konkurrenz aussticht @MarkusWinand - - PowerPoint PPT Presentation

modernes sql wie postgresql die konkurrenz aussticht
SMART_READER_LITE
LIVE PREVIEW

Modernes SQL Wie PostgreSQL die Konkurrenz aussticht @MarkusWinand - - PowerPoint PPT Presentation

Modernes SQL Wie PostgreSQL die Konkurrenz aussticht @MarkusWinand PGConf.de - 2018-04-13 FILTER Before we start FILTER In SQL, most aggregate functions * drop null arguments prior to the aggregation. *Exceptions: Aggregate


slide-1
SLIDE 1

Modernes SQL
 Wie PostgreSQL die Konkurrenz aussticht

@MarkusWinand PGConf.de - 2018-04-13

slide-2
SLIDE 2

FILTER

slide-3
SLIDE 3

FILTER Before we start

In SQL, most aggregate functions*
 drop null arguments
 prior to the aggregation.

*Exceptions: Aggregate functions that return structured data: array_agg, json_objectagg, json_arrayagg, xmlagg See: http://modern-sql.com/concept/null#aggregates

slide-4
SLIDE 4

SELECT YEAR, SUM(CASE WHEN MONTH = 1 THEN revenue
 ELSE 0
 END) JAN, SUM(CASE WHEN MONTH = 2 THEN revenue END) FEB, … FROM sales GROUP BY YEAR

FILTER The Problem

Pivot table: Years on the Y axis, month on X:

slide-5
SLIDE 5

SELECT YEAR, SUM(CASE WHEN MONTH = 1 THEN revenue
 ELSE 0
 END) JAN, SUM(CASE WHEN MONTH = 2 THEN revenue END) FEB, … FROM sales GROUP BY YEAR

FILTER The Problem

Pivot table: Years on the Y axis, month on X:

slide-6
SLIDE 6

SELECT YEAR, SUM(revenue) FILTER (WHERE MONTH = 1) JAN, SUM(revenue) FILTER (WHERE MONTH = 2) FEB, … FROM sales GROUP BY YEAR;

FILTER Since SQL:2003

SQL:2003 allows FILTER (WHERE…) after aggregates:

slide-7
SLIDE 7

FILTER Since SQL:2003

Year 2016 2016 2016 2016 2016 Month 1 2 3 ... 12 Revenue 1 23 345 ... 1234 Year 2016 Jan 1 Feb 23 Mar 345 ... ... Dec 1234 S U M ( … ) F I L T E R ( W H E R E … ) S U M ( … ) F I L T E R ( W H E R E m

  • n

t h = 2 ) S U M ( r e v e n u e ) F I L T E R ( W H E R E m

  • n

t h = 3 ) S U M ( r e v e n u e ) F I L T E R ( W H E R E m

  • n

t h = … ) S U M ( r e v e n u e ) F I L T E R ( W H E R E m

  • n

t h = 1 2 )

Pivot in SQL

  • 1. Use GROUP BY

to combine rows

  • 2. Use FILTER to pick

rows per column

See: http://modern-sql.com/use-case/pivot

slide-8
SLIDE 8

SELECT ent , MAX(val) FILTER(WHERE att='name') name , MAX(val) FILTER(WHERE att='email') email , MAX(val) FILTER(WHERE att='website') website FROM eav GROUP BY ent HAVING COUNT(*) FILTER(WHERE att='email') = 1 AND COUNT(*) FILTER(WHERE att='website') <= 1

FILTER

Use case: Flatten the EAV-Model
 (entity-attribute-value)

Since SQL:2003

GROUP BY

slide-9
SLIDE 9

SELECT ent , MAX(val) FILTER(WHERE att='name') name , MAX(val) FILTER(WHERE att='email') email , MAX(val) FILTER(WHERE att='website') website FROM eav GROUP BY ent HAVING COUNT(*) FILTER(WHERE att='email') = 1 AND COUNT(*) FILTER(WHERE att='website') <= 1

FILTER

Use case: Flatten the EAV-Model
 (entity-attribute-value)

Since SQL:2003

Pick each
 attribute ARRAY_AGG, XMLAGG, … are useful too MAX works


  • n strings too
slide-10
SLIDE 10

SELECT ent , MAX(val) FILTER(WHERE att='name') name , MAX(val) FILTER(WHERE att='email') email , MAX(val) FILTER(WHERE att='website') website FROM eav GROUP BY ent HAVING COUNT(*) FILTER(WHERE att='email') = 1 AND COUNT(*) FILTER(WHERE att='website') <= 1

FILTER

Use case: Flatten the EAV-Model
 (entity-attribute-value)

Since SQL:2003

Mandatory

slide-11
SLIDE 11

SELECT ent , MAX(val) FILTER(WHERE att='name') name , MAX(val) FILTER(WHERE att='email') email , MAX(val) FILTER(WHERE att='website') website FROM eav GROUP BY ent HAVING COUNT(*) FILTER(WHERE att='email') = 1 AND COUNT(*) FILTER(WHERE att='website') <= 1

FILTER

Use case: Flatten the EAV-Model
 (entity-attribute-value)

Since SQL:2003

Optional, but


  • nly one

Mandatory

slide-12
SLIDE 12

FILTER Availability

1999 2001 2003 2005 2007 2009 2011 2013 2015 2017

5.1

MariaDB MySQL

9.4

PostgreSQL SQLite DB2 LUW Oracle SQL Server

slide-13
SLIDE 13

BOOLEAN Aggregates

slide-14
SLIDE 14

Before we start

SQL uses a three-valued logic. Boolean values are either
 true, false or unknown(=null).

See: http://modern-sql.com/concept/three-valued-logic

BOOLEAN Aggregates

slide-15
SLIDE 15

SELECT ent , MAX(val) FILTER(WHERE att='name') name , MAX(val) FILTER(WHERE att='email') email , MAX(val) FILTER(WHERE att='website') website FROM eav GROUP BY ent HAVING COUNT(*) FILTER(WHERE att='email') = 1 AND COUNT(*) FILTER(WHERE att='website') <= 1

BOOLEAN Aggregates

Use case: Validate group properties
 (previous example continued)

Since SQL:2003

slide-16
SLIDE 16

SELECT ent , MAX(val) FILTER(WHERE att='name') name , MAX(val) FILTER(WHERE att='email') email , MAX(val) FILTER(WHERE att='website') website FROM eav GROUP BY ent HAVING COUNT(*) FILTER(WHERE att='email') = 1 AND COUNT(*) FILTER(WHERE att='website') <= 1

BOOLEAN Aggregates

Use case: Validate group properties
 (previous example continued)

Since SQL:2003

HAVING SOME(att='email') Equivalent to COUNT(*) FILTER(WHERE att='email') > 0

Assumption: constraint ensures only one email

slide-17
SLIDE 17

BOOLEAN Aggregates

EVERY(<cond>) ⇔ COUNT(*) FILTER(WHERE NOT(<cond>)) = 0

Since SQL:2003

ISO SQL

slide-18
SLIDE 18

BOOLEAN Aggregates

EVERY(<cond>) ⇔ COUNT(*) FILTER(WHERE NOT(<cond>)) = 0

Since SQL:2003

ISO SQL

slide-19
SLIDE 19

BOOLEAN Aggregates

EVERY(<cond>) ⇔ COUNT(*) FILTER(WHERE NOT(<cond>)) = 0

Since SQL:2003

Actually tests
 for no false!
 (unknown is removed) ISO SQL

slide-20
SLIDE 20

BOOLEAN Aggregates

EVERY(<cond>) ⇔ COUNT(*) FILTER(WHERE NOT(<cond>)) = 0 SOME(<cond>) ⇔ COUNT(*) FILTER(WHERE <cond>) > 0 ANY(<cond>) ⇔ COUNT(*) FILTER(WHERE <cond>) > 0

Since SQL:2003

}Same!

ISO SQL

slide-21
SLIDE 21

EVERY(…) SUM(CASE … WHEN TRUE THEN 0 BOOL_AND(…) WHEN FALSE THEN 1 END) = 0 BOOL_OR(…) ⇔ SUM(CASE … WHEN TRUE THEN 1 WHEN FALSE THEN 0 END) > 0

BOOLEAN Aggregates

EVERY(<cond>) ⇔ COUNT(*) FILTER(WHERE NOT(<cond>)) = 0 SOME(<cond>) ⇔ COUNT(*) FILTER(WHERE <cond>) > 0 ANY(<cond>) ⇔ COUNT(*) FILTER(WHERE <cond>) > 0

Since SQL:2003

}Same!

ISO SQL PostgreSQL

PostgreSQL seems to have a small incompatibility:
 if all values are unknown, it returns unknown instead of true.

} {

slide-22
SLIDE 22

BOOLEAN Aggregates

Since SQL:2003

1999 2001 2003 2005 2007 2009 2011 2013 2015 2017

MariaDB MySQL

8.4[0]

PostgreSQL SQLite DB2 LUW Oracle SQL Server

[0]Only EVERY(), which returns UNKNOWN if everything is NULL. Also: bool_or (similar to SOME).

slide-23
SLIDE 23

BOOLEAN Tests

slide-24
SLIDE 24

BOOLEAN Tests

Similar to is null, there are tests for each Boolean value
 (of which there are three: true, false, unknown/null) IS [NOT] [TRUE|FALSE|UNKNOWN] Example: Truly checking for “every” (no false, no unknown): COUNT(*) FILTER(WHERE <cond> IS NOT TRUE) = 0 COUNT(*) FILTER(WHERE <cond>) = COUNT(*)

(empty group returns true — like ISO SQL’s every)

Since SQL:2003

slide-25
SLIDE 25

BOOLEAN Tests

Similar to is null, there are tests for each Boolean value
 (of which there are three: true, false, unknown/null) IS [NOT] [TRUE|FALSE|UNKNOWN] Example: Truly checking for “every” (no false, no unknown): COUNT(*) FILTER(WHERE <cond> IS NOT TRUE) = 0 COUNT(*) FILTER(WHERE <cond>) = COUNT(*)

(empty group returns true — like ISO SQL’s every)

Since SQL:2003

slide-26
SLIDE 26

BOOLEAN Tests

Similar to is null, there are tests for each Boolean value
 (of which there are three: true, false, unknown/null) IS [NOT] [TRUE|FALSE|UNKNOWN] Example: Truly checking for “every” (no false, no unknown): COUNT(*) FILTER(WHERE <cond> IS NOT TRUE) = 0 COUNT(*) FILTER(WHERE <cond>) = COUNT(*)

(empty group returns true — like ISO SQL’s every)

Since SQL:2003

slide-27
SLIDE 27

BOOLEAN Tests

Similar to is null, there are tests for each Boolean value
 (of which there are three: true, false, unknown/null) IS [NOT] [TRUE|FALSE|UNKNOWN] Example: Truly checking for “every” (no false, no unknown): COUNT(*) FILTER(WHERE <cond> IS NOT TRUE) = 0 COUNT(*) FILTER(WHERE <cond>) = COUNT(*)

(empty group returns true — like ISO SQL’s every)

Since SQL:2003

slide-28
SLIDE 28

BOOLEAN Tests Since SQL:2003

1999 2001 2003 2005 2007 2009 2011 2013 2015 2017

5.1

MariaDB

5.0.51a

MySQL

8.3

PostgreSQL

3.5.7[0]

SQLite DB2 LUW Oracle SQL Server

[0]No IS [NOT] UNKNOWN. Use IS [NOT] NULL instead

slide-29
SLIDE 29

BOOLEAN Type

slide-30
SLIDE 30

BOOLEAN Type

CREATE TABLE … ( … deleted BOOLEAN NOT NULL, … )

Since SQL:2003

SELECT … FROM … WHERE NOT(deleted)

slide-31
SLIDE 31

BOOLEAN Type

CREATE TABLE … ( … deleted BOOLEAN NOT NULL, … )

Since SQL:2003

Alias for tinyint

MySQL MariaDB

slide-32
SLIDE 32

BOOLEAN Type

CREATE TABLE … ( … deleted BOOLEAN NOT NULL, … )

Since SQL:2003

Alias for tinyint

MySQL MariaDB

INSERT … (…, deleted, …) VALUES (…, true, …)

slide-33
SLIDE 33

BOOLEAN Type

CREATE TABLE … ( … deleted BOOLEAN NOT NULL, … )

Since SQL:2003

Alias for tinyint

MySQL MariaDB

INSERT … (…, deleted, …) VALUES (…, true, …) INSERT … (…, deleted, …) VALUES (…, false, …)

slide-34
SLIDE 34

BOOLEAN Type

CREATE TABLE … ( … deleted BOOLEAN NOT NULL, … )

Since SQL:2003

Alias for tinyint

MySQL MariaDB

INSERT … (…, deleted, …) VALUES (…, true, …) INSERT … (…, deleted, …) VALUES (…, false, …) INSERT … (…, deleted, …) VALUES (…, 42, …)

slide-35
SLIDE 35

BOOLEAN Type

CREATE TABLE … ( … deleted BOOLEAN NOT NULL, … )

Since SQL:2003

Alias for tinyint

MySQL MariaDB

INSERT … (…, deleted, …) VALUES (…, true, …) INSERT … (…, deleted, …) VALUES (…, false, …) INSERT … (…, deleted, …) VALUES (…, 42, …) UNIQUE,

slide-36
SLIDE 36

BOOLEAN Type

CREATE TABLE … ( … deleted BOOLEAN NOT NULL, … )

Since SQL:2003

Alias for tinyint

MySQL MariaDB

INSERT … (…, deleted, …) VALUES (…, true, …) INSERT … (…, deleted, …) VALUES (…, false, …) INSERT … (…, deleted, …) VALUES (…, 42, …) UNIQUE, +----+ | de | +----+ | 1 | | 0 | | 42 | +----+

slide-37
SLIDE 37

BOOLEAN Type Since SQL:2003

Note that boolean in base tables is often questionable:


  • deleted flags are often better represented as deleted_at


(or more advanced temporal models)


  • States often need more than two (or three) values


consider using an enum instead 
 See: 3 Reasons I Hate Booleans In Databases by Jeff Potter
 https://medium.com/@jpotts18/646d99696580

slide-38
SLIDE 38

BOOLEAN Type Since SQL:2003

However, boolean is also useful in 
 derived tables (subqueries)
 to improve readability Pretty name for


  • uter queries

SELECT order_id , SOME(gift_wrap IS NOT NULL) contains_gifts FROM order_lines GROUP BY order_id

slide-39
SLIDE 39

BOOLEAN Type Since SQL:2003

1999 2001 2003 2005 2007 2009 2011 2013 2015 2017

5.1[0]

MariaDB

5.0.51a[0]

MySQL

8.4

PostgreSQL SQLite DB2 LUW Oracle SQL Server

[0]BOOLEAN, TRUE, FALSE are aliases for TINYINT(1), 1, 0 respectivley.

slide-40
SLIDE 40

CHECK Constraints

slide-41
SLIDE 41

CHECK Constraints Since SQL-92

MySQL

CREATE TABLE … ( … deleted BOOLEAN NOT NULL
 CHECK (deleted IN (true, false)), … )

slide-42
SLIDE 42

CHECK Constraints Since SQL-92

MySQL

CREATE TABLE … ( … deleted BOOLEAN NOT NULL
 CHECK (deleted IN (true, false)), … ) INSERT … (…, deleted, …) VALUES (…, true, …)

slide-43
SLIDE 43

CHECK Constraints Since SQL-92

MySQL

CREATE TABLE … ( … deleted BOOLEAN NOT NULL
 CHECK (deleted IN (true, false)), … ) INSERT … (…, deleted, …) VALUES (…, true, …) INSERT … (…, deleted, …) VALUES (…, false, …)

slide-44
SLIDE 44

CHECK Constraints Since SQL-92

MySQL

CREATE TABLE … ( … deleted BOOLEAN NOT NULL
 CHECK (deleted IN (true, false)), … ) INSERT … (…, deleted, …) VALUES (…, true, …) INSERT … (…, deleted, …) VALUES (…, false, …) INSERT … (…, deleted, …) VALUES (…, 42, …) Syntax accepted,
 Constraint ignored

slide-45
SLIDE 45

CHECK Constraints Since SQL-92

1999 2001 2003 2005 2007 2009 2011 2013 2015 2017

10.2

MariaDB MySQL[0]

8.3

PostgreSQL

3.5.7

SQLite

9.7

DB2 LUW

11gR1

Oracle

2008R2

SQL Server

[0]Syntax accepted, but ignored without notice.

slide-46
SLIDE 46

DOMAIN

slide-47
SLIDE 47

CREATE DOMAIN positive_int AS INTEGER CHECK (VALUE > 0); CREATE TABLE order_lines ( …, quantity positive_int NOT NULL, … );

DOMAIN Since SQL:2003

A SQL domain is a set of permissible values.[SQL:2016-2: §4.12] Or: A way to manage CHECK constraints and DEFAULTs. DOMAINS are
 based on
 predefined types

slide-48
SLIDE 48

CREATE DOMAIN positive_int AS INTEGER CHECK (VALUE > 0); CREATE TABLE order_lines ( …, quantity positive_int NOT NULL, … );

DOMAIN Since SQL:2003

A SQL domain is a set of permissible values.[SQL:2016-2: §4.12] Or: A way to manage CHECK constraints and DEFAULTs.

slide-49
SLIDE 49

CREATE DOMAIN positive_int AS INTEGER CHECK (VALUE > 0); CREATE TABLE order_lines ( …, quantity positive_int NOT NULL, … );

DOMAIN Since SQL:2003

A SQL domain is a set of permissible values.[SQL:2016-2: §4.12] Or: A way to manage CHECK constraints and DEFAULTs. DOMAINS feel
 like types without
 type safety

slide-50
SLIDE 50

CREATE DOMAIN positive_int AS INTEGER CHECK (VALUE > 0); CREATE TABLE order_lines ( …, quantity positive_int NOT NULL, … );

DOMAIN Since SQL:2003

A SQL domain is a set of permissible values.[SQL:2016-2: §4.12] Or: A way to manage CHECK constraints and DEFAULTs. DOMAINS feel
 like types without
 type safety CAST(… AS <domain>) casts to the base type and
 checks the constraint

slide-51
SLIDE 51

CREATE DOMAIN positive_int AS INTEGER CONSTRAINT gt_zero CHECK (VALUE > 0); ALTER DOMAIN positive_int ADD CONSTRAINT ge_zero CHECK (VALUE >= 0); ALTER DOMAIN positive_int DROP CONSTRAINT gt_zero; ALTER DOMAIN positive_int RENAME TO unsigned_int;

DOMAIN Since SQL:2003

Domains can have multiple, named check constraints.

slide-52
SLIDE 52

CREATE DOMAIN positive_int AS INTEGER CONSTRAINT gt_zero CHECK (VALUE > 0); ALTER DOMAIN positive_int ADD CONSTRAINT ge_zero CHECK (VALUE >= 0); ALTER DOMAIN positive_int DROP CONSTRAINT gt_zero; ALTER DOMAIN positive_int RENAME TO unsigned_int;

DOMAIN Since SQL:2003

Domains can have multiple, named check constraints.

slide-53
SLIDE 53

CREATE DOMAIN positive_int AS INTEGER CONSTRAINT gt_zero CHECK (VALUE > 0); ALTER DOMAIN positive_int ADD CONSTRAINT ge_zero CHECK (VALUE >= 0); ALTER DOMAIN positive_int DROP CONSTRAINT gt_zero; ALTER DOMAIN positive_int RENAME TO unsigned_int;

DOMAIN Since SQL:2003

Domains can have multiple, named check constraints.

slide-54
SLIDE 54

CREATE DOMAIN positive_int AS INTEGER CONSTRAINT gt_zero CHECK (VALUE > 0); ALTER DOMAIN positive_int ADD CONSTRAINT ge_zero CHECK (VALUE >= 0); ALTER DOMAIN positive_int DROP CONSTRAINT gt_zero; ALTER DOMAIN positive_int RENAME TO unsigned_int;

DOMAIN Since SQL:2003

Domains can have multiple, named check constraints. PostgreSQL
 extension

slide-55
SLIDE 55

DOMAIN Since SQL:2003

PostgreSQL has a great extension: NOT VALID

PostgreSQL

slide-56
SLIDE 56

ALTER DOMAIN unsigned_int ADD CONSTRAINT gt_zero CHECK (VALUE > 0) NOT VALID; UPDATE order_lines SET quantity = ? WHERE quantity = 0; ALTER DOMAIN unsigned_int VALIDATE CONSTRAINT ge_zero; ALTER DOMAIN unsigned_int DROP CONSTRAINT gt_zero;

DOMAIN Since SQL:2003

PostgreSQL has a great extension: NOT VALID

PostgreSQL

Enforced on INSERT
 & UPDATE but not
 for existing values

slide-57
SLIDE 57

ALTER DOMAIN unsigned_int ADD CONSTRAINT gt_zero CHECK (VALUE > 0) NOT VALID; UPDATE order_lines SET quantity = ? WHERE quantity = 0; ALTER DOMAIN unsigned_int VALIDATE CONSTRAINT ge_zero; ALTER DOMAIN unsigned_int DROP CONSTRAINT gt_zero;

DOMAIN Since SQL:2003

PostgreSQL has a great extension: NOT VALID

PostgreSQL

Enforced on INSERT
 & UPDATE but not
 for existing values

slide-58
SLIDE 58

ALTER DOMAIN unsigned_int ADD CONSTRAINT gt_zero CHECK (VALUE > 0) NOT VALID; UPDATE order_lines SET quantity = ? WHERE quantity = 0; ALTER DOMAIN unsigned_int VALIDATE CONSTRAINT ge_zero; ALTER DOMAIN unsigned_int DROP CONSTRAINT gt_zero;

DOMAIN Since SQL:2003

PostgreSQL has a great extension: NOT VALID

PostgreSQL

Enforced on INSERT
 & UPDATE but not
 for existing values

slide-59
SLIDE 59

ALTER DOMAIN unsigned_int ADD CONSTRAINT gt_zero CHECK (VALUE > 0) NOT VALID; UPDATE order_lines SET quantity = ? WHERE quantity = 0; ALTER DOMAIN unsigned_int VALIDATE CONSTRAINT ge_zero; ALTER DOMAIN unsigned_int DROP CONSTRAINT gt_zero;

DOMAIN Since SQL:2003

PostgreSQL has a great extension: NOT VALID

PostgreSQL

Enforced on INSERT
 & UPDATE but not
 for existing values

slide-60
SLIDE 60

DOMAIN Since SQL:2003

PostgreSQL handles DROP DOMAIN … CASCADE proprietarily:

PostgreSQL

slide-61
SLIDE 61

DROP DOMAIN unsigned_int RESTRICT;

DOMAIN Since SQL:2003

PostgreSQL handles DROP DOMAIN … CASCADE proprietarily:

PostgreSQL

Fails if domain
 is in use
 (default)

slide-62
SLIDE 62

DROP DOMAIN unsigned_int RESTRICT; DROP DOMAIN unsigned_int CASCADE;

DOMAIN Since SQL:2003

PostgreSQL handles DROP DOMAIN … CASCADE proprietarily:

PostgreSQL

ISO behaviour is to
 copy the check constraints
 to the columns Drops COLUMNS that use the domain
 (and the domain)

slide-63
SLIDE 63

ALTER DOMAIN unsigned_int SET DEFAULT 1; ALTER DOMAIN unsigned_int DROP DEFAULT;

DOMAIN Since SQL:2003

Domains can also specify a DEFAULT value.

slide-64
SLIDE 64

DOMAIN Since SQL:2003

1999 2001 2003 2005 2007 2009 2011 2013 2015 2017

MariaDB MySQL

8.3[0]

PostgreSQL SQLite DB2 LUW Oracle SQL Server

[0]DROP DOMAIN differs from the standard: defaults to RESTRICT, drops columns on CASCADE.

slide-65
SLIDE 65

XMLTABLE

slide-66
SLIDE 66

SELECT id , c1 , n FROM tbl , XMLTABLE( '/d/e' PASSING x COLUMNS id INT PATH '@id' , c1 VARCHAR(255) PATH 'c1' , n FOR ORDINALITY ) r

XMLTABLE Since SQL:2006

Stored in tbl.x: <d> <e id="42"> <c1>…</c1> </e> </d> XPath* expression to identify rows

*Standard SQL allows XQuery,
 PostgreSQL supports only XPath

slide-67
SLIDE 67

SELECT id , c1 , n FROM tbl , XMLTABLE( '/d/e' PASSING x COLUMNS id INT PATH '@id' , c1 VARCHAR(255) PATH 'c1' , n FOR ORDINALITY ) r

XMLTABLE Since SQL:2006

Stored in tbl.x: <d> <e id="42"> <c1>…</c1> </e> </d>

*Standard SQL allows XQuery,
 PostgreSQL supports only XPath

slide-68
SLIDE 68

SELECT id , c1 , n FROM tbl , XMLTABLE( '/d/e' PASSING x COLUMNS id INT PATH '@id' , c1 VARCHAR(255) PATH 'c1' , n FOR ORDINALITY ) r

XMLTABLE Since SQL:2006

Stored in tbl.x: <d> <e id="42"> <c1>…</c1> </e> </d>

*Standard SQL allows XQuery,
 PostgreSQL supports only XPath

XPath* expressions to extract data

slide-69
SLIDE 69

SELECT id , c1 , n FROM tbl , XMLTABLE( '/d/e' PASSING x COLUMNS id INT PATH '@id' , c1 VARCHAR(255) PATH 'c1' , n FOR ORDINALITY ) r

XMLTABLE Since SQL:2006

Stored in tbl.x: <d> <e id="42"> <c1>…</c1> </e> </d>

*Standard SQL allows XQuery,
 PostgreSQL supports only XPath

Row number (like for unnest)

slide-70
SLIDE 70

SELECT id , c1 , n FROM tbl , XMLTABLE( '/d/e' PASSING x COLUMNS id INT PATH '@id' , c1 VARCHAR(255) PATH 'c1' , n FOR ORDINALITY ) r

XMLTABLE Since SQL:2006

Stored in tbl.x: <d> <e id="42"> <c1>…</c1> </e> </d>

*Standard SQL allows XQuery,
 PostgreSQL supports only XPath

Result id | c1 | n

  • ---+----+---

42 | … | 1

slide-71
SLIDE 71

XMLTABLE Availability

1999 2001 2003 2005 2007 2009 2011 2013 2015 2017

MariaDB MySQL

10[0] PostgreSQL

SQLite

9.7

DB2 LUW

11gR1

Oracle SQL Server

[0]No XQuery (only XPath). No default namespace declaration.

slide-72
SLIDE 72

NULLS FIRST/LAST

slide-73
SLIDE 73

The sorting of NULL is implementation defined (some DBs sort NULL as great, others as very small value)

NULLS FIRST/LAST Before SQL:2003

slide-74
SLIDE 74

The sorting of NULL is implementation defined (some DBs sort NULL as great, others as very small value)

NULLS FIRST/LAST Before SQL:2003

SELECT … FROM … ORDER BY COALESCE(nullable, ?); If you know a value
 larger/smaller than any
 actual value…

slide-75
SLIDE 75

The sorting of NULL is implementation defined (some DBs sort NULL as great, others as very small value)

NULLS FIRST/LAST Before SQL:2003

SELECT … FROM … ORDER BY COALESCE(nullable, ?); ORDER BY CASE WHEN nullable IS NULL THEN 0 ELSE 1 END , nullable; Using an extra sort key
 to put NULL and NOT NULL apart is more robust This shows NULLs first (no matter if nullable
 is sorted ASC or DESC)

slide-76
SLIDE 76

SQL:2003 introduced ORDER BY … NULLS FIRST/LAST

NULLS FIRST/LAST Since SQL:2003

SELECT … FROM … ORDER BY nullable NULLS FIRST; Note: PostgreSQL accepts NULLS FIRST/LAST in index definitions. This returns
 NULLs first
 (for ASC and DESC)

slide-77
SLIDE 77

NULLS FIRST/LAST Since SQL:2003

1999 2001 2003 2005 2007 2009 2011 2013 2015 2017

MariaDB[0] MySQL[0]

8.3[1]

PostgreSQL SQLite[0]

11.1[1]

DB2 LUW

11gR1[1]

Oracle SQL Server[0]

[0]By default sorted as smallest [1]By default sorted as greatest

slide-78
SLIDE 78

Inverse Distribution Functions (percentiles)

slide-79
SLIDE 79

SELECT d1.val FROM data d1 JOIN data d2 ON (d1.val < d2.val OR (d1.val=d2.val AND d1.id<d2.id)) GROUP BY d1.val HAVING count(*) = (SELECT FLOOR(COUNT(*)/2) FROM data d3)

Inverse Distribution Functions

The Problem

Grouped rows cannot be ordered prior aggregation.

(how to get the middle value (median) of a set)

Number rows Pick middle one

slide-80
SLIDE 80

SELECT d1.val FROM data d1 JOIN data d2 ON (d1.val < d2.val OR (d1.val=d2.val AND d1.id<d2.id)) GROUP BY d1.val HAVING count(*) = (SELECT FLOOR(COUNT(*)/2) FROM data d3)

Inverse Distribution Functions

The Problem

Grouped rows cannot be ordered prior aggregation.

(how to get the middle value (median) of a set)

Number rows Pick middle one

slide-81
SLIDE 81

SELECT PERCENTILE_DISC(0.5) WITHIN GROUP (ORDER BY val) FROM data

Median Which value?

Since SQL:2003

Inverse Distribution Functions

slide-82
SLIDE 82

SELECT PERCENTILE_DISC(0.5) WITHIN GROUP (ORDER BY val) FROM data

Since SQL:2003

Inverse Distribution Functions

Two variants:

  • for discrete values


(categories)

  • for continuous values


(linear interpolation)

slide-83
SLIDE 83

1 2 3 4

SELECT PERCENTILE_DISC(0.5) WITHIN GROUP (ORDER BY val) FROM data

Since SQL:2003

Inverse Distribution Functions

Two variants:

  • for discrete values


(categories)

  • for continuous values


(linear interpolation)

slide-84
SLIDE 84

1 2 3 4

0.25 0.5 0.75 1

1 2 3 4

SELECT PERCENTILE_DISC(0.5) WITHIN GROUP (ORDER BY val) FROM data

Since SQL:2003

Inverse Distribution Functions

Two variants:

  • for discrete values


(categories)

  • for continuous values


(linear interpolation)

slide-85
SLIDE 85

1 2 3 4

0.25 0.5 0.75 1

1 2 3 4

0.25 0.5 0.75 1

1 2 3 4 PERCENTILE_DISC

0.25 0.5 0.75 1

1 2 3 4 PERCENTILE_DISC

0.25 0.5 0.75 1

1 2 3 4 PERCENTILE_DISC(0.5)

SELECT PERCENTILE_DISC(0.5) WITHIN GROUP (ORDER BY val) FROM data

Since SQL:2003

Inverse Distribution Functions

Two variants:

  • for discrete values


(categories)

  • for continuous values


(linear interpolation)

slide-86
SLIDE 86

1 2 3 4

0.25 0.5 0.75 1

1 2 3 4

0.25 0.5 0.75 1

1 2 3 4 PERCENTILE_DISC

0.25 0.5 0.75 1

1 2 3 4 PERCENTILE_DISC

0.25 0.5 0.75 1

1 2 3 4 PERCENTILE_DISC(0.5)

0.25 0.5 0.75 1

1 2 3 4 PERCENTILE_CONT PERCENTILE_DISC(0.5)

0.25 0.5 0.75 1

1 2 3 4 PERCENTILE_CONT(0.5) PERCENTILE_DISC(0.5)

SELECT PERCENTILE_DISC(0.5) WITHIN GROUP (ORDER BY val) FROM data

Since SQL:2003

Inverse Distribution Functions

Two variants:

  • for discrete values


(categories)

  • for continuous values


(linear interpolation)

slide-87
SLIDE 87

Inverse Distribution Functions

Availability

1999 2001 2003 2005 2007 2009 2011 2013 2015 2017

MariaDB MySQL

9.4

PostgreSQL SQLite

11.1

DB2 LUW

9iR1

Oracle

2012[0]

SQL Server

[0]Only as window function (requires OVER clause)

slide-88
SLIDE 88

MERGE

(won’t come before PostgreSQL 12) (was committed, is now reverted)

slide-89
SLIDE 89

Copying rows from another table is easy: INSERT INTO <target> SELECT … FROM <source> WHERE NOT EXISTS (SELECT * FROM <target> WHERE … )

MERGE The Problem

Both, 
 <target> and <source> are in scope here.

slide-90
SLIDE 90

Deleting rows that exist in another table is also possible: DELETE FROM <target> WHERE EXISTS (SELECT * FROM <source> WHERE … )

MERGE The Problem

Both, 
 <target> and <source> are in scope here.

slide-91
SLIDE 91

Updating rows from another table is awkward: UPDATE <target>
 SET … WHERE …

MERGE The Problem

Requires a name
 (table or updatable view) but not a subquery

slide-92
SLIDE 92

Updating rows from another table is awkward: UPDATE <target>
 SET … WHERE …

MERGE The Problem

Subqueries are a more common choice Sometimes, updatable views can help. Bringing another tables rows into scope of the SET clause is tricky

slide-93
SLIDE 93

Updating rows from another table is awkward: UPDATE <target>
 SET … WHERE … SET (col1, col2) = (SELECT col1, col2 FROM <source> WHERE … ) WHERE …

MERGE The Problem

Both, 
 <target> and <source> are in scope here.

slide-94
SLIDE 94

SQL:2008 introduced merge to improve this situation two-fold:

  • It has always two tables in scope,


the source can be a derived table (subquery).

  • It can do insert, update, or delete in one go.

MERGE INTO <target> USING <source> ON <join condition> WHEN MATCHED [AND <cond>] THEN [UPDATE…|DELETE…] WHEN NOT MATCHED [AND <cond>] THEN INSERT…

MERGE Since SQL:2008

WHEN/THEN
 can appear many times

slide-95
SLIDE 95

MERGE Since SQL:2008

1999 2001 2003 2005 2007 2009 2011 2013 2015 2017

5.1

MariaDB MySQL PostgreSQL SQLite

9.1

DB2 LUW

9iR1[0]

Oracle

2008

SQL Server

[0]No AND condition.

slide-96
SLIDE 96
slide-97
SLIDE 97

About @MarkusWinand

  • Training for Developers
  • SQL Performance (Indexing)
  • Modern SQL
  • On-Site or Online
  • SQL Tuning
  • Index-Redesign
  • Query Improvements
  • On-Site or Online

https://winand.at/

slide-98
SLIDE 98

About @MarkusWinand

€0,- €10-30

sql-performance-explained.com

slide-99
SLIDE 99

About @MarkusWinand

@ModernSQL http://modern-sql.com