Formal Semantics of SQL
(and Cypher)
Paolo Guagliardo
Formal Semantics of SQL (and Cypher) Paolo Guagliardo SQL - - PowerPoint PPT Presentation
Formal Semantics of SQL (and Cypher) Paolo Guagliardo SQL Standard query language for relational databases $30B/year business Implemented in all major RDBMSs (free and commercial) First standardized in 1986 (ANSI) and 1987 (ISO)
Paolo Guagliardo
(SQL-89, SQL-92, SQL:1999, SQL:2003, SQL:2006, SQL:2008, SQL:2011, SQL:2016)
“The nice thing about standards is that you have so many to choose from” — Andrew S. Tanenbaum
SELECT * FROM R WHERE EXISTS ( SELECT * FROM ( SELECT R.A, R.A FROM R ) S )
Both PostgreSQL and Oracle output R
SELECT * FROM ( SELECT R.A, R.A FROM R ) S
PostgreSQL outputs a table with two columns named “A” Oracle throws an ERROR: reference to column “A” is ambiguous
Let’s have a look at the standard!
immediately contained in an <exists predicate>, then the <select list> is equivalent to a <value expression> that is an arbitrary <literal>.
sequence in which each <value expression> is a column reference that references a column of T and each column of T is referenced exactly once. The columns are referenced in the ascending sequence
SELECT * FROM R WHERE EXISTS ( SELECT * FROM ( SELECT R.A, R.A FROM R ) S ) SELECT * FROM ( SELECT R.A, R.A FROM R ) S SELECT S.A, S.A FROM ( SELECT R.A, R.A FROM R ) S SELECT R.A FROM R WHERE EXISTS ( SELECT 1 FROM ( SELECT R.A, R.A FROM R ) S )
Previous attempts
SELECT R.A FROM R WHERE R.A NOT IN ( SELECT S.A FROM S) SELECT R.A FROM R EXCEPT SELECT S.A FROM S SELECT R.A FROM R WHERE NOT EXISTS ( SELECT S.A FROM S WHERE S.A=R.A )
A 1 NULL A NULL
R S
A 1 A 1 NULL A
Answer
⌧ := (T1, . . . , Tk), := (N1, . . . , Nk), k > 0 ↵ := (A1, . . . , Am), 0 := (N 0
1, . . . , N 0 m), m > 0
Queries: Q := SELECT [DISTINCT] (↵:0 | *) FROM ⌧ : WHERE ✓ | Q (UNION | INTERSECT | EXCEPT) [ ALL ] Q Conditions: ✓ := TRUE | ¯ t (= | 6=) ¯ t | t IS [ NOT ] NULL | ¯ t [ NOT ] IN Q | EXISTS Q | ✓ AND ✓ | ✓ OR ✓ | NOT ✓
Essentially SQL without arithmetic, grouping and aggregation
Data model
Attribute references
JRKD,η = RD J⌧ : KD,η = J(T1, . . . , Tk) : (N1, . . . , Nk)KD,η = N1.JT1KD,η ⇥ · · · ⇥ Nk.JTkKD,η s
FROM⌧ :
WHERE✓ {
D,η=
a 2 J⌧ : KD,η | J✓KD,η;η¯
a = tt SELECT ⇤
FROM⌧ :
WHERE✓ |
D,η= s
FROM⌧ :
WHERE✓ {
D,η: 1 t SELECT ↵ : 0
FROM⌧ :
WHERE✓ |
D,η= ⇡α s
FROM⌧ :
WHERE✓ {
D,η! : 0 t SELECT DISTINCT ↵ : 0 | ⇤
FROM⌧ :
WHERE✓ |
D,η= " @ t SELECT ↵ : 0 | ⇤
FROM⌧ :
WHERE✓ |
D,η1 A JTRUEKD,η = t JtKD,η = ⇢⌘(A) if t = A t if t 2 C or t = NULL Jt1 = t2KD,η = 8 < : t if Jt1KD,η = Jt2KD,η and Jt1KD,η 6= NULL and Jt2KD,η 6= NULL f if Jt1KD,η 6= Jt2KD,η and Jt1KD,η 6= NULL and Jt2KD,η 6= NULL u if Jt1KD,η = NULL or Jt2KD,η = NULL Jt IS NULLKD,η = ⇢t if JtKD,η = NULL f if JtKD,η 6= NULL Jt IS NOT NULLKD,η = ¬Jt IS NULLKD,η J(t1, . . . tn) = (t0
1, . . . , t0 n)KD,η = n^
i=1Jti = t0
iKD,ηJ(t1, . . . tn) 6= (t0
1, . . . , t0 n)KD,η = n_
i=1Jti 6= t0
iKD,ηJ¯ t IN QKD,η = 8 < : t if 9¯ r 2 JQKD,η : J¯ t = ⌫(¯ r)KD,η = t f if 8¯ r 2 JQKD,η : J¯ t = ⌫(¯ r)KD,η = f u if @¯ r 2 JQKD,η : J¯ t = ⌫(¯ r)KD,η = t and 9¯ r 2 JQKD,η : J¯ t = ⌫(¯ r)KD,η 6= f J¯ t NOT IN QKD,η = ¬J¯ t IN QKD,η JEXISTS QKD,η = ⇢t if JQKD,η 6= ? f if JQKD,η = ? J✓1 AND ✓2KD,η = J✓1KD,η ^ J✓2KD,η J✓1 OR ✓2KD,η = J✓1KD,η _ J✓2KD,η JNOT ✓KD,η = ¬J✓KD,η JQ1 UNION ALL Q2KD,η = JQ1KD,η [ JQ2KD,η : `(JQ1K) JQ1 INTERSECT ALL Q2KD,η = JQ1KD,η \ JQ2KD,η : `(JQ1K) JQ1 EXCEPT ALL Q2KD,η = JQ1KD,η JQ2KD,η : `(JQ1K) JQ1 ? Q2KD,η = " JQ1 ? ALL Q2KD,η
? 2 {UNION, INTERSECT} JQ1 EXCEPT Q2KD,η = "(JQ1KD,η) JQ2KD,η : `(JQ1K)
and the University of Edinburgh
(Nadime Francis)