SQL Database Systems: The Complete Book Ch 2.3, 6.1-6.4 1 Project - - PowerPoint PPT Presentation

sql
SMART_READER_LITE
LIVE PREVIEW

SQL Database Systems: The Complete Book Ch 2.3, 6.1-6.4 1 Project - - PowerPoint PPT Presentation

SQL Database Systems: The Complete Book Ch 2.3, 6.1-6.4 1 Project Outline ??? Parser & SQL Query Relational Algebra Translator .sql JSqlParser Optimizer ??? Statistics Hope and Duct Tape? Query Evaluation Execution Plan Result


slide-1
SLIDE 1

SQL

Database Systems: The Complete Book Ch 2.3, 6.1-6.4

1

slide-2
SLIDE 2

Project Outline

SQL Query Parser & Translator Relational Algebra Optimizer Execution Plan Evaluation Engine Query Result

Statistics

???

Hope and Duct Tape?

JSqlParser .sql

???

2

slide-3
SLIDE 3

SQL is Human Readable

  • Lots of Syntactic Sugar
  • WHERE vs HAVING
  • Lots of Corner Cases
  • SELECT A, B vs SELECT A, SUM(B)
  • Non-obvious evaluation strategy
  • SELECT … FROM R, S, T, … WHERE …

3

SQL is hard to evaluate directly!

slide-4
SLIDE 4

Relational Algebra

  • Equivalent to SQL (to be discussed)
  • SIMPLE! (only a handful of operators)
  • “Non-declarative” (easy to rewrite)
  • Minimal corner cases or syntactic sugar

4

“RA” is easier to interpret!

slide-5
SLIDE 5

Relational Algebra

  • Basic Relational Operators
  • Select ( ), Project (π), Cross/Join (⨉/⋈),

Union (U), Relation (R, S, T, …), Minus (-)

  • Extended Relational Operators (more next week)
  • Aggregates (SUM,COUNT,MIN/MAX,AVERAGE)
  • List Operators: Sort, Limit

5

σ

slide-6
SLIDE 6

π 𝜏 ⋈

Employee Department

The Evaluation Pipeline

Parsed Query Data Results .sql How does this work? (later today) What does this look like? (today) How does this work? (next class)

6

slide-7
SLIDE 7

SQL

  • Developed by IBM (for System R) in the 1970s.
  • Standard used by many vendors.
  • SQL-86 (original standard)
  • SQL-89 (minor revisions; integrity constraints)
  • SQL-92 (major revision; basis for modern SQL)
  • SQL-99 (XML, window queries, generated default values)
  • SQL 2003 (major revisions to XML support)
  • SQL 2008 (minor extensions)
  • SQL 2011 (minor extensions; temporal databases)

7

slide-8
SLIDE 8

A Basic SQL Query

SELECT [DISTINCT] target-list FROM relation-list WHERE condition

A list of relation names (possibly with a range-variable after each name)

A list of attributes of relations in relation-list

Comparisons (‘=’, ‘<>’, ‘<‘, ‘>’, ‘<=’, ‘>=’) and other boolean predicates, combined using AND, OR, and NOT (a boolean formula)

(optional) keyword indicating that the answer should not contain duplicates

8

slide-9
SLIDE 9

A Basic SQL Query

SELECT [DISTINCT] target-list FROM relation-list WHERE condition

net.sf.jsqlparser.statement.select.PlainSelect mySelect.getDistinct() mySelect.getFromItem() and mySelect.getJoins() mySelect.getSelectItems() mySelect.getWhere()

9

slide-10
SLIDE 10

Query Evaluation

SELECT [DISTINCT] target-list FROM relation-list WHERE condition

1) Compute the 2n combinations of tuples in all relations appearing in relation-list 2) Discard tuples that fail the condition 3) Delete attributes not in target-list 4) If DISTINCT is specified, eliminate duplicate rows

This is the least efficient strategy to compute a query! A good optimizer will find more efficient strategies to compute the same answer.

10

slide-11
SLIDE 11

DISTINCT

11

Why do you explicitly indicate that you want duplicate elimination in SQL?

slide-12
SLIDE 12

Example-Wildcards

12

FirstName, LastName, Ship [James, Kirk, 1701A] [Leonard, McCoy, 1701A] [Spock, SonOfSarek, 1701A] . . . SELECT * FROM Officers WHERE Ship = ‘1701A’

Find all officers on the Enterprise (Ship 1701A)

FirstName, LastName, Ship [James, Kirk, 1701A] [Leonard, McCoy, 1701A] [Spock, SonOfSarek, 1701A] . . .

‘*’ denotes all attributes ‘Officers.*’ denotes all attributes in Officers

net.sf.jsqlparser.statement.select.AllColumns net.sf.jsqlparser.statement.select.AllTableColumns

slide-13
SLIDE 13

Example-Condition

13

FirstName, LastName, Ship [James, Kirk, 1701A] [Leonard, McCoy, 1701A] [Spock, SonOfSarek, 1701A] [Montgomery, Scott, 1701A] [Hikaru, Sulu, 2000 ] [Pavel, Chekov, 1701A] [Nyota, Uhura, 1701A] [Christine, Chapel, 0001 ] SELECT * FROM Officers WHERE Ship = ‘1701A’

Find all officers on the Enterprise (Ship 1701A)

FirstName, LastName, Ship [James, Kirk, 1701A] [Leonard, McCoy, 1701A] [Spock, SonOfSarek, 1701A] [Montgomery, Scott, 1701A] [Pavel, Chekov, 1701A] [Nyota, Uhura, 1701A]

slide-14
SLIDE 14

Example-Target List

14

FirstName, LastName, Ship [James, Kirk, 1701A] [Leonard, McCoy, 1701A] [Spock, SonOfSarek, 1701A] [Montgomery, Scott, 1701A] [Hikaru, Sulu, 2000 ] [Pavel, Chekov, 1701A] [Nyota, Uhura, 1701A] [Christine, Chapel, 0001 ]

SELECT O.FirstName,O.LastName FROM Officers O WHERE O.Ship = ‘1701A’

Find just names of all

  • fficers on the Enterprise

FirstName, LastName [James, Kirk ] [Leonard, McCoy ] [Spock, SonOfSarek] [Montgomery, Scott ] [Pavel, Chekov ] [Nyota, Uhura ]

slide-15
SLIDE 15

Example-Multiple Relations

15

FirstName, LastName, Ship [James, Kirk, 1701A] [Leonard, McCoy, 1701A] [Spock, SonOfSarek, 1701A] [Montgomery, Scott, 1701A] [Hikaru, Sulu, 2000 ] [Pavel, Chekov, 1701A] [Nyota, Uhura, 1701A] [Christine, Chapel, 0001 ] FirstName, LastName [Hikaru, Sulu ]

ID, Name, Location [1701A, Enterprise-A, Andoria ] [2000, Excelsior, Vulcan ] [1864, Reliant, Ceti Alpha VI]

SELECT FirstName,LastName FROM Officers, Ships WHERE Ship = ID AND Location = ‘Vulcan’

In English, what does this query compute?

Who is on a ship located at Vulcan?

slide-16
SLIDE 16

Example-Multiple Relations

16

SELECT FirstName,LastName FROM Officers, Ships WHERE Ship = ID AND Location = ‘Vulcan’

mySelect.getFromItem() returns ….schema.Table(Officers) mySelect.getJoins() returns List( ….select.Join(Table(Ships), {simple}) )

slide-17
SLIDE 17

Range Variables

17

SELECT FirstName,LastName FROM Officers, Ships WHERE Ship = ID AND Location = ‘Vulcan’

is the same as

SELECT Officers.FirstName,Officers.LastName FROM Officers, Ships WHERE Officers.Ship = Ships.ID AND Ships.Location = ‘Vulcan’

is the same as

SELECT O.FirstName,O.LastName FROM Officers O, Ships S WHERE O.Ship = S.ID AND S.Location = ‘Vulcan’

But it’s good style to use range variables and fully- qualified attribute names! JSqlParser calls this an “alias”

slide-18
SLIDE 18

Expressions

18

SELECT O.age, age1 = O.age*0.2, O.age*3.0 AS age2 FROM Officers O

[age, age1, age2]

Arithmetic expressions can appear in targets or conditions. Use ‘=’ or ‘AS’ to assign names to these attributes. (The behavior of unnamed attributes is unspecified)

slide-19
SLIDE 19

Strings

19

[Pavel, Chekov] [Christine, Chapel]

SELECT O.FirstName, O.LastName FROM Officers O WHERE S.LastName LIKE ‘Ch%e%’

SQL uses single quotes for ‘string literals’

slide-20
SLIDE 20

Strings

20

[Pavel, Chekov] [Christine, Chapel]

SELECT O.FirstName, O.LastName FROM Officers O WHERE O.LastName LIKE ‘Ch%e%’

LIKE is used for String Matches ‘%’ matches 0 or more characters

(like RegEx /.*/)

slide-21
SLIDE 21

Strings

21

[Pavel, Chekov] [Christine, Chapel]

SELECT O.FirstName, O.LastName FROM Officers O WHERE O.LastName LIKE ‘Ch_%e%’

LIKE is used for String Matches ‘%’ matches 0 or more characters

(like RegEx /.*/)

slide-22
SLIDE 22

UNION

22

Computes the union of any two union-compatible sets of tuples SELECT O.FirstName FROM Officers O WHERE O.LastName = ‘Kirk’ OR O.LastName = ‘Picard’

SELECT O.FirstName FROM Officers O WHERE O.LastName = ‘Kirk’ UNION SELECT O.FirstName FROM Officers O WHERE O.LastName = ‘Picard’

is the same as

slide-23
SLIDE 23

UNION

23

SELECT O.FirstName FROM Officers O WHERE O.LastName = ‘Kirk’ UNION SELECT O.FirstName FROM Officers O WHERE O.LastName = ‘Picard’

net.sf.jsqlparser.statement.select.Union myUnion.getPlainSelects()

slide-24
SLIDE 24

Nested Queries

24

SELECT O.FirstName, O.LastName FROM Officers O WHERE O.ID IN (SELECT V.Officer FROM Visited V WHERE V.Planet = ‘Vulcan’)

IN nested query must have exactly one attribute

Use NOT IN for all officers who have never visited ‘Vulcan’

What does this query compute?

net.sf.jsqlparser.statement.select.SubSelect net.sf.jsqlparser.expression.operators.relational.InExpression

slide-25
SLIDE 25

Nested Queries

25

SELECT O.FirstName, O.LastName FROM Officers O WHERE EXISTS (SELECT * FROM Visited V WHERE V.Planet = ‘Vulcan’ AND O.ID = V.Officer)

(With Correlation)

EXISTS is true if the nested query returns at least one result

The nested query can refer to attributes from the outer query

net.sf.jsqlparser.expression.operators.relational.ExistsExpression

slide-26
SLIDE 26

More Set Operators

26

IN EXISTS NOT IN NOT EXISTS

slide-27
SLIDE 27

More Set Operators

27

[op] ANY [op] ALL

SELECT * FROM Officers O WHERE O.Rank > ALL (SELECT O2.rank FROM Officers O2, Ships S WHERE O2.Ship = S.ID AND S.Name = ‘Enterprise’ )

What does this compute?

Which officers outrank every officer on the Enterprise?

net.sf.jsqlparser.expression.AllComparisonExpression

slide-28
SLIDE 28

From-Nesting

28

SELECT * FROM Officers O, (SELECT V.Officer FROM Visited V WHERE V.Planet = ‘Andoria’ ) A WHERE O.ID = A.Officer Queries are relations!

net.sf.jsqlparser.statement.select.SubSelect

slide-29
SLIDE 29

Aggregate Operators

29

SELECT COUNT(*) FROM Officers O, Ships S WHERE O.Ship = S.ID AND S.Name = ‘Enterprise’ What does this compute?

How many officers are on the Enterprise?

net.sf.jsqlparser.expression.Function

slide-30
SLIDE 30

Aggregate Operators

30

COUNT(*) COUNT(DISTINCT A[, B[, …]]) SUM([DISTINCT] A) AVG([DISTINCT] A) MAX(A) MIN(A)

Single Column/Expression

slide-31
SLIDE 31

Group Exercise

31

SELECT * FROM Officers O WHERE O.Rank > ALL (SELECT O2.rank FROM Officers O2, Ships S WHERE O2.Ship = S.ID AND S.Name = ‘Enterprise’ )

How could you write this query without ALL?

slide-32
SLIDE 32

Aggregate Operators

32

SELECT S.Name, AVG(O.Age) FROM Officers O, Ships S WHERE O.Ship = S.ID

This query is illegal! Why?

Can’t combine Aggregate and Non-Aggregate targets!

GROUP BY S.Name Grouping allows us to apply aggregates to Groups of tuples.

slide-33
SLIDE 33

Group-By Queries

33

SELECT [DISTINCT] target-list FROM relation-list WHERE condition GROUP BY grouping-list HAVING group-condition

The target-list now contains (a) grouped attributes (b) aggregate expressions Targets of type (a) must be a subset of the grouping-list (intuitively each answer tuple corresponds to a single group, and each group must have a single value for each attribute)

slide-34
SLIDE 34

Group-By Queries

SELECT [DISTINCT] target-list FROM relation-list WHERE condition GROUP BY grouping-list HAVING group-condition

The condition is applied before grouping The having-condition is applied after grouping

34

slide-35
SLIDE 35

Group-By Queries

SELECT [DISTINCT] target-list FROM relation-list WHERE condition GROUP BY grouping-list HAVING group-condition

mySelect.getGroupByColumnReferences() mySelect.getHaving()

35

slide-36
SLIDE 36

Order By/Limit

36

SELECT O.Name, O.Rank FROM Officers O

How can we compute the Top 5 officers by rank?

ORDER BY O.Rank LIMIT 5

mySelect.getOrderByElements() mySelect.getLimit()

slide-37
SLIDE 37

Defining Relations in SQL

37

CREATE TABLE Officers ( FirstName CHAR(20), LastName CHAR(20), Ship CHAR(5), ID INTEGER ) CREATE TABLE Ships ( ID CHAR(5), Name CHAR(20), Location CHAR(40) )

The schema defines not only the column names, but also their types (domains) For example a 20- character string

slide-38
SLIDE 38

Modifying Relations

38

DROP TABLE Officers ALTER TABLE Ships ADD COLUMN Commissioned DATE

Destroy the relation ‘Officers’ All schema information AND tuples are deleted

Add a new column (field) to the Ships relation Every tuple in the current instance is extended with a ‘null’ value in the new field

slide-39
SLIDE 39

Adding and Deleting Tuples

39

INSERT INTO Officers (FirstName, LastName, Ship) VALUES (‘Benjamin’, ‘Sisko’, ‘74205’)

DELETE FROM Officers O WHERE O.Ship = ‘2000’

Insert single tuples using:

Can delete all tuples satisfying some condition (e.g., Ship = 2000) More powerful data manipulation commands are available in SQL (We’ll discuss them later in the course)

slide-40
SLIDE 40

SQL

40

  • SQL is a language for querying relations
  • SELECT to access (query) data
  • Different features for different access patterns.
  • INSERT INTO, DELETE FROM to modify data
  • CREATE TABLE, DROP TABLE,

ALTER TABLE to modify relations

  • Next time…
  • Translating SQL to Relational Algebra (equivalence)