Relational Query Languages (2) SQL and QBE Walid G. Aref Query - - PowerPoint PPT Presentation

relational query languages 2 sql and qbe
SMART_READER_LITE
LIVE PREVIEW

Relational Query Languages (2) SQL and QBE Walid G. Aref Query - - PowerPoint PPT Presentation

Relational Query Languages (2) SQL and QBE Walid G. Aref Query Languages For The Relational Model Relational Algebra Procedural Domain Relational Query Query By Languages Relational Example (QBE) Calculus Declarative Tuple Relational


slide-1
SLIDE 1

Relational Query Languages (2) SQL and QBE

Walid G. Aref

slide-2
SLIDE 2

Query Languages For The Relational Model

Relational Query Languages Procedural

Relational Algebra

Declarative

Domain Relational Calculus Query By Example (QBE) Tuple Relational Calculus SQL

Walid G. Aref

slide-3
SLIDE 3

Query Languages For The Relational Model

Relational Query Languages Procedural

Relational Algebra

Declarative

Domain Relational Calculus Query By Example (QBE) Tuple Relational Calculus SQL

Walid G. Aref

slide-4
SLIDE 4

SQL: Structured Query Language

  • Standard since 1986
  • Declarative: State what you want, not how you evaluate it
  • Builds on Tuple Relational Calculus: Variables refer to tuples
  • Where are the variables in SQL?
  • Select [Distinct] (Output Attributes List)

From (Table List) Where (Qualification Predicates)

  • Example: Select DISTINCT S.name, S.gpa

From Students S ß S is a Tuple Variable Where S.age < 21

  • DISTINCT: Optional è Output table should not contain duplicates.
  • Default is that duplicates are not eliminated!
  • In contrast to Relational Algebra being based on sets (no duplicate tuples allowed,

which is expensive), SQL is based on bags (duplicate tuples allowed)

Walid G. Aref

slide-5
SLIDE 5

Expressing Relational Algebra Operators Using SQL

  • Select

: SELECT * FROM r WHERE r.id = 123 AND r.a =‘a’;

  • Project

: Select r.a, r.b+100 FROM r

  • Union

: (SELECT * FROM r) UNION (SELECT * FROM s);

  • Set Difference

: (SELECT * FROM r) EXCEPT (SELECT * FROM s);

  • Cross Product

: SELECT * FROM r, s; SELECT * FROM r CROSS JOIN s;

  • Rename

: Select cid as CourseID into Table1 FROM Courses;

  • Intersect

: (SELECT * FROM r) INTERSECT (SELECT * FROM s);

  • Join

: Select * from r, s where r.a = s.b

Walid G. Aref

slide-6
SLIDE 6

SQL: Null Values and Three Valued Logic

  • Nulls mean missing value, unknown, a

value does not exist or is not applicable.

  • Handling Nulls complicates things in SQL
  • In SQL, aggregate functions ignore null

values

  • Could have returned null as result instead.
  • What happens when Nulls are involved

in a predicate

  • Salary > 50K
  • But if salary is Null for some tuple. Returns

True or False? Or Null?

  • Three-valued Logic:
  • OR:

(unknown or true) = true (unknown or false) = unknown (unknown or unknown) = unknown

  • AND:

(true and unknown) = unknown (false and unknown) = false (unknown and unknown) = unknown

  • NOT:

(not unknown) = unknown

  • For SQL, the predicate:
  • P is unknown
  • Evaluates to true if Predicate P

evaluates to unknown

Walid G. Aref

slide-7
SLIDE 7

Our Example Relational Database Schema

  • Students(sid: string, name: string, login: string, age: integer, gpa: real)
  • Courses(cid: string, cname: string, credits: integer)
  • Enrolled(sid: string, cid: string, grade: string)
  • Instructor(iid: string, iname: string, irank: string, isalary: real)
  • Teaches(iid: string, cid: string, year: integer, semester: string)

Walid G. Aref

slide-8
SLIDE 8

Creating Tables in SQL, Keys, and Foreign Keys

  • Students(sid: string, name: string,

login: string, age: integer, gpa: real)

  • Enrolled(sid: string, cid: string,

grade: string)

  • CREATE TABLE Students

(sid: CHAR(20), name: CHAR(20), login: CHAR(10), age: INTEGER, gpa: REAL, PRIMARY KEY SID);

  • CREATE TABLE Enrolled

(sid CHAR(20), cid CHAR(20), grade CHAR(2), PRIMARY KEY (sid, cid), FOREIGN KEY (sid) REFERENCES Students ON DELETE CASCADE ON UPDATE SET DEFAULT, FOREIGN KEY (cid) REFERENCES Courses ON DELETE CASCADE ON UPDATE CASCADE)

  • Other options:
  • ON DELETE REJECT
  • UNIQUE (cid, grade)

Walid G. Aref

slide-9
SLIDE 9

Deleting and Altering Tables

  • To delete Table Students
  • DROP TABLE Students
  • Add an attribute to Table Students, and fill it with null
  • ALTER TABLE Students ADD COLUMN GraduationYear: integer
  • Can set another default value
slide-10
SLIDE 10

Data Manipulation in SQL

  • Insert new tuple into table:
  • INSERT INTO Students (sid, name, login, age, gpa)

VALUES (0111, ‘Bright, Mary’, ‘mb@purdue.edu’, 22, 4.0)

  • Can also perform bulk insert
  • Bulk Delete: Delete all the tuples that satisfy the delete predicate
  • DELETE FROM Students S

WHERE S.name = ‘Bright, Mary’

slide-11
SLIDE 11

Integrity Constraints

  • Must always be true for any instance of the database
  • Cannot detect if the data is true as long as it satisfies the integrity

constraints.

  • But can detect if the data is wrong
  • Strength in SQL
  • Integrity constraints are associated with the data definition and not with the

program logic

  • Integrity constraints are specified when schema is defined.
  • Does not depend on the programmers to enforce them. They are always verified by

the system

  • ICs are checked every time the relation is modified or updated
slide-12
SLIDE 12

Various Forms of Integrity Constraints

  • Domain Constraints: The possible data values permissible in a given

attribute

  • Key Constraints: Unique combinations of attribute values over the

entire table

  • Primary key constraint
  • Unique key constraints
  • Referential Integrity Constraints (Foreign Key Constraints):
  • Primary key value in the other relation must match with every foreign key

value.

  • Avoid “dangling pointers”
slide-13
SLIDE 13

Re Referential Integrity Constraints (Foreign Ke Key Con Constraints)

  • Foreign key values pointing to primary key values
  • Foreign Key à Primary Key
  • What should happen when a primary key value gets deleted or updated?
  • Integrity constraint enforced with the data itself not with the program logic
  • Domain experts specify what needs to be done

Create Table … … Foreign key ... References … On delete reject On update cascade On delete set default

slide-14
SLIDE 14

Supporting General Constraints

  • Can check general conditions, e.g.,
  • CREATE TABLE Students

(sid: CHAR(20), name: CHAR(20), login: CHAR(10), age: INTEGER, gpa: REAL, PRIMARY KEY SID, check gpa >= 0.0 and gpa <=4.0);

  • Check can contain other Select statements, etc., e.g., gpa > select gpa from …
  • Can also have an ASSERTION that is not associated by any one table
  • E.g., maintaining some student-professor ratio
  • Create Assertion Student-InstructorRatio(Check (

(select count(*) from students) / (select count(*) from instructors) < 15)

  • Has performance implication as these need to be rechecked with every update
slide-15
SLIDE 15

Views in SQL

  • View: A relation stored by storing the Select statement that generates it
  • CREATE VIEW BrightStudents (name, gpa)

AS SELECT name, gpa FROM Students WHERE gpa > 3.6 ORDER BY gpa desc <<< to order the tuples by gpa in descending order

  • Can limit the number of tuples in the output, e.g., add LIMIT 3 to get the top 3 students
  • Use of Views for Security: Can hide some attributes in the base tables, e.g., hiding

student’s grades or gpa when student is in the University health center. Also, hiding medical info of a student from the course instructor

  • SELECT name from BrightStudents where gpa = 4.0
  • Replace BrightStudents with its Select statement è Nested query
  • May not be very efficient during query execution

Walid G. Aref

slide-16
SLIDE 16

Materialized Views

  • Optimization: Materializing a view
  • A view can be materialized into disk by storing its result on disk
  • Previous query accesses the materialized view like a regular table
  • CREATE MATERIALIZED VIEW …
  • Is one form of redundancy as data is partly replicated in the base tables and

the materialized view tables

  • What happens when the base tables get updated?
  • Eager update of materialized views:
  • Whenever base tables get updated, will need to update the materialized views (overhead) è

Eager maintenance of materialized views (overhead)

  • Lazy update of materialized views:
  • When a base table is updated, accumulate all updates in delta tables, but do not update

materialized view tables eagerly.

  • When a query references a view, only then, propagate all delta tables related to this view into

the materialized view, then evaluate the query

slide-17
SLIDE 17

Updating Data in the Materialized View

  • Can we update the data in the materialized view?
  • Updating a view must translate to an update of the actual (base)

relations in the database.

  • When Keys of the base tables are part of the view definition
  • Thus, a tuple in the view will have a one-to-one mapping to the base

tables.

  • Thus, an update will be possible, and will reflect uniquely and

unambiguously to tuples in the base tables.

Walid G. Aref

slide-18
SLIDE 18

SQL: Scalar and Table Functions

  • User-defined functions are defined in SQL using:
  • CREATE FUNCTION .. RETURNS TABLE…
  • CREATE FUNCTION .. RETURNS INTEGER …
  • Accept parameters as input
  • Perform calculations
  • Return the result.
  • The return value can either be
  • A scalar (single) value

è Scalar Function

  • A table

è Table Function (Function that returns a table)

  • Where do we use a table function? In the FROM clause (or in any location

where a table is expected)

Walid G. Aref

slide-19
SLIDE 19

Nesting Queries in SQL

  • Find the names of the instructors who taught CS541
  • Instructor(iid: string, iname: string, irank: string, isalary: real)
  • Teaches(iid: string, cid: string, year: integer, semester: string)
  • Select i.iname

From instructors i where i.iid in (select t.iid from teaches t where t.cid = “CS541”);

  • Can plug in an SQL query anywhere in the Select statement where we are

expecting a table, e.g., in the From clause or as shown above.

  • Other locations are after: IN, EXISTS and UNIQUE, NOT IN, NOT EXISTS and NOT

UNIQUE, op ANY, op ALL, where op is any of: >, <, =, ≤, ≥, ≠

Walid G. Aref

slide-20
SLIDE 20

Nesting Queries with Correlations in SQL

  • 1. Select i.iname

From instructors i where i.iid in (select t.iid from teaches t where t.cid = “CS541”);

  • Notice that the inner query is independent of the outside query
  • 2. Select i.iname

From instructors i where exists (select * from teaches t where i.iid = t.iid and t.cid = “CS541”);

  • The inner query is dependent on (correlated with) the outer

query.

  • For every outer tuple, the inner query needs to be re-evaluated

Walid G. Aref

  • 3. Another example of a correlated

nested query: Find the names of the instructors who taught CS541 at most once Select i.iname From instructors i where unique ( select t.cid from teaches t where i.iid = t.iid and t.cid = “CS541”);

  • Inner query will need to be evaluated

for every outer tuple

slide-21
SLIDE 21

Aggregate Operators in SQL

  • SUM, COUNT, AVG, MIN, MAX
  • Example: Find the name of the youngest student

Select s1.name From Students s1 Where s1.age = (Select min(s2.age) from students s2)

  • Notice that we cannot use:

Select name, min(age) from Students

slide-22
SLIDE 22

Grouping in SQL

  • Find the average GPA for each age group in Students

Select age, avg(gpa) From Students Group by age

  • Non-aggregated attributes in SELECT attr list must

be listed in the GROUP BY clause

  • Cannot say: Select age, name, avg(gpa) From Students

Group by age

  • Either eliminate name or add it to group by clause

Group by age, name

  • What if we want to exclude some tuples?
  • Use the Where clause
  • Excluding students with GPA < 2, Find the average

GPA for each age group in Students Select age, avg(gpa) From Students where GPA >= 2 Group by age

  • Cannot add aggregates in the where clause,

e.g., where min(gpa) >2

  • What if we want to exclude some groups?
  • Use the Having clause
  • Excluding students with GPA < 2, Find the

average GPA for each age group in Students when the group has at least 3 students Select age, avg(gpa) From Students where GPA >= 2 Group by age Having count(*) > 2 Count(*) computes the number of tuples

slide-23
SLIDE 23

Examples of Using Nesting and Aggregation

  • Find name(s) of student(s) with maximum grade in the Database

Systems course

  • Students(sid: string, name: string, login: string, age: integer, gpa: real)
  • Courses(cid: string, cname: string, credits: integer)
  • Enrolled(sid: string, cid: string, grade: string)
  • Select s.name

from students s, enrolled e, courses c where c.cname = ‘Database Systems’ and c.cid = e.cid and e.sid = s.sid and e.grade >= all (Select e1.grade from enrolled e1, courses c1 where c1.cname = ‘CS541’ and c1.cid = e1.cid)

  • Select s.name

from students s, enrolled e, courses c where c.cname = ‘Database Systems’ and c.cid = e.cid and e.sid = s.sid and e.grade = (Select max(e1.grade) from enrolled e1, courses c1 where c1.cname = ‘Database Systems’ and c1.cid = e1.cid)

  • Or add
  • e.grade = (Select e1.grade

from enrolled e1, courses c1 where c1.cname = ‘Database Systems’ and c1.cid = e1.cid

  • rder by e1.grade desc limit 1)
slide-24
SLIDE 24

Triggers in SQL

  • A rule that gets triggered to execute

when some event takes place

  • Defined through a “Create Trigger”

SQL statement

  • For a trigger definition, we need to

specify:

  • Event (what activates the trigger)
  • Condition (tests whether the trigger

should execute)

  • Action (what happens when the

trigger executes)

CREATE TRIGGER YoungBrightStudents AFTER INSERT ON Students REFERENCING NEW TABLE NewStudents FOR EACH STATEMENT INSERT INTO YoungAndBright (sid, name, login, age, gpa) SELECT sid, name, login, age, gpa FROM NewStudents N WHERE N.age <= 18 and N.gpa > 3.6

  • Other options: Referencing old table
  • Before insert,
slide-25
SLIDE 25

The With Clause

  • Define a view on the fly that is local in scope to only the query it is defined

in

  • Or create a temporary table for just the query at hand and then delete that

table

  • With best(maxGPA)

select max (gpa) from students select name from students, best where students.gpa = best.maxGPA

  • Best is not defined afterwards
  • Is useful when defining recursive queries
slide-26
SLIDE 26

SQL Cursors

  • Useful when embedding SQL in

a host language

  • Allow the host program to get

the results of the query one tuple-at-a-time

  • Declare the cursor over an SQL

query

  • Open cursor, fetch, close
  • Declare dbc Cursor for

select i.iname, t.year, t.semester from instructor i, teaches t, courses c where c.cname = ‘Database System’ and c.cid = t.cid and t.iid = i.iid for update

  • EXEC SQL open dbc END-EXEC
  • EXEC SQL fetch dbc into :cn :cy :cs END-

EXEC

slide-27
SLIDE 27

Conceptual Evaluation of SQL Queries

  • Any SQL query can be conceptually evaluated by a Relational Algebra

Expression

  • Select A1,A2, …, An

From T1, T2, …, Tm Where P1 and P2 and … and Pk

  • 𝜌A1,A2, …, An(𝜏P1 and P2 and … and Pk(T1 X T2 X … X Tm))
slide-28
SLIDE 28

Other Interesting Topics in SQL

  • Supporting recursion in SQL
  • Unnesting nested queries
slide-29
SLIDE 29

Query-By-Example (QBE)

  • Builds on Domain Relational Calculus (Variables range over domains

not tuples)

  • 2D Graphical user interface to compose a query using examples

sid name login age gpa iid iname irank isalary Students Teaches iid cid year semester Instructors

slide-30
SLIDE 30

Relational Operators using QBE

  • Project: πname, age(Students)
  • Select: 𝜏(𝑜𝑏𝑛𝑓 =

!Mary’ ) ∧ (𝑏𝑕𝑓 < 20)(Students)

sid name login age gpa P. P. iid iname irank isalary Students Teaches iid cid year semester Instructors sid name login age gpa P. ’Mary’ <20 iid iname irank isalary Students Teaches iid cid year semester Instructors

slide-31
SLIDE 31

Relational Operators using QBE

  • Cross Product: Students X Enrolled
  • Join: Students S1 ⋈ Enrolled E

sid name login age gpa P. iid iname irank isalary Students Enrolled sid cid grade P. Instructors sid name login age gpa P. _y iid iname irank isalary Students Enrolled sid cid grade P. _y Instructors S1.sid = E.sid

slide-32
SLIDE 32

Relational Operators using QBE

  • Union
  • (Notice the rename of the attribute ‘Person Name’ and the output

table - Result)

sid name login age gpa _y iid iname irank isalary _x Students Enrolled sid cid grade Instructors Result Person Name P. _y P. _x

slide-33
SLIDE 33

Relational Operators using QBE

  • Set Difference
  • Intersect

sid name login age gpa P._y iid iname irank isalary _𝑧 Students Enrolled sid cid grade Instructors sid name login age gpa P._y iid iname irank isalary ¬_𝑧 Students Enrolled sid cid grade Instructors

slide-34
SLIDE 34

Relational Operators Using QBE

  • Ordering tuples
  • Example: List names or students in descending order of their GPA and

in case of equality, by their age in ascending order.

sid name login age gpa P. P.AO(2) P.DO(1) Students

slide-35
SLIDE 35

Relational Operators Using QBE

  • Aggregate functions
  • Example: Find the number of students enrolled in CS541.
  • .ALL: To ensure that duplicates are counted.
  • .UNQ: To ensure that duplicates get eliminated.

Enrolled sid cid grade P.CNT.ALL ‘CS541’

slide-36
SLIDE 36

Relational Operators Using QBE

  • Group By
  • Example: Find the number of courses enrolled by each student

Enrolled sid cid grade P.G. P.CNT.ALL

slide-37
SLIDE 37

Modifying the Database Using QBE

  • Example: Delete Students with GPA < 1
  • Example: Insert a new tuple

sid name login age gpa D. <1 Students sid name login age gpa I. Ada Bell AB@p 19 3.8 Students iid iname irank isalary U._X*1.05 Instructors

  • Example: Increase salaries
  • f instructors by 5%
slide-38
SLIDE 38