The Relational Model and Relational Algebra Cs386, Introduction to - - PowerPoint PPT Presentation

the relational model and relational algebra
SMART_READER_LITE
LIVE PREVIEW

The Relational Model and Relational Algebra Cs386, Introduction to - - PowerPoint PPT Presentation

The Relational Model and Relational Algebra Cs386, Introduction to Database Systems Jay Urbain Credits: Database Management Systems, Raghu Ramkrishnan and Johannes Gehrke Data Management, Richard Watson The relational model Overcomes


slide-1
SLIDE 1

The Relational Model and Relational Algebra

Cs386, Introduction to Database Systems Jay Urbain

Credits: Database Management Systems, Raghu Ramkrishnan and Johannes Gehrke Data Management, Richard Watson

slide-2
SLIDE 2

The relational model

Overcomes shortcomings of earlier

database models Has a strong theoretical base

Has a strong theoretical base Codd was the major developer

slide-3
SLIDE 3

Problems with other models

Programmers work at a low level of detail No commands for multiple record processing

(set processing) (set processing)

Little support for ad hoc querying by users

slide-4
SLIDE 4

Objectives of relational model research

Data independence

Logical and physical models are separate

Communicability

A simple model understood by programmers and

users

Set-processing

Increase programmer productivity

slide-5
SLIDE 5

Relational model concepts

Data structures Integrity rules Operators Operators

slide-6
SLIDE 6

Data structures

Domain

A set of values all of the same data type All the legal values of an attribute All the legal values of an attribute Defines what comparisons are legal Only attributes from the same domain should be

compared

The domain concept is rarely implemented

slide-7
SLIDE 7

Data structures

Relations

A table of n columns and m rows

A relation’s cardinality is its number of rows A relation’s degree is its number of columns A relational database is a collection of

relations

No explicit linkages between tables

slide-8
SLIDE 8

Structures

Primary key

A unique identifier of a row in a relation Can be composite

Candidate key

An attribute that could be a primary key

Alternate key

A candidate key that is not selected as the primary key

Foreign key

An attribute of a relation that is the primary key of another

relation for establishing a relationship

Can be composite

slide-9
SLIDE 9

Integrity rules

Entity integrity

No component of the primary key of a relation can be

null

Each row in a relation is uniquely identified Each row in a relation is uniquely identified

Referential integrity

A database must not contain any unmatched foreign

key values

For every foreign key there is a corresponding

primary key that it is referencing.

slide-10
SLIDE 10

Relational Algebra

Based on first-order logic (algebra of sets). Proposed by E.F. Codd, 1970, as a basis for database

query languages.

Declarative language that describes the desired answer Declarative language that describes the desired answer

without specifying how the answer is to be computed.

Inputs and outputs are query relations.

slide-11
SLIDE 11

Relational Algebra

Queries composed using collection of operators. One of two formal languages (DML) associated with the

relational model (DDL, DML). Relational Algebra expression:

Recursively defined to be a relation, a unary algebra

  • perator applied to a single expression or binary algebra
  • perator applied to two expressions.

I.e., the result of query on relations is a relation. Very

powerful!

slide-12
SLIDE 12

Relational Algebra Operations

Selection and projection

Select rows from a relation Project columns from a relation

σ π Example: select sname, rating from Student where rating > 8

, 8

( ( ))

sname rating rating

Student π δ

>

slide-13
SLIDE 13

Operations

Relational algebra has 8 operators

Restrict Project Product Product Union Intersect Difference Join Divide

slide-14
SLIDE 14

Restrict

Extracts rows from a single relation

A W X Y Z W X Y Z

slide-15
SLIDE 15

Project

Extracts columns from a single relation

A W X Y Z W X Y Z

slide-16
SLIDE 16

Product

Creates a new relation from all possible

combinations of rows in two other relations.

Cross join.

A V W v1 w1 B X Y Z x1 y1 z1 v1 w1 v2 w2 v3 w3 x1 y1 z1 x2 y2 z2 A TIMES B V W X Y Z v1 w1 x1 y1 z1 v1 w1 x2 y2 z2 v2 w2 x1 y1 z1 v2 w2 x2 y2 z2 v3 w3 x1 y1 z1 v3 w3 x2 y2 z2

slide-17
SLIDE 17

Union

Creates a new relation containing rows

appearing in one or both relations

Duplicate rows are automatically eliminated Relations must be union compatible Relations must be union compatible

A X Y x1 y1 x2 y2 x3 y3 B X Y x2 y2 x4 y4 A UNION B X Y x1 y1 x2 y2 x3 y3 x4 y4

slide-18
SLIDE 18

Intersect

Creates a new relation containing rows

appearing in both relations

Relations must be union compatible Natural join Natural join

A X Y x1 y1 x2 y2 x3 y3 B X Y x2 y2 x4 y4 A INTERSECT B X Y x2 y2

slide-19
SLIDE 19

Difference

Creates a relation containing rows in

the first relation but not in the second

Relations must be union compatible. True negation (not in, not exists

subquery).

A X Y x1 y1 x2 y2 x3 y3 B X Y x2 y2 x4 y4 A MINUS B X Y x1 y1 x3 y3

slide-20
SLIDE 20

Join

Creates a new relation from all combinations of

rows satisfying the join condition

A join B where W = Z Inner join, equijoin

Inner join, equijoin

A V W v1 wz1 v2 wz2 v3 wz3 B X Y Z x1 y1 wz1 x2 y2 wz3 A EQUIJOIN B V W X Y Z v1 wz1 x1 y1 wz1 v3 wz3 x2 y2 wz3

slide-21
SLIDE 21

Divide

Is there a value in the X column of A (e.g., x1) that has a value in

the Y column of A for every value of y in the Y column of B?

A X Y B Y x1 y1 x1 y2 x1 y3 x2 y1 x2 y3 A DIVIDE B X x1 y1 y2

slide-22
SLIDE 22

A primitive set of operators

Only five operators are required

Restrict Project Project Product Union Difference

slide-23
SLIDE 23

Relational algebra and SQL

Relational algebra is a standard for

judging a data retrieval language

Relational algebra SQL Restrict A where SELECT * FROM A

  • 1. Essentially, where all columns of A are equal to all columns of B

Restrict A where condition SELECT * FROM A WHERE condition Project A [X] SELECT X FROM A Product A times B SELECT * FROM A, B Union A union B SELECT * FROM A UNION SELECT * FROM B Difference A minus B SELECT * FROM A WHERE NOT EXISTS (SELECT * FROM B WHERE A.X = B.X AND A.Y = B.Y AND …)

1

slide-24
SLIDE 24

A complete relational database

A fully relational database supports

structures (domains and relations) integrity rules a manipulation language

Many commercial systems are not fully

relational because they do not support domains and integrity rules

Classified as relationally complete

The word “relational” is sometimes used too

freely

slide-25
SLIDE 25

Codd’s commandments

  • 1. The information rule

All data must appear to be stored as values in a table

  • 2. The guaranteed access rule

Every value in a database must be addressable by specifying its table name, column name, and the primary specifying its table name, column name, and the primary key of the row in which it is stored

  • 3. Systematic treatment of null values

There must be a distinct representation for unknown or inappropriate data

  • 4. Active on-line catalog on the relational model

There should be an on-line catalog that describes the relational model

slide-26
SLIDE 26

Codd’s commandments

5. The comprehensive data sublanguage rule

There must be a relational language that supports data definition, data manipulation, security and integrity constraints, and transaction processing operations

6. The view updating rule 6. The view updating rule

The DBMS must be able to update any view that is theoretically updateable

7. High-level insert, update, and delete

The system must support set-at-a-time operations

8. Physical data independence

Changes to storage representation or access methods will not affect application programs

slide-27
SLIDE 27

Codd’s commandments

9. Logical data independence

Information preserving changes to base tables will not affect application programs

  • 10. Integrity independence

Integrity constraints should be part of a database's definition rather Integrity constraints should be part of a database's definition rather than embedded within application programs It must be possible to change integrity constraints without affecting any existing application programs

  • 11. Distribution independence

Introduction of a distributed DBMS or redistributing existing distributed data should have no impact on existing applications

  • 12. The nonsubversion rule

It must not be possible to use a record-at-a-time interface to subvert security or integrity constraints

slide-28
SLIDE 28

Codd’s Rule 0

A relational DBMS must be able to manage

databases entirely through its relational capacities

A DBMS is either totally relational or it is not

relational

slide-29
SLIDE 29

Key points

Relational model is theoretically grounded

and practically relevant

Relational algebra is the foundation of Relational algebra is the foundation of

SQL

A relational DBMS should satisfy a range

  • f requirements to be fully relational