Databases Relational algebra Lectures for mathematics students - - PowerPoint PPT Presentation

databases relational algebra lectures for mathematics
SMART_READER_LITE
LIVE PREVIEW

Databases Relational algebra Lectures for mathematics students - - PowerPoint PPT Presentation

Databases Relational algebra Lectures for mathematics students Zbigniew Jurkiewicz, Institute of Informatics UW March 5, 2017 Zbigniew Jurkiewicz, Institute of Informatics UW Databases Relational algebra Lectures for mathematics students


slide-1
SLIDE 1

Databases Relational algebra Lectures for mathematics students

Zbigniew Jurkiewicz, Institute of Informatics UW March 5, 2017

Zbigniew Jurkiewicz, Institute of Informatics UW Databases Relational algebra Lectures for mathematics students

slide-2
SLIDE 2

Relational algebra

Theoretical model for describing the semantics of relational databases, proposed by T. Codd (who authored the concept of relational databases). Algebra over the domain of relations, i.e. variables

  • ccuring in its expressions represent relations.

Operators are defined to parallel typical operations needed to search for information in database tables. Originally postulated also as a query language for relational databases, each expression being equivalent to some query.

Zbigniew Jurkiewicz, Institute of Informatics UW Databases Relational algebra Lectures for mathematics students

slide-3
SLIDE 3

Relations

Each relation is represented by name. With each relation name we associate its schema — a sequence of attributes (corresponding to the columns of the table being modeled), e.g.

R(A, B, C) Student(id, first-name, last-name)

Attribute names in a relation schema must be different.

Zbigniew Jurkiewicz, Institute of Informatics UW Databases Relational algebra Lectures for mathematics students

slide-4
SLIDE 4

Basic operations

Ordinary binary set-theoretic operations: the union of sets (∪), the intersection of sets (∩) and the difference of sets (−), but

both arguments must have the same schema

Cartesian product R × S

As arguments could have attributes with the same names, the column names in result schema have sometimes to be prefixed with names of relations of origin Thus for relations R(A, B, C) i S(C, D, E) the schema of their cartesian product will be R × S(A, B, R.C, S.C, D, E) Better yet, we should use renaming.

Selection σcondition(R): we select only the tuples of relation R satisfying condition.

Zbigniew Jurkiewicz, Institute of Informatics UW Databases Relational algebra Lectures for mathematics students

slide-5
SLIDE 5

Basic operations

Projection πattribute1,...,attributen(R): we select only the indicated attributes from relation R. Renaming ρS(R): changes the relation name or the names

  • f some of its attributes ρR(X,Y,Z)R, sometimes both at
  • nce ρS(X,Y,ZX)R.

Join R ✶

θ S: similar to cartesian product, but joins only the

pairs of tuples satisfying a given condition θ. We will look at different variants in a moment.

Zbigniew Jurkiewicz, Institute of Informatics UW Databases Relational algebra Lectures for mathematics students

slide-6
SLIDE 6

Selection

Relation Animals: species name weight Parrot Kropka 3.50 Parrot Lulu 5.35 Parrot Hipek 3.50 Fox Fufu 6.35 Crocodile Czako 75.00 σspecies=′Parrot′Animals: species name weight Parrot Kropka 3.50 Parrot Lulu 5.35 Parrot Hipek 3.50

Zbigniew Jurkiewicz, Institute of Informatics UW Databases Relational algebra Lectures for mathematics students

slide-7
SLIDE 7

Projection

Duplicate rows are eliminated (we deal with relations!) πspecies,weightAnimals: species weight Parrot 3.50 Parrot 5.35 Fox 6.35 Crocodile 75.00

Zbigniew Jurkiewicz, Institute of Informatics UW Databases Relational algebra Lectures for mathematics students

slide-8
SLIDE 8

Generalized projection

In addition to the names of attributes, we allow expressions

  • n attributes, e.g. arithmetical expressions.

They should be explicitly named: A + B → C Some attributes may occur more than once.

Zbigniew Jurkiewicz, Institute of Informatics UW Databases Relational algebra Lectures for mathematics students

slide-9
SLIDE 9

Generalized projection

R = A B 1 2 3 4 πA+B→C,A,A→A1R = C A A1 3 1 1 7 3 3

Zbigniew Jurkiewicz, Institute of Informatics UW Databases Relational algebra Lectures for mathematics students

slide-10
SLIDE 10

Cartesian product

R1 = A B 1 2 3 4 R2 = B C 5 6 7 8 9 10 R1 × R2 = A R1.B R2.B C 1 2 5 6 1 2 7 8 1 2 9 10 3 4 5 6 3 4 7 8 3 4 9 10

Zbigniew Jurkiewicz, Institute of Informatics UW Databases Relational algebra Lectures for mathematics students

slide-11
SLIDE 11

Theta-join

R ✶

θ S = σθ(R × S)

θ stands for any expression on attributes from joined relations, e.g. A < C. A theta-join where the condition has the form of equality of two attributes is called equijoin. Dangling tuple: a tuple from one of relations, which matches no tuple from the other relation. Its contents will be eliminated from the result.

Zbigniew Jurkiewicz, Institute of Informatics UW Databases Relational algebra Lectures for mathematics students

slide-12
SLIDE 12

Theta-join

Animals species name weight Parrot Kropka 3.50 Parrot Lulu 5.35 Parrot Hipek 3.50 Fox Fufu 6.35 Crocodile Czako 75.00 Species name continent Parrot America Fox Europe Crocodile Africa Animals ✶

species=name Species

species Animals.name weight Species.name continent Parrot Kropka 3.50 Parrot Ameryka Parrot Lulu 5.35 Parrot Ameryka Parrot Hipek 3.50 Parrot Ameryka Fox Fufu 6.35 Fox Europa Crocodile Czako 75.00 Crocodile Africa

Zbigniew Jurkiewicz, Institute of Informatics UW Databases Relational algebra Lectures for mathematics students

slide-13
SLIDE 13

Natural join

Notation: R ✶ S. The relations to be joined should have at least one common attribute with the same name (otherwise we will have cartesian product). The condition is a conjunction of equalities for all pairs of common attributes. The result will contain only one attribute from each pair of common attributes.

Zbigniew Jurkiewicz, Institute of Informatics UW Databases Relational algebra Lectures for mathematics students

slide-14
SLIDE 14

Natural join

Animals species name weight Parrot Kropka 3.50 Parrot Lulu 5.35 Parrot Hipek 3.50 Fox Fufu 6.35 Crocodile Czako 75.00 Species species continent Parrot America Fox Europe Crocodile Africa Animals ✶ Species species name weight continent Parrot Kropka 3.50 America Parrot Lulu 5.35 America Parrot Hipek 3.50 America Fox Fufu 6.35 Europe Crocodile Czako 75.00 Africa

Zbigniew Jurkiewicz, Institute of Informatics UW Databases Relational algebra Lectures for mathematics students

slide-15
SLIDE 15

Renaming

Used to name result relations: ρRS(A,B,X,C,D,E)(R × S). Simplified notation: R1(A1, B, X, C, D, E) := (R × S).

Zbigniew Jurkiewicz, Institute of Informatics UW Databases Relational algebra Lectures for mathematics students

slide-16
SLIDE 16

Compound expressions

As in any algebra, the operations may be composed and nested to obtain compound expressions. The equivalence of expressions is often used with the

  • ptimization of queries to replace the entered expression

with the equivalent, more effective one.

Zbigniew Jurkiewicz, Institute of Informatics UW Databases Relational algebra Lectures for mathematics students

slide-17
SLIDE 17

Compound expressions: self-join

Animals species name weight Parrot Kropka 3.50 Parrot Lulu 5.35 Parrot Hipek 3.50 Fox Fufu 6.35 Crocodile Czako 75.00 Find the names of all pairs of animals of the same species

πZ1.name,Z2.name(ρZ1Animals ✶

Z1.species=Z2.species∧ Z1.name<Z2.name

ρZ2Animals)

(the second condition was added to prevent “duplicates”).

Zbigniew Jurkiewicz, Institute of Informatics UW Databases Relational algebra Lectures for mathematics students

slide-18
SLIDE 18

Multisets (bags)

According to mathematical definition of relation the duplicate tuples occuring in results of some operations (e.g. projection) should be eliminated. By allowing repetitions we can extend this algebra to multisets. This is harder than it sounds: there is a problem of the appropriate semantics for set theoretic operations of intersection and difference. The intuitive extensions of most operations are closed on relations with the exception of sum, which for two relations may return multiset. Some laws of relational algebra are not true for multisets, np. (R ∪ S) − T = (R − T) ∪ (S − T) There is an operator to convert multiset to relation by eliminating duplicates: δ(R).

Zbigniew Jurkiewicz, Institute of Informatics UW Databases Relational algebra Lectures for mathematics students

slide-19
SLIDE 19

Grouping

To model statistical queries on databases we introduce grouping operator with optional computation of aggregate functions γA,MIN(B)→MinB(R) Note that γA1,...,An(R) = δ(R) if Ai are all attributes of R.

Zbigniew Jurkiewicz, Institute of Informatics UW Databases Relational algebra Lectures for mathematics students

slide-20
SLIDE 20

Sorting

The sorting operator is written as τC,B(R). As it does not have sense for relational or multiset algebras (only for possible list algebra), it should always be the

  • utermost operator of any expression!

Zbigniew Jurkiewicz, Institute of Informatics UW Databases Relational algebra Lectures for mathematics students

slide-21
SLIDE 21

Outer joins

They try to preserve dangling tuples when building the result The special value ⊥ stands for missing values from the

  • ther side;

Full: R

  • ✶ S

Left: R

  • ✶L S

We take only dangling tuples from the first argument;

Right: R

  • ✶R S;

Of course instead of natural join theta-join could be used.

Zbigniew Jurkiewicz, Institute of Informatics UW Databases Relational algebra Lectures for mathematics students

slide-22
SLIDE 22

Outer join

Animals species name weight Parrot Kropka 3.50 Parrot Lulu 5.35 Parrot Hipek 3.50 Fox Fufu 6.35 Crocodile Czako 75.00 Species species continent Parrot America Fox Europe Crocodile Africa Cow Europe Animals

  • ✶ Species

species name weight continent Parrot Kropka 3.50 America Parrot Lulu 5.35 America Parrot Hipek 3.50 America Fox Fufu 6.35 Europe Crocodile Czako 75.00 Africa Cow ⊥ ⊥ Europe

Zbigniew Jurkiewicz, Institute of Informatics UW Databases Relational algebra Lectures for mathematics students

slide-23
SLIDE 23

Outer join

Another example. The schema of a fictional hotel contains relations: Rooms(number,beds,bathroom,price) Bookings(id,room,pesel,from,to,paid) To find rooms without bookings we could use πnumberRooms − πroomBookings but for more fun we will use σroom=⊥(Rooms

number=room Bookings)

Zbigniew Jurkiewicz, Institute of Informatics UW Databases Relational algebra Lectures for mathematics students

slide-24
SLIDE 24

Applications of relational algebra

Writing queries (e.g. when modeling semantics). Describing constraints on the correctness of database

  • contents. Examples:

R ∩ S = ∅ (equational style) R ∩ S ⊆ ∅ (set-theoretic style) Referential integrity modelling πforeign-key(R) ⊆ πkey(S) πforeign-key(R) − πkey(S) = ∅ Alternate form for functional dependencies A → B : σR.A=R1.A∧R.b=R1.B(R × ρR1(R)) = ∅ Intermediate form when compiling queries into database

  • perations.

Zbigniew Jurkiewicz, Institute of Informatics UW Databases Relational algebra Lectures for mathematics students