Relational Algebra Murali Mani What is Relational Algebra? Defines - - PDF document

relational algebra
SMART_READER_LITE
LIVE PREVIEW

Relational Algebra Murali Mani What is Relational Algebra? Defines - - PDF document

Relational Algebra Murali Mani What is Relational Algebra? Defines operations (data retrieval) for relational model SQLs DML (Data Manipulation Language) has data retrieval facilities, which are equivalent to that of relational


slide-1
SLIDE 1

1

Murali Mani

Relational Algebra

Murali Mani

What is Relational Algebra?

Defines operations (data retrieval) for

relational model

SQL’s DML (Data Manipulation Language)

has data retrieval facilities, which are equivalent to that of relational algebra.

SQL and relational algebra are not for

complex operations; they support efficient, easy access of large data sets.

slide-2
SLIDE 2

2

Murali Mani

Basics

Relational Algebra is defined on bags, rather than

relations (sets).

Bag or multiset allows duplicate values; but order is

not significant.

We can write an expression using relational algebra

  • perators with parentheses

Need closure – an operator on bag returns a bag. Relational algebra includes set operators, and other

  • perators specific to relational model.

Murali Mani

Set Operators

Union, Intersection, Difference, cross product Union, Intersection and Difference are

defined only for union compatible relations.

Two relations are union compatible if they

have the same set of attributes and the types (domains) of the attributes are the same.

Eg of two relations that are not union

compatible:

Student (sNumber, sName) Course (cNumber, cName)

slide-3
SLIDE 3

3

Murali Mani

Union: ∪

Consider two bags R1 and R2 that are union-

  • compatible. Suppose a tuple t appears in R1

m times, and in R2 n times. Then in the union, t appears m + n times.

2 1 4 3 2 1 B A

R1

6 5 4 3 2 1 B A

R2

2 1 4 3 4 3 6 5 2 1 2 1 B A

R1 ∪ R2

Murali Mani

Intersection: ∩

Consider two bags R1 and R2 that are union-

  • compatible. Suppose a tuple t appears in R1

m times, and in R2 n times. Then in the intersection, t appears min (m, n) times.

2 1 4 3 2 1 B A

R1

6 5 4 3 2 1 B A

R2

4 3 2 1 B A

R1 ∩ R2

slide-4
SLIDE 4

4

Murali Mani

Difference: -

Consider two bags R1 and R2 that are union-

  • compatible. Suppose a tuple t appears in R1

m times, and in R2 n times. Then in R1 – R2, t appears max (0, m - n) times.

2 1 4 3 2 1 B A

R1

6 5 4 3 2 1 B A

R2

2 1 B A

R1 – R2

Murali Mani

Bag semantics vs Set semantics

Union is idempotent for sets:

(R1 ∪ R2) ∪ R2 = R1 ∪ R2

Union is not idempotent for bags. Intersection is idempotent for sets and bags. Difference is idempotent for sets, but not for

bags.

For sets and bags, R1 ∩ R2 = R1 – (R1 – R2).

slide-5
SLIDE 5

5

Murali Mani

Cross Product (Cartesian Product): X

Consider two bags R1 and R2. Suppose a

tuple t1 appears in R1 m times, and a tuple t2 appears in R2 n times. Then in R1 X R2, t1t2 appears mn times.

2 1 2 1 B A

R1

5 4 5 4 3 2 C B

R2

3 2 2 1 5 4 2 1 5 4 2 1 5 4 2 1 2 2 R1.B 4 2 R2.B 5 1 3 1 C A

R1 X R2

Murali Mani

Basic Relational Operations

Select, Project, Join Select: denoted σC (R): selects the subset of

tuples of R that satisfies selection condition

  • C. C can be any boolean expression, its

clauses can be combined with AND, OR, NOT.

7 2 1 6 4 3 2 2 B 7 1 5 1 C A

R σ(C ≥ 6) (R)

7 2 1 6 4 3 2 B 7 1 C A

slide-6
SLIDE 6

6

Murali Mani

Select

Select is commutative: σC2 (σC1 (R)) = σC1

(σC2 (R))

Select is idempotent: σC (σC (R)) = σC (R) We can combine multiple select conditions

into one condition. σC1 (σC2 (… σCn (R)…)) = σC1 AND C2 AND … Cn (R)

Murali Mani

Project: πA1, A2, …, An (R)

Consider relation (bag) R with set of

attributes AR. πA1, A2, …, An (R), where A1, A2, …, An ∈ AR returns the tuples in R, but only with columns A1, A2, …, An.

7 2 1 6 4 3 2 2 B 8 1 5 1 C A

R

πA, B (R)

2 1 4 3 2 2 B 1 1 A

slide-7
SLIDE 7

7

Murali Mani

Project: Bag Semantics vs Set Semantics

For bags, the cardinality of R = cardinality of

πA1, A2, …, An (R).

For sets, cardinality of R ≥ cardinality of

πA1,A2, …, An (R).

For sets and bags

project is not commutative project is idempotent

Murali Mani

Natural Join: R ⋈ S

Consider relations (bags) R with attributes

AR, and S with attributes AS. Let A = AR ∩ AS. R ⋈ S can be defined as πAR – A, A, AS - A (σR.A1 = S.A1 AND R.A2 =S.A2 AND …

R.An=S.An (R X S))

where A = {A1, A2, …, An} The above expression says: select those tuples in R X S that agree in values for each

  • f the A attributes, and project the resulting

tuples such that we have only one value for each A attribute.

slide-8
SLIDE 8

8

Murali Mani

Natural Join example

2 1 2 1 B A

R1

5 4 5 4 3 2 C B

R2

3 2 1 2 B 3 1 C A

R1 ⋈ R2

Murali Mani

Theta Join: R ⋈

C S

Theta Join is similar to cartesian product,

except that we can specify any condition C. It is defined as R

⋈ C S = (σC (R X S))

2 1 2 1 B A

R1

5 4 5 4 3 2 C B

R2 R1

⋈ R1.B<R2.BR2

5 4 2 1 5 4 2 1 5 4 2 1 2 R1.B 4 R2.B 5 1 C A

slide-9
SLIDE 9

9

Murali Mani

Outer Join: R ⋈

  • S

Similar to natural join, however, if there is a

tuple in R, that has no “matching” tuple in S,

  • r a tuple in S that has no matching tuple in

R, then that tuple also appears, with null values for attributes in S (or R).

6 5 4 8 2 B 9 7 3 1 C A

R1

7 3 3 C 12 6 11 2 10 2 D B

R2 R1 ⋈

  • R2

null 9 8 7 10 3 2 1 11 3 2 1 null 6 5 4 6 B 7 C 12 null D A

Murali Mani

Left Outer Join: R ⋈

  • LS

Similar to natural join, however, if there is a

tuple in R, that has no “matching” tuple in S, then that tuple also appears, with null values for attributes in S (note: a tuple in S that has no matching tuple in R does not appear).

6 5 4 8 2 B 9 7 3 1 C A

R1

7 3 3 C 12 6 11 2 10 2 D B

R2 R1 ⋈

  • L R2

null 9 8 7 10 3 2 1 11 3 2 1 null 6 5 4 B C D A

slide-10
SLIDE 10

10

Murali Mani

Right Outer Join: R ⋈

  • RS

Similar to natural join, however, if there is a

tuple in S, that has no “matching” tuple in R, then that tuple also appears, with null values for attributes in R (note: a tuple in R that has no matching tuple in S does not appear).

6 5 4 8 2 B 9 7 3 1 C A

R1

7 3 3 C 12 6 11 2 10 2 D B

R2 R1 ⋈

  • R R2

10 3 2 1 11 3 2 1 12 7 6 null B C D A

Murali Mani

Renaming: ρS(A1, A2, …, An) (R)

Rename relation R to S, attributes of R are

renamed to A1, A2, …, An

ρS (R) renames relation R to S, keeping the

attributes same.

7 3 3 C 12 6 11 2 10 2 D B

R2

7 3 3 C 12 6 11 2 10 2 D X

ρS(X, C, D) (R2) S

7 3 3 C 12 6 11 2 10 2 D B

ρS (R2) S

slide-11
SLIDE 11

11

Murali Mani

Example: Introducing new relations

Find the semijoin of 2 relations R, S. Semijoin denoted R ⋉ S is defined as the tuples in R, such that for a tuple t1 in R, if there exists a tuple t2 in S, and t1 and t2 agree in all attributes common to R and S, then t1 appears in the result. R1 = R ⋈ S R2 = πAR (R1) R ⋉ S = R2 ⋂ R

Murali Mani

Duplicate Elimination: δ (R)

Convert a bag to a set.

R

2 1 4 3 2 2 B 1 1 A

δ (R)

4 3 2 B 1 A

slide-12
SLIDE 12

12

Murali Mani

Extended Projection: πL (R)

Here L can be

An attribute (just like simple projection) An expression x → y, where x and y are names of

attributes, this renames attribute x to y.

An expression E → z, where E is any expression

involving attributes, constants, and arithmetic and string operators. This has an attribute called z whose values are given by E.

7 3 3 C 12 6 11 2 10 2 D B

R

πB→A, C+D→X, C, D (R)

19 14 13 X 7 3 3 C 12 6 11 2 10 D 2 A

Murali Mani

Aggregation operators

MIN, MAX, COUNT, SUM, AVG AGGB (R) considers only non-null values of R.

R

null 1 4 3 3 2 B 1 1 A 2

MINB (R) MINB (R)

4

MAXB (R) MAXB (R)

3

COUNTB (R) COUNTB (R)

9

SUMB (R) SUMB (R)

3

AVGB (R) AVGB (R)

4

COUNT* (R) COUNT* (R)

slide-13
SLIDE 13

13

Murali Mani

Aggregation Operators

MIN, MAX, SUM, AVG must be on any 1

  • attribute. COUNT can be on any 1 attribute or

COUNT* (R)

An aggregation operator returns a bag, not a

single value ! But SQL allows treatment as a single value.

4 3 B A

σB=MAXB (R) (R)

Murali Mani

Grouping Operator: γGL, AL (R)

γGL, AL (R) groups all attributes in GL, and

performs the aggregation specified in AL.

HF 93 6D&7N HF 79 SW2 94 99 77 year KR Speed KR Matrix HF SW1 starName title

StarsIn γ starName, MIN (year)→year, COUNT(title) →num (StarsIn)

94 77 year 2 KR 3 HF num starName

slide-14
SLIDE 14

14

Murali Mani

Sorting Operator: τL (R)

It sorts the tuples in R. If L is list A1, A2, …,

An, it first sorts by A1, then by A2, and so on.

Sort is used as a last operator in an

expression.

7 2 1 6 1 3 3 2 B 8 1 5 1 C A

R

8 3 1 7 2 1 1 2 B 6 3 5 1 C A

τA,B (R)

Murali Mani

Relational Algebra Operators

Set Operators: Union, Intersection,

Difference, Cartesian Product

Select, Project Join: Natural Join, Theta Join, (Left/Right)

Outer Join

Renaming, Duplicate Elimination Aggregation: MIN, MAX, COUNT, SUM, AVG Grouping, Sorting