Relational Algebra Relational Query Languages Recall: Query = - - PowerPoint PPT Presentation

relational algebra relational query languages
SMART_READER_LITE
LIVE PREVIEW

Relational Algebra Relational Query Languages Recall: Query = - - PowerPoint PPT Presentation

Relational Algebra Relational Query Languages Recall: Query = Retrieval Program Language Examples: Theoretical: 1. Relational Algebra 2. Relational Calculus a. Tuple Relational Calculus (TRC) b. Domain Relational Calculus (DRC)


slide-1
SLIDE 1

Relational Algebra

slide-2
SLIDE 2

Relational Query Languages

 Theoretical QLs give semantics to Practical QLs

CSCI1270, Lecture 2

Recall: Query = “Retrieval Program”

Theoretical:

  • 1. Relational Algebra
  • 2. Relational Calculus
  • a. Tuple Relational Calculus (TRC)
  • b. Domain Relational Calculus (DRC)

Practical:

  • 1. SQL

(originally: SEQUEL from System R)

  • 2. Quel

(used in Ingres)

  • 3. Datalog (Prolog-like – used in research lab systems)

Language Examples:

slide-3
SLIDE 3

Relational Algebra

  • Basic Operators
  • 1. select ( σ )
  • 2. project ( p )
  • 3. union (  )
  • 4. set difference ( – )
  • 5. cartesian product (  )
  • 6. rename ( ρ )
  • Closure Property

CSCI1270, Lecture 2

Relational Operator

Relation Relation Relation

Relational Operator

slide-4
SLIDE 4

Select ( σ )

CSCI1270, Lecture 2

Notation: σpredicate (Relation)

Relation: Can be name of table or result of another query Predicate:

Select rows from a relation based on a predicate

  • 2. Complex
  • predicate AND predicate
  • predicate OR predicate
  • NOT predicate

Idea:

  • 1. Simple
  • attribute1 = attribute2
  • attribute = constant value (also: ≠, <, >, ≤, ≥)
slide-5
SLIDE 5

Bank Database

CSCI1270, Lecture 2 Account bname acct_no balance

Downtown Mianus Perry R.H. Brighton Redwood Brighton A-101 A-215 A-102 A-305 A-201 A-222 A-217 500 700 400 350 900 700 750

Depositor cname acct_no

Johnson Smith Hayes Turner Johnson Jones Lindsay A-101 A-215 A-102 A-305 A-201 A-217 A-222

Customer cname cstreet ccity

Jones Smith Hayes Curry Lindsay Turner Williams Adams Johnson Glenn Brooks Green Main North Main North Park Putnam Nassau Spring Alma Sand Hill Senator Walnut Harrison Rye Harrison Rye Pittsfield Stanford Princeton Pittsfield Palo Alto Woodside Brooklyn Stanford

Branch bname bcity assets

Downtown Redwood Perry Mianus R.H. Pownel

  • N. Town

Brighton Brooklyn Palo Alto Horseneck Horseneck Horseneck Bennington Rye Brooklyn 9M 2.1M 1.7M 0.4M 8M 0.3M 3.7M 7.1M

Borrower cname lno

Jones Smith Hayes Jackson Curry Smith Williams Adams L-17 L-23 L-15 L-14 L-93 L-11 L-17 L-16

Loan bname lno amt

Downtown Redwood Perry Downtown Mianus R.H. Perry L-17 L-23 L-15 L-14 L-93 L-11 L-16 1000 2000 1500 1500 500 900 1300

slide-6
SLIDE 6

Select ( σ )

CSCI1270, Lecture 2

bname bcity assets Downtown Brighton Brooklyn Brooklyn 9M 7.1M bname bcity assets Downtown Brooklyn 9M

σ bcity = “Brooklyn” (branch) = σ assets > $8M (σ bcity = “Brooklyn” (branch)) = Notation: σpredicate (Relation)

slide-7
SLIDE 7

Project ( p )

cstreet ccity

Main North Park Putnam Nassau Spring Alma Sand Hill Senator Walnut Harrison Rye Pittsfield Stanford Princeton Pittsfield Palo Alto Woodside Brooklyn Stanford CSCI1270, Lecture 2

Notation: pA1, …, An (Relation)

  • Relation: name of a table or result of another query
  • Each Ai is an attribute
  • Idea: p selects columns (vs. σ which selects rows)

p cstreet, ccity (customer) =

slide-8
SLIDE 8

Project ( p )

bcity Brooklyn Horseneck

CSCI1270, Lecture 2

p bcity (σassets > 5M (branch)) =

Question: Does the result of Project always have the same number

  • f tuples as its input?
slide-9
SLIDE 9

Union (  )

cname

Johnson Smith Hayes Turner Jones Lindsay Jackson Curry Williams Adams CSCI1270, Lecture 2

Notation: Relation1  Relation2 Example:

(p cname (depositor))  (p cname (borrower)) = R  S valid only if:

1. R, S have same number of columns (arity) 2. R, S corresponding columns have same name and domain (compatibility)

Depositor cname acct_no Borrower cname lno

Schema:

slide-10
SLIDE 10

Set Difference ( – )

bname lno amount

Downtown Redwood Perry Downtown Perry L-17 L-23 L-15 L-14 L-16 1000 2000 1500 500 300

Notation: Relation1 - Relation2

R - S valid only if:

1. R, S have same number of columns (arity) 2. R, S corresponding columns have same domain (compatibility)

Example:

(p bname (σamount ≥ 1000 (loan))) – (p bname (σ balance < 800 (account))) =

bname acct_no balance

Mianus Brighton Redwood Brighton A-215 A-201 A-222 A-217 700 900 700 850 CSCI1270, Lecture 2

bname

Downtown Perry

slide-11
SLIDE 11

What About Intersection?

Remember: R ⋂ S = R – (R – S)

S R R – S

slide-12
SLIDE 12

Cartesian Product (  )

depositor. cname acct_no borrower. cname lno Johnson Johnson Johnson Johnson Johnson Johnson Johnson Johnson Smith … A-101 A-101 A-101 A-101 A-101 A-101 A-101 A-101 A-215 … Jones Smith Hayes Jackson Curry Smith Williams Adams Jones … L-17 L-23 L-15 L-14 L-93 L-11 L-17 L-16 L-17 … CSCI1270, Lecture 2

Notation: Relation1  Relation2 Example:

R  S like cross product for mathematical relations:

  • every tuple of R appended to every tuple of S
  • flattened!!!

depositor  borrower =

How many tuples in the result? A: 56

slide-13
SLIDE 13

Rename ( ρ )

CSCI1270, Lecture 2

Notation: r identifier (Relation)

renames a relation, or

Notation: r identifier0 (identifier1, …, identifiern) (Relation)

renames relation and columns of n-column relation

Use:

massage relations to make , – valid, or  more readable

slide-14
SLIDE 14

Rename ( ρ )

CSCI1270, Lecture 2

Notation: r identifier0 (identifier1, …, identifiern) (Relation)

Example: r result (dcname, acctno, bcname, lno) (depositor  borrower) =

dccname acctno bcname lno Johnson Johnson Johnson Johnson Johnson Johnson Johnson Johnson Smith … A-101 A-101 A-101 A-101 A-101 A-101 A-101 A-101 A-215 … Jones Smith Hayes Jackson Curry Smith Williams Adams Jones … L-17 L-23 L-15 L-14 L-93 L-11 L-17 L-16 L-17 …

result

slide-15
SLIDE 15

Example Query in RA

  • Determine lno for loans that are for an amount

that is larger than the amount of some other

  • loan. (i.e. lno for all non-minimal loans)

CSCI1270, Fall 2008, Lecture 2

Temp1  … Temp2  … Temp1 … … Can do in steps:

slide-16
SLIDE 16

Example Query in RA

CSCI1270, Lecture 2

lno amt

L-17 L-23 L-15 L-14 L-93 L-11 L-16 1000 2000 1500 1500 500 900 1300

  • 1. Find the base data we need

Temp1  p lno,amt (loan)

  • 2. Make a copy of (1)

Temp2  ρ Temp2 (lno2,amt2) (Temp1)

lno2 amt2

L-17 L-23 L-15 L-14 L-93 L-11 L-16 1000 2000 1500 1500 500 900 1300

slide-17
SLIDE 17

Example Query in RA

CSCI1270, Lecture 2

lno amt lno2 amt2

L-17 L-17 … L-17 L-23 L-23 … L-23 … 1000 1000 … 1000 2000 2000 … 2000 … L-17 L-23 … L-16 L-17 L-23 … L-16 … 1000 2000 … 1300 1000 2000 … 1300 …

  • 3. Take the cartesian product of 1 and 2

Temp3  Temp1  Temp2

slide-18
SLIDE 18

Example Query in RA

  • p lno (σamt > amt2 (p lno,amt (loan)  (

ρTemp2 (lno2,amt2)(p lno,amt (loan)))))

CSCI1270, Fall 2008, Lecture 2

  • 4. Select non-minimal loans

Temp4  σamt > amt2 (Temp3)

  • 5. Project on lno

Result  p lno (Temp4) … or, if you prefer…

slide-19
SLIDE 19

Review

Theoretical Query Languages

CSCI1270, Fall 2008, Lecture 2

1. SELECT ( σ ) 2. PROJECT ( π ) 3. UNION (  ) 4. SET DIFFERENCE ( – ) 5. CARTESIAN PRODUCT (  ) 6. RENAME ( ρ )

Relational Algebra

  • Relational algebra gives semantics to practical query languages
  • Above set: minimal relational algebra

 will now look at some redundant (but useful!) operators

slide-20
SLIDE 20

Review

CSCI1270, Lecture 2

Find the names of customers who have both accounts and loans T1  ρT1 (cname2, lno) (borrower) T2  depositor  T1 T3  σcname = cname2 (T2) Result  π cname (T3) Above sequence of operators (ρ, , σ) very common.

Express the following query in the RA:

Motivates additional (redundant) RA operators.

slide-21
SLIDE 21

Relational Algebra

Additional Operators

  • 6. Update (  ) (we’ve already been using this)

CSCI1270, Lecture 2

  • 2. Division ( )
  • 3. Generalized Projection (π)
  • 4. Aggregation

 

  • 1. Natural Join (

)

     

  • 5. Outer Joins ( )
  • 1&2 Redundant: Can be expressed in terms of minimal RA

e.g. depositor borrower = π …(σ…(depositor  ρ…(borrower)))

  • 3 – 6 Added for extra power

 

slide-22
SLIDE 22

Natural Join

CSCI1270, Lecture 2

Idea: combines ρ, , σ

A B C D 1 2 2 3 α α α β +

  • +

10 10 20 10 E B D ‘a’ ‘a’ ‘b’ ‘c’ α α β β 10 20 10 10

r s

A B C D E 1 2 2 3 3 α α α Β β +

  • +

+ 10 10 20 10 10 ‘a’ ‘a’ ‘a’ ‘b’ ‘c’ =

Relation1 Relation2 Notation:

 

πcname,acct_no,lno (σcname=cname2 (depositor  ρt(cname2,lno) (borrower))) ≡

 

depositor borrower

 

slide-23
SLIDE 23

Division

CSCI1270, Lecture 2

A B α α α β γ γ γ γ δ δ 1 2 3 1 1 3 4 6 1 2 B 1 2

r s

A α δ =

Query: Find values for A in r which have corresponding B values for all B values in s

Relation1 Relation2 Notation:

Idea: expresses “for all” queries

slide-24
SLIDE 24

Division

CSCI1270, Lecture 2

A B α α α β γ γ γ γ δ δ 1 2 3 1 1 3 4 6 1 2 B 1 2

r s

A α δ =

17  3 = 5

The largest value of i such that: i  3 ≤ 17

t

Relational Division

The largest value of t such that: ( t  s  r )

Another way to look at it: and

 

slide-25
SLIDE 25

Division

CSCI1270, Lecture 2

A B C D E α α α β β γ γ γ a a a a a a a a α γ γ γ γ γ γ β a a b a b a b b 1 1 1 1 3 1 1 1 D E a b 1 1

r s

A B C α γ a a γ γ =

t ?

A More Complex Example

slide-26
SLIDE 26

Division Adds No Power

Definition in terms of the basic algebra operation Let r(R) and s(S) be relations, and let S  R r  s = R-S (r) – R-S ( (R-S (r) x s) – R-S,S(r)) To see why

– R-S,S(r) simply reorders attributes of r – R-S(R-S (r) x s) – R-S,S(r)) gives those tuples t in R-S (r) such that for some tuple u  s, tu  r.

slide-27
SLIDE 27

Generalized Projection

CSCI1270, Lecture 2

p e1,…,en (Relation)

e1,…,en can include arithmetic expressions – not just attributes

cname limit balance Jones Turner 5000 3000 2000 2500 credit =

π cname, limit - balance (credit) =

cname limit-balance Jones Turner 3000 500

Notation: Example Then…

slide-28
SLIDE 28

Aggregate Functions and Operations

 An aggregate function takes a collection of values and returns a single value as a result.

avg: average value min: minimum value max: maximum value sum: sum of values count: number of values

 Aggregate operation in relational algebra

G1, G2, …, Gn g F1( A1), F2( A2),…, Fn( An)(E )

  • E is any relational-algebra expression
  • G1, G2 …, Gn is a list of attributes on which to group

(can be empty)

  • Each Fi is an aggregate function
  • Each Ai is an attribute name

CSCI1270, Lecture 2

slide-29
SLIDE 29

Aggregate Operation – Example

 Relation r:

CSCI1270, Lecture 2

A B         C 7 7 3 10

g sum(c) (r)

sum-C 27 No grouping

slide-30
SLIDE 30

Aggregate Operation – Example

 Relation account grouped by branch-name:

CSCI1270, Lecture 2

branch-name g sum(balance) (account)

branch-name account-number balance Perryridge Perryridge Brighton Brighton Redwood A-102 A-201 A-217 A-215 A-222 400 900 750 750 700 branch-name sum(balance) Perryridge Brighton Redwood 1300 1500 700

slide-31
SLIDE 31

Aggregate Functions (Cont.)

Result of aggregation does not have a name

– Can use rename operation to give it a name – For convenience, we permit renaming as part of aggregate operation

CSCI1270, Lecture 2

branch-name g sum(balance) as sum-balance (account)

slide-32
SLIDE 32

Outer Joins

CSCI1270, Lecture 2

bname lno amt Downtown Redwood Perry L-170 L-230 L-260 3000 4000 1700

loan =

cname lno Jones Smith Hayes L-170 L-230 L-155

borrower =

=

bname lno amt cname Downtown Redwood L-170 L-230 3000 4000 Jones Smith

Join result loses…

 any record of Perry  any record of Hayes

Motivation:

 

loan borrower =

slide-33
SLIDE 33

Outer Joins

CSCI1270, Lecture 2

bname lno amt Downtown Redwood Perry L-170 L-230 L-260 3000 4000 1700

loan =

cname lno Jones Smith Hayes L-170 L-230 L-155

borrower =

bname lno amt cname Downtown Redwood Perry L-170 L-230 L-260 3000 4000 1700 Jones Smith ┴

  • preserves all tuples in left relation
  • 1. Left Outer Join ( )

 

┴ = NULL

loan borrower =

 

slide-34
SLIDE 34

Outer Joins

CSCI1270, Fall 2008, Lecture 2

bname lno amt cname Downtown Redwood ┴ L-170 L-230 L-155 3000 4000 ┴ Jones Smith Hayes

 

bname lno amt Downtown Redwood Perry L-170 L-230 L-260 3000 4000 1700

loan =

cname lno Jones Smith Hayes L-170 L-230 L-155

borrower =

  • preserves all tuples in right relation
  • 2. Right Outer Join ( )

┴ = NULL

loan borrower =

 

slide-35
SLIDE 35

Outer Joins

CSCI1270, Fall 2008, Lecture 2

bname lno amt Downtown Redwood Perry L-170 L-230 L-260 3000 4000 1700

loan =

cname lno Jones Smith Hayes L-170 L-230 L-155

borrower =

  • preserves all tuples in both relations
  • 3. Full Outer Join ( )

┴ = NULL

 

bname lno amt cname Downtown Redwood Perry ┴ L-170 L-230 L-260 L-155 3000 4000 1700 ┴ Jones Smith ┴ Hayes

loan borrower =

 

slide-36
SLIDE 36

Update

CSCI1270, Lecture 2

  • 1. Deletion: r  r – s

e.g., account  account – σbname=Perry (account) (deletes all Perry accounts)

  • 2. Insertion: r  r  s

e.g., branch  branch  {(Waltham, Boston, 7M)} (inserts new branch with bname = Waltham, bcity = Boston, assets = 7M)

  • 3. Update: r  πe1,…,en (r)

e.g., depositor  depositor  (ρtemp (cname,acct_no) (borrower)) (adds all borrowers to depositors, treating lno’s as acct_no’s) e.g., account  πbname,acct_no,bal*1.05 (account) (adds 5% interest to account balances)

Identifier  Query Notation: Common Uses:

slide-37
SLIDE 37

Views

  • Limited access to DB.
  • Tailored schema
  • Consider a person who needs to know a customer’s

loan number but has no need to see the loan

  • amount. This person should see a relation described

as: customer-name, loan-number (borrower loan)

  • A relation that is made visible to a user as a “virtual

relation” is called a view.

slide-38
SLIDE 38

View Definition

  • A view is defined using the create view statement which has the

form create view v as <query expression> where <query expression> is any legal relational algebra query

  • expression. The view name given as v.
  • Once a view is defined, the view name can be used to refer to the

virtual relation that the view generates.

  • View definition is not the same as creating a new relation by

evaluating the query expression Rather, a view definition causes the saving of an expression to be substituted into queries using the view.

slide-39
SLIDE 39

View Examples

  • Consider the view (named all-customer)

consisting of branches and their customers. create view all-customer as branch-name, customer-name(depositor account)  branch-name, customer-name(borrower loan)

  • We can find all customers of the Perryridge

branch by writing: customer-name (branch-name = “Perryridge”(all-customer))

slide-40
SLIDE 40

Updates Through View

  • Database modifications expressed as views must be translated to

modifications of the actual relations in the database.

  • Consider the person who needs to see all loan data in the loan

relation except amount. The view given to the person, branch-loan, is defined as: create view branch-loan as branch-name, loan-number (loan)

  • Since we allow a view name to appear wherever a relation name is

allowed, the person may write: branch-loan  branch-loan  {(“Perryridge”, L-37)}

slide-41
SLIDE 41

Updates Through Views (Cont.)

  • The previous insertion must be represented

by an insertion into the actual relation loan from which the view branch-loan is constructed.

  • An insertion into loan requires a value for
  • amount. The insertion can be dealt with by

either.

– rejecting the insertion and returning an error message to the user. – inserting a tuple (“L-37”, “Perryridge”, null) into the loan relation

slide-42
SLIDE 42

Updates Through Views (Cont.)

  • Some updates through views are impossible to

translate.

create view v as branch-name = “Perryridge” (account)) v  v  (L-99, Downtown, 23)

  • Others cannot be translated uniquely

all-customer  all-customer  (Perryridge, John)

  • Have to choose loan or account, and

create a new loan/account number!

slide-43
SLIDE 43

Views Defined Using Other Views

  • One view may be used in the expression

defining another view

  • A view relation v1 is said to depend directly on

a view relation v2 if v2 is used in the expression defining v1

  • A view relation v1 is said to depend on view

relation v2 if either v1 depends directly on v2 or there is a path of dependencies from v1 to v2

slide-44
SLIDE 44

View Expansion

  • Let view v1 be defined by an expression e1 that

may itself contain uses of view relations.

  • View expansion of an expression repeats the

following replacement step: repeat Find any view relation vi in e1 Replace the view relation vi by the expression defining vi until no more view relations are present in e1

  • As long as the view definitions are not recursive,

this loop will terminate