Normalization Redundancy causes several anomalies : insert, delete - - PDF document

normalization
SMART_READER_LITE
LIVE PREVIEW

Normalization Redundancy causes several anomalies : insert, delete - - PDF document

What and Why Normalization? To remove potential redundancy in design Normalization Redundancy causes several anomalies : insert, delete and update Normalization uses concept of dependencies Functional Dependencies Idea used:


slide-1
SLIDE 1

1

Murali Mani

Normalization

Murali Mani

What and Why Normalization?

To remove potential redundancy in design

Redundancy causes several anomalies: insert,

delete and update

Normalization uses concept of dependencies

Functional Dependencies

Idea used: Decomposition

Break R (A, B, C, D) into R1 (A, B) and R2 (B, C,

D)

Use decomposition judiciously.

Murali Mani

Insert Anomaly

Greg Dave sName p2 p1 pNumber ER s2 MM s1 pName sNumber Student Note: We cannot insert a professor who has no students. Insert Anomaly: We are not able to insert “valid” value/(s)

Murali Mani

Delete Anomaly

Greg Dave sName p2 p1 pNumber ER s2 MM s1 pName sNumber Student Note: We cannot delete a student that is the only student of a professor. Note: In both cases, minimum cardinality of Professor in the corresponding ER schema is 0 Delete Anomaly: We are not able to perform a delete without losing some “valid” information.

Murali Mani

Update Anomaly

Greg Dave sName p1 p1 pNumber MM s2 MM s1 pName sNumber Student Note: To update the name of a professor, we have to update in multiple tuples. Update anomalies are due to redundancy.

Student sNumber sName Professor pNumber pName Has Advisor (1,1) (0,*) years

Note the maximum cardinality of Professor in the corresponding ER schema is * Update Anomaly: To update a value, we have to update multiple rows.

Murali Mani

Keys : Revisited

A key for a relation R (a1, a2, …, an) is a set

  • f attributes, K that uniquely determine the

values for all attributes of R.

A key K is minimal: no proper subset of K is a

key.

A superkey need not be minimal Prime Attribute: An attribute of a key

slide-2
SLIDE 2

2

Murali Mani

Keys: Example

320FL Greg 2 144FL Dave 1 address sName sNumber Student Primary Key: <sNumber> Candidate key: <sName> Some superkeys: {<sNumber, address>, <sName>, <sNumber>, <sNumber, sName>, <sNumber, sName, address>} Prime Attribute: {sNumber, sName}

Murali Mani

Functional Dependencies (FDs)

320FL Greg 2 144FL Dave 1 address sName sNumber Student Suppose we have the FD sName → address

  • for any two rows in the Student relation with the same value for sName, the

value for address must be the same

  • i.e., there is a function from sName to address

Note:

  • We will assume no null values.
  • Any key (primary or candidate) or superkey of a relation R functionally

determines all attributes of R

Murali Mani

Properties of FDs

Consider A, B, C, Z are sets of attributes Reflexive (also called trivial FD): if A ⊇ B, then

A → B

Transitive: if A → B, and B → C, then A → C Augmentation: if A → B, then AZ → BZ Union: if A → B, A → C, then A → BC Decomposition: if A → BC, then A → B, A → C

Murali Mani

Inferring FDs

Why?

Suppose we have a relation R (A, B, C) and we

have functional dependencies A → B, B → C, C → A

what is a key for R? Should we split R into multiple relations?

We can infer A → ABC, B → ABC, C → ABC.

Hence A, B, C are all keys.

Murali Mani

Algorithm for inference of FDs

  • Computing the closure of set of attributes

{A1, A2, …, An}, denoted {A1, A2, …, An}+

1.

Let X = {A1, A2, …, An}

2.

If there exists a FD B1, B2, …, Bm → C, such that every Bi ∈ X, then X = X ∪ C

3.

Repeat step 2 till no more attributes can be added.

4.

{A1, A2, …, An}+ = X

Murali Mani

Inferring FDs: Example 1

Given R (A, B, C), and FDs A → B, B → C, C

→ A, what are possible keys for R

Compute the closure of attributes:

{A}+ = {A, B, C} {B}+ = {A, B, C} {C}+ = {A, B, C}

So keys for R are <A>, <B>, <C>

slide-3
SLIDE 3

3

Murali Mani

Inferring FDs: Example 2

Consider R (A, B, C, D, E) with FDs A → B,

B → C, CD → E, does A → E?

Let us compute {A}+ {A}+ = {A, B, C} Therefore A → E is false

Murali Mani

Decomposing Relations

Greg Dave sName p2 p1 pNumber MM s2 MM s1 pName sNumber StudentProf FDs: pNumber → pName Greg Dave sName p2 p1 pNumber s2 s1 sNumber Student p2 p1 pNumber MM MM pName Professor

Murali Mani

Decomposition: Lossless Join Property

Generating spurious tuples

Greg Dave sName MM MM pName S2 S1 sNumber Student p2 p1 pNumber MM MM pName Professor MM p1 Greg s2 MM p2 Dave s1 Greg Dave sName p2 p1 pNumber MM s2 MM s1 pName sNumber StudentProf

Murali Mani

Normalization Step

Consider relation R with set of attributes AR.

Consider a FD A → B (such that no other attribute in (AR – A – B) is functionally determined by A).

If A is not a superkey for R, we may

decompose R as:

Create R’ (AR – B) Create R’’ with attributes A ∪ B Key for R’’ = A

Murali Mani

Normal Forms: BCNF

Boyce Codd Normal Form (BCNF): For every

non-trivial FD X → a in R, X is a superkey of R.

Murali Mani

BCNF example

SCI (student, course, instructor) FDs: student, course → instructor instructor → course Decomposition: SI (student, instructor) Instructor (instructor, course)

slide-4
SLIDE 4

4

Murali Mani

Dependency Preservation

We might want to ensure that all specified FDs are captured. BCNF does not necessarily preserve FDs. 2NF, 3NF preserve FDs. ER Dave MM Dave instructor student SI DB 1 ER DB 1 MM course instructor Instructor DB 1 ER Dave MM instructor DB 1 Dave course student SCI (from SI and Instructor) SCI violates the FD student, course → instructor

Murali Mani

Normal Forms: 3NF

Third Normal Form (3NF): For every non-

trivial FD X → a in R, either a is a prime attribute or X is a superkey of R.

Murali Mani

3NF - example

Lot (propNo, county, lotNum, area, price, taxRate) Candidate key: <county, lotNum> FDs: county → taxRate area → price Decomposition: Lot (propNo, county, lotNum, area, price) County (county, taxRate)

Murali Mani

3NF - example

Lot (propNo, county, lotNum, area, price) County (county, taxRate) Candidate key for Lot: <county, lotNum> FDs: county → taxRate area → price Decomposition: Lot (propNo, county, lotNum, area) County (county, taxRate) Area (area, price)

Murali Mani

Extreme Example

Consider relation R (A, B, C, D) with primary key (A, B, C), and FDs B → D, and C → D. R violates 3NF. Decomposing it, we get 3 relations as: R1 (A, B, C), R2 (B, D), R3 (C, D) Let us consider an instance where we need these 3 relations and how we do a natural join ⋈ c2 b1 a3 c1 b2 a2 c1 b1 a1 C B A d2 b2 d1 b1 D B R1 R2 d2 c2 d1 c1 D C R3 c2 c1 c1 C d1 b1 a3 d2 b2 a2 d1 b1 a1 D B A R1 ⋈ R2: violates C → D c1 C d1 b1 a1 D B A R1 ⋈ R2 ⋈ R3: no FD is violated

Murali Mani

Define Foreign Keys?

Consider the normalization step: A relation R

with set of attributes AR, and a FD A → B, and A is not a key for R, we decompose R as:

Create R’ (AR – B) Create R’’ with attributes A ∪ B Key for R’’ = A

We can also define foreign key

R’ (A) references R’’ (A)

Question: What is key for R’?

slide-5
SLIDE 5

5

Murali Mani

How does Normalization Help?

Employee Dept Works For ssn name lot did dName (1, 1) (0, *)

Employee (ssn, name, lot, dept) Dept (did, dName) FK: Employee (dept) REFERENCES Dept (did) Suppose: employees of a dept are in the same lot FD: dept → lot Decomposing: Employee (ssn, name, dept) Dept (did, dName) DeptLot (dept, lot) (or) Employee (ssn, name, dept) Dept (did, dName, lot)

Employee Dept Works For ssn name did dName (1, 1) (0, *) lot