Relational Database Design Theory Part II CPS 196.3 Introduction - - PDF document

relational database design theory part ii
SMART_READER_LITE
LIVE PREVIEW

Relational Database Design Theory Part II CPS 196.3 Introduction - - PDF document

Relational Database Design Theory Part II CPS 196.3 Introduction to Database Systems 2 Announcement ! Project proposal/progress report due today ! Midterm next Thursday in class " Everything up to todays lecture, with a focus on the


slide-1
SLIDE 1

1

Relational Database Design Theory Part II

CPS 196.3 Introduction to Database Systems

2

Announcement

! Project proposal/progress report due today ! Midterm next Thursday in class

" Everything up to today’s lecture, with a focus on the materials covered by the first two homework assignments " Open book, open notes

! Will assign an optional problem set tonight as a study

guide for midterm

" Entirely optional " If you turn it in on Tuesday in class, you can use its grade to replace your lowest homework grade so far " Solution will be posted on Tuesday midnight

! Graded Homework #2 will be available on Tuesday

3

Review

! Functional dependencies

" X → Y: If two rows agree on X, they must agree on Y #A generalization of the key concept

! Non-key functional dependencies: a source of redundancy

" No trivial X → Y where X is not a superkey #Called a BCNF violation

! BCNF decomposition: a method for removing redundancies

" Given R(X, Y, Z) and a BCNF violation X → Y, decompose R into R1(X, Y) and R2(X, Z) #A lossless join decomposition

! Schema in BCNF has no redundancy due to FD’s

slide-2
SLIDE 2

2

4

Next

! 3NF (BCNF is too much) ! Multivalued dependencies: another source of

redundancy

! 4NF (BCNF is not enough)

5

Motivation for 3NF

! Address (street_address, city, state, zip)

" street_address, city, state → zip " zip → city, state

! Keys ! BCNF?

6

To decompose or not to decompose

Address1 Address2

! FD’s in Address1 ! FD’s in Address2 ! Hey, where is street_address, city, state → zip?

" Cannot check without joining Address1 and Address2 back together

! Problem: Some lossless join decomposition is not

dependency-preserving

! Dilemma: Should we get rid of redundancy at the expense

  • f making constraints harder to enforce?
slide-3
SLIDE 3

3

7

3NF

! R is in Third Normal Form (3NF) if for every non-trivial

FD X → A (where A is single attribute), either

" X is a superkey of R, or " A is a member of at least one key of R #Intuitively, BCNF decomposition on X → A would “break” the key containing A

! So Address is already in 3NF ! Tradeoff:

" Can enforce all original FD’s on individual decomposed relations " Might have some redundancy due to FD’s

8

BNCF = no redundancy?

! Student (SID, CID, club)

" Suppose your classes have nothing to do with the clubs you join " FD’s? " BNCF? " Redundancies?

SID CID club 142 CPS196 ballet 142 CPS196 sumo 142 CPS114 ballet 142 CPS114 sumo 123 CPS196 chess 123 CPS196 golf ... ... ...

9

Multivalued dependencies

! A multivalued dependency (MVD) has the form

X ! Y, where X and Y are sets of attributes in a relation R

! X ! Y means that whenever two rows in R agree on

all the attributes of X, then we can swap their Y components and get two new rows that are also in R

X Y Z a b1 c1 a b2 c2 ... ... ... X Y Z a b1 c1 a b2 c2 a b1 c2 a b2 c1 ... ... ...

Must be in R too

slide-4
SLIDE 4

4

10

MVD examples

Student (SID, CID, club)

! SID ! CID

11

Complete MVD + FD rules

! FD reflexivity, augmentation, and transitivity ! MVD complementation:

If X ! Y, then X ! attrs(R) – X – Y

! MVD augmentation:

If X ! Y and V ⊆ W, then XW ! YV

! MVD transitivity:

If X ! Y and Y ! Z, then X ! Z – Y

! Replication (FD is MVD):

If X → Y, then X ! Y

! Coalescence:

If X ! Y and Z ⊆ Y and there is some W disjoint from Y such that W → Z, then X → Z

Try proving things using these!

12

An elegant solution: chase

! Given a set of FD’s and MVD’s D, does another

dependency d (FD or MVD) follow from D?

! Procedure

" Start with the hypotheses of d, and treat them as “seed” tuples in a relation " Apply the given dependencies in D repeatedly

  • If we apply an FD, we infer equality of two symbols
  • If we apply an MVD, we infer more tuples

" If we infer the conclusion of d, we have a proof " Otherwise, if nothing more can be inferred, we have a counterexample

slide-5
SLIDE 5

5

13

Proof by chase

! In R(A, B, C, D), does A ! B and B ! C imply

that A ! C?

A B C D a b1 c1 d1 a b2 c2 d2 A B C D a b1 c2 d1 a b2 c1 d2

Have Need

14

Another proof by chase

! In R(A, B, C, D), does A → B and B → C imply

that A → C?

Have Need c1 = c2 In general, both new tuples and new equalities may be generated

A B C D a b1 c1 d1 a b2 c2 d2

15

Counterexample by chase

! In R(A, B, C, D), does A ! BC and CD → B imply

that A → B?

A B C D a b1 c1 d1 a b2 c2 d2

Have Need b1 = b2

slide-6
SLIDE 6

6

16

4NF

! A relation R is in Fourth Normal Form (4NF) if

" For every non-trivial MVD X ! Y in R, X is a superkey " That is, all FD’s and MVD’s follow from “key → other attributes” (i.e., no MVD’s and no FD’s besides key functional dependencies)

! 4NF is stronger than BCNF

" Because every FD is also a MVD

17

4NF decomposition algorithm

! Find a 4NF violation

" A non-trivial MVD X ! Y in R where X is not a superkey

! Decompose R into R1 and R2, where

" R1 has attributes X ∪ Y " R2 has attributes X ∪ Z (Z contains attributes not in X or Y)

! Repeat until all relations are in 4NF ! Almost identical to BCNF decomposition algorithm ! Any decomposition on a 4NF violation is lossless

18

4NF decomposition example

Student (SID, CID, club)

SID CID club 142 CPS196 ballet 142 CPS196 sumo 142 CPS114 ballet 142 CPS114 sumo 123 CPS196 chess 123 CPS196 golf ... ... ...

slide-7
SLIDE 7

7

19

3NF, BCNF, 4NF, and beyond

! Of historical interests

" 1NF: All column values must be atomic " 2NF: There is no partial functional dependency (a non- trivial FD X → A where X is a proper subset of some key)

No Possible Possible Redundancy due to MVD’s No No Possible Redundancy due to FD’s Possible Possible No Lose FD’s? 4NF BCNF 3NF Anomaly/normal form

20

Summary

! Philosophy behind BCNF, 4NF:

Data should depend on the key, the whole key, and nothing but the key!

! Philosophy behind 3NF:

… But not at the expense of more expensive constraint enforcement!