normalization
play

Normalization Improve the schema by decomposing relations and - PDF document

Normalization Improve the schema by decomposing relations and removing anomalies. CS 235: Boyce-Codd Normal Form (BCNF): all FDs Introduction to Databases follow from the fact key everything . Formally, R is in BCNF if every


  1. Normalization • Improve the schema by decomposing relations and removing anomalies. CS 235: • Boyce-Codd Normal Form (BCNF): all FD’s Introduction to Databases follow from the fact key → everything . • Formally, R is in BCNF if every nontrivial FD for Svetlozar Nestorov R , say X → A , has X a superkey. – “Nontrivial” = right-side attribute not in left side. Lecture Notes #6 BCNF properties Example 1. Guarantees no redundancy due to FD’s. Beers(name, manf, manfAddr). • FD’s: 2. Guarantees no update anomalies = one – name → manf, occurrence of a fact is updated, not all. – manf → manfAddr . 3. Guarantees no deletion anomalies = • Only key is name. valid fact is lost when tuple is deleted. • manf → manfAddr violates BCNF with a left side unrelated to any key. Decomposition into BCNF Proof • Setting: relation R , given FD’s F . Suppose • Let Y → A is a BCNF violation and follows relation R has BCNF violation X → B . from F • Then the computation of Y + used at least • We need only look among FD’s of F for a BCNF violation. one FD X → B from F . • If there are no violations, then the relation • X must be a subset of Y . is in BCNF. • Thus, if Y is not a superkey, X cannot be a • Don’t we have to consider implied FD’s? superkey either, and X → B is also a BCNF violation. • No, because… 1

  2. Decomposition Algorithm (1/2) Decomposition Algorithm (2/2) For every violation X → B among given 3. Find the FD’s for the decomposed FD’s: relations. 1. Compute X + . – Project the FD’s from F = calculate all consequents of F that involve only attributes � Cannot be all attributes – why? from X + or only from ( R − X + ) ∪ X . Decompose R into X + and ( R – X + ) ∪ X . 2. X R X+ Example (1/3) Example (2/3) R = Drinkers(name, addr, beersLiked, manf, favoriteBeer) • BCNF violations? F D ’s: • name → addr – For Drinkers1, name is key and all left sides • name → favoriteBeer of FD’s are superkeys. • beersLiked → manf – For Drinkers2, {name, beersLiked} is the key, Pick BCNF violation name → addr. and beersLiked → manf violates BCNF. • Close the left side: name + = name addr favoriteBeer . • Decomposed relations: Drinkers1(name, addr, favoriteBeer) Drinkers2(name, beersLiked, manf) • Projected FD’s (skipping a lot of work): – For Drinkers1: name → addr and name → favoriteBeer . – For Drinkers2: beersLiked → manf . Third Normal Form (3NF) Example (3/3) • Sometimes we have a dilemma: – If you decompose, you can’t check the FD’s in the decomposed relations. • Decompose Drinkers2 – If you don’t decompose, you violate BCNF. • Close beersLiked + = beersLiked, manf . • Abstractly: AB → C and C → B . • Decompose: • In book: title city → theatre and theatre → city. Drinkers3(beersLiked, manf) • Another example: street city → zip, Drinkers4(name, beersLiked) zip → city. • Resulting relations are all in BCNF: • Keys: AB and AC , but C → B has a left side not a superkey. Drinkers1(name, addr, favoriteBeer) • Suggests decomposition into BC and AC . Drinkers3(beersLiked, manf) – But you can’t check the FD AB → C in these Drinkers4(name, beersLiked) relations. 2

  3. Example “Elegant” Workaround • What can go wrong if we decompose: • Define the problem away. street zip A = street , • A relation R is in 3NF iff for every nontrivial 545 Tech Sq. 02138 B = city , FD X → A , either: 545 Tech Sq. 02139 C = zip. 1. X is a superkey, or city zip Cambridge 02138 2. A is prime = member of at least one Cambridge 02139 key. • Thus, the canonical problem goes away: city street zip you don’t have to decompose because all Cambridge 545 Tech Sq. 02138 Join: attributes are prime. Cambridge 545 Tech Sq. 02139 Decomposition Properties 3NF vs. BCNF 1. We should be able to recover from the • Without proof, we assert that it is always decomposed relations the data of the possible to decompose into BCNF and original. satisfy (1). – Recovery involves projection and join (next • Also without proof, we can decompose time). into 3NF and satisfy both (1) and (2). 2. We should be able to check that the FD’s • But it is not possible to decompose into for the original relation are satisfied by BCNF and get both (1) and (2). checking the projections of those FD’s in – Street-city-zip is an example of this point. the decomposed relations. Multivalued Dependencies Example • The multivalued dependency X →→ Y • Drinkers(name, addr, phones, beersLiked) holds in a relation R if whenever we have • MVD name →→ phones . two tuples of R that agree in all the • If Drinkers has the two tuples: attributes of X , then we can swap their Y name addr phones beersLiked components and get two new tuples that sue a p1 b1 sue a p2 b2 are also in R . it must also have the same tuples with phones components swapped: name addr phones beersLiked sue a p1 b2 sue a p2 b1 3

  4. MVD Rules Splitting Doesn’t Hold • Every FD is an MVD: if X → Y , then • Sometimes you need to have several attributes on the right of an MVD. swapping Y ’s between tuples that agree • For example: Drinkers(name, areaCode, phones, on X doesn’t create new tuples. beersLiked, beerManf) • Example, in Drinkers: name →→ addr . name areaCode phones beersLiked beerManf • Complementation : if X →→ Y , then Leo 773 555-1111 Bud A.B. Leo 773 555-1111 Honkers G.I. X →→ Z , where Z is all attributes not in X Leo 800 555-9999 Bud A.B. or Y . Leo 800 555-9999 Honkers G.I. • Example: since name →→ phones • name →→ areaCode phones holds, but neither holds in Drinkers , so does name →→ areaCode nor name →→ phones do. name →→ addr beersLiked . Fourth Normal Form (4NF) • Eliminate redundancy due to multiplicative effect of MVD's. • Roughly: treat MVD's as FD's for decomposition, but not for finding keys. • Formally: R is in Fourth Normal Form if whenever MVD X →→ Y is nontrivial ( Y is not a subset of X , and X ∪ Y is not all attributes), then X is a superkey. – Remember, X → Y implies X →→ Y , so 4NF is more stringent than BCNF. • Decompose R , using 4NF violation X →→ Y , into XY and X ∪ ( R—Y ). 4

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend