csci1270 introduction to database systems
play

CSCI1270 Introduction to Database Systems Normalization CSCI1270: - PowerPoint PPT Presentation

CSCI1270 Introduction to Database Systems Normalization CSCI1270: Introduction to Database Systems Another Use for FD s: Schema Design Schema Design: Approach #1 1. Construct E/R diagram 2. Translate into tables Subjective: How do we know


  1. CSCI1270 Introduction to Database Systems Normalization CSCI1270: Introduction to Database Systems

  2. Another Use for FD ’ s: Schema Design Schema Design: Approach #1 1. Construct E/R diagram 2. Translate into tables Subjective: How do we know if any good? Schema Design: Approach #2 1. Start with universal relation 2. Determine FD ’ s 3. “ Decompose ” UR using FD ’ s as guide Schema Design: Approach #3 1. Construct E/R diagram to come up with 1 st cut design 2. Use FD ’ s to verify or refine CSCI1270: Introduction to Database Systems

  3. Decomposition 1. Decomposing the Schema R = (bname, bcity, assets, cname, lno, amt ) R 1 = (bname, bcity, assets, cname) R 2 = (cname, lno, amt) Notation: R = R 1 ∪ R 2 CSCI1270: Introduction to Database Systems

  4. Decomposition 2. Decomposing the Instance R = bname bcity assets cname lno amt Dntn Bkln 9M Jones L-17 1000 Dntn Bkln 9M Johnson L-23 2000 Mianus Hnck 1.7M Jones L-93 500 Dntn Bkln 9M Hayes L-17 1000 R 1 = R 2 = bname bcity assets cname cname lno amt Dntn Bkln 9M Jones Jones L-17 1000 Johnson L-23 2000 Dntn Bkln 9M Johnson Jones L-93 500 Mianus Hnck 1.7M Jones Hayes L-17 1000 Dntn Bkln 9M Hayes BTW: Not a Good Decomposition CSCI1270: Introduction to Database Systems

  5. Goals of Decomposition 1. Lossless Joins Want to be able to reconstruct big relation by joining smaller ones (Natural join) (i.e.: R 1 R 2 = R ?) 2. Dependency Preservation Want to minimize the cost of global integrity constraints based on FD ’ s (i.e.: Avoid big joins in assertions) 3. Redundancy Avoidance Avoid unnecessary data dupl. (motivation for decomposition) Summary: LJ: Information loss DP: Efficiency (time) RA: Efficiency (space), update anomalies CSCI1270: Introduction to Database Systems

  6. Another Use for FD ’ s: Schema Design Example: R = bname bcity assets cname lno amt Dntn Bkln 9M Jones L-17 1000 Dntn Bkln 9M Johnson L-23 2000 Mianus Hnck 1.7M Jones L-93 500 Dntn Bkln 9M Hayes L-17 1000 R: “ Universal Relation ” Tuple meaning: Jones has a loan (L-17) for $1000 taken out of the Dntn branch in Bkln which has assets of $9M Design: pro : Fast queries (No need for joins!) con : Redundancy, update anomalies, deletion anomalies CSCI1270: Introduction to Database Systems

  7. Decomposition Goal #1: Lossless Joins A Bad Decomposition bname bcity assets cname cname lno amt Dntn Bkln 9M Jones Jones L-17 1000 Dntn Bkln 9M Johnson Johnson L-23 2000 = Mianus Hnck 1.7M Jones Jones L-93 500 Dntn Bkln 9M Hayes Hayes L-17 1000 bname bcity assets cname lno amt Dntn Bkln 9M Jones L-17 1000 Dntn Bkln 9M Jones L-93 500 Dntn Bkln 9M Johnson L-23 3000 Mianus Hnck 1.7M Jones L-17 1000 Mianus Hnck 1.7M Jones L-93 500 Dntn Bkln 9M Hayes L-17 1000 CSCI1270: Introduction to Database Systems

  8. Decomposition Goal #1: Lossless Joins A Bad Decomposition bname bcity assets cname lno amt Dntn Bkln 9M Jones L-17 1000 → Dntn Bkln 9M Jones L-93 500 = Dntn Bkln 9M Johnson L-23 3000 → Mianus Hnck 1.7M Jones L-17 1000 Mianus Hnck 1.7M Jones L-93 500 Dntn Bkln 9M Hayes L-17 1000 Problem: adds meaningless tuples “ Lossy join ” : B y adding noise, have lost meaningful information as a result of decomposition CSCI1270: Introduction to Database Systems

  9. Lossless Joins Is the Following Decomposition Lossless or Lossy? R 1 = R 2 = bname bcity assets cname bname lno amt Dntn Bkln 9M Jones Dntn L-17 1000 Dntn Bkln 9M Johnson Dntn L-23 2000 Mianus Hnck 1.7M Jones Mianus L-93 500 Dntn Bkln 9M Hayes bname bcity assets cname lno amt A: Lossy. … R 1 R 2 includes: Dntn Bkln 9M Jones L-23 2000 Dntn Bkln 9M Johnson L-17 1000 Dntn Bkln 9M Hayes L-23 2000 ( R 1 R 2 has 7 … tuples, whereas R has 4) CSCI1270: Introduction to Database Systems

  10. Lossless Joins Is the Following Decomposition Lossless or Lossy? R 1 = R 2 = bname assets cname lno lno bcity amt Dntn 9M Jones L-17 L-17 Bkln 1000 Dntn 9M Johnson L-23 L-23 Bkln 2000 Mianus 1.7M Jones L-93 L-93 Hnck 500 Dntn 9M Hayes L-17 A: Lossless. R 1 R 2 has 4 tuples CSCI1270: Introduction to Database Systems

  11. Lossless Joins Lossless or Lossy? R 1 = R 2 = bname bcity assets bname lno amt cname Dntn L-17 1000 Jones Dntn Bkln 9M Dntn L-23 2000 Johnson Mianus Bkln 1.7M Mianus L-93 500 Jones Dntn L-17 1000 Hayes A: Lossless. R 1 R 2 has 4 tuples Q: When is decomposition lossless? CSCI1270: Introduction to Database Systems

  12. Ensuring Lossless Joins A Decomposition of R , R = R 1 ∪ R 2 is Lossless iff R 1 ∩ R 2 → R 1 or R 1 ∩ R 2 → R 2 (i.e.: Intersecting atts must form a super key for one of the resulting smaller relations) Intuition: Original relation R has n tuples A A  R 1 … R 2 ● ● ●  ● ●  ● ● ▪ A a key ⇒ | R 2 | ≤ n, & ● Relationship with R 1 is n:1 ▪ A not a key ⇒ | R 1 | = n ∴ n tuples in result CSCI1270: Introduction to Database Systems

  13. Decomposition Goal #2: Dependency Preservation Goal: Efficient integrity checks of FD ’ s An Example With No Dependency Preservation: R = (bname, bcity, assets, cname, lno, amt) bname → bcity assets lno → amt bname Decomposition: R = R 1 ∪ R 2 R 1 = (bname, assets, cname, lno) R 2 = (lno, bcity, amt) Lossless, but Not DP. Why? CSCI1270: Introduction to Database Systems

  14. Decomposition Goal #2: Dependency Preservation (cont.) Decomposition (cont.): R = R 1 ∪ R 2 R 1 = (bname, assets, cname, lno) R 2 = (lno, bcity, amt) Lossless, but Not DP. Why? A: bname → bcity crosses 2 tables CREATE ASSERTION bname-bcity CHECK NOT EXISTS (SELECT * FROM R 1 AS x 1 , R 2 AS y 1 ,R 1 AS x 2 , R 2 AS y 2 WHERE x 1 .lno = y 1 .lno AND x 2 .lno = y 2 .lno AND x 1 .lno = x 2 .lno AND x 1 .bname = x 2 .bname AND y 1 .bcity <> y 2 .bcity) CSCI1270: Introduction to Database Systems

  15. Decomposition Goal #2: Dependency Preservation To Ensure Best Possible Efficiency of FD Checks Ensure that only a SINGLE table be examined for each FD i.e.: Ensure that A 1 , …, A n → B 1 , …, B m can be checked by examining one table as in: R i = … A 1 … … … A n B 1 B m Above: R i “ covers ” the FD, A 1 , …, A n → B 1 , …, B m To Test if Decomposition R = R 1 ∪ … R n is DP, 1. See which FD ’ s of R are covered by R 1 , …, R n 2. Compare closure of (1) to closure of FD ’ s of R CSCI1270: Introduction to Database Systems

  16. Decomposition Goal #2: Dependency Preservation More Formally: To test if R = R 1 ∪ … ∪ R n is dependency preserving wrt R ’ s FD set, F : 1. Compute F + 2. Compute G G ← ∅ For i ← 1 to n DO Add to G those FD ’ s in F + covered by R i 3. Compute G + 4. If F + = G + : Decomposition is DP If F + ≠ G + : Decomposition is not DP CSCI1270: Introduction to Database Systems

  17. Decomposition Goal #2: Dependency Preservation (cont.) More Formally (cont.): To test if R = R 1 ∪ … ∪ R n is dependency preserving wrt R ’ s FD set, F : 1. Compute F + 2. Compute G 3. Compute G + 4. Compute F + - G + Example: F = {A → B, AB → D, C → D} R 1 = (A, B, C); R 2 = (C, D) Is this decomposition of (A, B, C, D) DP? CSCI1270: Introduction to Database Systems

  18. Aside: Computing F + Many Algorithms Call For It If you know Armstrong ’ s Axioms cold, can generate lazily: 1. Compute F c 2. Use Armstrong ’ s Axioms to derive (X → Y) ∈ F c + as needed CSCI1270: Introduction to Database Systems

  19. Decomposition Goal #2: Dependency Preservation Example: F = {A → B, AB → D, C → D} R 1 = (A, B, C); R 2 = (C, D) Is R = R 1 ∪ R 2 DP? A: 1. F + = {A → B, AB → D, C → D} + Note: (A → D) ∈ F+ 2. G = ∅ ∪ {A → B, …} ∪ {C → D, …} Note: (A → D) ∉ G 3. G + = {…} Note: (A → D) ∉ G + 4. F + ≠ G + because (A → D) ∈ (F + - G + ) ∴ Decomposition is not DP CSCI1270: Introduction to Database Systems

  20. Decomposition Goal #2: Dependency Preservation Example: F = {A → B, AB → D, C → D} What is a DP decomposition of F ? A: R = R 1 ∪ R 2 s.t. R 1 = (A, B, D); R 2 = (C, D) 1. F + = { A → B, AB → D,C → D} + 2. G+ = {A → B, AB → D, C → D } + 3. F + = G + Note: G + cannot introduce FD ’ s not in F + ∴ Decomposition is DP Q: Does it satisfy lossless joins? A: No CSCI1270: Introduction to Database Systems

  21. Decomposition Goals Summary Lossless Joins Motivation: Avoid information loss Idea: No noise introduced when reconstitution universal relation via joins Test: At each decomposition test: R = R 1 ∪ R 2 (R 1 ∩ R 2 ) → R 1 or (R 1 ∩ R 2 ) → R 2 Ensured for: BCNF, 3NF Dependency Preservation Motivation: Efficient FD assertions Idea: No gic ’ s require joins of more than 1 table with itself Test: R = R 1 ∪ … ∪ R n is DP if closure of FD ’ s covered by each R i = closure of FD ’ s covered by R = F + Ensured for: 3NF CSCI1270: Introduction to Database Systems

  22. Decomposition Goal #3 Redundancy Avoidance Redundancy: A B C a x 1 1. Name FD of this relation? e x 1 Ans: B → C g y 2 h y 2 m y 2 2. Name the super keys of this relation A: All sets of atts that include A n z 1 p z 1 3. When do we have redundancy? A: When ∃ some FD, X → Y covered by relation & X not a super key CSCI1270: Introduction to Database Systems

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend