1
1
The Relational Data Model
Lecture 6
2
Outline
- Relational Data Model
- Functional Dependencies
- Logical Schema Design
The Relational Data Model Lecture 6 1 Outline Relational Data - - PDF document
The Relational Data Model Lecture 6 1 Outline Relational Data Model Functional Dependencies Logical Schema Design Reading Chapter 8 2 1 The Relational Data Model Relational Physical Data Schema storage Modeling Have
1
1
2
2
3
Data Modeling Relational Schema Physical storage E/R diagrams Tables: column names: attributes rows: tuples Complex file organization and index structures.
Have seen this in SQL Have seen this too Discuss next
4
Name Price Category Manufacturer gizmo $19.99 gadgets GizmoWorks Power gizmo $29.99 gadgets GizmoWorks SingleTouch $149.99 photography Canon MultiTouch $203.99 household Hitachi Tuples or rows or records Attribute names Table name or relation name Products:
3
5
Relational Schema: – Relation name plus attribute names – E.g. Product(Name, Price, Category, Manufacturer) – In practice we add the domain for each attribute Database Schema – Set of relational schemas – E.g. Product(Name, Price, Category, Manufacturer), Company(Name, Address, Phone), . . . . . . .
6
– values of corresponding domains
This is all mathematics, not to be confused with SQL tables ! (What's a difference?)
4
7
Name Price Category Manufacturer gizmo $19.99 gadgets GizmoWorks Power gizmo $29.99 gadgets GizmoWorks SingleTouch $149.99 photography Canon MultiTouch $203.99 household Hitachi Relational schema:Product(Name, Price, Category, Manufacturer) Instance:
8
3.9 Carol 3.7 Bob 3.8 Alice Courses GPA Name
OS DB Math OS DB OS Math
Student
3.9 Carol 3.7 Bob 3.8 Alice GPA Name
Student
Course OS DB Math OS Carol OS Alice DB Bob Alice Carol Alice Student Course DB Math Math
Takes Course
May need to add keys
5
9
10
Definition: If two tuples agree on the attributes then they must also agree on the attributes Formally: A1, A2, …, An B1, B2, …, Bm A1, A2, …, An B1, B2, …, Bm
6
11
12
7
13
Position Phone
14
Product: name price, manufacturer Person: ssn name, age Company: name stockprice, president
8
15
Product(name, category, color, department, price) name color category department color, category price Consider these FDs: What do they say ?
16
FD’s are constraints on relations:
99 Toys Green Gadget Tweaker 49 Toys Green Gadget Gizmo price department color category name
Does this instance satisfy all the FDs ? name color category department color, category price
9
17
59 Office-supp. Green Stationary Gizmo 99 Toys Black Gadget Tweaker 49 Toys Green Gadget Gizmo price department color category name
What about this one ? name color category department color, category price
18
If some FDs are satisfied, then
If all these FDs are true: name color category department color, category price Then this FD also holds: name, category price Why ??
10
19
Is equivalent to
Bm ... B1 Am ... A1
A1, A2, …, An B1, B2, …, Bm A1, A2, …, An B1 A1, A2, …, An B2 . . . . . A1, A2, …, An Bm
20
Trivial Rule Why ?
Am … A1
where i = 1, 2, ..., n A1, A2, …, An Ai
11
21
If and then Why ? A1, A2, …, An B1, B2, …, Bm B1, B2, …, Bm C1, C2, …, Cp A1, A2, …, An C1, C2, …, Cp
22 ... C1 Cp Bm … B1 Am … A1
12
23
Start from the following FDs: Infer the following FDs:
Which Rule did we apply ?
Inferred FD
24
Answers:
Transitivity on 3, 7
Split/combine on 5, 6
Trivial rule
Transitivity on 4, 1
Trivial rule
Which Rule did we apply ? Inferred FD
13
25
student major major, course room course time What else can we infer ?
26
If then Augmentation follows from trivial rules and transitivity How ? A1, A2, …, An B A1, A2, …, An , C1, C2, …, Cp B
14
27
– E.g. R(A, B, C, D): how many FDs are possible ?
– Still way too many
28
Given a set of attributes A1, …, An The closure, {A1, …, An}+ , is the set of attributes B s.t. A1, …, An B name color category department color, category price Example: Closures: name+ = {name, color} {name, category}+ = {name, category, color, department, price} color+ = {color}
15
29
Start with X={A1, …, An}. Repeat until X doesn’t change do: if B1, …, Bn C is a FD and B1, …, Bn are all in X then add C to X. {name, category}+ = {name, category, color, department, price} name color category department color, category price Example:
30
Compute {A,B}+ X = {A, B, } Compute {A, F}+ X = {A, F, } R(A,B,C,D,E,F) A, B C A, D E B D A, F B
16
31
A, B C A, D B B D Example: Step 1: Compute X+, for every X: A+ = A, B+ = BD, C+ = C, D+ = D AB+ = ABCD, AC+ = AC, AD+ = ABCD ABC+ = ABD+ = ACD+ = ABCD (no need to compute– why ?) BCD+ = BCD, ABCD+ = ABCD Step 2: Enumerate all FD’s X Y, s.t. Y ⊆ X+ and X∩Y = ∅: AB CD, ADBC, ABC D, ABD C, ACD B
32
– Designer derives them from real-world knowledge of users – Problem: knowledge might not be available
– Analyze given database instance and find all FD’s satisfied by that instance – Useful if designers don’t get enough information from users – Problem: FDs might be artifical for the given instance
17
33
Do all FDs make sense in practice ?
34
Course Dept, Room Dept, Room Course Student, Dept Course, Room Student, Course Dept, Room Student, Room Dept, Course Do all FDs make sense in practice ?
18
35
36
19
37
name, category price category color Keys are: {name, category} and all supersets
student address room, time course student, course room, time
38
20
39
40
what is his city ? Example: Persons with several phones SSN Name, City
Westfield 908-555-2121 987-65-4321 Joe Seattle 206-555-6543 123-45-6789 Fred Seattle 206-555-1234 123-45-6789 Fred City PhoneNumber SSN Name
but not SSN PhoneNumber
21
41
Break the relation into two:
Westfield 987-65-4321 Joe Seattle 123-45-6789 Fred City SSN Name 908-555-2121 987-65-4321 206-555-6543 123-45-6789 206-555-1234 123-45-6789 PhoneNumber SSN
Westfield 908-555-2121 987-65-4321 Joe Seattle 206-555-6543 123-45-6789 Fred Seattle 206-555-1234 123-45-6789 Fred City PhoneNumber SSN Name
42
Person buys Product name price name ssn
Conceptual Model: Relational Model: plus FD’s Normalization: Eliminates anomalies
22
43
R1 = projection of R on A1, ..., An, B1, ..., Bm R2 = projection of R on A1, ..., An, C1, ..., Cp R(A1, ..., An, B1, ..., Bm, C1, ..., Cp) R1(A1, ..., An, B1, ..., Bm) R2(A1, ..., An, C1, ..., Cp)
44
Camera 19.99 Gizmo Camera 24.99 OneClick Gadget 19.99 Gizmo Category Price Name 19.99 Gizmo 24.99 OneClick 19.99 Gizmo Price Name Camera Gizmo Camera OneClick Gadget Gizmo Category Name
Lossless decomposition
23
45
Camera 19.99 Gizmo Camera 24.99 OneClick Gadget 19.99 Gizmo Category Price Name Camera Gizmo Camera OneClick Gadget Gizmo Category Name Camera 19.99 Camera 24.99 Gadget 19.99 Category Price
What’s incorrect ?? Lossy decomposition
46
R(A1, ..., An, B1, ..., Bm, C1, ..., Cp) If A1, ..., An B1, ..., Bm Then the decomposition is lossless R1(A1, ..., An, B1, ..., Bm) R2(A1, ..., An, C1, ..., Cp) Example: name price, hence the first decomposition is lossless Note: don’t need necessarily A1, ..., An C1, ..., Cp
24
47
48
A simple condition for removing anomalies from relations: In English (though a bit vague): Whenever a set of attributes of R is determining another attribute, it should determine all the attributes of R. A relation R is in BCNF if: If A1, ..., An B is a non-trivial dependency in R , then {A1, ..., An} is a key for R
25
49
A’s Others B’s R1 Is there a 2-attribute relation that is not in BCNF ? Repeat choose A1, …, Am B1, …, Bn that violates the BNCF condition split R into R1(A1, …, Am, B1, …, Bn) and R2(A1, …, Am, [others]) continue with both R1 and R2 Until no more violations R2
50
What are the dependencies? SSN Name, City What are the keys? {SSN, PhoneNumber} Is it in BCNF?
Westfield 908-555-1234 987-65-4321 Joe Westfield 908-555-2121 987-65-4321 Joe Seattle 206-555-6543 123-45-6789 Fred Seattle 206-555-1234 123-45-6789 Fred City PhoneNumber SSN Name
26
51
Westfield 987-65-4321 Joe Seattle 123-45-6789 Fred City SSN Name 908-555-1234 987-65-4321 908-555-2121 987-65-4321 206-555-6543 123-45-6789 206-555-1234 123-45-6789 PhoneNumber SSN
SSN Name, City Let’s check anomalies:
52
Find a dependency that violates the BCNF condition: A’s Others B’s R1 R2 Heuristics: choose B , B , … B “as large as possible”
1 2 m
Decompose: 2-attribute relations are BCNF Continue until there are no BCNF violations left. A1, A2, …, An B1, B2, …, Bm
27
53
Person(name, SSN, age, hairColor, phoneNumber) SSN name, age age hairColor Decompose in BCNF (in class): Step 1: find all keys (How ? Compute S+, for various sets S) Step 2: now decompose
54
28
55
R’ is in general larger than R. Must ensure R’ = R Decompose Recover
56
29
57
Unit Company Product Unit Company Unit Product FD’s: Unit → Company; Company, Product → Unit So, there is a BCNF violation, and we decompose. Unit → Company No FDs Notice: we loose the FD: Company, Product Unit
58
Unit Company Product Unit Company Unit Product Galaga99 UW Galaga99 databases Bingo UW Bingo databases No problem so far. All local FD’s are satisfied. Let’s put all the data back into a single table again (anomalies?): Galaga99 UW databases Bingo UW databases Violates the dependency: company, product -> unit!
30
59
A simple condition for removing anomalies from relations: A relation R is in 3rd normal form if : Whenever there is a nontrivial dependency A1, A2, ..., An → B for R , then {A1, A2, ..., An } is a key for R,
Tradeoff: BCNF = no anomalies, but may lose some FDs 3NF = keeps all FDs, but may have some anomalies