Normalization (part 2) Professor Larry Heimann Carnegie Mellon University Information Systems Program
Homework review Below is a table containing information about people who registered for events at a community center: (first_name, last_name, email, home_telephone, work_telephone, event_name, event_date, event_price, spouse_name, child_name_1, child_name_2, child_name_3) a) Identify all the functional dependencies b) Identify the table's normal form c) Convert the table into a database with all tables in 3NF
Homework review Below is a table containing information about people who registered for events at a community center: (first_name, last_name, email, home_telephone, work_telephone, event_name, event_date, event_price, spouse_name, child_name_1, child_name_2, child_name_3) a) Identify all the functional dependencies F = {(first_name, last_name) → email, home_telephone, work_telephone, spouse_name, child_name_ i ; event_name → event_date, event_price}
Homework review Below is a table containing information about people who registered for events at a community center: (first_name, last_name, email, home_telephone, work_telephone, event_name, event_date, event_price, spouse_name, child_name_1, child_name_2, child_name_3) b) Identify the table's normal form The repeated attributes child_name_(1,2,3) make it clear that this is 0NF . One could also argue that lacking a legitimate primary key also places this as 0NF , but we’ll accept (first_name, last_name) as a primary key in this case.
Homework review Below is a table containing information about people who registered for events at a community center: (first_name, last_name, email, home_telephone, work_telephone, event_name, event_date, event_price, spouse_name, child_name_1, child_name_2, child_name_3) c) Convert the table into a database with all tables in 3NF people(person_id, first_name, last_name, email, home_telephone, work_telephone, spouse_name) children(child_id, person_id , child_name) events(event_id, event_name, event_date, event_price) person_events( event_id , person_id )
Homework review Below is a table containing information about people who registered for events at a community center: (first_name, last_name, email, home_telephone, work_telephone, event_name, event_date, event_price, spouse_name, child_name_1, child_name_2, child_name_3) c) Convert the table into a database with all tables in 3NF people(person_id, first_name, last_name, email, home_telephone, work_telephone, spouse_name) children(child_id, person_id , child_name) events(event_id, event_name, event_date, event_price) person_events( event_id , person_id )
Fact: just over half of adults today are married Implication: lots of nulls in the people table
Homework review Below is a table containing information about people who registered for events at a community center: (first_name, last_name, email, home_telephone, work_telephone, event_name, event_date, event_price, spouse_name, child_name_1, child_name_2, child_name_3) c) Convert the table into a database with all tables in 3NF (revised) people(person_id, first_name, last_name, email, home_telephone, work_telephone) relatives(relative_id, person_id , name, relation) events(event_id, event_name, event_date, event_price) person_events( event_id , person_id )
Asides Some issues dealing with households: • Di ff erent kinds of relations • People switching households over time Patterns for dealing with household membership Candidate keys and primary keys — some better than others
Homework review Below is a table that a dentist users to keep track of her work on her patients' teeth: (first_name, last_name, insurance_company_name, patient_insurance_number, insurance_company_phone_number, visit_date, which_tooth, symptoms, diagnosis, treatment_notes, material_used, fee, amount_billed, amount_currently_owed) a) Identify all the functional dependencies b) Identify the table's normal form c) Convert the table into a database with all tables in 3NF
Homework review Below is a table that a dentist users to keep track of her work on her patients' teeth: (first_name, last_name, insurance_company_name, patient_insurance_number, insurance_company_phone_number, visit_date, which_tooth, symptoms, diagnosis, treatment_notes, material_used, fee, amount_billed, amount_currently_owed) a) Identify all the functional dependencies F = { (first_name, last_name) → patient_insurance_number, amount_currently_owed; insurance_company_name → insurance_company_phone_number; (first_name, last_name), visit_date → which_tooth, symptoms, diagnosis, treatment_notes, material_used, fee, amount_billed }
Homework review Below is a table that a dentist users to keep track of her work on her patients' teeth: (first_name, last_name, insurance_company_name, patient_insurance_number, insurance_company_phone_number, visit_date, which_tooth, symptoms, diagnosis, treatment_notes, material_used, fee, amount_billed, amount_currently_owed) b) Identify the table's normal form As in the first problem, we could pick on the (first_name, last_name) primary key and say it’s 0NF , but we won’t. No repeated elements and no partial dependencies, so we are at least in 2NF . However, there are transitive dependencies, so can’t move onto 3NF .
Homework review Below is a table that a dentist users to keep track of her work on her patients' teeth: (first_name, last_name, insurance_company_name, patient_insurance_number, insurance_company_phone_number, visit_date, which_tooth, symptoms, diagnosis, treatment_notes, material_used, fee, amount_billed, amount_currently_owed) c) Convert the table into a database with all tables in 3NF patients(patient_id, first_name, last_name, insurance_id, patient_insurance_number, amount_currently_owed) insurances(insurance_id, company_name, company_phone_number) visits(visit_id, patient_id , which_tooth, symptoms, diagnosis, treatment_notes, material_used, fee, amount_billed)
Homework review Below is a table that a dentist users to keep track of her work on her patients' teeth: (first_name, last_name, insurance_company_name, patient_insurance_number, insurance_company_phone_number, visit_date, which_tooth, symptoms, diagnosis, treatment_notes, material_used, fee, amount_billed, amount_currently_owed) c) Convert the table into a database with all tables in 3NF (revised) patients(patient_id, first_name, last_name, amount_currently_owed) patient_insurances( patient_id , insurance_id, patient_insurance_number) insurances(insurance_id, company_name, company_phone_number) visits(visit_id, patient_id , which_tooth, symptoms, diagnosis, treatment_notes, material_used, fee, amount_billed)
Homework review Given a relation (table) R = (A, B, C, D, E) with F = {A → B, E → C, DE → A} a) what are all the candidate keys in this case? -- not A because it is determined by {C E} , not B because it is determined by A , not C because it is determined by E � -- not D alone cannot be the key because you need E , as well to determine A ( and the same goes for why not E alone) � -- only candidate key is {D E} in this case
Class exercise Given a relation (table) R = (A, B, C, D, E) with F = {A → B, E → C, DE → A} c) decompose the relation into additional relations until it is in 3NF -- Relation is 1NF because there is a composite primary key {D E} but there is also a partial dependency {E → C} � -- remove the partial dependency by decomposing to R1 = (C, E) ; R2 = (A, B, D, E) � -- remove the transitive dependency by decomposing further to R1 = (C, E) ; R21 = (A, D, E) ; R22 = (A, B)
Homework review Given a relation (table) R = (A, B, C, D, E) with F = {BC → ADE, D → B} a) Both {BC} and {CD} are candidate keys b) The relation R has no partial or transitive dependencies, so it is 3NF � c) Trick question; not automatically BCNF because two candidate keys, but they are overlapping composite keys. This is a rare edge-case that I personally have never seen in real life and would never ask on an exam.
Dealing with World Bank data • Looking at the raw data… • Lessons • Practical database design is not just looking functional dependencies of the fields, but looking at the data that would fill the fields. • Are repeating elements? Don’t forget that repeats can be the result of time. • Are there null values that we could prudently avoid? ( Remember that some nulls are normal and inevitable — we can’t fear them — but if the design could minimize them, all the better ) • Do we have primary keys that make sense and easy to identify?
Moving on …
Beyond 3NF • Introducing multi-valued dependencies In BCNF and below, everything revolved around functional dependencies A multi-valued dependency is one where certain tuples are required to exist in a relation; sometimes referred to as tuple-generating dependencies. • Fourth Normal Form (4NF) Multi-valued dependencies must be transformed into functional dependencies; implies that one value and not multiple values are dependent on a primary key Eliminate multiple sets of multiple valued or multi-valued dependencies, sometimes described as non-trivial multi-valued dependencies • Fifth Normal Form (5NF) Cyclical dependencies are removed; a cyclic dependency is simply something that depends on one thing, where that one thing is either directly in indirectly dependent on itself
Recommend
More recommend