database normalization
play

Database Normalization Asst. Prof. Dr. Kanda Runapongsa Saikaew - PDF document

Electricite Du Laos (EDL) Database Normalization Asst. Prof. Dr. Kanda Runapongsa Saikaew (krunapon@kku.ac.th) Department of Computer Engineering Khon Kaen University 1 Overview What and why normalization Background to normalization


  1. Electricite Du Laos (EDL) Database Normalization Asst. Prof. Dr. Kanda Runapongsa Saikaew (krunapon@kku.ac.th) Department of Computer Engineering Khon Kaen University 1 Overview  What and why normalization  Background to normalization  Functional dependency  Superkey  Candidate key  Primary key  Normal forms and their purposes  1NF  2NF 2  3NF What and why normalization  Normalization is a systematic way of ensuring that a database structure is suitable for general-purpose querying and free of certain undesirable characteristics — insertion, update, and deletion anomalies — that could lead to a loss of data integrity  Integrity constraints, in particular functional dependencies , can be used to identify schemas with such problems and to suggest refinements. 3 1 Dr. Kanda Runapongsa Saikaew, Computer Engineering, KKU

  2. Electricite Du Laos (EDL) Normalized tables  Informally, a relational database table is often described as "normalized" if it is in the third normal form (3NF)  Most 3NF tables are free of insertion, update, and deletion anomalies, i.e. in most cases 3NF tables adhere to BCNF, 4NF, and 5NF (but typically not 6NF) 4 When we apply normalization  A standard piece of database design guidance is that the designer should begin by fully normalizing the design  Selectively denormalize only in places where doing so is absolutely necessary to address performance issues  Some modeling disciplines, such as the dimensional modeling approach to data warehouse design, explicitly recommend non-normalized designs 5 Functional dependency  Functional dependency: Attribute B has a functional dependency on attribute B (i.e., A -> B)  If, for each value of attribute A, there is exactly one value of attribute B  If value of A is repeating in tuples then value of B will also repeat  In our example, branch address and branch phone number has a functional dependency on branch no. 6 2 Dr. Kanda Runapongsa Saikaew, Computer Engineering, KKU

  3. Electricite Du Laos (EDL) All instance that satisfies AB  C A B C D a1 b1 c1 d1 a1 b1 c1 d2 a1 b2 c2 d1 a2 b1 c3 d1 7 Trivial functional dependency  Trivial functional dependency  A trivial functional dependency is a functional dependency of attribute of a superset of itself  Examples  {Branch No, Branch Address} -> {Branch Address}  {Branch Address} -> {Branch Address} 8 Full functional dependency  An attribute is fully functional dependency on a set of attributes X if it is  Functionally dependent on X, and  Not functionally dependent on any proper subset of X  Example:  {Employee Address} has a functional dependency on {Employee ID, Skill}, but not a full functional dependency because it is also dependent on {Employee ID} 9 3 Dr. Kanda Runapongsa Saikaew, Computer Engineering, KKU

  4. Electricite Du Laos (EDL) Transitive dependency  Transitive dependency is an indirect functional dependency, one in which X → Z only by virtue of X → Y and Y → Z  Example:  Employee ID -> Employee office  Employee office -> Employee phone  Employee ID -> Employee phone 10 Superkey  Superkey  A superkey is an attribute or set of attributes that uniquely identifies rows within a table  Two distinct rows are always guaranteed to have distinct superkeys  Example  {Employee ID, Salary, Year} would be a superkey for the {Employee ID, Salary, Bonus, Year} table  {Employee ID, Year} would also be a 11 superkey Candidate key  Candidate key is a minimal superkey, that is, a superkey for which we can say that no proper subset of it is also a superkey  Example:  {Employee ID, Year} would be a candidate key for the table {Employee ID, Salary, Bonus, Year} table. 12 4 Dr. Kanda Runapongsa Saikaew, Computer Engineering, KKU

  5. Electricite Du Laos (EDL) Non-prime attribute  A non-prime attribute is an attribute that does not occur in any candidate key  Bonus would be a non-prime attribute in the "Employees' Bonus" table. 13 Primary key  Most DBMSs require a table to be defined as having a single unique key, rather than a number of possible unique keys  A primary key is a candidate key which the database designer has designated for this purpose 14 Example: Constraints on Entity Set  Consider relation obtained from Hourly_Emps:  Hourly_Emps ( ssn, name, lot, rating, hrly_wages , hrs_worked )  Notation : We will denote this relation schema by listing the attributes: SNLRWH  This is really the set of attributes {S,N,L,R,W,H}.  Sometimes, we will refer to all attributes of a relation by using the relation name. (e.g., Hourly_Emps for SNLRWH)  Some FDs on Hourly_Emps:  ssn is the key: S  SNLRWH  rating determines hrly_wages : R  W 15 5 Dr. Kanda Runapongsa Saikaew, Computer Engineering, KKU

  6. Electricite Du Laos (EDL) Data redundancy example1 (1/3) ssn name lot rating Hr_Wages Hr_Worked 1 A 48 8 10 40 2 B 22 8 10 30 3 C 35 5 7 30 4 D 35 5 7 32 5 E 35 8 10 40 16 Data redundancy example1 (2/3)  Redu dunda ndant nt Stora rage ge: The rating value 8 corresponds to the hourly wage 10, and this association is repeated three times  Upda date e Anoma malie lies: s: The hourly_wages in the first tuple could be updated without making a similar change in the second tuple 17 Data redundancy example1 (3/3)  Inserti rtion Anomalies: We cannot insert a tuple for an employee unless we know the hourly wage for the employee‟s rating value  Deletion A Anomalies: If we delete all tuples with a given rating value (e.g., we delete the tuples for C and D) we lose the association between the rating value and its hourly_wage value 18 6 Dr. Kanda Runapongsa Saikaew, Computer Engineering, KKU

  7. Electricite Du Laos (EDL) Decompositions (1/2)  Redundancy arises when a relational schema forces an association between attributes that is not natural.  Functional dependencies can be used to identify such situations and suggest refinement to the schema 19 Decompositions (2/2)  The essential idea is that many problems arising from redundancy can be addressed by replacing a relation with a collection of „smaller‟ relations  R consists of replacing the relation schema by two (or more) relation schemas that each contain a subset of attributes of R and together include all attributes in R 20 Examples  We can decompose Hourly_Emps into two relations:  Hourly_Emps2(ssn, name, lot, rating, hours_worked)  Wages(rating, hourly_wages)  This is more efficient than updating several tuples (as in the original design), and it eliminates the potential for inconsistency 21 7 Dr. Kanda Runapongsa Saikaew, Computer Engineering, KKU

  8. Electricite Du Laos (EDL) Data redundancy example2 Redundant Storage: the branchNo B001 and B002 corresponds to their branchAddress and telNo are repeated 22 Example2 Problem  Update Anomalies: If we change address of Tom Daniels, we also need to change address of Sally Adams  Insertion Anomalies: We cannot insert a tuple for a staff unless we know the address and phone number of the branch of that stuff  Deletion Anomalies: If we delete all tuples with a given branch value (e.g., we delete the tuples for branch B002) we lose the association between the branch and its address and phone number 23 Example2 Solution 24 8 Dr. Kanda Runapongsa Saikaew, Computer Engineering, KKU

  9. Electricite Du Laos (EDL) Normal Forms (1/2)  Given a relation schema, we need to decide whether it is a good design or we need to decompose it into smaller relations  To provide such guidance, several normal forms have been proposed 25 Normal Forms (2/2)  The normal forms based on FDs are first normal form (1NF), second normal form (2NF), and third normal form (3NF)  Every relation in 3NF is also in 2NF  Every relation in 2NF is also in 1NF 26 First Normal Form (1NF)  A relation is in first normal form if every field contains only atomic values, that is, no lists or sets  This requirement is implicit in our definition of the relation model  Table faithfully represents a relation and has no "repeating groups" 27 9 Dr. Kanda Runapongsa Saikaew, Computer Engineering, KKU

  10. Electricite Du Laos (EDL) Branch Table is not in 1NF 28 Converting Branch Table into 1NF 29 Second normal form (2NF)  2NF only applies to tables with composite primary keys.  No non-prime attribute in the table is functionally dependent on a part (proper subset) of a candidate key 30 10 Dr. Kanda Runapongsa Saikaew, Computer Engineering, KKU

  11. Electricite Du Laos (EDL) TempStaffAllocation Table is not in 2NF 31 Formal Definition of 2NF  Formal definition of 2NF is a table that is in 1NF and every non-prime attribute is fully functional dependent on the primary key  Full functional dependency indicates that if A and B are columns of a table, B is fully dependent on A if B is functionally dependent on A but not on any proper subset of A 32 Converting TempStaffAllocation table to 2NF 33 11 Dr. Kanda Runapongsa Saikaew, Computer Engineering, KKU

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend