Database Design and the Entity/Relationship Model Youve just been - - PowerPoint PPT Presentation
Database Design and the Entity/Relationship Model Youve just been - - PowerPoint PPT Presentation
Database Design and the Entity/Relationship Model Youve just been hired by Bank of America as their DBA for their online banking web site. You are asked to create a database that monitors: customers accounts loans
You’ve just been hired by Bank of America as
their DBA for their online banking web site.
You are asked to create a database that
monitors:
- customers
- accounts
- loans
- branches
- transactions, …
Now what??!!!
1.
Requirements Specification
- Determine the requirements of clients
2. Conceptual Design
- Express client requirements in terms of some high-level
model (E/R model).
- Confirm with clients that requirements are correct.
3. Functional Requirements
- Specify required data operations
- priorities, response times
4. Logical Design
- Convert E/R model to relational, object-based, XML-based,…
5. Physical Design
- Specify file organizations, build indexes
Conceptual Design
The E/R Data Model
What is a Data Model?
Framework for organizing and interpreting data
Example: E/R Data Model
Entity1 Entity2 Relationship Attribute1a Attribute1b Attribute2a Attribute2b Attribute2c
E/R Data Model
Basics
Entities
noun phrases (e.g., Bob Smith, Thayer St. Branch) contained in entity sets (e.g. Employee, Branch) have attributes (e.g., Employee = (essn, ename, …))
Relationships
verb phrases (e.g., works_at, works_for) relate 2 (binary) or more (n-ary) entities relationship sets characterize relationships amongst entity
sets
e.g., (Bob Smith, Thayer St Branch) Works_At
E/R Data Model
An Example
Employee Works_At
essn
Branch
ename phone children since seniority bname bcity
Works_For
manager worker
Entity Set Relationship Set Attribute Employee Works_At
phone
Lots of notation to come. Color is irrelevant
E/R Data Model
Types of Attributes
Employee Works_At
essn
Branch
ename phone children since seniority bname bcity
Works_For
manager worker
Default
ename children seniority
Multivalued Derived
E/R Data Model
Types of relationships
Employee Works_At
essn
Branch
ename phone children since seniority bname bcity
Works_For
manager worker
Many-to-One (n:1) Many-to-Many (n:m) Works_At Works_For
E/R Data Model
Recursive relationships
Employee Works_At
essn
Branch
ename phone children since seniority bname bcity
Works_For
manager worker
Employee
manager worker
Works_For Recursive relationships: Must be declared with roles
E/R Data Model
Design Issue #1: Entity Sets vs. Attributes
Employee
phone_no phone_loc
Employee
no loc
Phone Uses
vs
(a) (b) To resolve, determine how phones are used
1. Can many employees share a phone?
(If yes, then (b))
2. Can employees have multiple phones?
(if yes, then (b), or (a) with multivalued attributes)
3. Else (a)
An Example: How to model bank loans
E/R Data Model
Design Issue #2: Entity Sets vs. Relationship Sets
Customer
cssn cname
vs
(a) To resolve, determine how loans are issued
1. Can there be more than one customer per loan?
If yes, then (a). Otherwise, loan info must be replicated for each
customer (wasteful, potential update anomalies)
2. Is loan a noun or a verb?
Both, but more of a noun to a bank. (hence (a) probably more
appropriate)
Borrows Loan
lno amt
Customer
cssn cname
(b) Loans Branch
bname bcity lno amt
An Example:
E/R Data Model
Design Issue #3: Relationship Cardinalities
Customer Borrows Loan
? ?
Variations on Borrows:
1. Can a customer hold multiple loans? 2. Can a loan be jointly held by more than 1
customer?
E/R Data Model
Design Issue #3: Relationship Cardinalities
Customer Borrows Loan
? ?
Type Illustrated Multiple Loans? Joint Loans?
One-to-One (1:1) No No Many-to-one (n:1) No Yes One-to-many (1:n) Yes No Many-to-many (n:m) Yes Yes Cardinalities of Borrows:
Borrows Borrows Borrows Borrows
E/R Data Model
Design Issue #3: Relationship Cardinalities (cont)
In general...
1:1 n:1 1:n n:m
An Example: Works_At
E/R Data Model
Design Issue #4: N-ary vs Binary Relationship Sets
Employee Works_at Branch Dept Employee
WAE
Branch Dept WA
Binary: Ternary:
WAB WAD
vs
(Joe, Thayer, Acct) Works_At (Joe, w3) WAE (Thayer, w3) WAB (Acct, w3) WAD
Choose n-ary when possible!
Key = set of attributes identifying individual entities or relationships
E/R Data Model
Keys
Employee essn
ename eaddress ephone A. Superkey:
any attribute set that distinguishes identities
e.g., {essn}, {essn, ename, eaddress}
B. Candidate Key:
“minimal superkey” (can’t remove attributes and preserve “keyness”)
e.g., {essn}, {ename, eaddress}
C. Primary Key:
candidate key chosen as the key by a DBA
e.g., {essn} (denoted by underline)
E/R Data Model
Relationship Set Keys
Employee essn
ename ...
Works_At Branch bname
bcity
... Q: What attributes are needed to represent relationships in Works_At?
since
e1 e2 e3 b1 b2
A: {essn, bname, since}
E/R Data Model
Relationship Set Keys (cont.)
Q: What are the candidate keys of Works_At?
e1 e2 e3 b1 b2
A: {essn}
Employee essn
ename ...
Works_At Branch bname
bcity
...
since
E/R Data Model
Relationship Set Keys (cont.)
Q: What are the candidate keys if Works_At is...?
A: {essn, bname}
- b. n:m
- a. 1:n
- c. 1:1
A: {bname} A: {essn} or {bname}
Employee essn
ename ...
Works_At Branch bname
bcity
...
since ? ?
General Rules for Relationship Set Keys
E/R Data Model
Relationship Set Keys (cont.)
E1 P (E1)
...
R E2 P (E2)
...
If R is:
R 1:1 1:n n:1 n:m Candidate Keys P (E1) or P (E2) P (E2) P (E1) P (E1) P (E2)
Idea:
- Existence of one entity depends on another
Example: Loans and Loan Payments
E/R Data Model
Existence Dependencies and Weak Entity Sets
Loan lno
lamt
Loan_Pmt Payment pno
pdate pamt
Weak Entity Set Identifying Relationship Total Participation
E/R Data Model
Existence Dependencies and Weak Entity Sets
Weak Entity Sets
existence of payments depends upon loans
have no superkeys: different payment records (for different loans) can be identical
instead of keys, discriminators: discriminate
between payments for given loan (e.g., pno)
Loan lno
lamt
Loan_Pmt Payment pno
pdate pamt
E/R Data Model
Existence Dependencies and Weak Entity Sets
Identifying Relationships We say:
Loan is dominant in Loan_Pmt Payment is subordinate in Loan_Pmt Payment is existence dependent on Loan
Loan lno
lamt
Loan_Pmt Payment pno
pdate pamt
E/R Data Model
Existence Dependencies and Weak Entity Sets
All elements of Payment appear in
Loan_Pmt
Loan lno
lamt
Payment pno
pdate pamt
Loan_Pmt
E/R Data Model
Existence Dependencies and Weak Entity Sets
E1 attam E2 attb1 attbn Q. Is {attb1, …, attbn} a superkey of E2?
... ...
atta1
A: No
Q. Name a candidate key of E2
A: {atta1, attb1}
R Q. Does total participation of E2 in R E2 is
existence-dep?
A: No
E/R Data Model
Extensions to the Model: Specialization and Generalization
An Example:
Customers can have checking and savings accts Checking ~ Savings (many of the same attributes)
Old Way: Customer Has1 Savings Acct acct_no
balance interest
Has2 Checking Acct acct_no
balance
- verdraft
E/R Data Model
Extensions to the Model: Specialization and Generalization
Customer Has Account acct_no
balance
Checking Acct
- verdraft
interest
Isa
Savings Acct An Example:
Customers can have checking and savings accts Checking ~ Savings (many of the same attributes)
New Way:
superclass subclasses
E/R Data Model
Extensions to the Model: Specialization and Generalization
Subclass Distinctions:
1. User-Defined vs. Condition-Defined
User: Membership in subclasses explicitly determined
(e.g., Employee, Manager < Person)
Condition: Membership predicate associated with
subclasses - e.g: Person
Isa
Child Adult Senior age < 18 18 age age > 65
E/R Data Model
Extensions to the Model: Specialization and Generalization
Subclass Distinctions:
2. Overlapping vs. Disjoint
Overlapping: Entities can belong to >1 entity set
(e.g., Adult, Senior)
Disjoint: Entities belong to exactly 1 entity set
(e.g., Child) Person
Isa
Child Adult Senior age < 18 18 age age > 65
E/R Data Model
Extensions to the Model: Specialization and Generalization
Subclass Distinctions:
3. Total vs. Partial Membership
Total: Every entity of superclass belongs to a subclass
e.g.,
Partial: Some entities of superclass do not belong to any
subclass (e.g., Suppose Adult condition is age 21 ) Person
Isa
Child Adult Senior age < 18 age 18 age 65
E/R Data Model
Extensions to the Model: Aggregation
E/R: No relationships between relationships
E.g.: Associate loan officers with Borrows relationship set
Customer Loan Borrows Employee Loan_Officer
?
Associate Loan Officer with Loan?
What if we want a loan officer for every (customer, loan) pair?
E/R Data Model
Summary
Entities, Relationships (sets) Both can have attributes (simple, multivalued, derived,
composite)
Cardinality or relationship sets (1:1, n:1, n:m) Keys: superkeys, candidate keys, primary key
DBA chooses primary key for entity sets Automatically determined for relationship sets
Weak Entity Sets, Existence Dependence, Total/Partial
Participation
Specialization and Generalization (E/R + inheritance)
E/R Relational Schema
Entity Sets E = (a1, …, an)
E a1
an
…
E/R Relational Schema
Entity Sets E = (a1, …, an) Relationship Sets R = (a1, b1, c1, …, ck) a1: E1’s key b1: E2’s key c1, …, ck: attributes of R
E a1
an
…
E2
bm
E1 a1
an ck
R
… … …
a1
b1
c1
Not the whole story for Relationship Sets …
What about…
E2
bm
E1 a1
an ck
R
… … …
a1 b1
c1
Rule of Thumb Fewer tables good, as long as no redundancy
- a1 is a key for R
- a1 also a key for E1 = (a1, …, an)
Could have: R = (a1, b1, c1, …, ck) but…
- Ignore R
- Add b1, c1, …, ck to E1 instead (i.e: E1 = (a1, …, an,, b1, c1, …, ck))
Instead:
Relationship Cardinality Relational Schema n:m E1 = (a1, …, an) E2 = (b1, …, bm) R = (a1, b1, c1, …, cn)
E2
bm
E1 a1
an ck
R
… … …
a1 b1
c1
? ?
R
Relationship Cardinality Relational Schema n:m E1 = (a1, …, an) E2 = (b1, …, bm) R = (a1, b1, c1, …, cn) n:1 E1 = (a1, …, an, b1, c1, …, cn) E2 = (b1, …, bm)
E2
bm
E1 a1
an ck
R
… … …
a1 b1
c1
? ?
R R
Relationship Cardinality Relational Schema n:m E1 = (a1, …, an) E2 = (b1, …, bm) R = (a1, b1, c1, …, cn) n:1 E1 = (a1, …, an, b1, c1, …, cn) E2 = (b1, …, bm) 1:n E1 = (a1, …, an) E2 = (b1, …, bm,, a1, c1, …, cn)
R R R E2
bm
E1 a1
an ck
R
… … …
a1 b1
c1
? ?
Relationship Cardinality Relational Schema n:m E1 = (a1, …, an) E2 = (b1, …, bm) R = (a1, b1, c1, …, cn) n:1 E1 = (a1, …, an, b1, c1, …, cn) E2 = (b1, …, bm) 1:n E1 = (a1, …, an) E2 = (b1, …, bm,, a1, c1, …, cn) 1:1 Treat as n:1 or 1:n
R R R R E2
bm
E1 a1
an ck
R
… … …
a1 b1
c1
? ?
Acct-Branch Account acct_no
balance
Branch bname
bcity assets
Borrower Customer cname
ccity
Loan lno
amt cstreet
Depositor Loan-Branch
- Q. How many tables does this get translated into?
- A. 6 (account, branch, customer, loan, depositor, borrower)
- Q. What are the schemas?
Acct-Branch Account acct_no
balance
Branch bname
bcity assets
Borrower Customer cname
ccity
Loan lno
amt cstreet
Depositor Loan-Branch
Account bname acct_no balance Depositor cname acct_no Customer cname cstreet ccity Branch bname bcity assets Borrower cname lno Loan bname lno amt
Account bname acct_no balance
Downtown Mianus Perry R.H. Brighton Redwood Brighton A-101 A-215 A-102 A-305 A-201 A-222 A-217 500 700 400 350 900 700 750
Depositor cname acct_no
Johnson Smith Hayes Turner Johnson Jones Lindsay A-101 A-215 A-102 A-305 A-201 A-217 A-222
Customer cname cstreet ccity
Jones Smith Hayes Curry Lindsay Turner Williams Adams Johnson Glenn Brooks Green Main North Main North Park Putnam Nassau Spring Alma Sand Hill Senator Walnut Harrison Rye Harrison Rye Pittsfield Stanford Princeton Pittsfield Palo Alto Woodside Brooklyn Stanford
Branch bname bcity assets
Downtown Redwood Perry Mianus R.H. Pownel
- N. Town
Brighton Brooklyn Palo Alto Horseneck Horseneck Horseneck Bennington Rye Brooklyn 9M 2.1M 1.7M 0.4M 8M 0.3M 3.7M 7.1M
Borrower cname lno
Jones Smith Hayes Jackson Curry Smith Williams Adams L-17 L-23 L-15 L-14 L-93 L-11 L-17 L-16
Loan bname lno amt
Downtown Redwood Perry Downtown Mianus R.H. Perry L-17 L-23 L-15 L-14 L-93 L-11 L-16 1000 2000 1500 1500 500 900 1300
E/R Relational Schema
Weak Entity Sets
E1 = (a1, …, an) E2 = (a1, b1, …, bm)
E1 a1
an
E2 b1 … bm
…
IR
E/R Relational Schema
Multivalued Attributes
Emp = (ssn, name) Emp-Depts = (ssn, dept)
Emp
ssn name dept
Emp
ssn name 001 … Smith …
Emp-Depts
ssn dept 001 001 … Acct Sales …
E/R Relational Schema
Subclasses
Method 1: E = (a1, …, an) E1 = (a1, b1, …, bm) E2 = (a1, c1, …, ck)
E a1
an
E2
c1 b1
Isa
E1
… … bm … ck
E/Rb Relational Schema
Subclasses
Method 1: E = (a1, …, an) E1 = (a1, b1, …, bm) E2 = (a1, c1, …, ck) Method 2: E1 = (a1, …, an, b1, …, bm) E2 = (a1, …, an, c1, …, ck)
E a1
an
E2
c1 b1
Isa
E1
… … bm … ck
Subclasses example:
Method 1:
Account = (acct_no, balance) SAccount = (acct_no, interest) CAccount = (acct_no, overdraft)