Design Theory for Relational Databases
Thomas Schwarz, SJ
Design Theory for Relational Databases Thomas Schwarz, SJ Contents - - PowerPoint PPT Presentation
Design Theory for Relational Databases Thomas Schwarz, SJ Contents There are many ways a database scheme can be constructed A poorly designed scheme: Has problems with checking constraints Has problems with data coherence E.g.
Thomas Schwarz, SJ
constructed
first name
R(๐) A1, A2, โฆAn โถ B1, B2, โฆ, Bm A1, โฆ, An, B1, โฆ, Bm โ ๐ A1, โฆ, An B1, B2, โฆ, Bm
is equivalent to all of:
A1, A2, โฆAn โถ B1 A1, A2, โฆAn โถ B2 โฎ A1, A2, โฆAn โถ Bm
studioName, starName)
starName
โ
determines all attributes
Ai1, Ai2, โฆAim โj : Ai1, Ai2, โฆAim โถ Aj
attribute the key instead of the set with only element this attribute
and FDs and ,
?
R(A, B, C) A โ B B โ C A โ C
instance satisfying all FDs in T also satisfies all FDs in S
satisfying one is equal to the set of relation instances satisfying the other one.
S T
A1, A2, โฆAn โถ B1 A1, A2, โฆAn โถ B2 โฎ A1, A2, โฆAn โถ Bm
A1, A2, โฆAn โถ B1 A1, A2, โฆAn โถ B2 โฎ A1, A2, โฆAn โถ Bm A1, A2, โฆAn โถ B1, B2, โฆ, Bm
are those of the that are not among the
Ai โ Ai A, B โ A A1A2โฆAn โ B1B2โฆBm A1A2โฆAn โ C1C2โฆCr Ci Bi Ai
be a set of attributes
is the set
that every relation that satisfies all the FDs in also satisfies .
๐ ๐น = {A1, A2, โฆ, An} ๐น ๐น+ B ๐ A1, A2, โฆ, An โ B
and a set of functional dependencies .
2.Set to be . 3.Repeatedly search for some FD such that and . Then add to 4.Stop when the search fails and output .
๐น ๐ ๐น+ ๐ ๐ ๐น B1, B2, โฆ, Bm โ C โ ๐ B1, B2, โฆ, Bm โ ๐น C โ ๐น C ๐ ๐ = ๐น+
, G, H, I, J, K, L, M, M} and the set of functional dependencies {{E, F} -> {G}, {F} -> {I, J}, {E, H} -> {K, L}, K -> {M}, L -> {N} on R. What is the key for R?:
,H}
,H,K,L}
K -> {M}, L -> {N}
.
{{E, F} โ G, {F} โ I, {F} โ J, {E, H} โ K, {E, H} โ L, K โ M, L โ N}
{E}
.
.
.
{{E, F} โ G, {F} โ I, {F} โ J, {E, H} โ K, {E, H} โ L, K โ M, L โ N}
{E, F} = ๐ ๐ ๐ ๐ {E, F}+ = {E, F, G, I, J}
because of (1).
because of (2):
because of (3):
contains all the attributes.
, is a minimal candidate key and therefore a key.
{{E, F} โ G, {F} โ I, {F} โ J, {E, H} โ K, {E, H} โ L, K โ M, L โ N} {E, F, H} = ๐ ๐ ๐ ๐ = {E, F, G, H, I} ๐ ๐ = {E, F, G, H, I, J} ๐ = {E, F, G, H, I, J, K} ๐ = {E, F, G, H, I, J, K, L} ๐ = {E, F, G, H, I, J, K, L, M} ๐ = {E, F, G, H, I, J, K, L, M, N} {E, F, H}+ {F, H}+ = {F, H, I, J} {E, F, H}
B โ {A1, A2, โฆ, An}+ ๐ ๐ A1A2โฆAn โ B
leads to an FD that is not true.
follow from the in all relations,
is also equal in all and hence equal in .
has to be true and we have a contradiction
B โ {A1, A2, โฆ, An}+ ๐ ๐ A1A2โฆAn โ B ๐ A1A2โฆAn โ B B X1, X2, โฆXm โ B X1, X2, โฆXm A1A2โฆAn A1A2โฆAn โ X1, โฆXm A1A2โฆAn X1, โฆXm B A1A2โฆAn โ B
with regards to , but holds in all relations that also fulfill .
A1A2โฆAn โ B B โ {A1, A2, โฆ, An}+ ๐ B โ {A1, A2, โฆ, An}+ ๐ A1A2โฆAn โ B ๐
{A1 A2 ... An}+ every thing else 0 0 0 0 0 ... 0 0 0 0 1 1 ... 1
in is violated
need to be on the left side, i.e. in and the
side of the table.
should have been in
๐ C1C2โฆCr โ D ๐ Ci {A1, A2, โฆ, An}+ D D {A1, A2, โฆ, An}+
{A1 A2 ... An}+ every thing else 0 0 0 0 0 ... 0 0 0 0 1 1 ... 1
A1A2โฆAn โ B
{A1 A2 ... An}+ every thing else 0 0 0 0 0 ... 0 0 0 0 1 1 ... 1
proof
and then
R A1, A2, โฆ, Am โ B1, B2, โฆ, Bn B1, B2, โฆ, Bn โ C1, C2, โฆCt A1, A2, โฆ, Am โ C1, C2, โฆCt
that describe a relation
relation can be derived from the basis
:
have singleton right sides
is no longer a basis
we drop an attribute from the right side, then the result is no longer a basis
๐บ ๐บ ๐บ ๐บ
determines the other attributes
including the trivial ones , and
plus all augmentations
A โ B, A โ C A โ B, C A โ A, B A โ A, C A โ A, B, C B โ A, B โ C C โ A C โ B
right side singleton
{A โ B, A โ C, B โ A, B โ C, C โ A, C โ B} A โ B A โ C & C โ B {A โ C, B โ A, B โ C, C โ A, C โ B}
{A โ C, B โ A, B โ C, C โ A, C โ B} B โ A {A โ C, B โ C, C โ A, C โ B}
?
R ๐ L R ฯL(R) L ๐
calculate the closure in
is a FD calculated this way and , add the FD to
removed
๐ M โ L M+ R M โ X X โ L ๐ ๐ ๐ ๐
with projected on
๐ = {A โ B, B โ C, C โ D} L = {A, C, D} {A}+ = {A, B, C, D} {B}+ = {B, C, D} {C}+ = {C, D} {D}+ = {D}
attributes on the left would follow trivially
attributes are in
follows from the
L ๐ = {A โ C, A โ D, C โ D} A โ D ๐ = {A โ C, C โ D}
for every star
we need to repeat this update operation for every star or we get incoherent information
cartoon, we have no information left on the movie!
movies = (title, year, length, genre, studioName, starName)
movies = (title, year, length, genre, studioName, starName)
movies1=(title, year, length, genre, studioName) movies2=(title, year, studioName)
then is a superkey
A1โฆAn โ B A1โฆAn
title, year --> studio
title, year
{title, year}+ = {title, year, length, genre, studio}
movies(title, year, length, genre, studio) starsIn(title, year, star)
information multiple times
is in BCNF
BNCF .
,
are keys. So, BCNF
R(A, B) A โ B, B โ A R A โ B, B โ A A A โ B B โ B A โ B, B โ A A, B
information?
โจ
A1A2โฆAn โ B1โฆBm A1โฆAn
prod(title, year, studio, president, presAddr) title year -->studio studio --> president president --> presAddr
title, year
studio --> president president --> presAddr studio --> president
studio --> president {studio}+ = {president, presAddr}
(title, year, studio) (studio, president, presAddr)
title, year -->studio studio --> president, president --> presAddr
(president, presAddr)
that violates BCNF
as one relation and as the
X โ Y X+ X+ X โช โ(X+) X X+
the FDs given)
the FDs given)
R(A, B, C, D); AB โ C; BC โ D; CD โ A; AD โ B
with FD , where is not a key
and
is a tuple. It is projected as and
.
and , i.e.
because is a projection.
. )
there is only one value for
R(A, B, C) B โ C B R1(A, B) R2(B, C) t = (a, b, c) t1 = (a, b) t2 = (b, c) t โ R1 โ R2 t1 = (a, b) โ R1 t2 = (b, c) โ R2 t โ R1 โ R2 (a, b, x) โ R R1 (a, y, c) โ R B โ C x (a, b, x = c)
the BCNF decomposition algorithm cannot loose information
A, B
bookings(title, theater, city) theater --> city title, city --> theater title, city title, theater
but without the FDs
and
R(A, B, C) B โ A, B โ C R1(A, B) R2(B, C)
A B C 1 2 3 4 2 5
A B C 1 2 3 4 2 5 A B 1 2 4 2 B C 2 3 2 5
A B 1 2 4 2 B C 2 3 2 5 A B C 1 2 3 4 2 3 1 2 5 4 2 5
, because it only has two attributes
derived
(theater, city) (theater, title) title, city --> theater
Theater City AMC Wauwatosa Marcus 1 Milwaukee Marcus 2 Wauwatosa Theater Title Marcus 2 Doolittle AMC Doolittle
with FD has a lossless join into and
, the join is not loss-less
attributes :
?
R(A, B, C) B โ C R(A, B) R(B, C) B โ C B โ A R S1, S2, โฆSn Si ฯS1(R) โ ฯS2(R) โ โฆ โ ฯSn(R) = R
in which we project is not important.
tuple could appear in the projections
ฯS1(R) โ ฯS2(R) โ โฆ โ ฯSn(R) โ R t โ R
decomposed relationship
the decomposed relationship
relationship
with projections on , and
is then represented in the decomposition tableau
R(A, B, C, D) S1 = {A, D} S2 = {A, C} S3 = {B, C, D} S1 โ S2 โ S3
A B C D
a b1 c1 d a b2 c d2 a3 b c d
D
tuple has certain a and d values, but the join might give some values for the b and c column
A B C D
a b1 c1 d a b2 c d2 a3 b c d
downโ identities between the elements in the tableau.
dropping them
,
attributes on the right side, we can use it to equalize the entries for attributes on the right of an FD
A โ B B โ C CD โ A
A B C D
a b1 c1 d a b2 c d2 a3 b c d
:
A โ B
A B C D
a b1 c1 d a b2 c d2 a3 b c d
A B C D
a b1 c1 d a b1 c d2 a3 b c d
A B C D
a b1 c1 d a b1 c d2 a3 b c d
A B C D
a b1 c d a b1 c d2 a3 b c d
A B C D
a b1 c d a b1 c d2 a3 b c d
A B C D
a b1 c d a b1 c d2 a b c d
t
with unsubscripted variables?
and decomposition into
R(A, B, C, D) B โ AD {A, B}, {B, C}, {C, D}
and decomposition into
R(A, B, C, D) B โ AD {A, B}, {B, C}, {C, D}
A B C D
c1 a b d1 a2 b c d2 a3 b3 c d
and decomposition into
R(A, B, C, D) B โ AD {A, B}, {B, C}, {C, D}
A B C D
c1 a b d1 a2 b c d2 a3 b3 c d
A B C D
c1 a b d1 a b c d1 a3 b3 c d
example
, , in .
(a, b, c1, d1) (a, b, c, d1) (a3, b3, c, d) R B โ CD
A B C D
c1 a b d1 a b c d1 a3 b3 c d
A B
a b a3 b3
B C
b c1 b b3 c c
C D
c1 d1 c c d1 d
A B
a b a3 b3
B C
b c1 b b3 c c
C D
c1 d1 c c d1 d
A B C D
a b c1 d1 a b c d1 a b c d a3 b3 c d1 a3 b3 c d
be decomposed into , , . Assume FDs , , . Is the decomposition lossless?
R(A, B, C, D, E) {A, B, C} {B, C, D} {A, C, E} A โ D CD โ E E โ D
be decomposed into , , . Assume FDs , , . Is the decomposition lossless?
R(A, B, C, D, E) {A, B, C} {B, C, D} {A, C, E} A โ D CD โ E E โ D
A B C D E
a b c d1 e1 a2 b c d e2 a b3 c d3 e
be decomposed into , , . Assume FDs , , . Is the decomposition lossless?
R(A, B, C, D, E) {A, B, C} {B, C, D} {A, C, E} A โ D CD โ E E โ D A โ D
A B C D E
a b c d1 e1 a2 b c d e2 a b3 c d1 e
A B C D E
a b c d1 e1 a2 b c d e2 a b3 c d3 e
be decomposed into , , . Assume FDs , , . Is the decomposition lossless?
R(A, B, C, D, E) {A, B, C} {B, C, D} {A, C, E} A โ D CD โ E E โ D CD โ E
A B C D E
a b c d1 e a2 b c d e2 a b3 c d1 e
A B C D E
a b c d1 e1 a2 b c d e2 a b3 c d1 e
be decomposed into , , . Assume FDs , , . Is the decomposition lossless?
, ,
R(A, B, C, D, E) {A, B, C} {B, C, D} {A, C, E} A โ D CD โ E E โ D E โ D A โ D CD โ E
A B C D E
a b c d1 e a2 b c d e2 a b3 c d1 e
B C D C A C E
A B C D E
a b c d1 e a2 b c d e2 a b3 c d1 e b c d1
A B C
a b c a2 b c a b3 c b c d b3 c d1 a c e a2 a c c e2 e
B C D C A C E A B C D E
a b c d1 e a2 b c d e2 a b3 c d1 e b c d1
A B C
a b c a2 b c a b3 c b c d b3 c d1 a c e a2 c e2
A B C D E
a b c d1 e e a b c d e a2 b c d1 e2 a2 b c d e2 a b3 c d1 e2 a b3 c d1
is a non-trivial FD, then
is a superkey
not in are each member of some key (not necessarily the same)
R A1A2โฆAn โ B1B2โฆBm {A1, A2, โฆ, An} {B1, B2, โฆ, Bn} {A1, A2, โฆ, An}
bookings(title, theater, city) theater --> city title, city --> theater
such that
R
for
: use as a schema
are a superkey for , add another relation whose schema is a key for R
R ๐พ ๐ฟ ๐พ X โ A โ ๐ฟ XA R
, ,
R(A, B, C, D, E) AB โ C C โ B A โ D
, , follows from the other two
AB โ C C โ B A โ D
,
and
, ,
S1(A, B, C) S2(B, C) S3(A, D) R A, B, E A, C, E S1(A, B, C) S2(B, C) S3(A, D) S4(A, C, E)
decomposition that is a superkey .
is all the attributes.
๐ ๐
.
.
๐ ๐ โ ๐น โ B
rest of attributes r,s,t, e, f, b1 ** r,s t1 e1 f1 b **
๐น ๐ โ ๐น B
row row FD
๐
it, we loose the subscript in the column corresponding to the right side
row
B
decomposition
consequence get all the FDs
relation is by definition in third normal form
basis:
, then the basis is not minimal
using many relations
A B C 1 2 {3,4} 4 5 {3,4} A B C 1 2 3 1 3 4 4 5 3 4 5 4
lecturers are independent.
calc 1 | Lang | Krenz calc 1 | Ross | Sanders calc 1 | Lang | Sanders calc 2 | Ash | Gillen calc 2 | Ash | Engbers calc 1 | Ross | Schwarz calc 1 | Lang | Schwarz
calc 1 | Ross | Krenz calc 1 | Lang | Krenz calc 1 | Ross | Sanders calc 1 | Lang | Sanders calc 2 | Ash | Gillen calc 2 | Ash | Engbers calc 1 | Ross | Schwarz calc 1 | Lang | Schwarz calc1 | {Ross, Lang} | {Krenz, Sanders, Schwarz} calc2 | {Ash} | {Gillen, Engbers}
dependency
exist in the relation.
and in
and in the other attributes
and also exist
A1, A2, โฆ, An โ B1, โฆBm A1, A2, โฆ, An b1โฆbm bโฒ
1โฆbโฒ m
B1, B2, โฆ, Bm x1โฆxr xโฒ
1โฆxโฒ r
a1โฆanbโฒ
1โฆbโฒ mx1โฆxr
aโฒ
1โฆaโฒ nb1โฆbmxโฒ 1โฆxโฒ r
:
and Bs
A1, A2, โฆ, An A1, A2, โฆ, An B1, B2, โฆ, Bm
and course โ book course โ lecturer
calc 1 | Ross | Krenz calc 1 | Lang | Krenz calc 1 | Ross | Sanders calc 1 | Lang | Sanders calc 2 | Ash | Gillen calc 2 | Ash | Engbers calc 1 | Ross | Schwarz calc 1 | Lang | Schwarz
movies
then
A1โฆAn โ B1โฆBm
implies
A-attributes
A1โฆAn โ B1โฆBm B1โฆBm โ C1โฆCk A1โฆAn โ C1โฆCk
street, city
street is not true.
โ โ
and are the attributes not in the As and Bs, then
A1โฆAn โ B1โฆBm C1โฆCk A1โฆAn โ C1โฆCk
is a non-trivial MVD
is a super-key
A1โฆAn โ B1โฆBm A1โฆAn
BCNF 3NF
โ
3NF BCNF 4NF eliminate redundancies due to FDs no yes yes eliminates redundancies due to MVDs no no yes preserves FDs yes no no preserves MVDs no no no lossless joins yes yes yes