Introduction - - PowerPoint PPT Presentation

introduction slide 229
SMART_READER_LITE
LIVE PREVIEW

Introduction - - PowerPoint PPT Presentation

Introduction slide 229 Fonctional Dependencies slide 233 Relation decompositions slide 246


slide-1
SLIDE 1

Normal Forms 1

  • Normalisation of relational schemas
  • Introduction

slide 229

Fonctional Dependencies

slide 233

Relation decompositions

slide 246

Normal Forms

slide 255

  • Bad design => data redonduncy

Update mistakes (duplicate values) insertion mistakes (null values,

unconsistencies)

Deletion mistakes (loss of informations)

Introduce a concept of « good" schema

(without duplicates)

Allow to compare two schemas

  • ORDER table

PRODUCT QUANTITY COLOR VENDOR ADDRESS Umbrella 110 red Labaleine Paris Hat 50 green Lemelon Lyon Bag 65 black Toutcuir Lyon parasol 15 yellow Labaleine Paris Umbrella 5 red Labaleine Paris ceinture 25 green Letour Nantes Bag 65 black Legrand Paris

slide-2
SLIDE 2

Normal Forms 2

  • Study data properties

Functional dependencies Multivalued dependencies Product dependencies, ...

Normal Forms

partial order on schemas

decomposition / synthesis algorithms to get

3rd normal form schemas

  • !

Universal Relation FD name

  • n

pn city wn expdate

  • date

Normalisation lastname expqty oqty R1(.....) R2(....) ......

" #

$ %!&

Property defined on the schema

Specific case of integrity constraint defined on intension (so valid for all possible

extensions)

Definition

B depends functionaly from A if, for a given value

  • f A, a unique value of B is corresponding (for all

extensions)

A and B are attributes sets

Notation A → B

slide-3
SLIDE 3

Normal Forms 3

'

  • PERSONS(pn, name, lastname, city)

ORDERS(on, odate, wn, oqty, nb) EXPEDITIONS(on, expdate, expqty)

(

!

PN → NAME PN → LASTNAME PN → CITY ON → ODATE ON → PN ON → WN ON → OQTY ON, EXPDATE → EXPQTY NAME → CITY ? PN → WN ? OQTY → EXPQTY ?

  • Reflexivity

Y ⊂ X

X → Y

Augmentation

X → Y

XZ → YZ

Transitivity

X → Y and Y → Z

X → Z

  • ! )

Union

X → Y and X → Z

X → YZ

Pseudo-transitivity

X → Y and YW → Z XW → Z

Decomposition

X → Y and Z ⊂ Y

X → Z

slide-4
SLIDE 4

Normal Forms 4

  • *+

Transitive closure of F (a set of DF) is

denoted by F+

F+ = F U DF produced using axioms For example

ON → PN and PN → NAME so ON → NAME ON → NAME so ON, WN → NAME, WN mainly transitivity and pseudo-transitivity

  • Nodes = attributes

Edges = FD

  • n

pn

  • date wn
  • qty

name lastname city expdate expqty

  • *+
  • n

pn

  • date wn
  • qty

name lastname city expdate expqty

  • $ $

a functional dependency X → A is

elementary if

A is not included into X It does not exist X’ included into X so that X’→ A

Allow to simplify the process of transitive

closure (if not it is always possible to create new FDs using augmentation)

Example:

PN → NAME PN, WN → NAME not EFD

slide-5
SLIDE 5

Normal Forms 5

"

,+

Definition

Minimal subset of EFD allowing to produce all

  • ther ones

Example

(pn → name; pn → lastname; pn → city;

  • n → odate; on → pn; on → wn; on → oqty;
  • n, expdate → expqty)

Theorem

Any FD set has (at least) one minimal coverture

#

  • $

Definition

Minimal set of attributes allowing to determine

all other ones

R(A1, A2, ..., An) a relation schema. F+ the

set of FDs associated to R. X (subset of R) is a key for R iff:

X → A1, A2, ..., An It does not exist Y subset of X such as Y → A1,

A2, ..., An

'

  • $ %&

Example

  • n, expdate is a key for the example schema

Remarks

A relation may have several keys A relation has always one key (in the worst case

the entire schema)

(

!

slide-6
SLIDE 6

Normal Forms 6

  • !

Decomposition of a relation schema

The decomposition of a relation schema R(A1,

A2, ..., An) is its substitution by a set of relation schemas R1, R2, ..., Rp such as:

schema(R) = schema(R1) ∪ schema(R2) ∪ ...

schema(Rp)

Criterias of good decomposition

Decomposition without loss of informations Decomposition preserving FD

"

! .

  • the decomposition of a relation schema R(A1,

A2, ..., An) by a set of relation schemas R1, R2, ..., Rp is without loss of information iff:

R = R1

R2 ... Rp

"

! + !

R(A1, A2, ..., An) and FDR (associated set of

FDs)

decomposition of R in R1, R2, ..., Rp (with

FDR1, ...FDRn resp. set of FDs of R1, ..., Rn) preserves FDs iff:

FDR

+ = FDR1 + U ... U FDRn +

"

ORDERS(on, odate, wn, oqty, pn, name,

lastnale, city)

O(on, odate, wn, oqty, pn) and P(name,

lastnale, city): loss of info.

O(on, odate, wn, oqty, pn) and P(pn, name,

lastnale, city): no loss of info and FDs preservation

O(on, pn) and P(on, odate, wn, oqty, name,

lastname, city): no loss of info but loss of FDs (pn → name for example)

slide-7
SLIDE 7

Normal Forms 7

"

) $

R(X,Y,Z) and X → Y R(X,Y,Z)=R1(X,Y) R2(X,Z)

It is always possible to decompose a relation

schema using a FD

It is not possible to decompose a schema relation

without FD

FD decomposition preserves information

"

!

ORDERS(on, odate, wn, oqty, pn, name, lastame, city) pn → name; pn → lastname; pn → city;

  • n → odate; on → pn; on → wn; on → oqty;

ORDERS(on, odate, wn, oqty, pn, name, lastame, city) O1(on, odate, wn, oqty) C2(on, pn) O(on, pn, name, lastname, city) C3(pn, name, lastname, city)

  • n → odate, wn, oqty

pn → name, lastname, city

""

!

Ensures that decomposition is without loss of

information (uses the binary decomposition principle)

Does not ensure FDs preservations Decomposition is not unique (depends on the

  • rder of used FDs during decomposition)

"#

ORDERS(on, odate, wn, oqty, pn, name, lastame, city) C2(on, odate, wn, oqty, pn)

C1(pn, name, lastname, city) pn → name, lastname, city

slide-8
SLIDE 8

Normal Forms 8

"'

/

"(

/

First normal form Second normal form Third normal form ...

"

Definition

A relation is in first normal form if all its attributes

are atomics (definition of relational data model)

An atomic attribute is not:

multivalued (list of values) structured (composed by sub-attributes)

#

Definition

A relation is in second normal form:

It is in first normal form All non key attributes plenary depend from keys

Example

O(on, expdate, expqty, pn) not in 2NF because

  • n, expdate key and on → pn (pn does not

plenary depend from on, expdate)

slide-9
SLIDE 9

Normal Forms 9

#

*

Goal: eliminate duplicates produced by

transitivity on fonctional dependencies

Definition

A relation is in third normal form iff:

It is 2NF It does not exist FD among non key attributes

#

1.

R(A1, ..., An) and FDR associated set of FDs

compute FDR + determine key(s) of R Split attributes between key attributes

(belonging at least to one key) and non key attributes

Apply normal form definitions (starting from

1NF)

#

)

Any relation R has at least one

decomposition in 3NF preserving information and FDs

Two possible approaches:

decomposition: the decomposition process

stops when derived relations are in 3NF (but no preservation of FDs)

synthesis from minimal coverture

#

0$ % /&

  • inputs: universal relation + associated set of FDs
  • algorithm principle:

1.

From G graph of FDs, compute C minimal coverture

2.

Put all isolated attributes in the same relation (all are keys)

3.

Search the greatest set of attributes X deriving other attributes

4.

define relation R(X, A1, ..., An)

5.

delete FDs X → A1, .., X→ An from the graph of C minimal coverture

6.

Delete isolated attributes from C

7.

Restart algorithm form step 3 until C is empty

slide-10
SLIDE 10

Normal Forms 10

#"

/

Table WINE(NAME, STATE, COUNTRY)

Chenas, France, Beaujolais Juliénas, France, Beaujolais Chablis, France, Bourgogne Chablis, USA, Californie

With following FDs: country → state; name, state → country WINE is in 3NF, but there are duplicates (in this example can be handle by Boyce Codd Kent 3NF)

##

2

Allow to refine a schema design Can be used with a classical design

methodology such as E/R design

major cons: suppose that we are able to

get a minimal coverture of FDs (if a single FD is missing, all derivations are wrong)

Other normal forms with other types of

dependencies

3NF objective may be antagonist with

efficiency (denormalisation)