Databases Data Modeling Lectures for students of mathematics - - PowerPoint PPT Presentation

databases data modeling lectures for students of
SMART_READER_LITE
LIVE PREVIEW

Databases Data Modeling Lectures for students of mathematics - - PowerPoint PPT Presentation

Databases Data Modeling Lectures for students of mathematics Zbigniew Jurkiewicz Institute of Informatics UW March 26, 2017 Zbigniew Jurkiewicz Databases Data Modeling Lectures for students of mathematics All models are wrong, but some


slide-1
SLIDE 1

Databases Data Modeling Lectures for students of mathematics

Zbigniew Jurkiewicz

Institute of Informatics UW

March 26, 2017

Zbigniew Jurkiewicz Databases Data Modeling Lectures for students of mathematics

slide-2
SLIDE 2

“All models are wrong, but some are useful.” George Box

Zbigniew Jurkiewicz Databases Data Modeling Lectures for students of mathematics

slide-3
SLIDE 3

What is the best way to build applications? The fact that a system works well should be nearly invisible for its users (unless it breaks). Examples:

bathroom lift (elevator)

Zbigniew Jurkiewicz Databases Data Modeling Lectures for students of mathematics

slide-4
SLIDE 4

Phases of the project

1

Agree with users upon requirements for the system to be built.

2

Describe the agreed requirements in the form of specification.

3

Design the architecture of the system and the methods of its realization.

4

Implement the system and deploy it.

Zbigniew Jurkiewicz Databases Data Modeling Lectures for students of mathematics

slide-5
SLIDE 5

Data modeling

When modeling the requirements and describing typical use cases the analysts discover many various data objects. With object-oriented approach objects are modeled using class diagrams, usually with UML notation. In the databases field however there is a simpler (and much older) approach based on Entity-Relationship Diagrams (ERD).

Zbigniew Jurkiewicz Databases Data Modeling Lectures for students of mathematics

slide-6
SLIDE 6

Entity-Relationship Diagrams (ERD)

Entity-Relationship Diagrams were first described by Bachman and Chen. They should describe the associations between stored data, i.e. such data, which cannot be derived from other stored data. Now used mostly for modeling data bases during physical design. Two main components:

entities, describing (in a simplified way) real objects of interest from a modeling domain; relationships between entities.

Zbigniew Jurkiewicz Databases Data Modeling Lectures for students of mathematics

slide-7
SLIDE 7

Figure : Example ERD

Zbigniew Jurkiewicz Databases Data Modeling Lectures for students of mathematics

slide-8
SLIDE 8

Entities

Entities are used to model objects. An example:

Figure : Example entity

Zbigniew Jurkiewicz Databases Data Modeling Lectures for students of mathematics

slide-9
SLIDE 9

Relationships

Relationships connect two or more entities. Most often the relationships are binary. An example:

Figure : Example relationship

Zbigniew Jurkiewicz Databases Data Modeling Lectures for students of mathematics

slide-10
SLIDE 10

ERD — object-oriented extensions

Some entities are similar to subclasses in object-oriented approach, e.g. the goods sold by some company (modeled as entity Product) could be divided into hardware, software and office materials. Some relational database systems (e.g. PostgreSQL) can represent such hierarchies of tables, but the subclasses should be disjoint.

✂✁☎✄✝✆✟✞ ✠✡✆✟✞☞☛☎✌✎✍✏☛✒✑ ✆ ✓✕✔ ✞✖☛✎✌✎✗ ✘✟✙✛✚✛✜✛✢✛✣✤✣✒✥ ✦ ✍✤☛✒✑ ✢✎✧ ★✪✩✏✫✭✬✯✮✱✰ ✲ ✳✵✴☎✫✭✬✯✮✱✰✕✮ ✶ ✶ ✶ ✶ ✷ ✸ ✸ ✸ ✸✺✹ ✻

Zbigniew Jurkiewicz Databases Data Modeling Lectures for students of mathematics

slide-11
SLIDE 11

Hierachical entities — realization

It is not enough to model entities with a hierarchy, we have to decide how we would like to implement them. Possibilities are:

  • ne common table;
  • nly separate tables for subtypes;

common table + tables for subentities.

Zbigniew Jurkiewicz Databases Data Modeling Lectures for students of mathematics

slide-12
SLIDE 12

Exclusive relationships

An example: the invoice is associated with such relationship to either a company or a physical person (the customer). Implementation: two (or more) fields with foreign keys for the relationships + consistency constraint: all of them except one must be NULL

  • ne common field with a foreign key (types must match) +

additional selecting attribute

Zbigniew Jurkiewicz Databases Data Modeling Lectures for students of mathematics

slide-13
SLIDE 13

Database design

Discovering attributes and dependencies, grouping into “objects”. Mapping entities and relationships into tables Normalization and denormalization. Tuning the database, defining the access paths.

Zbigniew Jurkiewicz Databases Data Modeling Lectures for students of mathematics

slide-14
SLIDE 14

Database design

Two activities of the design, which interest us most, are logical design of database and physical design of its implementation. Logical design is concerned with defining tables, determining the access paths for tables (e.g. indexes) and matching indexes to application needs.

Using views greatly simplifies the design of forms and reports, so views should be planned too.

Physical design should decide about distribution of data into files and disks, archivization (backup) and restoring plans, and the integration with the mechanisms and tools

  • f the operating system.

Zbigniew Jurkiewicz Databases Data Modeling Lectures for students of mathematics

slide-15
SLIDE 15

UML: object-oriented approach to modeling

UML (Unified Modeling Language) was designed by Booch, Jacobsen and Rumbaugh. In 1997 Object Management Group accepted UML 1.1 as

  • ne of their industry standards.

UML covers two aspects of modeling:

static: the system structure; dynamic: the system behavior.

Zbigniew Jurkiewicz Databases Data Modeling Lectures for students of mathematics

slide-16
SLIDE 16

Class diagrams

They serve to model the structure of the system. Used during

business modeling (for modeling objects in the domain of the system) design (esp. database design), when more “technical” classes are introduced reverse engineering

Zbigniew Jurkiewicz Databases Data Modeling Lectures for students of mathematics

slide-17
SLIDE 17

Class diagrams

The main ingredients are classes with names. The simplest represention of class contains nothing else. But usually in addition to name the class also contains

attributes methods.

Zbigniew Jurkiewicz Databases Data Modeling Lectures for students of mathematics

slide-18
SLIDE 18

Class diagrams: associations

Kinds: Association Aggregation and composition Inheritance Dependency Refinement: used for realization or more detailed description

Zbigniew Jurkiewicz Databases Data Modeling Lectures for students of mathematics

slide-19
SLIDE 19

Class diagrams: aggregations and compositions

A class may be connected with many superior classes using aggregation relationship, but with only a single superior class using composition relationship.

Zbigniew Jurkiewicz Databases Data Modeling Lectures for students of mathematics

slide-20
SLIDE 20

Dependencies

Dependency relationship between two classes means that those classes have no other association, but the dependent class uses the object of the other, e.g as a parameter in one of its method or creates it.

Zbigniew Jurkiewicz Databases Data Modeling Lectures for students of mathematics

slide-21
SLIDE 21

Inheritance

Wrong inheritance example:

Zbigniew Jurkiewicz Databases Data Modeling Lectures for students of mathematics

slide-22
SLIDE 22

Refinement

Zbigniew Jurkiewicz Databases Data Modeling Lectures for students of mathematics

slide-23
SLIDE 23

Refinement

Connects two descriptions of the same thing on different abstraction levels. In situations when the refinement connects the abstract type with a class realizing it, it is called realization. Can be used for modeling different implementation of the same thing.

Zbigniew Jurkiewicz Databases Data Modeling Lectures for students of mathematics

slide-24
SLIDE 24

When to use aggregation?

Are there any operations on the whole object, which are automatically applied to components?

Zbigniew Jurkiewicz Databases Data Modeling Lectures for students of mathematics

slide-25
SLIDE 25

When to select aggregation, and when inheritance?

If the type of an object (e.g. Student) can change (e.g. from undergraduate to graduate) without changing any

  • ther attibutes, then using inheritance would force a class

change! Only advanced programming languages allow it.

Zbigniew Jurkiewicz Databases Data Modeling Lectures for students of mathematics

slide-26
SLIDE 26

Attributes and associations

In some situations there are doubts whether to use attribute or

  • association. The general rule is as follows:

Attributes are used to connect objects with values. Values are any elements not having identity, e.g. number 1 (there is no concept of “this” number 1). Associations connect objects with objects. Values can be easily written out directly and reread. With

  • bjects there is more trouble, because all connections to other
  • bjects has to be written out as well (and possibly those objects

too etc.).

Zbigniew Jurkiewicz Databases Data Modeling Lectures for students of mathematics

slide-27
SLIDE 27

Modeling behavior

Modeling system behavior includes two things: external behavior: communication of the system with the environment, seen from the external perspective, it is modeled by use cases; internal behavior: network of activities and dependencies between them, serving to realize the external behavior, described by interaction models. Besides those two models there is the physical model, composed of two diagrams: component diagram, deployment diagram.

Zbigniew Jurkiewicz Databases Data Modeling Lectures for students of mathematics

slide-28
SLIDE 28

Use cases

This model is used mostly for specifying the requirements. Two main ingredients: actors and use cases. Use case model sets the boundaries of the system.

Usually these are the boundaries of the application being built, but use cases are also used for business modeling (so called business use cases). These are different approaches to modeling with use cases and should not be mixed in a single model.

Zbigniew Jurkiewicz Databases Data Modeling Lectures for students of mathematics

slide-29
SLIDE 29

Modeling use cases

1

For each scenario case find the activities to be performed.

2

Determine the order of activities.

3

Assign the activities as methods to appropriate classes. Add arguments to pass data in messages.

4

Build sequence diagrams.

5

If some communicating classes are not connected, extend the class diagram with appropriate associations.

6

Build collaboration diagrams.

Zbigniew Jurkiewicz Databases Data Modeling Lectures for students of mathematics

slide-30
SLIDE 30

Use cases

Each use case should represent the realization of some real need of a user (actor), this should be reflected in the name of the use case, which should not instead describe the system actions. Displaying results of exam is not a good name, a better would be Browsing results

  • f exam.

The uses cases should match the fragments of a user

  • manual. If we use packages for grouping a large number of

use cases, then for each package there should exist a chapter in the user manual. The scenarios (descriptions) of uses cases should be written form the perspective of user, and not the system developer. You should not create use cases grouped around “objects”, e.g. Processing information about students. They usually have too many actors and too long specifications.

Zbigniew Jurkiewicz Databases Data Modeling Lectures for students of mathematics

slide-31
SLIDE 31

State diagram

State diagrams (or transition diagrams) are used for describing the history of classes. Technically they are generalized finite automata. Each state diagrams contains the network composed of nodes representing class states, and arcs representing transitions between states. Each state diagram describes the life history of one instance of a given class.

If the operations on objects of the class can be performed (or an occur) in any order, the state diagram is probably superfluous. However, when a class represents (reifies) some use case, process, interface manager etc., then the order of steps becomes important. So some mechanism of receiving and serving events should be described.

Zbigniew Jurkiewicz Databases Data Modeling Lectures for students of mathematics

slide-32
SLIDE 32

State diagram

State is an abstraction of some combinations of attribute values, such that all this combinations result in qualitatively similar answer (reaction) to all events. Transitions are labeled with conditions. When the condition is satisfied and its starting state is the current state, the transition is taken and the current state is changed to its ending state. Instead of conditions events could be used. Additionally control actions could be associated with

  • transition. The actions activate appropriate processes and

are performed when the transition is taken.

Zbigniew Jurkiewicz Databases Data Modeling Lectures for students of mathematics

slide-33
SLIDE 33

State diagram

With more complicated behaviors we are using nested networks, called also Harel diagrams. In such networks each transition from composed state applies to all its internal substates. Modeling the behavior of objects we try to show two thing: its state changes and interactions with other objects. When constructing the state diagram for a class you should remember that the class should have attributes or relationships, whose changes realize the change of state. Speaking the other way, states are represented indirectly by values of attributes and/or relationships.

Zbigniew Jurkiewicz Databases Data Modeling Lectures for students of mathematics

slide-34
SLIDE 34

Packages

Packages are the mechanism to divide a system into subsystems (divide et impera). They serve to group any constructs: uses cases, classes etc. into larger units. Packages can also contain other packages, thus creating hierarchies of packages and subpackages.

Zbigniew Jurkiewicz Databases Data Modeling Lectures for students of mathematics

slide-35
SLIDE 35

Packages

The dependencies between packages are promoted from dependencies between their elements.

If a package A depends on a package B, this usually means, that at least one element of the package A uses the services offered by an element of the package B. If there are many dependencies between two or more classes from different packages, it may point to the necessity of moving one of them to a different package.

If the packages offers many services divided into groups,

  • ne may model each group with a separate interface. For

the other packages we attach dependencies on these

  • interfaces. The same technique is used for components in

the physical model.

Zbigniew Jurkiewicz Databases Data Modeling Lectures for students of mathematics

slide-36
SLIDE 36

Deployment diagram

The deployment diagram describes the material architecture of the system. It contains objects of two kinds: nodes and connections. Nodes: physical objects containing some computational

  • resources. A node can be a type (class) or an instance.

Dell Pentium 466 yellow11: Dell Pentium 466 <<Router>> Cisco Router X2000

Zbigniew Jurkiewicz Databases Data Modeling Lectures for students of mathematics

slide-37
SLIDE 37

Deployment diagram

Connections: the communication paths, the type may be given by a stereotype.

klient A: Compaq Pro PC klient B: Compaq Pro PC serwer aplikacji: Silicon Graphics O2 <<TCP/IP>> <<TCP/IP>> serwer bazy danych: VAX <<DecNet>>

Zbigniew Jurkiewicz Databases Data Modeling Lectures for students of mathematics

slide-38
SLIDE 38

Stereotypes

Stereotypes are used to define the new kinds of modeling elements in terms of already existing elements. The name of a stereotype is surrounded with the guillemets (special kind of parentheses), e.g. <<Persistent>>, and placed close to the element name.

Zbigniew Jurkiewicz Databases Data Modeling Lectures for students of mathematics

slide-39
SLIDE 39

The limits of UML

UML was created as a notation for object-oriented modeling and it fits this role quite well. One should however remember, that there are other approaches for building programs, and not try to overuse it. For example in functional programming we often use higher order functions (functionals) and generic functions, which do not have their counterparts in UML notation.

Zbigniew Jurkiewicz Databases Data Modeling Lectures for students of mathematics

slide-40
SLIDE 40

Readability of diagrams

Most diagrams contain three basic kinds of elements: ,,bubbles” (classes on class diagrams, uses cases on use case diagrams), lines (associations on class diagrams, transitions on state diagram) labels (the names of use cases, roles and association names on class diagrams).

Zbigniew Jurkiewicz Databases Data Modeling Lectures for students of mathematics

slide-41
SLIDE 41

Readability of diagrams

There are some guidelines common for most analytic and design diagrams. (based on Scotta W. Ambler’s book The Elements of UML Style) Avoid crossing lines. If it is absolutely impossible to avoid it, one of lines should pass other another using a small “bridge” (cf. electronic diagrams). Avoid diagonal and curved lines. It is easier to follow a run

  • f straight line (horizontal or vertical). Placing bubbles on

the grid (built into most CASE tools) simplifies connecting bubbles with only horizontal and vertical lines.

Zbigniew Jurkiewicz Databases Data Modeling Lectures for students of mathematics

slide-42
SLIDE 42

Readability of diagrams

The sizes of bubbles should be logically consistent. A larger bubble looks more important, so if we don’t need this effect we should use same sized bubbles. Unfortunately some tools automatically fit the size of a bubble to its contents (e.g. its name) and it is not easy to change this. Use clear separation between element (empty space). With crowded bubbles and lines sometimes it is hard to determine, to which element the given label belongs. Try to place objects on diagrams starting from the left to the right and from the top to the bottom (unless you are in Japan ;-). The starting point of the diagram, e.g. initial state on state diagram or the actor initiating the case on case diagram, should be placed more or less close to the top left corner.

Zbigniew Jurkiewicz Databases Data Modeling Lectures for students of mathematics

slide-43
SLIDE 43

Readability of diagrams

Show only what is important. Avoid too much details, if they do not help with understanding the application. You need not to show everything on a single diagram. Avoid exotic notation, even if it is correct according to the

  • standard. In most notations (including UML), using 20% of

,,basic notation” we are able to do 80% of jobs. Use remaining elements only, when they are really necessary.

Zbigniew Jurkiewicz Databases Data Modeling Lectures for students of mathematics

slide-44
SLIDE 44

Readability of diagrams

Try to draw small diagrams. It is better to have a few diagrams, showing different aspects of the model, tahn one diagram with mixed informations. The classic rule talks about 7 ± 2 elements, because psychology claims that this is the number of independent information units which the person is able to process simultaneously. Concentrate, especially during initial steps, on the contents

  • f the diagram. You will think about the readability later if

necessary — maybe the diagram will be spurious or it will be divided into some smaller diagrams. Use uniform notational conventions, if possible patterned after the vocabulary of the modeled domain, especially on diagrams which will be shown to final users.

Zbigniew Jurkiewicz Databases Data Modeling Lectures for students of mathematics

slide-45
SLIDE 45

CASE tools

By the CASE tool we understand the integrated environment for modeling, designing and building applications. The central part of such tool is a common repository. The repository allows for concurrent work on many projects, with sharing of common fragment between them. This applies to single objects as well as to whole models.

Zbigniew Jurkiewicz Databases Data Modeling Lectures for students of mathematics

slide-46
SLIDE 46

The architecture of a typical CASE tool

Zbigniew Jurkiewicz Databases Data Modeling Lectures for students of mathematics

slide-47
SLIDE 47

Capabilities of CASE tools

Drawing models (includes knowledge about semantics of symbols and correctness rules) Model repository connected with diagrams (e.g when we change a name of some class, the change should be immediately visible on all diagrams) Support for navigation between models Cooperation of many users working on the same project Code generation (usually only a framework), e.g. SQL Reverse engineering Integration with other tools Levels of abstraction models Model exchange with different tools (export and import)

Zbigniew Jurkiewicz Databases Data Modeling Lectures for students of mathematics

slide-48
SLIDE 48

CASE tools

For managing the repository such systems often use relational database, e.g. SQL Anywhere (Sybase). Some of them cover for reverse engineering, i.e. inserting into analytic and design models objects from already existing programs and databases.

Zbigniew Jurkiewicz Databases Data Modeling Lectures for students of mathematics

slide-49
SLIDE 49

Extensions

The XMI (XML Metadata Interchange Format) specification proposes a standard for exchanging object-oriented data between programs. Additional extensions, like exchanging also diagrams, are being considered.

Zbigniew Jurkiewicz Databases Data Modeling Lectures for students of mathematics

slide-50
SLIDE 50

Sources

The link www.omg.org/uml contains informations about UML resources in OMG and other addresses. The link www.uml-forum.com contains (contained?) materials from working groups from discussions on further development of UML standard and other interesting links.

Zbigniew Jurkiewicz Databases Data Modeling Lectures for students of mathematics