DM534: Introduction to Relational Databases 2019 Slides by - - PowerPoint PPT Presentation

dm534 introduction to relational databases
SMART_READER_LITE
LIVE PREVIEW

DM534: Introduction to Relational Databases 2019 Slides by - - PowerPoint PPT Presentation

DM534: Introduction to Relational Databases 2019 Slides by Christian Wiwie (edits by Rolf Fagerberg) Some perspectjve ... What are Databases? Repository for large data amounts Describes a logical structure of contained data


slide-1
SLIDE 1

DM534: Introduction to Relational Databases

2019 Slides by Christian Wiwie (edits by Rolf Fagerberg)

slide-2
SLIDE 2

Nov 19, 2019 2

What are Databases?

  • Repository for large data amounts
  • Describes a logical structure of contained data
  • Guarantees data integrity by enforcing constraints
  • Allows for effjcient access
  • Consistent and safe storage

DB “SDU” Some perspectjve ...

slide-3
SLIDE 3

Nov 19, 2019 3

Database Management System (DBMS)

  • A DBMS manages databases
  • Access to database only via DBMS

... DB “SDU” DB “...” DB “DTU” User App Administrator DBMS

slide-4
SLIDE 4

Nov 19, 2019 4

Why learn about Databases?

  • Used almost everywhere
  • Crucial for safety & integrity of stored data
  • Jobs exist dealing specifjcally with databases
  • Increasingly relevant

– Technical advances → More & larger data amounts

slide-5
SLIDE 5

Nov 19, 2019 5

Where are Databases used?

  • Wherever large amounts of data are managed
  • Ofuen multjple DBMS in use that cater specifjc needs
  • Google uses Bigtable for web indexing, Google Maps, ...
  • Facebook uses MySQL; TAO for graph search, …
  • Other applicatjons

– Corporate data: payrolls, inventory, sales, customers, ... – Web search: Google, Live, Yahoo, ... – Social networks: Facebook, Twituer, ... – Scientjfjc and medical databases

slide-6
SLIDE 6

Nov 19, 2019 6

Features of a modern DBMS

  • Highly effjcient access to stored data using indexes
  • Backup/log mechanisms ensure data safety
  • Security policies to manage access permissions
  • Data consistency: Can enforce complex data

constraints, including dependencies

  • Flexible searching, sortjng, fjltering
  • Ensures all the above with simultaneous multj-user

access

slide-7
SLIDE 7

Nov 19, 2019 7

Databases vs. storage in fjles

  • File storage does not provide most of these features

→ Structure and constraints need to be imposed manually

  • Complex operatjons

– not trivial to do right → Error prone – are slow, e.g. searching, sortjng

slide-8
SLIDE 8

Nov 19, 2019 8

Types of DBMS / databases

  • Data can be modeled and organized difgerently
  • Optjmized for specifjc kinds of operatjons
  • Relatjonal DBMS (RDBMS) / databases the most wide-

spread

– Based on mathematjcal relatjons – Basically, a database is a collectjon of relatjons – e.g. MySQL, PostgreSQL, ...

  • Graph DBMS / databases

– Data is a network, with entjtjes and connectjons between them – e.g. neo4j

slide-9
SLIDE 9

Nov 19, 2019 9

DBMS type popularity

slide-10
SLIDE 10

Nov 19, 2019 10

Most widely used DBMS

  • Ranking of most widely used DBMS

Source: htups://db-engines.com/en/ranking

slide-11
SLIDE 11

Nov 19, 2019 11

Internal Structure of a Database

  • Multjple levels of

abstractjon

  • Higher levels

independent of lower levels

  • Sofuware independent
  • f how data is logically

and physically structured and stored Database

View 1 View 2 View 3 Conceptual Data Model Physical Data Model

App App

Logical Data Model

slide-12
SLIDE 12

Nov 19, 2019 12

View 1 View 2 View 3 Conceptual Data Model Physical Data Model

App App

Logical Data Model

Internal Structure of a Database

We will be looking at this part Database

slide-13
SLIDE 13

Nov 19, 2019 13

Conceptual Data Model

  • Semantjcs of stored data
  • Which entjtjes (concepts) are stored?
  • Which relatjonships exist between entjtjes?
  • Independent of DBMS type and specifjc DBMS used

From here on we are exam relevant

slide-14
SLIDE 14

Nov 19, 2019 14

Conceptual Data Model

Most widely used conceptual model: Entjty-Relatjonship (ER) diagrams Cardinality: How many entjtjes are involved in a relatjonship?

Entjty Type 1 Entjty Type 2 Cardinality Cardinality Relatjonship Type Aturibute 1 Aturibute 2 Aturibute 1 Aturibute 2

slide-15
SLIDE 15

Nov 19, 2019 15

Conceptual Data Model

  • Example Cardinalitjes
  • Read:

– One person owns one or more cars

Person Car

1..n

Owns Name Age Color Brand

slide-16
SLIDE 16

Nov 19, 2019 16

Conceptual Data Model

  • Example Cardinalitjes
  • Read:

– One car is owned by exactly one person

→ Constraints do not necessarily hold in reality (joint

  • wnership)

Person Car

1

Owns Name Age Color Brand

slide-17
SLIDE 17

Nov 19, 2019 17

Logical Data Model

  • Usually derived from conceptual data model
  • Expressed in terms of data structures specifjc to

type of DBMS

– Relatjonal DBMS: relatjonal (logical) data model – Graph DBMS: a graph structure

  • But: Stjll independent of specifjc DBMS used
slide-18
SLIDE 18

Nov 19, 2019 18

Relatjonal (Logical) Data Model

  • Main structural concept: relatjons

– Basically a table with rows and columns

  • A relatjon has a relatjon schema

– Specifjes structure of data that can be stored in

relatjon

  • relatjons != relatjonship

– Relatjonship is part of conceptual data model – A relatjon can hold data for entjtjes or relatjonships

We focus on relatjonal DBMS

slide-19
SLIDE 19

Nov 19, 2019 19

Relatjonal (Logical) Data Model

  • A relatjon schema consists of:

– a name – a set of aturibute names – Optjonally: aturibute types

relatjon_name(aturibute1, aturibute2, …) or relatjon_name(aturibute1: type1, aturibute2: type2, …)

slide-20
SLIDE 20

Nov 19, 2019 20

Relatjon Schemas

  • A relatjon usually corresponds to

– Real world entjty types (e.g. car, person, …) – Real world relatjonship types (e.g. person owns

car)

  • Example relatjon schemas:

– Car(color, brand) – Person(name: CHAR(20),age: INTEGER) – Owns(name, age, color, brand)

slide-21
SLIDE 21

Nov 19, 2019 21

Relatjon Instances

  • A relatjon or relatjon schema does not specify which

data is stored

  • A relatjon instance is a realizatjon of a relatjon with

data

– Data must conform to relatjon’s schema

  • Many relatjon instances can exist for the same

relatjon

slide-22
SLIDE 22

Nov 19, 2019 22

Tuples

  • A data entry in a relatjon instance is called tuple
  • A tuple is a realizatjon of the relatjon’s schema

– Assigns values to the atuributes of the relatjon – Must conform to relatjon schema

slide-23
SLIDE 23

Nov 19, 2019 23

Tuples

  • Example tuples of the relatjon Car(color, brand):

– (‘red’, ‘Ford’) – (‘blue’, ‘Mercedes’)

  • Example tuples of the relatjon Person(name, age):

– (‘Henry’, 36) – (‘Thomas’, 22)

slide-24
SLIDE 24

Nov 19, 2019 24

Relatjon Instances

  • Can be visualized by a table:

aturibute1 attribute2 ... ... ... Name

}

Aturibute / Column

{

Tuple / Row

Relatjon Instance / Table

slide-25
SLIDE 25

Nov 19, 2019 25

Relatjon Instance

  • Example relatjon instance of the person relatjon

name age ‘Henry’ 36 ‘Thomas’ 22 Person

slide-26
SLIDE 26

Nov 19, 2019 26

Database Instance

  • A database instance is the collectjon of all its

relatjon instances

– i.e. all relatjon schemas and their corresponding tuples

name age ‘Henry’ 36 ‘Thomas’ 22 Person color brand ‘red’ ‘Ford’ ‘blue’ ‘Mercedes’ Car ...

slide-27
SLIDE 27

Nov 19, 2019 27

From ER Diagrams to Relatjons

Standard translatjon:

  • Each entjty is converted directly to a relatjon (same

atuributes and keys).

  • Each relatjonship is converted to a relatjon with

atuributes consistjng of the keys of its related entjtjes plus its own atuributes (if any). More on keys later.

Person Car Owns pID Age cID Brand Age Age Age Date Name

slide-28
SLIDE 28

Nov 19, 2019 28

From ER Diagrams to Relatjons

Example: Person(pID: INTEGER, Name: CHAR(20)) Car(cID: INTEGER, Brand: CHAR(20)) Owns(pID: INTEGER, cID: INTEGER, Date: CHAR(10))

Person Car Owns pID Age cID Brand Age Age Age Date Name

slide-29
SLIDE 29

Nov 19, 2019 29

Integrity Constraints (ICs)

slide-30
SLIDE 30

Nov 19, 2019 30

Integrity Constraints (ICs)

  • Conditjon that must be true for any database

instance

  • Specifjed when relatjon schemas are defjned
  • Checked whenever relatjon instances are modifjed

– i.e., when tuple is added, deleted, or modifjed

slide-31
SLIDE 31

Nov 19, 2019 31

Domain constraints

  • Domain of valid values for an aturibute

– e.g., INTEGER, FLOAT, CHAR(20), … – correspond to data types in programming languages

  • Example relatjon schema:

Person(name: CHAR(20),age: INTEGER) → DBMS will not allow insertjon of this tuple

name age ‘Henry’ 36 ‘Mads’ ‘Doe’ Domain constraint violatjon

slide-32
SLIDE 32

Nov 19, 2019 32

Semantjc integrity constraints

  • Semantjc restrictjons on the data

– e.g., age >= 18

  • Example relatjon schema:

Person(name: CHAR(20),age: INTEGER) → DBMS will not allow insertjon of this tuple

name age ‘Henry’ 36 ‘Mads’ 16 Constraint violatjon

slide-33
SLIDE 33

Nov 19, 2019 33

Primary Keys

  • Set of relatjon atuributes

– that uniquely identjfjes tuples of relatjon – all tuples need to have unique values for these atuributes

  • Example: CPR is primary key of relatjon Person

→ There cannot be two tuples with same CPR number

CPR Name Birthday Address ... ... ... ... 1904651243 Svensson 19.04.1965 ... ... ... ... ... 1904651243 ... ... ... ... ... ... ... Not allowed

slide-34
SLIDE 34

Nov 19, 2019 34

Primary Keys

  • Primary key “points” to exactly one tuple

→ can be used to lookup corresponding tuple → e.g., person can be looked up using CPR

CPR Name Birthday Address ... ... ... ... 1904651243 Svensson 19.04.1965 ... ... ... ... ... What is the name of the person with CPR=1904651243 ?

slide-35
SLIDE 35

Nov 19, 2019 35

Foreign Keys

  • Allow to associate tuples in difgerent relatjons
  • Tuple of source relatjon → tuple of target relatjon

– Source and target relatjon can be the same – Can only point to a primary key in the target relatjon

slide-36
SLIDE 36

Nov 19, 2019 36

Example: University Database

Relatjon schemas:

– Students(sid: string, name: string, login: string, age: integer,

gpa:real)

– Courses(cid: string, cname:string, credits:integer) – Enrolled(sid:string, cid:string, grade:string) Student Course 1..n 1..n Enrolled name sid login age grade cid cname credits gpa

slide-37
SLIDE 37

Nov 19, 2019 37

Example: Foreign Keys

slide-38
SLIDE 38

Nov 19, 2019 38

Query Languages

slide-39
SLIDE 39

Nov 19, 2019 39

Query Languages

  • Allow manipulatjon and retrieval of data from a

database

  • Query languages != programming languages
  • not expected to be “turing complete”

→ i.e., not every operatjon can be expressed

  • not intended to be used for complex calculatjons
  • support easy, effjcient access to large data sets
slide-40
SLIDE 40

Nov 19, 2019 40

Relatjonal Query Languages

  • Based on relatjonal algebra
  • For relatjonal databases, i.e. relatjonal data model
  • Relatjonal model supports simple, powerful QLs:

– Strong formal foundatjon based on logic – Allows for much optjmizatjon

  • SQL: Most widely used relatjonal query language

→ Understanding Relatjonal Algebra is key to understanding SQL, query processing!

slide-41
SLIDE 41

Nov 19, 2019 41

Relatjonal Query Languages

More on relatjonal query languages and relatjonal algebra at next lecture.