Database Systems Database Systems 1 Creating a Database System - - PowerPoint PPT Presentation

database systems database systems
SMART_READER_LITE
LIVE PREVIEW

Database Systems Database Systems 1 Creating a Database System - - PowerPoint PPT Presentation

Database Systems Database Systems 1 Creating a Database System Design Construction Application Usage 2 Step 0 Analyze Analyze the domain Learn what things the system is intended to handle. Learn what the constraints of


slide-1
SLIDE 1

Database Systems Database Systems

1

slide-2
SLIDE 2

Creating a Database System

Design Construction Application Usage

2

slide-3
SLIDE 3

Step 0 – Analyze

  • Analyze the domain

– Learn what things the system is intended to handle. – Learn what the constraints of the domain are.

  • Analyze the system

– Learn what the system is intended to do, what

  • perations it should perform.
  • We’ve partly done part of this for you…

3

slide-4
SLIDE 4

Step 1 – Design

  • Design the database

– Create a schema that captures the constraints

  • f the domain.
  • E-R diagram, translated to a set of relation

schemas. schemas.

– Note any constraints that are not directly captured by the structure of the schema.

  • Value constraints.
  • Multiplicity constraints.
  • Broken dependencies.

4

slide-5
SLIDE 5

Course Objectives – Design

When the course is through, you should

– Given a domain, know how to design a database that correctly models the domain and its constraints.

”We want a database that we can use for scheduling courses and lectures. This is how it’s supposed to work: …”

5

slide-6
SLIDE 6

The Relational Data Model

  • Relations are sets of tuples.

– Tuples are sets of data. – All tuples in a relation have the same set of attributes with values. – A database schema is a set of interconnected relation – A database schema is a set of interconnected relation schemas.

  • Relations are a very simple model, mapping well

to how we tend to ”think” about data.

  • Most DBMS of today use a relational model.

6

slide-7
SLIDE 7

The Entity-Relationship approach

  • Design your database by drawing a picture
  • f it – an Entity-Relationship diagram.

– Allows us to sketch the design of a database informally (which is good when communi- informally (which is good when communi- cating with customers).

  • Use (more or less) mechanical methods to

convert your diagram to relations.

– This means that the diagram can be a formal specification as well.

7

slide-8
SLIDE 8

E-R diagram for Scheduler

code name Given

GivenCourse

teacher period #students

Course

In

Lecture

Of weekday time

Room

name #seats 8

slide-9
SLIDE 9

Translate to relations

Courses(code, name) GivenCourses(course, period, #students, teacher) course -> Courses.code Lectures(course, period, room, weekday, hour) (course, period) -> GivenCourses.(course, period) room -> Rooms.name Rooms(name, #seats)

9

slide-10
SLIDE 10

E-R summary

  • Entities, attributes
  • Relationships, multiplicity
  • Weak entities
  • Subclassing (ISA)
  • Translation to relations

10

slide-11
SLIDE 11

Step 1b – Verification

  • Once you have a schema, verify that it is correct

with respect to the constraints of the domain!

– Functional dependencies, independencies

  • For the constraints found in step 1, show that
  • For the constraints found in step 1, show that

your design captures the constraint.

– For those constraints that are not captured, argue why not.

  • Anyone can make a schema, not everyone can

make a correct schema…

11

slide-12
SLIDE 12

Functional dependencies (FDs)

  • X A

– ”X determines A”, ”X gives A” – ”A depends on X”

  • X is a set of attributes, A is a single
  • X is a set of attributes, A is a single

attribute.

  • Examples:

– code name – code, period teacher

12

slide-13
SLIDE 13

Quiz!

What’s the point of functional dependencies? Why are E-R diagrams not enough?

– E-R diagrams can be wrong.

  • Bad design – you’ve thought wrong when doing the diagram.
  • Bad design – you’ve thought wrong when doing the diagram.
  • Bad translation – something went wrong when translating to

relations.

– Dependencies are an extra safe-guard against errors, as well as a formal proof that your design is correct. – Dependencies represent ”extra” constraints that your diagram cannot handle.

13

slide-14
SLIDE 14

Anomalies

  • Redundancy

– Same course name stored several times. – Redundancy in a schema leads to possible anomalies. Courses(code, period, name, teacher)

  • Update anomaly

– If we update the course name, we must remember to update all tuples.

  • Deletion anomaly

– If a course is not scheduled to be given in any period this year, we lose its name!

14

slide-15
SLIDE 15

Using FDs to detect redundancy

  • Whenever X A holds for a relation R,

but X is not a key for R, then values of A will be redundantly repeated!

Courses(code, period, name, teacher)

{(’TDA356’, 2, ’Databases’, ’Niklas Broberg’), (’TDA356’, 4, ’Databases’, ’Rogardt Heldal’)}

code name code, period teacher

15

slide-16
SLIDE 16

Decomposition

  • Fix the problem by decomposing Courses:

– Create one relation with the attributes from the offending FD, in this case code and name.

Courses(code, period, name, teacher)

code name code, period teacher this case code and name. – Keep the original relation, but remove all attributes from the RHS

  • f the FD. Insert a reference from the LHS in this relation, to the

key in the first.

Courses(code, name) GivenCourses(code, period, teacher) code -> Courses.code

16

slide-17
SLIDE 17

Boyce-Codd Normal Form

  • A relation R is in Boyce-Codd Normal

Form (BCNF) if, whenever a nontrivial FD X A holds on R, X is a superkey of R.

– Remember: nontrivial means A is not part of X – Remember: nontrivial means A is not part of X – Remember: a superkey is any superset of a key (including the keys themselves).

Courses(code, name) GivenCourses(code, period, teacher)

17

slide-18
SLIDE 18

Quiz!

Why not use BCNF decomposition for designing database schemas? Why go via E-R diagrams?

– Decomposition doesn’t handle all situations

  • gracefully. E.g.
  • Self-relationships
  • Many-to-one vs. many-to-”exactly one”
  • Subclasses
  • Single-attribute entities

– E-R diagrams are graphical, hence easier to sell than some mathematical formulae.

18

slide-19
SLIDE 19

Other normal forms

3NF is a relaxation of BCNF

– Accept a relation even if there are dependencies X A where X is not a superkey, as long as A is in some key (prime). superkey, as long as A is in some key (prime).

  • 4NF is a strengthening of BCNF

– Requires us to handle not only FDs, but also independencies (INDs).

19

slide-20
SLIDE 20

Properties of normalization

  • BCNF decomposition guarantees

– Lossless join – No redundancy (in most cases)

  • 3NF decomposition guarantees
  • 3NF decomposition guarantees

– Lossless join – Dependencies are preserved

  • 4NF decomposition guarantees

– Lossless join – No redundancy

20

slide-21
SLIDE 21

Summary – dependencies and normal forms

  • Functional dependencies (FDs)

– X A, ”X determines A”, ”A depends on X” – Trivial FDs, computing closures (X+, F+), finding all implied FDs of a domain.

  • Independencies (INDs)

  • Independencies (INDs)

– X ↠ Y | Z, ”Y and Z are independent with respect to X” – X ↠ Y, ”Y is independent from everything else w.r.t X”

  • Normal Forms

– BCNF, 3NF, 4NF

  • Normalization algorithms

21

slide-22
SLIDE 22

Summary – Design

  • Model the domain as an E-R diagram.
  • Translate the diagram to a set of relation

schemas, forming a schema for the schemas, forming a schema for the database.

  • Verify using dependencies and normal

forms that your design is correct.

22

slide-23
SLIDE 23

Creating a Database System

Design Construction Application Usage

23

slide-24
SLIDE 24

Step 2 – Construction

  • Implement the database schema in a

DBMS.

– Relations become tables, attributes become columns. columns. – Constraints on relations.

  • Primary keys
  • References, foreign keys
  • Value constraints (checks)
  • General constraints (assertions)

24

slide-25
SLIDE 25

Course Objectives – Construction

When the course is through, you should

– Given a database schema with related constraints, implement the database in a constraints, implement the database in a relational (SQL) DBMS. – SQL Data Definition Language

25

slide-26
SLIDE 26

SQL Data Definition Language

  • CREATE TABLE, attributes
  • Constraints (name them!)

– PRIMARY KEY – FOREIGN KEY … REFERENCES CHECK – CHECK

  • Tuple-based
  • Attribute-based
  • Assertions, CREATE ASSERTION
  • Views, CREATE VIEW

– Persistent queries – Virtual tables

26

slide-27
SLIDE 27

Step 2b – Help constructs

  • Implement auxiliary functionality around

the core database (the tables) in order to simplify the application.

– Triggers to ensure consistency and simplify – Triggers to ensure consistency and simplify complex modifications. – Views to simplify complex queries (and modifications) and present a proper interface. – Privileges to ensure that things are done correctly.

27

slide-28
SLIDE 28

Triggers

  • Triggers are actions to perform when

some event occurs.

– Events: insertion, deletion, update – Condition: test whether to run the action or – Condition: test whether to run the action or not. – Action: The stuff to actually do.

A trigger is sometimes referred to as an ECA – Event-Condition-Action

28

slide-29
SLIDE 29

Basic trigger structure

CREATE TRIGGER name [BEFORE|AFTER] [INSERT|DELETE|UPDATE] ON tablename REFERENCING [NEW|OLD] [ROW|TABLE] AS variable FOR EACH [ROW|STATEMENT] WHEN condition WHEN condition action to perform

Decide whether to run the trigger or not. What should happen when the trigger is triggered.

29

slide-30
SLIDE 30

Views

  • Views can be seen in two ways:

– Persistent queries.

  • A view is a query that is given a name. The

”contents” of the view are fetched by the query from other tables. from other tables.

– Virtual tables.

  • A view is a table that has no data of its own, its

”contents” are stored in other tables.

  • A query on a view is translated to the query that

was used to define it.

  • An update on a view can be translated to updates
  • n base tables using special instead of triggers.

30

slide-31
SLIDE 31

Summary – Construction

  • Implement the database schema in a

DBMS.

  • Implement help constructs to make the
  • Implement help constructs to make the

application simpler and/or safer.

31

slide-32
SLIDE 32

Creating a Database System

Design Construction Application Usage

32

slide-33
SLIDE 33

Step 3 – Operate

  • Identify (and write) the queries needed by

the application.

– Exact queries might not be known until runtime, but the structure would probably be runtime, but the structure would probably be known.

  • Identify (and write) any modifications that

the application will need to perform.

33

slide-34
SLIDE 34

Course Objectives – Usage

When the course is through, you should

– Know how to query a database for relevant data using SQL. data using SQL. – Know how to change the contents of a database using SQL. – Relational Algebra – SQL Data Manipulation Language

34

slide-35
SLIDE 35

Relational Algebra

  • An algebra whose operands are relations

(or variables representing relations).

  • Operators representing the most common
  • perations on relations.
  • perations on relations.

– Selecting rows () – Projecting columns () – Composing (joining) relations

  • Cartesian product (x) (Cross product, product)
  • Join, natural join (⋈

⋈ ⋈ ⋈C, ⋈ ⋈ ⋈ ⋈)

35

slide-36
SLIDE 36

SELECT-FROM-WHERE

  • Basic structure of an SQL query:

SELECT attributes FROM tables WHERE tests over rows WHERE tests over rows SELECT X FROM T WHERE C

πX(σC(T))

36

slide-37
SLIDE 37

SELECT-FROM-WHERE- GROUPBY-HAVING-ORDERBY

  • Full structure of an SQL query:

SELECT attributes FROM tables WHERE tests over rows GROUP BY attributes

Only the SELECT and FROM clauses must be included.

GROUP BY attributes HAVING tests over groups ORDER BY attributes SELECT X, G FROM T WHERE C GROUP BY Y HAVING D ORDER BY Z;

Z’(πX,G’(σD’(Y,G’(σC(T)))))

X must be a subset of Y. Primes ’ mean we need some renaming.

Note the difference! Note the difference!

37

slide-38
SLIDE 38

Relations as sets

  • Relations are sets of tuples.
  • Set theory has plenty to borrow from:

– More operators:

  • ∪ – UNION

  • ∪ – UNION
  • ∩ – INTERSECT
  • ∖ – EXCEPT (MINUS)
  • Beware of bags vs sets!

– SQL uses bags, except for set operations. – DISTINCT (), ALL

38

slide-39
SLIDE 39

Summary SQL and Relational Algebra

  • SQL is based on

relational algebra.

– Operations over relations

  • SELECT-FROM-

WHERE-GROUPBY- HAVING-ORDERBY

– Grouping and aggregation

  • Grouping ()
  • SUM, AVG, MIN, MAX,

COUNT

– Set operations

  • Union (∪)

HAVING-ORDERBY

  • Operations for:

– Selection of rows () – Projection of columns () – Combining tables

  • Cartesian product (x)
  • Join, natural join, outer

join (⋈

⋈ ⋈ ⋈C, ⋈ ⋈ ⋈ ⋈, ⋈ ⋈ ⋈ ⋈)

  • Union ( )
  • Intersect (∩)
  • Set difference (∖)

– Miscellaneous

  • Renaming ()
  • Duplicate elimination ()
  • Sorting ()
  • Subqueries

– Sequencing – (Views)

˚

39

slide-40
SLIDE 40

SQL Modifications

  • Insertion

– INSERT INTO table (…) VALUES (…) – INSERT INTO table query

  • Deletion
  • Deletion

– DELETE FROM table WHERE test on rows

  • Updating

– UPDATE table SET attribute = value WHERE test on rows

40

slide-41
SLIDE 41

Summary – Usage

  • Identify (and write) the queries and

modifications that the application will need to perform.

  • …might not be needed when using a high-

level interface.

41

slide-42
SLIDE 42

Creating a Database System

Design Construction Application Usage

42

slide-43
SLIDE 43

Step 4 – Application

  • Write the external application, making use
  • f what you’ve done in the previous steps.

– Write the application in a ”real” programming language. – Communicate with the database via some special interface.

  • Ensure that your application is safe with

respect to concurrency and system crashes.

43

slide-44
SLIDE 44

Course Objectives – Applications

When the course is through, you should

– Know how to connect to and use a database from external applications. from external applications.

  • … using JDBC

44

slide-45
SLIDE 45

Two approaches

  • We have SQL for manipulating the
  • database. To be able to write ordinary

applications that use SQL, we can either

– Extend SQL with ”ordinary” programming – Extend SQL with ”ordinary” programming language constructs.

  • SQL/PSM, PL/SQL

– Extend an ordinary programming language to support database operations through SQL.

  • Embedded SQL, SQL/CLI (ODBC), JDBC, …

45

slide-46
SLIDE 46

SQL/PSM

  • Procedures, functions

– Parameters, local declarations – Functions return values – Calling (CALL procedures, use functions as values)

  • Assigning to variables
  • Assigning to variables

– SET – SELECT … INTO …

  • Cursors

– Declaring, fetching values

  • Control flow structures

– if, while, repeat, exceptions and handlers

46

slide-47
SLIDE 47

Embedded SQL

  • Write SQL inline in host language code.

– Prepend SQL with EXEC SQL

  • Shared variables.

– Prepend with colon in SQL code. – Prepend with colon in SQL code.

  • No inherent control structures!

– Uses control structures of the host language.

  • Compiled by a preprocessor into

procedure calls of the host language.

47

slide-48
SLIDE 48

JDBC

  • JDBC = Java DataBase Connectivity
  • JDBC is Java’s call-level interface to SQL

DBMS’s.

– A library with operations that give full access – A library with operations that give full access to relational databases, including:

  • Creating, dropping or altering tables, views, etc.
  • Modifying data in tables
  • Querying tables for information

48

slide-49
SLIDE 49

Objects in JDBC

  • DriverManager

– Register drivers, create connections.

  • Connection

– Create statements or prepared statements. – Close when finished.

  • Statement

– Execute queries or modifications.

  • PreparedStatement

– Execute a particular query or modification, possibly parametrized.

  • ResultSet

– Iterate through the result set of a query.

49

slide-50
SLIDE 50

ACID Transactions

  • DBMS must ensure that different

processes don’t interfere with each other!

– ”ACID”: Atomicity, Consistency, Isolation, Durability. – The isolation levels of transactions may vary.

  • Serializable
  • Read Committed
  • Read Uncommitted
  • Repeatable Read

– Isolation level affects only that transaction!

50

slide-51
SLIDE 51

Summary – Applications

  • Write the external application.
  • Communicate with the database.
  • Use transactions to safeguard against

interference and system crasches.

51

slide-52
SLIDE 52

Creating a Database System

0. Analyze 1. Design

  • Verify correctness

2. Construct

  • Construct help views and triggers
  • Construct help views and triggers

3. Operate 4. Write application Note that the order given here is artificial. 1 should come first, but the rest will probably be mixed, and typically depend on some larger-scale system development plan.

52

slide-53
SLIDE 53

Examination

  • Written exam: Mar 15 (Tue) 08:30-12:30, M

– 60 points (3/4/5 = 24/36/48, G/VG = 24/42) – Allowed aids: One handwritten A4 paper containing anything you like (on both sides), to be handed in together with the exam. – Divided into 7 distinct blocks:

  • E-R diagrams

(12)

  • FDs and Normal Forms

(12)

  • SQL DDL

(8)

  • SQL DML

(8)

  • Relational Algebra

(6)

  • Transactions

(6)

  • XML

(8)

53

slide-54
SLIDE 54

Non-standard Exam Structure

  • Each block will have two or three sections: A, B

and possibly C. Everyone is expected to know the A questions, while B and C questions are intended for those seeking higher grades.

  • You can only get points from one section within

each block!

– Less time spent on blocks that you know well. – Harder to get ”stray” points. – A’s give ~35, B’s ~40, C’s 60.

54

slide-55
SLIDE 55

Block 1 – E-R diagrams (12)

”A small train company wants to design a booking system for their customers. …”

  • Given the problem description above, construct an E-R

diagram. diagram.

  • Translate the E-R diagram into a database schema.

Note that all parts of E-R modelling belong in A! B&C: More complex domain, analyze a diagram, reason about benefits and drawbacks of different designs for the domain, …

55

slide-56
SLIDE 56

Block 2 – FDs and NFs (12)

”A car rental company has the following, not very success- ful, database. They want your help to improve it. …”

  • Identify all functional dependencies you expect to hold

in the domain.

  • Indicate which of those dependencies violate BCNF
  • Indicate which of those dependencies violate BCNF

with respect to the relations in the database.

  • Do a complete decomposition of the database so that

the resulting relations are in BCNF. B&C: More complex domain, decompose to 3NF or 4NF, argue about the meaning of dependencies, analyze benefits and drawbacks of different normal forms for the domain, …

56

slide-57
SLIDE 57

Block 3 – SQL DDL and Modifications (8)

”A grocery store wants a database to store information. After studying their domain you have come up with the following database schema. …”

  • Write SQL statements that create the relations as

tables in a DBMS, with all constraints. B&C: Write a trigger for something, argue about the effects

  • f some constraint, sketch a set of triggers that

together achieve some effect, more complex variations, argue about indexes or authorization,… You will not be required to write PSM code on the exam! But you may if you want to.

57

slide-58
SLIDE 58

Block 4 – SQL Queries (8)

”The grocery store wants your help in getting proper information from their database. …”

  • Write a number of SQL queries, given descriptions

about what they should return. about what they should return. Note that all parts of SQL belong in A! B&C: more complex queries, argue about the differences of queries, rewrite a query where you must use some specific construct, …

58

slide-59
SLIDE 59

Block 5 – Relational Algebra (6)

”Here is a schema for a database over persons and their

  • employments. …”
  • Explain (in words) what a given relational-algebraic

expression will return.

  • Translate a relational-algebraic expression to SQL.

Note that all operators of relational algebra belong in A! B&C: More complex expressions, write a relational- algebraic expression given a text description, translate an SQL query into relational algebra, …

59

slide-60
SLIDE 60

Block 6 – Transactions (6)

”Here are some transactions that run in parallel. …”

  • What will the end results of these transactions be, and

what could happen if they were not run as transactions? transactions?

  • What isolation level should this transaction run in to

ensure … ? B&C: Show a transaction that would interfere with this under isolation level …, argue about benefits and drawbacks of using a particular isolation level for some transaction, …

60

slide-61
SLIDE 61

Block 7 –XML (8)

”A medical research facility wants a database that uses a semi- structured model to represent different degrees of knowledge regarding the outbreak of epidemic diseases. …”

  • Suggest how to model this domain as a DTD.
  • Give an XML document that conforms to some DTD.
  • Given this DTD, what does this XPath/XQuery expression

compute?

B&C: More complex queries to interpret, write an XQuery expression that computes something, discuss the benefits of the semi- structured data model for this particular domain, …

61

slide-62
SLIDE 62

Things that will not appear

  • n the exam!
  • PSM, Embedded SQL
  • JDBC (or any other interface)
  • Calculations with indexes
  • Grant diagrams
  • Anything Oracle-specific

62

slide-63
SLIDE 63

Good Luck! Good Luck!

63