NEBC Database Course 2008 Welcome Introduction to Databases Tim - - PowerPoint PPT Presentation

nebc database course 2008 welcome introduction to
SMART_READER_LITE
LIVE PREVIEW

NEBC Database Course 2008 Welcome Introduction to Databases Tim - - PowerPoint PPT Presentation

NEBC Database Course 2008 Welcome Introduction to Databases Tim Booth : tbooth@ceh.ac.uk Course Overview What you will take home: What is a relational database? Why would you use one? How do you interact with one? How do you


slide-1
SLIDE 1

NEBC Database Course 2008 Welcome Introduction to Databases

Tim Booth : tbooth@ceh.ac.uk

slide-2
SLIDE 2

Course Overview

What you will take home:

  • What is a relational database?
  • Why would you use one?
  • How do you interact with one?
  • How do you design and build one?
  • What is the theory behind good relational design?
  • What software is available?
  • How can I build user interfaces, or display my data on the web?
slide-3
SLIDE 3

What is a database?

  • Most general definition:

“A database is an integrated collection of data organised to meet the needs of one or more users”

  • Databases stored on computer may be of various types:
  • Flat file database such as a comma-

separated file

  • Indexed file database such as BLAST dbs
  • 'Flat' relational database such as

Access .mdb

  • Database managed by a dedicated

application (RDBMS) such as PostgreSQL

  • r Oracle.
slide-4
SLIDE 4

Benefits of a Relational Approach

  • Database structure reflects the structure of the data
  • Eliminate redundant data
  • Data validation
  • Powerful querying via SQL
  • Security, data integrity and multi-user access
  • Compatibility with a huge range of existing software and

development tools

slide-5
SLIDE 5

What is a Relational Database?

  • A relational database is a collection of tables. Each table has

rows and columns, and looks rather like a spreadsheet.

  • The movie table in the example database:
slide-6
SLIDE 6

The Example Database

  • The 'bighit' database represents a video store.
  • Each person has access to an individual copy of this database,

and we will explore it as the course progresses.

  • The next slide shows a selection of tables in the database, and

links between them.

slide-7
SLIDE 7

The Example Database

slide-8
SLIDE 8

More about database tables

  • Tables are properly called 'relations'.
  • A table holds a collection of records or 'tuples'. These are

shown as rows when you view the data.

  • Each record has a fixed set of fields or 'attributes', each of

which contains data of a specified type. These appear as columns in the data view.

  • Inside the database, the order of the rows and columns is not
  • significant. The database sorts them when you ask to display

the data.

  • A database can have many different tables, corresponding to

things we want to store data about.

  • Other entities can also live in your database – we shall meet

them later!

slide-9
SLIDE 9

Access the sample database

  • Each user has a private copy of the sample database. We will

log in now.

  • Start the pgAdmin III application from under System Tools in

the main menu.

  • File → Add Server
  • Select the server in the tree view, login with user/pass
  • In the tree browser, open 'databases' then 'student0?'.
  • Open the 'bighit' schema, then the 'tables' link.
  • Select the 'movie' table – note the definition which appears in

the right pane.

  • Right click and select View data → View all rows.
slide-10
SLIDE 10

Introducing SQL

  • SQL is a language we will use to get at the data in the database
  • Generally human readable
  • Retrieve information with SELECT statements
  • eg. SELECT * FROM MOVIE
  • A result set comes back.
slide-11
SLIDE 11

Bring up an SQL Editor Window

  • Click the pencil icon to get a new SQL editor window.
  • In the new window, use the eraser icon to clear any text

already in the input box.

slide-12
SLIDE 12

Talking SQL

  • Type the query shown (1) and hit the green arrow (2) to run it.

2 2 1 1

slide-13
SLIDE 13

More about SQL

  • SQL is the Structured Query Language
  • The language and structure of databases are intimately linked.
  • Sometimes pronounced 'sequel' – especially by Microsoft

devotees.

  • Invented by Dr. Edgar Frank 'Ted' Codd (1924-2003), who was

from Oxford, and set down the foundations of the relational database concept while working in the IBM San Jose Research Laboratory.

  • The classic paper 'A relational model of data for large shared

data banks' appeared in the ACM (Association for Computing Machinery) magazine, June 1970

slide-14
SLIDE 14

Being More Specific

  • The general form of basic SELECT statements:

SELECT title, year FROM movie WHERE year > 1990 ORDER BY year SELECT WHAT FROM WHICH TABLE WHERE SOME TEST ORDER BY WHICH COLUMN

  • The last two parts are both optional.
  • SELECT * is a special case to get all columns.
slide-15
SLIDE 15

Modifying Data

  • INSERT, DELETE and UPDATE are the commands to modify

data. DELETE FROM reservation where accountid = 23 DELETE FROM TABLE WHERE CONDITION INSERT INTO reservation (accountid, movieid, datereserved) VALUES (23, 15, '12 June') INSERT INTO TABLE (COLUMNS) VALUES (VALUES)

slide-16
SLIDE 16

Modifying Data 2

  • INSERT, DELETE and UPDATE are the commands to modify

data. UPDATE movie SET genre = 'fantasy', rating = 'PG' WHERE id = 15 UPDATE TABLE SET COLUMN1 = VALUE1, COLUMN2 = VALUE2 WHERE CONDITION

slide-17
SLIDE 17

Getting Exactly What You Want

  • In a RDB you try not to store redundant or derived data.
  • For example, we store the year of release of a movie, but not

the age.

  • We can specify a calculation within the SELECT statement to

calculate the age of a movie on-the-fly. A) SELECT title, year from movie B) SELECT title, extract(YEAR FROM now()) - year FROM movie

  • 'extract' and 'now' are functions.
slide-18
SLIDE 18

Name That Column

  • Column aliases allow you to specify the name of a column

retrieved with SELECT.

  • Running the previous example, we see that the second column

gets called '?column?', because the database has no idea what to call the result of the formula.

  • We can simply say:

SELECT title, extract(YEAR FROM now()) - year as age FROM movie

slide-19
SLIDE 19

Learning by Example

  • SQL allows you to derive data by supplying a variety of

functions and constructs.

  • Rather than just talking through them, we will introduce some

by example in the exercises.

slide-20
SLIDE 20

Relational Databases in Biology

  • Some databases, such as the NCBI Taxonomy, can be

downloaded and set up on a local machine. ftp://ftp.ncbi.nih.gov/pub/taxonomy/

  • Ensembl permits direct access to the core database, which we

will look at later. The document describing how to get access is here: http://cvsweb.sanger.ac.uk/cgi−bin/cvsweb.cgi/ ~checkout~/ensembl/docs/tutorial/ensembl_tutorial.pdf

  • For large datasets, deploying an RDBMS can provide fast,

flexible access to the data. Most data-portal websites have such a database at their heart.

slide-21
SLIDE 21

Credits

  • The Bighit database is adapted from the website of Greg

Riccardi - http://riccardi.pearsoncmg.com/dbmgmt/.

  • Examples and quotes from Database Design and Management,

by Rob Eadie (Continuum, 1998).

  • Programming the Perl DBI, Alligator Descartes & Tim Bunce,

(O'Reilly 2000)

  • Course originally delivered in 2004; developed by Tim Booth,

Joe Wood, Bela Tiwari and Dan Swan at the EGTDC.