Simple SQL Queries (1) Review Collecting data Storing data into - - PDF document
Simple SQL Queries (1) Review Collecting data Storing data into - - PDF document
Simple SQL Queries (1) Review Collecting data Storing data into databases Create relational tables in SQL Server 2005 Insert tuples into tables Specifying some integrity constraints CMPT 354: Database I -- Simple SQL (1)
CMPT 354: Database I -- Simple SQL (1) 2
Review
- Collecting data
- Storing data into databases
– Create relational tables in SQL Server 2005 – Insert tuples into tables – Specifying some integrity constraints
CMPT 354: Database I -- Simple SQL (1) 3
What the Next?
- Collecting/storing data is not the ultimate
goal
– A database is useless if it can only store data, but cannot answer any queries
- Query answering, the major functionality of
databases
– How to ask a query? A query can be arbitrarily complicated – What answer should I expect?
CMPT 354: Database I -- Simple SQL (1) 4
Outline
- Data and queries
- Introduction to SQL
- Review: specifying database tables
- Maintaining the structures of tables
CMPT 354: Database I -- Simple SQL (1) 5
Querying Unstructured Data
- Natural language queries
– “I want jokes which are related to the topic of the class today, and can wake my students up” – Data: jokes, no matter where they are, could be come up by people – Hard to be answered by a computer
- Keyword search
– Find web stuff related to “jokes, SQL, databases” – A search engine like Google can do it, but the results may not be accurate or even relevant
CMPT 354: Database I -- Simple SQL (1) 6
One Result
In meeting with DBA friends one of my friend suggested while searching for “MSSQL Client” Microsoft returns you suggestion as “MySQL Client“. I did not believe it so I tested it myself. He was correct. Here is the screen shot.
http://blog.sqlauthority.com/2007/08/01/sql-server-sql-joke-sql-humor-sql-laugh-search-sql/
CMPT 354: Database I -- Simple SQL (1) 7
One More …
www.edream.org/images/blog/117/no-clue.jpg
CMPT 354: Database I -- Simple SQL (1) 8
Querying Structured Data
- List the students in class CMPT 354
– Student information is well organized in the student database – Each record contains student name, id, email, … – Each record has the same format
- Answers should be more accurate than
keyword search
CMPT 354: Database I -- Simple SQL (1) 9
Data and Queries
Structured data (tables) Unstructured data (e.g., text) Structured queries SQL, relational databases XML Unstructured queries (e.g., keyword search) Keyword search in text-rich relational databases Information retrieval
CMPT 354: Database I -- Simple SQL (1) 10
History
- IBM Sequel language
– Part of the System R project at the IBM San Jose Research Laboratory (http://www.mcjones.org/ System_R/SQL_Reunion_95/SRC-1997-018.pdf)
- Renamed Structured Query Language (SQL)
- ANSI and ISO standard SQL:
– SQL-86, 89, 92 – SQL:1999 (Y2K compliant language name), SQL:2003
- Commercial systems offer most, if not all, SQL-92
features, plus varying feature sets from later standards and special proprietary features
CMPT 354: Database I -- Simple SQL (1) 11
What Can SQL Do?
- Define databases
– What kinds of data? (e.g., names are character strings) – How to store them? (e.g., in alphabetical order) – Quality control (e.g., age cannot be negative)
- Retrieve data from databases
– Which attributes are needed? – Which tuples should be retrieved?
- Access control (e.g., only instructors can assign
final grades)
- Interface to other language / development tools
(e.g., C/C++, Java, 4GL, etc.)
CMPT 354: Database I -- Simple SQL (1) 12
Parts in SQL
- Data definition language (DDL)
- Interactive data-manipulation language
(DML)
- Integrity
- View definition
- Transaction control
- Embedded SQL and dynamic SQL
- Authorization
CMPT 354: Database I -- Simple SQL (1) 13
Data Definition Language (DDL)
- Specification of tables
– A table is also called a relation
- The schema for each table
- The domain of values associated with each
attribute
- Integrity constraints
- The set of indices to be maintained for each
relations
- Security and authorization information for each
relation
- The physical storage structure of each relation on
disk
CMPT 354: Database I -- Simple SQL (1) 14
Tables (Relations)
- Attributes (columns, dimensions)
– Attribute names are not stored explicitly in the table, they are stored in a meta-data table
- Tuples (records, rows)
– Each tuple has a (internal) row-id, explained as the address of the tuple Attributes (columns, dimensions) Tuples (records, rows)
CMPT 354: Database I -- Simple SQL (1) 15
Tables and Spreadsheets
- Tables are typed – each attribute has a pre-
defined data type
– Spreadsheet can be of undefined data type
- Disadvantages of spreadsheets
– Lack of auditing and revision control – Lack of security – Lack of concurrency: unlike databases, spreadsheets typically allow only one user to be making changes at any given time – Prone to errors due to entering information incorrectly or in the wrong place
CMPT 354: Database I -- Simple SQL (1) 16
Do You Know? (from Wikipedia)
- The concept of an electronic spreadsheet – "Budgeting
Models and System Simulation" (Richard Mattessich, 1961)
- Rene K. Pardo and Remy Landau filed U.S. Patent
4,398,249 on some related algorithms in 1970
– While the patent was initially rejected by the patent office as being a purely mathematical invention, Pardo and Landau won a court case in 1983 establishing that "something does not cease to become patentable merely because the point of novelty is in an algorithm." This case helped establish the viability of software patents.
- Dan Bricklin: the generally recognized inventor of
spreadsheets as a commercial product for the personal computer
- VisiCalc, the first application turning the personal computer
from a hobby for computer enthusiasts into a business tool
CMPT 354: Database I -- Simple SQL (1) 17
Inventors of SpreadSheets
Richard Mattessich
CMPT 354: Database I -- Simple SQL (1) 18
Basic Types in SQL
- char(n): Fixed length character string, with user-specified
length n
- varchar(n): Variable length character strings, with user-
specified maximum length n
- int: Integer (a finite subset of the integers that is machine-
dependent)
– smallint: Small integer (a machine-dependent subset of the integer domain type)
- numeric(p,d): Fixed point number, with user-specified
precision of p digits, with n digits to the right of decimal point
– real, double precision: Floating point and double-precision floating point numbers, with machine-dependent precision – float(n): Floating point number, with user-specified precision of at least n digits.
CMPT 354: Database I -- Simple SQL (1) 19
Tables and User Data
- Tables are holders of user data
– Only data fitting the type requirement can be held – A table can be empty
- Data in applications are independent of
tables
– Tables can be defined properly to hold data – Data can be preprocessed properly to fit tables
- Both tables and data can be modified
CMPT 354: Database I -- Simple SQL (1) 20
Review: Create a Table
- The create table command:
create table r (A1 D1, A2 D2, ..., An Dn, (integrity-constraint1), ..., (integrity-constraintk))
- Example:
create table branch (branch_name char(15) not null, branch_city char(30), assets integer)
CMPT 354: Database I -- Simple SQL (1) 21
Drop Table Construct
- Delete all information about the dropped
relation from the database
– drop table r
- Delete all tuples from a table
– delete from r
- Tips: how to delete all tuples in a large table
efficiently?
– Drop the table and then recreate it – Why?
CMPT 354: Database I -- Simple SQL (1) 22
Alter Table Constructs
- Add attributes to an existing relation
alter table r add A D
– A is the name of the attribute to be added to relation r, and D is the domain of A – All tuples in the relation are assigned null as the value for the new attribute.
- Drop attributes of a relation:
alter table r drop column A
– A is the name of an attribute of relation r – Dropping attributes is not supported by many databases (why?)
CMPT 354: Database I -- Simple SQL (1) 23
Summary
- Data and queries
– Different data allows different kinds of queries – Different queries may lead to different quality (e.g., accuracy)
- SQL
– DDL: data definition language – DML: data manipulation language
- Maintaining tables in relational databases
CMPT 354: Database I -- Simple SQL (1) 24
To Do List
- Create a table in SQL Server, modify its
structure by adding new attributes, changing data types of attributes, and removing some attributes
- Does SQL Server allow deleting an