[PPT] - Storage and Indexing (continued) CMPSCI 645 Mar 4, 2008 Slides PowerPoint Presentation

SLIDE 1

1

Storage and Indexing (continued)

CMPSCI 645 Mar 4, 2008

Slides Courtesy of R. Ramakrishnan and J. Gehrke

SLIDE 2

Today

 Index selection & performance tuning  Storing data: disks and files

2

SLIDE 3

Scan Equality Range Insert Delete

Heap File Sorted File Clustered Tree Index Unclustered Tree Index Unclustered Hash Index

BD .5BD BD 2D Search + D BD Dlog2B

D(log2B + #matching pages)

Search + BD Search + BD 1.5BD DlogF1.5 B

D(logF1.5B + #matching pages)

Search + D Search + D BD(R+. 15) D(1+logF.

.15B)

D(logF.15B + #matching recs)

Search + 3D Search + 3D BD(R+. 125) 2D BD 4D 4D

Cost of Operations

 Several assumptions underlie these (rough) estimates!

SLIDE 4

4

Index selection

 For each query in the workload:

Which relations does it access?
Which attributes are retrieved?
Which attributes are involved in selection/join conditions?

How selective are these conditions likely to be?

 For each update in the workload:

The type of update (INSERT/DELETE/UPDATE), and the

attributes that are affected.

SLIDE 5

5

Choice of Indexes

 What indexes should we create?

Which relations should have indexes?
What field(s) should be the search key?
Should we build several indexes?

 For each index, what kind of an index should it

be?

Hash/tree?
Clustered?

SLIDE 6

6

Choice of Indexes (Contd.)

 One approach: Consider the most important queries

in turn. Consider the best plan using the current indexes, and see if a better plan is possible with an additional index. If so, create it.

We must understand how a DBMS evaluates queries and

creates query evaluation plans!

For now, we discuss simple 1-table queries.

 Before creating an index, must also consider the

impact on updates in the workload!

Trade-off: Indexes can make queries go faster, updates
slower. Require disk space, too.

SLIDE 7

7

Index Selection Guidelines

 Attributes in WHERE clause are candidates for index keys.

Exact match condition suggests hash index.
Range query suggests tree index.
Clustering is especially useful for range queries; can also help on

equality queries if there are many duplicates.  Multi-attribute search keys should be considered when a

WHERE clause contains several conditions.

Order of attributes is important for range queries.
Such indexes can sometimes enable index-only strategies for

important queries.

For index-only strategies, clustering is not important!

 Choose indexes that benefit as many queries as possible.

Since only one index can be clustered per relation, choose it based on

important queries that would benefit the most from clustering.

SLIDE 8

8

Examples of Clustered Indexes

 B+ tree index on E.age can be

used to get qualifying tuples.

How selective is the condition?
Is the index clustered?

SELECT E.dno FROM Emp E WHERE E.age>40

SLIDE 9

9

Examples of Clustered Indexes

 Consider the GROUP BY query.

If many tuples have E.age > 10, using

E.age index and sorting the retrieved tuples by E.dno may be costly.

Clustered E.dno index may be better!

SELECT E.dno, COUNT (*) FROM Emp E WHERE E.age>10 GROUP BY E.dno

Compare index on <age>, index on <dno>.

SLIDE 10

10

Examples of Clustered Indexes

 Equality queries and duplicates:

Clustering on E.hobby helps!

SELECT E.dno FROM Emp E WHERE E.hobby=’Stamps’

SLIDE 11

11

Index-Only Plans

 Some queries can be

answered without retrieving any tuples from one or more of the relations involved, if a suitable index is available.

SELECT E.dno, COUNT(*) FROM Emp E GROUP BY E.dno

<E.dno>

SLIDE 12

12

Index-Only Plans

SELECT E.dno, MIN(E.sal) FROM Emp E GROUP BY E.dno

<E.dno,E.sal> Tree index <E.sal,E.dno> ? What about

SLIDE 13

13

Index-Only Plans

SELECT AVG(E.sal) FROM Emp E WHERE E.age=25 AND E.sal BETWEEN 3000 AND 5000

<E.age, E.sal> or <E.sal, E.age> Tree index

SLIDE 14

14

Index-Only Plans (Contd.)

 Index-only plans

are possible if we have a tree index with key <dno,age> or <age,dno>

Which is better?
What if we consider

the second query?

SELECT E.dno, COUNT () FROM Emp E WHERE E.age=30 GROUP BY E.dno SELECT E.dno, COUNT () FROM Emp E WHERE E.age>30 GROUP BY E.dno

SLIDE 15

Creating indexes in SQL

 SQL:1999 standard does not include any

statement for creating or dropping index structures!

15

CREATE INDEX age_index USING BTREE ON Emp(age)

SLIDE 16

16

Summary

 Understanding the nature of the workload for the

application, and the performance goals, is essential to developing a good design.

What are the important queries and updates? What

attributes/relations are involved?

 Indexes must be chosen to speed up important

queries (and perhaps some updates!).

Index maintenance overhead on updates to key fields.
Choose indexes that can help many queries, if possible.
Build indexes to support index-only strategies.
Clustering is an important decision; only one index on a

given relation can be clustered!

Order of fields in composite index key can be important.

SLIDE 17

Disks and Files

17

SLIDE 18

18

Disks and DBMS Design

 DBMS stores information on disks.  This has major implications for DBMS design!

READ: transfer data from disk to main memory (RAM)

for data processing.

WRITE: transfer data from RAM to disk for persistent

storage.

Both are high-cost operations, relative to in-memory
perations, so must be planned carefully!

SLIDE 19

19

Why Not Store Everything in Main Memory?

 Main memory is volatile. We want data to be

saved between runs. (Obviously!)

 Costs too much. $100 will buy you either 1GB

f RAM or 160GB of disk today.

 32-bit addressing limitation.

232 bytes can be directly addressed in memory.
Number of objects cannot exceed this number.

SLIDE 20

20

Arranging Pages on Disk

 `Next’ block concept:

blocks on same track, followed by
blocks on same cylinder, followed by
blocks on adjacent cylinder

 Blocks in a file should be arranged

sequentially on disk (by `next’), to minimize seek and rotational delay.

 For a sequential scan, pre-fetching several

pages at a time is a big win!

SLIDE 21

21

Buffer Management in a DBMS

 Data must be in RAM for DBMS to operate on it!  Table of <frame#, pageid> pairs is maintained.

DB

MAIN MEMORY DISK disk page free frame

Page Requests from Higher Levels

BUFFER POOL

choice of frame dictated by replacement policy

SLIDE 22

22

More on Buffer Management

 Requestor of page must unpin it, and indicate

whether page has been modified:

dirty bit is used for this.

 Page in pool may be requested many times,

a pin count is used. A page is a candidate for

replacement iff pin count = 0.

 CC & recovery may entail additional I/O

when a frame is chosen for replacement. (Write-Ahead Log protocol; more later.)

SLIDE 23

23

When a Page is Requested ...

 If requested page is not in pool:

Choose a frame for replacement
If frame is dirty, write it to disk
Read requested page into chosen frame

 Pin the page and return its address.  If requests can be predicted (e.g., sequential scans)

pages can be pre-fetched several pages at a time!

SLIDE 24

24

Buffer Replacement Policy

 Frame is chosen for replacement by a

replacement policy:

Least-recently-used (LRU), Clock, MRU etc.

 Policy can have big impact on # of I/O’s;

depends on the access pattern.

 Sequential flooding: Nasty situation caused by

LRU + repeated sequential scans.

# buffer frames < # pages in file means each page

request causes an I/O. MRU much better in this situation (but not in all situations, of course).

SLIDE 25

25

DBMS vs. OS File System

OS does disk space & buffer mgmt: why not let OS manage these tasks?

 Differences in OS support: portability issues  Some limitations, e.g., files can’t span disks.  Buffer management in DBMS requires ability to:

pin a page in buffer pool, force a page to disk

(important for implementing CC & recovery),

adjust replacement policy, and pre-fetch pages based
n access patterns in typical DB operations.

SLIDE 26

26

Files

 Access method layer offers an abstraction of

data on disk: a file of records residing on multiple pages

A number of fields are organized in a record
A collection of records are organized in a page
A collection of pages are organized in a file

SLIDE 27

27

Files of Records

 Page or block is OK when doing I/O, but

higher levels of DBMS operate on records and files of records.

 FILE: A collection of pages, each containing a

collection of records. Must support:

insert/delete/modify record
read a particular record (specified using record id)
scan all records (possibly with some conditions on

the records to be retrieved)

SLIDE 28

28

Unordered (Heap) Files

 Simplest file structure contains records in no

particular order.

 As file grows and shrinks, disk pages are

allocated and de-allocated.

 To support record level operations, we must:

keep track of the pages in a file
keep track of free space on pages
keep track of the records on a page

 There are many alternatives for keeping track

f this.

SLIDE 29

29

Heap File Implemented as a List

 (heap file name, header page id) stored in a known place.  Two doubly linked lists, for full pages & pages with space.

Each page contains 2 `pointers’ plus data.

 Upon insertion, scan the list of pages with space, or ask

disk space manager to allocate a new page

Header Page Data Page Data Page Data Page Data Page Data Page Data Page Pages with Free Space Full Pages

SLIDE 30

30

Heap File Using a Page Directory

 A directory entry per page; it can include the

number of free bytes on the page.

 The directory is a collection of pages; linked

list implementation is just one alternative.

Much smaller than linked list of all HF pages!

Data Page 1 Data Page 2 Data Page N Header Page

Page Format

 How to store a collection of records on a page?  Consider a page as a collection of slots, one for

each record.

 A record is identified by rid = <page id, slot #>  Record ids (rids) are used in indexes

(Alternatives 2 and 3).

SLIDE 32

32

Page Format: Fixed Length Records

 Moving records for free space management changes

rid! May not be acceptable.

Slot 1 Slot 2 Slot N PACKED N Free Space number

f records

. . .

M 1 . . . M ... 3 2 1 UNPACKED, BITMAP Slot 1 Slot 2 Slot N Slot M 1 1 number

f slots

. . .

SLIDE 33

33

Page Format: Variable Length Records

 Can move records on page without changing rid; so, attractive

for fixed-length records too. (level of indirection)

Page i Rid = (i,N) Rid = (i,2) Rid = (i,1) 20 16 24

N

Pointer to start

f free

space

SLOT DIRECTORY N . . . 2 1

# slots

SLIDE 34

34

Record Format: Fixed Length

 Information of a record type e.g., the number of fields

and field types is stored in the system catalog.

 Fixed length record: (1) the number of fields is fixed,

(2) each field has a fixed length.

 Store fields consecutively in a record.  Finding i’th field does not require scan of record. Base address (B)

L1 L2 L3 L4 F1 F2 F3 F4

Address = B+L1+L2

SLIDE 35

35

Record Format: Variable Length

 Variable length record: (1) number of fields is fixed, (2)

some fields are variable length

 Two alternatives:

 Second offers direct access to i’th field, efficient storage

f nulls (special don’t know value); small directory overhead.

$ $ $ $

Scan

Fields Delimited by Special Symbols

F1 F2 F3 F4 F1 F2 F3 F4 S1 S2 S3 S4 E4

Array of Field Offsets

SLIDE 36

36

System Catalogs

 For each index:

structure (e.g., B+ tree) and search key fields

 For each relation:

name, file name, file structure (e.g., Heap file)
attribute name and type, for each attribute
index name, for each index
integrity constraints

 For each view:

view name and definition

Storage and Indexing (continued)

CMPSCI 645 Mar 4, 2008

Today

 Index selection & performance tuning  Storing data: disks and files

Scan Equality Range Insert Delete

Heap File Sorted File Clustered Tree Index Unclustered Tree Index Unclustered Hash Index

BD .5BD BD 2D Search + D BD Dlog2B

Search + BD Search + BD 1.5BD DlogF1.5 B

Search + D Search + D BD(R+. 15) D(1+logF.

Search + 3D Search + 3D BD(R+. 125) 2D BD 4D 4D

Cost of Operations

 Several assumptions underlie these (rough) estimates!

Index selection

 For each query in the workload:

How selective are these conditions likely to be?

 For each update in the workload:

attributes that are affected.

Choice of Indexes

 What indexes should we create?

 For each index, what kind of an index should it

be?

Choice of Indexes (Contd.)

 One approach: Consider the most important queries

in turn. Consider the best plan using the current indexes, and see if a better plan is possible with an additional index. If so, create it.

creates query evaluation plans!

 Before creating an index, must also consider the

impact on updates in the workload!

Index Selection Guidelines

WHERE clause contains several conditions.

important queries.

important queries that would benefit the most from clustering.

Examples of Clustered Indexes

 B+ tree index on E.age can be

used to get qualifying tuples.

SELECT E.dno FROM Emp E WHERE E.age>40

Examples of Clustered Indexes

 Consider the GROUP BY query.

E.age index and sorting the retrieved tuples by E.dno may be costly.

SELECT E.dno, COUNT (*) FROM Emp E WHERE E.age>10 GROUP BY E.dno

Compare index on <age>, index on <dno>.

Examples of Clustered Indexes

 Equality queries and duplicates:

SELECT E.dno FROM Emp E WHERE E.hobby=’Stamps’

Index-Only Plans

 Some queries can be

answered without retrieving any tuples from one or more of the relations involved, if a suitable index is available.

SELECT E.dno, COUNT(*) FROM Emp E GROUP BY E.dno

<E.dno>

Index-Only Plans

SELECT E.dno, MIN(E.sal) FROM Emp E GROUP BY E.dno

<E.dno,E.sal> Tree index <E.sal,E.dno> ? What about

Index-Only Plans

SELECT AVG(E.sal) FROM Emp E WHERE E.age=25 AND E.sal BETWEEN 3000 AND 5000

<E.age, E.sal> or <E.sal, E.age> Tree index

Index-Only Plans (Contd.)

 Index-only plans

are possible if we have a tree index with key <dno,age> or <age,dno>

the second query?

SELECT E.dno, COUNT (*) FROM Emp E WHERE E.age=30 GROUP BY E.dno SELECT E.dno, COUNT (*) FROM Emp E WHERE E.age>30 GROUP BY E.dno

Creating indexes in SQL

 SQL:1999 standard does not include any

statement for creating or dropping index structures!

CREATE INDEX age_index USING BTREE ON Emp(age)

Summary

 Understanding the nature of the workload for the

application, and the performance goals, is essential to developing a good design.

attributes/relations are involved?

 Indexes must be chosen to speed up important

queries (and perhaps some updates!).

given relation can be clustered!

Disks and Files

Disks and DBMS Design

 DBMS stores information on disks.  This has major implications for DBMS design!

for data processing.

storage.

Why Not Store Everything in Main Memory?

 Main memory is volatile. We want data to be

saved between runs. (Obviously!)

 Costs too much. $100 will buy you either 1GB

 32-bit addressing limitation.

SELECT E.dno, COUNT () FROM Emp E WHERE E.age=30 GROUP BY E.dno SELECT E.dno, COUNT () FROM Emp E WHERE E.age>30 GROUP BY E.dno