Storage and Indexing (continued) CMPSCI 645 Mar 4, 2008 Slides - - PowerPoint PPT Presentation

storage and indexing continued
SMART_READER_LITE
LIVE PREVIEW

Storage and Indexing (continued) CMPSCI 645 Mar 4, 2008 Slides - - PowerPoint PPT Presentation

Storage and Indexing (continued) CMPSCI 645 Mar 4, 2008 Slides Courtesy of R. Ramakrishnan and J. Gehrke 1 Today Index selection & performance tuning Storing data: disks and files 2 Cost of Operations Scan Equality Range


slide-1
SLIDE 1

1

Storage and Indexing (continued)

CMPSCI 645 Mar 4, 2008

Slides Courtesy of R. Ramakrishnan and J. Gehrke

slide-2
SLIDE 2

Today

 Index selection & performance tuning  Storing data: disks and files

2

slide-3
SLIDE 3

Scan Equality Range Insert Delete

Heap File Sorted File Clustered Tree Index Unclustered Tree Index Unclustered Hash Index

BD .5BD BD 2D Search + D BD Dlog2B

D(log2B + #matching pages)

Search + BD Search + BD 1.5BD DlogF1.5 B

D(logF1.5B + #matching pages)

Search + D Search + D BD(R+. 15) D(1+logF.

.15B)

D(logF.15B + #matching recs)

Search + 3D Search + 3D BD(R+. 125) 2D BD 4D 4D

Cost of Operations

 Several assumptions underlie these (rough) estimates!

slide-4
SLIDE 4

4

Index selection

 For each query in the workload:

  • Which relations does it access?
  • Which attributes are retrieved?
  • Which attributes are involved in selection/join conditions?

How selective are these conditions likely to be?

 For each update in the workload:

  • The type of update (INSERT/DELETE/UPDATE), and the

attributes that are affected.

slide-5
SLIDE 5

5

Choice of Indexes

 What indexes should we create?

  • Which relations should have indexes?
  • What field(s) should be the search key?
  • Should we build several indexes?

 For each index, what kind of an index should it

be?

  • Hash/tree?
  • Clustered?
slide-6
SLIDE 6

6

Choice of Indexes (Contd.)

 One approach: Consider the most important queries

in turn. Consider the best plan using the current indexes, and see if a better plan is possible with an additional index. If so, create it.

  • We must understand how a DBMS evaluates queries and

creates query evaluation plans!

  • For now, we discuss simple 1-table queries.

 Before creating an index, must also consider the

impact on updates in the workload!

  • Trade-off: Indexes can make queries go faster, updates
  • slower. Require disk space, too.
slide-7
SLIDE 7

7

Index Selection Guidelines

 Attributes in WHERE clause are candidates for index keys.

  • Exact match condition suggests hash index.
  • Range query suggests tree index.
  • Clustering is especially useful for range queries; can also help on

equality queries if there are many duplicates.  Multi-attribute search keys should be considered when a

WHERE clause contains several conditions.

  • Order of attributes is important for range queries.
  • Such indexes can sometimes enable index-only strategies for

important queries.

  • For index-only strategies, clustering is not important!

 Choose indexes that benefit as many queries as possible.

  • Since only one index can be clustered per relation, choose it based on

important queries that would benefit the most from clustering.

slide-8
SLIDE 8

8

Examples of Clustered Indexes

 B+ tree index on E.age can be

used to get qualifying tuples.

  • How selective is the condition?
  • Is the index clustered?

SELECT E.dno FROM Emp E WHERE E.age>40

slide-9
SLIDE 9

9

Examples of Clustered Indexes

 Consider the GROUP BY query.

  • If many tuples have E.age > 10, using

E.age index and sorting the retrieved tuples by E.dno may be costly.

  • Clustered E.dno index may be better!

SELECT E.dno, COUNT (*) FROM Emp E WHERE E.age>10 GROUP BY E.dno

Compare index on <age>, index on <dno>.

slide-10
SLIDE 10

10

Examples of Clustered Indexes

 Equality queries and duplicates:

  • Clustering on E.hobby helps!

SELECT E.dno FROM Emp E WHERE E.hobby=’Stamps’

slide-11
SLIDE 11

11

Index-Only Plans

 Some queries can be

answered without retrieving any tuples from one or more of the relations involved, if a suitable index is available.

SELECT E.dno, COUNT(*) FROM Emp E GROUP BY E.dno

<E.dno>

slide-12
SLIDE 12

12

Index-Only Plans

SELECT E.dno, MIN(E.sal) FROM Emp E GROUP BY E.dno

<E.dno,E.sal> Tree index <E.sal,E.dno> ? What about

slide-13
SLIDE 13

13

Index-Only Plans

SELECT AVG(E.sal) FROM Emp E WHERE E.age=25 AND E.sal BETWEEN 3000 AND 5000

<E.age, E.sal> or <E.sal, E.age> Tree index

slide-14
SLIDE 14

14

Index-Only Plans (Contd.)

 Index-only plans

are possible if we have a tree index with key <dno,age> or <age,dno>

  • Which is better?
  • What if we consider

the second query?

SELECT E.dno, COUNT (*) FROM Emp E WHERE E.age=30 GROUP BY E.dno SELECT E.dno, COUNT (*) FROM Emp E WHERE E.age>30 GROUP BY E.dno

slide-15
SLIDE 15

Creating indexes in SQL

 SQL:1999 standard does not include any

statement for creating or dropping index structures!

15

CREATE INDEX age_index USING BTREE ON Emp(age)

slide-16
SLIDE 16

16

Summary

 Understanding the nature of the workload for the

application, and the performance goals, is essential to developing a good design.

  • What are the important queries and updates? What

attributes/relations are involved?

 Indexes must be chosen to speed up important

queries (and perhaps some updates!).

  • Index maintenance overhead on updates to key fields.
  • Choose indexes that can help many queries, if possible.
  • Build indexes to support index-only strategies.
  • Clustering is an important decision; only one index on a

given relation can be clustered!

  • Order of fields in composite index key can be important.
slide-17
SLIDE 17

Disks and Files

17

slide-18
SLIDE 18

18

Disks and DBMS Design

 DBMS stores information on disks.  This has major implications for DBMS design!

  • READ: transfer data from disk to main memory (RAM)

for data processing.

  • WRITE: transfer data from RAM to disk for persistent

storage.

  • Both are high-cost operations, relative to in-memory
  • perations, so must be planned carefully!
slide-19
SLIDE 19

19

Why Not Store Everything in Main Memory?

 Main memory is volatile. We want data to be

saved between runs. (Obviously!)

 Costs too much. $100 will buy you either 1GB

  • f RAM or 160GB of disk today.

 32-bit addressing limitation.

  • 232 bytes can be directly addressed in memory.
  • Number of objects cannot exceed this number.
slide-20
SLIDE 20

20

Arranging Pages on Disk

 `Next’ block concept:

  • blocks on same track, followed by
  • blocks on same cylinder, followed by
  • blocks on adjacent cylinder

 Blocks in a file should be arranged

sequentially on disk (by `next’), to minimize seek and rotational delay.

 For a sequential scan, pre-fetching several

pages at a time is a big win!

slide-21
SLIDE 21

21

Buffer Management in a DBMS

 Data must be in RAM for DBMS to operate on it!  Table of <frame#, pageid> pairs is maintained.

DB

MAIN MEMORY DISK disk page free frame

Page Requests from Higher Levels

BUFFER POOL

choice of frame dictated by replacement policy

slide-22
SLIDE 22

22

More on Buffer Management

 Requestor of page must unpin it, and indicate

whether page has been modified:

  • dirty bit is used for this.

 Page in pool may be requested many times,

  • a pin count is used. A page is a candidate for

replacement iff pin count = 0.

 CC & recovery may entail additional I/O

when a frame is chosen for replacement. (Write-Ahead Log protocol; more later.)

slide-23
SLIDE 23

23

When a Page is Requested ...

 If requested page is not in pool:

  • Choose a frame for replacement
  • If frame is dirty, write it to disk
  • Read requested page into chosen frame

 Pin the page and return its address.  If requests can be predicted (e.g., sequential scans)

pages can be pre-fetched several pages at a time!

slide-24
SLIDE 24

24

Buffer Replacement Policy

 Frame is chosen for replacement by a

replacement policy:

  • Least-recently-used (LRU), Clock, MRU etc.

 Policy can have big impact on # of I/O’s;

depends on the access pattern.

 Sequential flooding: Nasty situation caused by

LRU + repeated sequential scans.

  • # buffer frames < # pages in file means each page

request causes an I/O. MRU much better in this situation (but not in all situations, of course).

slide-25
SLIDE 25

25

DBMS vs. OS File System

OS does disk space & buffer mgmt: why not let OS manage these tasks?

 Differences in OS support: portability issues  Some limitations, e.g., files can’t span disks.  Buffer management in DBMS requires ability to:

  • pin a page in buffer pool, force a page to disk

(important for implementing CC & recovery),

  • adjust replacement policy, and pre-fetch pages based
  • n access patterns in typical DB operations.
slide-26
SLIDE 26

26

Files

 Access method layer offers an abstraction of

data on disk: a file of records residing on multiple pages

  • A number of fields are organized in a record
  • A collection of records are organized in a page
  • A collection of pages are organized in a file
slide-27
SLIDE 27

27

Files of Records

 Page or block is OK when doing I/O, but

higher levels of DBMS operate on records and files of records.

 FILE: A collection of pages, each containing a

collection of records. Must support:

  • insert/delete/modify record
  • read a particular record (specified using record id)
  • scan all records (possibly with some conditions on

the records to be retrieved)

slide-28
SLIDE 28

28

Unordered (Heap) Files

 Simplest file structure contains records in no

particular order.

 As file grows and shrinks, disk pages are

allocated and de-allocated.

 To support record level operations, we must:

  • keep track of the pages in a file
  • keep track of free space on pages
  • keep track of the records on a page

 There are many alternatives for keeping track

  • f this.
slide-29
SLIDE 29

29

Heap File Implemented as a List

 (heap file name, header page id) stored in a known place.  Two doubly linked lists, for full pages & pages with space.

  • Each page contains 2 `pointers’ plus data.

 Upon insertion, scan the list of pages with space, or ask

disk space manager to allocate a new page

Header Page Data Page Data Page Data Page Data Page Data Page Data Page Pages with Free Space Full Pages

slide-30
SLIDE 30

30

Heap File Using a Page Directory

 A directory entry per page; it can include the

number of free bytes on the page.

 The directory is a collection of pages; linked

list implementation is just one alternative.

  • Much smaller than linked list of all HF pages!

Data Page 1 Data Page 2 Data Page N Header Page

DIRECTORY

slide-31
SLIDE 31

31

Page Format

 How to store a collection of records on a page?  Consider a page as a collection of slots, one for

each record.

 A record is identified by rid = <page id, slot #>  Record ids (rids) are used in indexes

(Alternatives 2 and 3).

slide-32
SLIDE 32

32

Page Format: Fixed Length Records

 Moving records for free space management changes

rid! May not be acceptable.

Slot 1 Slot 2 Slot N PACKED N Free Space number

  • f records

. . .

M 1 . . . M ... 3 2 1 UNPACKED, BITMAP Slot 1 Slot 2 Slot N Slot M 1 1 number

  • f slots

. . .

slide-33
SLIDE 33

33

Page Format: Variable Length Records

 Can move records on page without changing rid; so, attractive

for fixed-length records too. (level of indirection)

Page i Rid = (i,N) Rid = (i,2) Rid = (i,1) 20 16 24

N

Pointer to start

  • f free

space

SLOT DIRECTORY N . . . 2 1

# slots

slide-34
SLIDE 34

34

Record Format: Fixed Length

 Information of a record type e.g., the number of fields

and field types is stored in the system catalog.

 Fixed length record: (1) the number of fields is fixed,

(2) each field has a fixed length.

 Store fields consecutively in a record.  Finding i’th field does not require scan of record. Base address (B)

L1 L2 L3 L4 F1 F2 F3 F4

Address = B+L1+L2

slide-35
SLIDE 35

35

Record Format: Variable Length

 Variable length record: (1) number of fields is fixed, (2)

some fields are variable length

 Two alternatives:

 Second offers direct access to i’th field, efficient storage

  • f nulls (special don’t know value); small directory overhead.

$ $ $ $

Scan

Fields Delimited by Special Symbols

F1 F2 F3 F4 F1 F2 F3 F4 S1 S2 S3 S4 E4

Array of Field Offsets

slide-36
SLIDE 36

36

System Catalogs

 For each index:

  • structure (e.g., B+ tree) and search key fields

 For each relation:

  • name, file name, file structure (e.g., Heap file)
  • attribute name and type, for each attribute
  • index name, for each index
  • integrity constraints

 For each view:

  • view name and definition

 Catalogs are themselves stored as relations!