Indexed Files : Outline ! Introduction ! Indexed Files ! Full Index - - PowerPoint PPT Presentation

indexed files outline
SMART_READER_LITE
LIVE PREVIEW

Indexed Files : Outline ! Introduction ! Indexed Files ! Full Index - - PowerPoint PPT Presentation

Indexed Files : Outline ! Introduction ! Indexed Files ! Full Index Organization ! Indexed Sequential Files ! Multilevel Indexes ! Overflow Management ! Performance Analysis rasitjutrakul Indexed Files Ordered Random access sequential


slide-1
SLIDE 1

rasitjutrakul

Indexed Files : Outline

! Introduction ! Indexed Files ! Full Index Organization ! Indexed Sequential Files ! Multilevel Indexes ! Overflow Management ! Performance Analysis

slide-2
SLIDE 2

rasitjutrakul

Indexed Files

Ordered sequential processing Random access Sequential file structure fast slow Direct file structure slow fast Indexed file structure fast fast

slide-3
SLIDE 3

rasitjutrakul

Indexed file

Key Key look up index look up index Block address Block address

675 Somchai 693 Somwang 270 Somnuek 105 Somsamorn 987 Somroo 675 693 1 270 2 105 3 987 4 105 3 270 2 675 693 1 907 4 block # 0 block # 1 block # 2 block # 3 block # 4

Index Index Index Index Data Data

slide-4
SLIDE 4

rasitjutrakul

Full Index Organization

4 7 10 1 16 1 18 2 19 2 20 3 21 3 22 4 23 4 24 5 25 5 26 6 28 6 33 7 35 7 37 8 39 8 41 9 44 9 48 10 78 10 81 11 92 11

4 7 10 16 18 19 20 21 22 23 24 25 26 28 33 35 37 39 41 44 48 78 81 92

block # 0 block # 1 block # 2 block # 3 block # 4 block # 5 block # 6 block # 7 block # 0 block # 1 block # 2 block # 3 block # 4 block # 5 block # 6 block # 7 block # 8 block # 9 block # 10 block # 11

slide-5
SLIDE 5

rasitjutrakul

"Binary search for the target key in the index "RetrieveOne

: SL[BinarySearch] + 1 rba

"RetrieveAll

: 1 rba + m2 sba

"DeleteOne

: SL[RetrieveOne] + 2 sba

"InsertOne

: SL[RetrieveOne] + 2 sba

Indexed File Structure

index blocks data blocks m1 blocks m2 blocks

slide-6
SLIDE 6

rasitjutrakul

Indexed Sequential Files "If records in the data file are ordered,

– ordered sequential is fast. – do not have to be full indexed

(keep only max index value of the data block)

– # indexes decreases, # index blocks decreases search length decreases, improve

performance

16345 Siripol 17324 Sirirak 17543 Siri 18932 Siriroj 19823 Toy 20221 Tao 23847 Ting 38211 Took 18932 38211

slide-7
SLIDE 7

rasitjutrakul

Indexed Sequential File

7 16 1 19 2 21 3 23 4 25 5 28 6 35 7 39 8 44 9 78 10 x 11

4 7 10 16 18 19 20 21 22 23 24 25 26 28 33 35 37 39 41 44 48 78 81 92

block # 0 block # 1 block # 2 block # 3 block # 0 block # 1 block # 2 block # 3 block # 4 block # 5 block # 6 block # 7 block # 8 block # 9 block # 10 block # 11

slide-8
SLIDE 8

rasitjutrakul

Indexed Sequential Files

"100,000 records, each of size 500 bytes "index record size = 20 bytes "block size = 2000 bytes

1 block = 4 data recs, 1 block = 100 index recs. 25,000 data blocks

"Full index :

– index file : 100,000 recs = 1000 index blocks

"Indexed sequential :

– index file : 25,000 recs = 250 index blocks

slide-9
SLIDE 9

rasitjutrakul

Multilevel Indexed Sequential "Trimming search length = better performance "Modify the logical structure of the index file

  • ne level

three levels two levels

slide-10
SLIDE 10

rasitjutrakul

Indexed Sequential File : 2 levels

7 16 1 19 2 21 3 23 4 25 5 28 6 35 7 39 8 44 9 78 10 x 11

4 7 10 16 18 19 20 21 22 23 24 25 26 28 33 35 37 39 41 44 48 78 81 92

block # 0 block # 1 block # 2 block # 3 block # 4 block # 5 block # 6 block # 7 block # 8 block # 9 block # 10 block # 11

19 25 39 x

level 0 level 1

slide-11
SLIDE 11

rasitjutrakul

Indexed Sequential File : 3 levels

7 16 1 19 2 21 3 23 4 25 5 28 6 35 7 39 8 44 9 78 10 x 11

4 7 10 16 18 19 20 21 22 23 24 25 26 28 33 35 37 39 41 44 48 78 81 92

block # 0 block # 1 block # 2 block # 3 block # 4 block # 5 block # 6 block # 7 block # 8 block # 9 block # 10 block # 11

19 25 39 x 25 x

level 0 level 1 level 2

slide-12
SLIDE 12

rasitjutrakul

"Insertion generates overflow records "Allocate empty slots for each blocks "Reorganizing the file if needed "Allocate extra overflow blocks (overflow area)

– overflow records are linked in a logical, ordered, chained fashion with the primary block

to which they belongs

– overflow recorded are not blocked

Overflow Records

slide-13
SLIDE 13

rasitjutrakul

Overflow Records

2 4 10 13 x 18 19 x 20 21 x 22 23 x 24 25 x 26 28 x 33 35 x 37 38 41 44 x 48 78 x 81 92 x 7 x 16 x 15 6 5 39 x

...

Overflow area Primary area

slide-14
SLIDE 14

rasitjutrakul

"Number of rba's needed to retrieve a target depends the height of the index tree. "The height depends on the NBLK and BF of index. "Let k be the avg. # of indexes per index block "Let the index tree be a h level tree.

Performance Analysis

NBLK k k k k k NBLK k h NBLK

index h h data h k data

= + + + + = − − = =

1 1 1

2 1

L log ( )

slide-15
SLIDE 15

rasitjutrakul

Performance : RetrieveOne

"100,000 records, each of size 500 bytes "index record size = 20 bytes "block size = 2000 bytes "Full index : 1000 blocks : log 1000 ≈

10 rba

"Indexed seq (1 level) : 250 blocks

– 1 + log 250 ≈ 9 rba

"Indexed seq (multilevel) : BF = 2000/20 = 100

– h = ?log 100000 = 3 – 1 + h = 4 rba

100

slide-16
SLIDE 16

rasitjutrakul

Initial Loading

4, 7, 9, 34, 63, 66, 70, 71

4

7

4 7

7 34

4 7 9 34

34 7

4 7 9

slide-17
SLIDE 17

rasitjutrakul

Initial Loading

7 34

4 7 9 34 63 66 70 71

34 x 66 x 7 34

4 7 9 34 63 66

34 66

4, 7, 9, 34, 63, 66, 70, 71

slide-18
SLIDE 18

rasitjutrakul

Reorganization Point

"Reorganize when performance has deteriorated by 50% from the performance just

after (initial loading).

"Let n1 be # of RetrieveOne in a unit time "Let n2 be # of RetrieveAll in a unit time "Let L be the average length of overflow recs.

NBLK k k k k k NBLK k h NBLK

index h h data h k data

= + + + + = − − = =

1 1 1

2 1

L log ( )

slide-19
SLIDE 19

rasitjutrakul

Physical Structure

master index master index cylinder index cylinder index cylinder index cylinder index track index track index track index track index

. . . . . . . . . . . . . . . . . . . . .

slide-20
SLIDE 20

rasitjutrakul

Physical Structure

master index cylinder indx track index data data track index data data data data track index data data data data track index data data data data cylinder indx track index data data data track index data data data data track index data data data data track index data data data data

. . .

trk 0 trk 1 trk 2 trk 19 trk 0 trk 1 trk 2 trk 19

slide-21
SLIDE 21

rasitjutrakul

Physical Structure

level 0 index level 1 index data blocks ... level 1 index data blocks ... level 1 index data blocks ... etc. level 0 index level 1 index data blocks ...

  • verflow

blocks level 1 index data blocks ...

  • verflow

blocks etc.

"Faster access

Mingling the data and index blocks : locality

Keep master index (level 0 index) in RAM

slide-22
SLIDE 22

rasitjutrakul

"10,000 records, 160 bytes/record, key is 16 bytes, pointer is 4 bytes "HP7925 - 256 bytes/sector, 64 sectors/track, 9 tracks/cylinder, 815 cylinder "Choose BF = 6, utilization=(160x6)/(256x4) = 93.8% "1 block = 1024, 1024/(16+4) = 51 index entries "1 track = 64/4 = 16 blocks "10000 records, 10000/6 = 1667 blocks "number of cylinders = 1667/(16x9-10) = 13 cyl.

Example

16 block / tracks, 9 tracks/cylinder (9 track index block + 1 cylinder index block)