File Organisation - 1 Dr. V. V. Subrahmanyam Associate Professor, - - PowerPoint PPT Presentation

file organisation 1
SMART_READER_LITE
LIVE PREVIEW

File Organisation - 1 Dr. V. V. Subrahmanyam Associate Professor, - - PowerPoint PPT Presentation

File Organisation - 1 Dr. V. V. Subrahmanyam Associate Professor, SOCIS, IGNOU Introduction File is a collection of records. Key element in the file management is concerned with the way in which the records themselves are organised


slide-1
SLIDE 1

File Organisation - 1

  • Dr. V. V. Subrahmanyam

Associate Professor, SOCIS, IGNOU

slide-2
SLIDE 2

Introduction

  • File is a collection of records.
  • Key element in the file management is

concerned with the way in which the records themselves are organised inside the file.

  • This affects system performance heavily as

far as records retrieval is concerned.

slide-3
SLIDE 3

Contd…

  • Access method of records in a file is

dependent upon the physical medium on which the files are stored.

  • For example, magnetic tape is sequential by
  • nature. So, records will be read sequentially.
  • While using disks, random access of records is

possible.

slide-4
SLIDE 4

File Organisation

  • It is a way of arranging the records in a

file when the file is stored on secondary storage devices.

slide-5
SLIDE 5

Some important definitions

  • File: It is a collection of related data or facts.
  • Fields: Theses are the columns containing one

type of information.

  • File is a group of records; Records contain

fields; fields contain data items. Data items contain characters(alphabets, digits, special characters etc..).

  • Each character occupies one byte of storage

space.

slide-6
SLIDE 6

Example

  • In the context of a traditional library, the

author catalogue is a file. Each individual author catalogue card is a record. Each column in the card such as author, title etc.. is a field.

slide-7
SLIDE 7

Objectives of File Organisation

  • Optimal selection of records i.e; records

should be accessed as fast as possible.

  • Any insert, update or delete transaction on

records should be easy, quick and should not harm other records.

  • No duplicate records should be induced as a

result of insert, update or delete.

  • Records should be stored efficiently so that

cost of storage is minimal.

slide-8
SLIDE 8

Data Files and Index Files

  • Database is a collection of files that together

implement a logical data model.

  • The two types of files in a physical database

structure are (i) data files and (ii) index files.

  • Data Files: These files store the facts that

comprise the database.

  • Index Files: These files support access to the

data files but usually do not themselves store facts other than key values.

slide-9
SLIDE 9

Example

  • Consider a simple bibliographical database. It

consists of records containing bibliographical details about books. Each record about a book consists several fields (Author, Title, Imprint etc..). For fast access to the records, index files

  • r inverted file is created – each record in

which may hold the index term (Author’s name or Subject descriptor etc..) and an index

  • number. It is similar to back of the book index.
slide-10
SLIDE 10

Structure/Organisation and Access method

  • The method of organising the record in a

file is referred to as its structure or

  • rganisation.
  • The method of searching the file in order

to retrieve the data is called the access method.

– Sequential – Random

slide-11
SLIDE 11
  • For a particular file the most appropriate
  • rganisation is determined on the basis of

the operational characteristics of the storage medium and nature of operations to be performed on the data.

  • Magnetic disks are examples of direct

access storage devices and magnetic tapes are examples of sequential storage devices.

slide-12
SLIDE 12

Types of file organisations

  • Sequential File Organization
  • Indexed Sequential Access

Method

  • Heap File Organization
  • Hash/Direct File Organization
  • B+ Tree File Organization
  • Cluster File Organization
slide-13
SLIDE 13

Sequential File Organisation

  • This is the simplest technique.
  • Records are written in a sequence in one

long list.

  • They are arranged in the same sequence

in which they were originally entered/written into the file.

  • The file is read from the beginning in the

sequence in which the records are arranged.

slide-14
SLIDE 14

Contd…

  • To retrieve start at the beginning of the file

read one after other in sequence until the record is searched for.

  • Time consuming for large files.
  • These files are stored on sequential storage

device like magnetic tapes.

  • Suitable for storing only for archive, backup

and transport copies of databases.

slide-15
SLIDE 15

Sequential Organisation

slide-16
SLIDE 16

Inserting a New Record

  • Requires creation of a new file.
  • To maintain file sequence, records are copied

to the point where amendment is required.

  • The changes are then made and copied into

the new file.

  • Following this, the remaining records in the
  • riginal file are copied to the new file.
slide-17
SLIDE 17

Inserting a New Record

slide-18
SLIDE 18

Disadvantages

  • Sorted file method always involves the effort

for sorting the record.

  • Each time any insert/update/ delete

transaction is performed, file is sorted. Hence identifying the record, inserting/ updating/ deleting the record, and then sorting them always takes some time and may make system slow.

slide-19
SLIDE 19

Indexed Sequential Access Method (ISAM)

  • Is designed to overcome the limitations of the

sequential file.

  • A file is sequenced on a particular field and an

index for that file is built based on that very field.

  • This index provides a mechanism for faster

search.

  • This technique allows both sequential and

random processing.

slide-20
SLIDE 20

ISAM

Index File PrimaryKey Block Pointer Data File (in Blocks)

slide-21
SLIDE 21

Inserting a Record

  • While inserting a record, in order to maintain the

sequence of records sometimes this may necessitate shifting subsequent records.

  • For a large file this is a costly and inefficient

process.

  • Instead, an overflow area is provided so that the

records that overflow their logical area are shifted into a designated overflow area and pointer is provided to it to the overflow location.

slide-22
SLIDE 22

Inserting a Record

611 Original Logical Block 611 612 614 618 624 Original Logical Block 611 612 614 615 618 Overflow Block

slide-23
SLIDE 23

Advantages

  • Since each record has its data block

address, searching for a record in larger database is easy and quick with proper primary key.

  • This method gives flexibility of using any

column as key field and index will be generated based on that. In addition to the primary key and its index, we can have index generated for other fields too.

slide-24
SLIDE 24

Contd…

  • It supports range retrieval, partial retrieval of
  • records. Since the index is based on the key

value, we can retrieve the data for the given range of values. In the same way, when a partial key value is provided, say student names starting with ‘JA’ can also be searched easily.

slide-25
SLIDE 25

Disadvantages

  • An extra cost to maintain index has to be
  • afforded. i.e.; we need to have extra space in the

disk to store this index value. When there is multiple key-index combinations, the disk space will also increase.

  • As the new records are inserted, these files have

to be restructured to maintain the sequence. Similarly, when the record is deleted, the space used by it needs to be released. Else, the performance of the database will slow down.