FILE SYSTEM IMPLEMENTATION
Sunu Wibirama
FILE SYSTEM IMPLEMENTATION Sunu Wibirama Outline File-System - - PowerPoint PPT Presentation
FILE SYSTEM IMPLEMENTATION Sunu Wibirama Outline File-System Structure File-System Implementation Directory Implementation Allocation Methods Free-Space Management Discussion 11. Outline File-System Structure
Sunu Wibirama
11.
Outline
File-System Structure File-System Implementation Directory Implementation Allocation Methods Free-Space Management Discussion
11.
Outline
File-System Structure File-System Implementation Directory Implementation Allocation Methods Free-Space Management Discussion
11.
File System Structure
File system is provided by OS to allow the data to be
stored, located, and retrieved easily
Two problems on file system design:
How the file system should look to the user Algorithms and data structures to map logical file
system onto the physical secondary-storage devices
File system organized into layers, uses features from
lower levels to create new features for use by higher levels.
I/O Control controls the physical device using
device driver
Basic file system needs only to issue generic
commands to the device driver to read and write physical block on the disk (ex. drive 1, cylinder 73, track 2, sector 11)
11.
Outline
File-System Structure File-System Implementation Directory Implementation Allocation Methods Free-Space Management Discussion
11.
File System Implementation
File organization module knows physical
and logical blocks, translating logical block address to physical block address. It also manages free-space on the disk
Logical file system manages metadata
information (all file system structure except the actual data or contents of the file).
Logical file system maintains file structure
via file-control blocks (FCB)
File control block – storage structure
consisting of information about a file
Basic file system and I/O control can be
used by multiple file systems.
Recently, OS supports more than one FS
11.
A Typical File Control Block
11.
On-Disk File System Structures
Boot control block contains info needed by system to boot
OS from that volume (UNIX: boot block, NTFS: partition boot sector)
Volume control block contains volume details (UNIX:
superblock, NTFS: master file table)
Directory structure organizes the files (UNIX: inode
numbers, NTFS: master file table)
Per-file File Control Block (FCB) contains many details
about the file
11.
In-Memory File System Structures
It is used for file-system management and performance
improvement (via caching).
The data are loaded at mount time and discarded at
dismount.
The structures including:
In-memory mount table: information of each mounted
volume
In-memory directory structure System-wide open-file table: a copy of FCB of each open
file
Per-process open-file table: a pointer to the appropriate
entry in the system-wide open-file table, as well as other information based on process that uses the file.
11.
New File Creation Process
Application program calls the logical file
system
Logical file system knows the directory
The system then reads the appropriate
directory into memory, updates it with the new file name and FCB, and writes it back to the disk.
Now, the new created file can be used for
I/O operation, which will be explained in the next slide
11.
In-Memory File System Structures
File open File read
11.
Outline
File-System Structure File-System Implementation Directory Implementation Allocation Methods Free-Space Management Discussion
11.
Directory Implementation
Directory-allocation and directory-management
algorithms significantly affects the efficiency, performance, and reliability of the file system. There are two impementations: linear list and hash table
Linear list of file names with pointer to the data
blocks.
simple to program time-consuming to execute, because it requires a
linear search to create or delete file.
Hash Table – linear list with hash data structure.
decreases directory search time problem:
collisions (two file names hash to the same
location)
fixed size of hash table (for 64 entries, the
hash function converts filename into integers from 0 to 63)
11.
Outline
File-System Structure File-System Implementation Directory Implementation Allocation Methods Free-Space Management Discussion
11.
Dynamic Storage Allocation
At any given time, we have a set of free spaces (holes) of various sizes
scattered through the disk.
Dynamic Storage Allocation problem: satisfying a request of size n
from a list of free spaces
There are three solutions for dynamic storage allocation problem:
First Fit: allocate the first hole that is big enough. Stop searching the
entire list as soon as we find a free hole that is large enough
Best Fit: Allocate the smallest hole that is big enough. We must
search the entire list in the storage
Worst Fit: allocate the largest hole. Again, we must search the entire
list in the storage
In term of speed and disk utilization : first fit and best fit are better then
worst fit
11.
External Fragmentation
First fit and best fit suffer from external fragmentation :
there is enough total storage space to satisfy the request, but the available spaces are not contiguous
OS File 2 File 3 File 8 50k 100k File 9 125k
11.
Internal Fragmentation
Another approach : break the physical storage into fixed-sized
blocks and allocate storage in units based on block size.
The storage space allocated to file may be slightly larger than the
requested space, but we get internal fragmentation.
OS A (part 3) A (part 2) A (part 1) Room for growth Allocated to A
11.
Fragmentation Solution?
Minimize external fragmentation?
Large disk blocks (but internal fragmentation occurs) Choose the best allocation methods to allocate disk block Compaction:
Shuffle memory contents to place all free memory together in one large block
(example: disk defragmenter)
Only for “execution time address binding” data (dynamic address relocation)
File 9 125k OS File 2 File 3 File 8 OS File 2 File 3 File 8 50k 100k OS File 2 File 3 File 8 90k 60k (a) (b)
11.
Allocation Methods
An allocation method refers to how disk blocks are
allocated for files:
Contiguous allocation Linked allocation Indexed allocation
11.
Contiguous Allocation of Disk Space
11.
Contiguous Allocation
OS : IBM VM/CMS Each file occupies a set of contiguous blocks on the disk Simple – only starting location (block #) and length (number of
blocks) are required
Both sequential and direct access are supported Disadvantages:
Wasteful of space (dynamic storage-allocation problem) External fragmentation: free space is broken into chunks We must determine the size of needed space before we allocated Preallocation may be inefficient if the file grow slowly over a long
period (months or years). The file therefore has a large amount of internal fragmentation
11.
Contiguous Allocation
(a) Contiguous allocation of disk space for 7 files. (b) The state of the disk after files D and F have been removed.
11.
Contiguous Allocation
One of several solutions: use a modified contiguous
allocation scheme (ex. : Veritas file system, replacing standard UNIX UFS)
Initially, a contiguous block of space is allocated If it is not to be large enough, another block is added,
which is called extent.
An extent is a contiguous block of disks A file consists of one or more extents
Another approach, we use Linked Allocation Method
11.
Linked Allocation
Linked allocation solves all
problems of contiguous allocation
Each file is a linked list of disk
blocks: blocks may be scattered anywhere on the disk.
Ex: File “Jeep”
Start at block 9 Then: block 16, 1, 10 Finally end at block 25
pointer (4 bytes) 1 block (512 bytes)
508 bytes
visible part to user
11.
Linked Allocation
Advantages:
Free-space management system – no waste of space File can grow, depends on available free blocks
Disadvantages:
No random access (only sequential access) Space required for pointers (0.78 percent of the disk is being used
for pointers, rather than for information). Solution, uses clusters (unit of blocks), so that pointers use much smaller percentage
Clusters operation increase internal fragmentation Reliability, pointer damage will cause unlinked blocks in a file.
FAT (File Allocation Table): variation on linked allocation (ex.: MS-DOS
and OS/2 operating systems)
Located at the beginning of each volume
11.
File-Allocation Table
To do random access:
to read the FAT
11.
Indexed Allocation
Linked allocation: solves external fragmentation and size-declaration
problem
FAT + Linked Allocation = efficient file retrieval using random access Without FAT, linked allocation cannot support efficient direct access:
because the pointers to the blocks are scattered with the block
themselves all over the disk
the pointers must be retrieved in order (sequential access).
How about collecting all pointers together into one location?
11.
Indexed Allocation
Brings all pointers together into the
index block
Each file has its own index block, which
is an array of disk-block addresses
Directory contains the address of index
block
Support direct access without external
fragmentation
Each file has its allocation for all
pointers, so that it has wasted space greater than linked allocation (which contains one pointer per block).
We want the index block as small as
possible, then we have several mechanisms:
Index block
11.
Indexed Allocation
Linked Scheme
An index block is normally one disk block Large files -> we can link together several index blocks Example: an index block contains:
Multilevel index
First-level index block points to second-level index blocks which in turn point
to the file blocks (see next slide)
Combined scheme
In unix, for example: 15 pointers in fileʼs inode 12 first pointer: direct blocks, for small file (no more than 12 blocks). If the
block size is 4KB, then up to 48KB (12 x 4KB) can be accessed directly
The next pointers point to indirect blocks, which implement multilevel index
based on their sequence (see next two slide)
11.
Multilevel Index
1st-level index block
2nd-level index block
file
Back to Indexed Allocation
11.
Combined Scheme: UNIX UFS (4K bytes per block)
Back to Indexed Allocation
11.
Outline
File-System Structure File-System Implementation Directory Implementation Allocation Methods Free-Space Management Discussion
11.
Free-Space Management
Disk space is limited, we need to reuse the space from deleted files for
new files, if possible
To keep track of free disk space, the system maintains a free-space
list.
Free-space list records all free disk blocks: those not allocated to
some file or directory
Creating file:
search the free-space list for the required amount of space allocate that space to the new file remove that space from the free-space list
Deleting file: the disk space of the file is added to free-space list
11.
Free-Space Management
Free-space list concept Bit vector (n blocks) Bit vector requires extra space.
Example:
block size = 212 bytes disk size = 230 bytes (1 gigabyte) n (amount of bits) = 230/212 = 218 bits (or 32K bytes)
…
1 2 n-1 bit[i] =
0 ⇒ block[i] free 1 ⇒ block[i] occupied Finding the free block number {(number of bits per word) *(number of 0-value words)} +offset of first 1 bit
1 1 1 1 1 1
1 2 3 4 5 6 7 8 9 10 11
1st word 2nd Word 3rd word
Assume 1word = 3bits, then finding block 11: (3 X 3)+2
11.
Free-Space Management
The other free-space management methods include:
(1) Linked list
Link together all the free disk blocks Keeping a pointer to the first free block in a special location
The first block contains a pointer to the next free disk block. Must read each block to traverse list, increase I/O operation
time. (2) Grouping
Storing the address of n free blocks in the first free block. n-1 blocks are actually free blocks but the last block
contains the addresses of another n free blocks. (3) Counting (for contiguous free block)
Several contiguous blocks may be freed simultaneously Keep the address of the first free block and n of free
contiguous blocks that follow the first block.
Each entry in free-space list consists of disk address and a
count (ex. address 4 + 15 --> we have 16 free blocks. 1st block starts at address 4, followed by 15 free blocks)
Fragmentation in Windows (FAT 32) and Linux (ext)
http://en.wikipedia.org/wiki/Defragmentation
Yes, but in very small quantity
‘clumps’ to manage small and large file (remember combined scheme in indexed allocation)
that are not predictably being fragmented in shorter time.
Possible to allocate bigger blocks for a file
disk using shake-fs
Fragmented Part
(*you should have known this before...)
http://geekblog.oneandoneis2.org/index.php/2006/08/17/ why_doesn_t_linux_need_defragmenting
Empty hard disk (*simplified assumption)
Start with FAT file system....
I have hello.txt OK, now add bye.txt I want to change hello.txt, dude...
Just copy, delete the original content, and wrap it up in the larger space..... Or, Put your extended file content to the next space.....
If the first approach requires huge read and write operation, then the most possible approach is the second one. That’s why FAT suffers from large fragmentation
1st approach 2nd approach
Initial condition
Add bye.txt
Change hello.txt