The need for File Systems Need to store data and programs in files - - PowerPoint PPT Presentation

the need for file systems
SMART_READER_LITE
LIVE PREVIEW

The need for File Systems Need to store data and programs in files - - PowerPoint PPT Presentation

The need for File Systems Need to store data and programs in files Must be able to store lots of data Must be nonvolatile and survive crashes and power outages Must allow multiple processes concurrent access Store on disks OS manages files


slide-1
SLIDE 1

The need for File Systems

Need to store data and programs in files Must be able to store lots of data Must be nonvolatile and survive crashes and power outages Must allow multiple processes concurrent access Store on disks OS manages files in a file system

slide-2
SLIDE 2

Different views on file systems

User view of file systems

  • file names, protections, operations

File system designers views

  • how to keep track of free blocks
  • how disk blocks are grouped and

managed to form files First look at files from a users perspective.

slide-3
SLIDE 3

User views of file systems

File naming

  • give users useful names for files
  • MS-DOS limitations, 8.3
  • file extensions
  • NTFS
  • Unix -256 chars
  • extensions may have meaning to

programs (e.g. cc) OS viewing files as a sequence of bytes gives most flexibility.

  • leave up to app what to do
slide-4
SLIDE 4

File Types

Regular files – ascii or binary Directories Character special files Block special files Symbolic links Sockets, Named pipe, Doors The file command, /etc/magic compressed files and archives Sequential files – on tape random access files

slide-5
SLIDE 5

File attributes - metadata

Returned with stat(2) family mostly

  • filename (not from stat(2))
  • size
  • mtime, ctime, atime
  • mode (permissions and type)
  • nlinks
  • uid, gid
slide-6
SLIDE 6

File operations

Create, Delete Open, Close Read, Write Append, Seek Get attributes, Set attributes Rename File descriptors – open files memory-mapped files – map files into process address space

slide-7
SLIDE 7

Directories

Root directory Could have just root directory with all files in it

  • not useful on multiuser systems
  • may be useful for flash, etc.

Have directories in the root directory Hierarchical directories – tree Path names

  • absolute and relative to cwd

Directory ops: Create, Delete, Opendir, Closedir, Readdir, Rename, link, unlink

slide-8
SLIDE 8

Directories and Files

Contents of a directory Unix . and .. directories (dot and dot-dot) Disk partitions Slices Filesystems Disk labels How to store files and directories on disk? Want efficiency and reliability

slide-9
SLIDE 9

Contiguous Allocation of files

Advantages

  • simple to implement (address of

first block + number of blocks)

  • very good read performance

Disadvantages

  • Disk becomes fragmented after awhile

(need to compact, keep track of holes)

  • Files change in size

CD-ROMS are a good use for this

  • write once
slide-10
SLIDE 10

Linked List allocation of files

Keep a linked list of disk blocks no external fragmentation, little internal Need to store first block in directory Random access is slow Pointer takes some disk block space.

slide-11
SLIDE 11

Linked List allocation using a table in memory

Table has a pointer to each disk block FAT – File Allocation Table Entire block can be used for data Random access works well Problem is entire table needs to be in memory (proportionate to disk size) “Implementing pointers using arrays”

slide-12
SLIDE 12

I-nodes

Needs to be in memory only when file is open. Point to a bunch of disk blocks. Point to a block that points to more disk blocks.

File Attributes

10 or so direct block ptrs

More block ptrs

slide-13
SLIDE 13

Directories

Need to map filenames to disk blocks

  • n disk (inode number)

file attributes can be in dir too or elsewhere like in the inode typical filename max length is 255 linear search of a directory can be slow Use a hash table or a tree to speed up lookups and/or cache the searches - dnlc

slide-14
SLIDE 14

File storage

Storing files on disk have many of the same issues as memory allocation.

  • contiguous
  • noncontiguous – with fixed size blocks

Block size

  • too small and too slow
  • seek time and rotational delay
  • too big wastes space (internal frag)

½ kB, 1kB, 4kB commonly used Need to keep track of free blocks

  • use a linked list or a bitmap
slide-15
SLIDE 15

Other file issues

Disk Quotas

  • hard and soft limit or just hard limit
  • time based or not
  • number of files or just space used

Backups – Importance of data

  • equipment can be replaced
  • but losing data is unacceptable
  • Backups to tape or other media
  • full and incremental, restores
  • physical security of tapes
  • offsite copies, encryption
slide-16
SLIDE 16

Filesystem consistency

System crashes can leave filesystem in inconsistent state.

  • need for scandisk and fsck
  • check blocks and files
  • missing blocks, duplicate blocks
  • lost+found

sync – write out modified blocks

  • done every 30 seconds

fsck can take a long time on large filesystems with lots of files

  • can make booting up slow
slide-17
SLIDE 17

Logging or Journaling

A filesystem that logs changes to on disk data in a separate sequential rolling log.

  • maintain accurate picture of filesystem
  • speeds up booting greatly
  • more reliable in case of power failure

Records each disk transaction Filesystem can be checked with the log instead of a full scan

slide-18
SLIDE 18

Logging or Journaling

  • log update at start
  • modify filesystem
  • marked done

When filesystem is checked if intent to change is marked, but not completed file structure for that block is checked. UFS logging, ext3, reisorfs Disksuite, Veritas separate log

slide-19
SLIDE 19

Unix Inodes

Structure contains metadata (stuff returned by stat(2))

  • mode (permissions and type)
  • nlinks, size
  • uid, gid
  • atime, ctime, mtime
  • device file is on

10 or 12 direct disk block addresses single, double, triple indirect blocks

(picture of inodes)

slide-20
SLIDE 20

Unix Inodes

Small files can be accessed quickly directly from inode. Larger files use the indirect indexing. Capacity of unix files using inodes.

  • example with 4 byte (32bit)

addressing

  • block size of 1kB (1024bytes)
slide-21
SLIDE 21

Unix Inodes

4 byte addresses and 1k block size Level # of blocks # of bytes Direct 12 12kB Single Ind 256 256kB Double “ 256*256=65k 65MB Triple Ind 256*65k=16M 16GB Max size file is 16GB + 65MB +268kB

slide-22
SLIDE 22

Unix Inodes

newfs, mkfs superblock (found in block 1 and other backup copies)

  • info about filesystem layout
  • # of inodes
  • # of disk blocks
  • free list for disk blocks

Directory entry needs filenames and inode number

slide-23
SLIDE 23

Filesystems

Berkeley Fast Filesystem, ufs

  • file names up to 255 chars
  • cylinder groups – keep data blocks
  • f file close together

Linux – ext2, ext3, ext4 XFS, Reisorfs VxFS WAFL (slide to come) ZFS (more on this later) NFS, CIFS Virtual filesystems, /proc

slide-24
SLIDE 24

WAFL File System

Write Anywhere File Layout from Network Appliance

  • ptimized for random writes

Used on file servers from NetApp provide files using NFS, CIFS, ftp, http servers have NVRAM cache for writes Ease of use is a principle of WAFL Similar to Berkeley Fast File System, but with several changes. Uses inodes – 16 pointers to blocks (or indirect blocks)

slide-25
SLIDE 25

WAFL Snapshots

Each filesystem has a root inode A snapshot duplicates a root inode A snapshot is a read only version of the filesystem at the point of time it is taken. Existing blocks are not overwritten. Space used by snapshot is blocks modified since snapshot was taken. Need to keep track of how many snapshots reference a block. When gets to zero the block can be freed.

See Figure 11.17 on page 446