File Syst ems Last t ime we t alked about disk int ernals 11: - - PDF document

file syst ems
SMART_READER_LITE
LIVE PREVIEW

File Syst ems Last t ime we t alked about disk int ernals 11: - - PDF document

File Syst ems Last t ime we t alked about disk int ernals 11: File Syst em Basics Despit e complex int ernals, disks export a simple array of sect ors Last Modif ied: How do we go f rom t hat t o a f ile syst em? 6/ 15/ 2004


slide-1
SLIDE 1

1

  • 1

11: File Syst em Basics

Last Modif ied: 6/ 15/ 2004 12:12:09 PM

  • 2

File Syst ems

Last t ime we t alked about disk int ernals Despit e complex int ernals, disks export a

simple array of sect ors

How do we go f rom t hat t o a f ile syst em? What do we exact ly do we expect f rom a

f ile syst em?

  • 3

File Syst em Basics

FS are probably t he OS abst ract ion t hat

average user is most f amiliar wit h

Files Dir ect or ies Access cont r ols (owner s, gr oups, per missions)

  • 4

Files

A f ile is a collect ion of dat a wit h syst em

maint ained pr oper t ies like

Owner, size, name, last read/ writ e t ime, et c.

Files of t en have “t ypes” which allow user s and

applicat ions t o r ecognize t heir int ended use

Some f ile t ypes ar e under st ood by t he f ile syst em

(mount point , symbolic link, dir ect or y)

Some f ile t ypes ar e under st ood by applicat ions

and user s (.t xt , .j pg, .ht ml, .doc, … )

Could t he syst em underst and t hese t ypes and cust omize

it s handling?

  • 5

Basic File Operat ions

UNI X

cr eat e (name)

  • pen (name, mode)

r ead (f d) writ e(f d) sync(f d) seek(f d, pos) close(f d) unlink (name) r ename (old, new)

Windows

Creat eFile (name, CREATE) Creat eFile (name, OPEN) ReadFile (handle) Wr it eFile (handle) FlushFileBuf f er s(handle) Set FilePoint er (handle) CloseHandle(handl ) Delet eFile(name) CopyFile (name) MoveFile(name)

  • 6

Direct ories

Dir ect or ies pr ovide a way f or user s t o or ganize

t heir f iles *and* a convenient way f or user s t o ident if y and shar e dat a

Logically dir ect or ies st or e inf or mat ion like f ile

name, size, modif icat ion t ime et c (Not always kept in t he dir ect or y t hough..)

Most f ile syst ems suppor t hier ar chical dir ect or ies

(/ usr/ local/ bin or C:\ WI NNT)

P

eople like t o organize inf ormat ion hierarchically Recall: OS of t en r ecor ds a cur r ent wor king

dir ect or y f or each pr ocess

Can t heref ore ref er t o f iles by absolut e and relat ive

names

slide-2
SLIDE 2

2

  • 7

Direct ories are special f iles

Dir ect or ies ar e f iles cont aining inf or mat ion t o be

int er pr et ed by t he f ile syst em it self

List of f iles and ot her direct ories cont ained in t his

direct ory

Some at t ribut es of each child including where t o f ind it !!

How should t he list of childr en be or ganized?

Flat f ile? B- tree?

Many syst ems have no par t icular or der , but t his is

ext r emely bad f or lar ge dir ect or ies!

  • 8

Mult iple parent direct ories?

One nat ur al quest ion is “can a f ile be in mor e t han

  • ne dir ect or y”?

Sof t links

Special f ile int erpret ed by t he FS (like direct ories in

t hat sense)

Tell FS t o look at a dif f erent pat hname f or t his f ile I f f ile delet ed or moved, sof t link will point t o wrong

place Har d links

Along wit h ot her f ile inf o maint ain ref erence count Delet e f ile = decrement ref erence count Only reclaim st orage when ref erence count does t o 0

  • 9

P at h Name Translat ion

To f ind f ile “/ f oo/ bar/ baz”

Find t he special root direct ory f ile (how does FS know

where t hat is?)

I n special root direct ory f ile, look f or ent ry f oo and t hat

ent ry will t ell you where f oo is

Read special direct ory f ile f oo and look f or ent ry bar t o

t ell you where bar is

Find special direct ory f ile bar and look f or ent ry baz t o

t ell you where baz is

Finally f ind baz

FS can cache common pr ef ixes f or ef f iciency

  • 10

File Buf f er Cache

Cache Dat a Read

Exploit t emporal localit y of access by caching pat hname

t ranslat ion inf ormat ion

Exploit t emporal localit y of access by leaving recent ly

accesses chunks of a f ile in memory in hopes t hat t hey will be accessed again (let app give hint if not ?)

Exploit spat ial localit y of access by bringing in large

chunks of a f ile at once Dat a wr it t en is also cached

For correct ness should be writ e-t hrough t o disk Normally is writ e- behind

  • FS per iodically walks t he buf f er cache and “f lushes” t hings
  • lder t han 30 seconds t o disk
  • Unr eliable!

Usually LRU r eplacement

  • 11

File Buf f er Cache

Typically cache is syst em wide (shared by

all processes)

Shar ed libr ar ies and execut ables and ot her

commonly accessed f iles likely t o be in memor y already Compet es wit h virt ual memory syst em f or

physical memory

Processes have less memory available t o t hem

t o st or e code and dat a (addr ess space)

Some syst ems have int egr at ed VM/ FS caches

  • 12

P rot ect ion Syst em

Most FS implement a pr ot ect ion scheme t o

cont rol:

Who can access a f ile How t hey can access it (e.g. read/ writ e/ exec/ ..)

Any pr ot ect ion syst em dict at es whet her a given

act ion per f or med by a given subj ect on a given

  • bj ect should be allowed. I n t his case:

Obj ect s = f iles P

rinciples = users

Act ions = operat ions

We’ll t alk mor e about pr ot ect ion syst ems lat er in

t he semest er

slide-3
SLIDE 3

3

  • 13

File Syst ems

We t alked a bit about disk int ernals Despit e complex int ernals, disks export a

simple array of sect ors

How do we go f rom t hat t o a f ile syst em?

  • 14

Exercise f or t he Reader ☺

I f you wer e going t o build your own f ile

syst em on t op of a f ixed sized f ile what would you do?

What ot her inf or mat ion would you need t o

st or e t her e besides f ile dat a and dir ect or y dat a?

How would you or ganize t hings?

  • 15

Some quest ions

Would you keep each f ile t oget her

sequent ially?

I f you did, what would you do if a f ile gr ew or

shrunk?

I f not , how would you keep t r ack of t he

mult iple pieces?

  • 16

File Layout

Opt ion 1: All blocks in a f ile must be

allocat ed cont iguously

Only need t o list st art and lengt h in direct ory Causes f r agment at ion of f r ee space Also causes copying as f iles gr ow

Opt ion 2: Allow f iles t o be broken int o

pieces

Fixed sized pieces (blocks) or var iable sized

pieces (ext ent s)?

I f we ar e going t o allow f iles t o be br oken int o

mult iple pieces how will we keep t r ack of t hem ?

  • 17

Blocks or Ext ent s?

I f f ixed sized block t hen st ore j ust

st ar t ing locat ion f or each one

I f variable sized ext ent need t o st ore

st art ing locat ion and lengt h

But maybe you can have f ewer ext ent s?

Blocks = less ext ernal f ragment at ion Ext ent s = less int ernal f ragment at ion

  • 18

Finding all P art s of a File

Opt ion 2A: List all blocks in t he dir ect or y

Direct ories will get pret t y big and also must change t he

direct ory everyt ime a f ile grows or shrinks Opt ion 2B: Linked st r uct ur e

Direct ory point s t o f irst piece (block or ext ent ), f irst

  • ne point s t o second one

File can expand and cont ract wit hout copying Good f or sequent ial access, t errible f or ot her kinds

Opt ion 2C: I ndexed st r uct ur e

Direct ory point s t o index block which cont ains point ers

t o dat a blocks

Good f or random access as well as sequent ial access

slide-4
SLIDE 4

4

  • 19

Unix I nodes

I node = index nodes

Files br oken int o f ixed size blocks I nodes cont ain point ers t o all t he f iles blocks Direct ory point s t o locat ion of inodes

Each inode cont ains 15 block point er s

First 12 point direct ly t o dat a blocks Then single, doubly and t riply indirect blocks

I nodes of t en cont ain inf or mat ion like last

modif icat ion t ime, size, et c. t hat could logically be associat ed wit h a dir ect or y

Not e: I ndir ect blocks somet ime number ed as f ile

blocks –1, -2, et c.

  • 20

I node (Not t o scale!)

Size Last Mod Owner Permissions Lba of File block 1 Lba of File block 2 Lba of File block 3 Lba of File block 11 Lba of File block 12 Lba of File block 10 Lba of Singly I ndir ect Block Lba of DoublyI ndirect Block Lba of Tr iply I ndir ect Block

  • 21

Max f ile size?

Assume:

4K dat a pages and indirect blocks Lbas are t ypically 4 byt es

Fir st 48K dir ect ly r eachable f r om inodr e One singly indir ect block r eaches 1024 mor e

blocks = 4M

One double indir ect blocks point s t o 1024 mor e

singly indir ect blocks which each point t o 4 M of dat a = 4 GB

One t r iply indir ect block point s t o 1024 mor e

doubly indir ect blocks which each point t o 4 GB of dat a = 4 TB

Max f ile or dir ect or y size = 4TB + 4GB + 4 MB +

48 K

  • 22

Ot her index st ruct ures?

Why t his part icular index st ruct ure?

? Dir ect point er s t o f ir st 12 blocks is good f or

small f iles Could you imagine ot her index st ruct ures?

Def init ely Flat vs Mult ilevel I ndex st r uct ur es?

  • 23

P at h Name Traversal Revisit ed

Dir ect or ies ar e j ust special f iles so t hey have

inodes of t heir own

To f ind “/ f oo/ bar/ baz” (assuming not hing is

cached)

Look in super block and f ind locat ion of I -node f or

/

Read inode f or / , f ind locat ion of f irst dat a block

  • f /

Read f ir st dat a block of / Repeat wit h all blocks of / unt il f ind ent r y f or f oo

if r ead unt il block 13 t hen must r ead singly indir ect block f ir st …

When f ind ent ry f or f oo gives addr ess of I -node

f or f oo, r ead inode f or f oo,..

  • 24

More quest ions

Remember you are building your own FS When someone cr eat es a new f ile wher e

would you put it ?

How would you f ind f r ee space? How much space would you look f or (how do you

know how big t he f ile will get ?)

Would you t ake t he f ir st accept able chunk of

f ree space you f ound? I f someone delet ed a f ile what would

happen?

slide-5
SLIDE 5

5

  • 25

Keeping t rack of f ree space

Ever yt ime f ree space, st art at / and see what s not

t aken? No!

Linked list of f r ee space

J ust put f reed blocks on t he end and pull blocks f rom

f ront t o allocat e

Hard t o manage spat ial localit y (why import ant ?) I f middle of list get s corrupt ed how t o repair?

Bit map

Divide all space int o blocks Bit per block (0 = f ree; 1 = allocat ed) Easy t o f ind groups of nearby blocks Usef ul f or disk recovery How big? I f had 40 GB disk, t hen have 10M of 4K blocks

is each needs 1 bit t hen 10M/ 8 = 1.2 MB f or t he bit map

  • 26

Answers?

We are going t o look at t wo dif f erent f ile

syst ems

Fast File Syst em (FFS) Log-St r uct ur ed File Syst ems (LFS)

Remember *your* answers t o t he quest ions

we j ust posed, at t he end of t oday, if you t hink your answers are bet t er t hen maybe you will go on t o writ e your own f ile syst em (MeFS)

  • 27

Right answer?

Remember much like CPU or disk scheduling

algor it hms, “r ight ” answer depends a lot on your wor kload

Are t here many small f iles (<

48 K) or all big f iles?

Ar e f iles usually r ead sequent ially f r om t he

beginning or r andomly?

Ar e f iles r ead and wr it t en equally? Ar e f iles in t he same dir ect or y usually accessed

t oget her?

How t o f ind out ?

Analyze st at ic FS snapshot s Take FS access t races Even t hen which syst ems do you look at ? Are t hey

“represent at ive”?

  • 28

Out t akes

  • 29

Some quest ions

What would your direct ory st ruct ure look

like (direct ory_t ?)

How would you f ind t he root direct ory?

What if t he root direct ory got really really big?

  • 30

File I nt erf aces

St r uct ur ed Recor d f iles

Think of f ile as cont aining dat a st ruct ures not byt es

Recor d-St r eam Tr anslat ion

When access record, read/ writ e t he correct number of

byt es Byt e St r eam Files

Appear t o user as st ream of byt es

St r eam

  • Block Tr anslat ion

I f read/ writ e single byt e f rom block get t he whole t hing

Raw St or age I nt er f ace

Sect or s or FS blocks

slide-6
SLIDE 6

6

  • 31

Byt e St ream File I nt erf ace

f ileI d = open(f ileName) Close (f ileI D) Seek(f ileI d, f ilePosit ion) Read (f ileI d, buf f er, lengt h) Writ e (f ileI d, buf f er, lengt h)

  • 32

Record Orient ed File I nt erf ace

f ileI d = open(f ileName) Close (f ileI d) get Record (f ileI d, record) put Record (f ileI d, record) Seek(f ileI d, r ecor dNum)

  • 33

Open syst em call

f ileI d = Open (f ileName ) Read dir ect or y cont aining base f ile name Get I node of f ile Det er mine per mission t o open f ile (dif f er ent

modes: r ead, r ead/ wr it e, append, et c.)

OS cr eat es an int er nal f ile descr ipt or f or t his f ile

  • r may f ind t hat one alr eady exist s if ot her

pr ocesses ar e accessing t his f ile

Allocat e r esour ces (buf f er s et c.) t o suppor t

access t o t he f ile

Cr eat e an ent r y in t he pr ocess cont r ol block f or

t his f ile; Pr ocess cont r ol block has an ar r ay of f ile inf or mat ion

Ret ur n t he index int o t he ar r ay in t he PCB as t he

f ileI d

  • 34

Accessing an open f ile

fileId = open (“/foo/bar/baz”, flags); read(fileId, buffer, len); syscall(READ, fileId, buffer,len); N Current Byt e Of f set Point er t o FS inf o 3 2 (st derr) 1 (st dout) 0 (st din) Open File I nf o (FS inf o) I node number Point ers t o Cached copy Of I node, indirect blocks, Dat a blocks, et c. Time of Last access Number of pr ocesses Using f ile Process Cont rol Block (Per Process I nf o)

  • 35

Ot her kinds of “special f iles” Device special f iles