StoringData:DisksandFiles Chapter7 Yea,fromthetableofmymemory - - PDF document

storing data disks and files
SMART_READER_LITE
LIVE PREVIEW

StoringData:DisksandFiles Chapter7 Yea,fromthetableofmymemory - - PDF document

StoringData:DisksandFiles Chapter7 Yea,fromthetableofmymemory Illwipeawayalltrivialfondrecords. -- Shakespeare, Hamlet


slide-1
SLIDE 1

DatabaseManagementSystems3ed,R. Ramakrishnan andJ.Gehrke 1

StoringData:DisksandFiles

Chapter7

“Yea,fromthetableofmymemory I’llwipeawayalltrivialfondrecords.”

  • - Shakespeare,Hamlet

DatabaseManagementSystems3ed,R. Ramakrishnan andJ.Gehrke 2

DisksandFiles

  • DBMSstoresinformationon(“hard”)disks.
  • ThishasmajorimplicationsforDBMSdesign!

READ:transferdatafromdisktomainmemory(RAM).

WRITE:transferdatafromRAMtodisk.

Botharehigh-costoperations,relativetoin-memory

  • perations,somustbeplannedcarefully!

DatabaseManagementSystems3ed,R. Ramakrishnan andJ.Gehrke 3

WhyNotStoreEverythinginMainMemory?

  • Coststoomuch.$1000willbuyyoueither

128MBofRAMor7.5GBofdisktoday.

  • Mainmemoryisvolatile.Wewantdatatobe

savedbetweenruns.(Obviously!)

  • Typicalstoragehierarchy:

Mainmemory(RAM)forcurrentlyuseddata.

Diskforthemaindatabase(secondarystorage).

Tapesforarchivingolderversionsofthedata (tertiarystorage).

slide-2
SLIDE 2

DatabaseManagementSystems3ed,R. Ramakrishnan andJ.Gehrke 4

Disks

  • Secondarystoragedeviceofchoice.
  • Mainadvantageovertapes:randomaccess vs.

sequential.

  • Dataisstoredandretrievedinunitscalled

diskblocksorpages.

  • UnlikeRAM,timetoretrieveadiskpage

variesdependinguponlocationondisk.

Therefore,relativeplacementofpagesondiskhas majorimpactonDBMSperformance!

DatabaseManagementSystems3ed,R. Ramakrishnan andJ.Gehrke 5

ComponentsofaDisk

✁ ✁ ✂ ✂ ✂

Platters

Theplattersspin(say,90rps).

Spindle

Thearmassemblyis movedinorouttoposition aheadonadesiredtrack. Tracksunderheadsmake acylinder (imaginary!).

Diskhead Armmovement Armassembly

Onlyonehead reads/writesatany

  • netime.

Tracks Sector

Blocksizeisamultiple

  • fsectorsize(whichisfixed).

DatabaseManagementSystems3ed,R. Ramakrishnan andJ.Gehrke 6

AccessingaDiskPage

  • Timetoaccess(read/write)adiskblock:

seektime(movingarmstopositiondiskheadontrack)

rotationaldelay(waitingforblocktorotateunderhead)

transfertime(actuallymovingdatato/fromdisksurface)

  • Seektimeandrotationaldelaydominate.

Seektimevariesfromabout1to20msec

Rotationaldelayvariesfrom0to10msec

Transferrateisabout1msecper4KBpage

  • KeytolowerI/Ocost:reduceseek/rotation

delays!Hardwarevs.softwaresolutions?

slide-3
SLIDE 3

DatabaseManagementSystems3ed,R. Ramakrishnan andJ.Gehrke 7

ArrangingPagesonDisk

  • `Next’blockconcept:

blocksonsametrack,followedby

blocksonsamecylinder,followedby

blocksonadjacentcylinder

  • Blocksinafileshouldbearranged

sequentiallyondisk(by`next’),tominimize seekandrotationaldelay.

  • Forasequentialscan,pre-fetching several

pagesatatimeisabigwin!

DatabaseManagementSystems3ed,R. Ramakrishnan andJ.Gehrke 8

RAID

  • DiskArray:Arrangementofseveraldisks

thatgivesabstractionofasingle,largedisk.

  • Goals:Increaseperformanceandreliability.
  • Twomaintechniques:

Datastriping:Dataispartitioned;sizeofa partitioniscalledthestripingunit.Partitionsare distributedoverseveraldisks.

Redundancy:Moredisks=>morefailures. Redundantinformationallowsreconstructionof dataifadiskfails.

DatabaseManagementSystems3ed,R. Ramakrishnan andJ.Gehrke 9

RAIDLevels

  • Level0:Noredundancy
  • Level1:Mirrored(twoidenticalcopies)

Eachdiskhasamirrorimage(checkdisk)

Parallelreads,awriteinvolvestwodisks.

Maximumtransferrate=transferrateofonedisk

  • Level0+1:StripingandMirroring

Parallelreads,awriteinvolvestwodisks.

Maximumtransferrate=aggregatebandwidth

slide-4
SLIDE 4

DatabaseManagementSystems3ed,R. Ramakrishnan andJ.Gehrke 10

RAIDLevels(Contd.)

  • Level3:Bit-InterleavedParity

StripingUnit:Onebit.Onecheckdisk.

Eachreadandwriterequestinvolvesalldisks;disk arraycanprocessonerequestatatime.

  • Level4:Block-InterleavedParity

StripingUnit:Onediskblock.Onecheckdisk.

Parallelreadspossibleforsmallrequests,large requestscanutilizefullbandwidth

Writesinvolvemodifiedblockandcheckdisk

  • Level5:Block-InterleavedDistributedParity

SimilartoRAIDLevel4,butparityblocksare distributedoveralldisks

DatabaseManagementSystems3ed,R. Ramakrishnan andJ.Gehrke 11

DiskSpaceManagement

  • LowestlayerofDBMSsoftwaremanagesspace
  • ndisk.
  • Higherlevelscalluponthislayerto:

allocate/de-allocateapage

read/writeapage

  • Requestforasequence ofpagesmustbesatisfied

byallocatingthepagessequentiallyondisk! Higherlevelsdon’tneedtoknowhowthisis done,orhowfreespaceismanaged.

DatabaseManagementSystems3ed,R. Ramakrishnan andJ.Gehrke 12

BufferManagementinaDBMS

  • DatamustbeinRAMforDBMStooperateonit!
  • Tableof<frame#, pageid>pairsismaintained.

DB

MAINMEMORY DISK diskpage freeframe

PageRequestsfromHigherLevels

BUFFERPOOL choiceofframedictated byreplacementpolicy

slide-5
SLIDE 5

DatabaseManagementSystems3ed,R. Ramakrishnan andJ.Gehrke 13

WhenaPageisRequested...

  • Ifrequestedpageisnotinpool:

Chooseaframeforreplacement

Ifframeisdirty,writeittodisk

Readrequestedpageintochosenframe

  • Pinthepageandreturnitsaddress.

* Ifrequestscanbepredicted(e.g.,sequentialscans)

pagescanbepre-fetched severalpagesatatime!

DatabaseManagementSystems3ed,R. Ramakrishnan andJ.Gehrke 14

MoreonBufferManagement

  • Requestorofpagemustunpinit,andindicate

whetherpagehasbeenmodified:

dirty bitisusedforthis.

  • Pageinpoolmayberequestedmanytimes,

apincountisused.Apageisacandidatefor replacement iff pincount=0.

  • CC&recoverymayentailadditionalI/O

whenaframeischosenforreplacement. (Write-AheadLogprotocol;morelater.)

DatabaseManagementSystems3ed,R. Ramakrishnan andJ.Gehrke 15

BufferReplacementPolicy

  • Frameischosenforreplacementbya

replacementpolicy:

Least-recently-used(LRU),Clock,MRUetc.

  • Policycanhavebigimpacton#ofI/O’s;

dependsontheaccesspattern.

  • Sequentialflooding:Nastysituationcausedby

LRU+repeatedsequentialscans.

#bufferframes<#pagesinfilemeanseachpage requestcausesanI/O.MRUmuchbetterinthis situation(butnotinallsituations,ofcourse).

slide-6
SLIDE 6

DatabaseManagementSystems3ed,R. Ramakrishnan andJ.Gehrke 16

DBMSvs.OSFileSystem

OSdoesdiskspace&buffermgmt:whynotlet OSmanagethesetasks?

  • DifferencesinOSsupport:portabilityissues
  • Somelimitations,e.g.,filescan’tspandisks.
  • BuffermanagementinDBMSrequiresabilityto:

pinapageinbufferpool,forceapagetodisk (importantforimplementingCC&recovery),

adjustreplacementpolicy,andpre-fetchpagesbased

  • naccesspatternsintypicalDBoperations.

DatabaseManagementSystems3ed,R. Ramakrishnan andJ.Gehrke 17

RecordFormats:FixedLength

  • Informationaboutfieldtypessameforall

recordsinafile;storedinsystem catalogs.

  • Finding i’th fieldrequiresscanofrecord.

Baseaddress(B)

L1 L2 L3 L4 F1 F2 F3 F4

Address=B+L1+L2

DatabaseManagementSystems3ed,R. Ramakrishnan andJ.Gehrke 18

RecordFormats:VariableLength

  • Twoalternativeformats(#fieldsisfixed):

* Secondoffersdirectaccessto i’th field,efficientstorage

  • fnulls (specialdon’tknowvalue);smalldirectoryoverhead.

4 $ $ $ $

Field Count

FieldsDelimitedbySpecialSymbols

F1F2F3 F4 F1F2F3F4

ArrayofFieldOffsets

slide-7
SLIDE 7

DatabaseManagementSystems3ed,R. Ramakrishnan andJ.Gehrke 19

PageFormats:FixedLengthRecords

* Recordid=<pageid,slot#>.Infirst

alternative,movingrecordsforfreespace managementchangesrid;maynotbeacceptable.

Slot1 Slot2 SlotN

... ...

N M 1 ... M...321 PACKED UNPACKED,BITMAP Slot1 Slot2 SlotN Free Space SlotM 1 1 number

  • frecords

number

  • fslots

DatabaseManagementSystems3ed,R. Ramakrishnan andJ.Gehrke 20

PageFormats:VariableLengthRecords

* Canmoverecordsonpagewithoutchangingrid;

so,attractiveforfixed-lengthrecordstoo.

Pagei Rid=(i,N) Rid=(i,2) Rid=(i,1) Pointer tostart

  • ffree

space

SLOTDIRECTORY

N ...21 20 16 24 N #slots

DatabaseManagementSystems3ed,R. Ramakrishnan andJ.Gehrke 21

FilesofRecords

  • PageorblockisOKwhendoingI/O,but

higherlevelsofDBMSoperateonrecords,and filesofrecords.

FILE:Acollectionofpages,eachcontaininga

collectionofrecords.Mustsupport:

insert/delete/modifyrecord

readaparticularrecord(specifiedusingrecordid)

scanallrecords(possiblywithsomeconditionson therecordstoberetrieved)

slide-8
SLIDE 8

DatabaseManagementSystems3ed,R. Ramakrishnan andJ.Gehrke 22

Unordered(Heap)Files

  • Simplestfilestructurecontainsrecordsinno

particularorder.

  • Asfilegrowsandshrinks,diskpagesare

allocatedandde-allocated.

  • Tosupportrecordleveloperations,wemust:

keeptrackofthepages inafile

keeptrackoffreespaceonpages

keeptrackoftherecords onapage

  • Therearemanyalternativesforkeepingtrack
  • fthis.

DatabaseManagementSystems3ed,R. Ramakrishnan andJ.Gehrke 23

HeapFileImplementedasaList

  • TheheaderpageidandHeapfilenamemust

bestoredsomeplace.

  • Eachpagecontains2`pointers’plusdata.

Header Page Data Page Data Page Data Page Data Page Data Page Data Page Pageswith FreeSpace FullPages

DatabaseManagementSystems3ed,R. Ramakrishnan andJ.Gehrke 24

HeapFileUsingaPageDirectory

  • Theentryforapagecanincludethenumber
  • ffreebytesonthepage.
  • Thedirectoryisacollectionofpages;linked

listimplementationisjustonealternative.

MuchsmallerthanlinkedlistofallHFpages!

Data Page1 Data Page2 Data PageN Header Page

DIRECTORY

slide-9
SLIDE 9

DatabaseManagementSystems3ed,R. Ramakrishnan andJ.Gehrke 25

SystemCatalogs

  • Foreachindex:

structure(e.g.,B+tree)andsearchkeyfields

  • Foreachrelation:

name,filename,filestructure(e.g.,Heapfile)

attributenameandtype,foreachattribute

indexname,foreachindex

integrityconstraints

  • Foreachview:

viewnameanddefinition

  • Plusstatistics,authorization,bufferpoolsize,etc.

* Catalogsarethemselvesstoredasrelations!

DatabaseManagementSystems3ed,R. Ramakrishnan andJ.Gehrke 26

Attr_Cat(attr_name, rel_name,type,position)

attr_name rel_name type position attr_name Attribute_Cat string 1 rel_name Attribute_Cat string 2 type Attribute_Cat string 3 position Attribute_Cat integer 4 sid Students string 1 name Students string 2 login Students string 3 age Students integer 4 gpa Students real 5 fid Faculty string 1 fname Faculty string 2 sal Faculty real 3

DatabaseManagementSystems3ed,R. Ramakrishnan andJ.Gehrke 27

Summary

  • Disksprovidecheap,non-volatilestorage.

Randomaccess,butcostdependsonlocationofpage

  • ndisk;importanttoarrangedatasequentiallyto

minimizeseek androtationdelays.

  • BuffermanagerbringspagesintoRAM.

PagestaysinRAMuntilreleasedbyrequestor.

Writtentodiskwhenframechosenforreplacement (whichissometimeafterrequestorreleasesthepage).

Choiceofframetoreplacebasedonreplacementpolicy.

Triestopre-fetch severalpagesatatime.

slide-10
SLIDE 10

DatabaseManagementSystems3ed,R. Ramakrishnan andJ.Gehrke 28

Summary(Contd.)

  • DBMSvs.OSFileSupport

DBMSneedsfeaturesnotfoundinmany OS’s,e.g., forcingapagetodisk,controllingtheorderof pagewritestodisk,filesspanningdisks,abilityto controlpre-fetchingandpagereplacementpolicy basedonpredictableaccesspatterns,etc.

  • Variablelengthrecordformatwithfieldoffset

directoryofferssupportfordirectaccessto i’th fieldandnullvalues.

  • Slottedpageformatsupportsvariablelength

recordsandallowsrecordstomoveonpage.

DatabaseManagementSystems3ed,R. Ramakrishnan andJ.Gehrke 29

Summary(Contd.)

  • Filelayerkeepstrackofpagesinafile,and

supportsabstractionofacollectionofrecords.

Pageswithfreespaceidentifiedusinglinkedlist

  • rdirectorystructure(similartohowpagesinfile

arekepttrackof).

  • Indexessupportefficientretrievalofrecords

basedonthevaluesinsomefields.

  • Catalogrelationsstoreinformationabout

relations,indexesandviews.(Informationthat iscommontoallrecordsinagivencollection.)