Disks and Files Garcia Molina, Ullman, Widom Ramakrishnan/Gehrke - - PowerPoint PPT Presentation

disks and files
SMART_READER_LITE
LIVE PREVIEW

Disks and Files Garcia Molina, Ullman, Widom Ramakrishnan/Gehrke - - PowerPoint PPT Presentation

Storing Data: Disks and Files Garcia Molina, Ullman, Widom Ramakrishnan/Gehrke Ch. 9 "Digital information lasts forever - or five years, whichever comes first." -- Jeff Rothenberg, RAND Corp., 1997 340151 Big Data & Cloud


slide-1
SLIDE 1

1 340151 Big Data & Cloud Computing (P. Baumann)

Storing Data: Disks and Files

Garcia Molina, Ullman, Widom Ramakrishnan/Gehrke Ch. 9

"Digital information lasts forever

  • or five years, whichever comes first."
  • - Jeff Rothenberg, RAND Corp., 1997
slide-2
SLIDE 2

2 340151 Big Data & Cloud Computing (P. Baumann)

Why Not Everything in Main Memory?

  • Costs too much
  • [Rama/Gehrke]

$1000 will buy you either 128MB of RAM or 7.5GB of disk

  • Today:

80 EUR will buy you either 4 GB of RAM or 1 TB of disk

  • …but today we have multi-Terabyte databases!
  • Main memory is volatile
  • want data to be saved between runs (obviously!)
  • Typical storage hierarchy:
  • Main memory (RAM) for currently used data
  • Disk

for main database (secondary storage)

  • Tapes

for archiving older versions of data (tertiary storage)

slide-3
SLIDE 3

3 340151 Big Data & Cloud Computing (P. Baumann)

Storage Capacity

  • Absolute times as of 2003, but ratios still ~ same
slide-4
SLIDE 4

4 340151 Big Data & Cloud Computing (P. Baumann)

Storage Cost

  • Again, absolute values as of 2003, but ratios still ~ same
slide-5
SLIDE 5

5 340151 Big Data & Cloud Computing (P. Baumann)

Storage Hierarchies

Magneto-optical media Optical media Magnetic tapes RAID systems Magnetic disks Main memory Storage capacity Storage capacity Larger Cheaper Slower Primary memory Secondary memory Tertiary memory

slide-6
SLIDE 6

6 340151 Big Data & Cloud Computing (P. Baumann)

Numbers

source: http://carlos.bueno.org/2014/11/cache.html

slide-7
SLIDE 7

7 340151 Big Data & Cloud Computing (P. Baumann)

Nearline (Tertiary) Storage

  • Usually tape
  • Reel, today: cartridge
  • Capacity 10 GB  ~6 TB per tape
  • Tape robots
  • HSM =

Hierarchical storage management

  • multi-Petabytes
slide-8
SLIDE 8

8 340151 Big Data & Cloud Computing (P. Baumann)

Caching & Virtual Memory

  • Cache: Fast memory, holding frequently used parts of a slower, larger

memory

  • small (L1) cache holds a few kilobytes of the memory "most recently used" by the

processor

  • Most operating systems keep most recently used "pages" of memory in main memory,

put the rest on disk

  • Virtual memory
  • programs don't know whether accessing main memory or a page on secondary

memory page (most operating systems)

  • Database systems usually take explicit control over 2ndary memory access
slide-9
SLIDE 9

9 340151 Big Data & Cloud Computing (P. Baumann)

Where Databases Reside

  • Hard Disk is secondary storage device of choice
  • Many flavors:

Disk: Floppy (hard, soft); Winchester; Ram disks; Optical, CD−ROM; Arrays

  • Main advantage over tapes: random access vs. sequential
  • Data stored and retrieved in units called disk blocks or pages
  • Unlike RAM, time to retrieve a disk page varies

depending upon location on disk

  • relative placement of pages on disk

has major impact on DBMS performance!

slide-10
SLIDE 10

10 340151 Big Data & Cloud Computing (P. Baumann)

The Miracle Called "Hard Disk"

  • Disk head contains magnet, hovering over spinning platter
  • flight height: 10-20 nm
  • (x 5,000 gives one hair!)
slide-11
SLIDE 11

11 340151 Big Data & Cloud Computing (P. Baumann)

Components of a Disk

  • platters spin
  • arm assembly moves in or out
  • to position head on desired track
  • Tracks under heads = a cylinder

(imaginary!)

  • Sector size = N * block size

(fixed)

  • ...typical numbers?
slide-12
SLIDE 12

12 340151 Big Data & Cloud Computing (P. Baumann)

Typical Numbers

  • Diameter:

1 inch ...15 inches

  • Cylinders:

40 (floppy) ... 20,000

  • Surfaces:

1 (old CDs) ... 2 (floppies) ... 30

  • Sector Size: 512 B ... 50 kB
  • Capacity:

360 kB (old floppy) ... 4 TB

slide-13
SLIDE 13

13 340151 Big Data & Cloud Computing (P. Baumann)

Disk Access Time

I want block X block X in memory

?

slide-14
SLIDE 14

14 340151 Big Data & Cloud Computing (P. Baumann)

Disk Access Time

Time = Seek Time + Rotational Delay + Transfer Time + Other

slide-15
SLIDE 15

15 340151 Big Data & Cloud Computing (P. Baumann)

Time = Seek Time + Rotational Delay + Transfer Time + Other

Seek Time

slide-16
SLIDE 16

16 340151 Big Data & Cloud Computing (P. Baumann)

Average Random Seek Time

  • Typical S: 10 ms ...40 ms

= millions of times RAM access !

Time = Seek Time + Rotational Delay + Transfer Time + Other

slide-17
SLIDE 17

17 340151 Big Data & Cloud Computing (P. Baumann)

Average Rotational Delay

  • R = 1/2 revolution
  • typical R = 4.16 ms (7,200 RPM)

Time = Seek Time + Rotational Delay + Transfer Time + Other

slide-18
SLIDE 18

18 340151 Big Data & Cloud Computing (P. Baumann)

Transfer Rate

  • Transfer rate: t
  • typical t: 10 ... 50 MB/second
  • transfer time T:

block size T = --------------- t

  • Ex: block size 32 kB, t = 32 MB/second

transfer time = …?

Time = Seek Time + Rotational Delay + Transfer Time + Other

slide-19
SLIDE 19

19 340151 Big Data & Cloud Computing (P. Baumann)

  • CPU time to issue I/O
  • Contention for controller
  • Contention for bus, memory
  • Typical Value:

Other Delays

Time = Seek Time + Rotational Delay + Transfer Time + Other

(relative to other values)

slide-20
SLIDE 20

20 340151 Big Data & Cloud Computing (P. Baumann)

Sequential Read?

  • So far:

Random Block Access

  • What about: Reading next block?
  • Disks optimized towards "consecutive" reading!
  • Blocks within track
  • Tracks within cylinder
  • Next cylinder
slide-21
SLIDE 21

21 340151 Big Data & Cloud Computing (P. Baumann)

"Next Block" Costs

  • `Next’ block concept:
  • blocks on same track, followed by
  • blocks on same cylinder, followed by
  • blocks on adjacent cylinder
  • If we don’t need to change cylinder:

Block Size Time to get = ---------------- + Negligible block t

  • + switch track (ie, read next arm)
  • + once in a while, next cylinder
slide-22
SLIDE 22

22 340151 Big Data & Cloud Computing (P. Baumann)

Random vs Sequential Read

  • Rule of Thumb:
  • Random I/O: Expensive
  • Sequential I/O: Less expensive
  • Ex: 1 KB Block:
  • Random I/O: ~ 20 ms
  • Sequential I/O: ~ 1 ms
  • relative difference is smaller for larger blocks
  • Whenever possible arrange file blocks sequentially on disk (by `next’)

to minimize seek and rotational delay

  • For sequential scan, pre-fetching several pages at a time is a big win!

“burst read”

slide-23
SLIDE 23

23 340151 Big Data & Cloud Computing (P. Baumann)

...Writing?

  • Cost for Writing

cost for Reading

  • ... unless we want to verify!
  • Then, need to add

Block size

  • --------------- + (full) rotation

t

slide-24
SLIDE 24

24 340151 Big Data & Cloud Computing (P. Baumann)

...To Modify a Block?

  • (a) Read Block
  • (b) Modify in Memory
  • (c) Write Block
  • [ (d) Verify ]
slide-25
SLIDE 25

25 340151 Big Data & Cloud Computing (P. Baumann)

Wrap-Up

  • Capacities grow, data hunger grows larger
  • Moore's Law vs Greg's Law vs disk growth
  • Databases heavily i/o bound
  • Disk space management largely determines performance
  • Disk access time =

Seek Time + Rotational Delay + Transfer Time + Other