System Notes 02: Hardware Hector Garcia-Molina CS 245 Notes 2 1 - - PowerPoint PPT Presentation

system
SMART_READER_LITE
LIVE PREVIEW

System Notes 02: Hardware Hector Garcia-Molina CS 245 Notes 2 1 - - PowerPoint PPT Presentation

CS 554: Advanced Database System Notes 02: Hardware Hector Garcia-Molina CS 245 Notes 2 1 Outline Hardware: Disks Access Times (disk) Optimizations (disk access time) Other Topics: Storage costs Using secondary storage


slide-1
SLIDE 1

CS 245 Notes 2 1

CS 554: Advanced Database System

Notes 02: Hardware

Hector Garcia-Molina

slide-2
SLIDE 2

CS 245 Notes 2 2

Outline

  • Hardware: Disks
  • Access Times (disk)
  • Optimizations (disk access time)
  • Other Topics:

– Storage costs – Using secondary storage – Disk failures

slide-3
SLIDE 3

CS 245 Notes 2 3

Hardware DBMS Data Storage

slide-4
SLIDE 4

CS 245 Notes 2 4

P M C

Typical Computer Secondary Storage

... ...

CPU Memory Disk Controller

slide-5
SLIDE 5

CS 245 Notes 2 5

Secondary storage Many flavors:

  • Disk:

Floppy (hard, soft) Removable Packs Winchester (most common) SSD disks Optical, CD-ROM… Arrays

  • Tape:Reel, cartridge

Robots

slide-6
SLIDE 6

CS 245 Notes 2 6

“Typical Disk:” Terms: Platter, Head, Cylinder, Track, Sector (physical), Block (logical), Gap …

Platter Head

slide-7
SLIDE 7

CS 245 Notes 2 7

Top View

Gap Sector Track

slide-8
SLIDE 8

CS 245 Notes 2 8

Block

Block Block = group of sectors that form a unit of access One read/write operation will read/write one block

slide-9
SLIDE 9

CS 245 Notes 2 9

Disk Access Time block x in memory

How long ?

I want block X

slide-10
SLIDE 10

CS 245 Notes 2 10

Time = Seek Time + Rotational Delay + Transfer Time + Other

Seek time: to move head to the desired cylinder (track) Rotational delay: for waiting on the desired sector Transfer time: to transfer data on sectors to memory

Platter Head

slide-11
SLIDE 11

CS 245 Notes 2 11

Seek Time

3 or 5x x 1 N

Cylinders Traveled Seek Time

Takes time to start the head moving Once head moving, the head travels fast

slide-12
SLIDE 12

CS 245 Notes 2 12

Average Random Seek Time

 

SEEKTIME (i  j)

S = N(N-1)

N N

i=1 j=1 ji

Start at cylinder i  Go to cylinder j There are N starting cylinders and N-1 cylinders Total: N(N-1) possible values

slide-13
SLIDE 13

CS 245 Notes 2 13

Average Random Seek Time

 

SEEKTIME (i  j)

S = N(N-1)

N N

i=1 j=1 ji

“Typical” S: 10 ms  40 ms

slide-14
SLIDE 14

Typical Seek Time

  • Ranges from

– 4ms for high end drives – 15ms for mobile devices

  • Typical SSD (Solid State): ranges from

– 0.08ms – 0.16ms

  • Source: Wikipedia, "Hard disk drive performance

characteristics"

CS 245 Notes 2 14

slide-15
SLIDE 15

CS 245 Notes 2 15

Rotational Delay

Head is here Block I Want Disk platter rotates

slide-16
SLIDE 16

CS 245 Notes 2 16

Average Rotational Delay

R = 1/2 revolution

HSpindle DD [rpm] Average rotational latency [ms] 4,200 7.14 5,400 5.56 7,200 4.17 10,000 3.00 15,000 2.00

Typical HDD figures

Source: Wikipedia, "Hard disk drive performance characteristics"

R=0 for SSDs

slide-17
SLIDE 17

CS 245 Notes 2 17

Transfer Rate: # bits transferred/sec

  • Transfer rates:

– HDD: up to 1000 Mbit/sec – 12x Blu-Ray: 432 Mbit/sec – 1xCD: 1.23 Mbits/sec – for SSDs, limited by interface e.g., SATA 3000 Mbit/s

  • Transfer time: Amount data transferred

Transfer rate

slide-18
SLIDE 18

CS 245 Notes 2 18

Other Delays

  • CPU time to issue I/O
  • Contention delay for disk controller

– Different programs can be using the disk

  • Contention delay for bus, memory

– Different programs can be transferring data

These delays are negligible compared to Seek time + rotational delay + transfer time

slide-19
SLIDE 19

CS 245 Notes 2 19

  • So far: One (Random) Block Access
  • What about: Reading “Next” block?
slide-20
SLIDE 20

CS 245 Notes 2 20

If we do things right

(e.g., Double Buffer, Stagger Blocks…)

Time to get = Block Size + Negligible “next” block Transfer rate

  • skip gap
  • switch track
  • once in a while,

next cylinder

slide-21
SLIDE 21

CS 245 Notes 2 21

Rule of Random I/O: Expensive Thumb Sequential I/O: Much less

slide-22
SLIDE 22

CS 245 Notes 2 22

Cost for Writing similar to Reading …. unless we want to verify: need to add (full) rotation + Block size Transfer time

slide-23
SLIDE 23

CS 245 Notes 2 23

  • To Modify a Block?
slide-24
SLIDE 24

CS 245 Notes 2 24

  • To Modify a Block?

To Modify Block:

(a) Read Block into Memory (b) Modify block in Memory (c) Write Block [(d) Verify?]

slide-25
SLIDE 25

Random Access Time

  • Hand Drive: Ranges from 2.9 msec (high end server

drive) to 12 msec (laptop HDD)

  • Due to the need to move the heads and wait for the

data to rotate under the read/write head

CS 245 Notes 2 25

slide-26
SLIDE 26

Data Transfer Rate

  • Hard Disk: Once the head is positioned, an

enterprise HDD can transfer data at about 140 MBytes/sec.

  • In practice, much lower speeds because….
  • Data transfer rate depends also on rotational speed

(of the platter) !

CS 245 Notes 2 26

slide-27
SLIDE 27

Reliability

  • Hard Disk: According to a study performed by CMU

for both consumer and enterprise-grade HDDs, their average failure rate is 6 years, and life expectancy is 9–11 years.

CS 245 Notes 2 27

slide-28
SLIDE 28

Cost and Capacity

CS 245 Notes 2 28

  • Hard Drive:
  • In 2013: HDDs of up to 6 TB were available.
  • In 2014: Cost: around $50 per TeraByte
slide-29
SLIDE 29

Kibibytes

  • 1 kibibyte = 210 bytes = 1024 bytes.

CS 245 Notes 2 29

from Wikipedia

slide-30
SLIDE 30

CS 245 Notes 2 30

Outline

  • Hardware: Disks
  • Access Times
  • Optimizations
  • Other Topics

– Storage Costs – Using Secondary Storage – Disk Failures

here

slide-31
SLIDE 31

CS 245 Notes 2 31

Optimizations (in controller or O.S.)

  • Disk Scheduling Algorithms

– e.g., elevator algorithm

  • Pre-fetch (Double buffering)
  • Arrays (RAID)
  • Mirrored Disks
slide-32
SLIDE 32

CS 245 Notes 2 32

Disk Scheduling: Elevator Algorithm

Situation: Have many read/write requests Question: In which order do you process the requests ?

slide-33
SLIDE 33

CS 245 Notes 2 33

Disk Scheduling: Elevator Algorithm

  • 1. Process requests

for these cylinders

  • 2. Then process

requests this way Current cylinder

slide-34
SLIDE 34

CS 245 Notes 2 34

Double Buffering Algorithm

Problem: You have a File

» Sequence of Blocks B1, B2, …, Bn

You have a Program that: » Process B1 » Process B2 » Process B3

...

slide-35
SLIDE 35

CS 245 Notes 2 35

Single Buffer Solution (“naïve” solution) (1) Read B1  Buffer (2) Process Data in Buffer (3) Read B2  Buffer (4) Process Data in Buffer ...

slide-36
SLIDE 36

CS 245 Notes 2 36

Say P = time to process/block R = time to read in 1 block n = # blocks

(1) Read B1  Buffer (2) Process Data in Buffer (3) Read B2  Buffer (4) Process Data in Buffer ...  P  P  R  R

Time to process n block = n(P + R)

slide-37
SLIDE 37

CS 245 Notes 2 37

Double Buffering

Memory: Disk:

A B C D G E F

process Read block 1

slide-38
SLIDE 38

CS 245 Notes 2 38

Double Buffering

Memory: Disk:

A B C D G E F B

done process

A

Process block 1 AND read block 2 simultaneously

slide-39
SLIDE 39

CS 245 Notes 2 39

Double Buffering

Memory: Disk:

A B C D G E F A C

process

B

done Process block 2 AND read block 3 simultaneously

slide-40
SLIDE 40

CS 245 Notes 2 40

Say P > R

What is processing time?

P = Processing time/block R = IO time/block n = # blocks

slide-41
SLIDE 41

CS 245 Notes 2 41

Double Buffering

Memory: Disk:

A B C D G E F

process Read block 1  R

slide-42
SLIDE 42

CS 245 Notes 2 42

Double Buffering

Memory: Disk:

A B C D G E F B

done process

A

Process block 1  P AND read block 2  R simultaneously Time needed = P (P > R)

slide-43
SLIDE 43

CS 245 Notes 2 43

Double Buffering

Memory: Disk:

A B C D G E F A C

process

B

done Process block 2  P AND read block 3  R simultaneously Time needed = P (P > R)

slide-44
SLIDE 44

CS 245 Notes 2 44

Say P  R

What is processing time?

P = Processing time/block R = IO time/block n = # blocks

  • Double buffering time = R + nP
  • Single buffering time

= n(R+P)

slide-45
SLIDE 45

Using disk array to accelerate disk access

  • Why use multiple disks:

– Multiple disks  multiple disk heads – Multiple outputs = Increased data rate

CS 245 Notes 2 45

slide-46
SLIDE 46

Techniques to deploit multiple disks

  • Block Striping:

– Store blocks of a file over multiple disks – (This technique uses multiple disks as point 2)

  • Mirror disk:

– Store the same data on multiple disks

  • RAID:

– Redundant Array of Independent (inexpensive) Disks

CS 245 Notes 2 46

slide-47
SLIDE 47

CS 245 Notes 2 47

Block Striping

  • Blocks of the same file stored on

different disks

Data blocks of 1 file

slide-48
SLIDE 48

CS 245 Notes 2 48

Disk Mirroring

  • Mirrored disks contain identical content
  • Read operation: n times as fast
  • Write operation: about the same as 1 disk

logically one disk

slide-49
SLIDE 49

CS 245 Notes 2 49

Disk Arrays

  • RAIDs (various flavors)

logically one disk Data blocks Parity block

00 01 10 00 11

(Even parity)

slide-50
SLIDE 50

CS 245 Notes 2 50

Disk Failures

  • Intermittent read failure

– Cause: power fluctuations/failure

  • Intermittent write failure

– Cause: power fluctuation/failure

  • Media decay  discuss first

– Disk surface worn out

  • Permanent failure  redundancy…

– Disk crash

slide-51
SLIDE 51

CS 245 Notes 2 51

Coping with media decay

  • Disk has a number of spare blocks
  • When writing a block fails for n times:

– Mark block as bad – Replace block with one of the spare blocks

slide-52
SLIDE 52

CS 245 Notes 2 52

Coping with Read/Write Failures

  • Detection:

– Read (verify) after writing data – Better: Use checksum

  • Detect and Correct:

 Redundancy

slide-53
SLIDE 53

CS 245 Notes 2 53

Detecting read error:

  • Block contains a check sum:
  • Check sum computed from data in block
  • Reading a data block:

– Re-compute check sum with data and verify with recorded checksum

data

slide-54
SLIDE 54

CS 245 Notes 2 54

Power failure during a write operation

  • Copy of data in memory will be lost
  • Copy of data on disk may be corrupted
  • Bottom line:

– Power failure during a write operation can be catastrophic data lost

  • Solution: stable storage update policy
slide-55
SLIDE 55

CS 245 Notes 2 55

Stable Storage Update Policy

1 Logical Block Block A Block B

  • 1. Write data to Block A
  • 2. Read data back and verify (repeat if needed)
  • 1. If Block A write fail after n tries
  • 1. Mark Block A as bad
  • 2. Replace with spare block
  • 3. Repeat
  • 3. Write data to Block B …
slide-56
SLIDE 56

CS 245 Notes 2 56

Stable Storage Update Policy

1 Logical Block Block A Block B

Power failure during Block A write: We still have an older copy of data in Block B Power failure during Block B write: The new copy was written correctly

slide-57
SLIDE 57

CS 245 Notes 2 57

Coping with Disk Crash

  • “3 ways”:

– Redundancy, redundancy, redundancy

  • Different ways to achieve redundancy:

– Exact copy (mirror) – RAID

slide-58
SLIDE 58

CS 245 Notes 2 58

Disk Mirroring

  • Mirrored disks contain identical content
  • Advantage: tolerates n-1 disk failures
  • Disadvantage: expensive…

logically one disk

slide-59
SLIDE 59

CS 245 Notes 2 59

Disk Arrays

  • RAID (level 4/5)

tolerates one failure Data blocks Parity block

00 01 10 00 11

(Even parity) Data on failed disk can be re-constructed using parity

slide-60
SLIDE 60

CS 245 Notes 2 60

Summary

  • Secondary storage, mainly disks
  • I/O times
  • I/Os should be avoided (if possible),

especially random ones…..

Summary

slide-61
SLIDE 61

CS 245 Notes 2 61

Outline

  • Hardware: Disks
  • Access Times
  • Optimizations
  • Other Topics

– Storage Costs – Using Secondary Storage – Disk Failures

here