system
play

System Notes 02: Hardware Hector Garcia-Molina CS 245 Notes 2 1 - PowerPoint PPT Presentation

CS 554: Advanced Database System Notes 02: Hardware Hector Garcia-Molina CS 245 Notes 2 1 Outline Hardware: Disks Access Times (disk) Optimizations (disk access time) Other Topics: Storage costs Using secondary storage


  1. CS 554: Advanced Database System Notes 02: Hardware Hector Garcia-Molina CS 245 Notes 2 1

  2. Outline • Hardware: Disks • Access Times (disk) • Optimizations (disk access time) • Other Topics: – Storage costs – Using secondary storage – Disk failures CS 245 Notes 2 2

  3. Hardware DBMS Data Storage CS 245 Notes 2 3

  4. CPU P Typical Computer Disk Controller ... M C ... Memory Secondary Storage CS 245 Notes 2 4

  5. Secondary storage Many flavors: - Disk: Floppy (hard, soft) Removable Packs Winchester (most common) SSD disks Optical, CD- ROM… Arrays - Tape:Reel, cartridge Robots CS 245 Notes 2 5

  6. “Typical Disk:” Platter Head … Terms: Platter, Head, Cylinder, Track, Sector (physical), Block (logical), Gap CS 245 Notes 2 6

  7. Top View Gap Sector Track CS 245 Notes 2 7

  8. Block Block Block = group of sectors that form a unit of access One read/write operation will read/write one block CS 245 Notes 2 8

  9. Disk Access Time block x I want in memory block X How long ? CS 245 Notes 2 9

  10. Platter Head … Time = Seek Time + Rotational Delay + Transfer Time + Other Seek time: to move head to the desired cylinder (track) Rotational delay: for waiting on the desired sector Transfer time: to transfer data on sectors to memory CS 245 Notes 2 10

  11. Seek Time Once head moving, the head travels fast 3 or 5x Seek Time x Cylinders Traveled 1 N Takes time to start the head moving CS 245 Notes 2 11

  12. Average Random Seek Time Start at cylinder i  Go to cylinder j N N   SEEKTIME (i  j) j=1 i=1 S = j  i N(N-1) There are N starting cylinders and N-1 cylinders Total: N(N-1) possible values CS 245 Notes 2 12

  13. Average Random Seek Time N N   SEEKTIME (i  j) j=1 i=1 S = j  i N(N-1) “Typical” S : 10 ms  40 ms CS 245 Notes 2 13

  14. Typical Seek Time • Ranges from – 4ms for high end drives – 15ms for mobile devices • Typical SSD (Solid State): ranges from – 0.08ms – 0.16ms • Source: Wikipedia, "Hard disk drive performance characteristics" CS 245 Notes 2 14

  15. Rotational Delay Disk platter rotates Head is here Block I Want CS 245 Notes 2 15

  16. Average Rotational Delay R = 1/2 revolution R=0 for SSDs Typical HDD figures HSpindle Average DD rotational [rpm] latency [ms] 4,200 7.14 5,400 5.56 7,200 4.17 10,000 3.00 15,000 2.00 Source: Wikipedia, "Hard disk drive performance characteristics" CS 245 Notes 2 16

  17. Transfer Rate: # bits transferred/sec • Transfer rates: – HDD: up to 1000 Mbit/sec – 12x Blu-Ray: 432 Mbit/sec – 1xCD: 1.23 Mbits/sec – for SSDs, limited by interface e.g., SATA 3000 Mbit/s • Transfer time: Amount data transferred Transfer rate CS 245 Notes 2 17

  18. Other Delays • CPU time to issue I/O • Contention delay for disk controller – Different programs can be using the disk • Contention delay for bus, memory – Different programs can be transferring data These delays are negligible compared to Seek time + rotational delay + transfer time CS 245 Notes 2 18

  19. • So far: One (Random) Block Access • What about: Reading “Next” block ? CS 245 Notes 2 19

  20. If we do things right (e.g., Double Buffer, Stagger Blocks…) Time to get = Block Size + Negligible “next” block Transfer rate - skip gap - switch track - once in a while, next cylinder CS 245 Notes 2 20

  21. Rule of Random I/O: Expensive Thumb Sequential I/O: Much less CS 245 Notes 2 21

  22. Cost for Writing similar to Reading …. unless we want to verify: need to add (full) rotation + Block size Transfer time CS 245 Notes 2 22

  23. • To Modify a Block? CS 245 Notes 2 23

  24. • To Modify a Block? To Modify Block: (a) Read Block into Memory (b) Modify block in Memory (c) Write Block [(d) Verify?] CS 245 Notes 2 24

  25. Random Access Time • Hand Drive: Ranges from 2.9 msec (high end server drive) to 12 msec (laptop HDD) • Due to the need to move the heads and wait for the data to rotate under the read/write head CS 245 Notes 2 25

  26. Data Transfer Rate • Hard Disk: Once the head is positioned, an enterprise HDD can transfer data at about 140 MBytes/sec. • In practice, much lower speeds because…. • Data transfer rate depends also on rotational speed (of the platter) ! CS 245 Notes 2 26

  27. Reliability • Hard Disk: According to a study performed by CMU for both consumer and enterprise-grade HDDs, their average failure rate is 6 years, and life expectancy is 9 – 11 years. CS 245 Notes 2 27

  28. Cost and Capacity • Hard Drive: • In 2013: HDDs of up to 6 TB were available. • In 2014: Cost: around $50 per TeraByte CS 245 Notes 2 28

  29. Kibibytes • 1 kibibyte = 2 10 bytes = 1024 bytes. from Wikipedia CS 245 Notes 2 29

  30. Outline • Hardware: Disks • Access Times • Optimizations here • Other Topics – Storage Costs – Using Secondary Storage – Disk Failures CS 245 Notes 2 30

  31. Optimizations (in controller or O.S.) • Disk Scheduling Algorithms – e.g., elevator algorithm • Pre-fetch (Double buffering) • Arrays (RAID) • Mirrored Disks CS 245 Notes 2 31

  32. Disk Scheduling: Elevator Algorithm Situation: Have many read/write requests Question: In which order do you process the requests ? CS 245 Notes 2 32

  33. Disk Scheduling: Elevator Algorithm 1. Process requests for these cylinders 2. Then process requests this way Current cylinder CS 245 Notes 2 33

  34. Double Buffering Algorithm Problem: You have a File » Sequence of Blocks B1, B2, …, Bn You have a Program that: » Process B1 » Process B2 » Process B3 ... CS 245 Notes 2 34

  35. Single Buffer Solution (“naïve” solution ) (1) Read B1  Buffer (2) Process Data in Buffer (3) Read B2  Buffer (4) Process Data in Buffer ... CS 245 Notes 2 35

  36. Say P = time to process/block R = time to read in 1 block n = # blocks  R (1) Read B1  Buffer  P (2) Process Data in Buffer (3) Read B2  Buffer  R (4) Process Data in Buffer ...  P Time to process n block = n(P + R) CS 245 Notes 2 36

  37. Double Buffering process Memory: Read block 1 Disk: A B C D E F G CS 245 Notes 2 37

  38. Double Buffering process Memory: A B Process block 1 AND read block 2 simultaneously Disk: A B C D E F G done CS 245 Notes 2 38

  39. Double Buffering process Memory: A B C AND read block 3 Process block 2 simultaneously Disk: A B C D E F G done CS 245 Notes 2 39

  40. Say P > R P = Processing time/block R = IO time/block n = # blocks What is processing time? CS 245 Notes 2 40

  41. Double Buffering process Memory: Read block 1  R Disk: A B C D E F G CS 245 Notes 2 41

  42. Double Buffering Time needed = P (P > R) process Memory: A B AND read block 2  R Process block 1  P simultaneously Disk: A B C D E F G done CS 245 Notes 2 42

  43. Time needed = P (P > R) Double Buffering process Memory: A B C AND read block 3  R Process block 2  P simultaneously Disk: A B C D E F G done CS 245 Notes 2 43

  44. Say P  R P = Processing time/block R = IO time/block n = # blocks What is processing time? • Double buffering time = R + nP • Single buffering time = n(R+P) CS 245 Notes 2 44

  45. Using disk array to accelerate disk access • Why use multiple disks: – Multiple disks  multiple disk heads – Multiple outputs = Increased data rate CS 245 Notes 2 45

  46. Techniques to deploit multiple disks • Block Striping: – Store blocks of a file over multiple disks – (This technique uses multiple disks as point 2) • Mirror disk: – Store the same data on multiple disks • RAID: – Redundant Array of Independent (inexpensive) Disks CS 245 Notes 2 46

  47. Block Striping • Blocks of the same file stored on different disks Data blocks of 1 file CS 245 Notes 2 47

  48. Disk Mirroring • Mirrored disks contain identical content logically one disk • Read operation: n times as fast • Write operation: about the same as 1 disk CS 245 Notes 2 48

  49. Disk Arrays • RAIDs (various flavors) (Even parity) Parity block Data blocks 00 01 00 10 11 logically one disk CS 245 Notes 2 49

  50. Disk Failures • Intermittent read failure – Cause: power fluctuations/failure • Intermittent write failure – Cause: power fluctuation/failure • Media decay  discuss first – Disk surface worn out • Permanent failure  redundancy… – Disk crash CS 245 Notes 2 50

  51. Coping with media decay • Disk has a number of spare blocks • When writing a block fails for n times: – Mark block as bad – Replace block with one of the spare blocks CS 245 Notes 2 51

  52. Coping with Read/Write Failures • Detection: – Read (verify) after writing data – Better: Use checksum • Detect and Correct:  Redundancy CS 245 Notes 2 52

  53. Detecting read error: • Block contains a check sum: data • Check sum computed from data in block • Reading a data block: – Re-compute check sum with data and verify with recorded checksum CS 245 Notes 2 53

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend