dr mais nijim
play

Dr. Mais Nijim 1 20.07.2010 Motivation Introduction Related Work - PowerPoint PPT Presentation

Dr. Mais Nijim 1 20.07.2010 Motivation Introduction Related Work 2 20.07.2010 Global online satellite images distribution system operated at the Earth Resources Observation and Science (EROS) center of the U.S Geological Survey


  1. Dr. Mais Nijim 1 20.07.2010

  2.  Motivation  Introduction  Related Work 2 20.07.2010

  3.  Global online satellite images distribution system operated at the Earth Resources Observation and Science (EROS) center of the U.S Geological Survey  The EROS system motivates the needs for prefetching to improve the performance of hybrid storage system 3 20.07.2010

  4. Reference: 4 20.07.2010

  5.  Study shows that new data is growing annually at the rate of 30%  Supercomputing centers and rich media organizations  Lawrence National laboratory, Oakridge national Lab, NASA, Google, and CNN rely on the large scale storage systems to meet demanding requirements of large data capacity with high performance and reliability 5 20.07.2010

  6.  Large scale storage systems have to be developed to fulfill rapidly increasing demands on both large storage capacity with high performance and high I/O performance employing more disks  Storage capacity  I/O performance increasing the number of storage components 6 20.07.2010

  7. Hybrid storage system Solid State Tapes Drives Hard Disks 7 20.07.2010

  8.  Solid State Disks  Highly accessed storage objects in a hybrid storage system can be prefetched and cashed to a high speed storage components  Solid-state disks can be readily connected to any other type of storage devices 8 20.07.2010

  9.  Tape Storage  Hybrid storage systems are cost-effective, because of the inexpensive tapes  Tape storage system has high reliability, long archive life time, and low cost  Tapes are ideal storage platform for a wide variety of data-intensive applications 9 20.07.2010

  10.  Prefetching is a promising solution to the reduction of latency of data transferring among SSDs, HDDs, and tapes  Prefetching is a process that aims at reducing the number of requests issued to HDDS or tapes while caching popular data in SSDs  Aggressive prefetching are need to efficiently reduce I/O latency  Overaggressive scheme may waste I/O bandwidth by transforming useless data 10 20.07.2010

  11. Web users L AN FTP server with solid state disks SAN Upper level Data miss prefetching SAN to FTP Lower level prefetching tapes to SAN Data miss ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡Fig.2 ¡The ¡Hybrid ¡Storage ¡System ¡Architecture ¡with ¡Prefetching 11 20.07.2010

  12.  Move data from parallel tape storage to hard disks  Parallel tape can increase the aggregate bandwidth between the disk storage and the tape storage by changing parallel load/unload operation 12 20.07.2010

  13.  To support parallelism  Data striping technique is used  To obtain the optima striping, data size and workload are considered 13 20.07.2010

  14.  Striping can cause lots of small data requests  Increase switch time  Data Placement Algorithm  Propose a data clustering algorithm that clusters objects with high probability to be requested together  Related data requests are highly to be requested together 14 20.07.2010

  15. Index 1 2 3 Reference O1 O2 O3 Priority 4 3 2 Tape Library 3 Tape Library 2 Tape Library 1 O13 O12 O11 O23 O22 O21 O32 O33 O31 O43 O42 O41 Step 1 2 3 4 Disk 1 O11 O13 O31 O3 3 Disk 2 O12 O21 O32 Updated priority - 3 - 2 - 1 - LRU Eviction - Either O11,O12 or O13 15 20.07.2010

  16. start R={r i ,r i+1 ,…,r j } P(r) Tape-library(r)) Fetch data from yes No If data is in As many tapes No prefetching disk system As possible Round Robin Data placement No yes If disk if LRU Eviction policy full 16 20.07.2010

  17.  The first component is solid state partitioning PaSSD  Dynamically partition the array of the solid states among HDDs in such a way to maximize I/O performance  Allocated dynamically depending on the popularity, size of contents, and access pattern 17 20.07.2010

  18.  Two approaches: I. Content popularity based weight assignment II. Collaborative popularity based weight assignment 18 20.07.2010

  19. start R={r i ,r i+1 ,…,r j } P(r), block(r),disk(r) yes No If data is in Apply PaSSD No prefetching Solid-state Fetch data from as many disks as possible yes No LRU Eviction policy If solid-states 19 20.07.2010

  20.  Server Access Model  Access time with no prefetching  Access time with prefetching 20 20.07.2010

  21.  The ultimate goal of our analytical model is to provide criteria that can mathematically evaluate the performance of our algorithm  Average Access time improvement S , where S is defined as Access time when Access time when prefetching is not carried out prefetching is carried out 21 20.07.2010

  22.  In the server access model, we consider multiple users accessing the network through the ftp server  we consider M/G/1 round robin queuing system  In this system, the average time to finish a job, necessitate a service time x, is calculated as follows System Utilization A job is defined as the retrieval time of an object. Therefore, the above equation gives the average retrieval time of an item. 22 20.07.2010

  23. s = s + ʹ″ s Size of object located in Size of object located in disks tapes 23 20.07.2010

  24.  The average service time x is calculated as 24 20.07.2010

  25.  Prefetching a proportion h s of the users requests results in a hit in the solid-state disks, which means that this portion is served by the solid state disks  The failure ratio f s =1-h s which means that the requests are located in the disk systems and/or the tapes  The portion h d results in a hit in the disk system  f d =1-h s -h d means that the request is served by the tape storage 25 20.07.2010

  26. 26 20.07.2010

  27. 27 20.07.2010

  28. Average number of items to be Average number of items to prefetched from disk to solid state be prefetched from tapes to disks disks 28 20.07.2010

  29. Probability of items to be Probability of items to be prefetched from the disk prefetched from the tape system to the solid states 29 20.07.2010

  30.  The hit ratio in the disk system will be increased by the number of the prefetched items  When the data objects are to prefetched from the disk system to the solid-state disks, the hit ratio in the solid state-disks is expected to rise 30 20.07.2010

  31. t = h ⋅ 0 + (1 − ʹ″ h ) ⋅ r d + (1 − h − ʹ″ h ) r t 1 − h s − n ( F ) p 1 = + b (1 − h s − (1 − p 1 )) λ s 1 f d − n ( F ) P 2 − n ( F ) p 1 ʹ″ s b b ʹ″ b − ʹ″ b (1 − h d + n ( F )(1 − p 2 )) λ s + b (1 − h s + n ( F )(1 − p 1 )) λ ʹ″ s 31 20.07.2010

  32. 32 20.07.2010

  33. 33 20.07.2010

  34.  The use of large scale parallel disk systems continues to rise as the demands for data-intensive applications with large capacities grow  Traditional storage systems scale up storage capacity by employing more hard disk drives, which tends to be an expensive solution due to ever increasing cost for HDDs  In hybrid storage systems, judiciously transferring data back and forth among SSDs, HDDS, and tapes is critical for I/O performance  A multi-layer prefetching algorithm (PreHySys) that can reduce missing rate of high-end storage components thereby reducing the average response time for data requests in hybrid storage systems 34 20.07.2010

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend