differentiated storage services
play

Differentiated Storage Services M. Mesnier, J.B. Akers, F. Chen, T. - PowerPoint PPT Presentation

Differentiated Storage Services M. Mesnier, J.B. Akers, F. Chen, T. Luo Presentation by Szymon Bachnij Introduction DSS is a proposition of I/O classification architecture we want to define the separate classes of I/O our goal is to


  1. Differentiated Storage Services M. Mesnier, J.B. Akers, F. Chen, T. Luo Presentation by Szymon Bachnij

  2. Introduction ● DSS is a proposition of I/O classification architecture ● we want to define the separate classes of I/O ● our goal is to assign the storage system policy to each of those classes to efficently manage data and I/O requests

  3. Challenges: ● Computer system performance depends on storage system ● Storage systems are becoming more and more complex ● Storage system need some information to provide any optimazation ● ... but too much information is not a good idea

  4. Requirements

  5. Operating system: ● classifier assosiated with every I/O request ● new field must be added to each OS structure describing I/O which is always copied to actual I/O command (SCSI, ATA) ● OS scheduler need to be changed

  6. Filesystem: ● must have its own classification scheme ● each class have its own policy ● I/O can change the classification class (ex. file changes its size)

  7. Storage system: ● must exctract the classifier, find the appropriate policy and enforce it ● don’t need to remember the class of each data block ● have to inform about changing the location of block

  8. Application: ● O_CLASSIFIED needed to use DSS while opening the file ● POSIX gather/scatter operations are overloaded ● changes in VFS are essential in order to handle DSS features

  9. Implementation

  10. Operating system ● interface for classifying I/O requests

  11. Operating system ● then we copy from the BIO to the 5-bit vendor-specific Group Number field in byte 6 of the SCSI CDB SCpnt->cmnd[6] = SCpnt->request->bio->bi_class; ● adding I/O classification is a matter of tracking an I/O from filesystem to device drivers through block layers

  12. File system ● Goal: provide the storage system information which blocks should be cached and the order of eviction of cached blocks

  13. File system ● class id and priority may change ● we using 19 out of 32 available ID’s ● the less numer the higher priority is

  14. File system ● provided POSIX interface for user-level I/O

  15. File system ● example for PostgreSQL

  16. Storage system Baseline algorithm: ● at the beginning we have ‘free list’ of allocations ● when the data block is cached the allocation is moved to ‘dirty list’ ● when the ‘free list’ drops below some level ‘syncer deamon’ begins to clean the ‘dirty list’

  17. Storage system Selective allocation: ● decision about caching is not based on request size ● metadata and small files are always cached ● large files are cached conditionally (it depends on ‘syncer deamon’ state)

  18. Storage system ● Selective eviction: ● is not a LRU algorithm ● first are evicted entries with lowest priority ● If this is not enough we evict next lowest entries ● metadata and small files rarely leave cache ● large files are usually moved out because of priority, but also its size

  19. Evaluation

  20. Environment ● single Linux machine (Fedora 13) ● kernel version: 2.6.34 ● 8-core system with 8GB of RAM ● file system: Ext3 ● storage device: 5-disk LSI RAID-1E array ● cache: Intel 32GB X25-E SSD

  21. Test methodology ● Workload generator which on input takes: file size distribution, request file size, read/write ratio, number of subdirectories

  22. File server ● file server worload based on SPECsfs2008 ● over 262,000 files and 8,500 directories created ● over 262,000 transactions performed ● read/write ratio is 2:1 ● 184GB of memory used ● 18GB cache

  23. E-mail server ● e-mail server worload based on a study of e-mail server file sizes ● 1 milion files 1,000 directories ● 1 milion transactions performed ● read/write ratio is 2:1 ● 204GB memory used ● 20GB cache

  24. Results

  25. Database ● used database: PostgreSQL ● highest priority for: metadata, user tables, log files and temporary tables (all in one class) ● index files have lower priority ● 8GB cache

  26. Database results

  27. The end

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend