Differentiated Storage Services M. Mesnier, J.B. Akers, F. Chen, T. - - PowerPoint PPT Presentation

differentiated storage services
SMART_READER_LITE
LIVE PREVIEW

Differentiated Storage Services M. Mesnier, J.B. Akers, F. Chen, T. - - PowerPoint PPT Presentation

Differentiated Storage Services M. Mesnier, J.B. Akers, F. Chen, T. Luo Presentation by Szymon Bachnij Introduction DSS is a proposition of I/O classification architecture we want to define the separate classes of I/O our goal is to


slide-1
SLIDE 1

Differentiated Storage Services

Presentation by Szymon Bachnij

  • M. Mesnier, J.B. Akers, F. Chen, T. Luo
slide-2
SLIDE 2

Introduction

  • DSS is a proposition of I/O classification

architecture

  • we want to define the separate classes of I/O
  • our goal is to assign the storage system policy

to each of those classes to efficently manage data and I/O requests

slide-3
SLIDE 3

Challenges:

  • Computer system performance depends
  • n storage system
  • Storage systems are becoming more

and more complex

  • Storage system need some information

to provide any optimazation

  • ... but too much information is not a good idea
slide-4
SLIDE 4

Requirements

slide-5
SLIDE 5

Operating system:

  • classifier assosiated with every I/O request
  • new field must be added to each OS structure

describing I/O which is always copied to actual I/O command (SCSI, ATA)

  • OS scheduler need to be changed
slide-6
SLIDE 6

Filesystem:

  • must have its own classification scheme
  • each class have its own policy
  • I/O can change the classification class

(ex. file changes its size)

slide-7
SLIDE 7

Storage system:

  • must exctract the classifier, find the appropriate

policy and enforce it

  • don’t need to remember the class
  • f each data block
  • have to inform about changing the location of

block

slide-8
SLIDE 8

Application:

  • O_CLASSIFIED needed to use DSS

while opening the file

  • POSIX gather/scatter operations

are overloaded

  • changes in VFS are essential in order

to handle DSS features

slide-9
SLIDE 9

Implementation

slide-10
SLIDE 10

Operating system

  • interface for classifying I/O requests
slide-11
SLIDE 11

Operating system

  • then we copy from the BIO to the 5-bit

vendor-specific Group Number field in byte 6 of the SCSI CDB

SCpnt->cmnd[6] = SCpnt->request->bio->bi_class;

  • adding I/O classification is a matter of tracking

an I/O from filesystem to device drivers through block layers

slide-12
SLIDE 12

File system

  • Goal: provide

the storage system information which blocks should be cached and the order of eviction

  • f cached blocks
slide-13
SLIDE 13

File system

  • class id and priority

may change

  • we using 19 out of 32

available ID’s

  • the less numer

the higher priority is

slide-14
SLIDE 14

File system

  • provided POSIX interface for user-level I/O
slide-15
SLIDE 15

File system

  • example for PostgreSQL
slide-16
SLIDE 16

Storage system

Baseline algorithm:

  • at the beginning we have ‘free list’
  • f allocations
  • when the data block is cached the allocation is

moved to ‘dirty list’

  • when the ‘free list’ drops below some level

‘syncer deamon’ begins to clean the ‘dirty list’

slide-17
SLIDE 17

Storage system

Selective allocation:

  • decision about caching is not based
  • n request size
  • metadata and small files are always cached
  • large files are cached conditionally

(it depends on ‘syncer deamon’ state)

slide-18
SLIDE 18

Storage system

  • Selective eviction:
  • is not a LRU algorithm
  • first are evicted entries with lowest priority
  • If this is not enough we evict next lowest entries
  • metadata and small files rarely leave cache
  • large files are usually moved out

because of priority, but also its size

slide-19
SLIDE 19

Evaluation

slide-20
SLIDE 20

Environment

  • single Linux machine (Fedora 13)
  • kernel version: 2.6.34
  • 8-core system with 8GB of RAM
  • file system: Ext3
  • storage device: 5-disk LSI RAID-1E array
  • cache: Intel 32GB X25-E SSD
slide-21
SLIDE 21

Test methodology

  • Workload generator

which on input takes: file size distribution, request file size, read/write ratio, number

  • f subdirectories
slide-22
SLIDE 22

File server

  • file server worload based on SPECsfs2008
  • over 262,000 files

and 8,500 directories created

  • over 262,000 transactions performed
  • read/write ratio is 2:1
  • 184GB of memory used
  • 18GB cache
slide-23
SLIDE 23

E-mail server

  • e-mail server worload based on a study
  • f e-mail server file sizes
  • 1 milion files 1,000 directories
  • 1 milion transactions performed
  • read/write ratio is 2:1
  • 204GB memory used
  • 20GB cache
slide-24
SLIDE 24

Results

slide-25
SLIDE 25

Database

  • used database: PostgreSQL
  • highest priority for: metadata, user tables, log

files and temporary tables (all in one class)

  • index files have lower priority
  • 8GB cache
slide-26
SLIDE 26

Database results

slide-27
SLIDE 27

The end