Revisiting the Metadata Architecture of Parallel File Systems Nawab - - PowerPoint PPT Presentation

revisiting the metadata architecture of parallel file
SMART_READER_LITE
LIVE PREVIEW

Revisiting the Metadata Architecture of Parallel File Systems Nawab - - PowerPoint PPT Presentation

Revisiting the Metadata Architecture of Parallel File Systems Nawab Ali #1 , Ananth Devulapalli *2 , Dennis Dalessandro *3 , Pete Wyckoff *4 and P. Sadayappan #5 # The Ohio State University * Ohio Supercomputer Center Presentation Outline


slide-1
SLIDE 1

Revisiting the Metadata Architecture of Parallel File Systems

Nawab Ali#1, Ananth Devulapalli*2, Dennis Dalessandro*3, Pete Wyckoff*4 and P. Sadayappan#5

#The Ohio State University *Ohio Supercomputer Center

slide-2
SLIDE 2

11/17/2008 PDSW 2008 2

Presentation Outline

  • Introduction
  • Object-based Storage Devices
  • Parallel File Systems Design Goals
  • Offloading MDS Operations to OSDs
  • OSD System Design
  • Experiments

– Microbenchmarks – Applications

  • Conclusions & Future Work
slide-3
SLIDE 3

11/17/2008 PDSW 2008 3

Introduction

  • HPC applications increasingly generate or process

large data sets

– Sloan Digital Sky Survey – Large Hadron Collider – Climate Research

  • Existing parallel file systems unable to cope with the

I/O throughput requirements

– Server-oriented design inhibits high-performance

slide-4
SLIDE 4

11/17/2008 PDSW 2008 4

Parallel File System Design Limitations

  • I/O bandwidth and latency

limit file system performance

  • Store-and-forward latency
  • Dedicated I/O and metadata

servers limit

– Scalability – Manageability – Performance

Typical parallel file system design

slide-5
SLIDE 5

11/17/2008 PDSW 2008 5

  • New storage technology

– SCSI extension

  • Intelligent, higher-level object interface

– Object encapsulation – Attributes

  • User assigned, but device managed
  • Large space for rich metadata
  • Secure building block for direct-access file systems

Object-based Storage Devices

slide-6
SLIDE 6

11/17/2008 PDSW 2008 6

OSD Architecture

slide-7
SLIDE 7

11/17/2008 PDSW 2008 7

  • Use intelligent peripherals

(OSDs) to improve performance, scalability and manageability

  • Serverless, direct-access

storage model

Parallel File System Design Goals

OSD-based parallel file system design

slide-8
SLIDE 8

11/17/2008 PDSW 2008 8

Metadata Design Goals

  • Offload file metadata operations to OSDs
  • Make the case for recoupling data and metadata

– Simplify parallel file system design

slide-9
SLIDE 9

11/17/2008 PDSW 2008 9

Existing Metadata Architectures

Stock PVFS OSD IOS

slide-10
SLIDE 10

11/17/2008 PDSW 2008 10

Offloading MDS Operations to OSDs

Dedicated OSD MDS Distributed OSD MDS

slide-11
SLIDE 11

11/17/2008 PDSW 2008 11

  • OSD Initiator

– Exports OSD interface to client applications – Generates SCSI commands

  • OSD Target

– Software Implementation of OSD – OSD Command processor – Data Management – Attribute Management

OSD System Design

slide-12
SLIDE 12

11/17/2008 PDSW 2008 12

  • 1 client
  • Latency as a function of number of I/O elements
  • PVFS stat: OSD-based MDS has higher latency than PVFS
  • PVFS create: Dist. OSD MDS has lower latency because it does not

create a metafile

Latency Microbenchmarks

PVFS stat PVFS create

slide-13
SLIDE 13

11/17/2008 PDSW 2008 13

  • Create throughput as a function of number of clients
  • 4 PVFS servers or 4 OSDs respectively
  • Disk-based storage: Dist. OSD MDS has a higher create throughput

than PVFS

  • RAM-based storage: PVFS outperforms other OSD variants

Throughput Microbenchmarks

Disk-based storage RAM-based storage

slide-14
SLIDE 14

11/17/2008 PDSW 2008 14

Checkpoint Performance Results

Checkpoint size = 32 kB

Checkpoint size = 256 kB

  • 8 clients, varying I/O elements
  • Processes write to individual checkpoint files
  • Distributed OSD MDS performs better than other schemes
  • Performance degrades with number of I/O elements because of

datafiles

slide-15
SLIDE 15

11/17/2008 PDSW 2008 15

  • 1 client. 4 PVFS servers or 4 OSDs respectively
  • test1: 5000 files. 1 GB data. Small metadata footprint

– Execution time dominated by small I/O

  • test7GB: 95000 files. 7 GB data. Metadata intensive

SSCA-3 Performance Results

SSCA-3 test1 SSCA-3 test7GB

slide-16
SLIDE 16

11/17/2008 PDSW 2008 16

Conclusions

  • Dedicated MDS limit the performance of parallel file

systems

  • Possibly recouple data and metadata using OSDs
  • Presented two metadata offloading techniques to OSDs
  • Performance of OSD-based file system is comparable to

that of PVFS

– Expect performance to improve significantly with the availability

  • f hardware OSDs
slide-17
SLIDE 17

11/17/2008 PDSW 2008 17

Future Work

  • Metadata query based on user-defined attributes
  • File system scalability analysis

– OSC Glenn cluster – iSER

  • Replicated Metadata
slide-18
SLIDE 18

11/17/2008 PDSW 2008 18

Thank You