an Object-Based File System for Large-Scale Federated IT - - PowerPoint PPT Presentation

an object based file system for large scale federated it
SMART_READER_LITE
LIVE PREVIEW

an Object-Based File System for Large-Scale Federated IT - - PowerPoint PPT Presentation

an Object-Based File System for Large-Scale Federated IT Infrastructures Jan Stender, Zuse Institute Berlin HPC File Systems: From Cluster To Grid October 3-4, 2007 In this talk ... Introduction: Object-based File Systems Target Environment


slide-1
SLIDE 1

an Object-Based File System for Large-Scale Federated IT Infrastructures Jan Stender, Zuse Institute Berlin

HPC File Systems: From Cluster To Grid October 3-4, 2007

slide-2
SLIDE 2

an object-based file system for federated IT infrastructures. an object-based file system for federated IT infrastructures.

In this talk ...

Introduction: Object-based File Systems Target Environment Architecture Features Implementation Current State & Plans

slide-3
SLIDE 3

an object-based file system for federated IT infrastructures. an object-based file system for federated IT infrastructures.

The XtreemOS EU Project

  • XtreemFS is part of the XtreemOS project
  • EU project, 18 partners from all over Europe, incl.

NEC, SAP, Telefonica, Mandriva, Red Flag Linux

  • Develops a distributed operating system around

Kerrighed, a single system image Linux kernel

  • The XtreemFS Team:

– Zuse Institute Berlin – Barcelona Supercomputing Center – NEC High Performance Computing, Stuttgart – CNR, Pisa, Italy – Universität Düsseldorf – SAP Research

slide-4
SLIDE 4

an object-based file system for federated IT infrastructures. an object-based file system for federated IT infrastructures.

In this talk ...

Introduction: Object-based File Systems Target Environment Architecture Features Implementation Current State & Plans

slide-5
SLIDE 5

an object-based file system for federated IT infrastructures. an object-based file system for federated IT infrastructures.

Object-based File Systems

Block-based File Systems:

  • Unit of distribution are disk blocks
  • File system addresses blocks over the network
  • Metadata and block-management at central server

Object-based File Systems:

  • Storage devices can be more intelligent today
  • Split file in parts and distribute & address them
  • Only metadata at server, block management by

storage devices

slide-6
SLIDE 6

an object-based file system for federated IT infrastructures. an object-based file system for federated IT infrastructures.

Object-based File Systems

slide-7
SLIDE 7

an object-based file system for federated IT infrastructures. an object-based file system for federated IT infrastructures.

Object-based File Systems

several available ...

  • Lustre (Open-Source)
  • Panasas ActiveStore (commercial)
  • Ceph (Research, Open-Source)

common properties:

  • parallel designs for high-performance LAN access
  • centralized, one data center, one organization
  • control over failures of hardware
slide-8
SLIDE 8

an object-based file system for federated IT infrastructures. an object-based file system for federated IT infrastructures.

In this talk ...

Introduction: Object-based File Systems Target Environment Architecture Features Implementation Current State & Plans

slide-9
SLIDE 9

an object-based file system for federated IT infrastructures. an object-based file system for federated IT infrastructures.

Target Environment

  • federation: clusters can join/leave/fail

– no centralized services at an organization

  • connected over the Internet

– complex failure cases (like network splits) – no control over hardware

slide-10
SLIDE 10

an object-based file system for federated IT infrastructures. an object-based file system for federated IT infrastructures.

Target Environment

  • spanning administration domains

– cross-organization authentication – virtual organization (VO) support necessary

  • commonly referred to as The Grid
slide-11
SLIDE 11

an object-based file system for federated IT infrastructures. an object-based file system for federated IT infrastructures.

In this talk ...

Introduction: Object-based File Systems Target Environment Architecture Features Implementation Current State & Plans

slide-12
SLIDE 12

an object-based file system for federated IT infrastructures. an object-based file system for federated IT infrastructures.

Architecture

slide-13
SLIDE 13

an object-based file system for federated IT infrastructures. an object-based file system for federated IT infrastructures.

In this talk ...

Introduction: Object-based File Systems Target Environment Architecture Features Implementation Current State & Plans

slide-14
SLIDE 14

an object-based file system for federated IT infrastructures. an object-based file system for federated IT infrastructures.

Features

  • POSIX-compliant file system API 
  • advanced metadata management

– replication  and partitioning of metadata – extended metadata  and queries

  • high performance

– parallel file access (striping)  – client-side caching

  • high data safety and availability

– replication of files  – automatic access pattern-based replica management – RAID, end-to-end checksums

slide-15
SLIDE 15

an object-based file system for federated IT infrastructures. an object-based file system for federated IT infrastructures.

Features

  • POSIX-compliant file system API 
  • advanced metadata management

– replication  and partitioning of metadata – extended metadata  and queries

  • high performance

– parallel file access (striping)  – client-side caching

  • high data safety and availability

– replication of files  – automatic access pattern-based replica management – RAID, end-to-end checksums

slide-16
SLIDE 16

an object-based file system for federated IT infrastructures. an object-based file system for federated IT infrastructures.

Features

  • POSIX-compliant file system API 
  • advanced metadata management

– replication  and partitioning of metadata – extended metadata  and queries

  • high performance

– parallel file access (striping)  – client-side caching

  • high data safety and availability

– replication of files  – automatic access pattern-based replica management – RAID, end-to-end checksums

slide-17
SLIDE 17

an object-based file system for federated IT infrastructures. an object-based file system for federated IT infrastructures.

Features - Metadata Management

  • partitioning:

– split up volume (DB) into

smaller parts

  • replication:

– primary/secondary with

fail-over

– granularity: volumes /

volume partitions

volume

MRC

dir file file extended attributes

  • name
  • timestamps
  • owner/group/ACL
  • content locations
  • size

volume DB

volume

slide-18
SLIDE 18

an object-based file system for federated IT infrastructures. an object-based file system for federated IT infrastructures.

Features

  • POSIX-compliant file system API 
  • advanced metadata management

– replication  and partitioning of metadata – extended metadata  and queries

  • high performance

– parallel file access (striping)  – client-side caching

  • high data safety and availability

– replication of files  – automatic access pattern-based replica management – RAID, end-to-end checksums

slide-19
SLIDE 19

an object-based file system for federated IT infrastructures. an object-based file system for federated IT infrastructures.

Features

  • POSIX-compliant file system API 
  • advanced metadata management

– replication  and partitioning of metadata – extended metadata  and queries

  • high performance

– parallel file access (striping)  – client-side caching

  • high data safety and availability

– replication of files  – automatic access pattern-based replica management – RAID, end-to-end checksums

slide-20
SLIDE 20

an object-based file system for federated IT infrastructures. an object-based file system for federated IT infrastructures.

XtreemFS – Replication

replication of files

  • read/write replication
  • fully transparent to client
  • guarantees sequential consistency
  • primary/secondary approach with fault-tolerant lease

negotiation consistency coordination

  • currently at object level
  • synchronous, asynchronous or on-demand
slide-21
SLIDE 21

an object-based file system for federated IT infrastructures. an object-based file system for federated IT infrastructures.

Features - Replication - Consistency Coordination

synchronous

  • writing: acknowledge

after all updates have been acknowledged

  • reading: on any replica
slide-22
SLIDE 22

an object-based file system for federated IT infrastructures. an object-based file system for federated IT infrastructures.

Features - Replication - Consistency Coordination

asynchronous

  • writing: acknowledge

when performed locally

  • reading: check and

fetch latest data

slide-23
SLIDE 23

an object-based file system for federated IT infrastructures. an object-based file system for federated IT infrastructures.

Features - Replication - Consistency Coordination

  • n-demand
  • writing: acknowledge

when performed locally, do not disseminate updates

  • reading: check and

fetch latest data

slide-24
SLIDE 24

an object-based file system for federated IT infrastructures. an object-based file system for federated IT infrastructures.

Features - Replication - Use Cases

problem

  • large file staged /

generated by a single process

  • access by many clients
  • each client accesses only

a small portion

  • clients reside on different

sites

solution

  • creation of a new (initially

empty) local replica per client

  • replicas are updated in

background / on demand

  • replica can be used

immediately, required

  • bjects may be transferred
  • n demand
  • example: large database
slide-25
SLIDE 25

an object-based file system for federated IT infrastructures. an object-based file system for federated IT infrastructures.

Features - Replication - Use Cases

problem

  • a producer gradually

generates a large file

  • a consumer wants to

access already written parts of the file

  • consumer and producer

concurrently work on the same file

solution

  • consumer and producer each

have a local replica

  • producer asynchronously

updates consumer's replica

  • consumer can access written
  • bjects locally
slide-26
SLIDE 26

an object-based file system for federated IT infrastructures. an object-based file system for federated IT infrastructures.

Features

  • POSIX-compliant file system API 
  • advanced metadata management

– replication  and partitioning of metadata – extended metadata  and queries

  • high performance

– parallel file access (striping)  – client-side caching

  • high data safety and availability

– replication of files  – automatic access pattern-based replica

management

– RAID, end-to-end checksums

slide-27
SLIDE 27

an object-based file system for federated IT infrastructures. an object-based file system for federated IT infrastructures.

In this talk ...

Introduction: Object-based File Systems Target Environment Architecture Features Implementation Current State & Plans

slide-28
SLIDE 28

an object-based file system for federated IT infrastructures. an object-based file system for federated IT infrastructures.

Implementation

Protocol:

  • HTTP (with JSON encoding for RPCs)

MRC, OSD, Directory Service:

  • staged server implementation (non-blocking I/O)
  • Java (~40.000 LOC) + BerkeleyDB (MRC)

File System Client:

  • FUSE-based implementation (for now)
  • C (~13.000 LOC)
slide-29
SLIDE 29

an object-based file system for federated IT infrastructures. an object-based file system for federated IT infrastructures.

In this talk ...

Introduction: Object-based File Systems Target Environment Architecture Features Implementation Current State & Plans

slide-30
SLIDE 30

an object-based file system for federated IT infrastructures. an object-based file system for federated IT infrastructures.

Next Steps & Future Plans

next steps:

  • performance improvements

– read/write access – striping – replication (failure-free case)

  • public release (by the end of 2007)

medium to long-term goals:

  • RAID & checksums
  • monitoring
slide-31
SLIDE 31

an object-based file system for federated IT infrastructures. an object-based file system for federated IT infrastructures.

Thanks for your attention!