Interposed Routing Interposed Request Routing for Scalable Client - - PDF document

interposed routing interposed request routing for scalable
SMART_READER_LITE
LIVE PREVIEW

Interposed Routing Interposed Request Routing for Scalable Client - - PDF document

Interposed Routing Interposed Request Routing for Scalable Client sends and receives Network Storage *Server standard NFS packets. *Server Darrell Anderson, Jeff Chase, and Amin Vahdat NFS Client *Server Department of Computer


slide-1
SLIDE 1

1

Interposed Request Routing for Scalable Network Storage

Darrell Anderson, Jeff Chase, and Amin Vahdat Department of Computer Science Duke University

Duke University • Department of Computer Science

Goals

Devise a highly scalable network storage architecture

  • Interpose on a standard file system protocol.

– Prototype supports NFS version 3.

  • Distribute responsibilities and data.

– Divide functions (e.g., data vs. metadata). – Scale functions by aggregating servers. This talk:

  • Request routing to scale functions.

Duke University • Department of Computer Science

In the Beginning...

NFS Client NFS Server Network Client sends and receives standard NFS packets. Server sends and receives standard NFS packets.

Duke University • Department of Computer Science

Interposed Routing

NFS Client *Server Client sends and receives standard NFS packets. Slice µProxy intercepts and redirects NFS packets to specialized servers. µ *Server *Server *Server *Server

Duke University • Department of Computer Science

Outline

Interposed routing Slice architecture

  • Functional decomposition
  • Data decomposition

Functions

  • Block-I/O
  • Small-file
  • Metadata

Request routing Performance

Duke University • Department of Computer Science

Slice Architecture

file placement policy network storage array small-file servers directory servers name space requests bulk I/O small file read/write name routing striping policy client µproxy

slide-2
SLIDE 2

2

Duke University • Department of Computer Science

Functional Decomposition

file placement policy network storage array small-file servers directory servers name space requests bulk I/O small file read/write name routing striping policy client µproxy

Duke University • Department of Computer Science

Data Decomposition

file placement policy network storage array small-file servers directory servers name space requests bulk I/O small file read/write name routing striping policy client µproxy

Duke University • Department of Computer Science

Outline

Interposed routing Slice architecture

  • Functional decomposition
  • Data decomposition

Functions

  • Block-I/O Storage Nodes
  • Small-file Servers
  • Directory Servers

Request routing Performance

Duke University • Department of Computer Science

Block-I/O Storage Nodes

Network storage nodes provide all storage in Slice.

  • Prototype uses a simple object-based model.

– Read, write, remove, truncate.

  • Clients access storage nodes directly.

– Static striping, or flexible block-maps. – Optional RAID “10” mirrored striping. network storage array bulk I/O striping policy client µproxy

Duke University • Department of Computer Science

Handle read and write operations on small files.

  • All I/O requests below threshold (e.g., 64 KB).

– Also the initial “small” segments of large files.

  • Absorb and aggregate I/O on small files.

– Data backed by storage array.

  • Storage nodes need not handle small files well.

Small-File Servers

small-file servers file placement policy small file read/write client µproxy network storage array

Duke University • Department of Computer Science

Directory Servers

Handle name space operations.

  • Associate name with attributes (lookup, getattr).
  • Manage directory contents (create, readdir).

– Preserve dependencies between objects.

  • Create affects new object and its parent

directory. directory servers name routing policy name space requests client µproxy network storage array

slide-3
SLIDE 3

3

Duke University • Department of Computer Science

Outline

Interposed routing Slice architecture

  • Functional decomposition
  • Data decomposition

Functions

  • Block-I/O Storage Nodes
  • Small-file Servers
  • Directory Servers

Request routing Performance

Duke University • Department of Computer Science

Request Routing Goals

Focus on name space.

  • Spread name space across multiple servers.

– Balance capacity and load.

  • (Maybe) keep entries on same server as parent.

– Some name space ops involve multiple sites.

  • Create entry, update parent modify time.

Duke University • Department of Computer Science

Request Routing

Three policies for name space request routing:

  • Volume Partitioning:

– Divide the name space into volumes. – Volumes have well defined mount points.

  • Mkdir Switching:

– Items on same server as parent directory. – Some mkdirs redirect to another server.

  • Name Hashing:

– Name space is a distributed hash table. – Requests hash by name, parent dir.

Duke University • Department of Computer Science

Outline

Interposed routing Slice architecture

  • Functional decomposition
  • Data decomposition

Functions

  • Block-I/O Storage Nodes
  • Small-file Servers
  • Directory Servers

Request routing Performance

Duke University • Department of Computer Science

Experiment Configuration

Hardware

  • Client: 450 MHz P3 with 32 bit 33 MHz PCI.
  • Server: 733 MHz P3 with 64 bit 66 MHz PCI.
  • Server: 8x 18 GB Seagate Ultra-2 Cheetah disks.
  • Gigabit Ethernet with 9 KB “jumbo” frames.

Software

  • FreeBSD 4.0-release.
  • Modified NFS stack and firmware for zero-copy.
  • NFS uses UDP/IP with 32 KB MTU.
  • Slice kernel modules; µProxy is IP filter on client.

Duke University • Department of Computer Science

Block-I/O Scaling

10 20 30 40 50 60 70 0 1 2 3 4 5 6 7 8 9 Storage Nodes Single-Client Bandwidth (MB/s) read write mirror-read mirror-write 100 200 300 400 500 0 1 2 3 4 5 6 7 8 9 Storage Nodes Aggregate Bandwidth (MB/s) read write mirror-read mirror-write

slide-4
SLIDE 4

4

Duke University • Department of Computer Science

Name Space Scaling

200 400 600 800 5 10 15 20 25 Clients Average Time (s) N-UFS Slice-1 N-MFS Slice-2 Slice-4 Slice-8

Duke University • Department of Computer Science

Mkdir Switching Affinity

50 100 150 200 250 300 20 40 60 80 100 Directory Affinity (%) Average Time (s) 16 Clients 8 Clients 4 Clients 1 Client

Duke University • Department of Computer Science

SPECsfs97 Throughput

2000 4000 6000 8000 1250 2500 3750 5000 6250 7500 Offered Load (IOPS) Delivered Load (IOPS) Slice-8 Slice-6 Slice-4 Slice-2 Slice-1 NFS Ideal

Duke University • Department of Computer Science

SPECsfs97 Latency

5 10 15 1250 2500 3750 5000 6250 7500 Delivered Load (IOPS) Average Latency (msec/op) NFS Slice-1 Slice-2 Slice-4 Slice-6 Slice-8 Celerra 506

Duke University • Department of Computer Science

Summary

Slice interposes between NFS client and server.

  • Simple redirection of NFS version 3 packets.

– Slice µProxy inspects and rewrites packets.

  • Separates functions normally for central server.

– Functional decomposition for request stream. – Data decomposition to scale each function.

  • Prototype shows performance and scalability.

http://www.cs.duke.edu/ari/slice

Duke University • Department of Computer Science

EOF

slide-5
SLIDE 5

5

Duke University • Department of Computer Science

Handling Failures

Approach: write-ahead logging.

  • µProxy logs intentions for

“dangerous” operations to coordinator. – Also logs when finished.

  • Coordinator completes or

aborts aging operations. – Roll forward, or back.

  • Independent of client, server,

and storage nodes.

µ Coordinator NFS Client

  • 4. Safe again
  • 2. Danger!
  • 3. (do it)
  • 1. Request
  • 5. Response