Workload-driven Analysis of File Systems in Shared Multi-Tier - - PowerPoint PPT Presentation

workload driven analysis of file systems in shared multi
SMART_READER_LITE
LIVE PREVIEW

Workload-driven Analysis of File Systems in Shared Multi-Tier - - PowerPoint PPT Presentation

Workload-driven Analysis of File Systems in Shared Multi-Tier Data-Centers over InfiniBand K. Vaidyanathan P. Balaji H. W. Jin D.K. Panda Network-Based Computing Laboratory Department of Computer Science and Engineering The Ohio


slide-1
SLIDE 1

Workload-driven Analysis of File Systems in Shared Multi-Tier Data-Centers

  • ver InfiniBand
  • K. Vaidyanathan
  • P. Balaji
  • H. –W. Jin D.K. Panda

Network-Based Computing Laboratory Department of Computer Science and Engineering The Ohio State University

slide-2
SLIDE 2

Presentation Outline

  • Introduction and Background
  • Characterization of local and network-

based file systems

  • Multi File System for Data-Centers
  • Experimental Results
  • Conclusions
slide-3
SLIDE 3

Introduction

  • Exponential growth of Internet

– Primary means of electronic interaction – Online book-stores, World-cup scores, Stock markets –

  • Ex. Google, Amazon, etc
  • Highly Scalable and Available Web-Services
  • Performance is critical for such Services
  • Utilizing Clusters for Web-Services? [shah01]

– High Performance-to-cost ratio – Has been proposed by Industry and Research Environments [shah01]: CSP: A Novel System Architecture for Scalable Internet and Communication Services. H. V. Shah, D. B. Minturn, A. Foong, G. L. McAlpine, R. S. Madukkarumukumana and G. J. Regnier In USITS 2001

slide-4
SLIDE 4

Cluster-Based Data-Centers

  • Nodes are logically partitioned

– provides specific services (serving static and dynamic content) – Use high speed interconnects like InfiniBand, Myrinet, etc.

  • Requests get forwarded through multiple tiers
  • Replication of content on all nodes

Proxy Server Web Server (Apache) Application Server (PHP) Database Server (MySQL)

WAN WAN Clients

Storage

slide-5
SLIDE 5

Shared Cluster-Based Data-Centers

  • Hosting several unrelated services on a single data-center

– Currently used by several ISPs and Web Service Providers (IBM, HP)

  • Replication of content

– Amount of data replicated increases linearly with the number of web- sites hosted Proxy Server Web Server Application Server Database Server WAN WAN

Clients

Website B Website C Website A

} } }

A B C A B C A B C

Storage

slide-6
SLIDE 6

Issues in Shared Cluster-Based Data-Centers

  • File System Caches being shared across multiple

web-sites

  • Under-utilization of aggregate cache of all nodes
  • Web-site Content

– Replication of content on all nodes if we use local file system – Need to fetch the document via network if we use network file system, however no replication required

  • Can we adapt the file system to avoid these?
slide-7
SLIDE 7

File System Interactions

Proxy Server SAN SAN SAN SAN Web Server Application Server Database Server

Network-based File Systems

Local file system Local file system

Data-Center Interaction File System Interaction

Local file system

slide-8
SLIDE 8

Existing File Systems

  • Network-based File System: Parallel Virtual File System

(PVFS) and Lustre (supports client-side caching)

  • Local File System: ext3fs and memory file system (ramfs)

compute node

SAN SAN

Web Server

Local file system

Metadata Manager I/O(OST) Node I/O(OST) Node Meta Data Data Data

compute node compute node compute node

Client-side Cache Server-side Cache

slide-9
SLIDE 9

Presentation Outline

  • Introduction and Background
  • Characterization of local and network-

based file systems

  • Multi File System for Data-Centers
  • Experimental Analysis
  • Conclusions
slide-10
SLIDE 10

Characterization of local and network-based File Systems

  • Network Traffic Requirements
  • Aggregate Cache
  • Cache Pollution Effects
slide-11
SLIDE 11

Network Traffic Requirements

  • Absolute Network Traffic generated

– Static Content – Dynamic Content

  • Network Utilization

– Large/Small burst (static or dynamic content)

  • Overhead of Metadata Operations
slide-12
SLIDE 12

Aggregate Cache in Data-Centers

  • Local File Systems use only single node’s cache

– Small files get huge benefits, if in memory. Otherwise, we pay a penalty of accessing the disk – Large Files may not fit in memory and also have high penalties in accessing the disk

  • Network File Systems use aggregate cache from all

nodes

– Large Files, if striped, can reside in file system cache on multiple nodes – Small files also get benefits due to aggregate cache

slide-13
SLIDE 13

Cache Pollution Effects

  • Working set – frequently accessed documents;

usually fits in memory

  • Shared Data-Centers

– Multiple web-sites share the file system cache; each website has lesser amount of file system cache to utilize – Bursts of requests/accesses to one web-site may result in cache pollution – May result in drastic drop in the number of cache hits

slide-14
SLIDE 14

Presentation Outline

  • Introduction and Background
  • Characterization of local and network-

based file systems

  • Multi File System for Data-Centers
  • Experimental Results
  • Conclusions
slide-15
SLIDE 15

Multi File System for Data-Centers

No No No Min ramfs Yes Yes Yes More traffic pvfs Yes No

Metadata

  • verhead

Yes Yes

Cache pollution effects

Yes No

Use of Aggregate Cache

Min Min

Network Traffic generated

lustre ext3fs

Characterization

slide-16
SLIDE 16

Multi File System for Data-Centers

  • A combination of file systems for different

environments

  • Memory file system and local file system (ext3fs)

for workloads with high temporal locality

  • Memory file system and network file system

(pvfs/lustre) for workloads with low temporal locality

slide-17
SLIDE 17

Presentation Outline

  • Introduction and Background
  • Characterization of local and network-

based file systems with data-centers

  • Multi File System for Data-Centers
  • Experimental Results
  • Conclusions
slide-18
SLIDE 18

Experimental Test-bed

  • Cluster 1 with:

– 8 SuperMicro SUPER X5DL8-GG nodes; Dual Intel Xeon 3.0 GHz processors – 512 KB L2 Cache, 2 GB memory; PCI-X 64 bit 133 MHz

  • Cluster 2 with:

– 8 SuperMicro SUPER P4DL6 nodes; Dual Intel Xeon 2.4 GHz processors – 512 KB L2 Cache, 512 MB memory; PCI-X 64 bit 133 MHz

  • Mellanox MT23108 Dual Port 4x HCAs; MT43132 24-port

switch

  • Apache 2.0.48 Web and PHP 4.3.7 Servers; MySQL 4.0.12,

PVFS 1.6.2, Lustre 1.0.4

slide-19
SLIDE 19

Workloads

  • Zipf workloads: the relative probability of a

request for the ith most popular document is proportional to 1/iα with α ≤ 1

– High Temporal locality (constant α) – Low Temporal locality (varying α)

  • TPC-W traces according to the specifications

6 GB 1K – 64MB Class 4 2 GB 1K – 16MB Class 3 450 MB 1K – 4MB Class 2 100 MB 1K – 1MB Class 1 25 MB 1K – 250K Class 0 Size File Sizes Class

slide-20
SLIDE 20

Experimental Analysis (Outline)

  • Basic Performance of different file systems
  • Network Traffic Requirements
  • Impact of Aggregate Cache
  • Cache Pollution Effects
  • Multi File System for Data-Centers
slide-21
SLIDE 21

Basic Performance

  • Network File Systems incur high overhead for metadata operations

(open() and close())

  • Lustre supports client-side cache
  • For large files, network-based file system does better than local file

system due to striping of the file

50713 3000 44108 9600 2379 1400 76312 1500

Read Latency (no cache)

1998 7.7 13825 680 1578 4 1602 4

Read Latency (cache)

876 876 1060 1060 6 6 6 6

Open & Close

  • verhead

1M 4K 1M 4K 1M 4K 1M 4K

lustre (usecs) pvfs (usecs) ramfs (usecs) ext3fs (usecs) Latency

slide-22
SLIDE 22

Network Traffic Requirements

200000 400000 600000 800000 Zipf Class 0 Zipf Class 1 Zipf Class 2 Zipf Class 3 #packets sent/received

ext3fs pvfs lustre

200000 400000 600000 800000 TPCW Class 0 TPCW Class 1 TPCW Class 2 TPCW Class 3 #packets sent/receiv ext3fs pvfs lustre

  • Absolute Network Traffic Generated:

– Increases proportionally compared to the local file system for PVFS – For Lustre, the traffic is close to that of the local file system – For dynamic content, the network traffic does not increase with increase in database size

slide-23
SLIDE 23

Impact of Caching and Metadata

  • perations
  • Local File Systems are better for workloads with high temporal

locality

  • Surprisingly Lustre performs comparable with local file systems

2000 4000 6000 8000 10000 12000 14000 Zipf Class 0 Zipf Class 1 Zipf Class 2 Zipf Class 3 T P S ext3fs ramfs pvfs lustre 50 100 150 200 250 TPCW Class 0 TPCW Class 1 TPCW Class 2 TPCW Class 3 T P S ext3fs ramfs pvfs lustre

slide-24
SLIDE 24

Impact of Aggregate Cache

20 40 60 80 100

α = 0.8 α = 0.75 α = 0.7 α = 0.65 α = 0.6 α = 0.55 α = 0.5 α = 0.4 α = 0.3

Workload with varying temporal locality TPS ext3fs pvfs lustre

  • Aggregate Cache improves data-center performance for

network-based file systems

slide-25
SLIDE 25

Cache Pollution Effects in Shared Data-Centers

  • Small Workloads, web-sites are not affected
  • Large Workloads, cache pollution affects multiple web-sites
  • Placing files on memory file system might avoid the cache

pollution effects

0% 20% 40% 60% 80% 100% Single Shared Single Shared Single Shared Single Shared Single Shared Zipf Class Zipf Class 1 Zipf Class 2 Zipf Class 3 Zipf Class 4 Percentage of Cached/NonCached Content NonCached Cached

slide-26
SLIDE 26

Multi File System Data-Centers

  • Performance benefits for static content is close to 48%
  • Performance benefits for dynamic content is close to 41%

0% 10% 20% 30% 40% 50% 60% Low Load Medium Load Heavy Load

P erfo rm an ce Im p ro vem en t Zipf Class 0 Zipf Class 1 Zipf Class 2 0% 10% 20% 30% 40% 50% Low Load Medium Load Heavy Load Performance Improvement TPCW Class 0 TPCW Class 1 TPCW Class 2

slide-27
SLIDE 27

Multi File System Data-Centers

  • Benefits are two folds:

– Avoidance of Cache Pollution – Reduced overhead of open() and close() operations for small files

2 4 6 8 10 12 14 16 18 20 α = 0.75 α = 0.65 α = 0.55 α = 0.45 Workload with varying temporal locality TPS

pvfs pvfs with ramfs

slide-28
SLIDE 28

Conclusions & Future Work

  • Fragmentation of resources in shared data-Centers

– Under-utilization of file system cache in clusters – Cache Pollution affects performance

  • Studied the impact of file systems in terms of network traffic,

aggregate cache and cache pollution effects

  • Proposed a Multi File System approach to utilize the benefits from each

file system

– Combination of Network and Memory File System for static content with low temporal locality – Memory File System and local file system for static content with high temporal locality and dynamic content

  • Propose to perform dynamic reconfiguration based on each node’s

memory cache and provide prioritization and QoS

slide-29
SLIDE 29

Web Pointers

http://www.cse.ohio-state.edu/~panda http://nowlab.cse.ohio-state.edu

{vaidyana,balaji,jinhy,panda}@cse.ohio-state.edu

NOWLAB