okeanos wasteless journaling for fast and reliable
play

Okeanos: Wasteless Journaling for Fast and Reliable Multistream - PowerPoint PPT Presentation

Okeanos: Wasteless Journaling for Fast and Reliable Multistream Storage Andromachi Hatzieleftheriou , Stergios V. Anastasiadis Department of Computer Science University of Ioannina, Greece University of Ioannina A. Hatzieleftheriou 1 Outline


  1. Okeanos: Wasteless Journaling for Fast and Reliable Multistream Storage Andromachi Hatzieleftheriou , Stergios V. Anastasiadis Department of Computer Science University of Ioannina, Greece University of Ioannina A. Hatzieleftheriou 1

  2. Outline Motivation Design Implementation Evaluation Conclusions University of Ioannina A. Hatzieleftheriou 2

  3. Motivation Write Traffic • Synchronous small writes (Linux ext3) data & metadata  critical for system and application Page Size=4KB Total Journal Volume (MB) reliability 1000 metadata only 100 10 Data Journaling • Multistream concurrency Ordered 1  effectively random I/O 0 1 10 100 Request Size (KB) • In page-sized disk accesses  async writes have good performance due to batching in memory  sync writes result in wasteful traffic due to excessive full-page I/Os University of Ioannina A. Hatzieleftheriou 3

  4. Design Goals Reliable storage 1.  keep data on disk Inexpensive synchronous small writes 2.  sequential disk throughput Reduce disk bandwidth waste due to: 3.  writes with high positioning overhead  unnecessary writes of unmodified data • Proposed approach:  batch random small writes in memory  journal data updates at subpage granularity University of Ioannina A. Hatzieleftheriou 4

  5. Wasteless Journaling M EMORY Pages data deltas Journal Filesystem D ISK • Idea: 1. Synchronously transfer data deltas from memory to journal 2. Occasionally move data blocks from memory to final location • Still wasteful!  large writes  disk traffic duplication University of Ioannina A. Hatzieleftheriou 5

  6. Selective Journaling M EMORY Pages data deltas Journal Filesystem D ISK • Definition:  write threshold differentiates requests by size • Idea: 1. Transfer large requests to final location without journaling of data 2. Treat small requests according to wasteless journaling University of Ioannina A. Hatzieleftheriou 6

  7. Consistency • Wasteless Journaling:  atomic updates of both data and metadata • Selective Journaling: data updates either journaled or not depending on request size   consistency at least as strict as default ext3 journaling mode (ordered) University of Ioannina A. Hatzieleftheriou 7

  8. Prototype Implementation Journal Descriptor Multiwrite Journal Block Block Data Copies Header Data Delta • block num of final location Tag Data Delta • offset in page Tag • length in bytes Data Delta … … Tag … Page Cache Modified Data Block Buffer Original Data • Multiwrite journal block  accumulates multiple subpage data updates • During recovery  apply data deltas to corresponding final disk blocks University of Ioannina A. Hatzieleftheriou 8

  9. Experiments • Implemented in Linux kernel 2.6.18 ext3 • Experimentation Environment:  x86-based servers  quad-core 2.66GHz processor  3GB RAM  Seagate Cheetah SAS 300GB 15KRPM disks • Workloads:  Microbenchmarks  Postmark  MPIO-IO over PVFS2 University of Ioannina A. Hatzieleftheriou 9

  10. Latency 1 Mbps/stream 1 Mbps/stream 1000 1000 Write Latency (ms) Read Latency (us) 100 100 Selective NILFS 10 10 Ordered Selective Wasteless Ordered Data Data Wasteless NILFS 1 1 0 20 40 60 80 100 0 20 40 60 80 100 Number of Streams Number of Streams ⁻ Data & wasteless achieve substantially lower write latency  similar to NILFS (stable Linux port of LFS ) ⁻ NILFS read latency significantly higher due to poor storage locality! University of Ioannina A. Hatzieleftheriou 10

  11. Disk Traffic Lower is better! 1Kbps/stream 1Kbps/stream Journal Throughput (MB/s) File System Throughput (MB/s) Ordered 5 10 Data Wasteless 4 1 Selective 3 0.1 Data 2 Wasteless 0.01 1 Selective Ordered 0.001 0 0 2000 4000 6000 8000 0 2000 4000 6000 8000 Number of Streams Number of Streams ⁻ Data journaling expensive in terms of journal traffic ⁻ Ordered journaling incurs increased filesystem traffic ⁻ Wasteless & selective substantially reduce journal and filesystem traffic University of Ioannina A. Hatzieleftheriou 11

  12. Application-Level Workloads Postmark MPI-IO over PVFS2 (Write Size 1KB) 800 Wasteless 1.0 Data 600 Selective Throughput MB/s Transactions/s 0.8 Ordered 0.6 400 Wasteless 0.4 Data 200 0.2 Selective Ordered 0 0.0 0 1 10 100 0 10 20 30 40 Request Size (KB) Threads per Client ₋ Small files workload wasteless increases transaction throughput  ₋ Parallel I/O workload 13 clients, 1 PVFS2 data server, 1 PVFS2 metadata server (15 machines)  wasteless doubles the throughput of parallel application checkpointing  University of Ioannina A. Hatzieleftheriou 12

  13. Conclusions & Future Work • Key concept:  apply subpage journaling of data updates to ensure reliability • Wasteless Journaling  merges subpage writes into page-sized journal blocks • Selective Journaling  journals only updates below a write threshold • Performance benefits demonstrated over ext3:  reduced write latency  improved transaction throughput  avoided bandwidth waste • Future Work  extent for virtualization environments and flash memory systems University of Ioannina A. Hatzieleftheriou 13

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend