Understanding Manycore Scalability of File Systems Changwoo Min , - PowerPoint PPT Presentation

Understanding Manycore Scalability of File Systems Changwoo Min , Sanidhya Kashyap, Stefgen Maass Woonhak Kang, and Taesoo Kim

Application must parallelize I/O operations ● Death of single core CPU scaling – CPU clock frequency: 3 ~ 3.8 GHz – # of physical cores: up to 24 (Xeon E7 v4) ● From mechanical HDD to fmash SSD – IOPS of a commodity SSD: 900K – Non-volatile memory (e.g., 3D XPoint): 1,000x ↑ But fjle systems become a scalability bottleneck

Problem: Lack of understanding in internal scalability behavior Exim mail server on RAMDISK Embarrassingly parallel application! btrfs F2FS 14k ext4 XFS 12k messages/sec 10k 1. Saturated 8k 6k 4k 2k 2. Collapsed 3. Never scale 0k 0 10 20 30 40 50 60 70 80 #core ● Intel 80-core machine: 8-socket, 10-core Xeon E7-8870 ● RAM: 512GB, 1TB SSD, 7200 RPM HDD 3

Even in slower storage medium fjle system becomes a bottleneck Exim email server at 80 cores 12k RAMDISK SSD 10k HDD 8k messages/sec 6k 4k 2k 0k btrfs ext4 F2FS XFS 4

Outline ● Background ● FxMark design – A fjle system benchmark suite for manycore scalability ● Analysis of fjve Linux fjle systems ● Pilot solution ● Related work ● Summary 5

Research questions ● What fjle system operations are not scalable? ● Why they are not scalable? ● Is it the problem of implementation or design? 6

Technical challenges ● Applications are usually stuck with a few bottlenecks → cannot see the next level of bottlenecks before resolving them → diffjcult to understand overall scalability behavior ● How to systematically stress fjle systems to understand scalability behavior 7

FxMark : evaluate & analyze manycore scalability of fjle systems FxMark: 3 applications 19 micro-benchmarks File ext4 tmpfs XFS F2FS btrfs J/NJ systems: Memory FS Journaling FS CoW FS Log FS Storage medium: SSD # core: 1, 2, 4, 10, 20, 30, 40, 50, 60, 70, 80 8

FxMark : evaluate & analyze manycore scalability of fjle systems FxMark: 3 applications 19 micro-benchmarks File ext4 tmpfs XFS F2FS btrfs >4,700 J/NJ systems: Memory FS Journaling FS CoW FS Log FS Storage medium: SSD # core: 1, 2, 4, 10, 20, 30, 40, 50, 60, 70, 80 9

Microbenchmark: unveil hidden scalability bottlenecks ● Data block read Low Medium High Sharing Level R R R R R R File Block Process Operation R Legend: 10

Stress difgerent components with various sharing levels 11

Evaluation ● Data block read R R Linear scalability 250 Low: 200 Legend 150 M ops/sec File btrfs ext4 100 ext4NJ systems: F2FS tmpfs XFS 50 0 Storage 0 10 20 30 40 50 60 70 80 #core medium: 12

Outline ● Background ● FxMark design ● Analysis of fjve Linux fjle systems – What are scalability bottlenecks? ● Pilot solution ● Related work ● Summary 13

Summary of results: fjle systems are not scalable DRBL DRBM DRBH DWOL DWOM DWAL DWTL 250 250 10 160 2 10 4 9 1.8 9 140 3.5 200 200 8 1.6 8 120 3 7 1.4 7 100 2.5 150 150 6 1.2 6 M ops/sec M ops/sec M ops/sec M ops/sec M ops/sec M ops/sec M ops/sec 5 80 1 5 2 100 100 4 0.8 4 60 1.5 3 0.6 3 40 1 50 50 2 0.4 2 20 0.5 1 0.2 1 0 0 0 0 0 0 0 0 10 20 30 40 50 60 70 80 0 10 20 30 40 50 60 70 80 0 10 20 30 40 50 60 70 80 0 10 20 30 40 50 60 70 80 0 10 20 30 40 50 60 70 80 0 10 20 30 40 50 60 70 80 0 10 20 30 40 50 60 70 80 #core #core #core #core #core #core #core DWSL MRPL MRPM MRPH MRDL MRDM MWCL 140 80 9 5 500 8 2.5 4.5 450 8 70 7 120 4 400 2 7 60 6 100 3.5 350 6 50 5 3 300 1.5 M ops/sec M ops/sec M ops/sec M ops/sec M ops/sec M ops/sec M ops/sec 80 5 40 2.5 250 4 4 60 2 200 1 30 3 3 1.5 150 40 20 2 2 1 100 0.5 20 10 1 1 0.5 50 0 0 0 0 0 0 0 0 10 20 30 40 50 60 70 80 0 10 20 30 40 50 60 70 80 0 10 20 30 40 50 60 70 80 0 10 20 30 40 50 60 70 80 0 10 20 30 40 50 60 70 80 0 10 20 30 40 50 60 70 80 0 10 20 30 40 50 60 70 80 #core #core #core #core #core #core #core MWRL DRBM:O_DIRECT MWCM MWUL MWUM MWRM DRBL:O_DIRECT 0.5 2.5 0.45 0.5 0.7 0.35 0.5 0.45 0.45 0.45 0.4 0.6 0.3 2 0.4 0.4 0.4 0.35 0.35 0.5 0.25 0.35 0.35 0.3 0.3 0.3 1.5 0.3 M ops/sec M ops/sec M ops/sec M ops/sec M ops/sec M ops/sec M ops/sec 0.4 0.2 0.25 0.25 0.25 0.25 0.2 0.3 0.15 0.2 0.2 1 0.2 0.15 0.15 0.15 0.15 0.2 0.1 0.1 0.1 0.1 0.1 0.5 0.1 0.05 0.05 0.05 0.05 0.05 0 0 0 0 0 0 0 0 10 20 30 40 50 60 70 80 0 10 20 30 40 50 60 70 80 0 10 20 30 40 50 60 70 80 0 10 20 30 40 50 60 70 80 0 10 20 30 40 50 60 70 80 0 10 20 30 40 50 60 70 80 0 10 20 30 40 50 60 70 80 #core #core #core #core #core #core #core Legend DWOL:O_DIRECT DWOM:O_DIRECT Exim RocksDB DBENCH 0.45 0.45 100k 700 18 0.4 0.4 90k 16 600 0.35 0.35 80k 14 btrfs 500 70k 0.3 0.3 12 ext4 messages/sec M ops/sec M ops/sec 60k 0.25 0.25 400 ops/sec 10 ext4NJ GB/sec 50k 0.2 0.2 F2FS 8 300 40k 0.15 0.15 tmpfs 6 30k 200 XFS 0.1 0.1 4 20k 100 0.05 0.05 10k 2 0 0 0k 0 0 0 10 20 30 40 50 60 70 80 0 10 20 30 40 50 60 70 80 0 10 20 30 40 50 60 70 80 0 10 20 30 40 50 60 70 80 0 10 20 30 40 50 60 70 80 #core #core #core 14 #core #core

Summary of results: fjle systems are not scalable DRBL DRBM DRBH DWOL DWOM DWAL DWTL 250 250 10 160 2 10 4 9 1.8 9 140 3.5 200 200 8 1.6 8 120 3 7 1.4 7 100 2.5 150 150 6 1.2 6 M ops/sec M ops/sec M ops/sec M ops/sec M ops/sec M ops/sec M ops/sec 5 80 1 5 2 100 100 4 0.8 4 60 1.5 3 0.6 3 40 1 50 50 2 0.4 2 20 0.5 1 0.2 1 0 0 0 0 0 0 0 0 10 20 30 40 50 60 70 80 0 10 20 30 40 50 60 70 80 0 10 20 30 40 50 60 70 80 0 10 20 30 40 50 60 70 80 0 10 20 30 40 50 60 70 80 0 10 20 30 40 50 60 70 80 0 10 20 30 40 50 60 70 80 #core #core #core #core #core #core #core DWSL MRPL MRPM MRPH MRDL MRDM MWCL 140 80 9 5 500 8 2.5 4.5 450 8 70 7 120 4 400 2 7 60 6 100 3.5 350 6 50 5 3 300 1.5 M ops/sec M ops/sec M ops/sec M ops/sec M ops/sec M ops/sec M ops/sec 80 5 40 2.5 250 4 4 60 2 200 1 30 3 3 1.5 150 40 20 2 2 1 100 0.5 20 10 1 1 0.5 50 0 0 0 0 0 0 0 0 10 20 30 40 50 60 70 80 0 10 20 30 40 50 60 70 80 0 10 20 30 40 50 60 70 80 0 10 20 30 40 50 60 70 80 0 10 20 30 40 50 60 70 80 0 10 20 30 40 50 60 70 80 0 10 20 30 40 50 60 70 80 #core #core #core #core #core #core #core MWRL DRBM:O_DIRECT MWCM MWUL MWUM MWRM DRBL:O_DIRECT 0.5 2.5 0.45 0.5 0.7 0.35 0.5 0.45 0.45 0.45 0.4 0.6 0.3 2 0.4 0.4 0.4 0.35 0.35 0.5 0.25 0.35 0.35 0.3 0.3 0.3 1.5 0.3 M ops/sec M ops/sec M ops/sec M ops/sec M ops/sec M ops/sec M ops/sec 0.4 0.2 0.25 0.25 0.25 0.25 0.2 0.3 0.15 0.2 0.2 1 0.2 0.15 0.15 0.15 0.15 0.2 0.1 0.1 0.1 0.1 0.1 0.5 0.1 0.05 0.05 0.05 0.05 0.05 0 0 0 0 0 0 0 0 10 20 30 40 50 60 70 80 0 10 20 30 40 50 60 70 80 0 10 20 30 40 50 60 70 80 0 10 20 30 40 50 60 70 80 0 10 20 30 40 50 60 70 80 0 10 20 30 40 50 60 70 80 0 10 20 30 40 50 60 70 80 #core #core #core #core #core #core #core Legend DWOL:O_DIRECT DWOM:O_DIRECT Exim RocksDB DBENCH 0.45 0.45 100k 700 18 0.4 0.4 90k 16 600 0.35 0.35 80k 14 btrfs 500 70k 0.3 0.3 12 ext4 messages/sec M ops/sec M ops/sec 60k 0.25 0.25 400 ops/sec 10 ext4NJ GB/sec 50k 0.2 0.2 F2FS 8 300 40k 0.15 0.15 tmpfs 6 30k 200 XFS 0.1 0.1 4 20k 100 0.05 0.05 10k 2 0 0 0k 0 0 0 10 20 30 40 50 60 70 80 0 10 20 30 40 50 60 70 80 0 10 20 30 40 50 60 70 80 0 10 20 30 40 50 60 70 80 0 10 20 30 40 50 60 70 80 #core #core #core 15 #core #core

Understanding Manycore Scalability of File Systems Changwoo Min , - PowerPoint PPT Presentation

Understanding Manycore Scalability of File Systems Changwoo Min , Sanidhya Kashyap, Stefgen Maass Woonhak Kang, and Taesoo Kim Application must parallelize I/O operations Death of single core CPU scaling CPU clock frequency: 3 ~ 3.8 GHz

ManyCore ManyCore Computing: ManyCore ManyCore Computing: Computing: Computing: The Impact on

Scalability and Replication Marco Serafini COMPSCI 532 Lecture 13 Scalability 2 Scalability

File Management What is a file? Elements of file management File organization

ShfmLocks: Scalable and Practjcal Locking for Manycore Systems Changwoo Min COSMOSS Lab / ECE /

Performance and Scalability (Chapter 11) Performance and Scalability Performance: How long

Root zone scalability model Bart Gijsen October 28, 2009 Root zone scalability model

Click on M odel File for CAD Click on M odel File for CAD Click on Model File for CAD Click

CPSC 410/611: File Management What is a file? Elements of file management File

Week 10: File Management What is a file? Elements of file management File

File Systems: Semantics & Structure What is a File a file is a named collection of

File Systems: Semantics & Structure What is a File a file is a named collection of

CPSC 410/611: File Management What is a file? Elements of file management

MORC A MANYCORE ORIENTED COMPRESSED CACHE TRI M. NGUYEN, DAVID WENTZLAFF 12/7/2015 1

Versioning of Topic Map Templates Structuring Versioning and Scalability Scalability Proc.

File Systems: Consistency Issues 1 File Systems: Consistency Issues File systems maintain many

~FILE SYSTEM~ SUNU WIBIRAMA OUTLINE FILE SYSTEM ACCESS METHODS DIRECTORY STRUCTURE FILE

An Introduction to Multi Relational Data Mining Outline Introduzione e concetti di base

Status of FNAL SciBooNE experiment Yasuhiro Nakajima (Kyoto Univ.) TAUP2007, Sendai September

Quantitative characterisation of mollusc shell textures D. Chateigner Lab. Physique de lEtat

DERP Forum Strengthening Relationships with our Regulatory Partners St. Louis, Missouri May 8-9,

ICLP09 PRISM: an overview LP connections Semantics Logic Tabling Proba- Learning

Data Cleaning for Data Integration Advanced School on Data Exchange, Integration, and Streams

Primary 1 February 2020 Vision: Hearts of Service * M inds of Inquiry * Joy in Learning *

COMP364: Working with Matplotlib Jrme Waldisphl, McGill

Understanding Manycore Scalability of File Systems Changwoo Min , - PowerPoint PPT Presentation

Understanding Manycore Scalability of File Systems Changwoo Min , Sanidhya Kashyap, Stefgen Maass Woonhak Kang, and Taesoo Kim Application must parallelize I/O operations Death of single core CPU scaling CPU clock frequency: 3 ~ 3.8 GHz

ManyCore ManyCore Computing: ManyCore ManyCore Computing: Computing: Computing: The Impact on

Scalability and Replication Marco Serafini COMPSCI 532 Lecture 13 Scalability 2 Scalability

File Management What is a file? Elements of file management File organization

ShfmLocks: Scalable and Practjcal Locking for Manycore Systems Changwoo Min COSMOSS Lab / ECE /

Performance and Scalability (Chapter 11) Performance and Scalability Performance: How long

Root zone scalability model Bart Gijsen October 28, 2009 Root zone scalability model

Click on M odel File for CAD Click on M odel File for CAD Click on Model File for CAD Click

CPSC 410/611: File Management What is a file? Elements of file management File

Week 10: File Management What is a file? Elements of file management File

File Systems: Semantics &amp; Structure What is a File a file is a named collection of

File Systems: Semantics &amp; Structure What is a File a file is a named collection of

CPSC 410/611: File Management What is a file? Elements of file management

MORC A MANYCORE ORIENTED COMPRESSED CACHE TRI M. NGUYEN, DAVID WENTZLAFF 12/7/2015 1

Versioning of Topic Map Templates Structuring Versioning and Scalability Scalability Proc.

File Systems: Consistency Issues 1 File Systems: Consistency Issues File systems maintain many

~FILE SYSTEM~ SUNU WIBIRAMA OUTLINE FILE SYSTEM ACCESS METHODS DIRECTORY STRUCTURE FILE

An Introduction to Multi Relational Data Mining Outline Introduzione e concetti di base

Status of FNAL SciBooNE experiment Yasuhiro Nakajima (Kyoto Univ.) TAUP2007, Sendai September

Quantitative characterisation of mollusc shell textures D. Chateigner Lab. Physique de lEtat

DERP Forum Strengthening Relationships with our Regulatory Partners St. Louis, Missouri May 8-9,

ICLP09 PRISM: an overview LP connections Semantics Logic Tabling Proba- Learning

Data Cleaning for Data Integration Advanced School on Data Exchange, Integration, and Streams

Primary 1 February 2020 Vision: Hearts of Service * M inds of Inquiry * Joy in Learning *

COMP364: Working with Matplotlib Jrme Waldisphl, McGill

File Systems: Semantics & Structure What is a File a file is a named collection of

File Systems: Semantics & Structure What is a File a file is a named collection of