A Year in the Life of a Parallel File System Glenn K. Lockwood, - PowerPoint PPT Presentation

A Year in the Life of a Parallel File System Glenn K. Lockwood, Shane Snyder, Teng Wang, Suren Byna, Philip Carns, Nicholas J. Wright November 15, 2018 - 1 -

Why was my job's I/O slow? Socrates (left) and Plato (right) contemplating I/O performance in The School of Athens by Raphael. 1511. - 2 -

Why was my job's I/O slow? 1. You are doing something wrong 2. Another job/system task is competing with you 3. The storage system is degraded - 3 -

Why was my job's I/O slow? 1. You are doing something wrong 2. Another job/system task is competing with you 3. The storage system is Most frustrating degraded Least studied - 4 -

Our holistic approach to I/O variation 1. Measure performance variation over a year on large-scale production HPC systems 2. Collect telemetry from across the entire system 3. Quantitatively describe why I/O varies so much - 5 -

1. Observing variation in the wild App I/O Shared File Per • Probe I/O performance daily Transfer Size File Process – Jobs scaled to achieve O(1 MiB) IOR IOR >80% peak fs performance – 45 – 300 sec per probe O(100 MiB) VPIC HACC • Run in diverse production BD-CATS environments – Two DOE HPC facilities (ALCF, NERSC) – Three large-scale systems (Mira, Edison, Cori) – Two parallel file system implementations (GPFS, Lustre) – Five file systems (Mira gpfs1, Edison lustre[1-3], Cori lustre1) - 6 -

2. Collecting diverse data for holistic analysis Compute Nodes IO Nodes, Storage Servers Service Nodes LMT Slurm Darshan ggiostat Cobalt Cray SDB - 7 -

Year-long I/O performance dataset • 366 days of testing • 11,986 jobs run • 220 metrics measured per job – some derived or degenerate – sometimes undefined …and not very insightful at a glance - 8 -

I/O performance variation in production - 9 -

Two flavors of I/O performance variation - 10 -

Performance varies over the long term Systematic, long-term problem for one I/O pattern - 11 -

Performance varies over the short term Transient bad I/O day for all jobs - 12 -

Performance also experiences transient losses Transient I/O problems - 13 -

Again: Why was my job's I/O so slow? • Could be: – Long-term systematic problems – Short-term transient problems • The next questions: – What causes long-term, systematic problems? – What causes short-term transient problems? • Our approach: – Separate problems over these two time scales – Independently classify causes of longer-term and shorter-term variation - 14 -

Separating short-term from long-term Goal: Numerically • distinguish time-dependent variation Simple moving averages • (SMAs) from financial market technical analysis Where short-term average • performance diverges from overall average - 15 -

Quantitatively bound long-term problems Goal: Numerically • distinguish time-dependent variation Simple moving averages • (SMAs) from financial market technical analysis Where short-term average • performance diverges from overall average Example: Bug in a specific • file system client version - 16 -

Separating short-term from long-term variation Mira (GPFS), all benchmarks Goal: Contextualize transient variation happening during long-term variation Two SMAs at different time • windows (e.g., 14 days and 49 days) - 17 -

Separating short-term from long-term variation Mira (GPFS), all benchmarks Goal: Contextualize transient variation happening during long-term variation Two SMAs at different time • windows (e.g., 14 days and 49 days) Crossover points indicate • short behavior == long behavior Divergence regions where • short behavior diverges from long behavior - 18 -

What causes divergence regions? Mira (GPFS), all benchmarks • Capitalize on widely ranging performance (and all 219 other metrics) • Correlate performance in this region with other metrics – Bandwidth contention – IOPS contention – Data server CPU load – ... - 19 -

What causes short-term variation over a year? Each spot is correlation within a single divergence region with p-value < 10 -5 Dot radius ∝ -log(p-value) - 20 -

Source of bimodality - 21 -

Identifying sources of transient variation Mira (GPFS), all benchmarks • Partitioning allows us to classify short-term performance variation • Can’t correlate truly transient variation though - 22 -

Identifying sources of transient variation Mira (GPFS), all benchmarks Confidently classifying • transients is statistically impossible Classifying in aggregate is • possible! If we observe a possible • relationship… – One time? Maybe coincidence – Many times? Maybe not a coincidence - 23 -

Identifying sources of transient variation 1. Identify jobs affected by transient issues 2. Define divergence regions 3. Classify jobs based on region, calculate p-values 4. Repeat for all transients and, calculate aggregate p-values - 24 -

Sources of transient variation in practice • #1 source is resource contention • Other factors implicated but too rare to meet p < 10 -5 • 16% of anomalies defy classification - 25 -

Overall findings • Baseline performance and variability change over time – Patches & updates – Sustained bandwidth contention from scientific campaigns • Partitioning performance in time yields more insight – Can classify short-term and transient variation – Quantifies effects of contention and suggests avenues for system architecture optimization • We can learn things from other fields of study - 26 -

Try this at home! Reproducibility (code + year-long dataset): https://www.nersc.gov/research-and-development/tokio/a-year-in-the- life-of-a-parallel-file-system/ (or see the paper appendix) pytokio Framework: https://github.com/nersc/pytokio This material is based upon work supported by the U.S. Department of Energy, Office of Science, under contracts DE- AC02-05CH11231 and DE-AC02-06CH11357. This research used resources and data generated from resources of the National Energy Research Scientific Computing Center, a DOE Office of Science User Facility supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC02-05CH11231 and the Argonne Leadership Computing Facility, a DOE Office of Science User Facility supported under Contract DE-AC02- 06CH11357. - 27 -

A Year in the Life of a Parallel File System Glenn K. Lockwood, - PowerPoint PPT Presentation

A Year in the Life of a Parallel File System Glenn K. Lockwood, Shane Snyder, Teng Wang, Suren Byna, Philip Carns, Nicholas J. Wright November 15, 2018 - 1 - Why was my job's I/O slow? Socrates (left) and Plato (right) contemplating I/O

File Management What is a file? Elements of file management File organization

The Shmitah Cycle Common Holy Year 1 Year 2 Year 1 Year 2 Year 3 Year 4 Year 5 Year 6

Parallel File Systems John White Lawrence Berkeley National Lab Topics Defining a File

Click on M odel File for CAD Click on M odel File for CAD Click on Model File for CAD Click

~FILE SYSTEM~ SUNU WIBIRAMA OUTLINE FILE SYSTEM ACCESS METHODS DIRECTORY STRUCTURE FILE

CPSC 410/611: File Management What is a file? Elements of file management File

Week 10: File Management What is a file? Elements of file management File

File System Implementation Summer 2016 Cornell University Today File allocation Unix

FILE SYSTEM IMPLEMENTATION Sunu Wibirama Outline File-System Structure File-System

[537] Distributed Systems Chapters 42 Tyler Harter 11/19/14 File-System Case Studies Local -

File Systems: Semantics & Structure What is a File a file is a named collection of

File Systems: Semantics & Structure What is a File a file is a named collection of

Chapter 12: File System Implementation File System Structure File System Implementation

Jieun Kim Hi-Sun Kim University of Chicago 1 st 2 nd 3 rd 4 th 5 th st nd rd th th year

File Systems Chapter 11, 13 OSPP What is a File? What is a Directory? Goals of File System

What if... There is no file with the name given to the File constructor: new File

Arlington County Civic Federation Meeting Dec. 6, 2016 Patrick K. Murphy, Ed.D.

Digital Strategies for Non Profit Organizations New Models & Trends ROME 14/10/2017 Alberto

WEEE Open: electronics, sustainability and open source Emanuele Guido, Tommaso Marinelli, Stefano

Per-AS traffic stats for BGP Traffic Engineering by Manuel Kasper, Monzoon Networks AG

A Joint Model for Chinese Microblog Sentiment Analysis Yuhui Cao, Zhao Chen, Ruifeng Xu, Tao Chen

Information Extraction from Microblogs Posted during Disasters Saptarshi Ghosh 1 Kripabandhu Ghosh

microblogging posts Jasmina Smailovi Joef Stefan Institute Department of Knowledge Technologies

Placing images on the world map: a microblog- based enrichment approach Claudia Hau ff &

A Year in the Life of a Parallel File System Glenn K. Lockwood, - PowerPoint PPT Presentation

A Year in the Life of a Parallel File System Glenn K. Lockwood, Shane Snyder, Teng Wang, Suren Byna, Philip Carns, Nicholas J. Wright November 15, 2018 - 1 - Why was my job's I/O slow? Socrates (left) and Plato (right) contemplating I/O

File Management What is a file? Elements of file management File organization

The Shmitah Cycle Common Holy Year 1 Year 2 Year 1 Year 2 Year 3 Year 4 Year 5 Year 6

Parallel File Systems John White Lawrence Berkeley National Lab Topics Defining a File

Click on M odel File for CAD Click on M odel File for CAD Click on Model File for CAD Click

~FILE SYSTEM~ SUNU WIBIRAMA OUTLINE FILE SYSTEM ACCESS METHODS DIRECTORY STRUCTURE FILE

CPSC 410/611: File Management What is a file? Elements of file management File

Week 10: File Management What is a file? Elements of file management File

File System Implementation Summer 2016 Cornell University Today File allocation Unix

FILE SYSTEM IMPLEMENTATION Sunu Wibirama Outline File-System Structure File-System

[537] Distributed Systems Chapters 42 Tyler Harter 11/19/14 File-System Case Studies Local -

File Systems: Semantics &amp; Structure What is a File a file is a named collection of

File Systems: Semantics &amp; Structure What is a File a file is a named collection of

Chapter 12: File System Implementation File System Structure File System Implementation

Jieun Kim Hi-Sun Kim University of Chicago 1 st 2 nd 3 rd 4 th 5 th st nd rd th th year

File Systems Chapter 11, 13 OSPP What is a File? What is a Directory? Goals of File System

What if... There is no file with the name given to the File constructor: new File

Arlington County Civic Federation Meeting Dec. 6, 2016 Patrick K. Murphy, Ed.D.

Digital Strategies for Non Profit Organizations New Models &amp; Trends ROME 14/10/2017 Alberto

WEEE Open: electronics, sustainability and open source Emanuele Guido, Tommaso Marinelli, Stefano

Per-AS traffic stats for BGP Traffic Engineering by Manuel Kasper, Monzoon Networks AG

A Joint Model for Chinese Microblog Sentiment Analysis Yuhui Cao, Zhao Chen, Ruifeng Xu, Tao Chen

Information Extraction from Microblogs Posted during Disasters Saptarshi Ghosh 1 Kripabandhu Ghosh

microblogging posts Jasmina Smailovi Joef Stefan Institute Department of Knowledge Technologies

Placing images on the world map: a microblog- based enrichment approach Claudia Hau ff &amp;

File Systems: Semantics & Structure What is a File a file is a named collection of

File Systems: Semantics & Structure What is a File a file is a named collection of

Digital Strategies for Non Profit Organizations New Models & Trends ROME 14/10/2017 Alberto

Placing images on the world map: a microblog- based enrichment approach Claudia Hau ff &