Parallel I/O Performance: From Events to Ensembles In collaboration - PowerPoint PPT Presentation

Parallel I/O Performance: From Events to Ensembles In collaboration with: • Lenny Oliker Andrew Uselton • David Skinner National Energy Research Scientific Computing Center • Mark Howison • Nick Wright Lawrence Berkeley National Laboratory • Noel Keen • John Shalf • Karen Karavanic

Parallel I/O Evaluation and Analysis • Explosion of sensor & simulation data make I/O a critical component • Petascale I/O requires new techniques: analysis, visualization, diagnosis • Statistical methods can be revealing • Present case studies and optimization results for: • MADbench – A cosmology application • GCRM – A climate simulation 2

IPM-I/O is an interposition library that wraps I/O calls with tracing instructions Job trace input IPM-I/O Read I/O Barrier Write I/O job trace output 3

Events to Ensembles The details of a trace can obscure as much as they reveal And it does not scale Statistical methods reveal what the trace obscures And it does scale Task 0 count Task 10,000 Wall clock time

Case Study #1: MADCAP analyzes the Cosmic Microwave Background radiation. Madbench – An out-of-core matrix solver writes and reads all of memory multiple times.

CMB Data Analysis time domain - O (1012) pixel sky map - O (108) angular power spectrum - O (104)

MADbench Overview u MADCAP is the maximum likelihood CMB angular power spectrum estimation code u MADbench is a lightweight version of MADCAP u Out-of-core calculation due to large size and number of pix-pix matrices

Computational Structure I. Compute, Write III. Read, Compute, IV. Read, (Loop) Write (Loop) Compute/Communic ate (Loop) task The compute intensity wall clock time can be tuned down to II. Compute/Communicate emphasize I/O (no I/O)

MADbench I/O Optimization Phase II. Read # 4 5 6 7 8 Click to edit Master text styles Second level ● Third level ● Fourth level ● Fifth level task wall clock time

MADbench I/O Optimization count duration (seconds)

MADbench I/O Optimization Cumulative Probability A statistical approach revealed a systematic duration (seconds) pattern

MADbench I/O Optimization Click to edit Master text styles Second level Process# ● Third level Before ● Fourth level ● Fifth level Time After Lustre patch eliminated slow reads

Case Study #2: Global Cloud Resolving Model (GCRM) developed by scientists at CSU Runs resolutions fine enough to simulate cloud formulation and dynamics Mark Howison’s analysis fixed it

GCRM I/O Optimization Task 0 Click to edit Master text styles Second level At 4km ● Third level resolution ● Fourth level ● Fifth level GCRM is dealing with a lot of data. The goal is to work at 1km and 40k tasks, which Task will require 16x as much 10,000 data. Wall clock time desired checkpoint time

GCRM I/O Optimization Worst case 20 sec Insight: all 10,000 are happening at once

GCRM I/O Optimization Worst case 3 sec Collective buffering reduces concurrency

GCRM I/O Optimization Click to edit Master text styles Second level ● Third level ● Fourth level Before ● Fifth level desired checkpoint time After

GCRM I/O Optimization Insight: Still need Aligned better I/O worst case behavior Worst case 1 sec

GCRM I/O Optimization Before desired checkpoint time After

GCRM I/O Optimization Sometimes the trace view is the right way to look at it Metadata is being serialized through task 0

GCRM I/O Optimization Defer metadata ops so there are fewer and they are larger

GCRM I/O Optimization Before desired checkpoint time After

Conclusions and Future Work Traces do not scale, can obscure underlying features Statistical methods scale, give useful diagnostic insights into large datasets Future work: gather statistical info directly in IPM Future work: Automatic recognition of model and moments within IPM

Acknowledgements • Julian Borrill wrote MADCAP/MADbench • Mark Howison performed the GCRM optimizations • Noel Keen wrote the I/O extensions for IPM • Kitrick Sheets (Cray) and Tom Wang (SUN/Oracle) assisted with the diagnosis of the Lustre bug • This work was funded in part by the DOE Office of Advanced Scientific Computing Research (ASCR) under contract number DE-C02-05CH11231

Parallel I/O Performance: From Events to Ensembles In collaboration - PowerPoint PPT Presentation

Parallel I/O Performance: From Events to Ensembles In collaboration with: Lenny Oliker Andrew Uselton David Skinner National Energy Research Scientific Computing Center Mark Howison Nick Wright Lawrence Berkeley National

Monte Carlo in different ensembles Daan Frenkel Different Ensembles Ensemble Name Constant

COS424 Scribe Notes Lecture 14: Ensembles Donghun Lee April 8, 2010 1 Ensembles A set of

Coulomb gas ensembles in 2D H. Hedenmalm December 11, 2015 H. Hedenmalm Coulomb gas ensembles

ENSEMBLES FOR TIME SERIES FORECASTING Mariana Oliveira & Lus Torgo Ensembles for Time

+ Design of Parallel Algorithms Parallel Algorithm Analysis Tools + Topic Overview n Sources of

+ Design of Parallel Algorithms Parallel Algorithm Analysis Tools + Topic Overview n Sources

Events Team CONTENTS 1) Event Categories 2) Major Events 3) Event timeline 4) Events

How Events Are Reshaping Modern Systems Jonas Bonr @jboner Why Should you care about Events?

Parallel and Distributed Programming Introduction Kenjiro Taura 1 / 21 Contents 1 Why Parallel

Guidance Information or Probability Forecast: Where do Ensembles Aim? It is widely held that

Low Rank Ensembles Eric Xing Ankur Parikh Avneesh Saluja Chris Dyer 1 Overview 2 Overview

Synchronization in Ensembles of Oscillators: Theory of Collective Dynamics A. Pikovsky Institut

DCSO: Dynamic Combination of Detector Scores for Outlier Ensembles Yue Zhao Maciej K.

Monte Carlo Methods Ensembles (Chapter 5) Biased Sampling (Chapter 14) Practical Aspects

Unfolding and Shrinking Neural Machine Translation Ensembles Felix Stahlberg and Bill Byrne

Ensemble methods CS 446 Why ensembles? Standard machine learning setup: We have some data.

Search and Discrepancies Ciaran McCreesh This Weeks Lectures Search and Discrepancies Recap

A review and comparison of ontology-based approaches to robot autonomy A. Olivares-Alarcos, D.

BDDC Domain Decomposition Algorithms Olof B. Widlund Courant Institute, New York University 75th

WORLDS BEST WORKFORCE Lakes Country Service Cooperative Western Lakes Center of Excellence

Data.dcs: Converting Legacy Data into Linked Data Matthew Rowe Organisations, Information and

Internationalized Domain Names (IDN) and the Role of Academia | WACREN 2017 | March 2017 The

Modeling with MOSEK Fusion Ulf Worse INFORMS Minneapolis October 5 2013 http://www.mosek.com

GDPR Leyla Hannbeck MRPharmS, MBA, MSc, MA NPA Chief

Sambuz

Useful Links

Newsletter

Mail Us

Parallel I/O Performance: From Events to Ensembles In collaboration - PowerPoint PPT Presentation

Parallel I/O Performance: From Events to Ensembles In collaboration with: Lenny Oliker Andrew Uselton David Skinner National Energy Research Scientific Computing Center Mark Howison Nick Wright Lawrence Berkeley National

Monte Carlo in different ensembles Daan Frenkel Different Ensembles Ensemble Name Constant

COS424 Scribe Notes Lecture 14: Ensembles Donghun Lee April 8, 2010 1 Ensembles A set of

Coulomb gas ensembles in 2D H. Hedenmalm December 11, 2015 H. Hedenmalm Coulomb gas ensembles

ENSEMBLES FOR TIME SERIES FORECASTING Mariana Oliveira &amp; Lus Torgo Ensembles for Time

+ Design of Parallel Algorithms Parallel Algorithm Analysis Tools + Topic Overview n Sources of

+ Design of Parallel Algorithms Parallel Algorithm Analysis Tools + Topic Overview n Sources

Events Team CONTENTS 1) Event Categories 2) Major Events 3) Event timeline 4) Events

How Events Are Reshaping Modern Systems Jonas Bonr @jboner Why Should you care about Events?

Parallel and Distributed Programming Introduction Kenjiro Taura 1 / 21 Contents 1 Why Parallel

Guidance Information or Probability Forecast: Where do Ensembles Aim? It is widely held that

Low Rank Ensembles Eric Xing Ankur Parikh Avneesh Saluja Chris Dyer 1 Overview 2 Overview

Synchronization in Ensembles of Oscillators: Theory of Collective Dynamics A. Pikovsky Institut

DCSO: Dynamic Combination of Detector Scores for Outlier Ensembles Yue Zhao Maciej K.

Monte Carlo Methods Ensembles (Chapter 5) Biased Sampling (Chapter 14) Practical Aspects

Unfolding and Shrinking Neural Machine Translation Ensembles Felix Stahlberg and Bill Byrne

Ensemble methods CS 446 Why ensembles? Standard machine learning setup: We have some data.

Search and Discrepancies Ciaran McCreesh This Weeks Lectures Search and Discrepancies Recap

A review and comparison of ontology-based approaches to robot autonomy A. Olivares-Alarcos, D.

BDDC Domain Decomposition Algorithms Olof B. Widlund Courant Institute, New York University 75th

WORLDS BEST WORKFORCE Lakes Country Service Cooperative Western Lakes Center of Excellence

Data.dcs: Converting Legacy Data into Linked Data Matthew Rowe Organisations, Information and

Internationalized Domain Names (IDN) and the Role of Academia | WACREN 2017 | March 2017 The

Modeling with MOSEK Fusion Ulf Worse INFORMS Minneapolis October 5 2013 http://www.mosek.com

GDPR Leyla Hannbeck MRPharmS, MBA, MSc, MA NPA Chief

Sambuz

Useful Links

Newsletter

Mail Us

ENSEMBLES FOR TIME SERIES FORECASTING Mariana Oliveira & Lus Torgo Ensembles for Time