Towards system-scale
- ptimisation of HPC applications
TADaaM : Topology-Aware System-Scale Data Management for High-Performance Computing Applications
Emmanuel Jeannot October 2016
Towards system-scale optimisation of HPC applications TADaaM : - - PowerPoint PPT Presentation
Towards system-scale optimisation of HPC applications TADaaM : Topology-Aware System-Scale Data Management for High-Performance Computing Applications Emmanuel Jeannot October 2016 INTRODUCTION Optimize application execution at system-scale
Emmanuel Jeannot October 2016
Optimize application execution at system-scale Topology
t he nt a
e ., d h e a t t t (a)
Applications Data
Tadaam, october 2016 Emmanuel Jeannot - 2
Tadaam, october 2016 Emmanuel Jeannot - 3
Tadaam, october 2016 Emmanuel Jeannot - 4
Tadaam, october 2016 Emmanuel Jeannot - 5
Multithreaded application Multithreaded
Multithreaded Runtime System Multithreaded Comp. Library Scientific app MPI (progress threads) OpenMP Parallel Blas Hardware Multicore+parallel Pb: Each thread ignore the existence of the other threads! Mapping? Priority? Scheduling?
Tadaam, october 2016 Emmanuel Jeannot - 6
2e+06 4e+06 6e+06 8e+06 1e+07 1.2e+07 1.4e+07 1 10 100 1000 Node hours Job size First Year Accumlated Curie Utilization Cumulated node hours
t he nt a
e ., h e a t t t (a)
Pb: message transfer not aware of other applications! Contention, routing, message scheduling Cf.: Demonstrating Improved Application Performance Using Dynamic Monitoring and Task Mapping, A. Gentile, J.Brandt, K. Devine, K. Pedretti BW median case: 2048 nodes Curie median case (install time): 256 nodes
Tadaam, october 2016 Emmanuel Jeannot - 7
Tadaam, october 2016 Emmanuel Jeannot - 8
Topology
t he nt aApplications Data ?
Problematic:
Tadaam, october 2016 Emmanuel Jeannot - 9
Hardware Application Memory hierachy Cache size Network topology Allocated resources Other applications Storage Stateful System-wide Service Layer Application needs Application a Application b Programming Model
Tadaam, october 2016 Emmanuel Jeannot - 10
Tadaam, october 2016 Emmanuel Jeannot - 11
Tadaam, october 2016 Emmanuel Jeannot - 12
Applications Programming models Hardware Libraries Operating systems Runtime systems
Compilers
SW stack Storage Batch scheduler
Environment model Application need and model Optimization algorithm Optimized execution
Network
Tadaam, october 2016 Emmanuel Jeannot - 13
Hardware Memory hierachy Cache size Network topology Allocated resources Other applications Storage Stateful System-Wide Service Layer
Application a Application b
Tadaam, october 2016 Emmanuel Jeannot - 14
Tadaam, october 2016 Emmanuel Jeannot - 15
Tadaam, october 2016 Emmanuel Jeannot - 16
Mesh/graph partitioning (Scotch) Platform model (Hwloc) Topology-aware locality mechanisms (TreeMatch) Parallel mesh adaptation (Pampa) Communication optimization (New Madeline)
Tadaam, october 2016 Emmanuel Jeannot - 17
Tadaam, october 2016 Emmanuel Jeannot - 18
Tadaam, october 2016 Emmanuel Jeannot - 19
Emmanuel Jeannot, senior research scientist (DR2), Inria, Team leader; Guillaume Aupy, Research scientist (CR2), Inria Alexandre Denis, experienced research scientist (CR1), Inria; Brice Goglin, experienced research scientist (CR1), Inria; Guillaume Mercier, assistant professor, Bordeaux Institute of Technology; François Pellegrini, professor, University of Bordeaux; Raphaël Blanchard, PhD student, CIFRE Onera; Cyril Bordage, Postdoc, COLOC, Inria; Remi Barat, PhD student, CIFRE, CEA; Nicolas Denoyelle, research engineer, COLOC, Inria; Clément Foyer, Engineer, ELCI, Inria; Cédric Lachat, post-doc, ELCI, Inria; Benjamin Lorendeau, PhD student, CIFRE, EDF; Farouk Mansouri, Post-doc, Inria, Adèle Villiermet, PhD student, COLOC, Inria. ; Hugo Taboada, PhD syudent, CEA; Cécile Boutors, Team assistant.
Tadaam, october 2016 Emmanuel Jeannot - 20
Inria Bordeaux Sud-Ouest