Introduction Resource awareness HPX RA in HPX Example Summary
How runtime systems can support resource awareness in HPC: the HPX case
Tommaso Bianucci
Technische Universität München
22 June 2018
Tommaso Bianucci Technische Universität München
How runtime systems can support resource awareness in HPC: the HPX - - PowerPoint PPT Presentation
Introduction Resource awareness HPX RA in HPX Example Summary How runtime systems can support resource awareness in HPC: the HPX case Tommaso Bianucci Technische Universitt Mnchen 22 June 2018 Tommaso Bianucci Technische Universitt
Introduction Resource awareness HPX RA in HPX Example Summary
Tommaso Bianucci
Technische Universität München
22 June 2018
Tommaso Bianucci Technische Universität München
Introduction Resource awareness HPX RA in HPX Example Summary
◮ 1 ExaFLOPS = 1018 FLOPS ◮ Billions of cores? ◮ Etherogeneous hardware
◮ Manycore CPUs ◮ GPUs ◮ FPGAs
− → These machines expose an extreme degree of parallelism.
Tommaso Bianucci Technische Universität München
Introduction Resource awareness HPX RA in HPX Example Summary
◮ 1 ExaFLOPS = 1018 FLOPS ◮ Billions of cores? ◮ Etherogeneous hardware
◮ Manycore CPUs ◮ GPUs ◮ FPGAs
− → These machines expose an extreme degree of parallelism.
Tommaso Bianucci Technische Universität München
Introduction Resource awareness HPX RA in HPX Example Summary
◮ Scaling-impaired applications ◮ Unbalanced execution tree
This causes:
◮ Poor parallel performance ◮ Suboptimal resource usage
− → Some applications do not scale well.
Tommaso Bianucci Technische Universität München
Introduction Resource awareness HPX RA in HPX Example Summary
◮ Scaling-impaired applications ◮ Unbalanced execution tree
This causes:
◮ Poor parallel performance ◮ Suboptimal resource usage
− → Some applications do not scale well.
Tommaso Bianucci Technische Universität München
Introduction Resource awareness HPX RA in HPX Example Summary
◮ Scaling-impaired applications ◮ Unbalanced execution tree
This causes:
◮ Poor parallel performance ◮ Suboptimal resource usage
− → Some applications do not scale well.
Tommaso Bianucci Technische Universität München
Introduction Resource awareness HPX RA in HPX Example Summary
Current predominant model in HPC:
◮ Fork-join for shared memory (OpenMP) ◮ Communicating Sequential Processes for distributed memory
(MPI) Problems:
◮ Global barriers ◮ Load imbalance
P.A.Grubel:"Dynamic adaptation in hpx, a task-based parallel runtime system" 2016. Tommaso Bianucci Technische Universität München
Introduction Resource awareness HPX RA in HPX Example Summary
Current predominant model in HPC:
◮ Fork-join for shared memory (OpenMP) ◮ Communicating Sequential Processes for distributed memory
(MPI) Problems:
◮ Global barriers ◮ Load imbalance
P.A.Grubel:"Dynamic adaptation in hpx, a task-based parallel runtime system" 2016. Tommaso Bianucci Technische Universität München
Introduction Resource awareness HPX RA in HPX Example Summary
Resource awareness
◮ Adaptive allocation and usage of resources ◮ The system is aware of its own resources ◮ At runtime vs. before execution
Tommaso Bianucci Technische Universität München
Introduction Resource awareness HPX RA in HPX Example Summary
Resource awareness
◮ Adaptive allocation and usage of resources ◮ The system is aware of its own resources ◮ At runtime vs. before execution
Tommaso Bianucci Technische Universität München
Introduction Resource awareness HPX RA in HPX Example Summary
◮ Computational units ◮ Memory ◮ Bus/network bandwidth ◮ I/O devices ◮ Power ◮ Thermal
◮ Buffers ◮ Queues Tommaso Bianucci Technische Universität München
Introduction Resource awareness HPX RA in HPX Example Summary
◮ Computational units ◮ Memory ◮ Bus/network bandwidth ◮ I/O devices ◮ Power ◮ Thermal
◮ Buffers ◮ Queues Tommaso Bianucci Technische Universität München
Introduction Resource awareness HPX RA in HPX Example Summary
E.g.: Invasive computing on MPSoC
E.g.: load balance, task scheduling
E.g.: Invasive MPI + job scheduler integration
Tommaso Bianucci Technische Universität München
Introduction Resource awareness HPX RA in HPX Example Summary
E.g.: Invasive computing on MPSoC
E.g.: load balance, task scheduling
E.g.: Invasive MPI + job scheduler integration
Tommaso Bianucci Technische Universität München
Introduction Resource awareness HPX RA in HPX Example Summary
E.g.: Invasive computing on MPSoC
E.g.: load balance, task scheduling
E.g.: Invasive MPI + job scheduler integration
Tommaso Bianucci Technische Universität München
Introduction Resource awareness HPX RA in HPX Example Summary
C++ runtime system for
◮ Task-based parallelism ◮ Shared memory + Distributed memory parallelization ◮ Fine-grained parallelism
Tommaso Bianucci Technische Universität München
Introduction Resource awareness HPX RA in HPX Example Summary
C++ runtime system for
◮ Task-based parallelism ◮ Shared memory + Distributed memory parallelization ◮ Fine-grained parallelism
Tommaso Bianucci Technische Universität München
Introduction Resource awareness HPX RA in HPX Example Summary
◮ Asynchronous scheduling and
execution
◮ Lightweight synchronization ◮ Active Global Address Space
(AGAS)
◮ Performance monitoring
framework
standard library for parallelism and concurrency” 2017. Tommaso Bianucci Technische Universität München
Introduction Resource awareness HPX RA in HPX Example Summary
Tommaso Bianucci Technische Universität München
Introduction Resource awareness HPX RA in HPX Example Summary
Tommaso Bianucci Technische Universität München
Introduction Resource awareness HPX RA in HPX Example Summary
Tommaso Bianucci Technische Universität München
Introduction Resource awareness HPX RA in HPX Example Summary
Tommaso Bianucci Technische Universität München
Introduction Resource awareness HPX RA in HPX Example Summary
Tommaso Bianucci Technische Universität München
Introduction Resource awareness HPX RA in HPX Example Summary
Capabilities
− → Work stealing + NUMA-awareness
− → Dynamic relocation of objects
− → Directly addressing HW accelerators
− → Easier integration into applications
Tommaso Bianucci Technische Universität München
Introduction Resource awareness HPX RA in HPX Example Summary
Capabilities
− → Work stealing + NUMA-awareness
− → Dynamic relocation of objects
− → Directly addressing HW accelerators
− → Easier integration into applications
Tommaso Bianucci Technische Universität München
Introduction Resource awareness HPX RA in HPX Example Summary
Capabilities
− → Work stealing + NUMA-awareness
− → Dynamic relocation of objects
− → Directly addressing HW accelerators
− → Easier integration into applications
Tommaso Bianucci Technische Universität München
Introduction Resource awareness HPX RA in HPX Example Summary
Capabilities
− → Work stealing + NUMA-awareness
− → Dynamic relocation of objects
− → Directly addressing HW accelerators
− → Easier integration into applications
Tommaso Bianucci Technische Universität München
Introduction Resource awareness HPX RA in HPX Example Summary
Capabilities
− → Work stealing + NUMA-awareness
− → Dynamic relocation of objects
− → Directly addressing HW accelerators
− → Easier integration into applications
Tommaso Bianucci Technische Universität München
Introduction Resource awareness HPX RA in HPX Example Summary
Limitations
− → Worker threads and localities cannot be changed at runtime
− → E.g. no DVFS support
− → No built-in facility
Tommaso Bianucci Technische Universität München
Introduction Resource awareness HPX RA in HPX Example Summary
Limitations
− → Worker threads and localities cannot be changed at runtime
− → E.g. no DVFS support
− → No built-in facility
Tommaso Bianucci Technische Universität München
Introduction Resource awareness HPX RA in HPX Example Summary
Limitations
− → Worker threads and localities cannot be changed at runtime
− → E.g. no DVFS support
− → No built-in facility
Tommaso Bianucci Technische Universität München
Introduction Resource awareness HPX RA in HPX Example Summary
= 0 zc,n+1 = z2
c,n + c
M = {c ∈ C : lim
n→∞ |zc,n| < +∞}
Tommaso Bianucci Technische Universität München
Introduction Resource awareness HPX RA in HPX Example Summary
= 0 zc,n+1 = z2
c,n + c
M = {c ∈ C : lim
n→∞ |zc,n| < +∞}
Tommaso Bianucci Technische Universität München
Introduction Resource awareness HPX RA in HPX Example Summary
= 0 zc,n+1 = z2
c,n + c
M = {c ∈ C : lim
n→∞ |zc,n| < +∞}
Tommaso Bianucci Technische Universität München
Introduction Resource awareness HPX RA in HPX Example Summary
1 void mandelbrot_kernel ( i n t taskNo , s t a t i c D a t a ∗sd ) { 2 // Computes
row
the image 3 i n t i = taskNo ; 4 f o r ( i n t j =0; j < sd− >xRes ; ++j ) { 5 i n t x = getX ( j , sd ) ; 6 i n t y = getY ( i , sd ) ; 7 complex double Z = 0 + 0∗ I ; 8 complex double C = x + y∗ I ; 9 10 i n t k = 0; 11 do { // Check the convergence
the sequence 12 Z = Z ∗ Z + C ; 13 k++; 14 } w h i l e ( cabs (Z) < 2 && k < max_iter ) ; 15 16 i f ( k == max_iter ) { // In case i t did not d i v e r g e . . . 17 memcpy( img [ i ] [ j ] , black , 3) ; // . . . we s e t a black p i x e l 18 } 19 e l s e { // I f i t d i v e r g e d . . . 20 // . . . we s e t the c o l o r a cc o rd i ng to k (# i t e r a t i o n s ) 21 memcpy( img [ i ] [ j ] , ge t C o l o r ( k ) , 3) ; 22 } 23 } 24 } Tommaso Bianucci Technische Universität München
Introduction Resource awareness HPX RA in HPX Example Summary
1 void mandelbrot_seq ( . . . ) { 2 3 s t a t i c D a t a ∗sd = assembleStaticData ( . . . ) ; 4 5 // I t e r a t e
s e q u e n t i a l l y 6 f o r ( i n t i =0; i < yRes ; ++i ) 7 { 8 mandelbrot_kernel ( i , sd ) ; 9 } 10 } Tommaso Bianucci Technische Universität München
Introduction Resource awareness HPX RA in HPX Example Summary
1 void mandelbrot_hpx ( . . . ) { 2 3 s t a t i c D a t a ∗sd = assembleStaticData ( . . . ) ; 4 5 std : : vector <hpx : : future <void> > f u t u r e s ; 6 f o r ( i n t i =0; i < yRes ; ++i ) 7 { 8 hpx : : future <void> f = hpx : : async(& mandelbrot_kernel , i , sd ) ; 9 f u t u r e s . push_back ( f ) ; 10 } 11 12 hpx : : w a i t _ a l l ( f u t u r e s ) ; 13 } Tommaso Bianucci Technische Universität München
Introduction Resource awareness HPX RA in HPX Example Summary
Tommaso Bianucci Technische Universität München
Introduction Resource awareness HPX RA in HPX Example Summary
◮ Future exascale computing requires smart code. ◮ Resource awareness can be a way to achieve better
performance.
◮ HPX has the potential to become a major runtime system for
HPC, thanks to both its performance and programmability.
Tommaso Bianucci Technische Universität München
Introduction Resource awareness HPX RA in HPX Example Summary
◮ Future exascale computing requires smart code. ◮ Resource awareness can be a way to achieve better
performance.
◮ HPX has the potential to become a major runtime system for
HPC, thanks to both its performance and programmability.
Tommaso Bianucci Technische Universität München