Charm++ as an Energy Efficient Runtime
BILGE ACUN - CHARM++ WORKSHOP 2017 4/18/17
1
Charm++ as an Energy Efficient Runtime 1 4/18/17 BILGE ACUN - - - PowerPoint PPT Presentation
Charm++ as an Energy Efficient Runtime 1 4/18/17 BILGE ACUN - CHARM++ WORKSHOP 2017 Interaction Between the Runtime System and the Resource Manager Allows dynamic interaction between the system resource manager or scheduler and the job
BILGE ACUN - CHARM++ WORKSHOP 2017 4/18/17
1
ü Allows dynamic interaction between the system resource manager or scheduler and the job runtime system ü Meets system-level constraints such as power caps and hardware configurations ü Achieves the objectives of both datacenter users and system administrators
BILGE ACUN - CHARM++ WORKSHOP 2017 4/18/17
2
Charm++ has three main components:
such as object loads, CPU temperatures
decisions and redistributes load
CPU temperatures remain below the temperature threshold, change the power cap
BILGE ACUN - CHARM++ WORKSHOP 2017 4/18/17
3
BI BILGE ACU CUN 1, , EU EUN KY KYUNG LEE1, , YO YOONHO PA PARK 1, , LAX LAXMIK IKANT ANT V.
1 IB
IBM T.J. WATSON N RESEAR ARCH H CENT NTER
2 UN
UNIVERSITY OF ILLINOIS AT UR URBANA-CH CHAMPAIGN
BILGE ACUN - CHARM++ WORKSHOP 2017 4/18/17
4
BILGE ACUN - CHARM++ WORKSHOP 2017
have different temperature, cooling characteristics
4/18/17
5
BILGE ACUN - CHARM++ WORKSHOP 2017
even further temperature variation
with a regular workload 7C 20 C
4/18/17
6
BILGE ACUN - CHARM++ WORKSHOP 2017
Cori at NERSC – Intel Haswell Minsky at IBM POWER8 Temperature distribution of 1800 cores
4/18/17
7
BILGE ACUN - CHARM++ WORKSHOP 2017
30 % 10 % 60 % 99 % CPU Utilization Workload starts
4/18/17
8
4/18/17
9
an exponential modeling space
v (10^10) * 44 * (10^4) = ~ 2^52
BILGE ACUN - CHARM++ WORKSHOP 2017
Ambient Fan Core Core
4/18/17
10
BILGE ACUN - CHARM++ WORKSHOP 2017
input and output parameters
1.
2.
Symposium on Principles and Practice of Parallel Programming, PPoPP '07, 2007. 3.
4.
4/18/17
11
BILGE ACUN - CHARM++ WORKSHOP 2017
Experimental Setup:
Power 8 processors
cores, 160 SMT cores
temperature, power readings
Pre-Processing Training Deployment Raw Data Core Temperatures (Predic:on)
Core U:liza:ons Fan Speeds
Neural Network Model Training Phase Deployment Phase
Ambient Temperature Core Frequencies Chip Power
4/18/17
12
BILGE ACUN - CHARM++ WORKSHOP 2017
500 1000 1500 2000
Number of Samples used for Training
0.5 1 1.5
Mean Absolute Error [°C]
Levenberg-Marquardt Scaled conjugate gradient Resilient
5 10 15 20
Core number
0.2 0.4 0.6 0.8 1 1.2 1.4
Mean Absolute Error [°C]
Median 25%-75% 9%-91%
4/18/17
13
core to another?
workload?
BILGE ACUN - CHARM++ WORKSHOP 2017 4/18/17
14
BILGE ACUN - CHARM++ WORKSHOP 2017 4/18/17
15
BILGE ACUN - CHARM++ WORKSHOP 2017
v Preemptive fan-control removes temperature peaks, and is able to keep the temperature as the same level as reactive fan control. v The key idea is cool the processor proactively, for example, before the application starts. v It can be done via job scheduler, and/or runtime without taking over the total control of the fan.
16
Power Reduction = Maximum Power – Stable Power
4/18/17
17
35% reduction in fan power
BILGE ACUN - CHARM++ WORKSHOP 2017
BEFORE AFTER
4/18/17
18
18% reduction in fan power
BILGE ACUN - CHARM++ WORKSHOP 2017 4/18/17
19
53% reduction in fan power on average
4/18/17
20
BILGE ACUN - CHARM++ WORKSHOP 2017
customizable periods
guided load balancing algorithm.
chip and core level variations.
Conference for High Performance Computing, Networking, Storage and Analysis, pages 647-658. IEEE, 2014.
21
BILGE ACUN - CHARM++ WORKSHOP 2017 4/18/17
22
4/18/17 BILGE ACUN - CHARM++ WORKSHOP 2017
23
BILGE ACUN - CHARM++ WORKSHOP 2017
v Preemptive fan-control removes temperature peaks, and is able to keep the temperature as the same level as reactive fan control. v The key idea is cool the processor proactively, for example, before the application starts. v It can be done via job scheduler, and/or runtime without taking over the total control of the fan.
4/18/17
24
BILGE ACUN - CHARM++ WORKSHOP 2017
Workload Starts How early to set the cooling speed? v Peak fan power can be reduced by 54 Watts = 58% reduction in cooling power. v 2790 Joules of energy is saved = Red area – black area
4/18/17
25