Energy Simulation with SimGrid Millian Poquet - - PowerPoint PPT Presentation

energy simulation with simgrid
SMART_READER_LITE
LIVE PREVIEW

Energy Simulation with SimGrid Millian Poquet - - PowerPoint PPT Presentation

Energy Simulation with SimGrid Millian Poquet millian.poquet@inria.fr Slides from SimGrid tutorials and F. C. Heinrich (Cluster17) Introduction Overview and Models Validation (CLUSTER17) Conclusion Chicken-and-egg Situation How to


slide-1
SLIDE 1

Energy Simulation with SimGrid

Millian Poquet millian.poquet@inria.fr

Slides from SimGrid tutorials and F. C. Heinrich (Cluster’17)

slide-2
SLIDE 2

Introduction Overview and Models Validation (CLUSTER’17) Conclusion

Chicken-and-egg Situation

How to save energy? Do costly experiments

∎ Typically: MJ to save some % ∎ Classical issue in optimization... Can we do more reasonable experiments?

Millian Poquet Energy Simulation with SimGrid 1 / 25

slide-3
SLIDE 3

Introduction Overview and Models Validation (CLUSTER’17) Conclusion

Simulation at rescue

The fastest path from idea to data. Comfortable ∎ Thousands of runs within the week on your laptop ∎ Preliminary results from partial implementations ∎ Focus on ideas, don’t fiddle with technical subtleties (yet) Challenges ∎ Validity: Realistic results (controlled experimental bias) ∎ Scalability: Simulate big enough problems fast enough ∎ Applicability: Should simulate what is important to users

Millian Poquet Energy Simulation with SimGrid 2 / 25

slide-4
SLIDE 4

Introduction Overview and Models Validation (CLUSTER’17) Conclusion

Outline

1 Introduction 2 Overview and Models 3 Validation (CLUSTER’17) 4 Conclusion

Millian Poquet Energy Simulation with SimGrid 3 / 25

slide-5
SLIDE 5

Introduction Overview and Models Validation (CLUSTER’17) Conclusion

SimGrid at a glance

∎ 18-year-old open-source project ∎ Collaboration: France (Inria, CNRS, Grenoble, Lyon, Rennes...), US (UCSD, Hawaii), UK, Austria (Vienna)... ∎ Papers: 500 cite, 300 use, 60 extend ∎ LOC: ≈150k C/C++ ∎ Initially focused on Grids. Argue that same techniques can be used for P2P, HPC, Cloud... ∎ Goal: Usable tool with predictive capability ∎ Model Checking capabilities

Millian Poquet Energy Simulation with SimGrid 4 / 25

slide-6
SLIDE 6

Introduction Overview and Models Validation (CLUSTER’17) Conclusion

Software Architecture

Essentially a library. Architectured as an OS. ∎ 1 system process (kernel + user code) ∎ mutual exclusion on actors’ execution ∎ maestro dictates who run ∎ user code increases simulation time via syscalls

SimGrid simulation process

Actor 0 Actor 1 Actor 2 Actor 3 Simulation data Execution control (maestro) User-given

user code user code start end send compute

Millian Poquet Energy Simulation with SimGrid 5 / 25

slide-7
SLIDE 7

Introduction Overview and Models Validation (CLUSTER’17) Conclusion

Internals Organization

User-visible components ∎ S4U (MSG): general purpose ∎ SimDag: DAGs of ptasks ∎ SMPI: online/offline MPI Internally: Strict layers ∎ S4U: User-friendly sugar ∎ SIMIX: Processes, synchro ∎ SURF: Resources usage ∎ Models: Action completion computation

372 435 245 245 530 530 50 664 x1 x2 x2 x2 x3 x3 xn + + + ... ≤ CP ≤ CL1 ≤ CL4 ≤ CL2 ≤ CL3

user code user code user code user code user code Processes ... ... ... Conditions Actions Constraints Variables

work remaining variable

S4U SIMIX SURF LMM

(or others) Millian Poquet Energy Simulation with SimGrid 6 / 25

slide-8
SLIDE 8

Introduction Overview and Models Validation (CLUSTER’17) Conclusion

Network Models

Several are available: ∎ Fast flow-based, towards realism and speed (by default) Contention, slow start, TCP congestion, cross-traffic effects. ∎ Constant time: A bit faster, no hope for realism ∎ Coordinate-based: Easier to instantiate P2P scenarios ∎ Packet-level: NS3 bindings

Millian Poquet Energy Simulation with SimGrid 7 / 25

slide-9
SLIDE 9

Introduction Overview and Models Validation (CLUSTER’17) Conclusion

DVFS and Energy Model

DVFS ∎ Modern CPUs can reduce computation speed to save energy ∎ Power states: Levels of performance. Governors pick them. ∎ SimGrid: Manually switch pstates, which change the flop rate Energy Model ∎ For one pstate, consumption = linear function of CPU use ∎ Classically accepted model in the literature, rarely challenged

8 / 25

slide-10
SLIDE 10

Introduction Overview and Models Validation (CLUSTER’17) Conclusion

Basic Energy Model Instantiation

<host id="MyHost2" speed="100.0Mf" > <prop id="watt_per_state" value="100.0:200.0" /> <prop id="watt_off" value="10" /> </host> ∎ watt_off: the host is off ⇒ 10 Watts ∎ watt_per_state power consumption interval [min:max]

∎ Idling host ⇒ 100 Watts ∎ Fully loaded host (100.0Mf=100 MFlops/s) ⇒ 200 Watts ∎ Linear model in between: CPU loaded at 50% ⇒ 150 Watts

Millian Poquet Energy Simulation with SimGrid 9 / 25

slide-11
SLIDE 11

Introduction Overview and Models Validation (CLUSTER’17) Conclusion

DVFS Energy Model Instantiation

<host id="MyHost1" speed="100.0Mf,50.0Mf,20.0Mf" pstate="0" > <prop id="watt_per_state" value="95.0:200.0, 93.0:170.0, 90.0:150.0" /> <prop id="watt_off" value="10" /> </host> ∎ power: 3 pstates {0,1,2}: 100, 50 and 20 Mflops/s ∎ pstate: Initial pstate (here, pstate=0, ie. 100 Mflops/s) ∎ watt_per_state two power values [min:max] as before

∎ Here, CPU loaded at 50% in pstate 2 consumes 120 Watts. ∎ Remember, pstates are numbered from 0! pstate 2 is 20 Mflops/s peak

Millian Poquet Energy Simulation with SimGrid 10 / 25

slide-12
SLIDE 12

Introduction Overview and Models Validation (CLUSTER’17) Conclusion

ON/OFF Energy Model

ON ↔ OFF takes time (seconds) and energy (Joules). Many ways to do it ∎ Not easy for the noise: everybody wants something specific ∎ SimGrid provides basic mechanisms, you have to help yourself ∎ Switching on/off is instantaneous

Millian Poquet Energy Simulation with SimGrid 11 / 25

slide-13
SLIDE 13

Introduction Overview and Models Validation (CLUSTER’17) Conclusion

CLUSTER’17 paper

Heinrich, Cornebize, Degomme, Legrand, Carpen-Amarie, Hunold, Orgerie, Quinson: Predicting the Energy-Consumption of MPI Applications at Scale Using Only a Single Node. Main goal: Validate performance and energy predictions Quick overview:

1 Obtain a platform model

∎ How does MPI perform on this platform?

2 Run the application on one node, all cores

∎ Processes interferences (memory contention, L1-L3 caches) ∎ Measure the energy consumption

3 Run the application on one node, one core

∎ Measure the energy consumption

4 Feed measurements / platform model into simulator

Millian Poquet Energy Simulation with SimGrid 12 / 25

slide-14
SLIDE 14

Introduction Overview and Models Validation (CLUSTER’17) Conclusion

MPI Simulation in SimGrid

Millian Poquet Energy Simulation with SimGrid 13 / 25

slide-15
SLIDE 15

Introduction Overview and Models Validation (CLUSTER’17) Conclusion

Contribution 1: Problem

Energy Model should be application-dependent.

taurus−13 taurus−14 taurus−16 taurus−7 taurus−8 taurus−10 taurus−11 taurus−12 taurus−1 taurus−3 taurus−4 taurus−5 taurus−6 25 50 75 100 0 25 50 75 100 0 25 50 75 100 25 50 75 100 0 25 50 75 100 50 100 150 200 250 50 100 150 200 250 50 100 150 200 250

Power (Watts) Workload

Idle NAS−EP NAS−LU HPL

Taurus cluster − 13 nodes @ 2300 MHz

14 / 25

slide-16
SLIDE 16

Introduction Overview and Models Validation (CLUSTER’17) Conclusion

Contribution 1: Solution

Instantiate the energy model presented before!

Average idle consumption (Pidle) Pstatic

50 100 150 200 250 1 4 8 12

Number of active cores Power (Watts) Frequency (MHz)

1200 1400 1600 1800 2000 2200

Taurus cluster, Lyon, NAS−EP

Millian Poquet Energy Simulation with SimGrid 15 / 25

slide-17
SLIDE 17

Introduction Overview and Models Validation (CLUSTER’17) Conclusion

Contribution 1: Outcome

NAS−EP

  • Reality

Simulation

Ideal scaling

  • 10

20 30 40 50 1x12 4x12 8x12 12x12

Run−time (in s)

  • 0.0

2.5 5.0 7.5 1x12 4x12 8x12 12x12

Energy (in kJ)

nodes x processes per node

Millian Poquet Energy Simulation with SimGrid 16 / 25

slide-18
SLIDE 18

Introduction Overview and Models Validation (CLUSTER’17) Conclusion

Contribution 2: Problem

∎ Previous benchmark (NAS-EP) uses almost no communication. What about more complicated applications? ∎ NAS-LU uses collective communciations and is memory bound ∎ Applications often contend e.g., on L1 or L3 caches

NAS−LU

  • Reality

Simulation (uncorrected)

Ideal scaling

  • 50

100 1x12 4x12 8x12 12x12

Run−time (in s)

  • 10

20 30 40 50 1x12 4x12 8x12 12x12

Energy (in kJ)

Millian Poquet Energy Simulation with SimGrid 17 / 25

slide-19
SLIDE 19

Introduction Overview and Models Validation (CLUSTER’17) Conclusion

Contribution 2: Solution

We unbias by computing speedup factors through trace alignment.

Millian Poquet Energy Simulation with SimGrid 18 / 25

slide-20
SLIDE 20

Introduction Overview and Models Validation (CLUSTER’17) Conclusion

Contribution 2: Outcome

NAS−LU

  • Reality

Simulation (corrected) Simulation (uncorrected)

Ideal scaling

  • 50

100 1x12 4x12 8x12 12x12

Run−time (in s)

  • 10

20 30 40 50 1x12 4x12 8x12 12x12

Energy (in kJ)

Millian Poquet Energy Simulation with SimGrid 19 / 25

slide-21
SLIDE 21

Introduction Overview and Models Validation (CLUSTER’17) Conclusion

Contribution 3: Problem

HPL is more complicated than this. Two main issues:

1 HPL sends many large messages from rank to rank.

❀ faster intra-node communications should be accounted for

HPL

  • Reality

Simulation (loopback same as ethernet)

Ideal scaling

  • 25

50 75 1x12 4x12 8x12 12x12

Run−time (in s)

Millian Poquet Energy Simulation with SimGrid 20 / 25

slide-22
SLIDE 22

Introduction Overview and Models Validation (CLUSTER’17) Conclusion

Contribution 3: Problem (2/2)

2 Makes heavy use of MPI_Iprobe in order to run computations

while waiting for data.

∎ But Iprobes do consume significant amounts of energy! ∎ We hence cannot ignore Iprobes!

90 100 110 120 130 140 150 160 170 180 190 500 1000 1500 2000 2500 3000 3500 Power Consumption (watts) Time (seconds) taurus-8

Millian Poquet Energy Simulation with SimGrid 21 / 25

slide-23
SLIDE 23

Introduction Overview and Models Validation (CLUSTER’17) Conclusion

Contribution 3: Solution

1 Calibrate loopback usage by sending local messages 2 Iprobe issue is simple: Scale CPU usage while iprobeing via

parameter -cfg=smpi/iprobe-cpu-usage (here: 0.61)

Millian Poquet Energy Simulation with SimGrid 22 / 25

slide-24
SLIDE 24

Introduction Overview and Models Validation (CLUSTER’17) Conclusion

Contribution 3: Outcome

HPL

  • Reality

Simulation (calibrated) Simulation (loopback same as ethernet)

Ideal scaling

  • 25

50 75 1x12 4x12 8x12 12x12

Run−time (in s)

nodes x processes per node HPL

  • Reality

Simulation (w/ iProbes) Simulation (wo/ iprobes)

  • 10

20 30 1x12 4x12 8x12 12x12

Energy (in kJ)

nodes x processes per node

23 / 25

slide-25
SLIDE 25

Introduction Overview and Models Validation (CLUSTER’17) Conclusion

Validation Recap

NAS−EP

  • Reality

Simulation

Ideal scaling

  • 10

20 30 40 50 1x12 4x12 8x12 12x12

Run−time (in s)

  • 0.0

2.5 5.0 7.5 1x12 4x12 8x12 12x12

Energy (in kJ) nodes x processes per node NAS−LU

  • Reality

Simulation

Ideal scaling

  • 25

50 75 100 1x12 4x12 8x12 12x12

Run−time (in s)

  • 10

20 30 1x12 4x12 8x12 12x12

Energy (in kJ) nodes x processes per node

HPL

  • Reality

Simulation

Ideal scaling

  • 20

40 60 80 1x12 4x12 8x12 12x12

Run−time (in s)

  • 10

20 30 1x12 4x12 8x12 12x12

Energy (in kJ) nodes x processes per node

Millian Poquet Energy Simulation with SimGrid 24 / 25

slide-26
SLIDE 26

Introduction Overview and Models Validation (CLUSTER’17) Conclusion

Take-aways

SimGrid can be helpful to your research ∎ Versatile: Several communities (Scheduling, Grids, HPC, P2P, Clouds) ∎ Accurate: Model limits known thanks to validation studies ∎ Sound: Easy to use, extensible, fast to execute, scalable, well tested ∎ Open: LGPL; User-community much larger than contributors group ∎ Around since 18 years, ready for at least 18 more years ∎ Discover: http://simgrid.gforge.inria.fr/ ∎ Learn: tutorials, user manual and examples ∎ Join: mailing list, #simgrid on irc.debian.org

Millian Poquet Energy Simulation with SimGrid 25 / 25