Entropy Journe Thmatique Emergente "aspects nergtiques du - - PowerPoint PPT Presentation

entropy
SMART_READER_LITE
LIVE PREVIEW

Entropy Journe Thmatique Emergente "aspects nergtiques du - - PowerPoint PPT Presentation

Ecole des Mines de Nantes Entropy Journe Thmatique Emergente "aspects nergtiques du calcul" Fabien Hermenier, Adrien Lbre, Jean Marc Menaud menaud@mines-nantes.fr mercredi 9 fvrier 2011 Outline Motivation


slide-1
SLIDE 1

Ecole des Mines de Nantes

Entropy

Fabien Hermenier, Adrien Lèbre, Jean Marc Menaud

Journée Thématique Emergente "aspects énergétiques du calcul"

menaud@mines-nantes.fr

mercredi 9 février 2011

slide-2
SLIDE 2

Fabien Hermenier, Adrien Lèbre, J.M. Menaud,- January 2011 - Ascola

  • Motivation
  • Entropy project
  • Dynamic consolidation principle
  • Reconfiguration problem
  • Some results
  • Extension to HPC
  • Cluster Context Switch
  • Conclusion

Outline

2

mercredi 9 février 2011

slide-3
SLIDE 3

Fabien Hermenier, Adrien Lèbre, J.M. Menaud,- January 2011 - Ascola

Motivation

  • DataCenter/Cluster environment
  • Static allocation of the resources to the jobs
  • Resources are underused
  • static allocation of resources vs. dynamic utilization
  • > Data center are oversized

3

50 % 50 %

45 %

55 %

20 %

80 %

Air C. Servers Servers

Fan

Memory

AC/DC CPU Idle Run

CPU Data center s 100 5

Disk

For a PUE = 2

mercredi 9 février 2011

slide-4
SLIDE 4

Fabien Hermenier, Adrien Lèbre, J.M. Menaud,- January 2011 - Ascola

Motivation

  • Dynamic Consolidation
  • Each task of a job is embedded into a Virtual Machine (VM)
  • The resources are allocated depending on the needs
  • VMs are mixed to be hosted on a reduced number of nodes
  • VM must be always online
  • Servers unused can be turned off
  • VMs are remixed when it is necessary, without downtime
  • But remixed VMs take time !
  • Packing the VMs implies several migrations
  • Some migrations has to be delayed to succeed.
  • Temporary hosting is necessary
  • ... -> Performance degradation

4

Reactivity is a key factor

mercredi 9 février 2011

slide-5
SLIDE 5

Fabien Hermenier, Adrien Lèbre, J.M. Menaud,- January 2011 - Ascola

  • Motivation
  • Entropy project
  • Dynamic consolidation principle
  • Reconfiguration problem
  • Some results
  • Extension to HPC
  • Cluster Context Switch
  • Conclusion

Outline

5

mercredi 9 février 2011

slide-6
SLIDE 6

Fabien Hermenier, Adrien Lèbre, J.M. Menaud,- January 2011 - Ascola 6

Entropy observes the current CPU, memory and network requirements of each VM and computes a globally optimized placement of them that satisfy all their requirements while using a minimum number of hosts. Dynamic consolidation Entropy can be cataloged as an IaaS system

mercredi 9 février 2011

slide-7
SLIDE 7

Fabien Hermenier, Adrien Lèbre, J.M. Menaud,- January 2011 - Ascola

  • A Configuration :
  • Each VM is assigned on a node,
  • Each VM requires a fix amount of memory.
  • Each VM requires a variable amount of CPU.

(Simplification : VMs executing a computation are active and require their own CPU.)

  • > May be viable

7

Global Design

mercredi 9 février 2011

slide-8
SLIDE 8

Fabien Hermenier, Adrien Lèbre, J.M. Menaud,- January 2011 - Ascola

Using Live Migrations at Cluster Scale

8

  • The Virtual Machines Packing Problem (VMPP)
  • Compute the minimum number of nodes to use to have a viable

configuration

mercredi 9 février 2011

slide-9
SLIDE 9

Fabien Hermenier, Adrien Lèbre, J.M. Menaud,- January 2011 - Ascola 9

) 4 servers

4 Tasks (

% CPU

Without Entropy

time

In Action 4 Tasks, 3 or 4 Servers With Entropy Consumption is reduced by 25%

Server n°3 stopped

mercredi 9 février 2011

slide-10
SLIDE 10

Fabien Hermenier, Adrien Lèbre, J.M. Menaud,- January 2011 - Ascola

  • Motivation
  • Entropy project
  • Dynamic consolidation principle
  • Reconfiguration problem
  • Some results
  • Extension to HPC
  • Cluster Context Switch
  • Conclusion

Outline

10

mercredi 9 février 2011

slide-11
SLIDE 11

Fabien Hermenier, Adrien Lèbre, J.M. Menaud,- January 2011 - Ascola

Order VM Operations (1/2)

11

Current Status Correct Status Non-viable manipulations

mercredi 9 février 2011

slide-12
SLIDE 12

Fabien Hermenier, Adrien Lèbre, J.M. Menaud,- January 2011 - Ascola

Order VM Operations (2/2)

12

2 steps

mercredi 9 février 2011

slide-13
SLIDE 13

Fabien Hermenier, Adrien Lèbre, J.M. Menaud,- January 2011 - Ascola

Migration Interdependences

13

  • One additional node is required

(critical energy consumption)

3 steps

mercredi 9 février 2011

slide-14
SLIDE 14

Fabien Hermenier, Adrien Lèbre, J.M. Menaud,- January 2011 - Ascola

Optimizing the reconfiguration process

14

Cost = 4 Cost = 9

  • Determine an efficient reconfiguration

plan (thanks to a cost function)

  • Cost model :
  • the necessary steps before migrating a VM
  • the amount of memory to migrate
  • the parallelism inside a single step

3 steps 2 steps

mercredi 9 février 2011

slide-15
SLIDE 15

Fabien Hermenier, Adrien Lèbre, J.M. Menaud,- January 2011 - Ascola 15

Entropy is a virtual machine (VM) manager for clusters and acts as an infinite control loop, which performs a globally optimized dynamic VM placement without downtime according to cluster resource usage and scheduler objectives

Architecture overview

Extract the current configuration : The position of each VMs and their states (active or inactive) Compute a viable configuration using a minimum number of nodes Plan and reduce the migration process if migrations are necessary Migrations orders are sent to the concerned hypervisors

mercredi 9 février 2011

slide-16
SLIDE 16

Fabien Hermenier, Adrien Lèbre, J.M. Menaud,- January 2011 - Ascola

  • Motivation
  • Entropy project
  • Dynamic consolidation principle
  • Reconfiguration problem
  • Some results
  • Extension to HPC
  • Cluster Context Switch
  • Conclusion

Outline

16

mercredi 9 février 2011

slide-17
SLIDE 17

Fabien Hermenier, Adrien Lèbre, J.M. Menaud,- January 2011 - Ascola

The interest of the dynamic consolidation is limited by the duration of the reconfiguration process.

17

  • Better reactivity
  • A reduced overhead

Reduced by 9%

  • A stable packing
  • Entropy computes equivalent configurations with ”cheap”

reconfiguration plans until the minimum

  • What are benefits ?

Always better From 14 to 6 minutes

mercredi 9 février 2011

slide-18
SLIDE 18

Fabien Hermenier, Adrien Lèbre, J.M. Menaud,- January 2011 - Ascola

  • Motivation
  • Entropy project
  • Dynamic consolidation principle
  • Reconfiguration problem
  • Some results
  • Extension to HPC
  • Cluster Context Switch
  • Conclusion

Outline

18

mercredi 9 février 2011

slide-19
SLIDE 19

Fabien Hermenier, Adrien Lèbre, J.M. Menaud,- January 2011 - Ascola

Dynamic consolidation

  • Servers unused by online applications (web, HA etc.) can be :
  • Turned off
  • OR
  • Can be used by preemptive applications (simulation HPC etc...)
  • The main problem
  • How can i improve my cluster by running a maximum of preemptive

applications

19

mercredi 9 février 2011

slide-20
SLIDE 20

Fabien Hermenier, Adrien Lèbre, J.M. Menaud,- January 2011 - Ascola

Entropy Advanced

20

1st job in the queue 1st job in the queue 1st job in the queue 2nd 2nd 2nd job 2nd job 2nd job 3rd job 4th job 3rd job 3rd job 4th job job 4th Processors Processors Processors Time Time Time Running Running Running

Jobs arrive in the queue and have to be scheduled. FCFS + Easy backfilling Jobs 2 and 3 have been backfilled. Some resources are unused (dark areas) Easy backfilling with preemption The 4th job can be started without impacting the first one. A small piece of resources is still unused.

⇒ consolidation and preemption to finely exploit distributed resources

mercredi 9 février 2011

slide-21
SLIDE 21

Fabien Hermenier, Adrien Lèbre, J.M. Menaud,- January 2011 - Ascola

General idea: manipulate vJobs instead of jobs

21

  • In a similar way of usual processes,

each vjob is in a particular state:

  • A cluster-wide context switch (a set of VM context switches)

enables to efficiently rebalance the cluster according to the: scheduler objectives / available resources / waiting vjobs queue (elasticity) [VTDC 2010]

mercredi 9 février 2011

slide-22
SLIDE 22

Fabien Hermenier, Adrien Lèbre, J.M. Menaud,- January 2011 - Ascola

Reconfiguration plan

22

mercredi 9 février 2011

slide-23
SLIDE 23

Fabien Hermenier, Adrien Lèbre, J.M. Menaud,- January 2011 - Ascola

Experiment on a cluster

  • Benefits
  • improve resources usage
  • suspend/resume transparent for the developer
  • Resources usage

23

mercredi 9 février 2011

slide-24
SLIDE 24

Fabien Hermenier, Adrien Lèbre, J.M. Menaud,- January 2011 - Ascola

  • Motivation
  • Entropy project
  • Dynamic consolidation principle
  • Reconfiguration problem
  • Some results
  • Extension to HPC
  • Cluster Context Switch
  • Conclusion

Outline

24

mercredi 9 février 2011

slide-25
SLIDE 25

Fabien Hermenier, Adrien Lèbre, J.M. Menaud,- January 2011 - Ascola

Conclusion

25

  • Manipulate VMs is tedious and may be non cost-effective
  • Entropy manage VMs instead of process
  • Provides a efficient and reactive dynamic consolidation policy
  • and a generic cluster-wide context switch

based on mechanisms provided by VMM

http ://entropy.gforge.inria.fr

LGPL Uses an Abstract VMM (Jasmine-VMM) ESX, Hyper-V, Xen, KVM ...

ANR Arpège SelfXL (2008-2011) ANR Arpège MyCloud (2010-2013) FUI Cool-IT (2011-2013) ANR Emergence Entropy (2011-2012)

mercredi 9 février 2011

slide-26
SLIDE 26

J.M. Menaud - Juin 2010 26

mercredi 9 février 2011