A Tool for Environment Deployment in Clusters and Light Grids - - PowerPoint PPT Presentation

a tool for environment deployment in clusters and light
SMART_READER_LITE
LIVE PREVIEW

A Tool for Environment Deployment in Clusters and Light Grids - - PowerPoint PPT Presentation

A Tool for Environment Deployment in Clusters and Light Grids presented by Guillaume Huard Yiannis Georgiou, Julien Leduc, Brice Videau, Johann Peyrard and Olivier Richard Laboratoire I nformatique et D istribution Grenoble, FRANCE Mescal


slide-1
SLIDE 1

A Tool for Environment Deployment in Clusters and Light Grids

presented by Guillaume Huard Yiannis Georgiou, Julien Leduc, Brice Videau, Johann Peyrard and Olivier Richard

Informatique et Distribution Laboratoire Grenoble, FRANCE Mescal Project

slide-2
SLIDE 2

Outline

Introduction Related Work and a new approach Kadeploy2 Environment deployment tool Performance Evaluations Work in Progress Conclusion and Perspectives

2 / 29

slide-3
SLIDE 3

Outline

Introduction Introduction and Motivations Environment deployment Related Work and a new approach Kadeploy2 Environment deployment tool Performance Evaluations Work in Progress Conclusion and Perspectives

3 / 29

slide-4
SLIDE 4

Introduction and Motivations

◮ Cluster-Grid ...for High Performance Computing ◮ Exploitation and Administration ◮ Issues:

◮ Cluster: Software installation and configuration ◮ Grid: Software heterogeneity among clusters

◮ Experimental platforms, research

◮ Need for various software environment for the experiments. (resource administrator, NFS , authentication,..)

Cluster

Servers Computing Nodes

User gateway Central services 4 / 29

slide-5
SLIDE 5

Environment deployment

Proposed solution:

◮ Environment deployment tool

Applications OS(Linux, FreeBSD,...)

Environment Middleware

Hardware Network

Specifiable

Tools Distro

Configurable

◮ Typical sequence of an environment deployment:

at the batch scheduler New experiment Environment creation

1 2 3 4

4)Work finishes, nodes return to the initial reference environment 3)Work on the environment 2)Environment deployment 1)Submission of requested nodes

5 / 29

slide-6
SLIDE 6

Outline

Introduction Related Work and a new approach Related Work A new way of exploitation based on the deployment operation Kadeploy2 Environment deployment tool Performance Evaluations Work in Progress Conclusion and Perspectives

6 / 29

slide-7
SLIDE 7

Related Work

◮ Cluster Management tools (Rocks, LCFG, Quattor, Oscar,...)

◮ Automated installation, configuration and management

◮ Imaging tools (Partimage, g4u, Frisbee,...)

◮ Creation of a disk/partition image ◮ Used: Mostly for maintenance but also for installation

◮ Automated installation tools (SIS, Kickstart, ...)

◮ OS and software Installation and Configuration

◮ Virtualization (Xen/XenoServer, VServer, ...)

◮ Different approach-Flexible infrastructure ◮ BUT: If we want to evaluate virtualization? 7 / 29

slide-8
SLIDE 8

Related Work-Synthesis

◮ Cluster and Grid Exploitation -> Various existing software

solutions

◮ Environment Deployment Operations -> only on phases of

installation or maintenance

◮ Solutions not as flexible as desired

8 / 29

slide-9
SLIDE 9

Kadeploy2: A new way of cluster and grid exploitation

◮ Enables every user to use the deployment operation on a cluster

  • r grid and deploy the environment of his (her) preference.

Node Node Node Node Node

Available Partitions Protected Partitions

Server Environment Image 2 Environment Image 1

◮ Solution for the software environment homogeneity problem on

cluster or light grid

◮ Ligtht grid: Simplification of grid...services/administration homogeneity

◮ Fast and robust deployment tool that proposes:

◮ Access control for every user to the deployment operations ◮ Simple method of environment creation 9 / 29

slide-10
SLIDE 10

Outline

Introduction Related Work and a new approach Kadeploy2 Environment deployment tool Kadeploy2:Architecture Kadeploy2:Deployment procedure Kadeploy2:Deployment procedure optimizations Performance Evaluations Work in Progress Conclusion and Perspectives

10 / 29

slide-11
SLIDE 11

Kadeploy2:Architecture and Principles

Client Scheduler Users Server Diffusion Mechanism Database Batch Environment Repository Computing Nodes Network booting protocols Hardware reboot mechanism Kadeploy2 Submision

◮ Utilization : usual protocols for network booting (PXE, TFTP , DHCP) ◮ Use of a database (MySql): state, config, environ. description ◮ Fast mechanism for environment diffusion, pipeline (flat tree, chain) ◮ Integration with the resource manager (ex:PBS, OAR,...) ◮ Deployment process: transfers new environment to every computing node (specific partition) ◮ Robustness (Hardware Side): Use of remote mechanisms for hardware reboot ◮ Environment creation: simple archiving of root partition in compressed tar format

11 / 29

slide-12
SLIDE 12

The deployment procedure: Concepts

◮ Procedure controlled by a minimal-system (mini-kernel, initrd)

◮ Memory mounted

◮ Preinstallation

◮ Disk partitioning

◮ Transfer + Write

◮ Environment diffusion on the deployment partition

◮ Post-Installation

◮ Finalizes the configuration of services that lack autoconfiguration procedures

◮ Robustness (Software Side)

◮ failing nodes excluded by timeouts 12 / 29

slide-13
SLIDE 13

Deployment procedure steps and time chart

1) Submission 2) Attribution / Session opening 3) Deployment Permission controls 5) Preinstallation 6) Environment propagation + Decompression on the partition 7) Postinstallation 8) Order of Reboot 10) Work on the environment 11) Session end indication 12) Deployment permission rights withdrawal / End of session 13) Order of Reboot 14) Boot on the reference environment 4) Boot deployment minimal−system 9) Boot on the new environment User Environment Minimal−system Reference Environment

reboot reboot reboot

computing node 1, 2, 3 4 5, 6, 7, 8 14 9 10, 11, 12, 13

Deployment Timetable

Row 1 Row 2 Row 3 Row 4 Row 5 Row 6 Row 7 Row 8 Row 9 Row 10 Row 11 Row 12 Row 13 Row 14

◮ Steps 4, 9, 14 (reboot phases) most time consuming ◮ Optimization motivation

13 / 29

slide-14
SLIDE 14

1st Optimization method nomini

1) Submission 3) Deployment Permission controls 5) Preinstallation 7) Postinstallation 8) Order of Reboot 9) Boot on the new environment 10) Work on the environment 11) Session end indication 13) Order of Reboot computing node Reference Environment

reboot User Environment reboot

14) Boot on the reference environment 1, 2, 3 4, 5, 6, 7, 8 10, 11, 12, 13 9 14 4) Reference environment preparation 12) Deployment permission rights withdrawal/End of session 6) Environment propagation+Decompression on the partition 2) Attribution/Session opening

◮ 1st reboot elimination: Procedure controlled by the reference

environment (no minimal-system) Constraints:

◮ Diffusion mechanism installed on the reference environment ◮ Deployment on a different partition than the current root

Robustness-> guaranteed (same arguments as the default method)

14 / 29

slide-15
SLIDE 15

2nd Optimization method pivot

1) Submission 3) Deployment Permission controls 5) Preinstallation 7) Postinstallation 8) Reference environment preparation 10) Work on the environment 11) Session end indication Reference Environment computing node User Environment

pivot un−pivot

13) Change the root file system(reference environment) + services launching 1, 2, 3 4, 5, 6, 7, 8 10, 11, 12 9 13 4) Reference environment preparation 9) Change the root file system(user environment)+services launching 6) Environment propagation+Decompression on the partition 12) Deployment permission rights withdrawal/End of session 2) Attribution/Session opening

◮ Extension of the nomini method (1st reboot eliminated) ◮ 2nd reboot elimination, just change the root filesystem :

use the system command pivot_root

◮ This is reversible ! -> 3d reboot elimination

Drawbacks:

◮ Same constraints as the 1st optimization method (based on it) ◮ Cannot change the kernel or the kernel parameters.

Robustness -> unchanged

15 / 29

slide-16
SLIDE 16

Outline

Introduction Related Work and a new approach Kadeploy2 Environment deployment tool Performance Evaluations Time to deploy at the cluster level Time to deploy at the grid level Work in Progress Conclusion and Perspectives

16 / 29

slide-17
SLIDE 17

Execution time of a deployment

◮ Platform: Grid5000, the French nationwide experimental grid:

◮ 9 geographically distributed sites ◮ every site hosts 1 to 3 clusters(from 256CPUs to 1K CPUs) ◮ All sites connected by RENATER(French Academic Network)

  • >10Gbits(2006)

◮ Used 2 clusters for our performance measurements:

◮ GDX Cluster LRI Laboratory @ Orsay

(AMD Opteron biprocessor 2GHz, 2G RAM, Gigabit Ethernet )

◮ Sophia Cluster INRIA Laboratory @ Sophia-Antipolis

(AMD Opteron biprocessor 2GHz, 2G RAM, Myrinet/Gigabit Ethernet )

17 / 29

slide-18
SLIDE 18

Default deployment method

◮ Metric: Time to reach each of the 5 most "time consuming" steps in the deployment procedure (from session opening to boot on the desired environment) ◮ GDX cluster (180 nodes)

100 200 300 400 500 50 100 150 200 time (sec) #nodes Kadeploy2 default deployment method on GDX cluster reboot,first check preinstall environment propagation+copy postinstall reboot,last check

◮ bottom curve time to boot the minimal system ◮ upper curve total time to boot the desired environment

18 / 29

slide-19
SLIDE 19

Comparison of deployment methods

100 200 300 400 500 50 100 150 200 time (sec) #nodes Kadeploy2 deployment procedure (Methods) default nomini pivot

◮ optimization methods 70-160sec faster

19 / 29

slide-20
SLIDE 20

Deployment on a lightweight grid of 2 clusters

◮ Time diagram of a deployment (default method) on 2 Grid5000

sites using 260 nodes (180 nodes in site 1 (GDX) and 80 nodes in site 2 (Sophia))

50 100 150 200 250 300 200 400 600 800 1000 1200 # nodes time (sec) deploying deployed deploying_site1 deployed_site1 deploying_site2 deployed_site2

◮ Current "boot-to-boot" time at Grid level 450 seconds

20 / 29

slide-21
SLIDE 21

Outline

Introduction Related Work and a new approach Kadeploy2 Environment deployment tool Performance Evaluations Work in Progress Various operating systems support Conclusion and Perspectives

21 / 29

slide-22
SLIDE 22

Various operating systems support

◮ Formerly GNU/Linux OS and ext2 filesystem specific ◮ Need for other OS integration:

FreeBSD, Solaris, MacOSX, Windows...

◮ Use of byte-to-byte copy for the image environment construction

(dd command)

◮ usable at disk-device level (assuming LBA) ◮ no restriction on the disk data layout – everything can be replicated

General method for various operating systems integration:

  • 1. Boot on the minimal deployment system.
  • 2. Write at the beginning of a primary partition a new minimal system for the target OS
  • 3. Boot on the new minimal deployment system.
  • 4. Do the appropriate changes on the disk partitions according the needs of the new filesystem

and write the regular deployment environment.

  • 5. Reboot on the new OS environment.

22 / 29

slide-23
SLIDE 23

Outline

Introduction Related Work and a new approach Kadeploy2 Environment deployment tool Performance Evaluations Work in Progress Conclusion and Perspectives Conclusion Perspectives

23 / 29

slide-24
SLIDE 24

Conclusion and Current State

◮ Environment deployment:

new ways to exploit clusters and grids

◮ Kadeploy2:

simple, effective and robust deployment tool

◮ A mandatory tool providing the reconfiguration facilities for the

Grid5000 platform

24 / 29

slide-25
SLIDE 25

Perspectives

◮ New deployment optimizations ◮ Refine General method for Various platform and OS support ◮ More User friendly procedures : Environment Creation and

update method

25 / 29

slide-26
SLIDE 26

Questions

and some advertising... https://www.grid5000.fr/ http://www-id.imag.fr/Logiciels/kadeploy/

26 / 29

slide-27
SLIDE 27

Backup Slides

Memory Memory Environment Reference

reboot reboot User Environment Minimal−system

Memory

reboot

Disk Disk Disk Partitions Partitions Partitions

Under preparation for deployment Current root file system 27 / 29

slide-28
SLIDE 28

Backup Slides

100 200 300 400 500 50 100 150 200 time (sec) #nodes Kadeploy2 default deployment method on GDX cluster reboot,first check preinstall environment propagation+copy postinstall reboot,last check 100 200 300 400 500 50 100 150 200 time (sec) #nodes Kadeploy2 nomini deployment method on GDX cluster first check preparation,preinstall environment propagation+copy postinstall reboot,last check 100 200 300 400 500 50 100 150 200 time (sec) #nodes Kadeploy2 pivot deployment method on GDX cluster first check preparation,preinstall environment propagation+copy postinstall last check

28 / 29

slide-29
SLIDE 29

Backup Slides

Partitions

reboot

Disk Memory Memory

reboot

Disk Disk Memory Partitions Disk Memory Partitions

reboot reboot Reference Environment Minimal System Minimal OS New OS Environment Under preparation for deployment Current root file system

29 / 29