A Tool for Environment Deployment in Clusters and Light Grids - - PowerPoint PPT Presentation
A Tool for Environment Deployment in Clusters and Light Grids - - PowerPoint PPT Presentation
A Tool for Environment Deployment in Clusters and Light Grids presented by Guillaume Huard Yiannis Georgiou, Julien Leduc, Brice Videau, Johann Peyrard and Olivier Richard Laboratoire I nformatique et D istribution Grenoble, FRANCE Mescal
Outline
Introduction Related Work and a new approach Kadeploy2 Environment deployment tool Performance Evaluations Work in Progress Conclusion and Perspectives
2 / 29
Outline
Introduction Introduction and Motivations Environment deployment Related Work and a new approach Kadeploy2 Environment deployment tool Performance Evaluations Work in Progress Conclusion and Perspectives
3 / 29
Introduction and Motivations
◮ Cluster-Grid ...for High Performance Computing ◮ Exploitation and Administration ◮ Issues:
◮ Cluster: Software installation and configuration ◮ Grid: Software heterogeneity among clusters
◮ Experimental platforms, research
◮ Need for various software environment for the experiments. (resource administrator, NFS , authentication,..)
Cluster
Servers Computing Nodes
User gateway Central services 4 / 29
Environment deployment
Proposed solution:
◮ Environment deployment tool
Applications OS(Linux, FreeBSD,...)
Environment Middleware
Hardware Network
Specifiable
Tools Distro
Configurable
◮ Typical sequence of an environment deployment:
at the batch scheduler New experiment Environment creation
1 2 3 4
4)Work finishes, nodes return to the initial reference environment 3)Work on the environment 2)Environment deployment 1)Submission of requested nodes
5 / 29
Outline
Introduction Related Work and a new approach Related Work A new way of exploitation based on the deployment operation Kadeploy2 Environment deployment tool Performance Evaluations Work in Progress Conclusion and Perspectives
6 / 29
Related Work
◮ Cluster Management tools (Rocks, LCFG, Quattor, Oscar,...)
◮ Automated installation, configuration and management
◮ Imaging tools (Partimage, g4u, Frisbee,...)
◮ Creation of a disk/partition image ◮ Used: Mostly for maintenance but also for installation
◮ Automated installation tools (SIS, Kickstart, ...)
◮ OS and software Installation and Configuration
◮ Virtualization (Xen/XenoServer, VServer, ...)
◮ Different approach-Flexible infrastructure ◮ BUT: If we want to evaluate virtualization? 7 / 29
Related Work-Synthesis
◮ Cluster and Grid Exploitation -> Various existing software
solutions
◮ Environment Deployment Operations -> only on phases of
installation or maintenance
◮ Solutions not as flexible as desired
8 / 29
Kadeploy2: A new way of cluster and grid exploitation
◮ Enables every user to use the deployment operation on a cluster
- r grid and deploy the environment of his (her) preference.
Node Node Node Node Node
Available Partitions Protected Partitions
Server Environment Image 2 Environment Image 1
◮ Solution for the software environment homogeneity problem on
cluster or light grid
◮ Ligtht grid: Simplification of grid...services/administration homogeneity
◮ Fast and robust deployment tool that proposes:
◮ Access control for every user to the deployment operations ◮ Simple method of environment creation 9 / 29
Outline
Introduction Related Work and a new approach Kadeploy2 Environment deployment tool Kadeploy2:Architecture Kadeploy2:Deployment procedure Kadeploy2:Deployment procedure optimizations Performance Evaluations Work in Progress Conclusion and Perspectives
10 / 29
Kadeploy2:Architecture and Principles
Client Scheduler Users Server Diffusion Mechanism Database Batch Environment Repository Computing Nodes Network booting protocols Hardware reboot mechanism Kadeploy2 Submision
◮ Utilization : usual protocols for network booting (PXE, TFTP , DHCP) ◮ Use of a database (MySql): state, config, environ. description ◮ Fast mechanism for environment diffusion, pipeline (flat tree, chain) ◮ Integration with the resource manager (ex:PBS, OAR,...) ◮ Deployment process: transfers new environment to every computing node (specific partition) ◮ Robustness (Hardware Side): Use of remote mechanisms for hardware reboot ◮ Environment creation: simple archiving of root partition in compressed tar format
11 / 29
The deployment procedure: Concepts
◮ Procedure controlled by a minimal-system (mini-kernel, initrd)
◮ Memory mounted
◮ Preinstallation
◮ Disk partitioning
◮ Transfer + Write
◮ Environment diffusion on the deployment partition
◮ Post-Installation
◮ Finalizes the configuration of services that lack autoconfiguration procedures
◮ Robustness (Software Side)
◮ failing nodes excluded by timeouts 12 / 29
Deployment procedure steps and time chart
1) Submission 2) Attribution / Session opening 3) Deployment Permission controls 5) Preinstallation 6) Environment propagation + Decompression on the partition 7) Postinstallation 8) Order of Reboot 10) Work on the environment 11) Session end indication 12) Deployment permission rights withdrawal / End of session 13) Order of Reboot 14) Boot on the reference environment 4) Boot deployment minimal−system 9) Boot on the new environment User Environment Minimal−system Reference Environment
reboot reboot reboot
computing node 1, 2, 3 4 5, 6, 7, 8 14 9 10, 11, 12, 13
Deployment Timetable
Row 1 Row 2 Row 3 Row 4 Row 5 Row 6 Row 7 Row 8 Row 9 Row 10 Row 11 Row 12 Row 13 Row 14
◮ Steps 4, 9, 14 (reboot phases) most time consuming ◮ Optimization motivation
13 / 29
1st Optimization method nomini
1) Submission 3) Deployment Permission controls 5) Preinstallation 7) Postinstallation 8) Order of Reboot 9) Boot on the new environment 10) Work on the environment 11) Session end indication 13) Order of Reboot computing node Reference Environment
reboot User Environment reboot
14) Boot on the reference environment 1, 2, 3 4, 5, 6, 7, 8 10, 11, 12, 13 9 14 4) Reference environment preparation 12) Deployment permission rights withdrawal/End of session 6) Environment propagation+Decompression on the partition 2) Attribution/Session opening
◮ 1st reboot elimination: Procedure controlled by the reference
environment (no minimal-system) Constraints:
◮ Diffusion mechanism installed on the reference environment ◮ Deployment on a different partition than the current root
Robustness-> guaranteed (same arguments as the default method)
14 / 29
2nd Optimization method pivot
1) Submission 3) Deployment Permission controls 5) Preinstallation 7) Postinstallation 8) Reference environment preparation 10) Work on the environment 11) Session end indication Reference Environment computing node User Environment
pivot un−pivot
13) Change the root file system(reference environment) + services launching 1, 2, 3 4, 5, 6, 7, 8 10, 11, 12 9 13 4) Reference environment preparation 9) Change the root file system(user environment)+services launching 6) Environment propagation+Decompression on the partition 12) Deployment permission rights withdrawal/End of session 2) Attribution/Session opening
◮ Extension of the nomini method (1st reboot eliminated) ◮ 2nd reboot elimination, just change the root filesystem :
use the system command pivot_root
◮ This is reversible ! -> 3d reboot elimination
Drawbacks:
◮ Same constraints as the 1st optimization method (based on it) ◮ Cannot change the kernel or the kernel parameters.
Robustness -> unchanged
15 / 29
Outline
Introduction Related Work and a new approach Kadeploy2 Environment deployment tool Performance Evaluations Time to deploy at the cluster level Time to deploy at the grid level Work in Progress Conclusion and Perspectives
16 / 29
Execution time of a deployment
◮ Platform: Grid5000, the French nationwide experimental grid:
◮ 9 geographically distributed sites ◮ every site hosts 1 to 3 clusters(from 256CPUs to 1K CPUs) ◮ All sites connected by RENATER(French Academic Network)
- >10Gbits(2006)
◮ Used 2 clusters for our performance measurements:
◮ GDX Cluster LRI Laboratory @ Orsay
(AMD Opteron biprocessor 2GHz, 2G RAM, Gigabit Ethernet )
◮ Sophia Cluster INRIA Laboratory @ Sophia-Antipolis
(AMD Opteron biprocessor 2GHz, 2G RAM, Myrinet/Gigabit Ethernet )
17 / 29
Default deployment method
◮ Metric: Time to reach each of the 5 most "time consuming" steps in the deployment procedure (from session opening to boot on the desired environment) ◮ GDX cluster (180 nodes)
100 200 300 400 500 50 100 150 200 time (sec) #nodes Kadeploy2 default deployment method on GDX cluster reboot,first check preinstall environment propagation+copy postinstall reboot,last check
◮ bottom curve time to boot the minimal system ◮ upper curve total time to boot the desired environment
18 / 29
Comparison of deployment methods
100 200 300 400 500 50 100 150 200 time (sec) #nodes Kadeploy2 deployment procedure (Methods) default nomini pivot
◮ optimization methods 70-160sec faster
19 / 29
Deployment on a lightweight grid of 2 clusters
◮ Time diagram of a deployment (default method) on 2 Grid5000
sites using 260 nodes (180 nodes in site 1 (GDX) and 80 nodes in site 2 (Sophia))
50 100 150 200 250 300 200 400 600 800 1000 1200 # nodes time (sec) deploying deployed deploying_site1 deployed_site1 deploying_site2 deployed_site2
◮ Current "boot-to-boot" time at Grid level 450 seconds
20 / 29
Outline
Introduction Related Work and a new approach Kadeploy2 Environment deployment tool Performance Evaluations Work in Progress Various operating systems support Conclusion and Perspectives
21 / 29
Various operating systems support
◮ Formerly GNU/Linux OS and ext2 filesystem specific ◮ Need for other OS integration:
FreeBSD, Solaris, MacOSX, Windows...
◮ Use of byte-to-byte copy for the image environment construction
(dd command)
◮ usable at disk-device level (assuming LBA) ◮ no restriction on the disk data layout – everything can be replicated
General method for various operating systems integration:
- 1. Boot on the minimal deployment system.
- 2. Write at the beginning of a primary partition a new minimal system for the target OS
- 3. Boot on the new minimal deployment system.
- 4. Do the appropriate changes on the disk partitions according the needs of the new filesystem
and write the regular deployment environment.
- 5. Reboot on the new OS environment.
22 / 29
Outline
Introduction Related Work and a new approach Kadeploy2 Environment deployment tool Performance Evaluations Work in Progress Conclusion and Perspectives Conclusion Perspectives
23 / 29
Conclusion and Current State
◮ Environment deployment:
new ways to exploit clusters and grids
◮ Kadeploy2:
simple, effective and robust deployment tool
◮ A mandatory tool providing the reconfiguration facilities for the
Grid5000 platform
24 / 29
Perspectives
◮ New deployment optimizations ◮ Refine General method for Various platform and OS support ◮ More User friendly procedures : Environment Creation and
update method
25 / 29
Questions
and some advertising... https://www.grid5000.fr/ http://www-id.imag.fr/Logiciels/kadeploy/
26 / 29
Backup Slides
Memory Memory Environment Reference
reboot reboot User Environment Minimal−system
Memory
reboot
Disk Disk Disk Partitions Partitions Partitions
Under preparation for deployment Current root file system 27 / 29
Backup Slides
100 200 300 400 500 50 100 150 200 time (sec) #nodes Kadeploy2 default deployment method on GDX cluster reboot,first check preinstall environment propagation+copy postinstall reboot,last check 100 200 300 400 500 50 100 150 200 time (sec) #nodes Kadeploy2 nomini deployment method on GDX cluster first check preparation,preinstall environment propagation+copy postinstall reboot,last check 100 200 300 400 500 50 100 150 200 time (sec) #nodes Kadeploy2 pivot deployment method on GDX cluster first check preparation,preinstall environment propagation+copy postinstall last check
28 / 29
Backup Slides
Partitions
reboot
Disk Memory Memory
reboot
Disk Disk Memory Partitions Disk Memory Partitions
reboot reboot Reference Environment Minimal System Minimal OS New OS Environment Under preparation for deployment Current root file system
29 / 29