a tool for environment deployment in clusters and light
play

A Tool for Environment Deployment in Clusters and Light Grids - PowerPoint PPT Presentation

A Tool for Environment Deployment in Clusters and Light Grids presented by Guillaume Huard Yiannis Georgiou, Julien Leduc, Brice Videau, Johann Peyrard and Olivier Richard Laboratoire I nformatique et D istribution Grenoble, FRANCE Mescal


  1. A Tool for Environment Deployment in Clusters and Light Grids presented by Guillaume Huard Yiannis Georgiou, Julien Leduc, Brice Videau, Johann Peyrard and Olivier Richard Laboratoire I nformatique et D istribution Grenoble, FRANCE Mescal Project

  2. Outline Introduction Related Work and a new approach Kadeploy2 Environment deployment tool Performance Evaluations Work in Progress Conclusion and Perspectives 2 / 29

  3. Outline Introduction Introduction and Motivations Environment deployment Related Work and a new approach Kadeploy2 Environment deployment tool Performance Evaluations Work in Progress Conclusion and Perspectives 3 / 29

  4. Introduction and Motivations ◮ Cluster-Grid ...for High Performance Computing ◮ Exploitation and Administration ◮ Issues: ◮ Cluster: Software installation and configuration ◮ Grid: Software heterogeneity among clusters ◮ Experimental platforms, research ◮ Need for various software environment for the experiments. Cluster Computing Nodes Servers User gateway Central services (resource administrator, NFS , authentication,..) 4 / 29

  5. Environment deployment Proposed solution: ◮ Environment deployment tool Applications Distro Specifiable Tools Middleware Environment OS(Linux, FreeBSD,...) Hardware Network Configurable ◮ Typical sequence of an environment deployment: 1)Submission of requested nodes 3)Work on the environment at the batch scheduler 1 2 3 4 2)Environment deployment 4)Work finishes, nodes return to Environment creation the initial reference environment New experiment 5 / 29

  6. Outline Introduction Related Work and a new approach Related Work A new way of exploitation based on the deployment operation Kadeploy2 Environment deployment tool Performance Evaluations Work in Progress Conclusion and Perspectives 6 / 29

  7. Related Work ◮ Cluster Management tools (Rocks, LCFG, Quattor, Oscar,...) ◮ Automated installation, configuration and management ◮ Imaging tools (Partimage, g4u, Frisbee,...) ◮ Creation of a disk/partition image ◮ Used: Mostly for maintenance but also for installation ◮ Automated installation tools (SIS, Kickstart, ...) ◮ OS and software Installation and Configuration ◮ Virtualization (Xen/XenoServer, VServer, ...) ◮ Different approach-Flexible infrastructure ◮ BUT: If we want to evaluate virtualization? 7 / 29

  8. Related Work-Synthesis ◮ Cluster and Grid Exploitation -> Various existing software solutions ◮ Environment Deployment Operations -> only on phases of installation or maintenance ◮ Solutions not as flexible as desired 8 / 29

  9. Kadeploy2: A new way of cluster and grid exploitation ◮ Enables every user to use the deployment operation on a cluster or grid and deploy the environment of his (her) preference. Environment Image 1 Node Node Environment Image 2 Server Node Node Node Protected Partitions Available Partitions ◮ Solution for the software environment homogeneity problem on cluster or light grid ◮ Ligtht grid: Simplification of grid...services/administration homogeneity ◮ Fast and robust deployment tool that proposes: ◮ Access control for every user to the deployment operations ◮ Simple method of environment creation 9 / 29

  10. Outline Introduction Related Work and a new approach Kadeploy2 Environment deployment tool Kadeploy2:Architecture Kadeploy2:Deployment procedure Kadeploy2:Deployment procedure optimizations Performance Evaluations Work in Progress Conclusion and Perspectives 10 / 29

  11. Kadeploy2:Architecture and Principles Batch Environment Scheduler Repository Users Kadeploy2 Database Submision Diffusion Network booting protocols Mechanism Hardware reboot mechanism Computing Nodes Client Server ◮ Utilization : usual protocols for network booting (PXE, TFTP , DHCP) ◮ Use of a database (MySql): state, config, environ. description ◮ Fast mechanism for environment diffusion, pipeline (flat tree, chain) ◮ Integration with the resource manager (ex:PBS, OAR,...) ◮ Deployment process: transfers new environment to every computing node (specific partition) ◮ Robustness (Hardware Side): Use of remote mechanisms for hardware reboot ◮ Environment creation: simple archiving of root partition in compressed tar format 11 / 29

  12. The deployment procedure: Concepts ◮ Procedure controlled by a minimal-system (mini-kernel, initrd) ◮ Memory mounted ◮ Preinstallation ◮ Disk partitioning ◮ Transfer + Write ◮ Environment diffusion on the deployment partition ◮ Post-Installation ◮ Finalizes the configuration of services that lack autoconfiguration procedures ◮ Robustness (Software Side) ◮ failing nodes excluded by timeouts 12 / 29

  13. Deployment procedure steps and time chart computing node 1) Submission 2) Attribution / Session opening Reference 1, 2, 3 Environment 3) Deployment Permission controls 4) Boot deployment minimal−system 5) Preinstallation 4 reboot 6) Environment propagation + Decompression on the partition 7) Postinstallation 14 reboot 8) Order of Reboot 5, 6, 7, 8 Minimal−system 9) Boot on the new environment 10) Work on the environment 9 reboot 11) Session end indication 12) Deployment permission rights withdrawal / End of session 13) Order of Reboot User Environment 10, 11, 12, 13 14) Boot on the reference environment Deployment Timetable Row 1 Row 2 Row 3 Row 4 Row 5 Row 6 Row 7 Row 8 Row 9 Row 10 Row 11 Row 12 Row 13 Row 14 ◮ Steps 4, 9, 14 (reboot phases) most time consuming ◮ Optimization motivation 13 / 29

  14. 1st Optimization method nomini computing node 1) Submission Reference 1, 2, 3 2) Attribution/Session opening Environment 3) Deployment Permission controls 4, 5, 6, 7, 8 4) Reference environment preparation 5) Preinstallation 6) Environment propagation+Decompression on the partition 7) Postinstallation 14 reboot reboot 9 8) Order of Reboot 9) Boot on the new environment 10) Work on the environment 11) Session end indication 12) Deployment permission rights withdrawal/End of session User Environment 13) Order of Reboot 10, 11, 12, 13 14) Boot on the reference environment ◮ 1st reboot elimination: Procedure controlled by the reference environment (no minimal-system) Constraints: ◮ Diffusion mechanism installed on the reference environment ◮ Deployment on a different partition than the current root Robustness-> guaranteed (same arguments as the default method) 14 / 29

  15. 2nd Optimization method pivot computing node 1) Submission 1, 2, 3 Reference 2) Attribution/Session opening 4, 5, 6, 7, 8 Environment 3) Deployment Permission controls 4) Reference environment preparation 5) Preinstallation 6) Environment propagation+Decompression on the partition 7) Postinstallation pivot 13 un−pivot 9 8) Reference environment preparation 9) Change the root file system(user environment)+services launching 10) Work on the environment 11) Session end indication 12) Deployment permission rights withdrawal/End of session User Environment 13) Change the root file system(reference environment) 10, 11, 12 + services launching ◮ Extension of the nomini method (1st reboot eliminated) ◮ 2nd reboot elimination, just change the root filesystem : use the system command pivot_root ◮ This is reversible ! -> 3d reboot elimination Drawbacks: ◮ Same constraints as the 1st optimization method (based on it) ◮ Cannot change the kernel or the kernel parameters. Robustness -> unchanged 15 / 29

  16. Outline Introduction Related Work and a new approach Kadeploy2 Environment deployment tool Performance Evaluations Time to deploy at the cluster level Time to deploy at the grid level Work in Progress Conclusion and Perspectives 16 / 29

  17. Execution time of a deployment ◮ Platform: Grid5000, the French nationwide experimental grid: ◮ 9 geographically distributed sites ◮ every site hosts 1 to 3 clusters(from 256CPUs to 1K CPUs) ◮ All sites connected by RENATER(French Academic Network) ->10Gbits(2006) ◮ Used 2 clusters for our performance measurements: ◮ GDX Cluster LRI Laboratory @ Orsay ( AMD Opteron biprocessor 2GHz, 2G RAM, Gigabit Ethernet ) ◮ Sophia Cluster INRIA Laboratory @ Sophia-Antipolis ( AMD Opteron biprocessor 2GHz, 2G RAM, Myrinet/Gigabit Ethernet ) 17 / 29

  18. Default deployment method ◮ Metric: Time to reach each of the 5 most "time consuming" steps in the deployment procedure (from session opening to boot on the desired environment) ◮ GDX cluster (180 nodes) Kadeploy2 default deployment method on GDX cluster 500 reboot,first check preinstall environment propagation+copy postinstall 400 reboot,last check 300 time (sec) 200 100 0 0 50 100 150 200 #nodes ◮ bottom curve time to boot the minimal system ◮ upper curve total time to boot the desired environment 18 / 29

  19. Comparison of deployment methods Kadeploy2 deployment procedure (Methods) 500 default nomini pivot 400 300 time (sec) 200 100 0 0 50 100 150 200 #nodes ◮ optimization methods 70-160sec faster 19 / 29

  20. Deployment on a lightweight grid of 2 clusters ◮ Time diagram of a deployment (default method) on 2 Grid5000 sites using 260 nodes (180 nodes in site 1 (GDX) and 80 nodes in site 2 (Sophia)) 300 250 200 # nodes 150 100 50 0 0 200 400 600 800 1000 1200 time (sec) deploying deployed_site1 deployed deploying_site2 deploying_site1 deployed_site2 ◮ Current "boot-to-boot" time at Grid level 450 seconds 20 / 29

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend