gLite on Grid5000 : towards a real-size testbed for production grids - - PowerPoint PPT Presentation

glite on grid 5000 towards a real size testbed for
SMART_READER_LITE
LIVE PREVIEW

gLite on Grid5000 : towards a real-size testbed for production grids - - PowerPoint PPT Presentation

gLite on Grid5000 : towards a real-size testbed for production grids S ebastien Badia and Lucas Nussbaum Partially funded by Simglite project Appel Interfaces Recherche en grilles Grilles de production Institut des Grilles du CNRS


slide-1
SLIDE 1

gLite on Grid’5000 : towards a real-size testbed for production grids

S´ ebastien Badia and Lucas Nussbaum

Partially funded by Simglite project Appel Interfaces Recherche en grilles – Grilles de production Institut des Grilles du CNRS — Action Aladdin INRIA

S´ ebastien Badia and Lucas Nussbaum gLite on Grid’5000 1 / 8

slide-2
SLIDE 2

Goal

◮ Use Grid’5000 as a testbed for gLite ◮ Use cases : developers of gLite components, and of applications

interacting with the gLite middleware

◮ Be able to run experiments in a stable environment (no

variation between experiments) compare results

◮ Be able to create experimental conditions required by an

experiment, possibly hard to meet in a production environment (e.g service crash)

◮ Be able to replace components of the infrastructure

test new versions, test interoperability

◮ Avoid overloading or influencing the production

infrastructure with test jobs

S´ ebastien Badia and Lucas Nussbaum gLite on Grid’5000 2 / 8

slide-3
SLIDE 3

Grid’5000

◮ Experimental platform for research

  • n distributed systems and high

performance parallel computing

◮ 1700 nodes (7000 cores),

10 sites in France

◮ Reconfigurable by users : operating

system on nodes can be replaced using Kadeploy, network isolation with KaVLAN

Grid’5000

S´ ebastien Badia and Lucas Nussbaum gLite on Grid’5000 3 / 8

slide-4
SLIDE 4

Deployed gLite infrastructure

◮ One VO and its VOMS (Virtual Organization Membership

Service), users directory

◮ Several sites, composed of : 1 One BDII (Berkeley Database Information Index), directory

  • f resources available on each site

2 One CE (Computing Element), task submission service for a

given computing site

3 Worker nodes and a batch scheduler to access them.

Torque/Maui was used

4 One UI (User Interface), used by users to access the

resources

S´ ebastien Badia and Lucas Nussbaum gLite on Grid’5000 4 / 8

slide-5
SLIDE 5

Tools developed

◮ Scientific Linux 5.5 image, minimal and generic

(working on all Grid’5000 clusters) for the Kadeploy deployment tool

◮ Ruby scripts enabling an automated installation of gLite from

RPM repositories

◮ Description of the platform to deploy (VO, sites, clusters) in a

configuration file

◮ Creation of a certification authority to generate and

automatically sign users and machines certificates

◮ Pre-filling of the RPM cache on nodes using Kadeploy to

accelerate deployment https://github.com/sbadia/gdeploy/

S´ ebastien Badia and Lucas Nussbaum gLite on Grid’5000 5 / 8

slide-6
SLIDE 6

Deployment process

Update SL and add gLite RPM repositories

  • n all nodes

Create CA Create VO and configure VOMS + Configure BDII + Configure worker nodes and batch

. . .

(on each cluster)

+ Configure CE Configure UI +

. . . (on each site)

S´ ebastien Badia and Lucas Nussbaum gLite on Grid’5000 6 / 8

slide-7
SLIDE 7

Results

Use of Grid’5000 to deploy the gLite middleware

◮ Deployment up to 926 nodes (17 clusters, 9 sites) ◮ Installation of machines with Scientific Linux 5.5 using Kadeploy :

10 minutes

◮ Configuration of gLite with one VO on 597 nodes (6 sites,

10 clusters) : 170 minutes

S´ ebastien Badia and Lucas Nussbaum gLite on Grid’5000 7 / 8

slide-8
SLIDE 8

Future work

◮ Improvements to the deployment script

◮ Deployment of several VO ◮ Deployment of other gLite services : storage, monitoring

◮ Collaborations

◮ Experiments on evolution of gLite components ◮ Experiments on tools interacting with the gLite

middleware : workflow engines, pilot jobs managers, etc.

◮ Simulation of services crash ◮ Load injection ◮ Submission of a large number of fake tasks S´ ebastien Badia and Lucas Nussbaum gLite on Grid’5000 8 / 8