Ganga a job management and optimising tool for job submission to - - PowerPoint PPT Presentation

ganga
SMART_READER_LITE
LIVE PREVIEW

Ganga a job management and optimising tool for job submission to - - PowerPoint PPT Presentation

Enabling Grids for E-sciencE Ganga a job management and optimising tool for job submission to the grid Andrew Maier (CERN) for the Ganga Development team www.eu-egee.org INFSO-RI-508833 Overview Enabling Grids for E-sciencE People


slide-1
SLIDE 1

INFSO-RI-508833

Enabling Grids for E-sciencE

www.eu-egee.org

Ganga

a job management and optimising tool for job submission to the grid

Andrew Maier (CERN)

for the Ganga Development team

slide-2
SLIDE 2

ISGC 2006, May 2006 - A. Maier (CERN) 2

Enabling Grids for E-sciencE

INFSO-RI-508833

Overview

  • People – Sponsors
  • Motivation
  • Ganga introduction
  • Architecture
  • Ganga in action
  • Current users of Ganga
  • Summary
slide-3
SLIDE 3

ISGC 2006, May 2006 - A. Maier (CERN) 3

Enabling Grids for E-sciencE

INFSO-RI-508833

People - Sponsors

  • Ganga is an ATLAS/LHCb joint project
  • Development work supported by PPARC through GridPP and by

EGEE through ARDA

  • Core team:

– U.Egede (Imperial), K.Harrison (Cambridge), D.Liko (CERN), A.Maier (CERN), J.T.Moscicki (CERN), A.Soroko (Oxford), CL.Tan (Birmingham)

  • Contributions from many others, from summer students to senior

researchers

  • Valued contributions from our colleagues of the Academia Sinica
slide-4
SLIDE 4

ISGC 2006, May 2006 - A. Maier (CERN) 4

Enabling Grids for E-sciencE

INFSO-RI-508833

Motivation (1)

  • To submit a job to the grid a number of “problems” have

to be overcome:

– Get a certificate – write a JDL with the resources needed – create a script to run the application on the worker-node – monitor the progress of your job – retrieve your output

  • The grid may not be the only computing resource you

want to use:

– local machine - for debugging or short tests – a local batch system - for small or intermediate datasets – grid – for intermediate to large scale datasets

slide-5
SLIDE 5

ISGC 2006, May 2006 - A. Maier (CERN) 5

Enabling Grids for E-sciencE

INFSO-RI-508833

Motivation (2)

  • Working on these resources can be different e.g.,

– For job running locally - no need for a JDL – For a job running on a batch system - a shared file system may allow to retrieve results easily – One may have to monitor the job manually – The commands to submit a job have different syntax

  • As a user you are probebly not interested in these

technicalities :

➔ Factor out these these differences: use Ganga

slide-6
SLIDE 6

ISGC 2006, May 2006 - A. Maier (CERN) 6

Enabling Grids for E-sciencE

INFSO-RI-508833

Ganga Overview

AtlasPROD DIAL DIRAC LCG2 gLite localhost LSF submit, kill get output update status store & retrieve job definition prepare, configure

Ganga4

Job Job Job Job scripts Gaudi Athena AtlasPROD DIRAC gLite LCG2 pbs localhost LSF + split, merge, monitor

slide-7
SLIDE 7

ISGC 2006, May 2006 - A. Maier (CERN) 7

Enabling Grids for E-sciencE

INFSO-RI-508833

Introduction to Ganga

  • What is Ganga

– Ganga is an application in python to help the user to

configure, prepare, submit and monitor applications to the local host, a batch or a grid system

– The goal is to make submitting a job transparent to the batch system used

Configure once, run anywhere

GANGA

LSF DIRAC Localhost User

AM

slide-8
SLIDE 8

ISGC 2006, May 2006 - A. Maier (CERN) 8

Enabling Grids for E-sciencE

INFSO-RI-508833

Ganga job abstraction (1)

  • User tasks are represented in Ganga in terms of a set
  • f building blocks
  • Application

– Specification of the software to be run, including values for configurable parameters

  • Backend

– Specification of batch or Grid system to be used, including resource requirements (minimum memory, maximum CPU, etc)

  • Dataset

– Used to specify input and/or output, for example a collection of input files containing event data

slide-9
SLIDE 9

ISGC 2006, May 2006 - A. Maier (CERN) 9

Enabling Grids for E-sciencE

INFSO-RI-508833

Ganga job abstraction (2)

  • Job

– Full task specification - input Dataset, Application, output Dataset, Backend - and bookkeeping information such as ID and status

  • Splitter

– Rule for dividing a Job into subjobs that can be run in parallel; rule may relate to Application parameters and/or to input Dataset

  • Merger

– Rule for merging outputs from subjobs

slide-10
SLIDE 10

ISGC 2006, May 2006 - A. Maier (CERN) 10

Enabling Grids for E-sciencE

INFSO-RI-508833

Internal architecture

Job Manager Remote Registry Client

  • Ganga 4 is decomposed

into 4 functional components

  • These components also

describe the components in a distributed model.

  • Strategy: Design each

component so that it could be a separate service.

  • But allow to combine two or

more components into a single service

Application Manager

slide-11
SLIDE 11

ISGC 2006, May 2006 - A. Maier (CERN) 11

Enabling Grids for E-sciencE

INFSO-RI-508833

Client

Application Manager Job Manager Remote Registry Client

  • Runs the Ganga interface

(CLIP, GPI, GUI)

  • The user interacts exclusively

through the client

  • With the client, the user

creates, modifies, submits and monitors jobs

  • Job configuration is kept in a

registry which can be local (within the client) or remote.

slide-12
SLIDE 12

ISGC 2006, May 2006 - A. Maier (CERN) 12

Enabling Grids for E-sciencE

INFSO-RI-508833

Client

Application Manager Job Manager Remote Registry Client

  • The client is a thin client (pure

python)

  • The client can be a command

line client or a GUI

  • The client interacts with the

application manager to configure applications

  • It submits and monitors jobs

via the job manager

  • It keeps state by storing

persistent information in the registry

slide-13
SLIDE 13

ISGC 2006, May 2006 - A. Maier (CERN) 13

Enabling Grids for E-sciencE

INFSO-RI-508833

Application Manager

Application Manager Job Manager Remote Registry Client

  • Prepares and configures the

application

  • Compiles user code
  • Sets-up the necessary

environment

  • Provides information to the

client on available applications, versions, platforms, etc.

slide-14
SLIDE 14

ISGC 2006, May 2006 - A. Maier (CERN) 14

Enabling Grids for E-sciencE

INFSO-RI-508833

Job Manager

Application Manager Job Manager Remote Registry Client

  • Submits the configured job to the

submission backend

  • A submission handler submits a job

to a backend

Creates the starter script and

the JDL

  • performs the monitoring
  • The application runtime handler

prepares the application

dependent wrapper script, depending on the backend.

E.g., DIRAC knows how to run

LHCb applications with a different setup as LSF.

slide-15
SLIDE 15

ISGC 2006, May 2006 - A. Maier (CERN) 15

Enabling Grids for E-sciencE

INFSO-RI-508833

Remote Registry

Application Manager Job Manager Remote Registry Client

  • Keeps track of jobs
  • Is a “passive” data store,

typically using a database backend

  • Keeps a roaming profile of

the user jobs

  • Ganga uses the AMGA

metadata catalogue

  • Keeps track of the job status
slide-16
SLIDE 16

ISGC 2006, May 2006 - A. Maier (CERN) 16

Enabling Grids for E-sciencE

INFSO-RI-508833

Design

slide-17
SLIDE 17

ISGC 2006, May 2006 - A. Maier (CERN) 17

Enabling Grids for E-sciencE

INFSO-RI-508833

Ganga in action

  • The key operations for a user running jobs are typically

– Job definition – Job submission – Job cancellation – Job monitoring – Output retrieval

  • These are performed in Ganga using simple GPI/User

commands

  • Technicalities are hidden from the user
slide-18
SLIDE 18

ISGC 2006, May 2006 - A. Maier (CERN) 18

Enabling Grids for E-sciencE

INFSO-RI-508833

Job definition

  • A job can be defined in Ganga starting from an instance
  • f the Job class
  • Job properties can be passed as arguments to the

constructor

– j = Job( application = Executable(), backend = LCG())

  • Job properties and sub-properties can also be set

through assignments

– j.application.exe = “/bin/echo” – j.application.args = [ “Hello World” ]

slide-19
SLIDE 19

ISGC 2006, May 2006 - A. Maier (CERN) 19

Enabling Grids for E-sciencE

INFSO-RI-508833

Job submission

  • User command: job.submit()
  • Outcome: job submitted, split into subjobs, command status returned
  • Beneath the surface:

Application Manager Application Manager Archivist Archivist Job Manager Job Manager j ob derived parameters j ob and derived parameters subj obs derived parameters subj obs subj obs and derived parameters status Perform application configuration for j ob Perform application configuration for subj obs S plit j ob into subj obs Create wrapper scripts, submit to backend Register subj obs, allocate workspace Register subj obs as submitted

slide-20
SLIDE 20

ISGC 2006, May 2006 - A. Maier (CERN) 20

Enabling Grids for E-sciencE

INFSO-RI-508833

Monitoring and output retrieval

  • User action: none
  • Outcome: changes in job status reported/updated periodically, output

retrieved automatically when job completes

  • Beneath the surface:

Archivist Job Manager Control Monitoring thread Determine active j obs for each backend Query status of active j obs

⇒ For each backend with active j obs: ⇒For each j ob with change of status:

Report change of status to uses, retrieve output if j ob completed Register new j ob status

slide-21
SLIDE 21

ISGC 2006, May 2006 - A. Maier (CERN) 21

Enabling Grids for E-sciencE

INFSO-RI-508833

CLIP: simple job from 1st principles

Job defined in Cambridge Job run in Amsterdam

slide-22
SLIDE 22

ISGC 2006, May 2006 - A. Maier (CERN) 22

Enabling Grids for E-sciencE

INFSO-RI-508833

Running ATLAS jobs on LCG

slide-23
SLIDE 23

ISGC 2006, May 2006 - A. Maier (CERN) 23

Enabling Grids for E-sciencE

INFSO-RI-508833

Ganga Gui

  • Ganga ship with a GUI
  • Based on pyqt
  • Completely dockable
  • Dynamically built on the

internal plugin architecture

  • Includes a job builder

wizard

slide-24
SLIDE 24

ISGC 2006, May 2006 - A. Maier (CERN) 24

Enabling Grids for E-sciencE

INFSO-RI-508833

Ganga GUI (top half)

Logical Job Folder list List of Jobs Job details

slide-25
SLIDE 25

ISGC 2006, May 2006 - A. Maier (CERN) 25

Enabling Grids for E-sciencE

INFSO-RI-508833

Ganga Gui (bottom half)

Ganga/Python prompt Saved scriptlets A scriptlet

slide-26
SLIDE 26

ISGC 2006, May 2006 - A. Maier (CERN) 26

Enabling Grids for E-sciencE

INFSO-RI-508833

Ganga User - Atlas

  • Atlas uses the Athena framework for creating jobs
  • Atlas DDM is being deployed on LCG sites

– Dataset concept – Will be used to distribute data – DQ2 plugin for Ganga gives access to the catalog

  • Site configuration

– Short queues to support analysis jobs – Fast response for users

  • New middleware components

– gLite Resource Broker – Bulk submission

slide-27
SLIDE 27

ISGC 2006, May 2006 - A. Maier (CERN) 27

Enabling Grids for E-sciencE

INFSO-RI-508833

Atlas Future Strategy

  • Based on these developments Ganga plans for a new

version by the end of May

  • In addition Ganga will connect to Panda, the Production and

Analysis of ATLAS on OSG

  • Atlas will use Ganga as frontend to submit jobs to the various

backends

  • Atlas will use Application Monitoring based on MonaLisa to

measure the actual performance of the various backends

slide-28
SLIDE 28

ISGC 2006, May 2006 - A. Maier (CERN) 28

Enabling Grids for E-sciencE

INFSO-RI-508833

Ganga Users - LHCb

  • Applications in LHCb are based on a common

framework called Gaudi – Users write code to be dynamically linked against the Gaudi framework – These shared libraries, the user options and the dataset to be analysed will sent for execution – Ganga has a plugin for Gaudi based applications and helps the user to configure the job

slide-29
SLIDE 29

ISGC 2006, May 2006 - A. Maier (CERN) 29

Enabling Grids for E-sciencE

INFSO-RI-508833

LHCb Analysis Job

  • Gaudi based applications:

In [3]: dv = DaVinci(version='v12r12') In [4]: print dv DaVinci { version = 'v12r12' , extraopts = None , package = 'Phys' , cmt_user_path = '/afs/cern.ch/user/u/uegede/cmtuser' , masterpackage = None ,

  • ptsfile = File {

name = '' } }

Specify extra option file properties appended to the

  • ptions file

Specify the options file to be used Specify the package you are working on

slide-30
SLIDE 30

ISGC 2006, May 2006 - A. Maier (CERN) 30

Enabling Grids for E-sciencE

INFSO-RI-508833

Analysis access to the Grid

  • Analysis jobs: No direct

submission to LCG

– Instead: Submission to the DIRAC workload management system

  • Advantages:

– Reduce the knowledge required to submit to the Grid – Provide transparent access to the LFC file catalogue for reading and writing data – Allow LHCb to set priorities and or restrictions for analysis jobs

slide-31
SLIDE 31

ISGC 2006, May 2006 - A. Maier (CERN) 31

Enabling Grids for E-sciencE

INFSO-RI-508833

Other users of Ganga

  • Gridproduction testbed

– Tests the functionality and availability

  • f grid sites
  • Geant 4 simulation

– new versions are tested against result

  • f earlier version
  • Biomed

– (see Hurng-Chun Lee's presentation yesterday)

  • ITU

– want to use Ganga in autumn, to aid the negotiation of new digital TV frequencies

slide-32
SLIDE 32

ISGC 2006, May 2006 - A. Maier (CERN) 32

Enabling Grids for E-sciencE

INFSO-RI-508833

Summary

  • Ganga simplifies the submission of jobs to the grid

– prepares – submits – monitors and retrieves results

  • Ganga features a plugin architecture to easily enhance

additional backends or applications

  • Ganga factors out the technicalities to submit to

different computing resources (local machine, batch system, grid)

  • Demo session of Ganga, tomorrow
  • Ganga motto:

Configure once, run anywhere

slide-33
SLIDE 33

ISGC 2006, May 2006 - A. Maier (CERN) 33

Enabling Grids for E-sciencE

INFSO-RI-508833

Spare

  • Spare
slide-34
SLIDE 34

ISGC 2006, May 2006 - A. Maier (CERN) 34

Enabling Grids for E-sciencE

INFSO-RI-508833

Ganga Tool vs Framework

  • Ganga is a lightweight user tool

easy to install (pure-python) “designed and optimized” for users

  • But also: Ganga is a developer framework

Plugin model

  • independent and rapid development of handlers (backends, applications)

Promote but not force common GPI abstractions

  • We do not require nor invent abstract base classes which are least common

denominators between systems, example:

– you may implement very complex application (e.g. ADA) and enable submission to DIAL only if that’s your main case

  • the design of framework does not attempt to match all possible applications with

all possible backends But: enable to build common tools on top of GPI: GUI, scripts,…

slide-35
SLIDE 35

ISGC 2006, May 2006 - A. Maier (CERN) 35

Enabling Grids for E-sciencE

INFSO-RI-508833

Status of the various backends

  • LCG

– Standard LCG Resource Broker – Currently available

  • CondorG

– Direct submission to CE – Currently avialbale

  • gLite

– New resource Broker – BulkSubmission – Under development

  • PANDA

– Production and Analysis on OSG – First version

  • Prodsys

– Production and Distributed Analysis on LCG