INFSO-RI-508833
Enabling Grids for E-sciencE
www.eu-egee.org
Ganga
a job management and optimising tool for job submission to the grid
Andrew Maier (CERN)
for the Ganga Development team
Ganga a job management and optimising tool for job submission to - - PowerPoint PPT Presentation
Enabling Grids for E-sciencE Ganga a job management and optimising tool for job submission to the grid Andrew Maier (CERN) for the Ganga Development team www.eu-egee.org INFSO-RI-508833 Overview Enabling Grids for E-sciencE People
INFSO-RI-508833
Enabling Grids for E-sciencE
www.eu-egee.org
a job management and optimising tool for job submission to the grid
Andrew Maier (CERN)
for the Ganga Development team
ISGC 2006, May 2006 - A. Maier (CERN) 2
Enabling Grids for E-sciencE
INFSO-RI-508833
Overview
ISGC 2006, May 2006 - A. Maier (CERN) 3
Enabling Grids for E-sciencE
INFSO-RI-508833
People - Sponsors
EGEE through ARDA
– U.Egede (Imperial), K.Harrison (Cambridge), D.Liko (CERN), A.Maier (CERN), J.T.Moscicki (CERN), A.Soroko (Oxford), CL.Tan (Birmingham)
researchers
ISGC 2006, May 2006 - A. Maier (CERN) 4
Enabling Grids for E-sciencE
INFSO-RI-508833
Motivation (1)
to be overcome:
– Get a certificate – write a JDL with the resources needed – create a script to run the application on the worker-node – monitor the progress of your job – retrieve your output
want to use:
– local machine - for debugging or short tests – a local batch system - for small or intermediate datasets – grid – for intermediate to large scale datasets
ISGC 2006, May 2006 - A. Maier (CERN) 5
Enabling Grids for E-sciencE
INFSO-RI-508833
Motivation (2)
– For job running locally - no need for a JDL – For a job running on a batch system - a shared file system may allow to retrieve results easily – One may have to monitor the job manually – The commands to submit a job have different syntax
technicalities :
➔ Factor out these these differences: use Ganga
ISGC 2006, May 2006 - A. Maier (CERN) 6
Enabling Grids for E-sciencE
INFSO-RI-508833
Ganga Overview
AtlasPROD DIAL DIRAC LCG2 gLite localhost LSF submit, kill get output update status store & retrieve job definition prepare, configure
Ganga4
Job Job Job Job scripts Gaudi Athena AtlasPROD DIRAC gLite LCG2 pbs localhost LSF + split, merge, monitor
ISGC 2006, May 2006 - A. Maier (CERN) 7
Enabling Grids for E-sciencE
INFSO-RI-508833
Introduction to Ganga
– Ganga is an application in python to help the user to
configure, prepare, submit and monitor applications to the local host, a batch or a grid system
– The goal is to make submitting a job transparent to the batch system used
Configure once, run anywhere
GANGA
LSF DIRAC Localhost User
AM
ISGC 2006, May 2006 - A. Maier (CERN) 8
Enabling Grids for E-sciencE
INFSO-RI-508833
Ganga job abstraction (1)
– Specification of the software to be run, including values for configurable parameters
– Specification of batch or Grid system to be used, including resource requirements (minimum memory, maximum CPU, etc)
– Used to specify input and/or output, for example a collection of input files containing event data
ISGC 2006, May 2006 - A. Maier (CERN) 9
Enabling Grids for E-sciencE
INFSO-RI-508833
Ganga job abstraction (2)
– Full task specification - input Dataset, Application, output Dataset, Backend - and bookkeeping information such as ID and status
– Rule for dividing a Job into subjobs that can be run in parallel; rule may relate to Application parameters and/or to input Dataset
– Rule for merging outputs from subjobs
ISGC 2006, May 2006 - A. Maier (CERN) 10
Enabling Grids for E-sciencE
INFSO-RI-508833
Internal architecture
Job Manager Remote Registry Client
into 4 functional components
describe the components in a distributed model.
component so that it could be a separate service.
more components into a single service
Application Manager
ISGC 2006, May 2006 - A. Maier (CERN) 11
Enabling Grids for E-sciencE
INFSO-RI-508833
Client
Application Manager Job Manager Remote Registry Client
(CLIP, GPI, GUI)
through the client
creates, modifies, submits and monitors jobs
registry which can be local (within the client) or remote.
ISGC 2006, May 2006 - A. Maier (CERN) 12
Enabling Grids for E-sciencE
INFSO-RI-508833
Client
Application Manager Job Manager Remote Registry Client
python)
line client or a GUI
application manager to configure applications
via the job manager
persistent information in the registry
ISGC 2006, May 2006 - A. Maier (CERN) 13
Enabling Grids for E-sciencE
INFSO-RI-508833
Application Manager
Application Manager Job Manager Remote Registry Client
application
environment
client on available applications, versions, platforms, etc.
ISGC 2006, May 2006 - A. Maier (CERN) 14
Enabling Grids for E-sciencE
INFSO-RI-508833
Job Manager
Application Manager Job Manager Remote Registry Client
submission backend
to a backend
Creates the starter script and
the JDL
prepares the application
dependent wrapper script, depending on the backend.
E.g., DIRAC knows how to run
LHCb applications with a different setup as LSF.
ISGC 2006, May 2006 - A. Maier (CERN) 15
Enabling Grids for E-sciencE
INFSO-RI-508833
Remote Registry
Application Manager Job Manager Remote Registry Client
typically using a database backend
the user jobs
metadata catalogue
ISGC 2006, May 2006 - A. Maier (CERN) 16
Enabling Grids for E-sciencE
INFSO-RI-508833
Design
ISGC 2006, May 2006 - A. Maier (CERN) 17
Enabling Grids for E-sciencE
INFSO-RI-508833
Ganga in action
– Job definition – Job submission – Job cancellation – Job monitoring – Output retrieval
commands
ISGC 2006, May 2006 - A. Maier (CERN) 18
Enabling Grids for E-sciencE
INFSO-RI-508833
Job definition
constructor
– j = Job( application = Executable(), backend = LCG())
through assignments
– j.application.exe = “/bin/echo” – j.application.args = [ “Hello World” ]
ISGC 2006, May 2006 - A. Maier (CERN) 19
Enabling Grids for E-sciencE
INFSO-RI-508833
Job submission
Application Manager Application Manager Archivist Archivist Job Manager Job Manager j ob derived parameters j ob and derived parameters subj obs derived parameters subj obs subj obs and derived parameters status Perform application configuration for j ob Perform application configuration for subj obs S plit j ob into subj obs Create wrapper scripts, submit to backend Register subj obs, allocate workspace Register subj obs as submitted
ISGC 2006, May 2006 - A. Maier (CERN) 20
Enabling Grids for E-sciencE
INFSO-RI-508833
Monitoring and output retrieval
retrieved automatically when job completes
Archivist Job Manager Control Monitoring thread Determine active j obs for each backend Query status of active j obs
⇒ For each backend with active j obs: ⇒For each j ob with change of status:
Report change of status to uses, retrieve output if j ob completed Register new j ob status
ISGC 2006, May 2006 - A. Maier (CERN) 21
Enabling Grids for E-sciencE
INFSO-RI-508833
CLIP: simple job from 1st principles
Job defined in Cambridge Job run in Amsterdam
ISGC 2006, May 2006 - A. Maier (CERN) 22
Enabling Grids for E-sciencE
INFSO-RI-508833
Running ATLAS jobs on LCG
ISGC 2006, May 2006 - A. Maier (CERN) 23
Enabling Grids for E-sciencE
INFSO-RI-508833
Ganga Gui
internal plugin architecture
wizard
ISGC 2006, May 2006 - A. Maier (CERN) 24
Enabling Grids for E-sciencE
INFSO-RI-508833
Ganga GUI (top half)
Logical Job Folder list List of Jobs Job details
ISGC 2006, May 2006 - A. Maier (CERN) 25
Enabling Grids for E-sciencE
INFSO-RI-508833
Ganga Gui (bottom half)
Ganga/Python prompt Saved scriptlets A scriptlet
ISGC 2006, May 2006 - A. Maier (CERN) 26
Enabling Grids for E-sciencE
INFSO-RI-508833
Ganga User - Atlas
– Dataset concept – Will be used to distribute data – DQ2 plugin for Ganga gives access to the catalog
– Short queues to support analysis jobs – Fast response for users
– gLite Resource Broker – Bulk submission
ISGC 2006, May 2006 - A. Maier (CERN) 27
Enabling Grids for E-sciencE
INFSO-RI-508833
Atlas Future Strategy
version by the end of May
Analysis of ATLAS on OSG
backends
measure the actual performance of the various backends
ISGC 2006, May 2006 - A. Maier (CERN) 28
Enabling Grids for E-sciencE
INFSO-RI-508833
Ganga Users - LHCb
framework called Gaudi – Users write code to be dynamically linked against the Gaudi framework – These shared libraries, the user options and the dataset to be analysed will sent for execution – Ganga has a plugin for Gaudi based applications and helps the user to configure the job
ISGC 2006, May 2006 - A. Maier (CERN) 29
Enabling Grids for E-sciencE
INFSO-RI-508833
LHCb Analysis Job
In [3]: dv = DaVinci(version='v12r12') In [4]: print dv DaVinci { version = 'v12r12' , extraopts = None , package = 'Phys' , cmt_user_path = '/afs/cern.ch/user/u/uegede/cmtuser' , masterpackage = None ,
name = '' } }
Specify extra option file properties appended to the
Specify the options file to be used Specify the package you are working on
ISGC 2006, May 2006 - A. Maier (CERN) 30
Enabling Grids for E-sciencE
INFSO-RI-508833
Analysis access to the Grid
submission to LCG
– Instead: Submission to the DIRAC workload management system
– Reduce the knowledge required to submit to the Grid – Provide transparent access to the LFC file catalogue for reading and writing data – Allow LHCb to set priorities and or restrictions for analysis jobs
ISGC 2006, May 2006 - A. Maier (CERN) 31
Enabling Grids for E-sciencE
INFSO-RI-508833
Other users of Ganga
– Tests the functionality and availability
– new versions are tested against result
– (see Hurng-Chun Lee's presentation yesterday)
– want to use Ganga in autumn, to aid the negotiation of new digital TV frequencies
ISGC 2006, May 2006 - A. Maier (CERN) 32
Enabling Grids for E-sciencE
INFSO-RI-508833
Summary
– prepares – submits – monitors and retrieves results
additional backends or applications
different computing resources (local machine, batch system, grid)
Configure once, run anywhere
ISGC 2006, May 2006 - A. Maier (CERN) 33
Enabling Grids for E-sciencE
INFSO-RI-508833
Spare
ISGC 2006, May 2006 - A. Maier (CERN) 34
Enabling Grids for E-sciencE
INFSO-RI-508833
Ganga Tool vs Framework
easy to install (pure-python) “designed and optimized” for users
Plugin model
Promote but not force common GPI abstractions
denominators between systems, example:
– you may implement very complex application (e.g. ADA) and enable submission to DIAL only if that’s your main case
all possible backends But: enable to build common tools on top of GPI: GUI, scripts,…
ISGC 2006, May 2006 - A. Maier (CERN) 35
Enabling Grids for E-sciencE
INFSO-RI-508833
Status of the various backends
– Standard LCG Resource Broker – Currently available
– Direct submission to CE – Currently avialbale
– New resource Broker – BulkSubmission – Under development
– Production and Analysis on OSG – First version
– Production and Distributed Analysis on LCG