22nd April 2005 BaBar Distributed Computing - S. J. Gowdy 1
BaBar Distributed Computing
Stephen J. Gowdy SLAC Super B-Factory Workshop 22nd April 2005
BaBar Distributed Computing Stephen J. Gowdy SLAC Super B-Factory - - PowerPoint PPT Presentation
BaBar Distributed Computing Stephen J. Gowdy SLAC Super B-Factory Workshop 22 nd April 2005 22 nd April 2005 BaBar Distributed Computing - S. J. Gowdy 1 Overview Foundations Tier-A Sites Data Distribution 22 nd April 2005 BaBar
22nd April 2005 BaBar Distributed Computing - S. J. Gowdy 1
Stephen J. Gowdy SLAC Super B-Factory Workshop 22nd April 2005
22nd April 2005 BaBar Distributed Computing - S. J. Gowdy 2
22nd April 2005 BaBar Distributed Computing - S. J. Gowdy 3
– Allows code to be seen anywhere in the world
each site
– Location of external software, compilers – Server names (Objectivity lock servers, etc.)
– More site customisation here (these
modifications are not in CVS)
22nd April 2005 BaBar Distributed Computing - S. J. Gowdy 4
– Installed at sites as $BFROOT/bin
Release
– By default imports all architectures, can use
importarch to only import selected platforms (would tell importrel to not import any)
– Once local run “gmake siteinstall” to reconfigure
the release for local site
would at SLAC
22nd April 2005 BaBar Distributed Computing - S. J. Gowdy 5
logical file name (LFN)
/store/PR/R14/AllEvents/0004/02/14.3.1a/AllEvents_00040228_14.3.1aV00 /store/PR/R14/AllEvents/0004/02/14.3.1a/AllEvents_00040228_14.3.1aV00.01.root
site specific configuration file
– $BFROOT/kanga/config/KanAccess.cfg
[yakut06] ~/reldirs/tstanalysis-24/workdir > KanAccess /store/PR/R14/AllEvents/0004/02/14.3.1a/AllEvents_00040228_14.3.1aV00.01.root root://kanolb-a:1094///store/PR/R14/AllEvents/0004/02/14.3.1a/AllEvents_00040228_14.3.1aV00.01.root
– This one uses xrootd for access
22nd April 2005 BaBar Distributed Computing - S. J. Gowdy 6
– Resilient against many failure modes – Very little overhead to disk IO – Now part of ROOT distribution
From 15th April 2005
22nd April 2005 BaBar Distributed Computing - S. J. Gowdy 7
– Generally two classes of files from production
– Header – User Data (ntuple-like information associated with particles) – (B)Tag Information (event level information) – Candidates (physics level reconstructed objects) – Analysis Object Data (AOD, detector level information) – Truth (if MC data)
– Event Summary Data (ESD)
/store/PR/R14/AllEvents/0004/02/14.3.1a/AllEvents_00040228_14.3.1aV00.02E.root
22nd April 2005 BaBar Distributed Computing - S. J. Gowdy 8
Monte Carlo
– Currently have 189 skims defined
– Each skim can decide to only be a pointer, deep
copy the micro or deep copy the micro and mini
collections (not available at all sites)
– Analysis runs on skims
22nd April 2005 BaBar Distributed Computing - S. J. Gowdy 9
– Support Oracle and mySQL
– Analysis performed on datasets – Example datasets are;
site access to databases
22nd April 2005 BaBar Distributed Computing - S. J. Gowdy 11
– SLAC (Prompt Calibration, analysis, simulation,
skimming)
– CC-IN2P3, France (analysis, simulation) – RAL (analysis, simulation) – Padova (Event Reconstruction, skimming,
simulation)
– GridKa, Germany (analysis, skimming,
simulation)
– CNAF, Italy (analysis)
22nd April 2005 BaBar Distributed Computing - S. J. Gowdy 12
expertise and needed level of resources
based on their resources contributed (50%
get distributed to all other countries)
– Actual usage reported each six months to the
International Finance Committee (funding agencies)
22nd April 2005 BaBar Distributed Computing - S. J. Gowdy 13
do distribution
– Sites can choose to import certain datasets
– Site can have a local database to remember
which files have been imported
– Bookkeeping tools warn users if they do not
have all of the data locally
– Could set up other Tier-A sites for export – Cluster of servers
22nd April 2005 BaBar Distributed Computing - S. J. Gowdy 15
Tier-A sites based on Analysis Working Groups
– Each AWG has a set of skims associated with it – All the skims for an AWG are put at one site
22nd April 2005 BaBar Distributed Computing - S. J. Gowdy 16
Computing system
using a specific site (that they may not have used before)
– In the future the “Grid” is forecast to solve this
22nd April 2005 BaBar Distributed Computing - S. J. Gowdy 17
22nd April 2005 BaBar Distributed Computing - S. J. Gowdy 18
[yakut06] ~ > BbkDatasetTcl -l '*BSemiExcl-Run4-*R16a' BbkDatasetTcl: 7 datasets found:- BSemiExcl-Run4-OffPeak-R16a BSemiExcl-Run4-OnPeak-R16a SP-1005-BSemiExcl-Run4-R16a SP-1235-BSemiExcl-Run4-R16a SP-1237-BSemiExcl-Run4-R16a SP-3429-BSemiExcl-Run4-R16a SP-998-BSemiExcl-Run4-R16a [yakut06] ~ > BbkDatasetTcl 'BSemiExcl-Run4-*R16a' BbkDatasetTcl: wrote BSemiExcl-Run4-OffPeak-R16a.tcl (7 collections, 1300477/132941301 events, ~9990.6/pb) BbkDatasetTcl: wrote BSemiExcl-Run4-OnPeak-R16a.tcl (73 collections, 22851621/1448776065 events, ~99532.6/pb) Selected 80 collections, 24152098/1581717366 events, ~109523.1/pb, from 2 datasets
– Extra disk space originally made available for CM2
conversion, ~80 TB to be freed of old Kanga+Objy
– SLAC CPU time is a mix of dedicated and batch use
Disk space Batch time + Dedicated CPU
– Note: IN2P3 uses dynamic staging system (HPSS) – Batch utilization has come back strong after decline
last summer
Disk space Batch time
– Actual disk space slightly exceeding 2004 MOU – Batch use peaked in Oct, recent drop the effect of
transitioning away from old Kanga
Disk space Batch time
– Disk already above 2004 MOU, including CNAF – Dedicated CPU reached 2004 MOE in Dec 2004,
analysis started to add to that
Disk space Dedicated CPU + Batch time
– Disk space reached MOU in mid2004 – CPU usage continued positive trend of 1st half of 2004 – With analysis use, has peaked above MOU level
Disk space Batch time