babar distributed computing
play

BaBar Distributed Computing Stephen J. Gowdy SLAC Super B-Factory - PowerPoint PPT Presentation

BaBar Distributed Computing Stephen J. Gowdy SLAC Super B-Factory Workshop 22 nd April 2005 22 nd April 2005 BaBar Distributed Computing - S. J. Gowdy 1 Overview Foundations Tier-A Sites Data Distribution 22 nd April 2005 BaBar


  1. BaBar Distributed Computing Stephen J. Gowdy SLAC Super B-Factory Workshop 22 nd April 2005 22 nd April 2005 BaBar Distributed Computing - S. J. Gowdy 1

  2. Overview ● Foundations ● Tier-A Sites ● Data Distribution 22 nd April 2005 BaBar Distributed Computing - S. J. Gowdy 2

  3. Software Distribution ● All source is in CVS repository in AFS – Allows code to be seen anywhere in the world ● SoftRelTools/SiteConfig used to configure each site – Location of external software, compilers – Server names (Objectivity lock servers, etc.) ● UserLogin package to set up environment – More site customisation here (these modifications are not in CVS) 22 nd April 2005 BaBar Distributed Computing - S. J. Gowdy 3

  4. Software Distribution (Cont.) ● “bin” package contains bootstrapping scripts – Installed at sites as $BFROOT/bin ● importrel used to import a BaBar Software Release – By default imports all architectures, can use importarch to only import selected platforms (would tell importrel to not import any) – Once local run “gmake siteinstall” to reconfigure the release for local site ● Should now be able to run applications as would at SLAC 22 nd April 2005 BaBar Distributed Computing - S. J. Gowdy 4

  5. Eventstore ● Collection names are trivially mapped to first logical file name (LFN) /store/PR/R14/AllEvents/0004/02/14.3.1a/AllEvents_00040228_14.3.1aV00 /store/PR/R14/AllEvents/0004/02/14.3.1a/AllEvents_00040228_14.3.1aV00.01.root ● Mapping from LFN to physical file name via site specific configuration file – $BFROOT/kanga/config/KanAccess.cfg [yakut06] ~/reldirs/tstanalysis-24/workdir > KanAccess /store/PR/R14/AllEvents/0004/02/14.3.1a/AllEvents_00040228_14.3.1aV00.01.root root://kanolb-a:1094///store/PR/R14/AllEvents/0004/02/14.3.1a/AllEvents_00040228_14.3.1aV00.01.root – This one uses xrootd for access 22 nd April 2005 BaBar Distributed Computing - S. J. Gowdy 5

  6. Eventstore (Cont.) ● xrootd used for production data access – Resilient against many failure modes – Very little overhead to disk IO – Now part of ROOT distribution ● Latest versions at http://xrootd.slac.stanford.edu From 15 th April 2005 22 nd April 2005 BaBar Distributed Computing - S. J. Gowdy 6

  7. Eventstore (Cont.) ● Collections made up of different components – Generally two classes of files from production ● Micro – Header – User Data (ntuple-like information associated with particles) – (B)Tag Information (event level information) – Candidates (physics level reconstructed objects) – Analysis Object Data (AOD, detector level information) – Truth (if MC data) ● Mini – Event Summary Data (ESD) ● (Third class contains RAW and Simulation data) ● File names let you know what is in them /store/PR/R14/AllEvents/0004/02/14.3.1a/AllEvents_00040228_14.3.1aV00.02E.root 22 nd April 2005 BaBar Distributed Computing - S. J. Gowdy 7

  8. Eventstore (Cont.) ● Production skimming done on data and Monte Carlo – Currently have 189 skims defined ● Vary a great deal in selection rate (<% of % to ~10%) – Each skim can decide to only be a pointer, deep copy the micro or deep copy the micro and mini ● All include the Tag, Candidates and User data ● Pointer skims require underlying production collections (not available at all sites) ● Deep copy skims expected to be more performant – Analysis runs on skims 22 nd April 2005 BaBar Distributed Computing - S. J. Gowdy 8

  9. Bookkeeping ● RDBMS based system – Support Oracle and mySQL ● Knows about collections (Data Set Entity) ● Groups collections in to datasets – Analysis performed on datasets – Example datasets are; ● AllEvents-Run5-OnPeak-R18 (data) ● SP-998-Run4 (MC) ● Tool to mirror databases to different sites ● Have a key distribution system to allow off- site access to databases 22 nd April 2005 BaBar Distributed Computing - S. J. Gowdy 9

  10. Simulation Production

  11. Tier-A Sites ● Currently have 6 Tier-A sites – SLAC (Prompt Calibration, analysis, simulation, skimming) – CC-IN2P3, France (analysis, simulation) – RAL (analysis, simulation) – Padova (Event Reconstruction, skimming, simulation) – GridKa, Germany (analysis, skimming, simulation) – CNAF, Italy (analysis) 22 nd April 2005 BaBar Distributed Computing - S. J. Gowdy 11

  12. Tier-A Sites (Cont.) ● Tasks at each Tier-A site based on local expertise and needed level of resources ● Countries received a Common Fund rebate based on their resources contributed (50% of the cost saving at SLAC, the other 50% get distributed to all other countries) – Actual usage reported each six months to the International Finance Committee (funding agencies) 22 nd April 2005 BaBar Distributed Computing - S. J. Gowdy 12

  13. Data Distribution ● Primarily method using Bookkeeping tools to do distribution – Sites can choose to import certain datasets ● Perhaps only the AOD, or the full AOD & ESD – Site can have a local database to remember which files have been imported – Bookkeeping tools warn users if they do not have all of the data locally ● All data import and export is via SLAC – Could set up other Tier-A sites for export – Cluster of servers 22 nd April 2005 BaBar Distributed Computing - S. J. Gowdy 13

  14. Data Distribution (Cont.) ● Recently decided to allocate datasets to Tier-A sites based on Analysis Working Groups – Each AWG has a set of skims associated with it – All the skims for an AWG are put at one site 22 nd April 2005 BaBar Distributed Computing - S. J. Gowdy 15

  15. Summary ● BaBar has a very productive Distributed Computing system ● For analysis users have an inconvenience of using a specific site (that they may not have used before) – In the future the “Grid” is forecast to solve this 22 nd April 2005 BaBar Distributed Computing - S. J. Gowdy 16

  16. Backup Slides 22 nd April 2005 BaBar Distributed Computing - S. J. Gowdy 17

  17. BbkDatasetTcl [yakut06] ~ > BbkDatasetTcl -l '*BSemiExcl-Run4-*R16a' BbkDatasetTcl: 7 datasets found:- BSemiExcl-Run4-OffPeak-R16a BSemiExcl-Run4-OnPeak-R16a SP-1005-BSemiExcl-Run4-R16a SP-1235-BSemiExcl-Run4-R16a SP-1237-BSemiExcl-Run4-R16a SP-3429-BSemiExcl-Run4-R16a SP-998-BSemiExcl-Run4-R16a [yakut06] ~ > BbkDatasetTcl 'BSemiExcl-Run4-*R16a' BbkDatasetTcl: wrote BSemiExcl-Run4-OffPeak-R16a.tcl (7 collections, 1300477/132941301 events, ~9990.6/pb) BbkDatasetTcl: wrote BSemiExcl-Run4-OnPeak-R16a.tcl (73 collections, 22851621/1448776065 events, ~99532.6/pb) Selected 80 collections, 24152098/1581717366 events, ~109523.1/pb, from 2 datasets 22 nd April 2005 BaBar Distributed Computing - S. J. Gowdy 18

  18. SLAC Usage Disk space Batch time + Dedicated CPU – Extra disk space originally made available for CM2 conversion, ~80 TB to be freed of old Kanga+Objy – SLAC CPU time is a mix of dedicated and batch use

  19. IN2P3 Usage Disk space Batch time – Note: IN2P3 uses dynamic staging system (HPSS) – Batch utilization has come back strong after decline last summer

  20. RAL Usage Disk space Batch time – Actual disk space slightly exceeding 2004 MOU – Batch use peaked in Oct, recent drop the effect of transitioning away from old Kanga

  21. INFN Usage Disk space Dedicated CPU + Batch time – Disk already above 2004 MOU, including CNAF – Dedicated CPU reached 2004 MOE in Dec 2004, analysis started to add to that

  22. GridKa Usage Disk space Batch time – Disk space reached MOU in mid2004 – CPU usage continued positive trend of 1 st half of 2004 – With analysis use, has peaked above MOU level

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend