ATLAS Production System in ATLAS Data Challenge 2 Luc Goossens - PowerPoint PPT Presentation

ATLAS Production System in ATLAS Data Challenge 2 Luc Goossens (CERN/EP/ATC) Kaushik De (UTA) 27 September 2004 CHEP 2004 1

in this talk  – introduction – terminology and conceptual model – architecture and components – experience so far – conclusions and outlook 27 September 2004 CHEP 2004 2

introduction  – ATLAS decided to undertake a series of Data Challenges (DC) in order to validate its Computing Model, its software, its data model – DC2 started July 2004: – introduced the new ATLAS Production System (prodsys) : • unsupervised production across many sites spread over three different Grids (US Grid3, NorduGrid, LCG-2) • 4 major components: – production supervisor – executor – » one executor per “grid-flavor” developed by corresponding grid experts – common data management system – common central production database for all ATLAS 27 September 2004 CHEP 2004 3

terminology and conceptual model  prodDB/AMI dataset task(transf) dataset logFile job(transf) logFile prodDB logFile job(transf) logFile logFile job(transf) logFile 27 September 2004 CHEP 2004 4

architecture  – as simple as possible (well almost) – flexible – target automatic production – based on DC1 experience with AtCom (DC1 interactive production system) and GRAT • core engine with plug-ins – some buzz technologies • XML, Jabber, Webservices, ... – federation of grids • LCG, Nordugrid, Grid3 • legacy systems only as backup – use middleware components as much as possible • avoid inventing ATLAS’ own version of grid – broker, catalogs, information system, ... • risky dependency ! 27 September 2004 CHEP 2004 5

dms prodDB supervisor supervisor supervisor supervisor super jabber jabber jabber jabber jabber legacy LCG LCG NG Grid3 executor executor executor executor executor LRC RLS RLS legacy LCG NG Grid3 27 September 2004 CHEP 2004 6

dms prodDB super super super super super jabber jabber jabber jabber jabber LCG LCG NG G3 legacy exe exe exe exe exe LRC RLS RLS legacy LCG NG Grid3 LSF 27 September 2004 CHEP 2004 7

prodDB = production database  – holds records for • job transformations • job definitions – status of jobs • job executions • logical files – Oracle database hosted at CERN 27 September 2004 CHEP 2004 8

jobTrans jobExecution jobDefinition uses attemptNr jobName implementation jobstatus jobXML formalPars supervisor currentState ... executor lastAttempt joboutputs supervisor metadata priority … ... logicalFile logicalFileName logicalCollection datasetName guid metadata ... 27 September 2004 CHEP 2004 9

jobTrans:formalPars <signature> <formalPar> <name>inputfile</name> <position>1</position> <type>LFN</type> <metaType>inputLFN</metaType> </formalPar> <formalPar> <name>outputfile</name> <position>2</position> <type>LFN</type> <metaType>outputLFN</metaType> </formalPar> ... <formalPar> <name>ranseed</name> <position>7</position> <type>natural</type> <metaType>plain</metaType> </formalPar> </signature> 27 September 2004 CHEP 2004 10

jobDefinition:jobXML <jobDef> <jobPars> <actualPar> <name>inputfile</name> <position>1</position> <type>LFN</type> <metaType>inputLFN</metaType> <value>dc2.003014.evgen.M1_minbias._00020.pool.root</value> </actualPar> ... </jobPars> <jobInputs> <fileInfo> <LFN>dc2.003014.evgen.M1_minbias._00020.pool.root</LFN> <logCol>/datafiles/dc2/evgen/dc2.003014.evgen.M1_minbias/</logCol> </fileInfo> </jobInputs> <jobOutputs>...</jobOutputs> <jobLogs>...</jobLogs> </jobDef> 27 September 2004 CHEP 2004 11

jobDefinition:jobXML <jobDef> <jobPars>...</jobPars> <jobInputs> ... </jobInputs> <jobLogs> <fileInfo> <stream>stdboth</stream> <LFN>dc2.003014.simul.M1_minbias._00980.job.log</LFN> <logCol>/logfiles/dc2/simul/dc2.003014.simul.M1_minbias/</logCol> <dataset><name>dc2.003014.simul.M1_minbias.log</name></dataset> <SEList><SE>castorgrid.cern.ch</SE></SEList> </fileInfo> </jobLogs> <jobOutputs> <fileInfo> <LFN>dc2.003014.simul.M1_minbias._00980.pool.root</LFN> <logCol>/datafiles/dc2/simul/dc2.003014.simul.M1_minbias/</logCol> <dataset><name>dc2.003014.simul.M1_minbias</name></dataset> <SEList><SE>castorgrid.cern.ch</SE></SEList> </fileInfo> </jobOutputs> </jobDef> 27 September 2004 CHEP 2004 12

dms prodDB supervisor supervisor supervisor supervisor supervisor jabber jabber jabber jabber jabber LCG LCG NG G3 legacy exe exe exe exe exe LRC RLS RLS legacy LCG NG Grid3 27 September 2004 CHEP 2004 13

supervisor  – consumes jobs from the production database – submits them to one of the executors it is connected with – follows up on the job – validates presence of expected outputs – takes care of final registration of output products in case of success – possibly takes care of clean-up in case of failure – will retry n times if necessary – implementation -> Windmill • http://heppc12.uta.edu/windmill/ – no brokering • “how-many-jobs-do-you-want” protocol – possibly stateless – uses Jabber to communicate with executors 27 September 2004 CHEP 2004 14

dms prodDB super super super super super jabber jabber jabber jabber jabber LCG legacy LCG NG G3 executor executor executor executor executor LRC RLS RLS legacy LCG NG Grid3 LSF 27 September 2004 CHEP 2004 15

executor  – one for each facility flavor • LCG (lexor), NG (dulcinea), GRID3 (capone), PBS, LSF, BQS, Condor?, … – translates facility neutral job definition into facility specific language • XRSL, JDL, wrapper scripts, … – implements facility neutral interface • usual methods: submit, getStatus, kill, … – possibly stateless – two implementation strategies • executor subclass • SOAP adapter + executor webservice (Capone) – see other talks in this conference 27 September 2004 CHEP 2004 16

dms prodDB super super super super super jabber jabber jabber jabber jabber LCG LCG NG G3 legacy exe exe exe exe exe LRC RLS RLS legacy LCG NG Grid3 LSF 27 September 2004 CHEP 2004 17

data management system  – allows global cataloguing of files • we have opted to interface to existing replica catalog flavors – allows global file movement • an ATLAS job can get/put a file anywhere – presents a uniform interface on top of all the facility native data management tools – we only counted on ability to do inter-grid file transfers • ideally jobs should be able to use input files located in other grids and write output files into other grids • this was not exercised – stateless – implementation -> Don Quijote • see separate talk by Miguel Branco 27 September 2004 CHEP 2004 18

experience  – since start of DC2 (July) the system has • 235000 jobexecution, 158000 jobdefinition, 251000 logicalfile – approx. evenly distributed over the three Grid flavors • 157 task, 22 jobtrans • consumed ~ 1.5 million SI2k months of CPU (~ 5000 CPU months) – we had high dependency on middleware • broker in LCG, RLS in Grid3/NG, ... • we suffered a lot ! • many bugs were found and corrected – DC2 started before development was finished • we suffered a lot ! • many bugs were found and corrected – detailed experience reports per Grid in other talks 27 September 2004 CHEP 2004 19

conclusion  – for DC2 ATLAS relies completely on a federation of grid systems (LCG, Nordugrid, Grid3) – the ATLAS production system allows for an automatic production on this federation of grids – the ATLAS production system is based directly on the services offered by these grids – stress-testing these services in the context of a major production was a new experience and many lessons were learned – it was possible, but not easy • a lot of manpower was needed to compensate for missing and/or buggy software 27 September 2004 CHEP 2004 20

ATLAS Production System in ATLAS Data Challenge 2 Luc Goossens - PowerPoint PPT Presentation

ATLAS Production System in ATLAS Data Challenge 2 Luc Goossens (CERN/EP/ATC) Kaushik De (UTA) 27 September 2004 CHEP 2004 1 in this talk introduction terminology and conceptual model architecture and components

Measuring DNSSEC using RIPE Atlas Kaveh Ranjbar RIPE NCC RIPE Atlas Coverage RIPE Atlas 2

ATLAS Searches for SUSY Chris Young, CERN ATLAS Group What have we not looked for? 1 / 37 ATLAS

ReSAKSS DATA CHALLENGE Annual Newsletter www.resakss.org/challenge ReSAKSS DATA CHALLENGE ANNUAL

CDF Data production model CDF Data production model S. Hou S. Hou for the CDF data production

ATLAS ROOT I/O pt 2 Atlas Hot Topics (with reference to CHEP presentations) Big data

ATLAS I/O Overview Peter van Gemmeren (ANL) gemmeren@anl.gov for many in ATLAS 8/23/2018 Peter

VAST CHALLENGE 2017 Bianca Barnucz & Stephanie Wegscheidl OVERVIEW VAST Challenge

Data Management in ATLAS Angelos Molfetas on behalf of the ATLAS DQ2 team 1 ATLAS DDM

Quarkonium and Heavy Flavour Meson Production at 13 TeV at ATLAS Leonid Gladilin (Moscow State

Highlights and Searches in ATLAS Dave Charlton University of Birmingham on behalf of the ATLAS

H result from ATLAS Lydia Brenner Introduction ATLAS I will try to compare some

Top Properties from ATLAS Chris Young (CERN), on behalf of ATLAS 27th May 2020 1 / 19 Top

Atlas Summit 2016 C ALL FOR P RESENTA TION P ROPOSALS The Atlas Society is currently planning the

Atlas Arteria Investor Presentation July 2018 Important notice and disclaimer Disclaimer Atlas

ATLAS Shrugged ATLAS Shrugged Pat O Toole Toole Pat O (with apologies to Ayn Rand and

Macquarie Atlas Roads Limited Macquarie Atlas Roads International Limited 2016 Annual General

Looking for dark-sector long- lived particles with ATLAS K. Hara (hara@hepsg3.px.tsukuba.ac.jp)

Polaroid jetography an album of jet physics measurements and searches at the ATLAS experiment

Measurement of muon misidentification rates in Z events for the ATLAS detector Johannes

Management of Pediatric Food Allergy Janice M. Joneja, Ph.D., RD 2005 Clinical Signs of Food

CANCER RISK IN PATIENTS WITH ATOPIC DERMATITIS: A SYSTEMATIC REVIEW Lily Wang, MD(C) CSIM Annual

Dermatology and I have nothing to disclose Developmental Disability No financial

New Treatments in Dermatology Toby Maurer, MD University of California, San Francisco Dept of

Prevention of Food Allergy: From Pre-conception to Early Post-Natal Life Janice Joneja Ph.D., RD

ATLAS Production System in ATLAS Data Challenge 2 Luc Goossens - PowerPoint PPT Presentation

ATLAS Production System in ATLAS Data Challenge 2 Luc Goossens (CERN/EP/ATC) Kaushik De (UTA) 27 September 2004 CHEP 2004 1 in this talk introduction terminology and conceptual model architecture and components

Measuring DNSSEC using RIPE Atlas Kaveh Ranjbar RIPE NCC RIPE Atlas Coverage RIPE Atlas 2

ATLAS Searches for SUSY Chris Young, CERN ATLAS Group What have we not looked for? 1 / 37 ATLAS

ReSAKSS DATA CHALLENGE Annual Newsletter www.resakss.org/challenge ReSAKSS DATA CHALLENGE ANNUAL

CDF Data production model CDF Data production model S. Hou S. Hou for the CDF data production

ATLAS ROOT I/O pt 2 Atlas Hot Topics (with reference to CHEP presentations) Big data

ATLAS I/O Overview Peter van Gemmeren (ANL) gemmeren@anl.gov for many in ATLAS 8/23/2018 Peter

VAST CHALLENGE 2017 Bianca Barnucz &amp; Stephanie Wegscheidl OVERVIEW VAST Challenge

Data Management in ATLAS Angelos Molfetas on behalf of the ATLAS DQ2 team 1 ATLAS DDM

Quarkonium and Heavy Flavour Meson Production at 13 TeV at ATLAS Leonid Gladilin (Moscow State

Highlights and Searches in ATLAS Dave Charlton University of Birmingham on behalf of the ATLAS

H result from ATLAS Lydia Brenner Introduction ATLAS I will try to compare some

Top Properties from ATLAS Chris Young (CERN), on behalf of ATLAS 27th May 2020 1 / 19 Top

Atlas Summit 2016 C ALL FOR P RESENTA TION P ROPOSALS The Atlas Society is currently planning the

Atlas Arteria Investor Presentation July 2018 Important notice and disclaimer Disclaimer Atlas

ATLAS Shrugged ATLAS Shrugged Pat O Toole Toole Pat O (with apologies to Ayn Rand and

Macquarie Atlas Roads Limited Macquarie Atlas Roads International Limited 2016 Annual General

Looking for dark-sector long- lived particles with ATLAS K. Hara (hara@hepsg3.px.tsukuba.ac.jp)

Polaroid jetography an album of jet physics measurements and searches at the ATLAS experiment

Measurement of muon misidentification rates in Z events for the ATLAS detector Johannes

Management of Pediatric Food Allergy Janice M. Joneja, Ph.D., RD 2005 Clinical Signs of Food

CANCER RISK IN PATIENTS WITH ATOPIC DERMATITIS: A SYSTEMATIC REVIEW Lily Wang, MD(C) CSIM Annual

Dermatology and I have nothing to disclose Developmental Disability No financial

New Treatments in Dermatology Toby Maurer, MD University of California, San Francisco Dept of

Prevention of Food Allergy: From Pre-conception to Early Post-Natal Life Janice Joneja Ph.D., RD

VAST CHALLENGE 2017 Bianca Barnucz & Stephanie Wegscheidl OVERVIEW VAST Challenge