SDC DB Support for Distributed Computing PanDAMon Integration in - - PowerPoint PPT Presentation

sdc db
SMART_READER_LITE
LIVE PREVIEW

SDC DB Support for Distributed Computing PanDAMon Integration in - - PowerPoint PPT Presentation

SDC DB Support for Distributed Computing PanDAMon Integration in CMS Workshop on Analysis Tools Development May 16 th 2013 Nicol Magini CERN IT-SDC-OL CERN IT Department CH-1211 Geneva 23 date Author etc Switzerland www.cern.ch/i t


slide-1
SLIDE 1

Support for Distributed Computing

CERN IT Department CH-1211 Geneva 23 Switzerland

www.cern.ch/it

DB SDC

Author etc

PanDAMon Integration in CMS

Workshop on Analysis Tools Development May 16th 2013 Nicolò Magini CERN IT-SDC-OL

date

slide-2
SLIDE 2

CERN IT Department CH-1211 Geneva 23 Switzerland

www.cern.ch/it

SDC

2 Author etc 2013-05-16

Outline

  • Status after the prototype
  • Current status of the testbed deployment
  • Plans for the integration testbed
  • Next steps

Workshop on Analysis Tools Nicolò Magini CERN IT-SDC-OL

slide-3
SLIDE 3

CERN IT Department CH-1211 Geneva 23 Switzerland

www.cern.ch/it

SDC

3 Author etc 2013-05-16

Monitoring of PanDA jobs

  • Reminder: “Monitoring of jobs in PanDA” is

more than “PanDA Monitor”

  • ATLAS ops and users take advantage of

Dashboard (populated from PanDA DB) to complement PanDA Monitor, especially for

– Task monitoring – Historical view

  • Here I’m going to look only at the

“PanDA Monitor” itself, in particular for job debugging

Workshop on Analysis Tools Nicolò Magini CERN IT-SDC-OL

slide-4
SLIDE 4

CERN IT Department CH-1211 Geneva 23 Switzerland

www.cern.ch/it

SDC

4 Author etc 2013-05-16

PanDAMon for the prototype

  • Using ATLAS PanDA Monitor as-is, with

minimal updates by V. Fine (ATLAS PanDAMon developer) to make it functional for CMS jobs

  • Already working successfully by CMS power

users in proof of concept phase

Workshop on Analysis Tools Nicolò Magini CERN IT-SDC-OL

slide-5
SLIDE 5

CERN IT Department CH-1211 Geneva 23 Switzerland

www.cern.ch/it

SDC

5 Author etc 2013-05-16

PanDAMon for the prototype

  • viewlogfiles: perform LFN2PFN conversion with PhEDEx

datasvc to find log file location (instead of looking up in central ATLAS catalog) – Recently had an issue with logfile retrieval, now fixed by V. Fine

Workshop on Analysis Tools Nicolò Magini CERN IT-SDC-OL

slide-6
SLIDE 6

CERN IT Department CH-1211 Geneva 23 Switzerland

www.cern.ch/it

SDC

6 Author etc 2013-05-16

Testbed deployment

  • Additional 2 core, 8 GB VM could be useful as PanDA Mon

“development instance” to test deployment and new modules

vocms09 Panda Mon (varnish) SLC6 LB Preslav VM 2 cores, 8 GB mem, 500 GB disk prototype vocms35 Panda Mon (varnish) SLC6 LB Preslav VM 2 cores, 8 GB mem, 500 GB disk prototype vocms33 Panda Mon SLC6 - power node LB Preslav 23-JAN-14 24 cores, 32 GB mem, 2x750 GB disk prototype vocms100 Panda Mon SLC6 LB (temporary node, this is the ASO spare) Preslav 27-JAN-14 8 cores, 24 GB mem, 3x1TB disk spare Workshop on Analysis Tools Nicolò Magini CERN IT-SDC-OL

slide-7
SLIDE 7

CERN IT Department CH-1211 Geneva 23 Switzerland

www.cern.ch/it

SDC

7 Author etc 2013-05-16

Testbed status

  • Basic quattor configuration performed by

VOC on all machines following ATLAS templates

  • Now in contact with ATLAS Distributed

Computing operators for software deployment and configuration procedures

Workshop on Analysis Tools Nicolò Magini CERN IT-SDC-OL

slide-8
SLIDE 8

CERN IT Department CH-1211 Geneva 23 Switzerland

www.cern.ch/it

SDC

8 Author etc 2013-05-16

PanDAMon testbed goals

  • During testbed phase

– Reproduce working PanDA Monitor setup from prototype phase in CMS instance – Identify “ATLAS” assumptions in monitoring, assess usability for CMS

  • Some examples found by developers in job debugging

views reported in the following

  • More surely to be found by CMS ops and users, will

gather feedback

– Produce new PandaMon custom modules for CMS integration for items not covered by current PanDAMon or Dashboard

Workshop on Analysis Tools Nicolò Magini CERN IT-SDC-OL

slide-9
SLIDE 9

CERN IT Department CH-1211 Geneva 23 Switzerland

www.cern.ch/it

SDC

9 Author etc 2013-05-16

Navigation

  • A lot of information on the website is

aggregated by cloud

  • For CMS, more useful to look at sites rather

than clouds?

Workshop on Analysis Tools Nicolò Magini CERN IT-SDC-OL

slide-10
SLIDE 10

CERN IT Department CH-1211 Geneva 23 Switzerland

www.cern.ch/it

SDC

10 Author etc 2013-05-16

Dataset info

  • Dataset info linking to DQ2

Workshop on Analysis Tools Nicolò Magini CERN IT-SDC-OL

slide-11
SLIDE 11

CERN IT Department CH-1211 Geneva 23 Switzerland

www.cern.ch/it

SDC

11 Author etc 2013-05-16

Dataset info

Workshop on Analysis Tools Nicolò Magini CERN IT-SDC-OL

  • Need to update to link to DAS/DBS
slide-12
SLIDE 12

CERN IT Department CH-1211 Geneva 23 Switzerland

www.cern.ch/it

SDC

12 Author etc 2013-05-16

Task monitoring

Workshop on Analysis Tools Nicolò Magini CERN IT-SDC-OL

  • Linked to ATLAS Task Monitoring
  • Integrate with CMS Task Monitoring
slide-13
SLIDE 13

CERN IT Department CH-1211 Geneva 23 Switzerland

www.cern.ch/it

SDC

13 Author etc 2013-05-16

Output file links

  • Links to log and output file locations working in “viewlogfile”

page, need to fix in “findfile“

  • (do we want to update output location in PanDA DB from

/store/temp/user to /store/user after ASO?)

Workshop on Analysis Tools Nicolò Magini CERN IT-SDC-OL

slide-14
SLIDE 14

CERN IT Department CH-1211 Geneva 23 Switzerland

www.cern.ch/it

SDC

14 Author etc 2013-05-16

Error reporting

  • ASO failures reported to DB and visible in

monitoring but not in “Error details”

  • CMS transformation (job wrapper) exit code

visible in PanDAMon, but not detailed error message - includes cmsRun messages

  • Update links to support mail…

Workshop on Analysis Tools Nicolò Magini CERN IT-SDC-OL

slide-15
SLIDE 15

CERN IT Department CH-1211 Geneva 23 Switzerland

www.cern.ch/it

SDC

15 Author etc 2013-05-16

Next steps

  • Next week: deploy PanDAMon as-is on dev

server in testbed setup

  • When testbed setup is ready, start looking

into reported issues

  • Interact with PanDAMon developers to learn

how to integrate new modules if needed by CMS

– First session already done

  • Reproduce deployment on prod server

Workshop on Analysis Tools Nicolò Magini CERN IT-SDC-OL