p3s - a few technicalities M.Potekhin (Brookhaven National - - PowerPoint PPT Presentation

p3s a few technicalities
SMART_READER_LITE
LIVE PREVIEW

p3s - a few technicalities M.Potekhin (Brookhaven National - - PowerPoint PPT Presentation

p3s - a few technicalities M.Potekhin (Brookhaven National Laboratory) potekhin@bnl.gov DUNE Collaboration Meeting CERN , January 2018 protoDUNE-SP data flow protoDUNE Online CERN EOS FTS1 FTS2 (NP04) DAQ CASTOR buffer (tape) F


slide-1
SLIDE 1

p3s - a few technicalities

M.Potekhin (Brookhaven National Laboratory) potekhin@bnl.gov

DUNE Collaboration Meeting - CERN, January 2018

slide-2
SLIDE 2

M Potekhin | DQM - DUNE Collaboration Meeting@CERN, January 2018

protoDUNE-SP data flow

2 Other US sites

protoDUNE (NP04) DAQ Online

Monitoring

Online buffer

CERN EOS

CASTOR (tape)

FTS1

FNAL

dCache

ENSTORE (tape)

custodial copy primary copy

A B

SAM (Metadata)

protoDUNE Infrastructure at CERN

C

processing in US and European Grids/Clouds

Monitoring Web Interface

FTS2 F T S 2 Prompt Processing System

Web UI/Visualization

US infrastructure

slide-3
SLIDE 3

M Potekhin | DQM - DUNE Collaboration Meeting@CERN, January 2018

Documentation

  • User-level documentation for p3s

– See "documents" folder on GitHub: https://github.com/DUNE/p3s – Documents exist in both "md" and "pdf" formats – For now the most relevant document is "JOB" – Links for all that are at https://wiki.dunescience.org/wiki/ProtoDUNE-SP

  • Expert-level documentation for server maintenance is in the works,

will be placed in the same location

3

slide-4
SLIDE 4

M Potekhin | DQM - DUNE Collaboration Meeting@CERN, January 2018

Storage (identity and access to EOS)

  • p3s jobs run under the pilot identity np04dqm
  • could also run as the developer mxp
  • when the pilot executes the job it still retains the pilot identity
  • in either case if your job description refers to directories not open to

public the job will fail

4

slide-5
SLIDE 5

M Potekhin | DQM - DUNE Collaboration Meeting@CERN, January 2018

Storage (data)

  • Right now we fully depend on EOS for data
  • F-FTS team knows the agreed upon location of our "inbox", that's

where the input data will be coming

  • re: identity/access to EOS - see previous slide... either you can use

your existing account and ensure that

– primary Unix group is np-comp – to submit batch jobs: e-group np04-t0comp-users – to read/write EOS you need eos-experiment-cenf-np04-readers and eos- experiment-cenf-np04-writers to read and write respectively

  • ...or at least for now use the prod account np04dqm
  • "Everything" (including all logs) is currently under

– /eos/experiment/neutplatform/protodune/np04tier0/p3s/

  • Perhaps we need better structure since FUSE has hiccups when there

is a large number of files in the same directory

  • Condor logs had to be moved from EOS to AFS due to a CERN policy

5

slide-6
SLIDE 6

M Potekhin | DQM - DUNE Collaboration Meeting@CERN, January 2018

Storage (logs)

  • As mentioned, all logs are currently under

– /eos/experiment/neutplatform/protodune/np04tier0/p3s/

  • Look at UUID in the p3s monitor (p3s-wev.cern.ch) to match an
  • bject to its log (pilot, job)
  • stdout and stderr are captured in uuid.out and uuid.err respectively,

where uuid is the actual (long) identifier - in the directory "joblog"

6

slide-7
SLIDE 7

M Potekhin | DQM - DUNE Collaboration Meeting@CERN, January 2018

Storage (software)

  • Tried EOS to host software (Dorota), does not perform/breaks
  • AFS was scheduled to be decomissioned in late 2018 but this is likely

to be pushed back

  • AFS has

– user space up to 10GB in an account – work space up to 100GB

  • The latter seems the right place to put software built locally since it

is suposed to be more robust albeit with more latency

  • Condor logs are now also in AFS due to CERN policy (EOS breaks

Condor daemons)

7

slide-8
SLIDE 8

M Potekhin | DQM - DUNE Collaboration Meeting@CERN, January 2018

Setup and wrappers

  • There are (previously) working examples of wrappers (payload

scripts) in the repository, under p3s/inputs

  • ...see the "larsoft" folder there, and there is also breakdown for a few

types of larsoft jobs

  • Need updates examples of CVMFS-only setup
  • ...perhaps event display won't need local builds
  • Need instructions (from Tom) for local builds as well

8