presentation to the LHCC Comprehensive Review James R. Catmore - - PowerPoint PPT Presentation

presentation to the lhcc comprehensive review
SMART_READER_LITE
LIVE PREVIEW

presentation to the LHCC Comprehensive Review James R. Catmore - - PowerPoint PPT Presentation

Distributed physics analysis for ATLAS: presentation to the LHCC Comprehensive Review James R. Catmore Research Associate, Lancaster University, UK ATLAS B-physics working group ATLAS Analysis computing model 2 TIER 0 (CERN) and PRODUCTION


slide-1
SLIDE 1

Distributed physics analysis for ATLAS: presentation to the LHCC Comprehensive Review

James R. Catmore Research Associate, Lancaster University, UK ATLAS B-physics working group

slide-2
SLIDE 2 James Catmore LHCC Comprehensive Review, CERN, 20th November 2007

ATLAS Analysis computing model 2

Detector

Monte Carlo Reconstruction AOD & TAG building

TIER 0 (CERN) and PRODUCTION SYSTEM (TIER 2 SITES)

Jobs Results TIER 1 SITES TIER 2 SITES MIDDLEWARE

ANALYSTS

Digits Digits ESD ESD AOD TAG AOD TAG
slide-3
SLIDE 3 James Catmore LHCC Comprehensive Review, CERN, 20th November 2007

ATLAS distributed analysis tools

  • Distributed data management: DQ2
  • Replica catalogue providing mappings to concrete

datasets on sites across the Grids

  • Dataset: basic data unit for an analyst - may contain

thousands of files

  • Tools (command line and web) for listing catalogue

content

  • Metadata service: AMI
  • Web interface giving information on file provenance
  • Grid user interface: GANGA

3

slide-4
SLIDE 4 James Catmore LHCC Comprehensive Review, CERN, 20th November 2007

GANGA

  • Grid user interface: GANGA
  • Single tool for all Grid-based work (including analysis and small Monte Carlo

productions)

  • Trivial switching between Grid running and local execution (for testing

purposes)

  • Grid backends include LCG (EGEE), NorduGrid, OSG (Panda)
  • Interface via a command-line or a GUI

4

www.cern.ch/ganga

slide-5
SLIDE 5 James Catmore LHCC Comprehensive Review, CERN, 20th November 2007

TAGs

  • Event TAGs are the method by which reconstructed

events are selected for analysis

  • Built from AOD according to offline analysis-style code
  • Initially consists of files containing event metadata and a

pointer to the POOL file from which the tag was made

  • Later loaded into a relational database for access by

physicists

  • Typical information held in the TAG: temporal conditions,

quality & detector status, trigger, physics

5

slide-6
SLIDE 6 James Catmore LHCC Comprehensive Review, CERN, 20th November 2007

Analysts’ work-flow (data taking) 6

Prepare analysis code Prepare TAG selection Set up analysis job Submit to the Grid Retrieve results Merge results Inspect in ROOT

1 2 3 4 5 6 7

slide-7
SLIDE 7 James Catmore LHCC Comprehensive Review, CERN, 20th November 2007

Analysts’ work-flow (now) 7

Prepare analysis code Locate dataset Set up analysis job Submit to the Grid Retrieve results Merge results Inspect in ROOT

1 2 3 4 5 6 7

slide-8
SLIDE 8 James Catmore LHCC Comprehensive Review, CERN, 20th November 2007

Example for this presentation

  • Submission of a simple Athena analysis job to the Grid using

GANGA

  • Search for J/ψ→μμ decays in AOD using the Athena package

BPhysAnalysisTools

  • Run over Monte Carlo AOD data sample produced by the

Production System

  • Job goes to the data
  • Physics analyses using exactly this method are in progress

now for commissioning preparation

8

slide-9
SLIDE 9 James Catmore LHCC Comprehensive Review, CERN, 20th November 2007

Step 1: prepare your analysis code 9

slide-10
SLIDE 10 James Catmore LHCC Comprehensive Review, CERN, 20th November 2007

Step 2: Find your dataset in AMI (current) 10

Wildcards supported Dataset number Simulation chain step

All official ATLAS MC data on the Grid uses a strict naming policy

AMI page: http://ami3.in2p3.fr:8080/opencms/opencms/AMI/www

slide-11
SLIDE 11 James Catmore LHCC Comprehensive Review, CERN, 20th November 2007

Step 2: Find your dataset in AMI (current) 11

DQ2 link Dataset name Data type Athena release Status

slide-12
SLIDE 12 James Catmore LHCC Comprehensive Review, CERN, 20th November 2007

Step 2: prepare TAG selection (data taking)

  • Event Quantities
  • Data Quality
  • Trigger information
  • Electron objects
  • Photon objects
  • Muon objects
  • Tau-jet objects
  • Jets
  • Physics attributes

12

Available for TAG selection

Developing the correct selection will be a major task and will probably involve several analysts or an entire physics group. Local testing will be required (local use

  • f the TAGs is trivial)
slide-13
SLIDE 13 James Catmore LHCC Comprehensive Review, CERN, 20th November 2007

Step 3: Set up analysis job.... GANGA window 13

slide-14
SLIDE 14 James Catmore LHCC Comprehensive Review, CERN, 20th November 2007

Step 3: Set up analysis job.... GANGA window 14

Job control panel

New Open Submit Retrieve

  • utput

Save Copy Kill Delete

Job details window

slide-15
SLIDE 15 James Catmore LHCC Comprehensive Review, CERN, 20th November 2007

Step 3: Set up analysis job.... GANGA window 15

Job monitoring window Type Back-end ID Status

slide-16
SLIDE 16 James Catmore LHCC Comprehensive Review, CERN, 20th November 2007

Step 3: Set up analysis job.... job builder 16

Athena Back-end control Datset/TAG Results

slide-17
SLIDE 17 James Catmore LHCC Comprehensive Review, CERN, 20th November 2007

Where is the data.....? 17

Best sites are FZKDISK and LYONDISK: 592 files each

slide-18
SLIDE 18 James Catmore LHCC Comprehensive Review, CERN, 20th November 2007

Where is the data? 18

slide-19
SLIDE 19 James Catmore LHCC Comprehensive Review, CERN, 20th November 2007

Step 4: Submit your job.... 19

slide-20
SLIDE 20 James Catmore LHCC Comprehensive Review, CERN, 20th November 2007

Step 6: Monitor the job 20

after 5 minutes or so.....

slide-21
SLIDE 21 James Catmore LHCC Comprehensive Review, CERN, 20th November 2007

Step 7: Retrieve and merge your results 21

Retrieve Merge

slide-22
SLIDE 22 James Catmore LHCC Comprehensive Review, CERN, 20th November 2007

Step 8: Collect results; analysis in ROOT 22

slide-23
SLIDE 23 James Catmore LHCC Comprehensive Review, CERN, 20th November 2007

Producing Monte Carlo data in GANGA 23

  • Physicists will often need to produce

quick small samples of Monte Carlo data (<10 000 events)

  • GANGA provides a plug-in for

running the full Athena simulation chain

  • Uses the standard ATLAS

JobTransformation mechanism used in the main production system to guarantee trustworthiness

  • Naming abides by ATLAS conventions
  • User-generated datasets saved on the

Grid and registered in DQ2 under

the user’s name

slide-24
SLIDE 24 James Catmore LHCC Comprehensive Review, CERN, 20th November 2007

Monitoring 24

ATLAS dashboard - reached from GANGA web page

slide-25
SLIDE 25 James Catmore LHCC Comprehensive Review, CERN, 20th November 2007

Conclusions and outlook

  • ATLAS physicists are doing distributed physics analysis now on

Monte Carlo data

  • TAGs are under preparation for data-taking
  • Datasets are well documented in AMI but even replication across

the sites remains a problem

  • The submission tools are sufficiently robust to handle this
  • Heavy reliance on LXPLUS UI at the moment
  • How will user support work when data-taking begins?
  • ATLAS distributed computing will be ready for data

25